YOUR FEEDBACK
Iterators in C#: Nothing Beats the Foreach Loop
Darryl wrote: I'm with Dave, screw your crappy pop-ups.

SYS-CON.TV
TOP MICROSOFT .NET LINKS


String Combinatorics with Visual Basic.NET
Create a simple custom string combination class with Visual Basic.NET

Digg This!

Page 1 of 2   next page »

Sooner or later you'll probably find it useful to be able to create and manipulate combinations programmatically. By far the most useful kinds of combinations are string combinations. A string combination of order (n, k) is a subset of k strings chosen from a set of n strings, where order doesn't matter. For example, suppose you have the strings {"ant", "bat", "cow", "dog", "elk"}, so n = 5. If you set k = 3, there are 10 string combination elements:

{"ant", "bat", "cow"}
{"ant", "bat", "dog"}
{"ant", "bat", "elk"}
{"ant", "cow", "dog"}
{"ant", "cow", "elk"}
{"ant", "dog", "elk"}
{"bat", "cow", "dog"}
{"bat", "cow", "elk"}
{"bat", "dog", "elk"}
{"cow", "dog", "elk"}

Notice I don't list an element like {"cow", "bat", "ant"}. Because order doesn't matter this element is the same as {"ant", "bat", "cow"}. In this example, and throughout this article, I use lexicographical combinations. That simply means that the combination elements are listed in dictionary order. Let me point out that combinations are sometimes confused with permutations. String permutations of order n are rearrangements of all possible n strings. The first two string permutations of the original string set above are {"ant", "bat", "cow", "dog", "elk"} and {"ant", "bat", "cow", "elk", "dog"}. In the sections of this article that follow, I present a simple custom StringCombination class written in Visual Basic.NET, explain the code in detail so you can modify it to meet your own needs, and point out a few of the ways you can use string combinations. This article assumes you are a .NET developer, tester, or manager and have intermediate-level Visual Basic.NET coding skill. I think you'll find the ability to use string combinations programmatically a valuable addition to your skillset.

The best way to show you where I'm headed is with a screenshot. If you examine Figure 1 you'll see that I start with a set of five arbitrary strings. I specify a subset size of three. My demonstration program first calculates that there are 10 combination elements if n = 5 and k = 3 and then prints all 10 elements. You'll find the complete program that produced the screenshot in Figure 1 in Listing 1.

Creating a String Combination Class
A string combination lends itself nicely to implementation as a class. There are many possible designs you can use. First I decided to implement my StringCombination class in a VB.NET Class Library so I can easily reuse it. The complete source code for the library is in Listing 2. My overall structure is in Listing 3.

I declared Private fields n and k to hold the total number of strings and the subset size respectively. I declared a string array field named data to hold the strings for a particular combination element. It turned out that I needed to know the original string elements to calculate the successor to a particular combination element, so I declared a string array field named orig to hold this information.

This design is very simple and easily modifiable but, as I'll discuss later, it's inefficient to a certain degree. I have a single class constructor to create a new StringCombination object. The constructor accepts a string array, a, and a subset size, k. The constructor assumes that the array argument is sorted. Next I declare a Private helper function indexOf(). One of the key methods of the StringCombination class is a Successor() method that returns the next combination element relative to a particular element. My Successor() implementation needs to know the successor string to a particular string. For example, in the original set of strings in the example above, {"ant", "bat", "cow", "dog", "elk"}, the successor to "cow" is "dog." I'll use the indexOf() function along with the orig array to get this information. Other fields are a ToString() method to display a string combination element in a friendly way, and a Choose(n,k) method to calculate the total number combination elements there are for a given initial string set size, n, and a subset size, k.

With this design, after I implement the constructor and methods, if I write code like this:

Dim animals As String() = New String() {"ant", "bat", "cow", "dog", "elk"}
Dim c As StringCombination = New StringCombination(animals, 3)
c = c.Successor

then the StringCombination object c will be represented in memory as shown in Figure 2.

Here's how I implemented the StringCombination constructor:

Public Sub New(ByVal a As String(), ByVal k As Integer)
   If a Is Nothing OrElse k < 1 Then
       Throw New Exception("Invalid argument in StringCombination constructor")
     End If
     Me.n = a.Length
     Me.k = k
     Me.data = New String(k - 1) {}
     Me.orig = New String(n - 1) {}

     Dim i As Integer = 0
     While i < Me.n
       Me.orig(i) = a(i)
       i = i + 1
     End While
    i = 0
    While i < Me.k
       Me.data(i) = a(i)
       i = i + 1
   End While
End Sub 'New()

First I do a rudimentary check of the two input parameters. It's pretty obvious that the array argument, a, can't be a Nothing reference. But I don't check if the input array is sorted. Checking the subset size, k, is a bit more subtle. I don't allow a value of 0 but you can allow a subset size of 0 if you want - the idea being that perhaps you want to return a Nothing reference or a self-reference. Next I determine the set size, n, by using the input array's Length property. Notice that as a matter of style, I prefer to use the Me keyword for class fields rather than naming alternatives such a m_n. Next I allocate memory for the data and orig arrays. If you're a C# programmer be careful to remember that the array allocation in VB.NET allocates one more cell than similar C# code. Next I copy all the values in the input array to the orig array using a While loop, and I copy the first k values into the data array. In both cases I could have used the slightly more efficient String.CopyTo() method instead.

Class Methods
Now let's take a look at the StringCombination methods' implementations. Here's my ToString():

Public Overrides Function ToString() As String
   Dim s As String = "{ "
   Dim i As Integer = 0
   While i < Me.k
     s = s & Me.data(i) & " "
     i = i + 1
   End While
   s = s & "}"
   Return s
End Function 'ToString()

As usual for a class ToString() method, I use the Overrides keyword to distinguish my custom ToString() from the ToString() method automatically generated by the System.Object parent class. I simply build up a meta-string separated by spaces that represents the individual strings in the data array. If efficiency is a concern you can use the StringBuilder class instead. You can also use my method as a basis for ToLongString() and ToShortString() methods, which give you more or less information respectively.

The trickiest part of the StringCombination class is the Successor() method that returns the next lexicographical string combination element for a given element. As I'll show you in a moment, my Successor() implementation needs to know the index value of each of the original strings. Here's a Private helper function that does that:

Private Function indexOf(ByVal s As String) As Integer
   Dim i As Integer = 0
   While i < Me.n
     If Me.orig(i) = s Then
       Return i
     End If
     i = i + 1
   End While
   Throw New Exception("Could not find the index in indexOf()")
End Function 'indexOf()

Because I've stored the original strings into array orig, I just iterate through each value in orig looking for the input string and return the current index when I hit the target string. If I make it all the way through array orig but haven't returned an index value then I know the target string isn't in the array and I throw an exception. The Successor() method is in Listing 4.

I begin by writing code that determines if I'm at the last combination element. If you refer back to the screen shot in Figure 1, you'll notice that the last combination element, {"cow", "dog", "elk"}, is the only one that has "cow" in cell data[0], or more generally, the nth-kth original value in cell 0. This relationship is generally true. If the current string combination element instance is the last element then I return Nothing. After that check, I create a result StringCombination object. I pass in the orig data array to seed the new combination element and pass the fixed subset size, k. Then I copy the current combination instance data values to the resulting data array.

The heart of the Successor() method is finding the rightmost-most data string that's "out of order," so to speak. I start by putting an index variable x at the rightmost cell, which is location Me.k-1, and work my way to the left by decrementing x. For example, consider the last combination element, {"cow", "dog", "elk"}. Notice that the original index value of each string (2, 3, 4) is equal to Me.n-Me.k, so when the index of a particular string is not equal to Me.n-Me.k, this is the string that must be changed to its successor string. Then every string to the right of the changed string must be a successor of the string on its left.

The algorithm is a bit tricky but if you trace through a few examples you'll see how it works. The good news is that you won't have to change any of this code if you don't want to.



Page 1 of 2   next page »

About James McCaffrey
Dr. James McCaffrey works for Volt Information Sciences, Inc., where he manages technical training for software engineers working at Microsoft's Redmond, WA campus. He has worked on several Microsoft products, including Internet Explorer and MSN Search. James can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.

SYS-CON Brazil News Desk wrote: Sooner or later you'll probably find it useful to be able to create and manipulate combinations programmatically. By far the most useful kinds of combinations are string combinations. A string combination of order (n, k) is a subset of k strings chosen from a set of n strings, where order doesn't matter.
read & respond »
MICROSOFT .NET LATEST STORIES
"Cloud Computing Is the Plan" - Ballmer Memo
With Microsoft mandarin Kevin Johnson bolting to Jupiter, leaving Microsoft to lick its wounds over Yahoo and reorganize, CEO Steve Ballmer sent out an all-hands e-mail to Microsoft folk encapsulating the message he delivered to financial analysts gathering in Redmond Thursday. Ballmer
Adobe's Kevin Lynch and Microsoft's Scott Guthrie to Keynote AJAX World RIA Conference & Expo
Two of the biggest launches in Rich Internet Application history took place in 2007/2008 when Adobe launched AIR 1.0 in February '08 and Microsoft launched Silverlight (September '07). At the 6th International AJAXWorld RIA Conference & Expo in October SYS-CON Events is delighted to be
Virtualization, Microsoft, Yahoo & Google
Citrix has tapped its VP of channels and emerging product sales Al Monserrat to replace its departing sales chief John Burris, who, as previously reported, is going to Sourcefire as CEO. A couple of years ago Monserrat was responsible for Citrix' North American sales. Meanwhile, Citrix
AJAX RIA Tutorial - Accessing the ASP.NET Authentication, Profile and Role Service in Silverlight
In ASP.NET 2.0, we introduced a very powerful set of application services in ASP.NET (Membership, Roles and profile). In 3.5 we created a client library for accessing them from Ajax and .NET Clients and exposed them via WCF web services. For more information on the base level ASP.NET
Cloud Computing - IBM's Got Its Head in the Clouds
Reminding people of how its backing was the making of Linux, IBM, to no one's surprise, has thrown its support behind cloud computing, that delicious nexus of every chi-chi buzzword technology currently in vogue: Web 2.0, rich Internet applications, software-as-a-service, SOA, grid com
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE
BREAKING NEWS FROM THE WIRES
Anexeon Achieves Microsoft Gold Partner Certification
Anexeon, LLC announced today it has achieved Microsoft Gold Certified Partner status along wit