YOUR FEEDBACK
Three RIA Platforms Compared: Adobe Flex, Google Web Toolkit, and OpenLaszlo
NN wrote: Yeah you are right GWT is poor man's Flex. After using GWT on two...

SYS-CON.TV
TOP MICROSOFT .NET LINKS


String Combinatorics with Visual Basic.NET
Create a simple custom string combination class with Visual Basic.NET

Digg This!

Page 1 of 2   next page »

Sooner or later you'll probably find it useful to be able to create and manipulate combinations programmatically. By far the most useful kinds of combinations are string combinations. A string combination of order (n, k) is a subset of k strings chosen from a set of n strings, where order doesn't matter. For example, suppose you have the strings {"ant", "bat", "cow", "dog", "elk"}, so n = 5. If you set k = 3, there are 10 string combination elements:

{"ant", "bat", "cow"}
{"ant", "bat", "dog"}
{"ant", "bat", "elk"}
{"ant", "cow", "dog"}
{"ant", "cow", "elk"}
{"ant", "dog", "elk"}
{"bat", "cow", "dog"}
{"bat", "cow", "elk"}
{"bat", "dog", "elk"}
{"cow", "dog", "elk"}

Notice I don't list an element like {"cow", "bat", "ant"}. Because order doesn't matter this element is the same as {"ant", "bat", "cow"}. In this example, and throughout this article, I use lexicographical combinations. That simply means that the combination elements are listed in dictionary order. Let me point out that combinations are sometimes confused with permutations. String permutations of order n are rearrangements of all possible n strings. The first two string permutations of the original string set above are {"ant", "bat", "cow", "dog", "elk"} and {"ant", "bat", "cow", "elk", "dog"}. In the sections of this article that follow, I present a simple custom StringCombination class written in Visual Basic.NET, explain the code in detail so you can modify it to meet your own needs, and point out a few of the ways you can use string combinations. This article assumes you are a .NET developer, tester, or manager and have intermediate-level Visual Basic.NET coding skill. I think you'll find the ability to use string combinations programmatically a valuable addition to your skillset.

The best way to show you where I'm headed is with a screenshot. If you examine Figure 1 you'll see that I start with a set of five arbitrary strings. I specify a subset size of three. My demonstration program first calculates that there are 10 combination elements if n = 5 and k = 3 and then prints all 10 elements. You'll find the complete program that produced the screenshot in Figure 1 in Listing 1.

Creating a String Combination Class
A string combination lends itself nicely to implementation as a class. There are many possible designs you can use. First I decided to implement my StringCombination class in a VB.NET Class Library so I can easily reuse it. The complete source code for the library is in Listing 2. My overall structure is in Listing 3.

I declared Private fields n and k to hold the total number of strings and the subset size respectively. I declared a string array field named data to hold the strings for a particular combination element. It turned out that I needed to know the original string elements to calculate the successor to a particular combination element, so I declared a string array field named orig to hold this information.

This design is very simple and easily modifiable but, as I'll discuss later, it's inefficient to a certain degree. I have a single class constructor to create a new StringCombination object. The constructor accepts a string array, a, and a subset size, k. The constructor assumes that the array argument is sorted. Next I declare a Private helper function indexOf(). One of the key methods of the StringCombination class is a Successor() method that returns the next combination element relative to a particular element. My Successor() implementation needs to know the successor string to a particular string. For example, in the original set of strings in the example above, {"ant", "bat", "cow", "dog", "elk"}, the successor to "cow" is "dog." I'll use the indexOf() function along with the orig array to get this information. Other fields are a ToString() method to display a string combination element in a friendly way, and a Choose(n,k) method to calculate the total number combination elements there are for a given initial string set size, n, and a subset size, k.

With this design, after I implement the constructor and methods, if I write code like this:

Dim animals As String() = New String() {"ant", "bat", "cow", "dog", "elk"}
Dim c As StringCombination = New StringCombination(animals, 3)
c = c.Successor

then the StringCombination object c will be represented in memory as shown in Figure 2.

Here's how I implemented the StringCombination constructor:

Public Sub New(ByVal a As String(), ByVal k As Integer)
   If a Is Nothing OrElse k < 1 Then
       Throw New Exception("Invalid argument in StringCombination constructor")
     End If
     Me.n = a.Length
     Me.k = k
     Me.data = New String(k - 1) {}
     Me.orig = New String(n - 1) {}

     Dim i As Integer = 0
     While i < Me.n
       Me.orig(i) = a(i)
       i = i + 1
     End While
    i = 0
    While i < Me.k
       Me.data(i) = a(i)
       i = i + 1
   End While
End Sub 'New()

First I do a rudimentary check of the two input parameters. It's pretty obvious that the array argument, a, can't be a Nothing reference. But I don't check if the input array is sorted. Checking the subset size, k, is a bit more subtle. I don't allow a value of 0 but you can allow a subset size of 0 if you want - the idea being that perhaps you want to return a Nothing reference or a self-reference. Next I determine the set size, n, by using the input array's Length property. Notice that as a matter of style, I prefer to use the Me keyword for class fields rather than naming alternatives such a m_n. Next I allocate memory for the data and orig arrays. If you're a C# programmer be careful to remember that the array allocation in VB.NET allocates one more cell than similar C# code. Next I copy all the values in the input array to the orig array using a While loop, and I copy the first k values into the data array. In both cases I could have used the slightly more efficient String.CopyTo() method instead.

Class Methods
Now let's take a look at the StringCombination methods' implementations. Here's my ToString():

Public Overrides Function ToString() As String
   Dim s As String = "{ "
   Dim i As Integer = 0
   While i < Me.k
     s = s & Me.data(i) & " "
     i = i + 1
   End While
   s = s & "}"
   Return s
End Function 'ToString()

As usual for a class ToString() method, I use the Overrides keyword to distinguish my custom ToString() from the ToString() method automatically generated by the System.Object parent class. I simply build up a meta-string separated by spaces that represents the individual strings in the data array. If efficiency is a concern you can use the StringBuilder class instead. You can also use my method as a basis for ToLongString() and ToShortString() methods, which give you more or less information respectively.

The trickiest part of the StringCombination class is the Successor() method that returns the next lexicographical string combination element for a given element. As I'll show you in a moment, my Successor() implementation needs to know the index value of each of the original strings. Here's a Private helper function that does that:

Private Function indexOf(ByVal s As String) As Integer
   Dim i As Integer = 0
   While i < Me.n
     If Me.orig(i) = s Then
       Return i
     End If
     i = i + 1
   End While
   Throw New Exception("Could not find the index in indexOf()")
End Function 'indexOf()

Because I've stored the original strings into array orig, I just iterate through each value in orig looking for the input string and return the current index when I hit the target string. If I make it all the way through array orig but haven't returned an index value then I know the target string isn't in the array and I throw an exception. The Successor() method is in Listing 4.

I begin by writing code that determines if I'm at the last combination element. If you refer back to the screen shot in Figure 1, you'll notice that the last combination element, {"cow", "dog", "elk"}, is the only one that has "cow" in cell data[0], or more generally, the nth-kth original value in cell 0. This relationship is generally true. If the current string combination element instance is the last element then I return Nothing. After that check, I create a result StringCombination object. I pass in the orig data array to seed the new combination element and pass the fixed subset size, k. Then I copy the current combination instance data values to the resulting data array.

The heart of the Successor() method is finding the rightmost-most data string that's "out of order," so to speak. I start by putting an index variable x at the rightmost cell, which is location Me.k-1, and work my way to the left by decrementing x. For example, consider the last combination element, {"cow", "dog", "elk"}. Notice that the original index value of each string (2, 3, 4) is equal to Me.n-Me.k, so when the index of a particular string is not equal to Me.n-Me.k, this is the string that must be changed to its successor string. Then every string to the right of the changed string must be a successor of the string on its left.

The algorithm is a bit tricky but if you trace through a few examples you'll see how it works. The good news is that you won't have to change any of this code if you don't want to.



Page 1 of 2   next page »

About James McCaffrey
Dr. James McCaffrey works for Volt Information Sciences, Inc., where he manages technical training for software engineers working at Microsoft's Redmond, WA campus. He has worked on several Microsoft products, including Internet Explorer and MSN Search. James can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.

SYS-CON Brazil News Desk wrote: Sooner or later you'll probably find it useful to be able to create and manipulate combinations programmatically. By far the most useful kinds of combinations are string combinations. A string combination of order (n, k) is a subset of k strings chosen from a set of n strings, where order doesn't matter.
read & respond »
MICROSOFT .NET LATEST STORIES
Peer Networking Series - A Closer Look at PNRP vs. Bonjour/ZeroConf
It seems as though whenever I bring up PNRP and its benefits, I am immediately inundated with a list of questions or comments indicating that Microsoft is re-inventing the wheel and that PNRP has already been implemented before in the form of ZeroConf and, more specifically, Apple's im
Microsoft, Unisys, Yahoo and Vista
Microsoft, which spent $6 billion on aQuantive and was chasing Yahoo for its ads before it came to a dead stop, has been supporting - as in helping write - legislation in New York and Connecticut that would regulate the data that companies like Yahoo and Google collect for targeted adv
AJAX World - Xceed Launches Microsoft Silverlight 2 Control
Xceed launched Xceed Upload for Silverlight, the commercial offering in support of Microsoft's promising new Silverlight technology. The product is available now for purchase or as a fully functional 45-day trial on Xceed's website. Xceed Upload for Silverlight lets developers add uplo
Microsoft To Keynote 4th International Virtualization Conference & Expo
Mike Neil is general manager for virtualization strategy in the Windows Server Division at Microsoft. Mike is focused on the delivery of the Windows virtualization technology, including Windows Server 2008 Hyper-V, Microsoft Hyper-V Server and Virtual PC 2007. Mike also directs the tec
Microsoft Virtualization Takes Management Cross-Platform
Microsoft is making System Center, its central management scheme, natively manage Linux, Unix and VMware virtual servers. The widgetry has always been a Windows-only affair, but now there are betas available showing off Microsoft's cross-platform prowess, important to Microsoft's place
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE
BREAKING NEWS FROM THE WIRES
Actium Partners with GraphOn to Web-Enable Financial Management Platform
GraphOn Corporation (OTCBB:GOJO), a leading worldwide developer of application publishi