YOUR FEEDBACK
Jeremy Geelan wrote: In response to inquiries and suggestions from readers this lexicon has recently...

SYS-CON.TV
TOP MICROSOFT .NET LINKS


Iterators in C#: Nothing Beats the Foreach Loop
Neat Uses Pioneered in Python Can Easily be Applied in C#

One Fish,
Two Fish,
Red Fish,
Blue Fish!
- Dr. Seuss, 1960

My favorite construct in the C# programming language has to be the foreach loop. I get a little jolt of pleasure every time I use it. I can enumerate rows in SQL Queries, nodes in XML documents, ArrayLists, and Hashtables. You name it, I can enumerate it all without mucking around with a single loop conditional.

foreach is super convenient to use, but I don't use it as much as I should in my own programs because writing the code to support foreach is tedious and dull. Personally I'd much rather be off doing some kind of asynchronous bind to a Web service thing than grinding out yet another iterator class. That's a shame because iterators can be used for a lot more than just enumerating data structures.

So I was happy to find that in C# 2.0 the dreary work is largely a relic of the past. The introduction of iterator functions reduces the backing infrastructure for foreach to a single function. This allows me to use foreach for all kinds of new purposes while still giving me time to check out the latest installment of Grand Theft Auto.

Supporting foreach Today
As shown in Figure 1, any time you use foreach the compiler expects the expression you give it to result in an implementation of IEnumerable (or IEnumerable<T> in C# 2.0). The compiler emits code that calls GetEnumerator on the resulting object and a loop that keeps calling IEnumerator.Move Next until MoveNext returns false. Each time through the loop the Current property on the iterator is evaluated and the result is used in the body of the foreach loop.

This means that if you want to give your clients something to enumerate, you must implement an iterator class devoted to feeding items to the client. This class is often a private nested class on some larger entity. The only way for the outside world to get at it is to call GetEnumerator on the container. When they do, you create an instance of your iterator class and pass it some sort of reference to the data it will be enumerating. The iterator class keeps a reference to the "current" item to hand out in the Current property and updates its state every time someone calls MoveNext. Listing 1 shows a simple iterator that can be used to iterate simple integer arrays along with a class that uses it to implement IEnumerable.

This code doesn't look too bad - it's less than 40 lines - but an array is a pretty simple data structure. Other structures (particularly recursive ones such as trees) are difficult to iterate with IEnumerable, so iterators can grow to many hundreds of lines or more. There are various other niggling problems that tend to complicate iterating over data structures. What if the client decides to open several iterators simultaneously? What if it tries to modify the collection in between calls to MoveNext? What if you want to support multiple threads iterating simultaneously? These problems each add code and complexity.

The biggest complaint I have about IEnumerable-style iteration, however, is its lack of flexibility. Very often I would like the client to be able to specify parameters to control how iteration should proceed. Maybe the client would prefer to iterate backwards. Maybe it only wants to see every other item or only the last 10. If the client is iterating tree-like data, it might prefer pre-order, in order, or post-order traversal, depending on what it's doing. If I'm generating the items as I go, there are any number of ways I might want the client to control iteration. This code is largely boilerplate, but it multiplies rapidly because you don't get to specify any parameters to GetEnumerator. This means you have to write an IEnumerable/IEnumerator pair for every option you want to support. This is great for clients, but it's painful for me, so today I'm more likely to just ditch the paradigm entirely and do something else.

Generators
An interesting contrast to this pain can be found in the Icon and Python programming languages. These languages approach iteration with an elegant construct known as "generator functions." Generator functions are written to look like normal functions, but they behave just a bit differently. Generator functions can stop at will and "yield" a value to their caller. When they do this, their execution state is preserved somewhere and the caller gets to run. The next time the function is invoked, the generator picks up where it left off (along with its local variables) and continues running as if nothing had happened. This makes it easy to write a function that yields a series of values to the client. A simple Python generator function looks like the following:


def hello():
	yield  'hello'
	yield  'world'

This function returns an object to the client called a "generator" that can be iterated in a fashion familiar to anyone who has ever used IEnumerator (see Listing 2).

There's a definite family resemblance to IEnumerable/ IEnumerator here, but the beauty is that the generator class itself is generated by the Python compiler based on the function definition.

We can write code in C# today that behaves a lot like hello by using switch statements and gotos. Listing 3 shows a crude translation of the Python program above into C#. Ugh. It works but C# is clearly the ugly stepsister in this family.

Generator Functions in C#
C# 2.0 borrows the concept of generator functions from Python, including a syntax that is pretty much identical. Implementing generator functions in C# 2.0 requires a fairly simple set of steps:

  1. You write a function and declare the return type to be IEnumerable or IEnumerable<T>.
  2. In your function body, you use "yield return" to yield a value to the caller (while preserving your state) or "yield break" to abandon iteration.
  3. The compiler translates your code into a class resembling the code in Listing 3.
  4. Clients consume your function using foreach (or by manually using the IEnumerator returned from the function). When the function body exits (or encounters a yield break statement), MoveNext will return false and the loop will exit.
Using the new syntax in C# 2.0, we can condense the code from Listing 3 into something more closely resembling the Python example (see Listing 4; for space reasons listings 4-8 are online at www.sys-con.com/ dotnet/source.cfm).

As you can see, function SayHello returns IEnumerable<string> instead of just string. The presence of this return type, plus the fact that the method contains the yield return expression, tells the compiler not to emit an ordinary method body. Instead the compiler will create a generator class that will yield the values "hello" and "world" in the Current property after successive calls to MoveNext. If we compile the class and crack it open with Reflector.NET we see that SayHello just creates an instance of the generator class and returns it:


private static IEnumerable<string> SayHello()
{
      return new hello.<GetStrings>d__0(0);
}

The compiler assigns a peculiar name to the generated class: <GetStrings>d__0. However, remember that no one should ever create an instance of this class - they should call SayHello.

We'd expect the generated class to implement the IEnumerable<string> interface. Looking at the class definition with Reflector.NET shows us that it does (see Listing 5).

A look at MoveNext (Listing 6) reveals code that looks suspiciously like the code we wrote in Listing 3.

This is a lot easier than writing all this code by hand, and there's less opportunity for bugs to creep in. The client doesn't see things any differently than it has in the past. It just merrily consumes the items generated by SayHello with a normal foreach loop like the following:


foreach(string s in Greeter.SayHello())
   Console.WriteLine(s);

Taking it to the Next Level
SayHello may be a simple program, but it subtly illustrates the real promise of generator functions. Rather than traverse a data structure, it just generates items and returns them sequentially. This is the real power of iterator functions: actually generating sequences of data. All kinds of problems become simpler with iterator functions because you get to calculate in whatever manner is most convenient for you without the consumer having to get involved. This makes lots of programs easier to write, especially if the items are generated recursively.

For a fun example, let's look at the classic "Tower of Hanoi" puzzle (where you move the disks from one of the three pegs to another one without placing a larger disk on a smaller one). This game can be easily solved with a simple recursive algorithm (see Listing 7).

It would be nice to allow the user to say something like the following:


foreach(Move m in TowerOfHanoi.Solution(10)) ...

However, the recursive nature of the algorithm would make it difficult to implement in C# 1.0. In C# 2.0 it's no problem - we just build a generator that yields a sequence of successive moves to solve the game. Listing 8 shows the technique in action. Notice how the function recursively calls itself to generate partial solutions to the game. The client is insulated from this detail and just iterates over the results with a simple foreach loop.

Conclusion
Iterator functions are a pretty new idea in the mainstream software world, but they've been around for awhile in academic languages like Icon and Python. A lot of really interesting uses have been pioneered in Python, and it's pretty easy to apply these ideas in C#. Poking around Google I turned up everything from Abstract State Machines to lightweight Thread Scheduling. I have a feeling a lot of techniques are still waiting to be discovered, so get out there and start playing with them. I have more samples and fun things at my Web site (www.neocranium.com). I hope to see you there.

About Jason Whittington
Jason Whittington is a consultant and researcher with an irrational fascination with virtual execution environments. When he's not researching or consulting he can often be found delivering courses for DevelopMentor. His Web site can be found at http://staff.develop.com/jasonw.

MICROSOFT .NET LATEST STORIES
Many of today (and tomorrow’s) development projects lend themselves nicely to RIA application patterns. Silverlight offers a compelling RIA development experience that works on Linux, the Mac and windows as well as all major browsers. With HD video, vector based graphics and a rich s...
As a long-time PB developer, I have successfully created or maintained many PB applications for clients and for myself. Since day one, I was impressed with the ease with which applications can be created using PowerBuilder. Although I had been exposed to Visual Studio and other client/...
TeamExpand, a developer of products complimentary to Microsoft Visual Studio Team System (VSTS), announced the commercial version of its TX Chrono, a timesheet tracking application targeted at software development organizations standardizing on the Visual Studio .NET environment. Besid...
China’s new anti-monopoly law went into effect August 1 and China-based Evermore Software, an Office wannabe, would love to haul Microsoft into court. It says it’s collecting evidence and has suggested to MarketWatch that the integration between Office and Windows might be just...
Developer Express announced the immediate availability of its reporting platform for WinForms and ASP.NET – the XtraReports Suite v2008 vol 2. Built and optimized for Visual Studio, the DevExpress suite of reporting components allows software developers to deliver cutting-edge capabi...
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021


SYS-CON FEATURED WHITEPAPERS

ADS BY GOOGLE
BREAKING NEWS FROM THE WIRES

The Consumer Electronics Association (CEA)®<...