The "Yield" Keyword in C# (AKA Execution Control Hot Potato)
I actually read about this topic a while back when I saw a C# code sample containing an unfamiliar keyword – yield. Well just today I saw another code sample using this keyword again, and had since forgotten everything I had learned. As I Googled the topic again, I happened across the same article I read back then, a terrific blog post by Joshua Flanagan. Here is a link to Joshua’s simple introduction to the yield C# keyword:
I would also suggest the following MSDN article as well:
I’ll give you the gist of this .NET 2.0 feature. If you have a method that returns an iterative collection and the consumers of your function will only need foreach, read-only access to that collection, then it is good practice to have your method return the universal IEnumerable collection (as opposed to you arbitrarily selecting an implementation-specific collection to be forced upon the consumers of your code). In C#, when you are returning an IEnumerable collection from a method, you are allowed to use the yield keyword in order to transform your method into what Microsoft calls an "iterator block". Here is a quote from the MSDN article I mentioned above to explain this phrase and the yield keyword:
"The yield keyword signals to the compiler that the method in which it appears is an iterator block. The compiler generates a class to implement the behavior that is expressed in the iterator block. In the iterator block, the yield keyword is used together with the return keyword to provide a value to the enumerator object. This is the value that is returned, for example, in each loop of a foreach statement. The yield keyword is also used with break to signal the end of iteration."
You are essentially building an enumerator (or iterator), and with each iteration of your foreach loop in the calling, consuming code, you are executing your method containing the iterator block until it reaches a yield return statement. At this point, only one item at a time in the collection is returned back to the consuming foreach loop for immediate logic execution by the body of the loop. On the next iteration of the consuming foreach loop, the method containing the iterator block will be called again and another single item in the collection will be returned for similar logic execution.
This game of execution control Hot Potato has its benefits, in that you can stop the flip-flop of execution control when the music has stopped. In this analogy, I am comparing the "music stopping" to your code satisfying some logic that makes the rest of your foreach loop iterations unnecessary (perhaps you have found the one item in the collection you were looking for by meeting the sufficient requirements of an if statement). The benefit is that when you break out of the consuming foreach loop before completing the iterations, you prevent wasteful code execution and memory use. This feature prevents one piece of code from fully populating a potentially large collection of items, just to have another consuming piece of code iterate partially through the list and then prematurely exit. Consider the savings especially when adding each item to the collection may be computationally intensive or if each item in the collection is itself a very large data structure. In the Hot Potato analogy, this would be as if the kids in the circle who didn’t touch the "hot potato" never existed in the game, saving floor space and unnecessary tosses to people you don’t really want to target for humiliating removal from the game.
For good sample code snippets of using the yield keyword, see the links provided above.
I hope to identify situations in the near future where I could benefit from using this C# keyword. Hopefully then I won’t forget a year down the road from now and then have to re-search the internet for Joshua’s blog post and the MSDN article to remind myself what I already learned.
FOLLOWUP (03/26/09): So I’m not sure how I missed it, but it took reading a very recent article by Scott Mitchell of 4GuysFromRolla.com to understand the real reason the yield keyword was introduced. The introduction of this keyword in .NET 2.0 was most helpful in the case of fulfilling the requirements of implementing the IEnumerable interface, which typically required you to create a sub-class that implemented IEnumerator. In many cases, it allowed you to take about 50 lines of code and turn it into 2 lines of code using the yield keyword. The introduction of LINQ brought heavy reliance on enumerating using the yield keyword in order to pipeline query operators together. Scott Mitchell takes you through this whole journey, tying all of the loose ends together to create a complete history of this keyword and how it’s most commonly utilized. Please visit the following article to get the whole picture: