Are iterators fundamentally flawed ?
Posted
Sat, Aug 2 2008 13:14
by
bill
It seems apparent that with computers as we currently know them, processors are now set to scale out not up.. that is, clock speeds aren’t rapidly growing, and certainly not doubling every year or two, instead the number of processors on a chip is. As a case in evidence, 2 or 3 years ago, my PC’s were all single CPU, then a bit over a year ago I bought my new desktop PC and it was a dual core. And as of a few days ago I swapped that dual core out for a quad core. For me, the number of cores/CPU’s is clearly doubling every year or two, but clock speed itself isn’t that much faster.
So what that means for computing is the emphasis is on parallel or concurrent computing. As such, we need to look at the fundamentals and see if they are designed for the task. Iterators don’t look like they are.
If you consider an iterator primary task of getting the next element, then look at how it does it, you should observe a major design flaw… the operation is not atomic. An iterator’s implementation is IEnumerator or IEnumerator(Of T), and IEnumerator require you to first call MoveNext then read the Current property. If you allow multiple threads to use the same IEnumerator, then you could get a call sequence such as MoveNext, MoveNext, Current, Current, which would skip one item and repeat the next one. So IEnumerator is not well designed for threading. This is a known design limitation, but rather than address that, languages like C# have implemented their iterators to be a different iterator per thread. That is they not only recognise the limitation but they also enforce it.
Going forward, perhaps we need a new IEnumerator class. In the good bad old days that would be called IEnumerator2(Of T). IEnumerator2 would make iterating a single method call that would return a Tuple(Of T, Boolean). It could extend IEnumerator, but it would be difficult to enforce thread safety to the IEnumerator interface. The problem that language like C# would then face is do they risk breaking existing code by returning the same enumerator to different threads ? Well no they would have to provide an IEnumerable2.GetEnumerator as well, so it would be possible to have the same class used for both single instance per thread and shared class across threads. As such, then the code that calls the iteration, the For Each loop, would then have to indicate whether it wants to do so using parallel or single threaded.. this would probably surface in the languages such as the For Parallel instruction on a loop.
So maybe iterators aren’t fundamentally flawed, they just currently have a lot of limitations which can and hopefully will be addressed as we venture more into the world of multi core :)