Reimplementing LINQ to Objects: Part 5 - Empty
Continuing with the non-extension methods, it's time for possibly the simplest LINQ operator around: "Empty".
What is it?
"Empty" is a generic, static method with just a single signature and no parameters:
public static IEnumerable<TResult> Empty<TResult>()
It returns an empty sequence of the appropriate type. That's all it does.
There's only one bit of interesting behaviour: Empty is documented to cache an empty sequence. In other words, it returns a reference to the same empty sequence every time you call it (for the same type argument, of course).
What are we going to test?
There are really only two things we can test here:
- The returned sequence is empty
- The returned sequence is cached on a per type argument basis
I'm using the same approach as for Range to call the static method, but this time with an alias of EmptyClass. Here are the tests:
[Test]
public void EmptyContainsNoElements()
{
using (var empty = EmptyClass.Empty<int>().GetEnumerator())
{
Assert.IsFalse(empty.MoveNext());
}
}
[Test]
public void EmptyIsASingletonPerElementType()
{
Assert.AreSame(EmptyClass.Empty<int>(), EmptyClass.Empty<int>());
Assert.AreSame(EmptyClass.Empty<long>(), EmptyClass.Empty<long>());
Assert.AreSame(EmptyClass.Empty<string>(), EmptyClass.Empty<string>());
Assert.AreSame(EmptyClass.Empty<object>(), EmptyClass.Empty<object>());
Assert.AreNotSame(EmptyClass.Empty<long>(), EmptyClass.Empty<int>());
Assert.AreNotSame(EmptyClass.Empty<string>(), EmptyClass.Empty<object>());
}
Of course, that doesn't verify that the cache isn't per-thread, or something like that... but it'll do.
Let's implement it!
The implementation is actually slightly more interesting than the description so far may suggest. If it weren't for the caching aspect, we could just implement it like this:
public static IEnumerable<TResult> Empty<TResult>()
{
yield break;
}
... but we want to obey the (somewhat vaguely) documented caching aspect too. It's not really hard, in the end. There's a very handy fact that we can use: empty arrays are immutable. Arrays always have a fixed size, but normally there's no way of making an array read-only... you can always change the value of any element. But an empty array doesn't have any elements, so there's nothing to change. So, we can reuse the same array over and over again, returning it directly to the caller... but only if we have an empty array of the right type.
At this point you may be expecting a Dictionary<Type, Array> or something similar... but there's another useful trick we can take advantage of. If you need a per-type cache and the type is being specific as a type argument, you can use static variables in a generic class, because each constructed type will have a distinct set of static variables.
Unfortunately, Empty is a generic method rather than a non-generic method in a generic type... so we've got to create a separate generic type to act as our cache for the empty array. That's easy to do though, and the CLR takes care of initializing the type in a thread-safe way, too. So our final implementation looks like this:
public static IEnumerable<TResult> Empty<TResult>()
{
return EmptyHolder<TResult>.Array;
}
private static class EmptyHolder<T>
{
internal static readonly T[] Array = new T[0];
}
That obeys all the caching we need, and is really simple in terms of lines of code... but it does mean you need to understand how generics work in .NET reasonably well. In some ways this is the opposite of the situation in the previous post - this is a sneaky implementation instead of the slower but arguably simpler dictionary-based one. In this case, I'm happy with the trade-off, because once you do understand how generic types and static variables work, this is simple code. It's a case where simplicity is in the eye of the beholder.
Conclusion
So, that's Empty. The next operator - Repeat - is likely to be even simpler, although it'll have to be another split implementation...
Addendum
Due to the minor revolt over returning an array (which I still think is fine), here's an alternative implementation:
public static IEnumerable<TResult> Empty<TResult>()
{
return EmptyEnumerable<TResult>.Instance;
}
#if AVOID_RETURNING_ARRAYS
private class EmptyEnumerable<T> : IEnumerable<T>, IEnumerator<T>
{
internal static IEnumerable<T> Instance = new EmptyEnumerable<T>();
private EmptyEnumerable()
{
}
public IEnumerator<T> GetEnumerator()
{
return this;
}
IEnumerator IEnumerable.GetEnumerator()
{
return this;
}
public T Current
{
get { throw new InvalidOperationException(); }
}
object IEnumerator.Current
{
get { throw new InvalidOperationException(); }
}
public void Dispose()
{
}
public bool MoveNext()
{
return false;
}
public void Reset()
{
}
}
#else
private static class EmptyEnumerable<T>
{
internal static readonly T[] Instance = new T[0];
}
#endif
Hopefully now everyone can build a version they're happy with :)