Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

After the masses of code required for all the permutations of First/Last/etc, DefaultIfEmpty is a bit of a relief.

What is it?

Even this simple operator has two overloads:

public static IEnumerable<TSource> DefaultIfEmpty<TSource>(
    this IEnumerable<TSource> source)

public static IEnumerable<TSource> DefaultIfEmpty<TSource>(
    this IEnumerable<TSource> source,
    TSource defaultValue)

The behaviour is very simple to describe:

  • If the input sequence is empty, the result sequence has a single element in it, the default value. This is default(TSource) for the overload without an extra parameter, or the specified value otherwise.
  • If the input sequence isn't empty, the result sequence is the same as the input sequence
  • The source argument must not be null, and this is validated eagerly
  • The result sequence itself uses deferred execution - the input sequence isn't read at all until the result sequence is read
  • The input sequence is streamed; any values read are yielded immediately; no buffering is used

Dead easy.

What are we going to test?

Despite being relatively late in the day, I decided to test for argument validation - and a good job too, as my first attempt failed to split the implementation into an argument validation method and an iterator block method for the real work! It just shows how easy it is to slip up.

Other than that, I can really only see four cases worth testing:

  • No default value specified, empty input sequence
  • Default value specified, empty input sequence
  • No default value specified, non-empty input sequence
  • Default value specified, non-empty input sequence

So I have tests for all of those, and that's it. I don't have anything testing for streaming, lazy evaluation etc.

Let's implement it!

Despite my reluctance to implement one operator in terms of another elsewhere, this felt like such an obvious case, I figured it would make sense just this once. I even applied DRY to the argument validation aspect. Here's the implementation in all its brief glory:

public static IEnumerable<TSource> DefaultIfEmpty<TSource>(
    this IEnumerable<TSource> source)
{
    // This will perform an appropriate test for source being null first.
    return source.DefaultIfEmpty(default(TSource));
}

public static IEnumerable<TSource> DefaultIfEmpty<TSource>(
    this IEnumerable<TSource> source,
    TSource defaultValue)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    return DefaultIfEmptyImpl(source, defaultValue);
}

private static IEnumerable<TSource> DefaultIfEmptyImpl<TSource>(
    IEnumerable<TSource> source,
    TSource defaultValue)
{
    bool foundAny = false;
    foreach (TSource item in source)
    {
        yield return item;
        foundAny = true;
    }
    if (!foundAny)
    {
        yield return defaultValue;
    }
}

Of course, now that I've said how easy it was, someone will find a bug...

Aside from the use of default(TSource) to call the more complex overload from the simpler one, the only aspect of any interest is the bottom method. It irks me slightly that we're assigning "true" to "foundAny" on every iteration... but the alternative is fairly unpleasant:

private static IEnumerable<TSource> DefaultIfEmptyImpl<TSource>(
    IEnumerable<TSource> source,
    TSource defaultValue)
{
    using (IEnumerator<TSource> iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
            yield return defaultValue;
            yield break; // Like a "return"
        }
        yield return iterator.Current;
        while (iterator.MoveNext())
        {
            yield return iterator.Current;
        }
    }
}

This may be slightly more efficient, but it feels a little clumsier. We could get rid of the "yield break" by putting the remainder of the method in an "else" block, but I'm not dead keen on that, either. We could use a do/while loop instead of a simple while loop - that would at least remove the repetition of "yield return iterator.Current" but I'm not really a fan of do/while loops. I use them sufficiently rarely that they cause me more mental effort to read than I really like.

If any readers have suggestions which are significantly nicer than either of the above implementations, I'd be interested to hear them. This feels a little inelegant at the moment. It's far from a readability disaster - it's just not quite neat.

Conclusion

Aside from the slight annoyance at the final lack of elegance, there's not much of interest here. However, we could now implement FirstOrDefault/LastOrDefault/SingleOrDefault using DefaultIfEmpty. For example, here's an implementation of FirstOrDefault:

public static TSource FirstOrDefault<TSource>(
    this IEnumerable<TSource> source)
{
    return source.DefaultIfEmpty().First();
}

public static TSource FirstOrDefault<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, bool> predicate)
{
    // Can't just use source.DefaultIfEmpty().First(predicate)
    return source.Where(predicate).DefaultIfEmpty().First();
}

Note the comment in the predicated version - the defaulting has to be the very last step after we've applied the predicate... otherwise if we pass in an empty sequence and a predicate which doesn't match with default(TSource), we'll get an exception instead of the default value. The other two ...OrDefault operators could be implemented in the same way, of course. (I haven't done so yet, but the above code is in source control.)

I'm currently unsure what I'll implement next. I'll see whether I get inspired by any particular method in the morning.

Published Wed, Dec 29 2010 21:59 by skeet
Filed under: , ,

Comments

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

When you said "Despite my reluctance to implement one operator in terms of another elsewhere, this felt like such an obvious case, I figured it would make sense just this once."  I thought you were going to do something like this (which I think is a little cleaner):

private static IEnumerable<TSource> DefaultIfEmptyImpl<TSource>(

   IEnumerable<TSource> source,

   TSource defaultValue)

{

   if (source.Any())

   {

       foreach (TSource item in source)

       {

           yield return item;

       }

   }

   else

   {

       yield return defaultValue;

   }

}

Wednesday, December 29, 2010 4:46 PM by Michael

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

C# really needs an operator like F#'s yield! ("yield bang"), which could be used on an IEnumerable<T> or IEnumerator<T>. The keyword could be something like "yield all". It would make the implementation much cleaner...

   using (IEnumerator<TSource> iterator = source.GetEnumerator())

   {

       if (iterator.MoveNext())

       {

           yield return iterator.Current;

           yield all iterator;

       }

       else

       {

           yield return defaultValue;

       }

   }

Wednesday, December 29, 2010 4:52 PM by Thomas Levesque

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

@Michael: That ends up fetching the iterator for source twice, if it's non-empty. I think that's a bad idea - I treat it as a general principle to only read from a sequence once.

Wednesday, December 29, 2010 5:13 PM by skeet

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

How about:

   using (IEnumerator<TSource> iterator = source.GetEnumerator())

   {

       if (iterator.MoveNext())

       {

           do

           {

               yield return iterator.Current;

           }

           while (iterator.MoveNext());

       }

       else

       {

           yield return defaultValue;

       }

   }

Seems pretty clean to me...

Wednesday, December 29, 2010 11:00 PM by John Gietzen

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

Jon, you could attempt to fetch the first item in the sequence, then compare and determine, whether to continue to yield or return the default value (I'm off to work right now, but I can definitely see an issue with struct vs. class types - will check this later, but it's just an idea).

Personally, I don't find the second alternative clumsy. Reassigning for each iteration feels clumsy imo.

Btw. would be really interesting if you could continue the LINQ series with additional helper extensions (I'll come up with a list later), but there are daily requirements which the default BCL LINQ extensions doesn't support in a single method. Again, more later.

Thursday, December 30, 2010 2:13 AM by Anders Borum

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

@John: I mention something similar in the post... I'm not keen on do/while in general. Possibly just a personal habit.

@Anders: If MoveNext() returns false, it would be a bug to call Current *at all*. As for continuing the series afterwards - maybe. I've got some extensions at morelinq.googlecode.com already. I suspect everyone will be sick of LINQ by the time I'm finished :)

Thursday, December 30, 2010 2:21 AM by skeet

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

How about spilting into to methods on that takes care of a potential empty sequence and one that "converts

" an IEnumerator<T> to IEnumerable<T>. The latter could be an extension method in it's own rights since it's applicable in more cases than this (However in the below implementation the contract is not validated but simply relied upon. Ie. that there has to be a valid .Current when entering the method):

private static IEnumerable<TSource> DefaultIfEmptyImpl<TSource>(

   IEnumerable<TSource> source,

   TSource defaultValue)

{

   using (IEnumerator<TSource> iterator = source.GetEnumerator())

   {

       if (!iterator.MoveNext())

       {

           return new [] {defaultValue};

       }

       return iterator.Rest();

   }

}

public static IEnumerable<T> Rest<T>(this IEnumerator<T> iterator)

{

 yield return iterator.Current;

 while (iterator.MoveNext())

 {

    yield return iterator.Current;

 }

}

Thursday, December 30, 2010 3:32 AM by runefs

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

@runefs: Nope, that wouldn't work - because then it's calling GetEnumerator and MoveNext eagerly... you're breaking deferred execution.

Thursday, December 30, 2010 5:33 AM by skeet

# re: Reimplementing LINQ to Objects: Part 12 - DefaultIfEmpty

Very nice series.

For the following operators, I would be very curious to see something like Distinct, Intersect, Union or OrderBy :)

Thursday, December 30, 2010 7:09 AM by Philippe