Paulo Morgado

.NET Development & Architecture

This Blog

Syndication

Search

Tags

News

Unit Test Today! Get Typemock Isolator!

Projects

Books

 

Visitors

Visitor Locations

Community

Email Notifications

Archives

Profile

Disclaimer

The opinions and viewpoints expressed in this site are mine and do not necessarily reflect those of Microsoft, my employer or any community that I belong to. Any code or opinions are offered as is. Products or services mentioned are purchased by me, made available to me by my employer or the manufacturer/vendor which doesn't influence my opinion in any way.

LINQ: Enhancing Distinct With The PredicateEqualityComparer
LINQ With C# (Portuguese)

Today I was writing a LINQ query and I needed to select distinct values based on a comparison criteria.

Fortunately, LINQ’s Distinct method allows an equality comparer to be supplied, but, unfortunately, sometimes, this means having to write custom equality comparer.

Because I was going to need more than one equality comparer for this set of tools I was building, I decided to build a generic equality comparer that would just take a custom predicate. Something like this:

public class PredicateEqualityComparer<T> : EqualityComparer<T>
{
    private Func<T, T, bool> predicate;

    public PredicateEqualityComparer(Func<T, T, bool> predicate)
        : base()
    {
        this.predicate = predicate;
    }

    public override bool Equals(T x, T y)
    {
        if (x != null)
        {
            return ((y != null) && this.predicate(x, y));
        }

        if (y != null)
        {
            return false;
        }

        return true;
    }

    public override int GetHashCode(T obj)
    {
        // Always return the same value to force the call to IEqualityComparer<T>.Equals
        return 0;
    }
}

Now I can write code like this:

.Distinct(new PredicateEqualityComparer<Item>((x, y) => x.Field == y.Field))

But I felt that I’d lost all conciseness and expressiveness of LINQ and it doesn’t support anonymous types. So I came up with another Distinct extension method:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, Func<TSource, TSource, bool> predicate)
{
    return source.Distinct(new PredicateEqualityComparer<TSource>(predicate));
}

And the query is now written like this:

.Distinct((x, y) => x.Field == y.Field)

Looks a lot better, doesn’t it? And it works wit anonymous types.

Update: I, accidently, had published the wrong version of the IEqualityComparer<T>.Equals method,

Published Thu, Apr 8 2010 2:18 by Paulo Morgado

Filed under: , , , , ,

Comments

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Thursday, April 08, 2010 8:43 PM

Using the predicate, greatly improves readability, conciseness and expressiveness of the queries, but it can be even better. Most of the times, we don’t want to provide a comparison method but just to extract the comaprison key for the elements.

So, I developed a SelectorEqualityComparer that takes a method that extracts the key value for each element.

msmvps.com/.../linq-enhancing-distinct-with-the-selectorequalitycomparer.aspx

Paulo Morgado

# Using a Delegate/Anonymous Method with the Linq Distinct Extension@ Wednesday, September 08, 2010 12:34 PM

Using a Delegate/Anonymous Method with the Linq Distinct Extension

IT Ramblings

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Friday, December 02, 2011 1:59 PM

Bela extensão, tive de fazer um distinct em 3 campos com validações distintas e ainda assim no retorno manter todos os campos no meu IQueryable.

Uma mão na roda!

Abraços

Diego Costa

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Friday, December 02, 2011 2:02 PM

Idk why but I sent you mt comment in portuguese.. no prob I guess.

Well it helped me a lot, I had to distinct all my IQueryable itens by 3 fields, and the return had to be with all fields.

With your extension I could just distinct the fields I needed withouth having to create a class or a newlist with only the fields to distinct.

Thanks!

Diego Costa

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Sunday, December 04, 2011 6:44 PM

Grato por ter ajudado, Diego. :)

Paulo Morgado

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Monday, June 11, 2012 4:51 AM

Great Post!

Can you post an small example of how to compare then with two or three fields?

Yves

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Monday, June 11, 2012 5:08 AM

Yves,

Is this what you're looking for?

.Distinct((x, y) => (x.Field1 == y.Field1) && (x.Field2 == y.Field2))

Paulo Morgado

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Monday, June 11, 2012 9:26 AM

Yeah, exactly that.

The point is that i couldn't make it work so i wanted to make sure i was calling the Distinct function appropiately. The answer it's taking too long (i stopped it after 5 min when it takes usually 10 seconds)

Can you please help me out? I must be doing something really wrong and i can't see it!

Here is my problematic code:

var customers =

(from c in customers

from ad in addresses

where (a.customer_id == c.customer_id)

select new

{

c.customer_id,

c.customer_name,

address = ad.address_street + " " + ad.address_city,

c.customer_birthday

})

.Distinct((x, y) => (x.customer_id == y.customer_id) && (x.customer_name == y.customer_name) && (x.address == y.address) ) //PredicateEqualityComparer Version

.OrderByDescending(c => c.customer_birthday).Take(10);

I didn't repeat the missing code which is obviously your class but the static Distinct i placed it inside this static class:

public static class Extensions

{

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, Func<TSource, TSource, bool> predicate)

{

return source.Distinct(new PredicateEqualityComparer<TSource>(predicate));

}

public static IEnumerable<TSource> Distinct<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)

where TKey : IEquatable<TKey>

{

return source.Distinct(new SelectorEqualityComparer<TSource, TKey>(selector));

}

}

Thanks again!

Yves

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Monday, June 11, 2012 12:18 PM

Yves,

How many items does your enumerable have?

You can always do something like this:

.Distinct(

 (x, y) =>

 {

   Debug.Write("something meaningful");

   return (x.Field1 == y.Field1) && (x.Field2 == y.Field2);

 })

If you can calculate an hashcode for your comparison, probably you would be better with a custom implementation of IEqualityComparer<TSource>. There are some optimizations around the use of the hashcode.

Paulo Morgado

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Tuesday, June 12, 2012 3:02 AM

Uff..

Man, that definitely killed my VS debugger!

Well, too bad i can not use your solution. I love it because it's pretty elegant and i wanted to use it further in every Distinct of my applications but the use of Distinct itself is too unhealty in my case.

So, if you need to compare multiple fields in big tables (thousands of records) using Distinct with EqualityComparer you better use Group by!

You can read why here:

imar.spaanjaars.com/.../using-grouping-instead-of-distinct-in-entity-framework-to-optimize-performance

Thanks for your help! It's indeed such an elegant solution!

Yves

# re: LINQ: Enhancing Distinct With The PredicateEqualityComparer@ Tuesday, June 12, 2012 3:48 AM

Oh! You're working with SQL Server.

It's not that grouping is better than distinct per se. It's that distinct with an equality comparer cannot be translated to SQL Server.

You need to be very careful when using these operators. Once you use an operator on IEnumerable on a sequence of operators on IQueriable, you'll retrive all the data from the queriable source and end up with an IEnumerable.

Meanwhile, I've changed the implementation of the PredicateEquilityCompararer to accept an hash function. You can get it here: http://pmlinq.codeplex.com/

Paulo Morgado

Leave a Comment

(required) 
(required) 
(optional)
(required) 
If you can't read this number refresh your screen
Enter the numbers above: