I am sure that the Parallel Framework (PFx) will take more and more importance in future developments.
However, demos often are not in phase with the reality. It’s true that a fractal demo on a 1000 cores machine is ideal for parallel demo, however in real life, I never used the fractal yet and I don’t think that I am the only one in this case.
So my idea was to make an example inspired from my daily work. So I made an example around… the Entity Framework. 
Imagine a DB featuring a Customers table with potentially a lot of rows (100 000 in my example which is quite a lot), a WPF application with a DataGrid to show the customers. Note that the WPF DataGrid is better than the Winforms DataGridView for this case because the WPF DataGrid loads only the visible items.
The ObjectQuery class implements IListSource, so it means that we can use it as ItemsSource (the Customers property is of type ObjectSet<Customer> which inherits of ObjectQuery). Then, we add some controls to filter the items (for example, one TextBox per string column). We won’t use the MVVM pattern to avoid complicating this sample. So to filter items, we can change the DataGrid ItemsSource Customers.Where(…) instead of Customers. The result of the Where is also an ObjectQuery instance so no problem. But… this implies a new query into the DB. We should avoid executing a new SQL query each time one of the filter control’s value change. Indeed, on the Window load, we get all the customers.
In addition to the useless query, we have another issue: new non persisted customers aren’t in the DataGrid since we change the filter and we can’t get it in the grid while we don’t call the context SaveChanges method. Conversely, the deleted customers still are in the grid while changes aren’t persisted. Indeed, EF queries get only the DB datas. To fix it, we can persist systematically, the add and delete and include all in a transaction so we are able to cancel the modifications however, this solution isn’t great.
We need to work with the ObjectContext cache as soon as the entities are loaded. For this, we can use the ObjectStateManager property but… with the ObjectStateManager, we will have only an IEnumerable<Customer>, so we won’t be able to automate the add and the deletion.
To fix it, I made my own class:
public class ObjectSetDataSource<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable, INotifyCollectionChanged where T : class, new()
{
[…]
private IEnumerable<T> AllEntities
{
get
{
if (_isLoaded == false)
{
foreach (var e in ObjectSet)
yield return e;
_isLoaded = true;
}
else
{
foreach (var e in ObjectSet.Context.ObjectStateManager.GetObjectStateEntries(EntityState.Added | EntityState.Modified | EntityState.Unchanged).Select(ose => ose.Entity).OfType<T>())
yield return e;
}
}
}
private void CreateEntitiesList()
{
_entities = null;
if (_allEntities == null)
_allEntities = AllEntities;
if (Predicate == null)
{
_entities = _allEntities.ToList();
return;
}
_entities = _allEntities.Where(Predicate).ToList();
}
public Func<T, bool> Predicate
{
get { return _predicate; }
set
{
_predicate = value;
CreateEntitiesList();
OnCollectionChanged(
NotifyCollectionChangedAction.Reset);
}
}
}
Ok, it works. On the Window, we can use the TextChanged event like this:
private void lastNameTB_TextChanged(object sender, TextChangedEventArgs e)
{
Filter();
}
private void firstNameTB_TextChanged(object sender, TextChangedEventArgs e)
{
Filter();
}
private void Filter()
{
string lastName = lastNameTB.Text;
string firstName = firstNameTB.Text;
_customers.Predicate = c =>
{
int i = 0;
while (i < 100000) i++;
return c.LastName.StartsWith(lastName) && c.FirstName.StartsWith(firstName);
};
}
Note the loop to slow the process in order to see better what happens.
We will quickly see performance issues. Imagine that we want all the customers whose last name starts with “MEZ”.
What happens?
First issue: when we write MEZ, we probably write M then E and then Z. But with our sequential process, we will get the list of the customers whose last name starts with M then with ME and only then with MEZ. The problem is the fact that you can’t cancel the process because while it isn’t finished, the TextChanged event isn’t raised. Another issue is the fact that our code uses only one core of our processor and so 50% of the CPU power with my dual core:
If we parallelize our code, we will fix the second issue:
To use 100% of our CPU, we should execute our LINQ query on several threads. Indeed you can because the customers filter is independent between each customer. However, creating one thread per customer is out of the question! Tasks are very interesting for this and in our case, we will use PLINQ which is really fantastically easy to use. This is the code I propose:
public class ObjectSetDataSource<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable, INotifyCollectionChanged where T : class, new()
{
[…]
private void CreateEntitiesList()
{
_entities = null;
if (_allEntities == null)
_allEntities = AllEntities;
if (Predicate == null)
{
_entities = _allEntities.ToList();
return;
}
try
{
_entities = _allEntities.AsParallel.Where(Predicate).ToList();
}
catch (OperationCanceledException)
{
}
}
public Func<T, bool> Predicate
{
get { return _predicate; }
set
{
_predicate = value;
CreateEntitiesList();
OnCollectionChanged(NotifyCollectionChangedAction.Reset);
}
}
}
The CPU use shows a real improvement:
Only with the AsParallel extension method, It’s hard to do it more easily!
With this, our LINQ query is executed on all the cores of our computer. The extension method AsParallel will share the processes by splitting the source by n (2 for my dual core). Note that with execution on more than one thread, the result (without OrderBy) can be in another order than with our first code mono-threaded but in our case, it isn’t important.
However, there still is the first issue: we need to be able to cancel the list process if the filter changes in order not to have to generate the lists for M and ME when the wanted filter is MEZ. Of course, PFx proposes all we need to do it:
public class ObjectSetDataSource<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable, INotifyCollectionChanged where T : class, new()
{
[…]
private void CreateEntitiesList()
{
_entities = null;
if (_allEntities == null)
_allEntities = AllEntities;
if (Predicate == null)
{
_entities = _allEntities.ToList();
return;
}
try
{
_entities = _allEntities.AsParallel().WithCancellation(_cancellationToken.Token).Where(Predicate).ToList();
}
catch (OperationCanceledException)
{
}
}
public Func<T, bool> Predicate
{
get { return _predicate; }
set
{
_predicate = value;
if (_cancellationToken != null)
_cancellationToken.Cancel();
_cancellationToken = new CancellationTokenSource();
new Task(t =>
{
CreateEntitiesList();
if (cancellationToken.IsCancellationRequested)
return;
var app = Application.Current;
if (app != null)
app.Dispatcher.BeginInvoke((Action)(() => OnCollectionChanged(NotifyCollectionChangedAction.Reset)));
}, cancellationToken).Start();
}
}
}
The CPU use shows one more important improvement:
The WithCancellation extension method allows us to cancel the LINQ query during its execution.
The idea to create the task in the Predicate property is to not be forced to wait the end of the M filter process and then the end of the ME filter process to finally have our MEZ filter. The task runs the filter process asynchronously and avoids the frozen UI.
The last point is the BeginInvoke method which allow us to raise the CollectionChanged event into the principal thread which is inevitable to avoiding the following exception: “This type of CollectionView does not support changes to its SourceCollection from a thread different from the Dispatcher thread”.
This sample, very frequent, shows us the interest of using PFx. However, be careful, as all algorithms aren’t parallelisable. Moreover, the parallel code often adds more complexity and implies bringing parallelism algorithm under control. However if you have time to learn it, I suggest you very very very strongly to learn PFx and the parallelism in general. Indeed, as I wrote it at first, the Parallel Framework (PFx) will take more and more importance in the future.
You can download the complete ObjectSetDataSource class here.