LINQ: Mean, Median, and Mode
Posted
Fri, May 7 2010 0:23
by
Deborah Kurata
If you are doing any type of statistical analysis, you probably need to calculate mean, median and mode. There are lots of places on the Web you can find the calculations. This post is different than most in that it uses LINQ and Lambda expressions.
Mean is the statistical average of a set of numbers. This one is easy with LINQ because of the Average function.
In C#:
int[] numbers = { 4, 4, 4, 4, 3, 2, 2, 2, 1 };
double mean = numbers.Average();
Debug.WriteLine(("Mean: " + mean));
In VB:
Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1}
Dim mean As Double = numbers.Average()
Debug.WriteLine("Mean: " & mean)
The result is:
Mean: 2.88888888888889
This code uses the Average extension method on the IEnumerable class to calculate the mean, or average, of the numbers.
Median is the middle number of a set of numbers. If there is an even number of entries, it is the average of the two middle numbers.
In C#:
int[] numbers = { 4, 4, 4, 4, 3, 2, 2, 2, 1 };
int numberCount = numbers.Count();
int halfIndex = numbers.Count()/2;
var sortedNumbers = numbers.OrderBy(n=>n);
double median;
if ((numberCount % 2) == 0)
{
median = ((sortedNumbers.ElementAt(halfIndex) +
sortedNumbers.ElementAt((halfIndex - 1)))/ 2);
} else {
median = sortedNumbers.ElementAt(halfIndex);
}
Debug.WriteLine(("Median is: " + median));
In VB:
Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1}
Dim numberCount As Integer = numbers.Count
Dim halfIndex As Integer = numbers.Count \ 2
Dim sortedNumbers = numbers.OrderBy(Function(n) n)
Dim median As Double
If (numberCount Mod 2 = 0) Then
median = (sortedNumbers.ElementAt(halfIndex) +
sortedNumbers.ElementAt(halfIndex - 1)) / 2
Else
median = sortedNumbers.ElementAt(halfIndex)
End If
Debug.WriteLine("Median is: " & median)
The result is:
Median is: 3
This code first counts the numbers and divides the count by 2 to find the middle of the list. Note that the VB code uses the backslash (\) to perform an integer division where the C# code uses a forward slash (/) for the division.
It then sorts the numbers in order using the OrderBy extension method and a Lambda expression that simply orders by the numbers.
The last step is to get the element at the middle (if odd) or the average of the two middle elements (if even). The result is the median.
Mode is the number that occurs the largest number of times.
In C#:
int[] numbers = { 4, 4, 4, 4, 3, 2, 2, 2, 1 };
var mode = numbers.GroupBy(n=> n).
OrderByDescending(g=> g.Count()).
Select(g => g.Key).FirstOrDefault();
Debug.WriteLine(("Mode is: " + mode));
In VB:
Dim numbers() As Integer = {4, 4, 4, 4, 3, 2, 2, 2, 1}
Dim mode = numbers.GroupBy(Function(n) n).
OrderByDescending(Function(g) g.Count).
Select(Function(g) g.Key).FirstOrDefault
Debug.WriteLine("Mode is: " & mode)
The result is:
Mode is: 4
This code uses the GroupBy extension method on IEnumerable to group the numbers by number. It then orders them by the count and selects the first one. This provides the number that occurs the most times.
Use these techniques whenever you need to calculate the mean, median, or mode.
Enjoy!