Entity Framework Core 3.0 – “Hidden” GROUP BY Capabilities (Part 1)

With Entity Framework Core 3.0 (EF) the internal implementation of the LINQ query translation has been changed a lot. So, some queries that didn't work previously are working now and some that worked are not working anymore. :)

In this article:

Entity Framework Core 3.0 – “Hidden” GROUP BY Capabilities (Part 1)
Pawel Gerr is architect consultant at Thinktecture. He focuses on backends with .NET Core and knows Entity Framework inside out.

The LINQ extension method GroupBy is a special one, sometimes it works and sometimes it throws an exception. One of the use cases that are supported is the usage of an aggregate function right after calling GroupBy:

				
					var productCount = Context.Products  __
  .GroupBy(p => p.GroupId)  
  .Select(g => new  
               {  
                  GroupId = g.Key,  
                  Count = g.Count()  
               })  
  .ToList();
				
			

The generated SQL statement is the one we expect:

				
					SELECT p.GroupId, COUNT(*) AS Count  
FROM Products AS p  
GROUP BY p.GroupId
				
			

The previous use case is simple and very limited in its usefulness. A use case that I see in the projects more often is something like: “Give me the first/last product for each group ordered by columns x, y“. The query that comes to mind first is:

				
					var firstProducts = Context.Products  
   .GroupBy(p => p.GroupId)  
   .Select(g => g.OrderBy(p => p.Name).FirstOrDefault())  
   .ToList();
				
			

Looks good but leads to an InvalidOperationException. To get the desired result without using the method GroupBy is to start the LINQ query with ProductGroups and use the navigational property Products.

				
					var firstProducts = Context.ProductGroups  
   .Select(g => g.Products.OrderBy(p => p.Name).FirstOrDefault())  
   .ToList();
				
			

Remarks: If a group has no products then null is going to be pushed into our list firstProducts. Most of the time these nulls are of no use and should be filtered out.

The generated SQL is using the window function ROW_NUMBER for grouping.

				
					SELECT t0.*  
FROM   
  ProductGroups AS p  
  LEFT JOIN  
  (  
     SELECT *  
     FROM   
     (  
         SELECT *, ROW_NUMBER() OVER(PARTITION BY p0.GroupId ORDER BY p0.Name) AS row  
         FROM Products AS p0  
     ) AS t  
     WHERE t.row <= 1  
  ) AS t0 ON p.Id = t0.GroupId
				
			

In the end we can say that the support of GroupBy has been improved in version 3.0. The queries that have been evaluated on the client are now translated to SQL and executed on the database. The only drawback is that we need to rewrite our LINQ queries sometimes.

Free
Newsletter

Current articles, screencasts and interviews by our experts

Don’t miss any content on Angular, .NET Core, Blazor, Azure, and Kubernetes and sign up for our free monthly dev newsletter.

Related Articles
.NET
Roslyn Source Generators: Logging – Part 11
In previous part we lerned how to pass parameters to a Source Generator. In this article we need this knowledge to pass futher parameters to implement logging.
29.08.2023
.NET
Roslyn Source Generators: Configuration – Part 10
In this article we will see how to pass configuration parameters to a Roslyn Source Generator to control the output or enable/disable features.
29.08.2023
.NET
Roslyn Source Generators: Reduction of Resource Consumption in IDEs – Part 9
In this article we will see how to reduce the resource consumption of a Source Generator when running inside an IDE by redirecting the code generation to RegisterImplementationSourceOutput.
29.08.2023