With Entity Framework Core 3.0 (EF) the internal implementation of the LINQ query translation has been changed a lot. So, some queries that didn't work previously are working now and some that worked are not working anymore. :)

The LINQ extension method GroupBy is a special one, sometimes it works and sometimes it throws an exception. One of the use cases that are supported is the usage of an aggregate function right after calling GroupBy:

var productCount = Context.Products  __
  .GroupBy(p => p.GroupId)  
  .Select(g => new  
               {  
                  GroupId = g.Key,  
                  Count = g.Count()  
               })  
  .ToList();

The generated SQL statement is the one we expect:

SELECT p.GroupId, COUNT(*) AS Count  
FROM Products AS p  
GROUP BY p.GroupId

The previous use case is simple and very limited in its usefulness. A use case that I see in the projects more often is something like: "Give me the first/last product for each group ordered by columns x, y". The query that comes to mind first is:

var firstProducts = Context.Products  
   .GroupBy(p => p.GroupId)  
   .Select(g => g.OrderBy(p => p.Name).FirstOrDefault())  
   .ToList();

Looks good but leads to an InvalidOperationException. To get the desired result without using the method GroupBy is to start the LINQ query with ProductGroups and use the navigational property Products.

var firstProducts = Context.ProductGroups  
   .Select(g => g.Products.OrderBy(p => p.Name).FirstOrDefault())  
   .ToList();

Remarks: If a group has no products then null is going to be pushed into our list firstProducts. Most of the time these nulls are of no use and should be filtered out.

The generated SQL is using the window function ROW_NUMBER for grouping.

SELECT t0.*  
FROM   
  ProductGroups AS p  
  LEFT JOIN  
  (  
     SELECT *  
     FROM   
     (  
         SELECT *, ROW_NUMBER() OVER(PARTITION BY p0.GroupId ORDER BY p0.Name) AS row  
         FROM Products AS p0  
     ) AS t  
     WHERE t.row <= 1  
  ) AS t0 ON p.Id = t0.GroupId

In the end we can say that the support of GroupBy has been improved in version 3.0. The queries that have been evaluated on the client are now translated to SQL and executed on the database. The only drawback is that we need to rewrite our LINQ queries sometimes.

Related Articles

 | Pawel Gerr

It has been a while since I released my article about the usage of temp tables in Entity Framework (v6). Meanwhile, Microsoft has released a completely rewritten version of its O/R mapper so my old approach is no longer applicable. But before we learn about a new one, let us…

Read article
 | Pawel Gerr

Both a LINQ query and an SQL statement are descriptions that state which data should be fetched, but not how.. Sure, when reading LINQ or SQL, we can make assumptions about the performance but not in every case. Some queries are either too fancy or too big to grasp, so our…

Read article
 | Pawel Gerr

After talking about performance issues like N+1 Queries and the Cartesian Explosion that made its comeback in Entity Framework Core 3, we will today look at a performance issue that is not tied to any Entity Framework version but is rather a general one. What do I mean by…

Read article