With Entity Framework Core 3.0 (EF) the internal implementation of the LINQ query translation has been changed a lot. So, some queries that didn't work previously are working now and some that worked are not working anymore. :)

The LINQ extension method GroupBy is a special one, sometimes it works and sometimes it throws an exception. One of the use cases that are supported is the usage of an aggregate function right after calling GroupBy:

var productCount = Context.Products  __
  .GroupBy(p => p.GroupId)  
  .Select(g => new  
               {  
                  GroupId = g.Key,  
                  Count = g.Count()  
               })  
  .ToList();

The generated SQL statement is the one we expect:

SELECT p.GroupId, COUNT(*) AS Count  
FROM Products AS p  
GROUP BY p.GroupId

The previous use case is simple and very limited in its usefulness. A use case that I see in the projects more often is something like: "Give me the first/last product for each group ordered by columns x, y". The query that comes to mind first is:

var firstProducts = Context.Products  
   .GroupBy(p => p.GroupId)  
   .Select(g => g.OrderBy(p => p.Name).FirstOrDefault())  
   .ToList();

Looks good but leads to an InvalidOperationException. To get the desired result without using the method GroupBy is to start the LINQ query with ProductGroups and use the navigational property Products.

var firstProducts = Context.ProductGroups  
   .Select(g => g.Products.OrderBy(p => p.Name).FirstOrDefault())  
   .ToList();

Remarks: If a group has no products then null is going to be pushed into our list firstProducts. Most of the time these nulls are of no use and should be filtered out.

The generated SQL is using the window function ROW_NUMBER for grouping.

SELECT t0.*  
FROM   
  ProductGroups AS p  
  LEFT JOIN  
  (  
     SELECT *  
     FROM   
     (  
         SELECT *, ROW_NUMBER() OVER(PARTITION BY p0.GroupId ORDER BY p0.Name) AS row  
         FROM Products AS p0  
     ) AS t  
     WHERE t.row <= 1  
  ) AS t0 ON p.Id = t0.GroupId

In the end we can say that the support of GroupBy has been improved in version 3.0. The queries that have been evaluated on the client are now translated to SQL and executed on the database. The only drawback is that we need to rewrite our LINQ queries sometimes.

Related Articles

entity framework core
Do Not Waste Performance by Not Using Temp Tables With Entity Framework Core
It has been a while since I released my article about the usage of temp tables in Entity Framework (v6). Meanwhile, Microsoft has released a completely rewritten version of its O/R mapper so my old approach is no longer applicable. But before we learn about a new one, let us…
Pawel Gerr
entity framework core
Better Entity Framework Core Performance by Reading Execution Plans
Both a LINQ query and an SQL statement are descriptions that state which data should be fetched, but not how.. Sure, when reading LINQ or SQL, we can make assumptions about the performance but not in every case. Some queries are either too fancy or too big to grasp, so our…
Pawel Gerr
entity framework core
Unnecessary Fuzzy Searches may hurt your Entity Framework Core Performance
After talking about performance issues like N+1 Queries and the Cartesian Explosion that made its comeback in Entity Framework Core 3, we will today look at a performance issue that is not tied to any Entity Framework version but is rather a general one. What do I mean by…
Pawel Gerr
entity framework core
The performance issue "Cartesian Explosion" made its comeback in Entity Framework Core 3
In Entity Framework Core 3.0/3.1 the SQL statement generation underwent significant changes. As we have seen in the previous post these changes removed both the implicit client-side evaluation and the N+1 Query Problem (which is good!). Unfortunately, these changes (re)introduced…
Pawel Gerr