In this Article

In Entity Framework Core 3.0/3.1 the SQL statement generation underwent significant changes. As we have seen in the previous post these changes removed both the implicit client-side evaluation and the N+1 Query Problem (which is good!).
Unfortunately, these changes (re)introduced another issue: the Cartesian Explosion Problem.

What is a "Cartesian Explosion"?

As implied by the name, it has something to do with a cartesian product, i.e. with JOINs. When performing a JOIN on the one-to-many relationship then the rows of the one-side are being replicated N times whereby N is the number of matching records on the many-side.

Here is an example for JOIN-ing 1 ProductGroup with 1000 Products.
The corresponding LINQ query would look like:

var groups = Context.ProductGroups
          .Include(g => g.Products)
          .ToList();

The is similar to the following one:

SELECT *
FROM ProductGroups
LEFT JOIN
    Products
    ON Products.GroupId = ProductGroups.Id

And the result set:

ProductGroup Id Product Id
1 1
1 2
1 3
1 ...
1 1000

As we see, the columns of the ProductGroup are replicated 1000 times. Imagine there are 10 sellers per Product - the result set will contain 1 * 1000 * 10 = 10000 rows although we have just 1 + 1000 + 10 = 1011 records in the database.

I should be clear what happens if we add a few Includes more. The result set (i.e. the cartesian product) would explode.

EF-forced "ORDER BY"

The larger result set due to JOINs is not the only cause for lower performance. Let's look at the SQL statement generated by EF. Btw, the SQL statement above is not complete but the following one is:

SELECT 
    [p].[Id], [p].[Name], [p].[RowVersion],
    [p0].[Id], [p0].[GroupId], [p0].[Name], [p0].[RowVersion]
FROM
    [ProductGroups] AS [p]
LEFT JOIN 
    [Products] AS [p0] 
    ON [p].[Id] = [p0].[GroupId]
ORDER BY
    [p].[Id], [p0].[Id]

For internal purposes, the EF adds an ORDER BY clause to order the entities by their identifiers. So, with a result set of that huge size, the ordering of this data will produce considerable load on the database.

Query splitting (back to the roots)

The solution of the Cartesian Explosion Problem that came with Entity Framework Core 3 is the same as with Entity Framework (non-Core) 6. We split 1 LINQ query in multiple queries if (and only if) the database load rises significantly.

When using our (oversimplified) example from above then the solution is to load Products and ProductGroups separately.

var groups = Context.ProductGroups.ToList();
var products = Context.Products.ToList();

Here are some database statistics (MS SQL Server) I get when loading data having two one-to-many relationships before and after query splitting. The absolute numbers are not relevant, just look at the relative difference, especially in the Reads and Rows.

Before splitting After splitting
CPU 31 16
Duration 75 3
Reads 5300 350
Rows 12000 2300

Summary

In this blog article, I wanted to convey two things: there is a new (old) issue we have to be aware of, and this issue can be solved.
The difficulty is finding such queries and determining how to split them. If we split too much, we waste time. If we split too little, we waste performance. The tools I highly recommend using for this task are the database statistics and execution plans.

Don't miss out on news about Entity Framework Core & more

Subscribe to our free monthly newsletter for our experts' latest technical articles about Angular, .NET, Blazor, Azure, and Kubernetes.

Please enter a valid email address.

Related Articles

entity framework core
Unnecessary Fuzzy Searches may hurt your Entity Framework Core Performance
After talking about performance issues like N+1 Queries and the Cartesian Explosion that made its comeback in Entity Framework Core 3, we will today look at a performance issue that is not tied to any Entity Framework version but is rather a general one. What do I mean by…
Pawel Gerr
entity framework core
Is "N+1 Queries" still a performance issue in Entity Framework Core 3?
In a previous post we saw that EF 2.1 is highly susceptible to the N+1 queries problem. After the release of a new version of Entity Framework Core (EF) the first question coming to mind is: "Is it still a big issue in EF 3.1?" And if the answer is no, is there anything else we…
Pawel Gerr
entity framework core
Entity Framework Core 2.1 Performance - Beware of N+1 Queries (Revisited)
In the previous post we have identified some Entity Framework (EF) LINQ queries that are affected by so called N+1 queries problem. In the meantime a new version (2.1-RC1) of Entity Framework has been released so we check the SQL statement generation yet another time. Samples…
Pawel Gerr
entity framework core
Entity Framework Core 3.0 - "Hidden" GROUP BY Capabilities (Part 2)
In the previous blog post we used a navigational property to work around the limitations of the extension method . The problem is, there is not always such a property we can use for grouping. Especially, when following domain driven design practices, a bidirectional navigation is…
Pawel Gerr