The Performance Issue „Cartesian Explosion“ Made Its Comeback In Entity Framework Core 3

In Entity Framework Core 3.0/3.1 the SQL statement generation underwent significant changes. As we have seen in the previous post these changes removed both the implicit client-side evaluation and the N+1 Query Problem (which is good!). Unfortunately, these changes (re)introduced another issue: the Cartesian Explosion Problem.

In diesem Artikel:

The Performance Issue „Cartesian Explosion“ Made Its Comeback In Entity Framework Core 3
Pawel Gerr ist Architekt und Consultant bei Thinktecture. Er hat sich auf .NET Core Backends spezialisiert und kennt Entity Framework von vorne bis hinten.

What is a "Cartesian Explosion"?

As implied by the name, it has something to do with a cartesian product, i.e. with JOINs. When performing a JOIN on the one-to-many relationship then the rows of the one-side are being replicated N times whereby N is the number of matching records on the many-side.

Here is an example for JOIN-ing 1 ProductGroup with 1000 Products.
The corresponding LINQ query would look like:

				
					var groups = Context.ProductGroups
          .Include(g => g.Products)
          .ToList();
				
			

The is similar to the following one:

				
					SELECT *
FROM ProductGroups
LEFT JOIN
    Products
    ON Products.GroupId = ProductGroups.Id
				
			

And the result set:

ProductGroup Id
Product Id
1
1
1
2
1
3
1
1
1000

As we see, the columns of the ProductGroup are replicated 1000 times. Imagine there are 10 sellers per Product – the result set will contain 1 * 1000 * 10 = 10000 rows although we have just 1 + 1000 + 10 = 1011 records in the database.

I should be clear what happens if we add a few Includes more. The result set (i.e. the cartesian product) would explode.

EF-forced "ORDER BY"

The larger result set due to JOINs is not the only cause for lower performance. Let’s look at the SQL statement generated by EF. Btw, the SQL statement above is not complete but the following one is:

				
					SELECT 
    [p].[Id], [p].[Name], [p].[RowVersion],
    [p0].[Id], [p0].[GroupId], [p0].[Name], [p0].[RowVersion]
FROM
    [ProductGroups] AS [p]
LEFT JOIN 
    [Products] AS [p0] 
    ON [p].[Id] = [p0].[GroupId]
ORDER BY
    [p].[Id], [p0].[Id]

				
			

For internal purposes, the EF adds an ORDER BY clause to order the entities by their identifiers. So, with a result set of that huge size, the ordering of this data will produce considerable load on the database.

Query splitting (back to the roots)

The solution of the Cartesian Explosion Problem that came with Entity Framework Core 3 is the same as with Entity Framework (non-Core) 6. We split 1 LINQ query in multiple queries if (and only if) the database load rises significantly.

When using our (oversimplified) example from above then the solution is to load Products and ProductGroups separately.

				
					var groups = Context.ProductGroups.ToList();
var products = Context.Products.ToList();
				
			

Here are some database statistics (MS SQL Server) I get when loading data having two one-to-many relationships before and after query splitting. The absolute numbers are not relevant, just look at the relative difference, especially in the Reads and Rows.

Type
Library
License
Apache 2
Language
C# / .NET

Summary

In this blog article, I wanted to convey two things: there is a new (old) issue we have to be aware of, and this issue can be solved.
The difficulty is finding such queries and determining how to split them. If we split too much, we waste time. If we split too little, we waste performance. The tools I highly recommend using for this task are the database statistics and execution plans.

Kostenloser
Newsletter

Aktuelle Artikel, Screencasts, Webinare und Interviews unserer Experten für Sie

Verpassen Sie keine Inhalte zu Angular, .NET Core, Blazor, Azure und Kubernetes und melden Sie sich zu unserem kostenlosen monatlichen Dev-Newsletter an.

Diese Artikel könnten Sie interessieren
Blazor
[Sneak Preview] Blazor WebAssembly: The Power Of EF Core And SQLite In The Browser – In-Depth

[Sneak Preview] Blazor WebAssembly: The Power Of EF Core And SQLite In The Browser – In-Depth

Rumor has it, this is the dream of many .NET developers: Using SQLite and EF Core in the browser with Blazor WebAssembly. Is this possible? In this article, we will have a look how you can manage your offline data with the help of SQLite and EF Core by persisting the data with standard browser features and APIs.
12.04.2022
Entity Framework Core
Entity Framework Core: Default Comparer For Byte Arrays May Waste Lots Of Memory And CPU

Entity Framework Core: Default Comparer For Byte Arrays May Waste Lots Of Memory And CPU

The default implementation of Entity Framework Core prefers to play it safe (for good reasons) when working with byte arrays. This 'safety' is - in some use cases - unnecessary and costs us a lot of memory and CPU. In this article, we will see that doing less is sufficient for the given property thanks to one of the most overlooked features of Entity Framework.
24.03.2022
.NET CORE
Entity Framework Core 5 Performance: Power Of Table Hints

Entity Framework Core 5 Performance: Power Of Table Hints

A few months ago, I had to implement a Web API for a relatively simple use case, but there was a challenge. The web endpoint must read and update a specific record using Entity Framework Core 5 and be capable of handling thousands of requests per second. Another requirement was the *data integrity*, so a transaction was a must. With high concurrency and Microsoft SQL Server as the database, we can end up in a deadlock if the SQL Server locks the records or rather the pages in an *unfavorable* way. Let's see how *table hints* can help us.
29.07.2021
.NET
Do Not Waste Performance By Not Using Temp Tables With Entity Framework Core

Do Not Waste Performance By Not Using Temp Tables With Entity Framework Core

It has been a while since I released my article about the usage of temp tables in Entity Framework (v6). Meanwhile, Microsoft has released a completely rewritten version of its O/R mapper so my old approach is no longer applicable. But before we learn about a new one, let us think about what we might need temp tables for.
25.06.2020
.NET
Better Entity Framework Core Performance By Reading Execution Plans

Better Entity Framework Core Performance By Reading Execution Plans

Both a LINQ query and an SQL statement are descriptions that state which data should be fetched, but not how.. Sure, when reading LINQ or SQL, we can make assumptions about the performance but not in every case. Some queries are either too fancy or too big to grasp, so our predictions may be way too vague if we can make any at all.
10.06.2020
.NET
Unnecessary Fuzzy Searches May Hurt Your Entity Framework Core Performance

Unnecessary Fuzzy Searches May Hurt Your Entity Framework Core Performance

After talking about performance issues like N+1 Queries and the Cartesian Explosion that made its comeback in Entity Framework Core 3, we will today look at a performance issue that is not tied to any Entity Framework version but is rather a general one.
24.03.2020