Entity Framework Core: Default Comparer For Byte Arrays May Waste Lots Of Memory And CPU

The default implementation of Entity Framework Core prefers to play it safe (for good reasons) when working with byte arrays. This 'safety' is - in some use cases - unnecessary and costs us a lot of memory and CPU. In this article, we will see that doing less is sufficient for the given property thanks to one of the most overlooked features of Entity Framework.

In diesem Artikel:

Pawel Gerr ist Architekt und Consultant bei Thinktecture. Er hat sich auf .NET Core Backends spezialisiert und kennt Entity Framework von vorne bis hinten.

Please note: this article is not about whether a byte array should or should not be used with relational databases but rather about “if you do, then be aware of …”

EF's default behavior with byte arrays

When working with byte arrays and change tracking is active, then on SaveChanges Entity Framework Core (EF) is not just comparing the object references of the arrays, but the content as well. If the corresponding property represents some kind of bit-mask, i.e., every byte in the array is changed independently, then comparing every byte is necessary. But, most of the time, I see in projects that the properties are used for persisting small binary data, like thumbnails, which are considered immutable. In such cases, it is unlikely that someone will change single bytes inside the array. If the thumbnail has to be changed, then the byte array is replaced by another byte array, i.e., the new one is a completely new object reference.

How much does it cost?

Having some binary data, the comparison of the content is not wrong in general but unnecessary. I’ve benchmarked a few use cases in terms of memory and CPU usage. One entity was using the default behavior, the other a custom ValueComparer.

The benchmarks update 10k entities with 1kB array each. Before calling SaveChanges, the property is assigned one of two new arrays. One new array has 1 different byte at the beginning and is considered the best-case, and the other has a different byte at the end of the array.

					// array read from database = [0,0,0,...,0];

var newArray_bestCase = [1,0,0,...,0];
var newArray_worstCase = [0,0,0,...,1];

All benchmarks do two things: update the property bytes and call SaveChanges.

					entitiesLoadedFromDb.ForEach(e => e.Bytes = newArray_bestCase); // or newArray_worstCase

await myDbContext.SaveChangesAsync();

For benchmarking, I use the library BenchmarkDotNet with the MemoryDiagnoser.

The source code can be found on GitHub

					|            Method |       Mean |    Error |   StdDev | Gen 0 |Gen 1 | Allocated |
|------------------ |-----------:|---------:|---------:|------:|-----:|----------:|
|  Default_BestCase |   337.0 ms |  4.01 ms |  3.75 ms |  7000 | 2000 |     63 MB |
| Default_WorstCase | 1,220.7 ms | 11.84 ms | 11.07 ms | 66000 | 2000 |    531 MB |
|   Custom_BestCase |   325.6 ms |  6.16 ms |  5.46 ms |  8000 | 2000 |     65 MB |
|  Custom_WorstCase |   330.5 ms |  5.02 ms |  4.70 ms |  8000 | 2000 |     65 MB |

Worst case, the memory usage rises from 63 MB to 531 MB (ca. 850%) and the duration from 337 ms to 1220 ms (over 350%), when using the default behavior. With the custom ValueComparer, the values always staylow.

Use reference equality for opaque binary data

The ValueComparer can be changed in OnModelCreating or in IEntityTypeConfiguration<T> via the method SetValueComparer. The method expects an instance of ValueComparer, which can be implemented from scratch or by using the generic class ValueComparer<T>. The constructor of ValueComparer<T> expects three expressions:

  • equalsExpression: compares two instances using reference equality
  • hashCodeExpression: computes the hash code
  • snapshotExpression: passes the reference of the array as is because it is enough for reference equality
					builder.Property(e => e.Bytes)
       .SetValueComparer(new ValueComparer<byte[]>(
            (obj, otherObj) => ReferenceEquals(obj, otherObj),
            obj => obj.GetHashCode(),
            obj => obj));


In this article, we looked at the ValueComparer and how it affects memory and CPU usage when using byte arrays with EF. Although we were talking about byte arrays only, the same performance issues could arise with all custom objects with a ValueConverter (please note: Converter, not Comparer).


Aktuelle Artikel, Screencasts, Webinare und Interviews unserer Experten für Sie

Verpassen Sie keine Inhalte zu Angular, .NET Core, Blazor, Azure und Kubernetes und melden Sie sich zu unserem kostenlosen monatlichen Dev-Newsletter an.

Newsletter Anmeldung
Diese Artikel könnten Sie interessieren
Entity Framework Core

Entity Framework Core 7 Performance: Cartesian Explosion

In Entity Framework Core 3 (EF 3) the SQL statement generation (re)introduced the Cartesian Explosion problem. A lot has happened since then, so it is time to revisit the issue with Entity Framework Core 7 (EF 7).
Entity Framework Core

Entity Framework Core 7: N+1 Queries Problem

The N+1 queries problem has been our constant companion since day one of Entity Framework (Core). Entity Framework Core 2 (EF 2) introduced a new feature that caused the "N+1 queries problem" more often and was more difficult to detect, so it was removed in the following version. After a little back and forth, let's see how Entity Framework Core 7 (EF 7) handles this issue and why it will likely remain in the future.
Entity Framework Core

Entity Framework Core: User-defined Fields and Tables

The requirement to store additional fields, unknown at development time, in a relational database is not new. Nonetheless, none of the projects I know of are willing to change the database structure at runtime. What if there is a project which needs dynamically created fields and doesn't want or cannot use entity–attribute–value model or switch to No-SQL databases?

[Sneak Preview] Blazor WebAssembly: The Power Of EF Core And SQLite In The Browser – In-Depth

Rumor has it, this is the dream of many .NET developers: Using SQLite and EF Core in the browser with Blazor WebAssembly. Is this possible? In this article, we will have a look how you can manage your offline data with the help of SQLite and EF Core by persisting the data with standard browser features and APIs.
Entity Framework Core

Entity Framework Core 5 Performance: Power Of Table Hints

A few months ago, I had to implement a Web API for a relatively simple use case, but there was a challenge. The web endpoint must read and update a specific record using Entity Framework Core 5 and be capable of handling thousands of requests per second. Another requirement was the *data integrity*, so a transaction was a must. With high concurrency and Microsoft SQL Server as the database, we can end up in a deadlock if the SQL Server locks the records or rather the pages in an *unfavorable* way. Let's see how *table hints* can help us.
Entity Framework Core

Do Not Waste Performance By Not Using Temp Tables With Entity Framework Core

It has been a while since I released my article about the usage of temp tables in Entity Framework (v6). Meanwhile, Microsoft has released a completely rewritten version of its O/R mapper so my old approach is no longer applicable. But before we learn about a new one, let us think about what we might need temp tables for.