Sample Application
First, we will create a sample application for the reproduction of the issues and for measuring the throughput. For that, we create an ASP.NET Core application with one endpoint and a DbContext
with one entity, Product
.
Demo DbContext
The entity Product
has two properties: an ID and a DateTime
, which will be changed on every call of the web API.
public class Product
{
public Guid Id { get; set; }
public DateTime LastProductAccess { get; set; }
}
public class DemoDbContext : DbContext
{
public DbSet Products { get; set; }
public DemoDbContext(DbContextOptions options)
: base(options)
{
}
}
Later, during the startup of the application, we will create one Product
with a hard-coded identifier. This ID will be used for load tests.
var id = new Guid("06D6E029-320B-4347-9E47-03CCFF153652");
if (!await dbContext.Products.AnyAsync(p => p.Id == id))
{
dbContext.Products.Add(new() { Id = id, LastProductAccess = DateTime.UtcNow });
await dbContext.SaveChangesAsync();
}
Demo Controller
The ASP.NET controller has one action, which reads a Product
with the provided ID and updates the property LastProductAccess
.
[Route("[controller]")]
public class DemoController : Controller
{
private const IsolationLevel _ISOLATION_LEVEL =
IsolationLevel.ReadCommitted // Violates our business requirements,
// but just for testing purposes
// IsolationLevel.RepeatableRead
// IsolationLevel.Serializable
;
private readonly DemoDbContext _ctx;
public DemoController(DemoDbContext ctx)
{
_ctx = ctx;
}
[HttpPost("{id}")]
public async Task DoAsync(Guid id)
{
await using var tx = await _ctx.Database.BeginTransactionAsync(_ISOLATION_LEVEL);
var product = await _ctx.Products.FirstAsync(p => p.Id == id);
// do more or less complex stuff
product.LastProductAccess = DateTime.UtcNow;
await _ctx.SaveChangesAsync();
await tx.CommitAsync();
}
}
Choosing the Transaction Isolation Level
For consistency, a transaction is the first that comes to mind. The question is, what isolation level should we choose, and what are the implications?
Isolation Level: ReadCommitted
The isolation level ReadCommitted
is usually the default one. Although it is not sufficient for our purposes because it allows others to modify the data we are currently working on, still, it is interesting to see how it will perform.
I’m using hey for the execution of multiple concurrent requests. We execute 10k requests with a concurrency level of 50.
After the execution of the load tests, we see that all requests have responded with the status code 200
, and we are able to handle 5k requests per second.
Please note: the absolute values are not important and will vary from machine to machine. What matters are the differences between measurements.
> hey -n 10000 -c 50 -m POST http://localhost:5000/demo/06D6E029-320B-4347-9E47-03CCFF153652
Summary:
Total: 1.9819 secs
Slowest: 0.0303 secs
Fastest: 0.0052 secs
Average: 0.0099 secs
Requests/sec: 5045.5772
Status code distribution:
[200] 10000 responses
[Route("[controller]")]
public class DemoController : Controller
{
private const IsolationLevel _ISOLATION_LEVEL =
// IsolationLevel.ReadCommitted // Violates our business requirements,
// but just for testing purposes
IsolationLevel.RepeatableRead
// IsolationLevel.Serializable
;
Shortly after the start of the load test, our application drowns in exceptions due to deadlocks.
System.InvalidOperationException: An exception has been raised that is likely due to a
transient failure. Consider enabling transient error resiliency by adding
'EnableRetryOnFailure()' to the 'UseSqlServer' call.
--> Microsoft.EntityFrameworkCore.DbUpdateException: An error occurred while updating
the entries. See the inner exception for details.
--> Microsoft.Data.SqlClient.SqlException (0x80131904): Transaction (Process ID 102)
was deadlocked on lock resources with another process and has been chosen as the
deadlock victim. Rerun the transaction.
Most of the requests ended up with the status code 500
, and the performance turned out very bad. Sure, we could try to catch the exception and retry or activate EnableRetryOnFailure()
as suggested in the error message, but it will make everything worse.
> hey -n 10000 -c 50 -m POST http://localhost:5000/demo/06D6E029-320B-4347-9E47-03CCFF153652
Summary:
Total: 56.0751 secs
Slowest: 8.3635 secs
Fastest: 0.0957 secs
Average: 0.2801 secs
Requests/sec: 178.3322
Status code distribution:
[200] 400 responses
[500] 9600 responses
Isolation Level: Serializable
The isolation level Serializable
is the strictest one, so it may help us to prevent the deadlocks – but it did not. We still get as many deadlocks as with RepeatableRead
.
> hey -n 10000 -c 50 -m POST http://localhost:5000/demo/06D6E029-320B-4347-9E47-03CCFF153652
Summary:
Total: 55.2665 secs
Slowest: 7.5805 secs
Fastest: 0.0133 secs
Average: 0.2756 secs
Requests/sec: 180.9414
Status code distribution:
[200] 401 responses
[500] 9599 responses
Table Hints
The table hints aren’t supported by Entity Framework by default, i.e., we have to implement this feature ourselves or use a 3rd party library. In this article, I use the library Thinktecture.EntityFrameworkCore.SqlServer.
We change the LINQ query so that the Product
is loaded with a UpdLock
and RowLock
, and retry the load test.
var product = await _ctx.Products
.WithTableHints(SqlServerTableHint.UpdLock,
SqlServerTableHint.RowLock)
.FirstAsync(p => p.Id == id);
The output of the load test shows that the application is stable now; we have no deadlocks anymore. The throughput is not as high as with IsolationLevel.ReadCommitted
and without table hints, but it is the worst-case scenario we have here. In production, we won’t be hammering on the one and only record over and over again but use different records.
> hey -n 10000 -c 50 -m POST http://localhost:5000/demo/06D6E029-320B-4347-9E47-03CCFF153652
Summary:
Total: 4.6708 secs
Slowest: 0.0446 secs
Fastest: 0.0160 secs
Average: 0.0233 secs
Requests/sec: 2140.9711
Status code distribution:
[200] 10000 responses
Bonus: Compiled Queries
In this section, we will replace the only LINQ query with a compiled query.
[Route("[controller]")]
public class DemoController : Controller
{
// definition and "caching" of the compiled query
private static readonly Func> _loadProductAsync
= EF.CompileAsyncQuery((DemoDbContext ctx, Guid id)
=> ctx.Products
.WithTableHints(SqlServerTableHint.UpdLock, SqlServerTableHint.RowLock)
.First(p => p.Id == id));
...
[HttpPost("{id}")]
public async Task DoAsync(Guid id)
{
await using var tx = await _ctx.Database.BeginTransactionAsync(_ISOLATION_LEVEL);
// replacement of the LINQ-query with compiled query
var product = await _loadProductAsync(_ctx, id);
// do more or less complex stuff
product.LastProductAccess = DateTime.UtcNow;
await _ctx.SaveChangesAsync();
await tx.CommitAsync();
}
The results are not bad. We almost got 500 requests per second more, for free.
>hey -n 10000 -c 50 -m POST http://localhost:5000/demo/06D6E029-320B-4347-9E47-03CCFF153652
Summary:
Total: 3.8418 secs
Slowest: 0.0295 secs
Fastest: 0.0091 secs
Average: 0.0192 secs
Requests/sec: 2602.9580
Status code distribution:
[200] 10000 responses
Summary
The table hints may have been forgotten when working with Entity Framework. Still, they shouldn’t be underestimated when it comes to performance and stability of the application.