Do Not Waste Performance By Not Using Temp Tables With Entity Framework Core

It has been a while since I released my article about the usage of temp tables in Entity Framework (v6). Meanwhile, Microsoft has released a completely rewritten version of its O/R mapper so my old approach is no longer applicable. But before we learn about a new one, let us think about what we might need temp tables for.

In diesem Artikel:

Do Not Waste Performance By Not Using Temp Tables With Entity Framework Core
Pawel Gerr ist Architekt und Consultant bei Thinktecture. Er hat sich auf .NET Core Backends spezialisiert und kennt Entity Framework von vorne bis hinten.

Potential issues without using temp tables

The usage of temp tables is beneficial if we have to work with a lot of data and/or with multiple databases, particularly with different types of DbContext. One of the most common use cases is when we have to load data like Products with specific identifiers.

				
					// could contain hundreds or thousands of product ids
List<Guid> productIds = ...;
DemoDbContext ctx = ...;

var products = await ctx.Products
                        .Where(p => productIds.Contains(p.Id))
                        .ToListAsync();
				
			

Without temp tables there is not much choice but to use the method Contains that sends all productIds to the database. The corresponding SQL statement looks like this:

				
					SELECT *
FROM Products
WHERE Id IN 
(
    'a1204953-d5e5-477c-a5d3-00111e810bcd', 
    ...,
    '880d3f31-2184-4566-9b63-1a762dcc5c13'
)
				
			

With a few dozen IDs, this approach performs very well, but with hundreds or thousands of values, the performance starts to degrade. Furthermore, it is not uncommon that the productIds are used in more than one query, for example, to load Products and OrderItems.

				
					List<Guid> productIds = ...;

// 1st usage of "productIds"
var products = await ctx.Products
                        .Where(p => productIds.Contains(p.Id))
                        .ToListAsync();

// 2nd usage of "productIds"
var orderItems = await ctx.OrderItems
                          .Where(p => productIds.Contains(p.ProductId))
                          .ToListAsync();
				
			

Multiple transmissions of a large SQL statement to the database and its parsing is certainly an issue, but likely not the biggest one. Depending on the particular query, the database may decide on an execution plan that is not the ideal one, due to the lack of information about the data. One wrong decision by the database may lead to a chain reaction (see: Scanning of the data source is just the tip of the iceberg) that can cause considerable load on the database server.

Benefits of temp tables

One benefit of the temp tables is that we do not need to send the same data like productIds multiple times to the database when using it in multiple queries. Furthermore, we may choose a more performant approach to send the identifiers to the database like using SqlBulkCopy in case of MS SQL Server. Last but not least, we can provide the database with more information about the data, so the database can to choose a better execution plan.
In our case, we can insert all productIds into a temp table and create a unique (clustered) index. By this, the database receives important information, so the records are unique and ordered. With unique, ordered data, the database should be able to choose the best operations to process the request.

The previous example may not yield the best performance when using built-in features only, but it is supported by Entity Framework Core. However, working with a collection of simple values like productIds is not the same as having a collection of tuples because we can not use Contains() anymore but need the method Join().
A concrete example would be loading OrderItems for specific customers and products. The following LINQ query symbolizes our intent but throws an InvalidOperationException at runtime because a Join of an EF-query with an in-memory collection is not supported.

				
					List<(Guid CustomerId, Guid ProductId)> customerProductTuples = ...;

var orderItems = await ctx.OrderItems
                          //.AsEnumerable()
                          .Join(customerProductTuples,
                                i => new { i.Order.CustomerId, i.ProductId },
                                t => new { t.CustomerId, t.ProductId },
                                (i, t) => i)
                          .ToListAsync();
				
			

Sure, we could add AsEnumerable() before Join(), but that will load all OrderItems from the database, which is not an option, especially with big tables.

Adding support for temp tables

Adding support for temp tables to Entity Framework Core can be divided into three parts: introduction of temp tables to Entity Framework Core, creation of the temp tables, and inserting records. The first part does not depend on the concrete vendor of the database, so it applies to the MS SQL Server, SQLite, MySQL, and further. The last two parts are vendor-specific, especially the (bulk) insert of data, so each database requires a specific implementation.

Moreover, I will show you some key elements required to add temp table support to Entity Framework Core 3.1. By the end of this article, we will have a fully functional prototype. If you need more detailed information, feel free to look into the real code. The sources can be found in Azure DevOps: Thinktecture.EntityFrameworkCore.

We will start with the easier part, the introduction of temp tables to Entity Framework Core.

Using temp tables in queries

A temp table has to be introduced to Entity Framework Core before it can be used in queries. For that, we go to OnModelCreating() of the corresponding DbContext and configure a new entity. The entity may have any number of columns you need but in this example we will use just 1 column of the type Guid.

				
					public class MyTempTable
{
   public Guid Id { get; set; }
}

public class DemoDbContext : DbContext
{
   ...
   
   protected override void OnModelCreating(ModelBuilder modelBuilder)
   {
      ...

      modelBuilder.Entity<MyTempTable>()
                  .HasNoKey()
                  .ToView("#MyTempTable");
   }
}
				
			

The entity MyTempTable is created as keyless because (1), I do not want to decide yet, whether the column Id will be unique or not, and (2), it does not really matter for temp tables. Furthermore, the entity is configured as a View, so the EF migrations ignores it. The mentioned configuration parameters are not essential. The only line that matters is modelBuilder.Entity<MyTempTable>().

To get a reference to an IQueryable<MyTempTable>, we use the method Set<T> defined in DbContext. Additionally, I select the column Id right away because the helper class MyTempTable is of no interest.

				
					public class DemoDbContext : DbContext
{
   ...
   
   public IQueryable<Guid> MyTempTable => Set<MyTempTable>()
                                               .Select(t => t.Id);
}
				
			

Please note: In this simplified example, I am using the fixed table name #MyTempTable. This limits us to one temp table per database connection. In the real code, I add a suffix (#MyTempTable_1#MyTempTable_2, etc.) so the names are unique per connection.

For a generic approach the access to corresponding IQueryable<T> would be:

				
					public IQueryable<Guid> MyTempTable 
            => Set<MyTempTable>()
                   .FromSqlRaw($"SELECT * FROM {escaped-table-name}")
                   .Select(t => t.Id);
				
			

After configuration of the new temp table, we can use it like any other entity. 

				
					var products = await ctx.Products
                        .Where(p => ctx.MyTempTable.Contains(p.Id))
                        .ToListAsync();
				
			

The LINQ query above produces the following SQL statement.

				
					SELECT *
FROM Products
WHERE Id IN (SELECT Id FROM #MyTempTable)
				
			

Looks good so far, despite the exception SqlException: Invalid object name '#MyTempTable' because there is no #MyTempTable yet.

Creation of the temp table

 

The creation of a table requires some manual work, but the Model of Entity Framework Core will help us. First, we need to implement a method that generates the required SQL. We do this inside our DemoDbContext. For simplicity, we skip some column properties, like the DEFAULT-value, that are not necessary in our example.

				
					private string GetCreateTableSql(
   bool createPk)
{
   var sqlGenHelper = this.GetService<ISqlGenerationHelper>();

   var entity = Model.FindEntityType(typeof(MyTempTable));
   var tableName = entity.GetTableName();
   var escapedTableName = sqlGenHelper.DelimitIdentifier(tableName);

   var idProperty = entity.FindProperty(nameof(MyTempTable.Id));
   var columnName = idProperty.GetColumnName();
   var escapedColumnName = sqlGenHelper.DelimitIdentifier(columnName);
   var columnType = idProperty.GetColumnType();
   var nullability = idProperty.IsNullable ? "NULL" : "NOT NULL";

   var pkSql = createPk ? $", PRIMARY KEY ({escapedColumnName})" : null;

   var sql = $@"
CREATE TABLE {escapedTableName}
(
   {escapedColumnName} {columnType} {nullability}
   {pkSql}
);";
   return sql;
}
				
			

For a more generic approach, we could iterate over all properties to generate SQL for all columns. The Model provides us with all the necessary information.

What is left is the implementation of a method that executes the SQL statement. But before the execution, we must open the connection and keep it open. If the connection is closed, the Entity Framework Core will open the connection, execute the SQL, and close the connection again, which drops the temp table.

Please note that OpenConnectionAsync will not throw an exception if the connection is open already, in this case, the Entity Framework Core increments an internal counter only. The (real) database connection is going to be closed if the counter drops to 0.

				
					public async Task CreateMyTempTableAsync(
   bool createPk,
   CancellationToken cancellationToken = default)
{
   var sql = GetCreateTableSql(createPk);

   await Database.OpenConnectionAsync(cancellationToken);

   try
   {
      await Database.ExecuteSqlRawAsync(sql, cancellationToken);
   }
   catch (Exception)
   {
      Database.CloseConnection();
      throw;
   }
}
				
			

Let us try out everything we have implemented so far. The LINQ query does not throw any exceptions anymore, so the temp table has been created on the database successfully.

				
					await ctx.CreateMyTempTableAsync(true);

var products = await ctx.Products
                        .Where(p => ctx.MyTempTable.Contains(p.Id))
                        .ToListAsync();
				
			

The SQL statement of CreateMyTempTableAsync is:

				
					CREATE TABLE [#MyTempTable]
(
   [Id] uniqueidentifier NOT NULL
   , PRIMARY KEY ([Id])
)
				
			

Let’s insert some data into #MyTempTable.

(Bulk) Insert into temp table

 

The implementation of the bulk insert into a (temp) table differs the most depending on the database vendors. With MS SQL Server, we could use SqlBulkCopy, with SQLite we may Prepare a SqliteCommand and re-use it, or we build an INSERT statement with multiple VALUES to save round trips to the database, etc.

For this prototype, I will use SqlBulkCopy and to get the best performance, and we need to implement IDataReaderSqlBulkCopy will use the data reader to iterate over the productIds. Our implementation will not implement all methods and properties of the interface, but just a few that are used by the SqlBulkCopy.

				
					public class MyTempTableDataReader : IDataReader
{
   private readonly IEnumerator<Guid> _enumerator;

   public int FieldCount => 1;

   public MyTempTableDataReader(
      IEnumerable<Guid> values)
   {
      _enumerator = values.GetEnumerator();
   }

   public bool Read()
   {
      return _enumerator.MoveNext();
   }

   public object GetValue(int i)
   {
      if (i == 0)
         return _enumerator.Current;

      throw new ArgumentOutOfRangeException();
   }

   public void Dispose()
   {
      _enumerator.Dispose();
   }

   // all other members throw NotImplementedException
   public int Depth => throw new NotImplementedException();
   public object this[int i] => throw new NotImplementedException();
   public bool GetBoolean(int i) => throw new NotImplementedException();
   ...
}
				
			

Next, we have to configure the instance of SqlBulkCopy.

				
					private SqlBulkCopy GetSqlBulkCopy()
{
   var sqlGenHelper = this.GetService<ISqlGenerationHelper>();

   var sqlCon = (SqlConnection)Database.GetDbConnection();
   var sqlTx = (SqlTransaction?)Database.CurrentTransaction?.GetDbTransaction();

   var entity = Model.FindEntityType(typeof(MyTempTable));
   var tableName = entity.GetTableName();
   var escapedTableName = sqlGenHelper.DelimitIdentifier(tableName);

   var idProperty = entity.FindProperty(nameof(MyTempTable.Id));
   var idColumnName = idProperty.GetColumnName();

   return new SqlBulkCopy(sqlCon, SqlBulkCopyOptions.Default, sqlTx)
          {
             DestinationTableName = escapedTableName,
             ColumnMappings =
             {
                new SqlBulkCopyColumnMapping(0, idColumnName)
             }
          };
}
				
			

In the end, we open the connection, create the MyTempTableDataReader, push all data to the database, and close the connection again.

				
					public async Task BulkInsertIntoMyTempTableAsync(
   IEnumerable<Guid> values,
   CancellationToken cancellationToken = default)
{
   using var bulkCopy = GetSqlBulkCopy();
   await Database.OpenConnectionAsync(cancellationToken);

   try
   {
      using var reader = new MyTempTableDataReader(values);
      await bulkCopy.WriteToServerAsync(reader, cancellationToken);
   }
   finally
   {
      Database.CloseConnection();
   }
}
				
			

Now our prototype is complete, and the example behaves as expected!

				
					List<Guid> productIds = ...;

await ctx.CreateMyTempTableAsync(true);
await ctx.BulkInsertIntoMyTempTableAsync(productIds);

var products = await ctx.Products
                        .Where(p => ctx.MyTempTable.Contains(p.Id))
                        .ToListAsync();
				
			

Summary

In this article, we looked at an approach, how to create and use temp tables with Entity Framework Core 3.1. Interestingly enough, the bulk insert was the most elaborate part because Entity Framework Core cannot help us much besides providing metadata about the table.
To improve the prototype, we may implement some (project-specific) convenience methods, so the temp tables‘ usage goes almost unnoticed.

If you have any ideas, suggestions, or questions, please write me an email pawel.gerr@thinktecture.com!

Kostenloser
Newsletter

Aktuelle Artikel, Screencasts, Webinare und Interviews unserer Experten für Sie

Verpassen Sie keine Inhalte zu Angular, .NET Core, Blazor, Azure und Kubernetes und melden Sie sich zu unserem kostenlosen monatlichen Dev-Newsletter an.

Diese Artikel könnten Sie interessieren
.NET
Incremental Roslyn Source Generators in .NET 6: Adapt Code Generation Based on Project Dependencies – Part 5

Incremental Roslyn Source Generators in .NET 6: Adapt Code Generation Based on Project Dependencies – Part 5

The Roslyn Source Generator, implemented in the previous articles of the series, emits some C# code without looking at the dependencies of the current .NET (Core) project. In this article our DemoSourceGenerator should implement a JsonConverter, but only if the corresponding library (e.g. Newtonsoft.Json) is referenced by the project.
08.07.2022
Unterschiede
.NET
Blazor WebAssembly vs. Blazor Server – Welche Unterschiede gibt es und wann wähle ich was?

Blazor WebAssembly vs. Blazor Server – Welche Unterschiede gibt es und wann wähle ich was?

Das Blazor Framework von Microsoft gibt es inzwischen in drei "Geschmacksrichtungen". Die erste ist Blazor WebAssembly, die zweite Blazor Server, und zu guter Letzt gibt es noch Blazor Hybrid. In diesem Artikel wollen wir uns die zwei "echten", also Browser-basierten, Web-Anwendungs-Szenarien WebAssembly und Server anschauen.
04.07.2022
Three different textured walls
.NET
Dependency Injection Scopes in Blazor

Dependency Injection Scopes in Blazor

The dependency injection system is a big part of how modern ASP.NET Core works internally: It provides a flexible solution for developers to structure their projects, decouple their dependencies, and control the lifetimes of the components within an application. In Blazor - a new part of ASP.NET Core - however, the DI system feels a bit odd, and things seem to work a bit differently than expected. This article will explain why this is not only a feeling but indeed the case in the first place and how to handle the differences in order to not run into problems later on.
31.05.2022
.NET
Asynchrone Operationen: Blazor WebAssembly für Angular-Entwickler – Teil 5 [Screencast]

Asynchrone Operationen: Blazor WebAssembly für Angular-Entwickler – Teil 5 [Screencast]

Eine Webanwendung will natürlich auch mit Daten gefüttert werden. Doch diese müssen irgendwo her kommen. Nichts liegt näher als diese von einer Web API zu laden. Dieser Screencast zeigt, wie asynchrone Operationen in Blazor funktionieren und welche gravierenden Unterschiede es zu Angular gibt.
26.05.2022
.NET
Typings: Blazor WebAssembly für Angular-Entwickler – Teil 4 [Screencast]

Typings: Blazor WebAssembly für Angular-Entwickler – Teil 4 [Screencast]

C# und TypeScript entstammen der Feder der selben Person. Doch sind sie deshalb auch gleich? In diesem Teil der Screencast-Serie erfahren Sie, wie mit Typen in den beiden Programmiersprachen verfahren wird und welche Unterschiede es gibt.
19.05.2022
.NET
Bindings: Blazor WebAssembly für Angular-Entwickler – Teil 3 [Screencast]

Bindings: Blazor WebAssembly für Angular-Entwickler – Teil 3 [Screencast]

Wer Komponenten einsetzt, steht früher oder später vor der Fragestellung, wie man Daten an die Komponente übergibt oder auf Ereignisse einer Komponente reagiert. In diesem Screencast wird gezeigt wie Bindings bei Komponenten funktionieren, also wie eine Komponente Daten von außerhalb benutzen und Rückmeldung bei Aktionen geben kann.
12.05.2022