In this Article

It has been a while since I released my article about the usage of temp tables in Entity Framework (v6). Meanwhile, Microsoft has released a completely rewritten version of its O/R mapper so my old approach is no longer applicable. But before we learn about a new one, let us think about what we might need temp tables for.

Potential issues without using temp tables

The usage of temp tables is beneficial if we have to work with a lot of data and/or with multiple databases, particularly with different types of DbContext. One of the most common use cases is when we have to load data like Products with specific identifiers.

// could contain hundreds or thousands of product ids
List<Guid> productIds = ...;
DemoDbContext ctx = ...;

var products = await ctx.Products
                        .Where(p => productIds.Contains(p.Id))
                        .ToListAsync();

Without temp tables there is not much choice but to use the method Contains that sends all productIds to the database. The corresponding SQL statement looks like this:

SELECT *
FROM Products
WHERE Id IN 
(
    'a1204953-d5e5-477c-a5d3-00111e810bcd', 
    ...,
    '880d3f31-2184-4566-9b63-1a762dcc5c13'
)

With a few dozen IDs, this approach performs very well, but with hundreds or thousands of values, the performance starts to degrade. Furthermore, it is not uncommon that the productIds are used in more than one query, for example, to load Products and OrderItems.

List<Guid> productIds = ...;

// 1st usage of "productIds"
var products = await ctx.Products
                        .Where(p => productIds.Contains(p.Id))
                        .ToListAsync();

// 2nd usage of "productIds"
var orderItems = await ctx.OrderItems
                          .Where(p => productIds.Contains(p.ProductId))
                          .ToListAsync();

Multiple transmissions of a large SQL statement to the database and its parsing is certainly an issue, but likely not the biggest one. Depending on the particular query, the database may decide on an execution plan that is not the ideal one, due to the lack of information about the data. One wrong decision by the database may lead to a chain reaction (see: Scanning of the data source is just the tip of the iceberg) that can cause considerable load on the database server.

Benefits of temp tables

One benefit of the temp tables is that we do not need to send the same data like productIds multiple times to the database when using it in multiple queries. Furthermore, we may choose a more performant approach to send the identifiers to the database like using SqlBulkCopy in case of MS SQL Server. Last but not least, we can provide the database with more information about the data, so the database can to choose a better execution plan.
In our case, we can insert all productIds into a temp table and create a unique (clustered) index. By this, the database receives important information, so the records are unique and ordered. With unique, ordered data, the database should be able to choose the best operations to process the request.

The previous example may not yield the best performance when using built-in features only, but it is supported by Entity Framework Core. However, working with a collection of simple values like productIds is not the same as having a collection of tuples because we can not use Contains() anymore but need the method Join().
A concrete example would be loading OrderItems for specific customers and products. The following LINQ query symbolizes our intent but throws an InvalidOperationException at runtime because a Join of an EF-query with an in-memory collection is not supported.

List<(Guid CustomerId, Guid ProductId)> customerProductTuples = ...;

var orderItems = await ctx.OrderItems
                          //.AsEnumerable()
                          .Join(customerProductTuples,
                                i => new { i.Order.CustomerId, i.ProductId },
                                t => new { t.CustomerId, t.ProductId },
                                (i, t) => i)
                          .ToListAsync();

Sure, we could add AsEnumerable() before Join(), but that will load all OrderItems from the database, which is not an option, especially with big tables.

Adding support for temp tables

Adding support for temp tables to Entity Framework Core can be divided into three parts: introduction of temp tables to Entity Framework Core, creation of the temp tables, and inserting records. The first part does not depend on the concrete vendor of the database, so it applies to the MS SQL Server, SQLite, MySQL, and further. The last two parts are vendor-specific, especially the (bulk) insert of data, so each database requires a specific implementation.

Moreover, I will show you some key elements required to add temp table support to Entity Framework Core 3.1. By the end of this article, we will have a fully functional prototype. If you need more detailed information, feel free to look into the real code. The sources can be found in Azure DevOps: Thinktecture.EntityFrameworkCore.

We will start with the easier part, the introduction of temp tables to Entity Framework Core.

Kostenloses Cheat Sheet zu Performance-Optimierung in Entity Framework Core zum Download:
Pawel Gerr hat zum Thema Performance-Optimierung in Entity Framework Core ein Cheat Sheet erstellt, auf dem er kompakt alles Wissenswertes zusammengefasst hat. Melden Sie sich hier zu unseren monatlichen DevNews an und wir senden Ihnen das Cheat Sheet zum Download.

Using temp tables in queries

A temp table has to be introduced to Entity Framework Core before it can be used in queries. For that, we go to OnModelCreating() of the corresponding DbContext and configure a new entity. The entity may have any number of columns you need but in this example we will use just 1 column of the type Guid.

public class MyTempTable
{
   public Guid Id { get; set; }
}

public class DemoDbContext : DbContext
{
   ...
   
   protected override void OnModelCreating(ModelBuilder modelBuilder)
   {
      ...

      modelBuilder.Entity<MyTempTable>()
                  .HasNoKey()
                  .ToView("#MyTempTable");
   }
}

The entity MyTempTable is created as keyless because (1), I do not want to decide yet, whether the column Id will be unique or not, and (2), it does not really matter for temp tables. Furthermore, the entity is configured as a View, so the EF migrations ignores it. The mentioned configuration parameters are not essential. The only line that matters is modelBuilder.Entity<MyTempTable>().

To get a reference to an IQueryable<MyTempTable>, we use the method Set<T> defined in DbContext. Additionally, I select the column Id right away because the helper class MyTempTable is of no interest.

public class DemoDbContext : DbContext
{
   ...
   
   public IQueryable<Guid> MyTempTable => Set<MyTempTable>()
                                               .Select(t => t.Id);
}

Please note: In this simplified example, I am using the fixed table name #MyTempTable. This limits us to one temp table per database connection. In the real code, I add a suffix (#MyTempTable_1, #MyTempTable_2, etc.) so the names are unique per connection.

For a generic approach the access to corresponding IQueryable<T> would be:

public IQueryable<Guid> MyTempTable 
            => Set<MyTempTable>()
                   .FromSqlRaw($"SELECT * FROM {escaped-table-name}")
                   .Select(t => t.Id);

After configuration of the new temp table, we can use it like any other entity.

var products = await ctx.Products
                        .Where(p => ctx.MyTempTable.Contains(p.Id))
                        .ToListAsync();

The LINQ query above produces the following SQL statement.

SELECT *
FROM Products
WHERE Id IN (SELECT Id FROM #MyTempTable)

Looks good so far, despite the exception SqlException: Invalid object name '#MyTempTable' because there is no #MyTempTable yet.

Creation of the temp table

The creation of a table requires some manual work, but the Model of Entity Framework Core will help us. First, we need to implement a method that generates the required SQL. We do this inside our DemoDbContext. For simplicity, we skip some column properties, like the DEFAULT-value, that are not necessary in our example.

private string GetCreateTableSql(
   bool createPk)
{
   var sqlGenHelper = this.GetService<ISqlGenerationHelper>();

   var entity = Model.FindEntityType(typeof(MyTempTable));
   var tableName = entity.GetTableName();
   var escapedTableName = sqlGenHelper.DelimitIdentifier(tableName);

   var idProperty = entity.FindProperty(nameof(MyTempTable.Id));
   var columnName = idProperty.GetColumnName();
   var escapedColumnName = sqlGenHelper.DelimitIdentifier(columnName);
   var columnType = idProperty.GetColumnType();
   var nullability = idProperty.IsNullable ? "NULL" : "NOT NULL";

   var pkSql = createPk ? $", PRIMARY KEY ({escapedColumnName})" : null;

   var sql = $@"
CREATE TABLE {escapedTableName}
(
   {escapedColumnName} {columnType} {nullability}
   {pkSql}
);";
   return sql;
}

For a more generic approach, we could iterate over all properties to generate SQL for all columns. The Model provides us with all the necessary information.

What is left is the implementation of a method that executes the SQL statement. But before the execution, we must open the connection and keep it open. If the connection is closed, the Entity Framework Core will open the connection, execute the SQL, and close the connection again, which drops the temp table.

Please note that OpenConnectionAsync will not throw an exception if the connection is open already, in this case, the Entity Framework Core increments an internal counter only. The (real) database connection is going to be closed if the counter drops to 0.

public async Task CreateMyTempTableAsync(
   bool createPk,
   CancellationToken cancellationToken = default)
{
   var sql = GetCreateTableSql(createPk);

   await Database.OpenConnectionAsync(cancellationToken);

   try
   {
      await Database.ExecuteSqlRawAsync(sql, cancellationToken);
   }
   catch (Exception)
   {
      Database.CloseConnection();
      throw;
   }
}

Let us try out everything we have implemented so far. The LINQ query does not throw any exceptions anymore, so the temp table has been created on the database successfully.

await ctx.CreateMyTempTableAsync(true);

var products = await ctx.Products
                        .Where(p => ctx.MyTempTable.Contains(p.Id))
                        .ToListAsync();

The SQL statement of CreateMyTempTableAsync is:

CREATE TABLE [#MyTempTable]
(
   [Id] uniqueidentifier NOT NULL
   , PRIMARY KEY ([Id])
)

Let's insert some data into #MyTempTable.

(Bulk) Insert into temp table

The implementation of the bulk insert into a (temp) table differs the most depending on the database vendors. With MS SQL Server, we could use SqlBulkCopy, with SQLite we may Prepare a SqliteCommand and re-use it, or we build an INSERT statement with multiple VALUES to save round trips to the database, etc.

For this prototype, I will use SqlBulkCopy and to get the best performance, and we need to implement IDataReader. SqlBulkCopy will use the data reader to iterate over the productIds. Our implementation will not implement all methods and properties of the interface, but just a few that are used by the SqlBulkCopy.

public class MyTempTableDataReader : IDataReader
{
   private readonly IEnumerator<Guid> _enumerator;

   public int FieldCount => 1;

   public MyTempTableDataReader(
      IEnumerable<Guid> values)
   {
      _enumerator = values.GetEnumerator();
   }

   public bool Read()
   {
      return _enumerator.MoveNext();
   }

   public object GetValue(int i)
   {
      if (i == 0)
         return _enumerator.Current;

      throw new ArgumentOutOfRangeException();
   }

   public void Dispose()
   {
      _enumerator.Dispose();
   }

   // all other members throw NotImplementedException
   public int Depth => throw new NotImplementedException();
   public object this[int i] => throw new NotImplementedException();
   public bool GetBoolean(int i) => throw new NotImplementedException();
   ...
}

Next, we have to configure the instance of SqlBulkCopy.

private SqlBulkCopy GetSqlBulkCopy()
{
   var sqlGenHelper = this.GetService<ISqlGenerationHelper>();

   var sqlCon = (SqlConnection)Database.GetDbConnection();
   var sqlTx = (SqlTransaction?)Database.CurrentTransaction?.GetDbTransaction();

   var entity = Model.FindEntityType(typeof(MyTempTable));
   var tableName = entity.GetTableName();
   var escapedTableName = sqlGenHelper.DelimitIdentifier(tableName);

   var idProperty = entity.FindProperty(nameof(MyTempTable.Id));
   var idColumnName = idProperty.GetColumnName();

   return new SqlBulkCopy(sqlCon, SqlBulkCopyOptions.Default, sqlTx)
          {
             DestinationTableName = escapedTableName,
             ColumnMappings =
             {
                new SqlBulkCopyColumnMapping(0, idColumnName)
             }
          };
}

In the end, we open the connection, create the MyTempTableDataReader, push all data to the database, and close the connection again.

public async Task BulkInsertIntoMyTempTableAsync(
   IEnumerable<Guid> values,
   CancellationToken cancellationToken = default)
{
   using var bulkCopy = GetSqlBulkCopy();
   await Database.OpenConnectionAsync(cancellationToken);

   try
   {
      using var reader = new MyTempTableDataReader(values);
      await bulkCopy.WriteToServerAsync(reader, cancellationToken);
   }
   finally
   {
      Database.CloseConnection();
   }
}

Now our prototype is complete, and the example behaves as expected!

List<Guid> productIds = ...;

await ctx.CreateMyTempTableAsync(true);
await ctx.BulkInsertIntoMyTempTableAsync(productIds);

var products = await ctx.Products
                        .Where(p => ctx.MyTempTable.Contains(p.Id))
                        .ToListAsync();

Summary

In this article, we looked at an approach, how to create and use temp tables with Entity Framework Core 3.1. Interestingly enough, the bulk insert was the most elaborate part because Entity Framework Core cannot help us much besides providing metadata about the table.
To improve the prototype, we may implement some (project-specific) convenience methods, so the temp tables' usage goes almost unnoticed.

If you have any ideas, suggestions, or questions, please write me an email pawel.gerr@thinktecture.com!

Related Articles

entity framework core
Better Entity Framework Core Performance by Reading Execution Plans
Both a LINQ query and an SQL statement are descriptions that state which data should be fetched, but not how.. Sure, when reading LINQ or SQL, we can make assumptions about the performance but not in every case. Some queries are either too fancy or too big to grasp, so our…
Pawel Gerr
entity framework core
Unnecessary Fuzzy Searches may hurt your Entity Framework Core Performance
After talking about performance issues like N+1 Queries and the Cartesian Explosion that made its comeback in Entity Framework Core 3, we will today look at a performance issue that is not tied to any Entity Framework version but is rather a general one. What do I mean by…
Pawel Gerr
entity framework core
The performance issue "Cartesian Explosion" made its comeback in Entity Framework Core 3
In Entity Framework Core 3.0/3.1 the SQL statement generation underwent significant changes. As we have seen in the previous post these changes removed both the implicit client-side evaluation and the N+1 Query Problem (which is good!). Unfortunately, these changes (re)introduced…
Pawel Gerr
entity framework core
Is "N+1 Queries" still a performance issue in Entity Framework Core 3?
In a previous post we saw that EF 2.1 is highly susceptible to the N+1 queries problem. After the release of a new version of Entity Framework Core (EF) the first question coming to mind is: "Is it still a big issue in EF 3.1?" And if the answer is no, is there anything else we…
Pawel Gerr