Fetching one record from a collection using navigational properties in Entity Framework may lead to unnecessary JOINs. To show the problem we need two tables Products and Prices.

EF Blog - Redundant Joins - DB Schema

The query shown below is fetching products along with their first price.

var products = ctx.Products
      .Select(p => new
      {
          p.Name,
          FirstPriceStartdate = p.Prices
                                 .OrderBy(price => price.Startdate)
                                 .FirstOrDefault().Startdate,
          FirstPriceValue = p.Prices
                             .OrderBy(price => price.Startdate)
                             .FirstOrDefault()
                             .Value,
      })
      .ToList();

Looks simple.
Lets look at the SQL statement or rather the execution plan.

EF Blog - Redundant Joins - Before Subselect

The table Prices is JOINed twice because of the two occurrences of the expression p.Prices.OrderBy(...).FirstOrDefault(). The Entity Framework doesn't recognize that these expressions are identical but we can help. Just use a sub-select.

var products = ctx.Products
       .Select(p => new
       {
           Product = p,
           FirstPrice = p.Prices
                         .OrderBy(price => price.Startdate)
                         .FirstOrDefault()
       })
      .Select(p => new
      {
          p.Product.Name,
          FirstPriceStartdate = p.FirstPrice.Startdate,
          FirstPriceValue = p.FirstPrice.Value,
      })
      .ToList();

That's it, the table Prices is JOINed only once now.

EF Blog - Redundant Joins - After Subselect

Having a complex query you may need multiple sub-select to select a navigational property of another navigational property. But in this case please write an email to your colleagues or a comment so the other developers understand what's going on otherwise your funny looking query will be refactored pretty soon :)

Related Articles

 | Pawel Gerr

Implementing database access with Entity Framework is pretty convenient, but sometimes the query performance can be very poor. Especially using navigational properties to load collections leads to significantly longer execution times and more I/O. To see the impact of the loading…

Read article
 | Pawel Gerr

Whether in an application in production or in integration tests on CI, the deadlocks are virtually unavoidable. The behavior of a deadlock depends on the database (MS SQL Server, MySQL, etc)  and the isolation level (like Snapshot Isolation). Some of the databases are blocking…

Read article
 | Pawel Gerr

There are some use cases that force me to use raw T-SQL instead of using LINQ. One of them is the need for ROW_NUMBER. But not anymore... One again, due to great work of Entity Framework (EF) team we are able to add more capabilities to EF quite easily. At the moment there are…

Read article