Auditing EF Code First Migrations

EF Code First is awesome. I cannot emphasize enough how useful, intuitive and overall amazing it is (though I tried to do just that in this post, so I won't start it again).

One problem that usually comes up against EF Code First is that you cannot handle changes of the model efficiently in the DB. This (poorly constructed) argument can be countered with introducing the opposition to EF Migrations: a set of tools that aims to do just that. If you don't know EF Migrations, you can learn more about it here.
Basically what you have to do is run the Enable-Migrations command, and then every time you want to create a new "snapshot" of your model you call Add-Migration. Finally, when you actually want to update the database, you call Update-Database.

A second argument that usually comes up is that OK, I have migrations, that there are still problems with it.

  • "I cannot call Update-Database when deploying to a production environment." True, and you shouldn't. What you should do is create a script from your migrations and apply that as part of the CI process (stay tuned, I have something coming up on this).
  • "Migrations are not easy to merge in team environments". Well, there is some truth to it. But there is a technique that you can use to merge migrations quite simply. And come on, have you ever tried to merge an edmx, or pure SQL-based changescripts? (Not to mention that this drawback of migrations goes away entirely in EF Core, so there).
  • "OK, so migrations are awesome. I can use them as part of my CI builds, they can be merged more easily then edmx or SQL. Still, I have no way of auditing who and when applies a migration to the database." Now this is unfortunately kind of true...

How tracking migrations works

Well, I'm not gonna explain all the details (check out the link above to see how it exactly works). For now it's enough to know that if you use Code First Migrations, a whole new table gets created in the database. It's called __MigrationHistory and it contains some basic info about the migrations (e.g. last run migration), but it does not store any timestamps or user info. And I do have to agree that this is kinda sad.

How tracking migrations should work

So we need to specify two more columns to our __MigrationHistory table:

  • MigratingUser: The user who created the migration. Defaults to CURRENT_USER (because we are on Azure SQL, of course).
  • MigrationDate: The timestamp of the migration. Defaults to GETDATE().

How tracking migrations should not work

I've seen many solutions over the years for this. Most of them have not been to my liking at all.

How tracking migrations could work

There is one solution that I almost like :)
So it turns out, that you can overwrite the migration handling.
You can create you own HistoryRow type, which is basically a migration history record:

public class AuditingHistoryRow : HistoryRow
{
  public string MigratingUser { get; set; }
  public DateTime MigrationDate { get; set; }
}

Then create a custom HistoryContext (notice the new in the IDbSet):

public class AuditingHistoryContext : HistoryContext
{
  public new IDbSet<AuditingHistoryRow> History { get; set; }
}

And then in the constructor of the migration configuration class, register this newly created type:

public Configuration()
{
  AutomaticMigrationsEnabled = false;
  SetHistoryContextFactory("System.Data.SqlClient", 
    (conn, schema) => new AuditingHistoryContext(conn, schema)
   );
}

And your done. Not that big of a deal, right? My only problem with this solution is that you cannot easily and elegantly specify the default values. As far as I know, the only way to specify a default value for a column in EF Code First, you have to use (guess what?) migrations. And I don't want to add a custom migration snapshot just so that I can handle this requirement. EF should be concerned with your business entities and business rules — and this is not a business rule. Migrations have nothing to do with business requirements, much less the auditing of migrations. You cannot create an EF Migration for something which is not part of your EF model, and the migrationhistory should never be a part of your model. So this solution could be a good one if you could specify a default value easily with the fluent configuration API (but again, as far as I know, you can't; but then again, I might be wrong).

How tracking migrations can work

So after this jibber-jabber about being all elegant and transparent and stuff, here's my solution.

Here's what we've established so far:

  • Migrations are code-based, but you can create SQL scripts from this code-based version.
  • We don't use the code-based version when deploying to a production environment. We use the scripts. So technically, it's enough to introduce the changes in the migration history auditing to the generated SQL.
  • The migration history should not be part of the model, so anything related to migrations should not be in the code-based migration. Thus it is truly enough to change the script generation (check out the default behavior: if you create an initial migration, the code-based migration has no info on migration history, but the generated SQL does).

So I have to modify the process of generating SQL from the migrations. And yes, you can do that.
So if you want to create scripts for Sql Server, then you have to inherit the SqlServerMigrationSqlGenerator class. This class has a number of methods that you can overwrite, depending on what type of code item you want to generate yourself. For example, if you want to overwrite the generation of tables (and that's what we want), you can create something like this:

public class AuditingMigrationSqlGenerator : SqlServerMigrationSqlGenerator
{
  protected override void Generate(CreateTableOperation createTableOperation)
  {
    // custom code to generate table goes here
  }
}

And if you have a custom generator like this, you can register it withing the cosntructor of your migrations configuration class:

public Configuration()
{
  AutomaticMigrationsEnabled = false;
  SetSqlGenerator("System.Data.SqlClient", new AuditingMigrationSqlGenerator(this));            
        }

Now what should go inside the generate method? Well, something like this:

  1. First, check if the table being generated is the migration history table.
  2. If the table being generated is the migration history table, add the two new columns.

So all in all, something like this:

public class AuditingMigrationSqlGenerator : SqlServerMigrationSqlGenerator
{

  protected override void Generate(CreateTableOperation createTableOperation)
  {
    if (IsHistoryTable(createTableOperation))
    {
      AddCustomHistoryTableColumns(createTableOperation);
    }
    base.Generate(createTableOperation);
  }
}

  private bool IsHistoryTable(CreateTableOperation createTableOperation)
  {
    // check if the table being created is the history table
  }
  private bool AddCustomHistoryTableColumns(CreateTableOperation createTableOperation)
  {
    // add the custom columns to the history table
  }

Adding custom columns to the table

Let's start with the easier one. This is actually pretty straightforward, you just have to mess around with the API a bit, but it's kinda-sorta intuitive:

private CreateTableOperation AddCustomHistoryTableColumns(CreateTableOperation createTableOperation)
{
  createTableOperation.Columns.Add(
     new ColumnModel(PrimitiveTypeKind.String)
     {
       Name = "MigratingUser",
       DefaultValueSql = "CURRENT_USER"
     });
  
  createTableOperation.Columns.Add(
     new ColumnModel(PrimitiveTypeKind.DateTime)
     {
       Name = "MigrationDate",
       DefaultValueSql = "GETDATE()"
     });
  return createTableOperation;
}

And that's it :) You have to create a ColumnModel, specify the type of the column, add a name, and see how easy it is to specify the default value? That's what we've been missing from the kinda-sorta OK approach.

Checking for the migration history table

And now for the hard part — checking for the migration history table. This is important, because as you can see above, the migration history table can be customized, e.g. its name in the database can be changed. So we have to do some magic to determine the name of the migration history table. If we have the name of the migration history table, then checking every table's name during generation is just a matter of name equality (see IsHistoryTable in the next code listing).
First of, we'll need the actual migration configuration class, so the sql generator should be modified as well:

public class AuditingMigrationSqlGenerator : SqlServerMigrationSqlGenerator
{
  private readonly DbMigrationsConfiguration config;
  private readonly string historyTableName;
  
  public AuditingMigrationSqlGenerator(DbMigrationsConfiguration config)
  {
    this.config = config;
    this.historyTableName = GetHistoryTableName();
  }

  private string GetHistoryTableName()
  {
    // get the history table name from config
  }

  private bool IsHistoryTable(CreateTableOperation createTableOperation)
  {
    return createTableOperation.Name == historyTableName;            
  }
  // other stuff with adding custom columns etc. is still here
}

And this changes the registration a bit:

public Configuration()
{
  AutomaticMigrationsEnabled = false;
  SetSqlGenerator("System.Data.SqlClient", new AuditingMigrationSqlGenerator(this));            
}

So let's try to find the history table. To do this, first you have to create a your DbContext. To do this, you have to use the migration configuration instance (that's why it is passed to the generator):

private string GetHistoryTableName()
{
  using (var originalCtx = (DbContext)Activator.CreateInstance(config.ContextType))
  {
    // other stuff
  }
}

If you have the original context and the migration configuration, then you can access the current history context like this:

private string GetHistoryTableName()
{
  using (var originalCtx = (DbContext)Activator.CreateInstance(config.ContextType))
  using (var histCtx = config.GetHistoryContextFactory("System.Data.SqlClient")(originalCtx.Database.Connection, null))
  {
    // other stuff
  }
}

Notice the using blocks. Technically, you don't need them, because at no point during my solution are any of the contexts actually opened. If you check the Connection property in them, you'll find that they are closed all the way. But better to be safe then sorry, I guess.
Now all you have to do is find the set in histCtx for HistoryRow, or any of its descendant types (keep in mind that you can still customize history entries with the other kinda-sorta OK solution). And this is where we enter black-magic territory. To see how to do this exactly, check out this blog post by Rowan Miller, product director of Entity Framework. I took his code and modified it a bit (his version checks for type equality, mine checks for type compatibility and I also add the schema name to the table).

private string GetHistoryTableName()
{
  using (var originalCtx = (DbContext)Activator.CreateInstance(config.ContextType))
  using (var histCtx = config.GetHistoryContextFactory("System.Data.SqlClient")(originalCtx.Database.Connection, null))
  {
    var metadata = ((IObjectContextAdapter)histCtx).ObjectContext.MetadataWorkspace;
    var objectItemCollection = ((ObjectItemCollection)metadata.GetItemCollection(DataSpace.OSpace));

    // Get the entity type from the model that maps to the CLR type
    var entityType = metadata
                   .GetItems<EntityType>(DataSpace.OSpace)
                   .Single(e => typeof(HistoryRow)   
        .IsAssignableFrom(objectItemCollection.GetClrType(e)));

   // Get the entity set that uses this entity type
   var entitySet = metadata
               .GetItems<EntityContainer>(DataSpace.CSpace)
               .Single()
               .EntitySets
               .Single(s => s.ElementType.Name == entityType.Name);

   // Find the mapping between conceptual and storage model for this entity set
   var mapping = metadata
                 .GetItems<EntityContainerMapping>(DataSpace.CSSpace)
                 .Single()
                 .EntitySetMappings
                 .Single(s => s.EntitySet == entitySet);

  // Find the storage entity set (table) that the entity is mapped
  var table = mapping
             .EntityTypeMappings.Single()
             .Fragments.Single()
             .StoreEntitySet;

  // Return the table name from the storage entity set
  return string.Join(".",
                 (string)table.MetadataProperties["Schema"].Value ?? "dbo",
                 (string)table.MetadataProperties["Table"].Value ?? table.Name
                   );
  }
}

And there you have it! The last piece of the puzzle. Now you can find the name of the migration history table in the database. During SQL generation, your custom SQL generator checks every time when a table is being generated, whether that table is the migration table (whose name it already has by that point), and if it is, just tweaks the table definition a little bit before handing it over to the built-in base logic.
Go ahead and check it out. If you create your initial migration with Add-Migration, and then use Update-Database -Script, you'll see your new and improved migration history table.
You can check out the whole solution on Github.