Do you understand a data type change = "Data Motion Scripts"?

Last updated by Github-actions[bot] about 1 month ago.See history

Scripting out a schema change is easy, worrying about data is not. "'Data motion" refers to a change in the meaning of data, which will require scripts which touch data and schema.

Let's look at an example:

We have a 'Gender' column (that is a Boolean) storing 0's and 1's. All works well for a while.

TableBit
Figure: Anything wrong this Gender column?

Later you learn you need to change the data type to char(2) to support 'MA', 'FE', 'NB' and 'NA'

CasterSemenya
Figure: Caster Semenya was the first to teach us a thing or two about the right data type for Gender

The data then must be migrated to the new data type this way:

Rename 'Gender' to 'ztGender' Add a new column 'Gender' with type char(2) Insert the existing data from 'ztGender' to 'Gender' (map 0 to 'F' and 1 to 'M') Delete the column ztGender

TableChar
Figure: Changing the data type and data required a "Data Motion Script"

Note: zt stands for Temporary.

Visual Studio does not automatically support this scenario, as data type changes are not part of the refactoring tools. However, if you add pre and post scripting events to handle the data type change the rest of the changes are automatically handled for you.

microsoft schema compare 1710232021939
Figure: Don't use Visual Studio Schema Compare Tool (aka Data Dude)

Note: In order to achieve this you must use the built in Refactor tools as it create a log of all the refactors in order. This helps Visual Studio generate the schema compare and make sure no data is lost.

There are few options available to perform data type change correctly:

  1. Use manual scripts. All data type changes including data migration can be performed by writing scripts manualy. This way you have full control over the change. It is recommended to use:

    • DbUp to automate script deployment and keep track of all database changes.
  2. Use Database Project. As mentioned above, Visual Studio does not support data type changes out of the box and should not be used to perform this kind of task.
  3. Use Entity Framework (EF) Code First Migrations. If your application uses Entity Framework Code First, then it is strongly recommended to use Migrations feature.

Using EF Code First Migrations is comparable to using one of the below combinations:

public partial class GenderToString : DbMigration
{
   public override void Up()
   {
      AlterColumn("dbo.Customers", "Gender", c => c.String(maxLength: 2));
   }
        
   
   public override void Down()
   {
       AlterColumn("dbo.Customers", "Gender", c => c.Boolean(nullable: false));
   }
}

Bad Example - the default scaffolded migration will not perform any mapping of your data\

public partial class GenderToString : DbMigration
{
   public override void Up()
   {
     AddColumn("dbo.Customers", "GenderTemp", c => c.Boolean(nullable: false));
     Sql("UPDATE [dbo].[Customers] set GenderTemp = Gender");
     DropColumn("dbo.Customers", "Gender");
     AddColumn("dbo.Customers", "Gender", c => c.String(maxLength: 2));
     Sql("UPDATE [dbo].[Customers] set Gender = 'MA' where GenderTemp=1");
     Sql("UPDATE [dbo].[Customers] set Gender = 'FE' where GenderTemp=0");
     DropColumn("dbo.Customers", "GenderTemp");
   }
}

Good Example - Data motion with EF Migrations

We open source. Powered by GitHub