Did you ever attempt to add a new non-nullable ForeignKey field to an existing Django model? If you did, you were probably prompted to add a default value that existing data should use for the new Foreign Key. But what value should this be? How do you automatically set it to the right value?
I was recently working on a project where the requirements had greatly changed throughout the iteration of the project. We had gone from a quite simple model that got more and more fields added to it, and it ended up with multiple many copies of the same model instances with only slight changes to the data.
It made a lot of sense to refactor this database model and to split it up into a parent and child model, which meant that any items that shared most of the data could inherit from the parent, and only specify the new data in the child.
E.g. instead of doing something like this:
from django.db import models class FooModel(models.Model): field_a = models.IntegerField() field_b = models.IntegerField() field_c = models.IntegerField() field_d = models.IntegerField() field_that_is_updated = models.IntegerField()
We could simply refactor it into something like this:
from django.db import models class BarModel(models.Model): field_a = models.IntegerField() field_b = models.IntegerField() field_c = models.IntegerField() field_d = models.IntegerField() class FooModel(models.Model): parent = models.ForeignKey(BarModel, on_delete=models.CASCADE) field_that_is_updated = models.IntegerField()
Note that the new
field is not nullable. So what would happen to all the existing
would they point to?
This is a very common scenario that you will definitely run into. Sometimes I see the lazy solution which is to simply make it
, this makes the migration pass but you will still run into the issue that the old entries no longer are complete.
How did I solve this problem where we want to add a new
field to a model that is not nullable, and we don't have any existing data to point these existing entries to?
Migrate in Multiple Steps
We can achieve this by creating multiple migration files that execute the changes to our models in incremental ways. We will step by step do the following things:
Create a new Model and add new
ForeignKeythat is nullable.
- Write custom migration that generates the data for old entries.
- Remove migrated fields from the original Model.
Remove nullable from the new
All the migrations get executed synchronously one by one, and we end up with a non-nullable
that is populated with data that has been created and generated during the migration process.
Sounds good? Let's do it!
Instantiate New Models and Allow Null Foreign Key
The first step is to set up our new Model, create a
field to it, but keep the fields that you want to migrate in the old model and make sure that the
This allows us to set up the new model while still keeping our original data in the existing model's table. We don't want to delete the fields and the data with it just yet.
At this point in time, our code would look something like this.
class BarModel(models.Model): field_a = models.IntegerField() field_b = models.IntegerField() field_c = models.IntegerField() field_d = models.IntegerField() class FooModel(models.Model): field_a = models.IntegerField() field_b = models.IntegerField() field_c = models.IntegerField() field_d = models.IntegerField() field_that_is_updated = models.IntegerField() parent = models.ForeignKey(BarModel, on_delete=models.CASCADE, null=True)
Note that we added our new
and we added a
to it from the
. Also, note that the
fields are duplicated in both models at this point in time.
Write Custom Migration that Generates Data
Next, we want to fill the new
table with data and point the existing
entries to these new table rows with its
. We can achieve this by creating a custom migration file that we will with our own content.
python manage.py makemigrations --empty myapp
The command above will generate a new migration file to the
Django application. This newly generated file will contain an empty
list that you can fill with your own actions that you wish your migration file to execute.
package is filled with many useful actions that you can execute during migration such as (but not limited to):
- Add new fields.
- Delete new fields.
- Run custom SQL.
- Execute Python script.
The last point is what is interesting to us, we can make our custom migration execute python code as the migration gets applied. Awesome!
We can leverage this feature to then generate the new data on the fly and populate the
with data from the existing
entries, and then point the
field to this new
When you generate your new migration using the
flag, the file should end up looking something like this.
from django.db import migrations, models class Migration(migrations.Migration): dependencies = [ ('myapp', '0011_auto_20190108_0750'), ] operations = 
We can then add a
operation to the
list that execute a custom function.
from django.db import migrations, models def create_bars(apps, schema_editor): ... class Migration(migrations.Migration): dependencies = [ ('myapp', '0011_auto_20190108_0750'), ] operations = [ migrations.RunPython(create_bars) ]
This means that the migration will execute the
method when it is applied.
We can then fill our new
function with something like this:
def create_bars(apps, schema_editor): FooModel = apps.get_model('myapp', 'FooModel') BarModel = apps.get_model('myapp', 'BarModel') for foo in FooModel.objects.all(): instance, _ = BarModel.objects.get_or_create( field_a=foo.field_a, field_b=foo.field_b, field_c=foo.field_c, field_d=foo.field_d, ) foo.parent = instance foo.save()
So what does this function do? Well, we loop through all of our
entries and we set its
field to a newly created
entry. Note that we use
, this means that we avoid creating multiple duplicate
entries and we reuse them for many
After this migration is run, we should have populated all required
entries and our nullable
field should now all be populated. There should be no empty
fields after this migration is applied.
Finalize the State of our Models
At this point in time, all of our existing
should point to a
and even though the field is
, no entries should have a null value. This means that we are now ready to finalize the state of our models by updating it to its final version.
class BarModel(models.Model): field_a = models.IntegerField() field_b = models.IntegerField() field_c = models.IntegerField() field_d = models.IntegerField() class FooModel(models.Model): field_that_is_updated = models.IntegerField() parent = models.ForeignKey(BarModel, on_delete=models.CASCADE)
As you can see from the code example above, we now removed
field, and we also removed the old
fields on the
model. This will obviously delete all those fields from the database, and we will lose that data, but since our previous custom migration already migrated the data to a new
entry, we should now be safe.
At this point you should be able to generate your final migration file and then apply all of these migrations with the following commands:
python manage.py makemigrations python manage.py migrate
Summary of adding new ForeignKey to Django
At a first glance, this approach might look a bit confusing. We wanted to create a new
, but the first thing we do is to create a field that
This is OK though, we are creating multiple migration files that gets executed and applied synchronously. This process takes just a few seconds to execute, so your field is in practice only nullable for a moment as the migrations are applied.
By doing this we end up with complete data without any missing entries, and our old data will still be usable with our new data structure.