Dump and Load data with Django

Kode Jurig
4 min readJan 30, 2021

If you want to give your Django project to a colleagues, maybe! Or anyone. You may be thinking, “yes, they’ll try my project”. Any application will look good if you have data in it.

Imagine, if you create a “Blog” application. You show off to your colleagues. Your colleague is interested and says “What’s your repo?”, Then you give your repo link to clone.

Your colleagues will clone your repo and try out your application. But, the first impression will be really annoying. Because your colleagues have to fill in data or writing one by one on your blog application. Maybe your colleague will say “why are not given initial data, so I can see and try it without filling in the data one by one”.

If you want! You can do it. Anyone trying your app for the first time, they already have initial data ready to load into the database. Django provides a feature for this problem. You can dump and load data at any time.

In Django, these features are known as fixtures. In this article, we will try to explore one by one how we use this feature to back up and load data into the database. Django calls this “providing initial data to the model”. The same!

Assume that we already have one application, no need to be complex. For example, a blog. In the blog application, we have a model named Article and Comment. Here we already have 5 articles and some comments in the database. In fact, admin users already exist.

5 Articles

Then, we will back up the data of this article. So, we can run the command:

$ python manage.py dumpdata blog.article > article.json

blog is the name of the application while the article is the model in the blog. article.json is the file where we save the dump result of the article. If we open it, the result is probably long to the right. This is an image of the results I’ve beautified:

Results

You can see, the data content is like an array. But that’s not the case! Take a look at the author field. If there is one article that has the author value of 1234 (for example), and your colleague only has one user who has id 1, for example!

$ python manage.py loaddata article.json

If this data is loaded into a database owned by your colleague, there may be an error because author 1234 is not in your colleague’s database.

django.db.utils.IntegrityError: Problem installing fixtures: insert or update on table "blog_article" violates foreign key constraint "blog_article_author_id_905add38_fk_auth_user_id"DETAIL:  Key (author_id)=(1234) is not present in table "auth_user".

For this problem, think about relationships. If we want to create a dump of the article model, we must also dump the related model:

$ python manage.py dumpdata blog.article auth.user > userart.json

However, relationships in Django are complex

After all, relationships in Django are complex. It’s not possible to inspect Django’s models individually. If you create an application that has more than 5 models, it is also complex if each model has a relationship to another model. In my opinion, it is better to dump all models, both from your application and from Django’s models, but you should avoid contenttypes:

$ python manage.py dumpdata --exclude contenttypes > db.json

Why avoid contenttypes?

If you dump for all models. Your colleagues who use the same database as you, will be fine. However, if your other colleagues use different databases, when loading the data an integrity error will occur. Please, if you want to dump contenttypes, that’s if a colleague or other developer uses the same database as yours, but that’s not possible! Every developer is definitely different.

Conclusion

So, this Django Feature is very useful. However, we should still be careful when using Django’s dump data. He will help you if you really pay attention to the databases other developers are using.

--

--