In my current project I'm now at a point where the application has come pretty far but is still far from finished. The domain model is not perfect, but very stable at this point -- changes however, are still possible.
To get going with the application we are planning to perform tests with some real data. The problem at this point is that entering real data into a database that might be wiped due to changes in the domain model is a very tedious task, at least for the person responsible for entering the data. And the reason for running such a test is to find problems with the domain model which lead to the dreaded changes in the domain model. This cat bites her own tail...
As wiping out the database and re-entering the data is not an option at this point, I decided to look into LiquiBase which as been sitting on my delicious list for quite a while.
This post is an attempt to write down my experiences and learned lessons with LiquiBase. The documentation of LiquiBase is technically complete and extensive but lacks more information about how to use it in a real project. This is an attempt to create a guide on how to leverage LiquiBase in a real world project.
What is this LiquiBase anyways?
For starters LiquiBase is a tool that helps developers to track and apply changes to database schemas. Basically what it does is storing all changes that are made to a database in a special XML file, called a database changelog file. This changelog file can be used to determine if there are any changes between the schema in the changelog file and the schema found in a database. The nice thing about LiquiBase is that it is database neutral so you can use with any database Java is able to connect to, and that are a lot...
But there is more to LiquiBase that just tracking changes. The outstanding feature of LiquiBase is that it is able to migrate data using the information stored in the database changelog file. This doesn't sound like very much when you are starting a project from scratch, but as soon as you start entering data in your evolving application, you'll understand. And as every application evolves, this is a feature handy for every developer, especially for Grails developers as Hibernate as ORM basically supports only two modes of operation to handle the database schema (configured via the hibernate.hbm2ddl.auto option): wiping out the database and applying a new schema (called create or create-drop) or attempting to update the schema (simply called update). The update option looks like a good way to go at first but as it turns out it has its drawbacks, especially if there is data in the database.
How does it work?
The hub of all activity in LiquiBase is the changelog file. Basically there are two possible ways to create and maintain the changelogs, the first is to use the generated changelogs as far as possible and the other is to write changelogs by hand. I'm sticking mostly to the first one as the generated changelogs are sufficient most of the time and writing changelogs manually is way to tedious though easy as there is a complete XSD.
Each entry in the changelog file is called a changeset and has a unique id. The id is used to refer to specific datasets. But how does LiquiBase know which changesets have been applied to the database? The answer is simple, it creates a table called DATABASECHANGELOG into which it will insert a row for each applied changeset. So make sure you start using LiquiBase early because you really want to have these tables in your database when changes have been incorporated into a running system.
Applying changesets against the database is called migration but more on this later.
Setting the Scene
The work flow to push in changes may depend on your project. The work flow I'm describing can be applied to team of couple of Grails developers and has been tested on a real world project.
As an example for the rest of the post let's assume a project that is developed by a couple of code monkeys. The change the domain classes and to monkey stuff. On the other hand there is a data entry system deployed so real data can be inserted into the database and the software can be demoed. I will refer to this system as the demo system. The demo system and the data entered into it has several purposes. First to see if the domain model is adequate for real data and second to provide the developers with real testing data.
Problems arise quickly in this scenario, for example changes are made to the domain model and the database schema of the demo system must be updated, preferably without wiping out all the data that has been entered. At this point LiquiBase comes in.
Grails and LiquiBase - Getting Started
Ok, now how to use LiquiBase in a Grails Project? Easy: there is a LiquiBase grails plugin for grails which can easily be installed using the following command:
grails install-plugin liquibase
This will install a copy of LiquiBase in your application and set up a couple of new commands for the grails command line interface. Before you can do anything useful with LiquiBase you have to create a database changelog file for your current database, preferably using the generate-changelog command:
grails generate-changelog grails-app/migrations/changelog.xml
Note: if you omit the filename the changelog will be dumped to the console. You can dump the changelog into any file but grails-app/migrations/changelog.xml is the default and LiquiBase will keep looking for this file. With the changelog file in place you should add it to your version control system.
So the next step is to make some changes to the domain model which result in a change of the database schema. You now have two possibilities for updating your changelog file. Either manually or using some of the neat tools LiquiBase offers. I won't go into the details for creating the files manually, the LiquiBase manual does a good job at this. So let's talk about the tools.
LiquiBase ships with a very powerful command: diff. What this does is it looks at the current database and at the changelog (I have to correct myself here, see next section! Sorry about this!) another database and outputs the difference between them. As LiquiBase changelog XML! So this is awesome because you can make changes to your Grails domain model, run it against a database and pull out the changes. All that is then to do is to add the newly generated changesets to your database changelog file.
The db-diff Tool
The db-diff tool in the Grails plugin is kind of mysterious. The LiquiBase documentation for the Grails plugin does not even mention it and I got it wrong in the first run. I had to reading the section about the database diff provided by LiquiBase and dig through the Groovy scripts of the LiquiBase Grails plugin to understand what it does.
The diff utility of LiquiBase compares two databases with each other, which kind of explains its name (doh). When using the LiquiBase command line interface you have to provide two different connection strings for two databases. When using the Grails plugin however, you cannot specify two databases (not yet?); so what happens is that: the database for the current environment ist compared to the database of the test environment. This behaviour is hardcoded into plugins/liquibase-184.108.40.206/scripts/DbDiff.groovy and limits the usefulness of the tool. So what I did is every time a database has been migrated the new schema gets propagated to the testing database. If the domain model changes the schema of the dev database, the changes can easily be determined by diff'ing it against the testing database.
This is not exactly a perfect solution as you manually have to keep the testing database in sync with the changelog. So I am currently playing around with a modification of the Grails LiquiBase plugin which adds a compareToChangelog command which populates a separate database from the current changelog and then compares this to the database of the currently used environment. Using this tool makes updating the changelog really easy as you get the changesets just for those changes you made to the domain model. If there is interest I will publish this to the LiquiBase plugin.
Migrating a Database
Finally your changelog has been updated and you now need to update your demo database. The first thing you should do: Make a backup of your database! Although LiquiBase has matured to some extent, migrating a database full of data is a risky endeavor. So make sure you are on the safe side before touching a database with real data.
The easiest way to migrate the database is to set up a copy of your project and modify the datasource to point at the demo database. The first thing you might want to do is to check which changesets have to be executed by running the following command:
This will generate a list of changesets which have to be applied, assuming there is a DATABASECHANGELOG table in the database. If you are happy with what you see, you can either migrate the database directly or do a dry run. I prefer to do a dry run before touching the database by running:
This will output a sequence of SQL statements that LiquiBase will issue to migrate the database. This is quite handy as you can see what will happen. Then, finally the database is migrated by issuing the following command:
When everything goes fine your database has just been updated. If something goes wrong, you can either use the rollback commands provided or fall back to the backup of your database.
The following points present pitfalls you want to avoid (its enough that I hit them . Please note that this is a living section, I keep adding new points I encounter.
Use LiquiBase early!
Make sure you start using LiquiBase early. Insert the DATABASECHANGELOG table as soon as possible and ensure that changesets are run against the databases.
Backups, Backups, Backups...
Backup your database before migrating! Really, do it!
Stick to one Database Flavor
Do not switch database flavors, e.g. from HSQLDB to MySQL. I tried this and I've encountered a couple of issues with this. For example a changelog created from a HSQLDB may not run on a MySQL and vice versa due to different named datatypes (e.g. MySQL MEDIUMBLOB vs. HSQLDB VARBINARY). Further the changelog generated from a MySQL can cause problems on other databases as it names all primary keys "PRIMARY" per default. This leads to erroneous SQL that cannot be run.
Those problemes are nothing that couldn't be solved with a little bit of search and replace and even might be necessary to create the schema for a production database; but it is definitively nothing you want to do very often.
How to Start Using LiquiBase in the Middle of a Project
When you start to use LiquiBase in the middle of a project where there already is a database filled with precious production data, things can easily become complicated. The key to success here is to freeze the project or at least the part which affects the database schema for some time, generate a database changelog file from the current state and propagate this to all database instances. The reason why this is so important is that you need the DATABASECHANGELOG table on those databases; without this table and its entries LiquiBase does not know which changes to apply. You could use the db-diff tool here, but the far more elegant approach is to simply propagate those changes to all databases once.
As soon as you have the DATABASECHANGELOG on all database instances, updating the schema becomes so easy that you will almost get addicted to it!
So, step by step:
- freeze your project for half an hour, maybe you can do this in the evening
- create a new changelog file from the current schema with the grails generate-changelog command
- mark all those changesets as imported with the grails changelog-sync command
- push those changes to ALL databases associated with the project. If you fail to do so, migrating the schema of a database without those tables is almost impossible. Depending on the size of your database this requires a massive amount of time.
- With the DATABASECHANGELOG table in place, migrating LiquiBase to the next version is as simple as entering grails migrate.
LiquiBase is a great addition to Grails and you should really really think about using it. Otherwise managing changes to the database can easily become a big problem and im talking out of experience here. Hopefuly this post helps you to get started with LiquiBase. As always if you have any comments please feel to post.