Jakob Külzer Ninja Coding Monkey goes Canada

19Sep/08Off

Grails and LiquiBase – How to use

In my current project I'm now at a point where the application has come pretty far but is still far from finished. The domain model is not perfect, but very stable at this point -- changes however, are still possible.

To get going with the application we are planning to perform tests with some real data. The problem at this point is that entering real data into a database that might be wiped due to changes in the domain model is a very tedious task, at least for the person responsible for entering the data. And the reason for running such a test is to find problems with the domain model which lead to the dreaded changes in the domain model. This cat bites her own tail...

As wiping out the database and re-entering the data is not an option at this point, I decided to look into LiquiBase which as been sitting on my delicious list for quite a while.

This post is an attempt to write down my experiences and learned lessons with LiquiBase. The documentation of LiquiBase is technically complete and extensive but lacks more information about how to use it in a real project. This is an attempt to create a guide on how to leverage LiquiBase in a real world project.

What is this LiquiBase anyways?

For starters LiquiBase is a tool that helps developers to track and apply changes to database schemas. Basically what it does is storing all changes that are made to a database in a special XML file, called a database changelog file. This changelog file can be used to determine if there are any changes between the schema in the changelog file and the schema found in a database. The nice thing about LiquiBase is that it is database neutral so you can use with any database Java is able to connect to, and that are a lot...

But there is more to LiquiBase that just tracking changes. The outstanding feature of LiquiBase is that it is able to migrate data using the information stored in the database changelog file. This doesn't sound like very much when you are starting a project from scratch, but as soon as you start entering data in your evolving application, you'll understand. And as every application evolves, this is a feature handy for every developer, especially for Grails developers as Hibernate as ORM basically supports only two modes of operation to handle the database schema (configured via the hibernate.hbm2ddl.auto option): wiping out the database and applying a new schema (called create or create-drop) or attempting to update the schema (simply called update). The update option looks like a good way to go at first but as it turns out it has its drawbacks, especially if there is data in the database.

How does it work?

The hub of all activity in LiquiBase is the changelog file. Basically there are two possible ways to create and maintain the changelogs, the first is to use the generated changelogs as far as possible and the other is to write changelogs by hand. I'm sticking mostly to the first one as the generated changelogs are sufficient most of the time and writing changelogs manually is way to tedious though easy as there is a complete XSD.

Each entry in the changelog file is called a changeset and has a unique id. The id is used to refer to specific datasets. But how does LiquiBase know which changesets have been applied to the database? The answer is simple, it creates a table called DATABASECHANGELOG into which it will insert a row for each applied changeset. So make sure you start using LiquiBase early because you really want to have these tables in your database when changes have been incorporated into a running system.

Applying changesets against the database is called migration but more on this later.

Setting the Scene

The work flow to push in changes may depend on your project. The work flow I'm describing can be applied to team of couple of Grails developers and has been tested on a real world project.

As an example for the rest of the post let's assume a project that is developed by a couple of code monkeys. The change the domain classes and to monkey stuff. On the other hand there is a data entry system deployed so real data can be inserted into the database and the software can be demoed. I will refer to this system as the demo system. The demo system and the data entered into it has several purposes. First to see if the domain model is adequate for real data and second to provide the developers with real testing data.

Problems arise quickly in this scenario, for example changes are made to the domain model and the database schema of the demo system must be updated, preferably without wiping out all the data that has been entered. At this point LiquiBase comes in.

Grails and LiquiBase - Getting Started

Ok, now how to use LiquiBase in a Grails Project? Easy: there is a LiquiBase grails plugin for grails which can easily be installed using the following command:

grails install-plugin liquibase

This will install a copy of LiquiBase in your application and set up a couple of new commands for the grails command line interface. Before you can do anything useful with LiquiBase you have to create a database changelog file for your current database, preferably using the generate-changelog command:

grails generate-changelog grails-app/migrations/changelog.xml

Note: if you omit the filename the changelog will be dumped to the console. You can dump the changelog into any file but grails-app/migrations/changelog.xml is the default and LiquiBase will keep looking for this file. With the changelog file in place you should add it to your version control system.

Tracking Changes

So the next step is to make some changes to the domain model which result in a change of the database schema. You now have two possibilities for updating your changelog file. Either manually or using some of the neat tools LiquiBase offers. I won't go into the details for creating the files manually, the LiquiBase manual does a good job at this. So let's talk about the tools.

LiquiBase ships with a very powerful command: diff. What this does is it looks at the current database and at the changelog (I have to correct myself here, see next section! Sorry about this!) another database and outputs the difference between them. As LiquiBase changelog XML! So this is awesome because you can make changes to your Grails domain model, run it against a database and pull out the changes. All that is then to do is to add the newly generated changesets to your database changelog file.

The db-diff Tool

The db-diff tool in the Grails plugin is kind of mysterious. The LiquiBase documentation for the Grails plugin does not even mention it and I got it wrong in the first run. I had to reading the section about the database diff provided by LiquiBase and dig through the Groovy scripts of the LiquiBase Grails plugin to understand what it does.

The diff utility of LiquiBase compares two databases with each other, which kind of explains its name (doh). When using the LiquiBase command line interface you have to provide two different connection strings for two databases. When using the Grails plugin however, you cannot specify two databases (not yet?); so what happens is that: the database for the current environment ist compared to the database of the test environment. This behaviour is hardcoded into plugins/liquibase-1.8.0.0/scripts/DbDiff.groovy and limits the usefulness of the tool. So what I did is every time a database has been migrated the new schema gets propagated to the testing database. If the domain model changes the schema of the dev database, the changes can easily be determined by diff'ing it against the testing database.

This is not exactly a perfect solution as you manually have to keep the testing database in sync with the changelog. So I am currently playing around with a modification of the Grails LiquiBase plugin which adds a compareToChangelog command which populates a separate database from the current changelog and then compares this to the database of the currently used environment. Using this tool makes updating the changelog really easy as you get the changesets just for those changes you made to the domain model. If there is interest I will publish this to the LiquiBase plugin.

Migrating a Database

Finally your changelog has been updated and you now need to update your demo database. The first thing you should do: Make a backup of your database! Although LiquiBase has matured to some extent, migrating a database full of data is a risky endeavor. So make sure you are on the safe side before touching a database with real data.

The easiest way to migrate the database is to set up a copy of your project and modify the datasource to point at the demo database. The first thing you might want to do is to check which changesets have to be executed by running the following command:

grails status

This will generate a list of changesets which have to be applied, assuming there is a DATABASECHANGELOG table in the database. If you are happy with what you see, you can either migrate the database directly or do a dry run. I prefer to do a dry run before touching the database by running:

grails migrate-sql

This will output a sequence of SQL statements that LiquiBase will issue to migrate the database. This is quite handy as you can see what will happen. Then, finally the database is migrated by issuing the following command:

grails migrate

When everything goes fine your database has just been updated. If something goes wrong, you can either use the rollback commands provided or fall back to the backup of your database.

Gotchas

The following points present pitfalls you want to avoid (its enough that I hit them ;-) . Please note that this is a living section, I keep adding new points I encounter.

Use LiquiBase early!

Make sure you start using LiquiBase early. Insert the DATABASECHANGELOG table as soon as possible and ensure that changesets are run against the databases.

Backups, Backups, Backups...

Backup your database before migrating! Really, do it!

Stick to one Database Flavor

Do not switch database flavors, e.g. from HSQLDB to MySQL. I tried this and I've encountered  a couple of issues with this. For example a changelog created from a HSQLDB may not run on a MySQL and vice versa due to different named datatypes (e.g. MySQL MEDIUMBLOB vs. HSQLDB VARBINARY). Further the changelog generated from a MySQL can cause problems on other databases as it names all primary keys "PRIMARY" per default. This leads to erroneous SQL that cannot be run.

Those problemes are nothing that couldn't be solved with a little bit of search and replace and even might be necessary to create the schema for a production database; but it is definitively nothing you want to do very often.

How to Start Using LiquiBase in the Middle of a Project

When you start to use LiquiBase in the middle of a project where there already is a database filled with precious production data, things can easily become complicated. The key to success here is to freeze the project or at least the part which affects the database schema for some time, generate a database changelog file from the current state and propagate this to all database instances. The reason why this is so important is that you need the DATABASECHANGELOG table on those databases; without this table and its entries  LiquiBase does not know which changes to apply. You could use the db-diff tool here, but the far more elegant approach is to simply propagate those changes to all databases once.

As soon as you have the DATABASECHANGELOG on all database instances, updating the schema becomes so easy that you will almost get addicted to it!

So, step by step:

  • freeze your project for half an hour, maybe you can do this in the evening
  • create a new changelog file from the current schema with the grails generate-changelog command
  • mark all those changesets as imported with the grails changelog-sync command
  • push those changes to ALL databases associated with the project. If you fail to do so, migrating the schema of a database without those tables is almost impossible. Depending on the size of your database this requires a massive amount of time.
  • With the DATABASECHANGELOG table in place, migrating LiquiBase to the next version is as simple as entering grails migrate.

Conclusion

LiquiBase is a great addition to Grails and you should really really think about using it. Otherwise managing changes to the database can easily become a big problem and im talking out of experience here. Hopefuly this post helps you to get started with LiquiBase. As always if you have any comments please feel to post.

Comments (17) Trackbacks (7)
  1. Hey Jakob, great article and definitely worth to be mentioned in the next Grails Podcast. Keep us up to date, good luck in Toronto & Viel Spass im Praktikum! Sven

  2. Hey Sven,
    that would be an honor. :) Thanks!

    Cheers,
    Jakob

  3. Is the db-diff command supposed to add diff’s to changelog or do you just copy it from the std-out? Nice post well done!

  4. The db-diff command just ouputs the changes to the console and usually this is what you want. You can review the changes, copy it into your changelog file or just use it to see if there are differences.

    Cheers,
    Jakob

  5. thanks for the tutorial. it really helpfull …

    i am using autobase before (in grails 1.0.4) and now move to use liquibase (in grails 1.1.1) since the autobase plugin are not working :(

  6. I recently wrote a JDBC driver wrapper to perform schema management. Basically what you do is setup your jdbc URL to use the wrapper around your backend JDBC driver and in doing so deployment will occur automatically prior to the release of the first client connection. This solution works with LiquiBase, Grails, plain old Java, and/or any tool that uses JDBC.

    The driver currently the LiquiBase schema manager in addition to a more direct internal schema manager. This gives the user choice since LiquiBase provides database independence via the XML layer and for users who must support more then one DB type using LiquiBase as the schema manger is the best approach. The standard manager is the best approach for a shop that uses only one type of DB and wants to take advantage of driver level features for schema record and/or use vendor specific SQL statements rather then the XML indirection model.

    In any case please checkout the project on my blog.
    http://markfarnsworth.wordpress.com/2009/06/27/garin/

  7. Hi jakob,

    nice post. Could you publish the mentioned compareToChangelog ? It would be helpful!!

    thanks,
    masiar

  8. Hi,

    Thanks for the tutorial !
    But, there is one thing that I am not sure to have understood. In fact, the changelog.xml is not generated with domain classes but with the database or with informations writen manually. We can’t generate this file with domain classes ?

    Thanks,

    maxou

  9. @maxou: there’s two ways you can generate your changelog. You can either write it manually or you can get liquibase to generate it for you. The later will take a certain existing database structure and create your changelog.xml from it. To answer your question, no, you cannot generate a changelog.xml directly from domain classes (because this is highly database dependent). But you can get Grails and hbm2ddl to populate a database for you and then use liquibase to create the changelog.xml.

  10. Ok, I had time to test the tool.

    In fact, the command which is very cool is db-diff (generate changelog is not very necessary in fact, db-diff do better). You don’t have to write by hand the xml code. You just have copy and paste.
    On the other hand, it’s weird that the diff is done between dev and test databases. Why not between dev and prod databases ?
    So, I had put the production database in the test environnement. I do : “grails test migrate”.

    With a Db-diff which work correctly, Autobase not have reasons to exist.

    My only problem is with foreign keys of the framework tables (fmwk_item…) which it searches to create and drop in every db-diff.

  11. Thanks a lot!! Nice post!!

  12. Great post!

    I am wondering if the grails liquibase plugin can support row data as well as schema changes. I have created a new table and want it populated with data when I perform the migration. Is there some other mechanism other than manually creating a data-load script and applying it to all my databases.

    Thanks.

  13. Hey Justin,
    that is indeed something I’ve been thinking about for a while. You can use liquibase to do that, using the insert data (http://www.liquibase.org/manual/insert_data) or load data (http://www.liquibase.org/manual/load_data) refactorings. You could also use custom sql refactoring (http://www.liquibase.org/manual/custom_sql_file) depending on your needs. I have to admit that I never used these refactorings, though.

    The obvious drawback is that loading data via these mechanisms doesn’t offer a rollback. An interesting question is whether this could be part of an automated database synchronization mechanism… hmmm…

    Your question actually made me look at the new refactorings — letting me solve some other problems elegantly now. Thanks. :)

  14. Hey Jakob. You have done a great job by writing this article and simplifying the database change management through liquibase. I keep on referring to this article every now and then, especially the section, “Using LiquiBase in the Middle of a Project”. Thanks.

  15. Hello Imran,
    thank you, I’m glad my post is so useful to you. :)

  16. Hi Jakob. Thanks for such a great article. I am going to try it out, but in the meantime I have a question. Did you ever implement the compareToChangelog functionality mentioned in your post?

  17. This post helped me in realizing that the use of Liquibase in the application isn’t black magic. The explanation couldn’t have gotten any simpler. Thanks for this post!

    (The post looks very old, but the fact that it still helps people like me explains its usefulness. :) )


Leave a comment