Flyway
Goals
- Understand the importance of managed, automated schema migration.
- Learn about the Flyway tool.
- Be able to create SQL-based and Java-based migrations.
- Understand how to integrate Flyway both from Maven and directly from within Java.
Concepts
- baseline
- database administrator (DBA)
- Flyway
- schema migration
Library
java.sql.ResultSet.CONCUR_UPDATABLE
java.sql.ResultSet.TYPE_FORWARD_ONLY
java.sql.ResultSet.updateBytes(String columnLabel, byte[] x)
java.util.UUID
javax.sql.DataSource
org.flywaydb.core.Flyway
org.flywaydb.core.Flyway.baseline()
org.flywaydb.core.Flyway.clean()
org.flywaydb.core.Flyway.info()
org.flywaydb.core.Flyway.migrate()
org.flywaydb.core.Flyway.repair()
org.flywaydb.core.Flyway.setDataSource(DataSource dataSource)
org.flywaydb.core.Flyway.setDataSource(String url, String user, String password, String... initSqls)
org.flywaydb.core.Flyway.setInstalledBy(String installedBy)
org.flywaydb.core.Flyway.validate()
org.flywaydb.core.api.migration.jdbc.BaseJdbcMigration
org.flywaydb.core.api.migration.jdbc.JdbcMigration
org.flywaydb.core.api.migration.jdbc.JdbcMigration.migrate(Connection connection)
Dependencies
org.flywaydb:flyway-core:5.1.3
org.flywaydb:flyway-maven-plugin:5.1.3
org.postgresql:postgresql:42.2.2
Lesson
One of the most onerous aspects of developing a database application is schema management. An application's domain model is seldom completely understood up front; it evolves over time. The domain model should drive the database schema, which means schema change is inevitable during the course of development.
A changing schema presents several complications.
- Developers must have an agreed-upon way, some shared “recipe”, to move from one schema variation to the next without losing data already stored in the database.
- For such incremental changes, there must be some standard way to identify the version of each schema.
- Different developers may be working on different versions of the schema, e.g. in different branches.
- Developer need to work on a different version of the schema than is deployed, not changing the production database schema until time for release.
Managing these aspects of incremental schema change is referred to as schema migration, and with all but the smallest projects is difficult to achieve without the help of an automated schema migration tool. Flyway is a popular, simple, and open-source migration tool for Java. It provides a library and API for automated migrations; a command-line interface; and a Maven plugin for automated migrations during builds.
Migration
To manage schema migrations, Flyway creates and maintains a new table in your database named schema_version
which keeps track of which schema versions schema change that has been applied to the database. Developers give each incremental schema change a unique version number—even the DDL statements that create the initial database schema itself.
When it comes time to migrate the schema, Flyway checks the schema_version
table and applies all available incremental migrations, one at a time, in order. After successful migration, Flyway updates the schema_version
table with the results of the migration. Future migrations will note the current version of the schema and perform no changes if the schema is already at the latest version.
schema_version
table after initial migration.installed _rank | version | description | type | script | checksum | installed _by | installed _on | execution _time | success |
---|---|---|---|---|---|---|---|---|---|
1 | 1 | Initial database schema. | SQL | V1 | 1234567890 | jdoe | 2018-01-02 33:44:00.0 | 543 | TRUE |
2 | 2 | Added item color. | SQL | V2 | 9876543210 | jdoe | 2018-01-02 33:55:00.0 | 123 | TRUE |
SQL-Based Migrations
The easiest and preferred way to define migrations is to provide .sql
files containing the actual SQL commands needed to be executed for each migration. Usually these files contain the initial DDL statements to define the schema, later schema updates, and DML statements for bulk data updates. Flyway automatically discovers SQL-based migrations if they are placed in the src/main/resources/db/migration
directory of your project and use a particular format for the filename as shown in the figure. The format of many of the filename components are configurable. The version number can contain full stop .
characters, as you would expect with semantic versions, and underscore _
characters, with as many components as needed. The version is separated from the description by two underscore __
characters. You can include SQL --
comments or C-style multi-line /*…*/
comments as needed in the file.
The initial migration file usually creates the database schema. In this case, the first migration would contain the DDL statements from the earlier lesson on SQL, shown below.
Later as your application evolves, you may decide that you need to support item color as discussed in the lesson on relational algebra. Once you add an SQL DDL file with V2__
in the filename, Flyway will discover and perform this migration to a database that is at a previous migration version.
Java-Based Migrations
While you should prefer SQL-based migrations, some DDL or DML manipulations are so intricate that they are better performed programmatically using JDBC. The migration might also involve access to other data or configurations only available via Java. You can have SQL-based and Java-based migrations interspersed, as long as their version numbers are in the correct sequence.
For Java-based migrations, you will create a .java
source code file rather than an .sql
file, placing it in the src/main/java/
directory tree. The migration logic must be in a Java class in the db.migration
package and must implement org.flywaydb.core.api.migration.jdbc.JdbcMigration
. Flyway recommends that you extend org.flywaydb.core.api.migration.jdbc.BaseJdbcMigration
rather than implementing JdbcMigration
directly.
The class must use a particular format for its name as shown in the figure. The format of many of the class name components are configurable. The version number cannot contain full stop .
characters, as they are not allowed in Java class names; you can use underscore _
characters instead. The version is separated from the description by two underscore __
characters.
Place migration logic inside your implementation of the JdbcMigration.migrate(Connection connection)
method. During migration Flyway will automatically call this method with java.sql.Connection
instance that is already connected to the database. You should not close the connection when you are finished, but any statements you create inside the method must be closed as normal.
Imagine that you wish to add a UUID in string form to each supplier. (On products that support it, such as PostgreSQL, you would want to use an actual UUID type.) You would first need to add a new uuid
column, which is easily done in SQL. You would also want to add not-null constraint and a unique constraint, both of which are also easily done in SQL. The complication is that, after adding the column but before adding the constraints, you would need to generate UUIDs in the column for the existing rows. As standard SQL provides no UUID functionality (and UUID support across products is inconsistent at best), this is more easily accomplished using the Java java.util.UUID
class.
Integration
There are several ways to invoke Flyway migration. Although Flyway provides a command-line tool for isolated manual migrations, it is most useful to integrate Flyway into the build process itself so that migrations happen automatically on a coordinated bases during development.
Maven
The Flyway Maven Plugin provides access to Flyway migration via the Maven POM using the org.flywaydb:flyway-maven-plugin
dependency in the <plugins>
section. The migrate
goal by default is bound to the Maven pre-integration-test
phase, so you can be assured that migration occurs after the project is successfully built and packaged, but before integration tests begin. The simple configuration below will automatically discover migration scripts and migrate the schema for the database indicated by <url>
. You can also migrate manually using mvn flyway:migrate
, even if you don't bind the plugin to any goal in the <executions>
section.
The Maven plugin has several other goals to help you manage your schema in addition to actual migration. The goals that have default phases are bound to the pre-integration-test
phase. See the documentation for each goal to find out the available plugin settings. The Summary section has a convenient table with more details on each goal.
migrate
- Migrates the database.
clean
- Drops all objects from the schema.
info
- Provides information about migration history and status.
validate
- Validates the migrations against those available.
baseline
- Baselines the database to a specific version by setting the database to the specified
baselineVersion
and running migrations only after this version. repair
- Repairs the Flyway metadata table
schema_version
, removing failed migrations and updating the checksums to match those of the currently available migrations.
Java
If you want more control over migration, and/or you want to integrate the migration process into the application logic itself, you can directly access the Flyway Java API by including the org.flywaydb:flyway-core
dependency. The main class for accessing Flyway is org.flywaydb.core.Flyway
. Once you create and configure a Flyway
instance, you can invoke the equivalent of the Maven plugin commands by calling methods such as Flyway.migrate()
. Java integration via the API still requires that schema migration files be configured in the project as explained above.
Most importantly you will need to specify the database for Flyway to use. If you are using a javax.sql.DataSource
, you can indicate this by calling Flyway.setDataSource(DataSource dataSource)
. Otherwise if you have a URL connection string, you can call Flyway.setDataSource(String url, String user, String password, String... initSqls)
.
CLI
Flyway also comes with a downloadable command-line tool which may be used to manage migrations independently from Maven. Without the benefit of Maven's dependency resolution and configuration and the lack of a POM-based configuration, this tool requires a separate configuration. Documentation can be found on the CLI tool page.
Review
Summary
Maven Goal | API Method | Description | Default Phase |
---|---|---|---|
baseline | Flyway.baseline() | Baselines the database to a specific version by setting the database to the specified baselineVersion and running migrations only after this version. | N/A |
clean | Flyway.clean() | Drops all objects from the schema. | pre-integration-test |
info | Flyway.info() | Provides information about migration history and status. | N/A |
migrate | Flyway.migrate() | Migrates the database. | pre-integration-test |
repair | Flyway.repair() | Repairs the Flyway metadata table schema_version , removing failed migrations and updating the checksums to match those of the currently available migrations. | N/A |
validate | Flyway.validate() | Validates the migrations against those available. | pre-integration-test |
Gotchas
- If you perform a Java-integrated Flyway migration using in-memory H2 in Oracle mode with no database username, you'll need to set some username or the current version of Flyway will produce an error.
In the Real World
Baselining Schemas
Normally you would never need to baseline a database if you used Flyway to manage your schema from the start, including the initial DDL statements. In the real world you may be presented with a schema that already exists and which your tour DBA has been migrating manually. You will likely want to create Flyway migrations containing the table creation DDL along with historical schema changes, perhaps to use with an in-memory database for testing, but the DBA has already performed those changes manually on the production database—the earlier Flyway migrations have effectively already been performed. Baselining thus becomes a valuable tool for the initial switchover to Flyway, to indicate which migrations have effectively already been applied to the database in the past, and to only apply new Flyway migrations going forward.
Changing Keys and Inter-Table Migration
Correlated subqueries are particularly useful when a schema migration moves or copies information from one table to another, allowing you to ensure that the data is assigned to the appropriate rows. The V3__Added_supplier_uuids.java
migration above added a UUID column named uuid
to the Supplier
table, producing Supplier(
. The items are still related to the suppliers using the generated ID: id
, name, address, uuid)Item(
. Now that each supplier has a UUID, there is no need to maintain a surrogate key generated by the database. You can create an SQL migration that will add a new supplierUuid column to the Item table, copy over the correct UUIds, and then switch to using code
, name, stock, supplierId
, color)Supplier.uuid
as the new primary key, with Item.supplierUuid
as the foreign key.
Think About It
- Why is schema migration necessary? Why can't a developer simply manually modify the database schema when needed?
- What sort of schema version convention will you use in your project? Will it take the same form as your project version? Will you use the issue tracking ticket number? What happens when two migrations are created in separate branches for the same product release, and the order of the migrations is important—that is, one migration depends on the other?
Self Evaluation
- How does Flyway keep track of which migrations have already been applied to a database schema?
- What is the purpose of a baseline version? Will every schema migration use baselining?
- What is the difference between SQL-based and Java-based migrations? Which should you prefer, and why? When would that not be possible?
Task
- Put your current database schema migration under control of Flyway.
- Create all the necessary migrations to create the schema from scratch using DDL statements, as well as to migrate the schema to its current state.
- Use Flyway Java integration to turn on Flyway migration for your in-memory database tests.
- Add Flyway Maven integration to update the production database schema as part of the build. As you've already created and updated the production database schema manually, you'll first need to baseline the schema to the appropriate Flyway migration version.
- Do not add populate books and publishers as part of your migrations. You should be able to add data separately as part of the
load-snapshot
command.
- Make it possible for each publisher to have multiple web sites.
- Create an additional Flyway SQL-based migration to add the appropriate table and copy over existing data as needed.
- Make sure you remove any unnecessary table columns, as well as update your Java code and tests.
- Find a publisher with multiple web sites and add it those web sites your snapshot data. Try to find a publisher that already exists as part of your snapshot data.
- Load a publisher with multiple web sites into your production database, either manually or via
load-snapshot
.
- Switch to using a surrogate UUID key to identify publishers in the database, to allow for multiple publishers with the same name and to support changing a publisher's name in the future.
- Choose the appropriate column type for the UUID and be prepared to explain your decision.
- Create an additional Flyway Java-based migration to generate the UUIDs.
- The schema changes must be part of the same migration so they will be performed as one atomic change.
See Also
- Flyway: Get Started (Flyway)
- Flyway: The agile database migration framework for Java (vimeo - JavaZone)
- Come Fly With Me: Database Migration Patterns with Flyway (YouTube - NLJUG)
- Evolutionary Database Design (Martin Fowler)
References
Resources
Acknowledgments
- Flyway illustrations are used from the Flyway site licensed under CC BY-SA 4.0.
- Flyway is a registered trademark of Boxfuse GmbH.
- Some symbols are from Font Awesome by Dave Gandy.