Chapter 13. Saving & Restoring a VoltDB Database

Documentation

VoltDB Home » Documentation » Using VoltDB

Chapter 13. Saving & Restoring a VoltDB Database

There are times when it is necessary to save the contents of a VoltDB database to disk and then restore it. For example, if the cluster needs to be shut down for maintenance, you may want to save the current state of the database before shutting down the cluster and then restore the database once the cluster comes back online. Performing periodic backups of the data can also provide a fallback in case of unexpected failures — either physical failures, such as power outages, or logic errors where a client application mistakenly corrupts the database contents.

VoltDB provides shell commands, system procedures, and an automated snapshot feature that help you perform these operations. The following sections explain how to save and restore a running VoltDB cluster, either manually or automatically.

13.1. Performing a Manual Save and Restore of a VoltDB Cluster

Manually saving and restoring a VoltDB database is useful when you need to modify the database's physical structure or make schema changes that cannot be made to a running database. For example, changing the K-safety value, the number of sites per site, or changing the partitioning column of a partitioned table. The normal way to perform such a maintenance operation using save and restore is as follows:

  1. Stop database activities (using pause).

  2. Use save to write a snapshot of the current data to disk.

  3. Shutdown the cluster.

  4. Make changes to the VoltDB schema, cluster configuration, and/or configuration file as desired.

  5. Reinitialize the database with the modified configuration file, using voltdb init --force.

  6. Restart the cluster in admin mode, using voltdb start --pause.

  7. Optionally, reload the schema and stored procedures (if you are changing the schema).

  8. Restore the previous snapshot.

  9. Restart client activity (using resume).

The key is to make sure that all database activity is stopped before the save and shutdown are performed. This ensures that no further changes to the database are made (and therefore lost) after the save and before the shutdown. Similarly, it is important that no client activity starts until the database has started and the restore operation completes.

Also note that Step #7, reloading the schema, is optional. If you are going to reuse the same schema in a new database instance, the restore operation will automatically load the schema from the snapshot itself. If you want to modify the schema in any way, such as changing indexes or tables and columns, you should load the modified schema before restoring the data from the snapshot. If the database schema is not empty (that is there are tables already defined), only the data is loaded from the snapshot. See Section 13.1.3.2, “Modifying the Database Schema and Stored Procedures” for more information on modifying the schema when restoring snapshots.

Save and restore operations are performed either by calling VoltDB system procedures or using the corresponding voltadmin shell commands. In most cases, the shell commands are simpler since they do not require program code to use. Therefore, this chapter uses voltadmin commands in the examples. If you are interested in programming the save and restore procedures, see Appendix G, System Procedures for more information about the corresponding system procedures.

When you issue a save command, you specify a path where the data will be saved and a unique identifier for tagging the files. VoltDB then saves the current data on each node of the cluster to a set of files at the specified location (using the unique identifier as a prefix to the file names). This set of files is referred to as a snapshot, since it contains a complete record of the database for a given point in time (when the save operation was performed).

The --blocking option lets you specify whether the save operation should block other transactions until it completes. In the case of manual saves, it is a good idea to use this option since you do not want additional changes made to the database during the save operation.

Note that every node in the cluster uses the same absolute path, so the path specified must be valid, must exist on every node, and must not already contain data from any previous saves using the same unique identifier, or the save will fail.

When you issue a restore command, you specify the same absolute path and unique identifier used when creating the snapshot. VoltDB checks to make sure the appropriate save set exists on each node, then restores the data into memory.

13.1.1. How to Save the Contents of a VoltDB Database

To save the contents of a VoltDB database, use the voltadmin save command. The following example creates a snapshot at the path /tmp/voltdb/backup using the unique identifier TestSnapshot.

$ voltadmin save --blocking /tmp/voltdb/backup "TestSnapshot"

In this example, the command tells the save operation to block all other transactions until it completes. It is possible to save the contents without blocking other transactions (which is what automated snapshots do). However, when performing a manual save prior to shutting down, it is normal to block other transactions to ensure you save a known state of the database.

Note that it is possible for the save operation to succeed on some nodes of the cluster and not others. When you issue the voltadmin save command, VoltDB displays messages from each partition indicating the status of the save operation. If there are any issues that would stop the process from starting, such as a bad file path, they are displayed on the console. It is a good practice to examine these messages to make sure all partitions are saved as expected.

Note that it is also possible to issue the voltadmin save command without arguments. In that case the snapshot is saved to the default snapshots folder in the database root directory. This can be useful because the voltdb start command can automatically restore the latest snapshot in that directory as described in the next section.

13.1.2. How to Restore the Contents of a VoltDB Database Manually

The easiest way to restore a snapshot is to let VoltDB do it for you as part of the recover operation. If you are not changing the cluster configuration you can use an automated snapshot or other snapshot saved into the voltdbroot/snapshots directory by simply restarting the cluster nodes using the voltdb start command. With the start action VoltDB automatically starts and restores the most recent snapshot. If command logging is enabled, it also replays any logs after the snapshot. This approach has the added benefit that VoltDB automatically loads the previous schema as well as part of the snapshot.

However, you cannot use voltdb start to restore a snapshot if the physical configuration of the cluster has changed or if you want to restore an earlier snapshot or a snapshot stored in an alternate location. In these cases you must do a manual restore.

To manually restore a VoltDB database from a snapshot previously created by a save operation, you can create a new database instance and use the voltadmin restore command. So, for example, if you modify the configuration, you must re-initialize the root directory with the new configuration file, using the --force flag to overwrite the previous configuration and database content:

$ voltdb init --config=newconfig.xml --force

Then you can start the reconfigured database, which creates a new empty database. It is also a good idea to start the database in admin mode by including the --pause flag:

$ voltdb start --pause

Finally, you restore the previously saved snapshot using the same pathname and unique identifier used during the save. The following example restores the snapshot created by the example in Section 13.1.1 and resumes normal operation (that is, exits admin mode).

$ voltadmin restore /tmp/voltdb/backup "TestSnapshot"
$ voltadmin resume

As with save operations, it is always a good idea to check the status information displayed by the command to ensure the operation completed as expected.

13.1.3. Changing the Cluster Configuration Using Save and Restore

Most changes to a VoltDB database can be made "on the fly" while the database is running. Adding and removing tables, enabling and disabling database features such as import and export, and adding or updating stored procedures can all be done while the database is active. However, between a save and a restore, it is possible to make changes to the database and cluster configuration that cannot be made on a running cluster. For example, you can:

  • Add or remove nodes from the cluster

  • Modify the schema and/or stored procedures that:

    • Change partitioned tables to replicated and vice versa

    • Change the partitioning column on partitioned tables

    • Add unique indexes to tables with existing data

  • Change the number of sites per host

  • Change the K-safety value

The following sections discuss these procedures in more detail.

13.1.3.1. Adding and Removing Nodes from the Database

To add nodes to the cluster, use the following procedure:

  1. Save the database with the voltadmin save command.

  2. Shutdown and re-initialize the database root directories on each node (including initializing new root directories for the nodes you are adding).

  3. Start the cluster (including the new nodes) specifying the new server count with the --count argument to the voltdb start command.

  4. Restore the database with the voltadmin restore command..

When the snapshot is restored, the database (and partitions) are redistributed over the new cluster configuration.

It is also possible to remove nodes from the cluster using this procedure. However, to make sure that no data is lost in the process, you must copy the snapshot files from the nodes that are being removed to one of the nodes that is remaining in the cluster. This way, the restore operation can find and restore the data from partitions on the missing nodes.

13.1.3.2. Modifying the Database Schema and Stored Procedures

The easiest and recommended way to change the database schema is by sending the appropriate SQL database definition language (DDL) statements to the sqlcmd utility. Similarly you can update the stored procedures on a running database using the LOAD CLASSES and REMOVE CLASSES directives.

However, there are a few changes that cannot be made to a running database,. For example, changing the partitioning column of a table if the table contains data. For these changes, you must use save and restore to change the schema.

To modify the database schema or stored procedures between a save and restore, make the appropriate changes to the source files (that is, the database DDL and the stored procedure Java source files). If you modify the stored procedures, be sure to repackage any Java stored procedures into a JAR file. Then you can:

  1. Save the database with the voltadmin save command.

  2. Shutdown and re-initialize the database root directories on each node.

  3. Start the cluster with the voltdb start command.

  4. Load the modified schema and stored procedures using sqlcmd.

  5. Restore the database contents with the voltadmin restore command.

Two points to note when modifying the database structure before restoring a snapshot are:

  • When existing rows are restored to tables where new columns have been added, the new columns are filled with either the default value (if defined by the schema) or nulls.

  • When changing the datatypes of columns, it is possible to decrease the datatype size (for example, going from an INT to an TINYINT). However, if any existing values exceed the capacity of the new datatype (such as an integer value of 5,000 where the datatype has been changed to TINYINT), the entire restore will fail.

If you remove or modify stored procedures (particularly if you change the number and/or datatype of the parameters), you must make sure the corresponding changes are made to client applications as well.