4.3. Upgrading the Cluster

Documentation

VoltDB Home » Documentation » Administrator's Guide

4.3. Upgrading the Cluster

Sometimes you need to update or reconfigure the server infrastructure on which the VoltDB database is running. Server upgrades are one example. A server upgrade is when you need to fix or replace hardware, update the operating system, or otherwise modify the underlying system.

Server upgrades usually require stopping the VoltDB database process on the specific server being serviced. However, if your database cluster uses K-safety for enhanced availability, it is possible to complete server upgrades without any database downtime by performing a rolling hardware upgrade, where each server is upgraded in turn using the voltadmin stop and rejoin commands.

Another type of upgrade is when you want to reconfigure the cluster as a whole. Reasons for reconfiguring the cluster are because you want to add or remove servers from the cluster or you need to modify the number of partitions per server that VoltDB uses.

Adding servers to the cluster can happen without stopping the database. This is called elastic scaling. Removing servers or changing the number of sites per host requires restarting the cluster during a maintenance window.

The following sections describe four methods of cluster upgrade:

  • Performing server upgrades

  • Performing rolling upgrades on K-safe clusters

  • Adding servers to a running cluster through elastic scaling

  • Reconfiguring the cluster with a maintenance window

4.3.1. Performing Server Upgrades

If you need to upgrade or replace the hardware or software (such as the operating system) of the individual servers, this can be done without taking down the database as a whole. As long as the server is running with a K-safety value of one or more, it is possible to take a server out of the cluster without stopping the database. You can then fix the server hardware, upgrade software (other than VoltDB), even replace the server entirely with a new server, then bring the server back into the cluster.

To perform a server upgrade:

  1. Stop the VoltDB server process on the server using the voltadmin stop command. As long as the cluster is K-safe, the rest of the cluster will continue running.

  2. Perform the necessary upgrades.

  3. Have the server rejoin the cluster using the voltdb rejoin command.

The rejoin command starts the database process on the server, contacts the database cluster, then copies the necessary partition content from other cluster nodes so the server can then participate as a full member of the cluster, While the server is rejoining, the other database servers remain accessible and actively process queries from client applications.

When rejoining a cluster you must specify a host server that the rejoining node will connect to. The host can be any server still in the cluster; it does not have to be the same host specified when the cluster was initially started. For example:

$ voltdb rejoin --host=voltsvr4 \
         --deployment=deployment.xml \
         --license=~/license.xml

If the cluster is not K-safe — that is, the K-safety value is 0 — then you must follow the instructions in Section 4.3.4, “Reconfiguring the Cluster During a Maintenance Window” to upgrade the servers.

4.3.2. Performing Rolling Hardware Upgrades on K-Safe Clusters

If you need to upgrade all of the servers in a K-safe cluster (for example, if you are upgrading the operating system), you can perform a rolling hardware upgrade by stopping, upgrading, then rejoining each server one at a time. Using this process the entire cluster can be upgraded without suffering any downtime of the database. Just be sure to wait until the rejoining server has become a full member of the cluster before removing and upgrading the next server in the rotation. Specifically, wait until the following message appears in the log or on the console for the rejoining server:

Node rejoin completed. 

Alternately, you can attempt to connect to the server remotely — for example, using the sqlcmd command line utility. If your connection is rejected, the rejoin has not finished. If you successfully connect to the client port of the rejoining node, you know the rejoin is complete:

$ sqlcmd --servers=myserver
SQL Command :: myserver:21212
1>

Note

You cannot update the VoltDB software itself using the rolling hardware upgrade process, only the operating system, hardware, or other software. See the section on upgrading VoltDB software using database replication for information about minimizing downtime during a VoltDB software upgrade.

4.3.3. Adding Servers to a Running Cluster with Elastic Scaling

If you want to add servers to a VoltDB cluster — usually to increase performance and/or capacity — you can do this without having to restart the database. You add servers to the cluster with the voltdb add command, specifying one of the existing nodes with the --host flag. For example:

$ voltdb add --host=voltsvr4 \
         --license=~/license.xml

You must add a full complement of servers to match the K-safety value (K+1) before the servers can participate in the cluster. For example, if the K-safety value is 2, you must add 3 servers before they actually become part of the cluster and the cluster rebalances its partitions.

When you add servers to a VoltDB database, the cluster performs the following actions:

  1. The new servers are added to the cluster configuration and sent copies of the schema, stored procedures, and deployment file.

  2. Once sufficient servers are added, copies of all replicated tables and their share of the partitioned tables are sent to the new servers.

  3. As the data is rebalanced, the new servers begin processing transactions for the partition content they have received.

  4. Once rebalancing is complete, the new servers are full members of the cluster.

4.3.4. Reconfiguring the Cluster During a Maintenance Window

If you want to remove servers from the cluster permanently (as opposed to temporarily removing them for maintenance as described in Section 4.3, “Upgrading the Cluster”) or you want to change other cluster-wide attributes, such as the number of partitions per server, you need to restart the server. Stopping the database temporarily to perform this sort of reconfiguration is known as a maintenance window.

The steps for reconfiguring the cluster with a maintenance window are:

  1. Place the database in admin mode (voltadmin pause).

  2. Perform a manual snapshot of the database (voltadmin save --blocking).

  3. Shutdown the database (voltadmin shutdown).

  4. Make the necessary changes to the deployment file.

  5. Start a new database using the voltdb create --force option and the edited deployment file.

  6. Restore the snapshot created in Step #2 (voltadmin restore).

  7. Return the database to normal operations (voltadmin resume).