Chapter 6. Database Replication in Kubernetes


VoltDB Home » Documentation » VoltDB Kubernetes Administrator's Guide

Chapter 6. Database Replication in Kubernetes

Previous chapters describe how to run a single VoltDB cluster within Kubernetes. Of course, you can run multiple independent VoltDB databases in Kubernetes. You do this by starting each cluster in separate regions, under different namespaces within the same Kubernetes cluster, or running a single instance of the VoltDB Operator managing multiple clusters in the same namespace. However, some business applications require the same database running in multiple locations — whether for data redundancy, disaster recovery, or geographic distribution. In VoltDB this is done through Cross Datacenter Replication, or XDCR.

XDCR in Kubernetes works the same way it does in any other network environment, as described in the chapter on Database Replication in the Using VoltDB guide. The key differences when using XDCR in Kubernetes are:

  • You must establish a network mesh between the Kubernetes clusters containing the VoltDB databases so that the nodes of each VoltDB cluster can identify and resolve the IP addresses and ports of all the nodes from the other VoltDB cluster.

  • You must configure the VoltDB clusters with unique replication IDs and appropriate connection properties, just as you would outside of Kubernetes, except here you define them in YAML rather than XML.

The following sections explain how to perform these actions.

6.1. Configuring the Network Mesh

For XDCR to work, the network environment must support each cluster reaching the nodes of the other cluster through the IP addresses and ports that the clusters advertise. This is necessary because the XDCR relationship occurs in three distinct steps:

  1. First, the clusters connect over the replication port (port 5555, by default). The initial connection confirms that the schema of the two clusters match for all DR tables and that there is data in only one of the clusters.

  2. Once the clusters agree on the schema, each cluster sends a list of node IP addresses and ports to the other cluster and multiple connections are made, node-to-node, between the two clusters.

  3. Finally, if there is existing data, a synchronization snapshot is sent between the clusters before replication can begin.

For Step #2 to occur, both clusters must agree on (and be able to resolve) the IP addresses and ports for each node in the other clusters. This can be challenging when crossing network domains and firewalls. It is even more daunting in Kubernetes, where the IP and service addresses within one Kubernetes cluster may not be visible by default outside that cluster. Kubernetes provides services, such as cluster IPs and loadBalancers, that help expose your services beyond the individual cluster. However, the resulting addresses do not match the identities of the cluster nodes themselves.

So there are two separate configurations to consider:

  • XDCR between two databases within the same Kubernetes cluster

  • XDCR between two database in different Kubernetes clusters, regions, or even hosted by different service providers

In the first case, where both copies of the database are within the same Kubernetes cluster, Kubernetes' default network scope allows the databases to resolve the resulting network addresses. In the second case, where the databases are in different clusters (which is essentially always the case for geographically distributed databases), you must use additional services, such as Consul, to create the necessary network mesh between the clusters.

How you set up the network mesh depends on the network services you use and in some cases the configuration of the host provider of Kubernetes itself. The network configuration may also affect what IP addresses you use when configuring your XDCR clusters. It is beyond the scope of this guide to account for all possible alternatives. Instead the following sections provide the overall guidelines for configuring XDCR clusters in a local environment, which you can extend to a distributed environment based on the network mesh and resulting IP addresses you create.