Chapter 8. Cross Datacenter Replication in Kubernetes

Documentation

VoltDB Home » Documentation » VoltDB Kubernetes Administrator's Guide

Chapter 8. Cross Datacenter Replication in Kubernetes

Previous chapters describe how to run a single VoltDB cluster within Kubernetes. Of course, you can run multiple independent VoltDB databases in Kubernetes. You do this by starting each cluster in separate regions, under different namespaces within the same Kubernetes cluster, or running a single instance of the VoltDB Operator managing multiple clusters in the same namespace. However, some business applications require the same database running in multiple locations — whether for data redundancy, disaster recovery, or geographic distribution. In VoltDB this is done through Cross Datacenter Replication, or XDCR.

Important

Please note that in addition to the guidance specific to Kubernetes provided in this chapter, the following rules apply to XDCR in any operating environment:

  • You must have command logging enabled for three or more clusters.

  • You can only join (or rejoin) one cluster at a time to the XDCR environment.

Command logging is always recommended when using XDCR to ensure durability. Using XDCR without command logging on two clusters, it is possible for transactions processed on one cluster to be lost if the cluster crashes before the binary log is sent to the other cluster. However, for three or more clusters, command logging is required. Without command logging, not only can XDCR transactions be lost, but it is likely the databases will diverge without warning, if a cluster crashes after sending a binary log to one cooperating cluster but not to the other.

8.1. Requirements for XDCR in Kubernetes

Once established, XDCR in Kubernetes works the same way it does in any other network environment, as described in the chapter on Database Replication in the Using VoltDB guide. The key difference when using XDCR in Kubernetes is how you establish the initial connection between the clusters. Unlike traditional servers with known IP addresses, in Kubernetes network addresses are assigned on the fly and are not normally accessible outside individual namespaces or regions. Therefore, you must do additional work to create the appropriate network relationships. Specifically, you must:

  • Establish a network mesh between the Kubernetes clusters containing the VoltDB databases so that the nodes of each VoltDB cluster can identify and resolve the IP addresses and ports of all the nodes from the other VoltDB clusters.

  • Configure the VoltDB clusters, including properties that identify the type of mesh involved and mesh-specific annotations that determine what network addresses and ports to use.

The following sections describe the different approaches to establishing a network mesh and how to configure the clusters in each case.