9.3. Resetting XDCR When a Cluster Leaves Unexpectedly


VoltDB Home » Documentation » VoltDB Kubernetes Administrator's Guide

9.3. Resetting XDCR When a Cluster Leaves Unexpectedly

Normally, when a cluster is removed from XDCR in an orderly fashion, the other clusters are notified that the cluster has left the mesh. However, if a cluster leaves unexpectedly — for example, if it crashes or is shutdown and deleted without setting its role to "none" to notify the other clusters — the XDCR network still thinks the cluster is a member and may return. As a result, the remaining clusters continue to save DR logs for the missing member, using up unnecessary processing cycles and disk space. You need to reset the XDCR network mesh to correct this situation.

To reset the mesh you notify the remaining clusters that the missing cluster is no longer a member. You do this be adding the DR ID of the missing cluster to the cluster.clusterSpec.dr.excludeClusters property. The property value is an array of DR IDs. For example, if the DR ID (cluster.config.deployment.dr.id) of the lost cluster is "3", you set the property to "{3}":

--set cluster.clusterSpec.dr.excludeClusters='{3}'

You must set this property for all of the clusters remaining in the XDCR environment. If later, you want to add the missing cluster (or another cluster with the same DR ID) back into the XDCR mesh, you will need to reset this property. For example:

--set cluster.clusterSpec.dr.excludeClusters=null