2.9. Eliminating Server Process Latency

The preceding sections explain how to configure your servers and network to maximize the performance of VoltDB. The goal is to avoid server functions, such as swapping or Java garbage collection, from disrupting the proper operation of the VoltDB process.

Any latency in the scheduling of VoltDB threads can impact the performance of your database. These delays result in corresponding latency in the database transactions themselves. But equally important, prolonged latency can interrupt intra-cluster communication as well, to the point where the cluster may incorrectly assume a node has failed and drop it as a member. If server latency causes a node not to respond to network messages beyond the heartbeat timeout setting, the rest of the cluster will drop the node as a "dead host".

Therefore, in addition to the configuration settings described earlier in this chapter, the following are some known causes of latency you should watch out for:

Other applications — Clearly, running other applications on the same servers as VoltDB can result in unpredictable resource conflicts for memory, CPU, and disk access. Running VoltDB on dedicated servers is always recommended for production environments.
Frequent snapshots — Initiating snapshots consumes resources. Especially on a database under heavy load, this can result in latency spikes. Although it is possible to run both automated snapshots and command logging (which performs its own snapshots), they are redundant and can cause unnecessary delays. Also, when using command logging on a busy database, consider increasing the size of the command log segments if snapshots are occurring too frequently.
I/O contention — Contention for disk resources can interfere with the effective processing of VoltDB durability features. This can be avoided by allocating separate devices for individual disk-based activity. For example, wherever possible locate command logs and snapshots on separate devices.
JVM statistics collection — Enabling Java Virtual Machine (JVM) statistics can produce erratic latency issues for memory-intensive applications like VoltDB. Disabling JVM stats is strongly recommended when running VoltDB servers. You can disable JVM stats by issuing the following command before starting the VoltDB process:
```
export VOLTDB_OPTS='-XX:+PerfDisableSharedMem'
```
Alternately, you can write the JVM stats to an in-memory virtual disk, such as /tmpfs.
Hardware power saving options — Beware of hardware options that attempt to conserve energy by putting "idle" processes or resources into a reduced or sleep state. Resuming quiesced resources takes time and the requesting process is blocked for that period. Make sure power saving options are disabled for the resources you need (such as CPUs and disks).

Although not specific to server resources, the following are some additional causes of latency due to improper database and application design. When combined with the previous server issues, they can result in erratic and troublesome performance and even node failures.

Sequential scans of large tables — Perhaps the most common cause of latency is queries that require a sequential scan of extremely large tables of data. Any query that must read through every record in a table will perform badly in proportion to the size of the table. Be sure to review the execution plans for key transactions to ensure indexes are used as expected and add indexes to avoid sequential scans wherever possible.
Large deletes — VoltDB retains and reuses memory whenever you delete tuples. If the amount of deleted space reaches a certain percentage of overall memory usage, VoltDB compresses the memory used. Transactions wait while this function is performed. To avoid latency caused by compaction, you can perform deletes in smaller, ongoing transactions. The USING TTL feature of the CREATE TABLE statement can assist in automating the incremental purging of old records.

Administrator's Guide

2.9. Eliminating Server Process Latency