The database memory management schemes described in the previous sections are designed to relieve you as database developer or administrator from having to manage the memory consumption yourself. However, there may be times when you want to optimize the system for your specific application's needs. The following sections explain what controls are available to help you adjust:
Memory consumption in the Java Heap
Frequency and size of the compaction process for table data
Despite the fact that you as a developer or database administrator cannot control when
temporary storage is allocated and freed, you can control how much memory is used. Java provides a
way to specify the size of the heap, the portion of memory the JVM uses to store runtime data such as class instances,
arrays, etc. The -Xms
and -Xmx
arguments to the java
command
specify the initial and maximum heap size, respectively.
By setting both the -Xmx
and -Xms
arguments, you can control not only the
maximum amount of memory used, but also the amount of fluctuation that can occur. Figure 13.3, “Controlling the Java Heap Size”
illustrates how the -Xms
and -Xmx
arguments can be used to control the overall size
of temporary storage.
However, you must be careful when setting the values for the Java heap size, since the JVM will not exceed the value you set as a maximum. It is possible, under some conditions, to force a Java out-of-memory error if the maximum heap size is not large enough for the temporary storage VoltDB requires. See the VoltDB Planning Guide for recommendations on calculating the appropriate heap size for your specific application.
Remember, temporary storage is used to queue the procedure requests and responses. If you are using synchronous procedures calls (and therefore little or no queuing on the server) a small heap size is acceptable. Also, if the size of the procedure invocations (in terms of the arguments passed into the procedures) and the return values are small, a lower heap size is acceptable. But if you are invoking procedures asynchronously with large argument lists or return values, be very careful when setting a low maximum heap size.
The compaction process is designed to run in parallel with the database workload, constantly managing the memory incrementally in the background. Under normal operations the process is transparent to the users. However, if your application has an unusual workload or requires special attention regarding memory reclamation, there are database configuration options you can use to modify the compaction process. There are two key triggers you can adjust regarding compaction:
The frequency, or interval, of compaction transactions
The maximum limit for tuples moved as part of a single compaction transaction
By default, the compaction process runs every 60 seconds. You can modify this setting to check for defragmentation
more or less frequently by setting the interval
attribute of the <compaction>
element, which can be found under <systemsettings>
in the configuration file. You specify the interval as an integer number
of seconds. If you specify the interval as zero (0), no compaction processing will be done by the system
automatically.
You can also change the maximum number of tuples that will be moved in a single compaction transaction by setting the maxcount attribute of <compaction>. You do not specify the maximum count as a number of tuples, since tuples can be dramatically different in size, depending on the schema. And moving 10 large tuples can be more expensive than moving 10 small tuples. Instead, to make the transactions equivalent in size and execution time, you specify the maximum as the number of blocks' worth of tuples that can be moved in one transaction.
For example, if you specify the maxcount as 2 and one table (A) can fit 200 tuples in a block and another table (B) can fit 1,000 tuples in a block this means compaction can move at most 400 tuples in table A or 2,000 tuples in table B at one time. You can specify maxcount as 1, 2, or 3 block's worth of tuples. The default maximum count is 1.
The following configuration sets the interval to 120 seconds and the maximum count to 2 blocks' worth of tuples.
<systemsettings> <compaction interval="120" maxcount="2"/> </systemsettings>
You can modify the compaction settings at any time — when initializing the database with the voltdb init command or while it is running using the voltadmin update command. If, for any reason, compaction is not keeping up with the number of delete operations and the amount of fragmentation is growing, you can force the system to run compaction using the voltadmin defrag command. The voltadmin defrag command begins an iteration of the compaction process, regardless of the currently configured schedule, and triggers any appropriate compaction transactions that are required.
When running compaction manually, you have three additional controls available.
You can specify a different maximum count for the volume of data to move during the compaction, using the
--count
flag and specifying the maximum number of blocks' worth of tuples to move.
You can choose which tables to compact (normally, the compaction process checks all tables on each iteration).
To limit the compaction to selected tables, use the --tables
flag specifying a comma-separated list
of table names.
You can also choose to do a full compaction, where the maximum limit is ignored and all
gaps in memory for the table are filled. You initiate a full compaction using the --full
flag. (If
--count
and --full
are both specified, --count
is
ignored.)
For example, the following command performs a full compaction of the orders and shipments tables:
$ voltadmin defrag --tables=orders,shipments --full
Depending on how much data exists in the table and the amount of fragmentation, a full compaction can take a significant amount of time to execute and could result in increased latency for other transactions. A full compaction should not be done without due consideration for potential impact to the current workload or, preferably, be scheduled for off hours.
Finally, as part of operational planning, it is possible to set resource limits related to memory usage, using the
<resourcemonitor>
element in the database configuration. When you set the limit
(using the <memorylimit>
subelement) you can also specify
compact="true"
so that, when the limit is triggered, besides pausing the database it will trigger a
full compaction of the database tables to alleviate the memory pressure. For example:
<systemsettings>
<resourcemonitor>
<memorylimit size="80%" compact="true"/>
</resourcemonitor>
</systemsettings>