May 26, 2015 (Updated May 27, 2015)
This document provides information about known issues and limitations to the current release of VoltDB. If you encounter any problems not listed below, please be sure to report them to email@example.com. Thank you.
For customers upgrading from pre-5.0 releases of VoltDB, please see the V4.0 Upgrade Notes for special considerations when upgrading from previous major versions. Otherwise, the process for upgrading from a previous version of VoltDB is as follows:
Place the database in admin mode (using voltadmin pause).
Perform a manual snapshot of the database (using voltadmin save).
Shutdown the database (using voltadmin shutdown).
Upgrade the VoltDB software.
Restart the database (using the voltdb create action).
Reload any Java stored procedures and the database schema (using the sqlcmd directives load classes and file).
Restore the snapshot created in Step #2 (using voltadmin restore).
Return the database to normal operations (using voltadmin resume).
Users of previous versions of VoltDB should take note of the following changes that might impact their existing applications.
Additional subquery support
Previously it was possible to use subqueries in the FROM clause of a SELECT statement. Now subqueries are supported in many more situations. Scalar subqueries, which return a single value, can be used in most cases where SQL functions can be used. Non-scalar subqueries, which return multiple rows or multiple columns per row, can be used in array comparisons and in the IN and EXISTS predicates.
Subqueries are valid only in the SELECT statement. They cannot be used in other SQL statements or index definitions. Also, for the initial release of extended subquery support, only replicated tables can be used in subqueries outside of the FROM clause.
VoltDB Management Center improvements
The VoltDB Management Center contains a number of improvements and new capabilities, including:
Improved performance of JDBC prepared statements and ad hoc queries with parameters
When processing ad hoc queries that use parameters and placeholders (rather than a single SQL statement as a text string), VoltDB now makes use of previously cached queries to significantly improve performance for repeated queries. This can be most notable for JDBC prepared statements that are implemented as ad hoc queries with parameters.
SHA-256 support in python client
The python client library has been updated to support SHA-256 hashing of passwords when authenticating to servers with security enabled.
In addition to the new features and capabilities described above, the following limitations in previous versions have been resolved:
Excessive memory use when overflowing queued export or DR data corrected
There was in issue in earlier releases where, if the target of export or database replication (DR) stalled, the sending cluster buffers queued data to disk. However, this did not properly free the associated memory; and so memory usage would increase. It was possible, if the service buffered data to disk for an extended period of time, that the server process could run out of memory.
This issue has been resolved and memory associated with data buffered to disk is released appropriately. Note however, that even if excessive memory usage is no longer a problem, you should always try to resolve issues with stalled downstream systems when using export or DR because buffered data could eventually exceed disk storage capacity.
New C++ client supports SHA-256 hashing
The C++ client library has been updated to support SHA-256 hashing of passwords when authenticating to servers with security enabled. By default, the client supports past and present server versions by using SHA-1 hashing. However, when connecting to VoltDB 5.2 and later servers, you can use SHA-256 hashing by specifying the hash type in the client configuration. For example:
voltdb::ClientConfig config("myusername", "mypassword", voltdb::HASH_SHA256); voltdb::Client client = voltdb::Client::create(config);
Now both the Java and C++ client libraries support SHA-256 hashing. The new C++ client is available from the VoltDB client downloads page.
Support for SHA-2 in the voltdb mask command
VoltDB 5.2 introduced use of SHA-2 hashing for authentication. This release brings the voltdb mask function up to date with the new authentication scheme. For customers using the mask function, be sure to re-hash your deployment file using the 5.2.1 voltdb mask command and use the newly hashed deployment file when starting the database to ensure all command utilities can authenticate properly.
Ability for database replication (DR) to resume across cluster outages
Previously, database replication (DR) was able to continue despite individual node failures (in a K-safe environment). However failure of either the master or replica cluster would force a restart of DR. Beginning with 5.2, DR can resume across cluster failures when either the master or replica is recovered from command logs. See the chapter on "Database Replication" in the Using VoltDB manual for details.
Secure export to Hadoop using Kerberos
The HTTP export connector now supports the use of Kerberos authentication when exporting to a WebHDFS endpoint that is configured to use Kerberos. See the section on using the HTTP export connector in the Using VoltDB manual for details.
Support for partial indexes
VoltDB now supports partial indexes. That is, the index definition can contain a WHERE clause limiting the rows that are included in the index. For example:
CREATE INDEX completed_tasks ON tasks (task_id, startdate, enddate) WHERE enddate IS NOT NULL;
For the initial release of partial indexes, there are certain limitations on when and where such indexes and the tables associated with them can be modified. For now, you cannot use the ALTER TABLE statement to modify a table with a partial index. This limitation is expected to be relaxed in a future release.
New VoltDB Management Center features
The Management Center, VoltDB's web-based management console, continues to be extended and improved. This release contains two major new features:
New bitwise functions
VoltDB now supports several new functions for performing bitwise operations on BIGINT values. The new functions support standard binary operands such as AND, OR, XOR, and NOT as well as bit shifting operations. See the reference pages for the BITAND(), BITNOT(), BITOR(), BITXOR(), BIT_SHIFT_LEFT(), and BIT_SHIFT_RIGHT() functions in the Using VoltDB manual for details.
New HEX() function
Another new function, HEX(), converts a BIGINT value into its hexadecimal representation as a string. See the reference page for the HEX() function in the Using VoltDB manual for details.
Support for SHA-2
VoltDB now supports SHA-2 hashing of credentials between the Java and c++ client libraries and the server. When you pass a username and password, the updated client library uses a SHA-2 hash of the credentials. On the server side, the VoltDB server accepts both SHA-1 (sent by previous versions of the client) and SHA-2. So both current and previous versions of the client libraries continue to work with the latest server release.
New voltadmin command to stop individual servers
The VoltDB command line utility, voltadmin, now supports the stop command. The voltadmin stop command stops the VoltDB server process on the specified node. Note that the stop command can only be used on a K-safe cluster and will not intentionally shutdown the database. That is, the command will only stop a node if there are enough nodes left for the cluster to remain viable.
In addition to the new features and capabilities described above, the following limitations in previous versions have been resolved:
VoltDB 5.1.2 is a patch release that provides performance and stability improvements for Database Replication (DR).
Improved performance for initial DR snapshot.
When database replication starts, a snapshot is sent from the master database to the replica. In this release several I/O improvements have been made to improve the performance and reliability of the initial DR snapshot.
Improved management of DR buffers
There was an issue with how multiple buffers were grouped and managed in DR, which could result in decreased replication throughput. This issue has been resolved.
VoltDB 5.1.1 is a patch release that fixes an issue introduced in 5.1.
Bug fix: Excessive CPU usage on idle database
Changes in VoltDB 5.1 introduced a process that, when the database was idle, would "spin" making it appear that the database was consuming significant CPU cycles. When the database was active processing queries, the CPU usage would drop to normal levels.
Although not dangerous, this behavior was misleading and is corrected by the current update.
VoltDB 5.1 introduces several significant new features and enhancements. Existing customers should pay close attention to the following notes to see what if any changes they may want to make to their applications and/or operations to take advantage of the new capabilities.
New implementation of Database Replication (DR)
Database Replication (DR) lets you automatically copy updates to database tables from one database (the master) to another (the replica). Starting with VoltDB 5.1, DR has been rewritten to remove any single point of failure, improve performance, and allow new capabilities in the future. New features include:
For existing DR customers, the new capabilities and the elimination of the DR agent do necessitate some operational changes. Specifically, you must now:
See the chapter on Database Replication in the Using VoltDB manual for details.
Ability to export data to multiple streams
VoltDB now allows you to export data to more than one target at a time. By assigning export tables to individual streams and then configuring each stream separately in the deployment file, you can export data to multiple targets simultaneously. For example, you might export deduped sensor data to Hadoop once it has been processed and export alerts regarding unusual events to HTTP for distribution via SMS, email, or other notification service. See the chapter on exporting live data in the Using VoltDB manual for details.
Use of multiple streams does require additional information in the schema and the deployment file. For example, the EXPORT TABLE statement now requires a TO STREAM clause so you can specify the stream to which each export table is directed. However, for backwards compatibility, the old syntax is still supported temporarily to allow customers time to migrate existing applications at their convenience.
Batch processing of interactive DDL statements
VoltDB 5.0 introduced interactive DDL, eliminating the need for a precompiled application catalog. However,
large schema could take a significant time to process interactively. VoltDB 5.1 solves this problem by allowing you
to batch DDL statements. If you have your DDL statements in a single file, you can use the
$ sqlcmd 1> file --batch myschema.sql;
If you have a mix of DDL (data definition language) statements and DML (data manipulation language) and
directives you can batch process only the DDL statements by enclosing them in a
load classes myprocs.jar; file --inlinebatch END_OF_BATCH CREATE PROCEDURE FROM CLASS procs.AddEmp; CREATE PROCEDURE FROM CLASS procs.ChangeDept; PARTITION PROCEDURE AddEmp ON TABLE emp COLUMN empid; PARTITION PROCEDURE ChangeDept ON TABLE emp COLUMN empid; END_OF_BATCH
Batch processing DDL statements can speed up the processing of those statements by a factor of 10 or more, depending on the number and complexity of the statements and the size of the cluster.
New Administrative features in VoltDB Management Center
VoltDB Management Center, the web-based console for managing and monitoring a VoltDB database, now has a tab for administrative functions. On the Admin tab, you can pause and resume the database, save and restore snapshots, as well as review and update the database configuration. If security is enabled for the database, only users with the ADMIN permission can see and use the Admin tab in the Management Center.
In addition to the new features and capabilities described above, the following limitations in previous versions have been resolved:
This release contains no new features but corrects the following issues from the original 5.0 release.
Issues related to using INSERT INTO SELECT with export tables
There was an issue in earlier releases where using an INSERT INTO SELECT statement with an export table as the target for the insert either generated a null pointer exception or did not insert the expected data into the export stream. The issue only applies to INSERT INTO SELECT as an ad hoc query or within a multi-partitioned stored procedure.
These issues have now been corrected.
Database failure when reporting long-running queries
There was an issue in previous versions (starting with VoltDB 4.8), where if a query runs for a significant amount of time, VoltDB attempts to log a warning. However, the warning generates an error (index out of bounds) and stops the database.
This issue is now fixed.
Lines starting with "file" in sqlcmd incorrectly interpreted as a file directive.
In the original 5.0 release, any sqlcmd input line beginning with "file" (regardless of upper or lowercase) was interpreted as a file directive, even in the middle of a multi-line statement. This would happen, for example, if a CREATE TABLE statement included a column name starting with "file":
CREATE TABLE archive ( ID INTEGER, Directory VARCHAR(128), Filename VARCHAR(128) );
This usually resulted in several errors and the intended statement not being interpreted correctly. This issue is now fixed.
The major new feature in VoltDB 5.0 is the ability to enter data definition language (DDL) statements interactively. For example, using sqlcmd on the command line or the VoltDB Management Center SQL Query interface. This makes the process of creating a database and defining the schema more flexible. As part of the support for interactive DDL, the following features have been added:
Pleased note that processing DDL interactively can take longer than compiling an application catalog all at once. This is most noticeable when processing a large schema and especially on a multi-node cluster (where each change must be coordinated among the servers).
If you find entering DDL interactively too slow, it is possible to revert to precompiling the schema before starting the database. You have two choices:
Performance improvements for processing large schemas interactively are expected in upcoming releases.
Ability to "trim" rows using LIMIT PARTITION ROWS EXECUTE
The LIMIT PARTITION ROWS constraint now supports an EXECUTE clause that lets you specify a DELETE statement that is executed when the constraint value is exceeded. The EXECUTE clause gives you the ability to automatically "prune" older data when the constraint is reached. See the description of the CREATE TABLE statement in the Using VoltDB manual for details.
Support for HttpFS targets in Hadoop export
The HTTP connector, now supports Apache HttpFS (Hadoop HDFS over HTTP) servers as a target when exporting
using the WebHDFS protocol. Set the export property
Addition of the ORDER BY clause to the DELETE statement
It is now possible to use the ORDER BY clause with LIMIT and/or OFFSET when performing a DELETE operation. ORDER BY allows you to more selectively remove database rows. For example, the following DELETE query removes the five oldest records, based on a timestamp column:
DELETE FROM events ORDER BY event_time ASC LIMIT 5;
Note that DELETE queries that include the ORDER BY clause must be single-partitioned and the ORDER BY clause must be deterministic. See the description of the DELETE statement in the Using VoltDB manual for details.
In addition to the new features listed above, VoltDB V5.0 includes fixes to several known issues:
The following are known limitations to the current release of VoltDB. Workarounds are suggested where applicable. However, it is important to note that these limitations are considered temporary and are likely to be corrected in future releases of the product.
Changing the deployment configuration when recovering command logs, can result in unexpected settings.
There is an issue where, if the command log contains schema changes (performed through interactive DDL statements, voltadmin update, or @UpdateApplicationCatalog), when the command logs are recovered, the previous deployment file settings are used, even if an alternate deployment file is specified on the voltdb recover command line. Then, after recovering the database, a new schema update can result in the deployment settings specified on the command line taking affect.
Until this issue is resolved, the safest workaround to ensure the desired configuration is achieved is to perform the voltdb recover operation without modifying the current deployment file, then make deployment changes with the voltadmin update command after the database has started.
Command logs can only be recovered to a cluster of the same size.
To ensure complete and accurate restoration of a database, recovery using command logs can only be performed to a cluster with the same number of unique partitions as the cluster that created the logs. If you restart and recover to the same cluster with the same deployment options, there is no problem. But if you change the deployment options for number of nodes, sites per host, or K-safety, recovery may not be possible.
For example, if a four node cluster is running with four sites per host and a K-safety value of one, the cluster has two copies of eight unique partitions (4 X 4 / 2). If one server fails, you cannot recover the command logs from the original cluster to a new cluster made up of the remaining three nodes, because the new cluster only has six unique partitions (3 X 4 / 2). You must either replace the failed server to reinstate the original hardware configuration or otherwise change the deployment options to match the number of unique partitions. (For example, increasing the site per host to eight and K-safety to two.)
Do not use the subfolder name "segments" for the command log snapshot directory.
VoltDB reserves the subfolder "segments" under the command log directory for storing the actual command log files. Do not add, remove, or modify any files in this directory. In particular, do not set the command log snapshot directory to a subfolder "segments" of the command log directory, or else the server will hang on startup.
Some DR data may not be delivered if master database nodes fail and rejoin in rapid succession.
Because DR data is buffered on the master database and then delivered asynchronously to the replica, there is always the danger that data does not reach the replica if a master node stops. This situation is mitigated in a K-safe environment by all copies of a partition buffering on the master cluster. Then if a sending node goes down, another node on the master database can take over sending logs to the replica. However, if multiple nodes go down and rejoin in rapid succession, it is possible that some buffered DR data — from transactions when one or more nodes were down — could be lost when another node with the last copy of that buffer also goes down.
If this occurs and the replica recognizes that some binary logs are missing, DR stops and must be restarted.
To avoid this situation, especially when cycling through nodes for maintenance purposes, the key is to ensure that all buffered DR data is transmitted before stopping the next node in the cycle. You can do this using the @Statistics system procedure to make sure the last ACKed timestamp (using @Statistitcs DR on the master cluster) is later than the timestamp when the previous node completed its rejoin operation.
Synchronous export in Kafka can use up all available file descriptors and crash the database.
A bug in the Apache Kafka client can result in file descriptors being allocated but not released if the producer.type attribute is set to "sync" (which is the default). The consequence is that the system eventually runs out of file descriptors and the VoltDB server process will crash.
Until this bug is fixed, use of synchronous Kafka export is not recommended. The workaround is to set the Kafka producer.type attribute to "async" using the VoltDB export properties.
Comments containing unmatched single quotes in multi-line statements can produce unexpected results.
When entering a multi-line statement at the sqlcmd prompt, if a line ends in a comment (indicated by two hyphens) and the comment contains an unmatched single quote character, the following lines of input are not interpreted correctly. Specifically, the comment is incorrectly interpreted as continuing until the next single quote character or a closing semi-colon is read. This is most likely to happen when reading in a schema file containing comments. This issue is specific to the sqlcmd utility.
A fix for this condition is planned for an upcoming point release
Do not use assertions in VoltDB stored procedures.
VoltDB currently intercepts assertions as part of its handling of stored procedures. Attempts to use assertions in stored procedures for debugging or to find programmatic errors will not work as expected.
The UPPER() and LOWER() functions currently convert ASCII characters only.
The UPPER() and LOWER() functions return a string converted to all uppercase or all lowercase letters, respectively. However, for the initial release, these functions only operate on characters in the ASCII character set. Other case-sensitive UTF-8 characters in the string are returned unchanged. Support for all case-sensitive UTF-8 characters will be included in a future release.
Avoid using decimal datatypes with the C++ client interface on 32-bit platforms.
There is a problem with how the math library used to build the C++ client library handles large decimal values on 32-bit operating systems. As a result, the C++ library cannot serialize and pass Decimal datatypes reliably on these systems.
Note that the C++ client interface can send and receive Decimal values properly on 64-bit platforms.
The VoltDB Enterprise Manager is part of the VoltDB Enterprise Edition and continues to be supported for customers who are currently using it. However, due to limitations in its implementation, no further development work is being done on the Enterprise Manager and it is not recommended for new deployments. The Enterprise Manager's functionality will be replaced by new, more robust, deployment and management capabilities in the future.
Manual snapshots not copied to the Management Server properly.
Normally, manual snapshots (those created with thebutton) are copied to the management server. However, if automated snapshots are also being created and copied to the management server, it is possible for an automated snapshot to override the manual snapshot.
If this happens, the workaround is to turn off automated snapshots (and their copying) temporarily. To do this, uncheck the box for copying snapshots, set the frequency to zero, and click. Then re-open the Edit Snapshots dialog and take the manual snapshot. Once the snapshot is complete and copied to the management server (that is, the manual snapshot appears in the list on the dialog box), you can re-enable copying and automated snapshots.
Old versions of Enterprise Manager files are not deleted from the /tmp directory
When the Enterprise Manager starts, it unpacks files that the web server uses into a subfolder of the /tmp directory. It does not delete these files when it stops. Under normal operation, this is not a problem. However, if you upgrade to a new version of the Enterprise Edition, files for the new version become intermixed with the older files and can result in the Enterprise Manager starting databases using the wrong version of VoltDB. To avoid this situation, make sure these temporary files are deleted before starting a new version of VoltDB Enterprise Manager.
The /tmp directory is emptied every time the server reboots. So the simplest workaround is to reboot your management server after you upgrade VoltDB. Alternately, you can delete these temporary files manually by deleting the winstone subfolders in the /tmp directory:
$ rm -vr /tmp/winstone*
Enterprise Manager configuration files are not upwardly compatible.
When upgrading VoltDB Enterprise Edition, please note that the configuration files for the Enterprise Manager are not upwardly compatible. New product features may make existing database and/or deployment definitions unusable. It is always a good idea to delete existing configuration information before upgrading. You can delete the configuration files by deleting the ~/.voltdb directory. For example:
$ rm -vr ~/.voltdb
Enterprise Manager cannot start two databases on the same server.
In the past, it was possible to run two (or more) databases on a single physical server by defining two logical servers with the same IP address and making the ports for each database unique. However, as a result of internal optimizations introduced in VoltDB 2.7, this technique no longer works when using the Enterprise Manager.
We expect to correct this limitation in a future release. Note that it is still possible to start multiple databases on a single server manually using the VoltDB shell commands.
The Enterprise Manager cannot start or manage a replica database for database replication.
Starting with VoltDB 5.1, database replication (DR) has changed and the VoltDB Enterprise Manager can no longer correctly configure, start or manage a replica database. The recommended method is to start the database manually and use the builtin VoltDB Management Center to manage the database by connecting to the cluster nodes directly on the HTTP port (8080 by default).
The following notes provide details concerning how certain VoltDB features operate. The behavior is not considered incorrect. However, this information can be important when using specific components of the VoltDB product.
Schema updates clear the stored procedure data table in the Management Center Monitor section
Any time the database schema or stored procedures are changed, the data table showing stored procedure statistics at the bottom of the Monitor section of the VOltDB Management Center get reset. As soon as new invocations of the stored procedures occur, the statistics table will show new values based on performance after the schema update. Until invocations occur, the procedure table is blank.
You cannot partition a table on a column defined as ASSUMEUNIQUE.
The ASSUMEUNIQUE attribute is designed for identifying columns in partitioned tables where the column values are known to be unique but the table is not partitioned on that column, so VoltDB cannot verify complete uniqueness across the database. Using interactive DDL, you can create a table with a column marked as ASSUMEUNIQUE, but if you try to partition the table on the ASSUMEUNIQUE column, you receive an error. The solution is to drop and add the column using the UNIQUE attribute instead of ASSUMEUNIQUE.
Adding or dropping column constraints (UNIQUE or ASSUMEUNIQUE) is not supported by the ALTER TABLE ALTER COLUMN statement.
You cannot add or remove a column constraint such as UNIQUE or ASSUMEUNIQUE using the ALTER TABLE ALTER COLUMN statement. Instead to add or remove such constraints, you must first drop then add the modified column. For example:
ALTER TABLE employee DROP COLUMN empID; ALTER TABLE employee ADD COLUMN empID INTEGER UNIQUE;
Do not use UPDATE to change the value of a partitioning column
For partitioned tables, the value of the column used to partition the table determines what partition the row belongs to. If you use UPDATE to change this value and the new value belongs in a different partition, the UPDATE request will fail and the stored procedure will be rolled back.
Updating the partition column value may or may not cause the record to be repartitioned (depending on the old and new values). However, since you cannot determine if the update will succeed or fail, you should not use UPDATE to change the value of partitioning columns.
The workaround, if you must change the value of the partitioning column, is to use both a DELETE and an INSERT statement to explicitly remove and then re-insert the desired rows.
Certain SQL syntax errors result in the error message "user lacks privilege or object not found" when compiling the runtime catalog.
If you refer to a table or column name that does not exist, the VoltDB compiler issues the error message "user lacks privilege or object not found". This can happen, for example, if you misspell a table or column name.
Another situation where this occurs is if you mistakenly use double quotation marks to enclose a string
literal (such as
The workaround is, if you receive this error, to look for misspelled table or columns names or string literals delimited by double quotes in the offending SQL statement.
File Descriptor Limits
VoltDB opens a file descriptor for every client connection to the database. In normal operation, this use of file descriptors is transparent to the user. However, if there are an inordinate number of concurrent client connections, or clients open and close many connections in rapid succession, it is possible for VoltDB to exceed the process limit on file descriptors. When this happens, new connections may be rejected or other disk-based activities (such as snapshotting) may be disrupted.
In environments where there are likely to be an extremely large number of connections, you should consider increasing the operating system's per-process limit on file descriptors.
Protecting VoltDB Against Port Scanners
VoltDB uses a number of different ports for interprocess communication as well as features such as HTTP access, DR, and so on. Port scanning software often interferes with normal operation of such ports by sending bogus data to them in an attempt to identify open ports.
VoltDB has hardened its port usage to ignore unexpected or irrelevant data from port scanners. However, the
ports used for Database Replication (DR) cannot be protected in this way. So, in V4.6, a Java property was
introduced to allow you to disable the DR ports, for situations where port scanning cannot be avoided. To disable
the DR ports, set the Java property VOLTDB_DISABLE_DR to
$ export VOLTDB_OPTS="-DVOLTDB_DISABLE_DR=true" $ voltdb create myapplication.jar \ --deployment=deployment.xml \ --host=voltsvr1
Note that, if you disable the DR ports, you cannot use the database as a master for database replication.