Chapter 1. Managing VoltDB Databases

Documentation

VoltDB Home » Documentation » Administrator's Guide

Chapter 1. Managing VoltDB Databases

VoltDB is a distributed, in-memory database designed from the ground up to maximize throughput performance on commodity servers. The VoltDB architecture provides many advantages over traditional database products while avoiding the pitfalls of NoSQL solutions:

  • By partitioning the data and stored procedures, VoltDB can process multiple queries in parallel without sacrificing the consistency or durability of an ACID-compliant database.

  • By managing all data in memory with a single thread for each partition, VoltDB avoids overhead such as record locking, latching, and device-contention inherent in traditional disk-based databases.

  • VoltDB databases can scale up to meet new capacity or performance requirements simply by adding more nodes to the cluster.

  • Partitioning is automated, based on the schema, so there is no need to manually shard or repartition the data when scaling up as with many NoSQL solutions.

  • Finally, VoltDB Enterprise Edition provides features to ensure durability and high availability through command logging, locally replicating partitions (K-safety), and wide-area database replication.

Each of these features is described, in detail, in the Using VoltDB manual. This book explains how to use these and other features to manage and maintain a VoltDB database cluster from a database administrator's perspective.

1.1. Getting Started

Before you set up VoltDB for use in a production environment, you need to make four decisions:

  • What database features to use — Which features you want to use are defined in the configuration file and set with the voltdb init command.

  • Physical structure of the cluster — The number and addresses of the nodes in the cluster, which you specify when you start the cluster with the voltdb start command.

  • Logical structure of the database — The logical structure of the database tables and views, otherwise known as the schema, is defined in standard SQL statements and can be applied to the database using the sqlcmd command line utility.

  • Stored procedures — The schema declares stored procedures. The procedures themselves execute transactions against the data and are written as Java classes. You load the stored procedures as JAR files using the sqlcmd command line utility.

To initialize a VoltDB database cluster, you need a configuration file. The configuration file lets you enable and configure various database options including availability, durability, and security. The configuration file also defines certain attributes of the database on the current server, in particular the paths for disk-based files created by the database such as command logs and snapshots. All nodes in the cluster must specify the same cluster configuration file when they initialize the database root directory with the voltdb init command.

When you actually start the database cluster, using the voltdb start command, you declare the size the cluster by specifying the number of nodes in the cluster and one or more of the nodes as potential hosts. VoltDB selects one of the specified nodes as the "leader" to coordinate startup.

When using the VoltDB Enterprise Edition, you will also need a license file, often called license.xml. VoltDB automatically looks for the license file in the user's current working directory, the home directory, or the voltdb/ subfolder where VoltDB is installed. If you keep the license file in a different directory or under a different name, you can use to --license argument on the voltdb init command to specify the license file location.

Finally, to prepare the database for a specific application, you will need the database schema, including the DDL statements that describe the database's logical structure, and a JAR file containing the stored procedure class files. In general, the database schema and stored procedures are produced as part of the database development process, which is described in the Using VoltDB manual.

This book assumes the schema and stored procedures have already been created. The configuration file, on the other hand, defines the run-time configuration of the cluster. Establishing the correct settings for the configuration file and physically managing the database cluster is the duty of the administrators who are responsible for maintaining database operations. This book is written for those individuals and covers the standard procedures associated with database administration.