Chapter 7. Creating Flexible Schemas With JSON

Documentation

VoltDB Home » Documentation » Guide to Performance and Customization

Chapter 7. Creating Flexible Schemas With JSON

A major part of any relational database is the schema: the structure of the data as defined by the tables and columns. It is possible to change the schema when necessary. However, at any given time, each table has a set number of columns, each with a specific name and datatype.

It is possible to store unstructured data in a relational database as a "blob" using a VARBINARY or VARCHAR column. However, the database has no way to operate on your data effectively beyond simply storing and retrieving it.

Sometimes data is not as strictly organized as a relational database schema requires, but does have structure within it. For example, a table may have a set of properties, each with a different name and matching value. But not all records use the same set of properties.

JSON (JavaScript Object Notation) is a light-weight data interchange format that lets you describe data structures on the fly. JSON-encoded strings are composed of a hierarchy of key-value pairs that can be as simple or as complex as needed. More importantly, the actual structure of the object is determined at run-time and does not need to be predefined.

VoltDB gives you the ability to mix the efficiency of the relational schema with the flexibility of JSON. By using JSON-encoded columns with VoltDB SQL functions and index capabilities, you can work more naturally with JSON data while maintaining the efficiency and transactional consistency of a relational database.

7.1. Using JSON Data Structures as VoltDB Content

JSON processing is easiest to understand by example. For the purposes of this chapter, we will use the example of a single sign-on (SSO) application using VoltDB. The application needs to store login sessions for multiple online sites under a common username. Each login session must manage different properties describing the state of the session, where properties can vary from simple data values to more complex data structures. Additionally, future enhancements to the application may change what properties are required. Because of the variability of the data, a good strategy is to store this session information as JSON data structures.

To store JSON data in VoltDB, you declare the storage as a standard VARCHAR column, large enough to contain the maximum expected length of the JSON data. For example, the following table declaration includes a VARCHAR column named session_info with a maximum length of 2048 characters.

CREATE TABLE user_session_table (
    username           VARCHAR(200)   UNIQUE NOT NULL,
    password           VARCHAR(100)   NOT NULL,
    global_session_id  VARCHAR(200)   ASSUMEUNIQUE NOT NULL,
    last_accessed      TIMESTAMP,
    session_info       VARCHAR(2048)
);
PARTITION TABLE user_session_table ON COLUMN username;

The JSON data is stored as text and it is up to your application to perform the conversion from an in-memory structure to the textual representation. For example, the VoltDB software kit includes an example, json-sessions, that provides a version of the application described here, which uses an open source package from Google called GSON to convert from plain old Java objects (POJOs) to text and vice versa.

Once you have a text representation of the JSON object, you insert it into the database column using standard SQL syntax. For example:

INSERT INTO user_session_table (username, password,
                                global_session_id, 
                                last_accessed, session_info) 
         VALUES (?, ?, ?, ?, ?);

Note that VoltDB does not validate that the text being inserted into the VARCHAR column is properly encoded JSON. Validation of the encoding occurs when accessing such columns using the JSON-specific functions, described next.