VoltDB Home » Documentation » Using VoltDB


SELECT — Fetches the specified rows and columns from the database.


Select-statement [{set-operator} Select-statement ] ...

SELECT [ TOP integer-value ]
{ * | [ ALL | DISTINCT ] { column-name | selection-expression } [AS alias] [,...] }
FROM { table-reference } [ join-clause ]...
[WHERE [NOT] boolean-expression [ {AND | OR} [NOT] boolean-expression]...]

{ table-name [AS alias] | view-name [AS alias] | sub-query AS alias }


, table-reference
[INNER | {LEFT | RIGHT | FULL } [OUTER]] JOIN [{table-reference}] [join-condition]

ON conditional-expression
USING (column-reference [,...])

ORDER BY { column-name | alias } [ ASC | DESC ] [,...]
GROUP BY { column-name | alias } [,...]
HAVING boolean-expression
LIMIT integer-value [OFFSET row-count]



The SELECT statement retrieves the specified rows and columns from the database, filtered and sorted by any clauses that are included in the statement. In its simplest form, the SELECT statement retrieves the values associated with individual columns. However, the selection expression can be a function such as COUNT and SUM.

The following features and limitations are important to note when using the SELECT statement with VoltDB:

  • See Appendix C, SQL Functions for a full list of the SQL functions the VoltDB supports.

  • VoltDB supports the following operators in expressions: addition (+), subtraction (-), multiplication (*), division (*) and string concatenation (||).

  • TOP n is a synonym for LIMIT n.

  • The WHERE expression supports the boolean operators: equals (=), not equals (!= or <>), greater than (>), less than (<), greater than or equal to (>=), less than or equal to (<=), LIKE, IS NULL, IS DISTINCT, IS NOT DISTINCT, AND, OR, and NOT. Note, however, although OR is supported syntactically, VoltDB does not optimize these operations and use of OR may impact the performance of your queries.

  • The boolean expression LIKE provides text pattern matching in a VARCHAR column. The syntax of the LIKE expression is {string-expression} LIKE '{pattern}' where the pattern can contain text and wildcards, including the underscore (_) for matching a single character and the percent sign (%) for matching zero or more characters. The string comparison is case sensitive.

    Where an index exists on the column being scanned and the pattern starts with a text prefix (rather than starting with a wildcard), VoltDB will attempt to use the index to maximize performance, For example, a query limiting the results to rows from the EMPLOYEE table where the primary index¸ the JOB_CODE column, begins with the characters "Temp" looks like this:

    SELECT * from EMPLOYEE where JOB_CODE like 'Temp%';
  • The boolean expression IN determines if a given value is found within a list of alternatives. For example, in the following code fragment the IN expression looks to see if a record is part of Hispaniola by evaluating whether the column COUNTRY is equal to either "Dominican Republic" or "Haiti":

    WHERE Country IN ('Dominican Republic', 'Haiti')

    Note that the list of alternatives must be enclosed in parentheses. The result of an IN expression is equivalent to a sequence of equality conditions separated by OR. So the preceding code fragment produces the same boolean result as:

    WHERE Country='Dominican Republic' OR Country='Haiti'

    The advantages are that the IN syntax provides more compact and readable code and can provide improved performance by using an index on the initial expression where available.

  • The boolean expression BETWEEN determines if a value falls within a given range. The evaluation is inclusive of the end points. In this way BETWEEN is a convenient alias for two boolean expressions determining if a value is greater than or equal to (>=) the starting value and less than or equal to (<=) the end value. For example, the following two WHERE clauses are equivalent:

    WHERE salary BETWEEN ? AND ?
    WHERE salary >= ? AND salary <= ?
  • The boolean expressions IS DISTINCT FROM and IS NOT DISTINCT FROM are similar to the equals ("=") and not equals ("<>") operators respectively, except when evaluating null operands. If either or both operands are null, the equals and not equals operators return a boolean null value, or false. IS DISTINCT FROM and IS NOT DISTINCT FROM consider null a valid operand. So if only one operand is null IS DISTINCT FROM returns true and IS NOT DISTINCT FROM returns false. If both operands are null IS DISTINCT FROM returns false and IS NOT DISTINCT FROM returns true.

  • When using placeholders in SQL statements involving the IN list expression, you can either do replacement of individual values within the list or replace the list as a whole. For example, consider the following statements:

    SELECT * from EMPLOYEE where STATUS IN (?, ?,?);
    SELECT * from EMPLOYEE where STATUS IN ?;

    In the first statement, there are three parameters that replace individual values in the IN list, allowing you to specify exactly three selection values. In the second statement the placeholder replaces the entire list, including the parentheses. In this case the parameter to the procedure call must be an array and allows you to change not only the values of the alternatives but the number of criteria considered.

    The following Java code fragment demonstrates how these two queries can be used in a stored procedure, resulting in equivalent SQL statements being executed:

    String arg1 = "Salary";
    String arg2 = "Hourly";
    String arg3 = "Parttime";
    voltQueueSQL( query1, arg1, arg2, arg3);
    String listargs[] = new String[3];
    listargs[0] = arg1;
    listargs[1] = arg2;
    listargs[2] = arg3;
    voltQueueSQL( query2, (Object) listargs);

    Note that when passing arrays as parameters in Java, it is a good practice to explicitly cast them as an object to avoid the array being implicitly expanded into individual call parameters.

  • VoltDB supports the use of CASE-WHEN-THEN-ELSE-END for conditional operations. For example, the following SELECT expression uses a CASE statement to return different values based on the contents of the price column:

    SELECT Prod_name, 
        CASE WHEN price > 100.00 
              THEN 'Expensive'
              ELSE 'Cheap'
    FROM products ORDER BY Prod_name;                      

    For more complex conditional operations with multiple alternatives, use of the DECODE() function is recommended.

  • VoltDB supports both inner and outer joins.

  • The SELECT statement supports subqueries as a table reference in the FROM clause. Subqueries must be enclosed in parentheses and must be assigned a table alias. Note that subqueries are only supported in the SELECT statement; they cannot be used in data manipulation statements such UPDATE or DELETE.

  • VoltDB currently supports two window functions, RANK() and DENSE_RANK(), as expressions in the selection list. Both functions generate an BIGINT value (starting at 1) representing the ranking of the current result, using the following syntax:

    RANK() OVER ([PARTITION BY {expression} [,...]] ORDER BY {expression})

    DENSE_RANK() OVER ([PARTITION BY {expression} [,...]] ORDER BY {expression})

    The PARTITION expression(s) define how the rankings are grouped and the ORDER BY expression specifies what values are ranked.[1]The ORDER BY expression must be of either an integer or TIMESTAMP datatype.

    For example, if you rank a column (say, CITY_POPULATION) and use the COUNTRY column as the partitioning column for the ranking, the cities of each country will be ranked separately. If you use both STATE and COUNTRY as partitioning columns, then the cities for each state in each country will be ranked separately. The following SELECT statement uses the RANK function with the voter sample application to show the top ten states, in order, where a contestant has received the most votes:

    SELECT contestant_number, state, 
            rank() over (partition by contestant_number order by num_votes) 
            from v_votes_by_contestant_number_state 
            where contestant_number=1 LIMIT 10;                    

    Please be aware of the following limitations when using the window functions:

    • There can be only one window function, either RANK() or DENSE_RANK(), per SELECT statement.

    • You cannot use a window function and GROUP BY in the same SELECT statement.

    • The ORDER BY clause is required.

    The difference between RANK() and DENSE_RANK() is how they handle ranking when there is more than one row with the same ORDER BY value. If more than one row has the same ORDER BY value, those rows receive the same rank value in both cases. However, with the RANK() function, the next rank value is incremented by the number of preceding rows. For example, if the ORDER BY values of four rows are 100, 98, 98, and 73 the respective rank values using RANK() will be 1, 2, 2, and 4. Whereas, with the DENSE_RANK() function, the next rank value is always only incremented by one. So, if the ORDER BY values are 100, 98, 98, and 73, the respective rank values using DENSE_RANK() will be 1, 2, 2, and 3.

  • You can only join two or more partitioned tables if those tables are partitioned on the same value and joined on equality of the partitioning column. Joining two partitioned tables on non-partitioned columns or on a range of values is not supported. However, there are no limitations on joining to replicated tables.

  • Extremely large result sets (greater than 50 megabytes in size) are not supported. If you execute a SELECT statement that generates a result set of more than 50 megabytes, VoltDB will return an error.


The SELECT statement can include subqueries. Subqueries are separate SELECT statements, enclosed in parentheses, where the results of the subquery are used as values, expressions, or arguments within the surrounding SELECT statement.

Subqueries, like any SELECT statement, are extremely flexible and can return a wide array of information. A subquery might return:

  • A single row with a single column — this is sometimes known as a scalar subquery and represents a single value

  • A single row with multiple columns — this is also known as a row value expression

  • Multiple rows with one or more columns

In general, VoltDB supports subqueries in the FROM clause, in the selection expression, and in boolean expressions in the WHERE clause or in CASE-WHEN-THEN-ELSE-END operations. However, different types of subqueries are allowed in different situations, depending on the type of data returned.

  • In the FROM clause, the SELECT statement supports all types of subquery as a table reference. The subquery must be enclosed in parentheses and must be assigned a table alias.

  • In the selection expression, scalar subqueries can be used in place of a single column reference.

  • In the WHERE clause and CASE operations, both scalar and non-scalar subqueries can be used as part of boolean expressions. Scalar subqueries can be used in place of any single-valued expression. Non-scalar subqueries can be used in the following situations:

    • Row value comparisons — Boolean expressions that compare one row value expression to another can use subqueries that resolve to one row with multiple columns. For example:

      select * from t1 
         where (a,c) > (select a, c from t2 where b=t1.b);
    • IN and EXISTS — Subqueries that return multiple rows can be used as an argument to the IN or EXISTS predicate to determine if a value (or set of values) exists within the rows returned by the subquery. For example:

      select * from t1 
         where a in (select a from t2);
      select * from t1
         where (a,c) in (select a, c from t2 where b=t1.b);
      select * from t1 where c > 3 and 
         exists (select a, b from t2 where a=t1.a);
    • ANY and ALL — Multi-row subqueries can also be used as the target of an ANY or ALL comparison, using either a scalar or row expression comparison. For example:

      select * from t1 
         where a > ALL (select a from t2);
      select * from t1
         where (a,c) = ANY (select a, c from t2 where b=t1.b);

Note that subqueries are only supported in the SELECT statement; they cannot be used in data manipulation statements such UPDATE or DELETE or in CREATE VIEW statements or index definitions. Also, VoltDB does not support subqueries in the HAVING, ORDER BY, or GROUP BY clauses.

For the initial release of subqueries in selection and boolean expressions, only replicated tables can be used in the subquery. Both replicated and partitioned tables can be used in subqueries in place of table references in the FROM clause.

Set Operations

VoltDB also supports the set operations UNION, INTERSECT, and EXCEPT. These keywords let you perform set operations on two or more SELECT statements. UNION includes the combined results sets from the two SELECT statements, INTERSECT includes only those rows that appear in both SELECT statement result sets, and EXCEPT includes only those rows that appear in one result set but not the other.

Normally, UNION and INTERSECT provide a set including unique rows. That is, if a row appears in both SELECT results, it only appears once in the combined result set. However, if you include the ALL modifier, all matching rows are included. For example, UNION ALL will result in single entries for the rows that appear in only one of the SELECT results, but two copies of any rows that appear in both.

The UNION, INTERSECT, and EXCEPT operations obey the same rules that apply to joins:

  • You cannot perform set operations on SELECT statements that reference the same table.

  • All tables in the SELECT statements must either be replicated tables or partitioned tables partitioned on the same column value, using equality of the partitioning column in the WHERE clause.


The following example retrieves all of the columns from the EMPLOYEE table where the last name is "Smith":

SELECT * FROM employee WHERE lastname = 'Smith';

The following example retrieves selected columns for two tables at once, joined by the employee_id using an implicit inner join and sorted by last name:

SELECT lastname, firstname, salary 
    FROM employee AS e, compensation AS c
    WHERE e.employee_id = c.employee_id
    ORDER BY lastname DESC;

The following example includes both a simple SQL query defined in the schema and a client application to call the procedure repeatedly. This combination uses the LIMIT and OFFSET clauses to "page" through a large table, 500 rows at a time.

When retrieving very large volumes of data, it is a good idea to use LIMIT and OFFSET to constrain the amount of data in each transaction. However, to perform LIMIT OFFSET queries effectively, the database must include a tree index that encompasses all of the columns of the ORDER BY clause (in this example, the lastname and firstname columns).


       SELECT lastname, firstname FROM employee
       WHERE company = ?
       ORDER BY lastname ASC, firstname ASC
       LIMIT 500 OFFSET ?;


Java Client Application:

long offset = 0;
String company = "ACME Explosives";
boolean alldone = false;
while ( ! alldone ) {
   VoltTable results[] = client.callProcedure("EmpByLimit",
   if (results[0].getRowCount() < 1) {
        // No more records.
        alldone = true; 
   } else {
        // do something with the results.
   offset += 500;

[1] Use of the keyword PARTITION is for compatibility with SQL syntax from other databases and is unrelated to the columns used to partition single-partitioned tables. You can use the RANK() functions with either partitioned or replicated tables and the ranking column does not need to be the same as the partitioning column for VoltDB partitioned tables.