CREATE VIEW

Documentation

VoltDB Home » Documentation » Using VoltDB

CREATE VIEW

CREATE VIEW — Creates a view into one or more tables, optimizing access to a summary of their contents.

Synopsis

CREATE VIEW view-name ( view-column-name [,...] )
AS SELECT { column-name | selection-expression } [AS alias] [,...]
FROM table-reference [join-clause...]
[where-clause] [group-clause]

CREATE VIEW view-name
[migrate-target-definition | migrate-topic-definition]
( view-column-name [,...] )
AS SELECT { column-name | selection-expression } [AS alias] [,...]
FROM stream-reference
[where-clause] [group-clause] [ttl-definition]

table-reference: { table-name [AS alias] }

stream-reference: { view-name [AS alias] }

join-clause: , table-reference [INNER] JOIN [{table-reference}] [join-condition]

join-condition: ON conditional-expression USING (column-reference [,...])

where-clause WHERE [NOT] boolean-expression
[ {AND | OR} [NOT] boolean-expression]...

group-clause GROUP BY { column-name | selection-expression } [,...]

migrate-target-definition MIGRATE TO TARGET {target-name}

migrate-topic-definition MIGRATE TO TOPIC {topic-name}
[WITH [KEY (column-name[,...])] [VALUE (column-name[,...])]]

ttl-definition: USING TTL value [time-unit] ON COLUMN column-name
[BATCH_SIZE number-of-rows] [MAX_FREQUENCY value]

time-unit: SECONDS | MINUTES | HOURS | DAYS

Description

The CREATE VIEW statement creates a view of a table, a stream, or joined tables with selected columns and aggregates. VoltDB implements views as materialized views. In other words, the view is stored as a special table in the database and is updated each time the corresponding database contents are modified. This means there is a small, incremental performance impact for any inserts or updates to the tables, but selects on the view will execute efficiently.

The following limitations are important to note when using the CREATE VIEW statement with VoltDB:

  • If the SELECT statement contains a GROUP BY clause, all of the columns and expressions listed in the GROUP BY must be listed in the same order at the start of the SELECT statement. Aggregate functions, including COUNT(*), are allowed following the GROUP BY columns.

  • Views are allowed on individual tables or streams, or joins of multiple tables. Joining streams is not supported.

  • Joins must be inner joins and cannot be self-joins. All other limitations for joins as described in the SELECT statement also apply to joins in views.

  • Views that join multiple tables must include a COUNT(*) field listed after all GROUP BY columns.

  • To avoid performance problems when inserting data into a view that joins multiple tables, it is strongly recommended you define indexes on the table columns involved in the join.

Managing Stream Views

You can create views on both tables and streams. Views on tables are materialized, which means the content of the view is tied to the current contents of the table. You cannot modify the records of the view except by changing the contents of the table.

Views on streams, however, are historical. As data is inserted into the stream, the view contents are incremented. However, streams are ephemeral and have no storage associated with them, so there is no way update or delete the view records from the stream itself. Instead, you are allowed to use the DELETE and UPDATE statements to manage the contents of stream views.

For example, you might want to track user sessions, keeping a log of daily logins and time online per user. You do not need to keep a record of every session, so you can use a stream to capture each event and a view on the stream to track the daily totals.

CREATE STREAM user_session 
  PARTITION ON COLUMN userid (
    userid INTEGER NOT NULL, 
    start_time TIMESTAMP,
    session_length BIGINT
);
CREATE VIEW session_view 
  (userid, session_day, total_count, total_time)
  AS SELECT userid, TRUNCATE(DAY, start_time),
            count(*), SUM(session_length)
     FROM user_session GROUP BY userid, TRUNCATE(DAY, start_time);

But you do not need to keep these records forever, so you might choose to delete views for any days more than a week old. You could do this with the following DELETE statement, deleting records from the stream view:

DELETE FROM session_view WHERE session_day < DATEADD(DAY, -7, NOW());

Similarly, if you knew there was an invalid session logged earlier, you might update the view to remove the incorrect data from the view using an UPDATE statement:

UPDATE session_view 
  SET total_count = total_count-1,
      total_time = total_time-72
  WHERE userid = 1234
        AND session_day = TRUNCATE(DAY,NOW());

Automating Stream View Management

As mentioned in the preceding section, you can "clean up" stream views by using the DELETE statement. However, doing this manually is usually impractical. So, VoltDB lets you use the same time-to-live (TTL) and migration capabilities available to tables for stream views.

When you create the stream view you can define how long to keep the view data by adding the USING TTL clause to specify when records should be deleted based on a timestamp column in the view. If the timestamp is older than the specified value, the record will be deleted. For example, the following view definition automates the delete operation shown earlier by specifying that records should be deleted after the value of the session_day column is older than seven days.

CREATE VIEW session_view 
  (userid, session_day, total_count, total_time)
  AS SELECT userid, TRUNCATE(DAY, start_time),
            count(*), SUM(session_length)
     FROM user_session GROUP BY userid, TRUNCATE(DAY, start_time)
  USING TTL 7 DAYS ON COLUMN session_day;

If you want to delete the records from the stream view but not lose them, you can use the MIGRATE clause to specify where to send the records, then use the MIGRATE rather than DELETE statement. Stream view records can be migrated to either export connectors or to topics, depending on whether you specify MIGRATE TO TARGET or MIGRATE TO TOPIC. The key difference between the two is that MIGRATE TO TARGET ensures that the record reaches the external export target system and receipt is acknowledged before it is deleted. Whereas MIGRATE TO TOPIC deletes the record as soon as it has been inserted into the VoltDB topic queue and made available to external consumers.

For example, the following CREATE VIEW statement specifies that migrated records are sent to the topic OldSessions. When migrating to a topic, you can also specify which columns compose the topic key and which columns to include in the message itself.

CREATE VIEW session_view
  MIGRATE TO TOPIC OldSessions 
    WITH KEY(userid) VALUES(session_day,total_count,total_time)
  (userid, session_day, total_count, total_time)
  AS SELECT userid, TRUNCATE(DAY, start_time),
            count(*), SUM(session_length)
     FROM user_session GROUP BY userid, TRUNCATE(DAY, start_time);

Finally, you can combine USING TTL and MIGRATE TO to automate the removal and transfer of old records. Whenever the TTL value is triggered, the records are sent to the specified target or topic and then deleted from the view.

CREATE VIEW session_view
  MIGRATE TO TOPIC OldSessions 
    WITH KEY(userid) VALUES(session_day,total_count,total_time)
  (userid, session_day, total_count, total_time)
  AS SELECT userid, TRUNCATE(DAY, start_time),
            count(*), SUM(session_length)
     FROM user_session GROUP BY userid, TRUNCATE(DAY, start_time)
  USING TTL 7 DAYS ON COLUMN session_day;

It is important when evaluating view contents for TTL to be aware of the possibilities of additional stream inserts arriving "late". That is, the timestamp value being used for the TTL evaluation may not be synchronous to the system clock. So if you attempt to delete view records immediately after they expire (in the preceding example, when the session start time is only one day old), late arriving records will create a new record for that older time window. It is always a good idea to allow a certain leeway before deleting or migrating records to make sure any potentially late arrivals are included before the record is archived.

Note that you cannot define TTL or migration on table views. However, a similar effect can be achieved by defining the TTL and migration constraints on the table itself.

Examples

The following example defines a view that counts the number of records for a specific product item grouped by its location (that is, the warehouse the item is in).

CREATE VIEW inventory_count_by_warehouse (
     productID,
     warehouse,
     total_inventory
) AS SELECT
     productID,
     warehouse,
     COUNT(*)
FROM inventory GROUP BY productID, warehouse;

The next example uses a WHERE clause but no GROUP BY to provide a count and minimum and maximum aggregates of all records that meet a certain criteria.

CREATE VIEW small_towns ( number, minimum, maximum )
    AS SELECT count(*), min(population), max(population) 
    FROM TOWNS WHERE population < 10000;

The final example demonstrates joining two tables in a view. This definition provides a similar view to the first example, except it uses the productID column to join two tables, Product and Inventory:

CREATE VIEW inventory_count_by_warehouse (
     productName,
     warehouse,
     total_inventory
) AS SELECT
     product.productName,
     inventory.warehouse,
     COUNT(*)
FROM product JOIN inventory 
     ON product.productID = inventory.productID
     GROUP BY product.productName, inventory.warehouse;