4.6. Modifying the Schema

Documentation

VoltDB Home » Documentation » Using VoltDB

4.6. Modifying the Schema

You can use DDL to add, modify, or remove schema objects as the database is running. For a list of all valid DDL you can use, see Appendix A, Supported SQL DDL Statements. You can do the following types of schema changes:

  • Modifying Tables — You can add, modify (alter), and remove (drop) table columns. You can also add and drop table constraints. Finally, you can drop entire tables.

  • Adding and Dropping Indexes — You can add and remove (drop) named indexes.

  • Modifying Partitioning for Tables and Stored Procedures — You can un-partition stored procedures and re-partition stored procedures on a different column, For tables you can change a table between partitioned and replicated, and repartition a table on a different column,

  • Modify roles and users — To learn about modifying roles and users, see Chapter 12, Security.

VoltDB safely handles sqlcmd DDL entered by different users on different nodes of the cluster because it manages sqlcmd commands as transactions, just like stored procedures. To demonstrate the DDL statements to modify the schema, the following sections use a new table, Airport, added to the fight reservation as shown below:

CREATE TABLE Airport (
   AirportID integer NOT NULL,
   Name varchar(15) NOT NULL,
   City varchar(25),
   Country varchar(15),
   PRIMARY KEY (AirportID)
);

4.6.1. Effects of Schema Changes on Data and Clients

You can make many schema changes on empty tables with few restrictions. However, be aware that if a table has data, some schema changes are not allowed and other schema changes may modify or even remove data. When working with test data in your database, you can use TRUNCATE TABLE to empty the data from a table you are working on. Note that all DDL examples in this chapter assume the tables are empty.

We can think of the effects of schema changes on data in three severity levels:

  • Schema change completes without damage to data

  • Schema change fails to complete to avoid damage to data

  • Schema change destroys data

VoltDB error messages and the documentation can help you avoid schema change attempts that fail to complete. For example, you cannot drop a table that has referencing procedures or views.

Obviously you need to be most aware of which schema changes cause data to be destroyed. In particular, removing objects from the schema will also remove the data they contain. Note that schema objects cannot be renamed with DDL, but objects can be replaced by performing a DROP and then ADD. However, it is important to realize that as a result of a DROP operation, such as DROP TABLE, the data associated with that table will be deleted before the new definition is added.

Plan and coordinate changes with client development. Stored procedures and ad hoc queries provide an API that clients use to access the database correctly. Changes to the schema can break the stored procedure calls client applications have developed, so use well-planned schedules to communicate database schema changes to others. Client applications depend on many schema definition features including (but not limited to):

  • Table names

  • Column names

  • Column data types

  • Primary key definitions

  • Table partitions

  • Stored procedure names

  • Stored procedure partitioning

Plan and test carefully before making schema changes to a production database. Be aware that clients may experience connection issues during schema changes, especially for changes that take longer to complete, such as view or index changes.

Schema changes not only affect data, but the existence of data in the database affects the time it takes to process schema changes. For example, when there are large amounts of data, some DDL statements can block processing, resulting in a noticeable delay for other pending transactions. Examples include adding indexes, creating new table columns, and modifying views.

4.6.2. Viewing the Schema

The VoltDB Management Center provides a web browser view of database information, including the DDL schema source. Use a web browser to view the VoltDB Management Center on port 8080 of one of the cluster hosts (http://host-name:8080).

You can also use the sqlcmd show directive to see a list of the current database tables and all procedures. For additional details about the schema, execute the @SystemCatalog system procedure. Use any of the following arguments to @SystemCatalog to obtain details about a component of the database schema:

  • TABLES

  • COLUMNS

  • INDEXINFO

  • PRIMARYKEYS

  • PROCEDURES

  • PROCEDURECOLUMNS

For example:

$ sqlcmd
1> SHOW TABLES;
2> SHOW PROCEDURES;
3> EXEC @SystemCatalog COLUMNS;

4.6.3. Modifying Tables

After creating a table in a database with CREATE TABLE, you can use ALTER TABLE to make the following types of table changes:

To drop an entire table, use the DROP TABLE DDL statement.

4.6.3.1. Altering a Table Column's Data Definition

You can make the following types of alterations to a table column's data definition:

$ sqlcmd
1> ALTER TABLE Airport ALTER COLUMN Name VARCHAR(25);            1
2> ALTER TABLE Airport ALTER COLUMN Country SET DEFAULT 'USA';   2
3> ALTER TABLE Airport ALTER COLUMN Name SET NOT NULL;           3

The examples are described as follows:

1

Change a column's data type. In our example we decided we needed more than 15 characters for the Airport Name so we changed it to 25 characters.

If the table has no existing data, you can make any data type changes. However, if the table already contains data, the new type must be larger than the old one. This restriction prevents corrupting existing data values that might be larger than the size of the new data type (See also Table A.1, “Supported SQL Datatypes”.)

2

Set or drop the column's DEFAULT value. In our example we assume the application is to be used mostly for US domestic travel so we can set a default value for the Airport Country of 'USA'.

To remove a default, redefine the column data definition, for example:

ALTER TABLE Airport ALTER COLUMN Country VARCHAR(15);

3

Change whether the column is NULL or NOT NULL. In our example we set the AirportID to be not null because this is a required field.

If the table has existing data, you cannot change a column to not null.

4.6.3.2. Adding and Dropping Table Columns

$ sqlcmd
1> ALTER TABLE Airport ADD COLUMN AirportCode VARCHAR(3)        1
2> BEFORE AirportID;  
3> ALTER TABLE Airport DROP COLUMN AirportID;                   2

The examples are described as follows:

1

Add table columns. In our example, we have decided not to use the integer AirportID for airport identification but to instead add an AirportCode, which uses a unique three-letter code for any airport as defined by the International Air Transport Association's airport codes.

You cannot rename or overwrite a column but you can drop and add columns. When adding a column, you must include the new column name and the data type. Options you may include are:

  • DEFAULT value — If a table contains data, the values for the new column will be automatically filled in with the default value.

  • NOT NULL — If the table contains data, you must include a default value if you specify a NOT NULL column.

  • One of the following index type constraints including PRIMARY KEY, UNIQUE, or ASSUMEUNIQUE.

    Note, we recommend that you not define the UNIQUE or ASSUMEUNIQUE constraint directly on a column definition when adding a column or creating a table. If you do, the constraint has no name so you cannot drop the constraint without dropping the entire column. Instead, we recommend you apply UNIQUE or ASSUMEUNIQUE by adding the constraint (see Section 4.6.3.3, “Adding and Dropping Table Constraints”) or by adding an index with the constraint (see Section 4.6.4, “Adding and Dropping Indexes”). Defining these constraints this way names the constraint, which makes it easier to drop later if necessary.

  • BEFORE column-name — Table columns cannot be reordered but the BEFORE clause allows you to place a new column in a specific position with respect to the existing columns of the table.

2

Drop table columns. In our example we drop the AirportID column because we are replacing it with the AirportCode column.

You cannot remove a column that has a reference to it. You have to remove all references to the column first. References to a column may include:

  • A stored procedure

  • An index

  • A view

4.6.3.3. Adding and Dropping Table Constraints

You cannot alter a table constraint but you can add and drop table constraints. If the table contains existing data, you cannot add UNIQUE, ASSUMEUNIQUE, or PRIMARY KEY constraints.

$ sqlcmd                                                                    
1> ALTER TABLE Airport ADD CONSTRAINT                        1
2>    uniquecode UNIQUE (Airportcode);
3> ALTER TABLE Airport ADD PRIMARY KEY (AirportCode);        2

The examples are described as follows:

1

Add named constraints UNIQUE or ASSUMEUNIQUE. In our example, we add the UNIQUE constraint to the AirportCode column. To drop a named constraint, include the name using the format in the following example:

ALTER TABLE Airport DROP CONSTRAINT uniquecode;

2

Add unnamed constraints PRIMARY KEY or LIMIT PARTITION ROWS, each of which can apply to a table only once. In our example, we add the PRIMARY KEY constraint to the new AirportCode column.

When adding a table constraint, it must not conflict with the other columns of the table. For example, only one primary key is allowed for a table so you cannot add the PRIMARY KEY constraint to an additional column.

To drop the PRIMARY KEY or LIMIT PARTITION ROWS constraint, include the type of constraint using the format in the following example:

ALTER TABLE Airport DROP PRIMARY KEY;

4.6.4. Adding and Dropping Indexes

Use CREATE INDEX to create an index on one or more columns of a table. Use DROP INDEX to remove an index from the schema. The following example modifies the flight reservation schema by adding an index to the Flight table to improve performance when looking up flights.

$ sqlcmd
1> CREATE INDEX flightTimeIdx ON Flight (departtime);

The CREATE INDEX statement explicitly creates an index. VoltDB creates an index implicitly when you specify the table constraints UNIQUE, PRIMARY KEY, or ASSUMEUNIQUE. Use the ALTER TABLE statement to add or drop these table constraints along with their associated indexes, as shown in Section 4.6.3, “Modifying Tables”.

4.6.5. Modifying Partitioning for Tables and Stored Procedures

Any changes to the schema must be carefully coordinated with the design and development of stored procedures. This not only applies to column names, data types, and so on, but also to the partition plan.

How to partition tables and stored procedures using the PARTITION TABLE and CREATE PROCEDURE PARTITION ON statements is explained in Section 4.4, “Partitioning Database Tables” and Section 5.3.3, “Partitioning Stored Procedures in the Schema”.

You can change the partitioning of stored procedures, and you can change a table to a replicated table or repartition it on a different column. However, because of the intricate dependencies of partitioned tables and stored procedures, this can only be done by dropping and re-adding the tables and procedures. Also, you must pay close attention to the order in which objects are dropped and added.

The following DDL examples demonstrate some partitioning modifications to a table and stored procedures.

The following DDL is added to the Flight reservation schema to help demonstrate the DDL partition changes described in this section.

$ sqlcmd
1> PARTITION TABLE Airport ON COLUMN Name;
2> CREATE PROCEDURE FindAirportCodeByName 
3>    PARTITION ON TABLE Airport COLUMN Name
4>    AS SELECT TOP 1 AirportCode FROM Airport WHERE Name=?;
5> 
6> CREATE PROCEDURE FindAirportCodeByCity AS
7>    SELECT TOP 1 AirportCode FROM Airport WHERE City=?;

The stored procedures are tested with the following sqlcmd directives:

$ sqlcmd
1> exec FindAirportCodeByName 'Logan Airport';
2> exec FindAirportCodeByCity 'Boston';

4.6.5.1. Un-partitioning a Stored Procedure

In the simplest case, you can un-partition a single-partitioned stored procedure by dropping and re-creating that procedure without including the PARTITION ON clause. In this example we drop the single-partitioned FindAirportCodeByName procedure and re-create it as multi-partitioned because it needs to search all partitions to find an airport code by name.

$ sqlcmd
1> DROP PROCEDURE FindAirportCodeByName;
2> CREATE PROCEDURE FindAirportCodeByName AS
3>    SELECT TOP 1 AirportCode FROM Airport WHERE Name=?;

4.6.5.2. Changing a Partitioned Table to a Replicated Table

Important

You cannot change the partitioning of a table that has data in it. To change a partitioned table to a replicated one, you drop and re-create the table, which deletes any data that might be in the table.

Before executing the following steps, save the existing schema so you can easily re-create the table. The VoltDB Management Center provides a view of the existing database schema DDL source, which you can download and save.

$ sqlcmd
1> DROP PROCEDURE FindAirportCodeByName;                     1
2> DROP PROCEDURE FindAirportCodeByCity;
3> DROP TABLE Airport IF EXISTS CASCADE;                     2
4> CREATE TABLE AIRPORT (                                    3
5>    AIRPORTCODE varchar(3) NOT NULL,
6>    NAME varchar(25),
7>    CITY varchar(25),
8>    COUNTRY varchar(15) DEFAULT 'USA',
9>    CONSTRAINT UNIQUECODE UNIQUE (AIRPORTCODE),
10>    PRIMARY KEY (AIRPORTCODE)
11> );
12> CREATE PROCEDURE FindAirportCodeByName AS                4
13>    SELECT TOP 1 AirportCode FROM Airport WHERE Name=?;
14> CREATE PROCEDURE FindAirportCodeByCity AS
15>    SELECT TOP 1 AirportCode FROM Airport WHERE City=?;

The example is described as follows:

1

Drop all stored procedures that reference the table. You cannot drop a table if stored procedures reference it.

2

Drop the table. Options you may include are:

  • IF EXISTS — Use the IF EXISTS option to avoid command errors if the named table is already removed.

  • CASCADE — A table cannot be removed if it has index or view references. You can remove the references explicitly first or use the CASCADE option to have VoltDB remove the references along with the table.

3

Re-create the table. By default, a newly created table is a replicated table.

4

Re-create the stored procedures that access the table. If the stored procedure is implemented with Java and changes are required, modify and reload the code before re-creating the stored procedures. For more, see Section 5.3, “Installing Stored Procedures into the Database”.

4.6.5.3. Re-partitioning a Table to a Different Column

Important

You cannot change the partitioning of a table that has data in it. In order to re-partition a table you have to drop and re-create the table, which deletes any data that might be in the table.

Follow these steps to re-partition a table:

  1. Un-partition the table by following the instructions in Section 4.6.5.2, “Changing a Partitioned Table to a Replicated Table”. The sub-steps are summarized as follows:

    1. Drop all stored procedures that reference the table.

    2. Drop the table.

    3. Re-create the table.

    4. Re-create the stored procedures that access the table.

  2. Partition the table on the new column. In our example, it makes sense to partition the Airport table on the AirportCode column, where each row must be unique and non null.

    $ sqlcmd
    1> PARTITION TABLE Airport ON COLUMN AirportCode;
  3. Re-partition stored procedures that should be single-partitioned. See Section 4.6.5.4, “Updating a Stored Procedure”.

4.6.5.4. Updating a Stored Procedure

This section describes how to update a stored procedure that has already been declared in the database with the CREATE PROCEDURE statement. The steps to update a stored procedure are summarized as follows:

  1. If the procedure is implemented in Java, update the procedure's code, recompile, and repackage the jar file. For details, see Section 5.3, “Installing Stored Procedures into the Database”.

  2. Ensure all tables and columns the procedure accesses are in the database schema.

  3. Update the procedure in the database.

    • If the procedure is implemented in Java, use the sqlcmd load classes directive to update the class in the database. For example:

      $ sqlcmd
      1> load classes GetAirport.jar;
    • If the procedure is implemented with SQL, use the CREATE PROCEDURE AS command to update the SQL.

  4. If required, re-partition the stored procedure. You partition procedures using the PARTITION ON clause in the CREATE PROCEDURE statement. If you need to re-partition the procedure, either changing the partitioning column or switching from replicated to partitioned or vice versa, perform the following steps:

    1. Use DROP PROCEDURE to remove the stored procedure.

    2. Use CREATE PROCEDURE to re-declare the stored procedure, including the new partitioning scheme.

    In our example so far, we have three stored procedures that are adequate to access the Airport table, so no additional procedures need to be partitioned:

    • VoltDB automatically defined a default select stored procedure, which is partitioned on the AirportCode column. It takes an AirportCode as input and returns a table structure containing the AirportCode, Name, City, and Country.

    • The FindAirportCodeByName stored procedure should remain multi-partitioned because it needs to search in all partitions.

    • The FindAirportCodeByCity stored procedure should also remain multi-partitioned because it needs to search in all partitions.

4.6.5.5. Removing a Stored Procedure from the Database

If you've decided a stored procedure is no longer needed, use the following steps to remove it from the database:

  1. Drop the stored procedure from the database.

    $ sqlcmd
    1> DROP PROCEDURE GetAirport;
  2. Remove the code from the database. If the procedure is implemented with Java, use the sqlcmd remove classes directive to remove the procedure's class from the database.

    2> remove classes myapp.procedures.GetAirport;