3.4. Handling Errors

Documentation

VoltDB Home » Documentation » Using VoltDB

3.4. Handling Errors

One special situation to consider when calling VoltDB stored procedures is error handling. The VoltDB client interface catches most exceptions, including connection errors, errors thrown by the stored procedures themselves, and even exceptions that occur in asynchronous callbacks. These error conditions are not returned to the client application as exceptions. However, the application can still receive notification and interpret these conditions using the client interface.

The following sections explain how to identify and interpret errors that occur executing stored procedures and in asynchronous callbacks.

3.4.1. Interpreting Execution Errors

If an error occurs in a stored procedure (such as an SQL contraint violation), VoltDB catches the error and returns information about it to the calling application as part of the ClientResponse class.

The ClientResponse class provides several methods to help the calling application determine whether the stored procedure completed successfully and, if not, what caused the failure. The two most important methods are getStatus() and getStatusString().

The getStatus() method tells you whether the stored procedure completed successfully and, if not, what type of error occurred. The possible values of getStatus() are:

  • CONNECTION_LOST — The network connection was lost before the stored procedure returned status information to the calling application. The stored procedure may or may not have completed successfully.

  • CONNECTION_TIMEOUT — The stored procedure took too long to return to the calling application. The stored procedure may or may not have completed successfully. See Section 3.4.2, “Handling Timeouts” for more information about handling this condition.

  • GRACEFUL_FAILURE — An error occurred and the stored procedure was gracefully rolled back.

  • RESPONSE_UNKNOWN — This is a rare error that occurs if the coordinating node for the transaction fails before returning a response. The node to which your application is connected cannot determine if the transaction failed or succeeded before the coordinator was lost. The best course of action, if you receive this error, is to use a new query to determine if the transaction failed or succeeded and then take action based on that knowledge.

  • SUCCESS — The stored procedure completed successfully.

  • UNEXPECTED_FAILURE — An unexpected error occurred on the server and the procedure failed.

  • USER_ABORT — The code of the stored procedure intentionally threw a UserAbort exception and the stored procedure was rolled back.

It is good practice to always check the status of the ClientResponse before evaluating the results of a procedure call, because if the status is anything but SUCCESS, there will not be any results returned. In addition to identifying the type of error, for any values other than SUCCESS, the getStatusString() method returns a text message providing more information about the specific error that occurred.

If your stored procedure wants to provide additional information to the calling application, there are two more methods to the ClientResponse that you can use. The methods getAppStatus() and getAppStatusString() act like getStatus() getStatusString(), but rather than returning information set by VoltDB, getAppStatus() and getAppStatusString() return information set by the stored procedure code itself.

In the stored procedure, you can use the methods setAppStatusCode() and setAppStatusString() to set the values returned to the calling application. For example:

Stored Procedure

final byte AppCodeWarm = 1;
final byte AppCodeFuzzy = 2;
              . . .
setAppStatusCode(AppCodeFuzzy);
setAppStatusString("I'm not sure about that...");
              . . .

Client Application

static class MyCallback implements ProcedureCallback {
  @Override
  public void clientCallback(ClientResponse clientResponse) {
    final byte AppCodeWarm = 1;
    final byte AppCodeFuzzy = 2;

    if (clientResponse.getStatus() != ClientResponse.SUCCESS) {
        System.err.println(clientResponse.getStatusString());
    } else {
        if (clientResponse.getAppStatus() == AppCodeFuzzy) {
           System.err.println(clientResponse.getAppStatusString());
        };
        myEvaluateResultsProc(clientResponse.getResults());
    }    
  }
}

3.4.2. Handling Timeouts

One particular error that needs special handling is if a connection or a stored procedure call times out. By default, the client interface only waits a specified amount of time (two minutes) for a stored procedure to complete. If no response is received from the server before the timeout period expires, the client interface returns control to your application, notifying it of the error. For synchronous procedure calls, the client interface returns the error CONNECTION_TIMEOUT to the procedure call. For asynchronous calls, the client interface invokes the callback including the error information in the clientResponse object.

Similarly, if no response of any kind is returned on a connection (even if no transactions are pending) within the specified timeout period, the client connection will timeout. When this happens, the connection is closed, any open stored procedures on that connection are closed with a return status of CONNECTION_LOST, then the client status listener callback method connectionLost is invoked. Unlike a procedure timeout, when the connection times out, the connection no longer exists, so your client application will receive no further notifications concerning pending procedures, whether they succeed or fail.

It is important to note that CONNECTION_TIMEOUT does not necessarily mean the procedure failed. In fact, it is very possible that the procedure may complete and return information after the timeout error is reported. The timeout is provided to avoid locking up the client application when procedures are delayed or the connection to the cluster hangs for any reason.

Similarly, CONNECTION_LOST does not necessarily mean a pending procedure failed. It is possible that the procedure completed but was unable to return its status due to a connection failure. The goal of the connection timeout is to notify the client application of a lost connection in a timely manner, even if there is no outstanding procedures using the connection.

There are several things you can do to address potential timeouts in your application:

  • Change the timeout period by calling either or both the methods setProcedureCallTimeout and setConnectionResponseTimeout on the ClientConfig object. The default timeout period is 2 minutes for both procedures and connections. You specify the timeout period in milliseconds, where a value of zero disables the timeout altogether. For example, the following client code resets the procedure timeout to 90 seconds and the connection timeout period to 3 minutes, or 180 seconds:

    config = new ClientConfig("advent","xyzzy");
    config.setProcedureCallTimeout(90 * 1000);
    config.setConnectionResponseTimeout(180 * 1000);
    client = ClientFactory.createClient(config);
    
  • Catch and respond to the timeout error as part of the response to a procedure call. For example, the following code excerpt from a client callback procedure reports the error to the console and ends the callback:

    static class MyCallback implements ProcedureCallback {
      @Override
      public void clientCallback(ClientResponse response) {
     
        if (response.getStatus() == ClientResponse.CONNECTION_TIMEOUT) {
             System.out.println("A procedure invocation has timed out.");
             return;
        };
        if (response.getStatus() == ClientResponse.CONNECTION_LOST) {
             System.out.println("Connection lost before procedure response.");
             return;
        };
  • Set a status listener to receive the results of any procedure invocations that complete after the client interface times out. See the following section, Section 3.4.3, “Interpreting Other Errors”, for an example of creating a status listener for delayed procedure responses.

3.4.3. Interpreting Other Errors

Certain types of errors can occur that the ClientResponse class cannot notify you about immediately. These errors include:

Backpressure

If backpressure causes the client interface to wait, the stored procedure is never queued and so your application does not receive control until after the backpressure is removed. This can happen if the client applications are queuing stored procedures faster than the database cluster can process them. The result is that the execution queue on the server gets filled up and the client interface will not let your application queue any more procedure calls.

Lost Connection

If a connection to the database cluster is lost or times out and there are outstanding asynchronous requests on that connection, the ClientResponse for those procedure calls will indicate that the connection failed before a return status was received. This means that the procedures may or may not have completed successfully. If no requests were outstanding, your application might not be notified of the failure under normal conditions, since there are no callbacks to identify the failure. Since the loss of a connection can impact the throughput or durability of your application, it is important to have a mechanism for general notification of lost connections outside of the procedure callbacks.

Exceptions in a Procedure Callback

An error can occur in an asynchronous callback after the stored procedure completes. These exceptions are also trapped by the VoltDB client, but occur after the ClientResponse is returned to the application.

Delayed Procedure Responses

Procedure invocations that time out in the client may later complete on the server and return results. Since the client application can no longer react to this response inline (for example, with asynchronous procedure calls, the associated callback has already received a connection timeout error) the client may want a way to process the returned results.

In each of these cases, an error happens and is caught by the client interface outside of the normal stored procedure execution cycle. If you want your application to address these situations, you need to create a listener, which is a special type of asynchronous callback, that the client interface will notify whenever such errors occur.

You must define the listener before you define the VoltDB client or open a connection. The ClientStatusListenerExt interface has four methods that you can implement — one for each type of error situation — connectionLost, backpressure, uncaughtException, and lateProcedureResponse. Once you declare your ClientStatusListenerExt, you add it to a ClientConfig object that is then used to define the client. The configuration class also defines the username and password to use for all connections.

By performing the operations in this order, you ensure that all connections to the VoltDB database cluster use the same credentials for authentication and will notify the status listener of any error conditions outside of normal procedure execution.

The following example illustrates:

1

Declaring a ClientStatusListenerExt

2

Defining the client configuration, including authentication credentials and the status listener

3

Creating a client with the specified configuration

For the sake of example, this status listener does little more than display a message on standard output. However, in real world applications the listener would take appropriate actions based on the circumstances.

      /* 
      *  Declare the status listener 
      */
ClientStatusListenerExt mylistener = new ClientStatusListenerExt() 1
  {

    @Override
    public void connectionLost(String hostname, int port, 
                               int connectionsLeft, 
                               DisconnectCause cause) 
      {
        System.out.printf("A connection to the database been lost. "
         + "There are %d connections remaining.\n", connectionsLeft);
      }

    @Override
    public void backpressure(boolean status) 
      {
        System.out.println("Backpressure from the database "
         + "is causing a delay in processing requests.");
      }

    @Override
    public void uncaughtException(ProcedureCallback callback,
                                  ClientResponse r, Throwable e) 
     {
        System.out.println("An error has occured in a callback "
         + "procedure. Check the following stack trace for details.");
        e.printStackTrace();
     }

    @Override
    public void lateProcedureResponse(ClientResponse response, 
                                   String hostname, int port)
     {
        System.out.printf("A procedure that timed out on host %s:%d"
           + " has now responded.\n", hostname, port);
     }

  };

      /* 
      *  Declare the client configuration, specifying 
      *  a username, a password, and the status listener 
      */
ClientConfig myconfig = new ClientConfig("username",       2
                                       "password",
                                       mylistener);

      /* 
      *  Create the client using the specified configuration. 
      */
Client myclient = ClientFactory.createClient(myconfig);    3

>