Chapter 2 Querying Cassandra with CQL

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 2 QUERYING CASSANDRA WITH CQL

If you are transitioning from a relational database and SQL, you will find Cassandra Query Language (CQL) easy to use for accessing the Cassandra server. CQL has a syntax similar to SQL and can be used from a command line shell (cqlsh) or from client APIs such as the Hector API introduced in Chapter 1, “Using Cassandra with Hector.” Although Cassandra is a NoSQL database, Cassandra’s data model is similar to a relational database with a storage model based on column families, columns, and rows. Instead of querying a relational database table, you query a column family. Instead of querying relational database columns and rows, you query Cassandra’s columns and rows. This chapter introduces CQL using the Hector client API for running CQL statements. Another API that supports CQL may be used just as well for running the CQL statements.

OVERVIEW OF CQL

CQL 3 is the latest version of CQL. Being a query language for a non-relational database, some constructs used in SQL are not supported in CQL—for example, JOINS. CQL 3 identifiers are case-insensitive unless enclosed in double quotes. CQL 3 keywords are also case-insensitive. An identifier in CQL is a letter followed by any sequence of letters, digits, and the underscore. A string literal in CQL is specified with single quotes, and to use a single quotation mark in a query, it must be delimited, or escaped, with another single quote. CQL 3 data types were discussed in Chapter 1. The CQL 3 commands are discussed in Table 2.1.

Table 2.1 CQL 3 Commands

For a complete syntax of CQL 3 commands, see http://cassandra.apache.org/doc/cql3/CQL.html.

Note that not all Java clients support CQL 3. For example, Hector does not support CQL 3, but supports CQL 2.0 (http://cassandra.apache.org/doc/cql/CQL.html). Subsequent sections discuss most CQL 2 statements with an example. Later in the chapter, we will discuss some of the new features in CQL 3. You will use the CQL 3 commands in Chapter 3, “Using Cassandra with DataStax Java Driver,” on the DataStax Java driver.

SETTING THE ENVIRONMENT

You will use Hector Java client to run CQL statements. Download the following software:

Apache Cassandra apache-cassandra-2.0.4-bin.tar.gz or a later version from http://cassandra.apache.org/download/

Hector Java client hector-core-1.1-4.jar or a later version from https://github.com/hector-client/hector/downloads

Eclipse IDE for Java EE developers from http://www.eclipse.org/downloads/

Java SE 7 from http://www.oracle.com/technetwork/java/javase/downloads/index-jsp-138363.html

Then follow these steps:

1. Install the Eclipse IDE.

2. Extract the Apache Cassandra TAR file to a directory (for example, C:Cassandraapache-cassandra-2.0.4).

3. Add the bin folder, C:Cassandraapache-cassandra-2.0.4in, to the PATH environment variable.

4. Start Apache Cassandra server with the following command: cassandra –f

The Cassandra server starts and begins listening for CQL clients on localhost:9042. Cassandra listens for Thrift clients on localhost:9160.

CREATING A JAVA PROJECT

In this section, you will create a Java project in Eclipse for running CQL statements using a Hector client. Follow these steps:

1. Select File > New > Other in the Eclipse IDE.

2. In the New window select the Java Project wizard and click Next, as shown in Figure 2.1.

Figure 2.1
Selecting the Java Project wizard.

Source: Eclipse Foundation.

3. In the Create a Java Project screen, specify a project name (for example, CQL), select the directory location for the project (or choose Use Default Location), select the JRE, and click Next, as shown in Figure 2.2.

Figure 2.2
Creating a new Java project.

Source: Eclipse Foundation.

4. In the Java Settings screen, select the default settings and click Finish, as shown in Figure 2.3. A new Java project is created in Eclipse, as shown in the Package Explorer (see Figure 2.4).

Figure 2.3
The Java Settings screen.

Source: Eclipse Foundation.

Figure 2.4
The new Java project in the Package Explorer.

Source: Eclipse Foundation.

5. You need to add the same JAR files to the project Java build path as for the Chapter 1 project. To begin, right-click the project node in the Package Explorer and select Properties.

6. In the Properties for CQL dialog box, select the Java Build Path node and click the Add External JARs button to add JAR files, as shown in Figure 2.5. Then click OK.

Figure 2.5
Adding JAR files.

Source: Eclipse Foundation.

7. Add a Java class for the Hector client application. Select File > New > Other and, in the New window, select Java > Class, as shown in Figure 2.6. Then click Next.

Figure 2.6
Selecting Java > Class.

Source: Eclipse Foundation.

8. In the New Java Class wizard, select the source folder (CQL/src), specify a package (cql), and specify a class name (CQLClient). Then select the main method stub to create and click Finish, as shown in Figure 2.7. The CQLClient Java class is created and added to the Eclipse Java project, as shown in the Package Explorer in Figure 2.8.

Figure 2.7
Creating a new Java class.

Source: Eclipse Foundation.

Figure 2.8
The Java class CQLClient.

Source: Eclipse Foundation.

CREATING A KEYSPACE

In CQL 3, the syntax for creating a keyspace is as follows:

CREATE KEYSPACE <keyspace_name> WITH <property1> = {} AND <property2> = {};

The properties supported are discussed in Table 2.2.

Table 2.2 CREATE KEYSPACE Command Properties

If SimpleStrategy is used as the replication strategy, an example of a command to create a keyspace is as follows:

CREATE KEYSPACE CQLKeyspace
WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1}
AND durable_writes = false;

The 'replication_factor' sub-option can be used only with SimpleStrategy. If NetworkTopologyStrategy is used, an example of a command to create a keyspace is as follows:

CREATE KEYSPACE CQLKeyspace
    WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 1, 'DC2' : 1}
    AND durable_writes = true;

The DC1 and DC2 refer to the data centers DC1 and DC2. The sub-option values are the individual replication factors for each data center.

The CQL 2 syntax for creating a keyspace is as follows:

CREATE KEYSPACE <ks_name>
    WITH strategy_class = <value>
    [ AND strategy_options:<option> = <value> [strategy_options:<option> = <value>]];

For example:

CREATE KEYSPACE CQLKeyspace WITH strategy_class = 'SimpleStrategy'
    AND strategy_options:replication_factor = 1;

If Cassandra CLI (client interface utility) is used to create a keyspace, the syntax of the CREATE KEYSPACE command is different than that discussed. Cassandra CLI does not completely support CQL, and the Thrift API is supported. To create a keyspace in Cassandra CLI, start Cassandra CLI with the following command:

cassandra-cli

Run the following command to create a keyspace with the name CQLKeyspace, the replication strategy SimpleStrategy, and a replication factor of 1:

CREATE KEYSPACE CQLKeyspace WITH placement_strategy= 'org.apache.cassandra.
locator.SimpleStrategy' AND strategy_options={replication_factor:1};

A keyspace CQLKeyspace is created and the output from the command is shown in Figure 2.9. To use the keyspace run the following command:

use CQLKeyspace;

As indicated by the message output in Figure 2.9, the CQLKeyspace is authenticated.

Figure 2.9
The CQLKeyspace is authenticated.

Source: Microsoft Corporation.

CREATING A COLUMN FAMILY

You will use the Hector client to run the CQL statement to create a column family. To begin, add a createCF() method to the CQLClient class and invoke the method from the main method. Hector provides the me.prettyprint.cassandra.model.CqlQuery class to run CQL statements. The constructor for the class is CqlQuery(Keyspace k, Serializer<K> keySerializer,Serializer<N> nameSerializer, Serializer<V> valueSerializer). But, before you may run CQL statements, you need to create a Cluster instance and a Keyspace instance as discussed in Chapter 1.

Cluster cluster = HFactory.getOrCreateCluster("cql-cluster", "localhost:9160");
Keyspace keyspace = HFactory.createKeyspace("CQLKeyspace", cluster);

Create a CQLQuery instance using the class constructor with StringSerializer instances for key, column name, and column value.

CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
StringSerializer.get(), StringSerializer.get());

Next, set the CQL query to create a column family. Set the comparator as UTF8Type, which implies that columns are sorted based on UTF8Type sorting and columns are displayed as UTF8Type text. The other supported types are AsciiType, BytesType (the default), CounterColumnType, IntegerType, LexicalUUIDType and LongType. The default validation class is set using the default_validation parameter set to UTF8Type and is the validator to use for column values. The supported types and default setting are the same as for the comparator.

cqlQuery.setQuery("CREATE COLUMNFAMILY catalog (catalog_id text PRIMARY KEY,
journal text,publisher text,edition text,title text,author text)WITH
comparator=UTF8Type AND default_validation_class=UTF8Type");

Some of the other supported options are discussed in Table 2.3, all of the column family options being optional. Only the column family name is a required parameter.

Table 2.3 CREATE COLUMNFAMILY Command Options

To run the CQL query, invoke the execute() method:

cqlQuery.execute();

The CQLClient application to create a column family catalog using the Hector client to run the CQL statement appears in Listing 2.1.

Listing 2.1 CQLClient Application

package cql;
 
import me.prettyprint.cassandra.model.CqlQuery;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.factory.HFactory;
 
public class CQLClient {
       private static Cluster cluster;
       private static Keyspace keyspace;
       public static void main(String[] args) {
              cluster = HFactory.getOrCreateCluster("cql-cluster",
"localhost:9160");
              createKeyspace();
               createCF();
       }
       private static void createKeyspace() {
              keyspace = HFactory.createKeyspace("CQLKeyspace", cluster);
       }
       private static void createCF() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("CREATE COLUMNFAMILY catalog (catalog_id text
PRIMARY KEY,journal text,publisher text,edition text,title text,author text) WITH
comparator=UTF8Type AND default_validation=UTF8Type AND caching=keys_only AND
replicate_on_write=true");
       cqlQuery.execute();
       }
}

If it’s not already started, start Cassandra, right-click CQLClient, and select Run As > Java Application as shown in Figure 2.10.

Figure 2.10
Running the CQLClient application.

Source: Eclipse Foundation.

The "catalog" column family is created in the CQLKeyspace keyspace. (This keyspace must be created prior to running the CQL statement to create the column family.) In subsequent sections, you will add other methods to the CQLClient class to run CQL statements and invoke the methods from the main method. The primary key column of the "catalog" column family is named something other than KEY, which makes it unsuitable for being specified in the WHERE clause of CQL 2 queries, as you will see in a later section. To create a primary key column called KEY, run the following CQL query:

cqlQuery.setQuery("CREATE COLUMNFAMILY catalog2 (KEY text PRIMARY KEY,journal
text,publisher text,edition text,title text,author text)");

One of the columns must be a primary key column. If a primary key is not specified, the following exception is generated:

InvalidRequestException(why:You must specify a PRIMARY KEY)

USING THE `INSERT` STATEMENT

In this section, you will run the INSERT CQL statement. The syntax for the INSERT statement with the required clauses is as follows:

INSERT INTO <tablename> (<column1>, <column2>, <column>) VALUES (<value1>,
<value2>, <valueN>)

The number of values must match the number of columns or the following exception is generated:

InvalidRequestException(why:unmatched column names/values)

However, the number of columns/values may be less than in the schema for the column family. The primary key column must be specified, as the primary key identifies a row. Add an insert() method to the CQLClient class and invoke the method from the main method. Create a CQLQuery object as before.

CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
StringSerializer.get(), StringSerializer.get());

Set the query to add a row to the catalog table using the setQuery(String) method:

cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal, publisher, edition,
title,author) VALUES ('catalog1','Oracle Magazine', 'Oracle Publishing',
'November-December 2013', 'Engineering as a Service','David A. Kelly')");

Then run the query with the execute() method:

cqlQuery.execute();

Similarly, add another row:

cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal, publisher, edition,
title,author) VALUES ('catalog2','Oracle Magazine', 'Oracle Publishing',
'November-December 2013', 'Quintessential and Collaborative','Tom Haunert')");
cqlQuery.execute();

The INSERT statement adds a new row if one does not exist and replaces the row if a row with the same primary key already exists. Run the CQLClient application to invoke the insert() method and add data to the "catalog" column family. Then add the rows in Cassandra CLI. To fetch the row with the "catalog1" and "catalog2" keys, run the following commands:

get catalog['catalog1'];
get catalog['catalog2'];

The output from the command fetches the rows added with the INSERT statement, as shown in Figure 2.11.

Figure 2.11
Adding rows with the INSERT statement.

Source: Microsoft Corporation.

You add a row to a column family with the name KEY in a similar manner:

cqlQuery.setQuery("INSERT INTO catalog2 (KEY, journal, publisher, edition,title,
author) VALUES ('catalog1','Oracle Magazine', 'Oracle Publishing', 'November-
December 2013', 'Engineering as a Service','David A. Kelly')");
cqlQuery.execute();

When a row is added, all the columns/values do not have to be specified. For example, the following CQL query adds a row without the journal column. Flexible schema is one of the features of the Cassandra database and of NoSQL databases in general.

cqlQuery.setQuery("INSERT INTO catalog (catalog_id,   publisher, edition,title,
author) VALUES ('catalog4', 'Oracle Publishing', 'November-December 2013',
'Engineering as a Service','David A.  Kelly')");
cqlQuery.execute();

USING THE `SELECT` STATEMENT

In this section, you will query using the SELECT statement. The SELECT statement must have the following required clauses and keywords:

SELECT <select-clause> FROM <tablename>

The SELECT statement queries one or more columns from one or more rows and returns the result as a rowset, with each row having the columns specified in the query. Even if a column name not defined in the column family schema is specified in the SELECT statement’s <select-clause>, the column value is returned—a null value for a non-existent column. The columns whose values are to be selected are specified in the <select-clause> as comma-separated column names. Alternatively, to select all columns, specify *. The <tablename> is the column family or table from which to select.

Add a method called select() to the CQLClient application and invoke the method from the main method. Then create a CQLQuery object as before.

CqlQuery<String, String, String> cqlQuery = new CqlQuery<String, String, String>
(keyspace, StringSerializer.get(), StringSerializer.get(), StringSerializer.get());

As an example, select all columns using *:

cqlQuery.setQuery("select * from catalog");

Invoke the execute() method to run the CQL statement. The result of the query is returned as a QueryResult<CqlRows<K, N, V>> object.

QueryResult<CqlRows<String, String, String>> result = cqlQuery.execute();

Fetch the result using the get() method and create an Iterator over the result using the iterator() method.

Iterator iterator = result.get().iterator();

Iterate over the result to fetch individual rows. A row is represented with the Row interface, and a Row instance consists of a key/column slice tuple. Get the key value using the getKey() method and get the column slice represented with the ColumnSlice interface using the getColumnSlice() method. Fetch the collection of columns from the ColumnSlice instance using the getColumns method. Create another Iterator over the list of columns and iterate over the columns to fetch individual HColumn instances, which represent the columns in the column slice. Output the column name using the getName() method from HColumn and output the column value using the getValue() method.

      while (iterator.hasNext()) {
                    Row row = (Row) iterator.next();
                    String key = (String) row.getKey();
                    ColumnSlice columnSlice = row.getColumnSlice();
                    List columnList = columnSlice.getColumns();
                    Iterator iter = columnList.iterator();
                    while (iter.hasNext()) {
                           HColumn column = (HColumn) iter.next();
                           System.out.println("Column name: " +
column.getName() + " ");
 
                           System.out.println("Column Value: " +
column.getValue());
                           System.out.println("
");
                      }
              }

Run the CQLClient application to fetch all the column values from the catalog table. The catalog1 row columns are output as shown in Figure 2.12.

Figure 2.12
The result of a SELECT statement.

Source: Eclipse Foundation.

The catalog2 row columns are output as shown in Figure 2.13.

Figure 2.13
The result of a SELECT statement (continued).

Source: Eclipse Foundation.

The SELECT statement also supports a WHERE clause to filter a query based on the value of another column.

SELECT <select-clause> FROM <tablename> WHERE <where-clause>

CQL requires that the WHERE clause with the = comparison be used with the table key alone or an indexed column alone. The column in the = comparison after WHERE must either be the primary key column called KEY or some other column that has a secondary index. Before we discuss how to filter a SELECT query using the WHERE clause, let’s add a secondary index on a column.

CREATING A SECONDARY INDEX

CQL provides the CREATE INDEX command to create a secondary index on a column already defined in a column family. For example, the following command will add a secondary index called titleIndex on column called title in table called catalog.

All existing data for the column is indexed asynchronously. When new data is added, it is indexed automatically at the time of insertion.

CREATE INDEX titleIndex ON catalog (title)

Add a createIndex() method to the CQLClient class to create a secondary index on a column. Then specify and run the preceding CQL query using a CQLQuery instance.

cqlQuery.setQuery("CREATE INDEX titleIndex ON catalog (title)");
cqlQuery.execute();

Invoke the createIndex() method in the main method and run the CQLClient application to create a secondary index on the title column in the catalog table.

USING THE `SELECT` STATEMENT WITH THE `WHERE` CLAUSE

As mentioned, CQL requires the column in an = comparison specified in the WHERE clause to be an indexed column or a primary key column called KEY. If you run a CQL query using the WHERE clause on a primary key column that is not called KEY or on some other column that has not been indexed, the following exception is generated:

Caused by: InvalidRequestException(why:No indexed columns present in by-columns
clause with "equals" operator)

The following CQLQuery query would generate the preceding exception because catalog_id used with the = operator is not an indexed column, and even though it is a primary key column, it is not called KEY.

cqlQuery.setQuery("SELECT catalog_id, journal, publisher, edition,title,author
FROM catalog WHERE catalog_id='catalog1'");

The same goes for the following query because the journal column used in the = comparison is not an indexed column.

cqlQuery.setQuery("SELECT KEY, journal, publisher, edition,title,author FROM
catalog WHERE journal='Oracle Magazine'");

Because you created a secondary index on the title column in the catalog table, you can use the title column in the = comparison after the WHERE clause:

cqlQuery.setQuery("SELECT catalog_id, journal, publisher, edition,title,author
FROM catalog WHERE title='Engineering as a Service'");

For example, if catalog1 is the only column with the title “Engineering as a Service,” then the preceding query would generate the following result using the same iteration over the QueryResult<CqlRows<String, String, String>> result returned by the query:

Column name: catalog_id
Column Value: catalog1
Column name: journal
Column Value: Oracle Magazine
Column name: publisher
Column Value: Oracle Publishing
Column name: edition
Column Value: November-December 2013
Column name: title
Column Value: Engineering as a Service
Column name: author
Column Value: David A.  Kelly

The SELECT statement with the WHERE clause may also be used with the KEY column in the = comparison—for example, to select the columns where KEY is catalog1.

cqlQuery.setQuery("SELECT KEY, journal, publisher, edition,title,author FROM
catalog2 WHERE KEY='catalog1'");

The result of the query is shown in Figure 2.14.

Figure 2.14
The result of a SELECT statement with a WHERE query.

Source: Eclipse Foundation.

USING THE `UPDATE` STATEMENT

The UPDATE statement is used to update the column values of row(s). You update a row using an UPDATE CQL statement. The syntax of the UPDATE statement is as follows:

UPDATE <tablename> ( USING <option> ( AND <option> )* )? SET <assignment1> ( ','
<assignmentN> )* WHERE <where-clause>;

Add a method called update() to the CQLClient class. Then set the CQL UPDATE statement as the query in the CQLQuery object.

cqlQuery.setQuery("UPDATE catalog USING CONSISTENCY ALL SET 'edition' = '11/12
2013', 'author' = 'Kelley, David A.' WHERE CATALOG_ID = 'catalog1'");

The column in the WHERE clause to select the row must be the primary key column. If some other column is used, the following exception is generated:

Caused by: InvalidRequestException(why:Expected key 'CATALOG_ID' to be present in
WHERE clause for 'catalog')

UPDATE does not try to determine whether the row identified by the primary key column exists. If the row does not exist, a row is created. Run a SELECT query after the UPDATE statement. The result of the query indicates that the columns were updated, as shown in Figure 2.15.

Figure 2.15
The result of a SELECT statement after an UPDATE statement.

Source: Eclipse Foundation.

USING THE `BATCH` STATEMENT

The BATCH statement is used to run a group of modification statements (insertions, updates, deletions) in a batch as a single statement. Only UPDATE, INSERT, and DELETE statements may be grouped in a BATCH statement. Running multiple statements as a single statement saves round trips between the client and the server. The syntax of the BATCH statement is as follows:

BEGIN BATCH (USING <option> ( AND <option> )* )? <modification-stmt> ( ';'
<modification-stmt> )* APPLY BATCH;

Add a method called batch() to the CQLClient class. Set a BATCH statement in the CQLQuery object. The BATCH statement includes two INSERT statements and two UPDATE statements.

cqlQuery.setQuery("BEGIN BATCH USING CONSISTENCY QUORUM   UPDATE catalog     SET
'edition' = '11/12 2013', 'author' = 'Haunert, Tom' WHERE CATALOG_ID = 'catalog2'
INSERT INTO catalog (catalog_id, journal, publisher, edition,title,author) VALUES
('catalog3','Oracle Magazine', 'Oracle Publishing', 'November-December 2013',
'','') INSERT INTO catalog (catalog_id, journal, publisher, edition,title,author)
VALUES ('catalog4','Oracle Magazine', 'Oracle Publishing', 'November-December
2013', '','') UPDATE catalog   SET 'edition' = '11/12 2013'  WHERE CATALOG_ID =
'catalog3' APPLY BATCH");

The consistency level cannot be set for individual statements within a BATCH statement. If the consistency level is set on individual statements, the following error is generated:

Caused by: InvalidRequestException(why:Consistency level must be set on the BATCH,
not individual statements)

Invoke the batch() method from the main method and run the CQLClient class in the Eclipse IDE. All the statements grouped in the BATCH statement are run and applied. Next, invoke the select() method after the batch() method to output all the columns in all the rows. The result of the query, shown here, indicates that the BATCH statement has been applied.

Result took (38195us) for query (me.prettyprint.cassandra.model.CqlQuery@65b57dc
c) on host: localhost(127.0.0.1):9160
Column name: catalog_id
Column Value: catalog1
Column name: author
Column Value: Kelley, David A.
Column name: edition
Column Value: 11/12 2013
Column name: journal
Column Value: Oracle Magazine
Column name: publisher
Column Value: Oracle Publishing
Column name: title
Column Value: Engineering as a Service
Column name: catalog_id
Column Value: catalog2
Column name: author
Column Value: Haunert, Tom
Column name: edition
Column Value: 11/12 2013
Column name: journal
Column Value: Oracle Magazine
Column name: publisher
Column Value: Oracle Publishing
Column name: title
Column Value: Quintessential and Collaborative
Column name: catalog_id
Column Value: catalog3
Column name: author
Column Value:
Column name: edition
Column Value: 11/12 2013
Column name: journal
Column Value: Oracle Magazine
Column name: publisher
Column Value: Oracle Publishing
Column name: title
Column Value:
Column name: catalog_id
Column Value: catalog4
Column name: author
Column Value:
Column name: edition
Column Value: November-December 2013
Column name: journal
Column Value: Oracle Magazine
Column name: publisher
Column Value: Oracle Publishing
Column name: title
Column Value:

USING THE `DELETE` STATEMENT

The DELETE statement is used to delete columns and rows. The syntax of the DELETE statement is as follows:

DELETE ( <selection> ( ',' <selection> )* )? FROM <tablename> WHERE <where-clause>

The <selection> items refer to the columns to be deleted. The column in the WHERE clause must be the primary key column. If no column is specified, all the columns are deleted. The row itself is not deleted because the primary key column is not deleted even if the primary column is specified in the <selection> items. Add a method called delete() to CQLClient class. Then set a query to delete the journal and publisher columns from the catalog table from the row with the primary key "catalog3".

cqlQuery.setQuery("DELETE journal, publisher from catalog WHERE
catalog_id='catalog3'");
cqlQuery.execute();

Next, set a query to delete all columns from the catalog table from the row with the primary key "catalog4".

cqlQuery.setQuery("DELETE from catalog WHERE catalog_id='catalog4'");
cqlQuery.execute();

To demonstrate that the primary key column cannot be deleted, include the catalog_id column in the columns to delete:

cqlQuery.setQuery("DELETE catalog_id, journal, publisher, edition, title, author
from catalog WHERE catalog_id='catalog4'");
cqlQuery.execute();

Invoke the delete() method from the main method and run the CQLClient class in the Eclipse IDE. Then invoke the select() method after the delete() method to query the rows after deletion. As shown in the Eclipse IDE in Figure 2.16, the journal and publisher columns are deleted from the catalog3 row, and all the columns have been deleted from the catalog4 row. The primary key column is not deleted.

Figure 2.16
The result of a SELECT statement after a DELETE statement.

Source: Eclipse Foundation.

USING THE `ALTER COLUMNFAMILY` STATEMENT

The ALTER COLUMNFAMILY or ALTER TABLE statement is used to alter the column family definitions to add columns, drop columns, change the type of existing columns, and update the table options. The syntax of the statement is as follows:

ALTER (TABLE | COLUMNFAMILY) <tablename> <instruction>

The <instruction> supports the alterations using the keywords discussed in Table 2.4.

Table 2.4 ALTER Command Keywords

Add updateCF() and updateCF2() methods to the CQLClient class. In the updateCF() method, change the column type of the edition column to int in the catalog table using statement ALTER COLUMNFAMILY catalog ALTER edition TYPE int.

cqlQuery.setQuery("ALTER COLUMNFAMILY catalog ALTER edition TYPE int");
cqlQuery.execute();

Invoke the updateCF() method in the main method and run the CQLClient class. The edition column type gets changed to int. The value in the edition column is still of type text. A subsequent select() method returns the value of the edition column as text. In updateCF2(), change the type of the edition column back to text.

cqlQuery.setQuery("ALTER COLUMNFAMILY catalog ALTER edition TYPE text");
cqlQuery.execute();

If a column type is modified, a column value that was previously addable becomes nonaddable. For example, set the column type of the journal column to int:

cqlQuery.setQuery("ALTER COLUMNFAMILY catalog ALTER journal TYPE int");
cqlQuery.execute();

Next, add a journal column value of type text:

cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal, publisher, edition,
title,author) VALUES ('catalog5','Oracle Magazine', 'Oracle Publishing',
'November-December 2013', '','')");
cqlQuery.execute();

The following exception is generated, indicating that the text value cannot be added to an int type column:

HInvalidRequestException: InvalidRequestException(why:unable to make int from
'Oracle Magazine')

DROPPING THE COLUMN FAMILY

The DROP TABLE or DROP COLUMNFAMILY statement may be used to drop a column family, including all the data in the column family. Add a dropCF() method to the CQLClient class. Then set the query on a CQLQuery object to be DROP COLUMNFAMILY catalog, which would drop the catalog column family.

cqlQuery.setQuery("DROP COLUMNFAMILY catalog");
cqlQuery.execute();

Next, invoke the dropCF() method from the main method and run the CQLClient application. The catalog column family gets dropped. If only the table data is to be removed but not the table, use the TRUNCATE statement:

TRUNCATE <tablename>

DROPPING THE KEYSPACE

You can use the DROP KEYSPACE statement to drop a keyspace:

DROP KEYSPACE <identifier>

Add a dropKeyspace() method to drop a keyspace. Drop the CQLKeyspace by setting the CQLQuery object query to DROP KEYSPACE CQLKeyspace.

cqlQuery.setQuery("DROP KEYSPACE CQLKeyspace");
cqlQuery.execute();

Invoke the dropKeyspace() method from the main method and run the CQLClient application to drop the CQLKeyspace. The execute() method must be invoked after you set a query with setQuery(). Queries do not get added to the CQLQuery object so they can be run in a batch. If a keyspace is used after it has been dropped, the following error is generated:

Caused by: InvalidRequestException(why:Keyspace 'CQLKeyspace' does not exist)

THE CQLCLIENT APPLICATION

The CQLClient application appears in Listing 2.2. Some of the method invocations in the main method have been commented out and should be uncommented as required to run individually or in sequence.

Listing 2.2 The CQLClient Application

package cql;
 
import java.util.Iterator;
import java.util.List;
 
import me.prettyprint.cassandra.model.CqlQuery;
import me.prettyprint.cassandra.model.CqlRows;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.ColumnSlice;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.beans.Row;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.query.QueryResult;
 
public class CQLClient {
 
       private static Cluster cluster;
       private static Keyspace keyspace;
       public static void main(String[] args) {
               cluster = HFactory.getOrCreateCluster("cql-cluster",
"localhost:9160");
              /*Some of the method invocations in the main method have been commented
out and should be uncommented as required to run individually or in sequence. */
              createKeyspace();
               createCF();
              //insert();
              //select();
              //createIndex();
              //selectFilter();
              //update();
              //select();
              //batch();
              //select();
              //delete();
              //update2();
              //select();
              //updateCF();
              //select();
              //updateCF2();
              //dropCF();
              //dropKeyspace();
       }
/*Creates a Cassandra keyspace*/
       private static void createKeyspace() {
              keyspace = HFactory.createKeyspace("CQLKeyspace", cluster);
       }
       /*Drops a Cassandra keyspace*/
       private static void dropKeyspace() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("DROP KEYSPACE  CQLKeyspace");
              cqlQuery.execute();
       }
       /*Creates an index*/
       private static void createIndex() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("CREATE INDEX titleIndex ON catalog (title)");
              cqlQuery.execute();
       }
       /*Creates a column family*/
       private static void createCF() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("CREATE COLUMNFAMILY catalog (catalog_id text
PRIMARY KEY,journal text,publisher text,edition text,title text,author text) WITH
comparator=UTF8Type AND default_validation=UTF8Type AND caching=keys_only AND
replicate_on_write=true");
       cqlQuery.execute();
              cqlQuery.setQuery("CREATE COLUMNFAMILY catalog2 (KEY text PRIMARY
KEY,journal text,publisher text,edition text,title text,author text)");
cqlQuery.execute();
       }
       /*Adds data to a column family*/
       private static void insert() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal,
publisher, edition,title,author) VALUES ('catalog1','Oracle Magazine', 'Oracle
Publishing', 'November-December 2013', 'Engineering as a Service','David
A.  Kelly')");
              cqlQuery.execute();
              cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal,
publisher, edition,title,author) VALUES ('catalog2','Oracle Magazine', 'Oracle
Publishing', 'November-December 2013', 'Quintessential and Collaborative','Tom
Haunert')");
              cqlQuery.execute();
              cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal,
publisher, edition,title,author) VALUES ('catalog3','Oracle Magazine', 'Oracle
Publishing', 'November-December 2013', 'Engineering as a Service','David A.
Kelly')");
              cqlQuery.execute();
              cqlQuery.setQuery("INSERT INTO catalog (catalog_id,   publisher,
edition,title,author) VALUES ('catalog4', 'Oracle Publishing', 'November-
December 2013', 'Engineering as a Service','David A.  Kelly')");
              cqlQuery.execute();
              cqlQuery.setQuery("INSERT INTO catalog2 (KEY, journal, publisher,
edition,title,author) VALUES ('catalog1','Oracle Magazine', 'Oracle Publishing',
'November-December 2013', 'Engineering as a Service','David A.  Kelly')");
              cqlQuery.execute();
       }
       /*Selects data from a column family*/
       private static void select() {
              CqlQuery<String, String, String> cqlQuery = new CqlQuery<String,
String, String>(
                            keyspace, StringSerializer.get(), StringSerializer.
get(),
                            StringSerializer.get());
              cqlQuery.setQuery("select * from catalog");
              QueryResult<CqlRows<String, String, String>> result = cqlQuery
                            .execute();
              System.out.println(result);
              Iterator iterator = result.get().iterator();
              while (iterator.hasNext()) {
                     Row row = (Row) iterator.next();
                     String key = (String) row.getKey();
                     ColumnSlice columnSlice = row.getColumnSlice();
                     List columnList = columnSlice.getColumns();
                     Iterator iter = columnList.iterator();
                     while (iter.hasNext()) {
                            HColumn column = (HColumn) iter.next();
                            System.out.println("Column name: " +
column.getName() + " ");
 
                            System.out.println("Column Value: " +
column.getValue());
                            System.out.println("
");
                      }
              }
       }
 
       /*Selects data from a column family using a WHERE clause*/
private static void selectFilter() {
               CqlQuery<String, String, String> cqlQuery = new CqlQuery<String,
String, String>(
                             keyspace, StringSerializer.get(), StringSerializer.
get(),
                             StringSerializer.get());
              //cqlQuery.setQuery("SELECT catalog_id, journal, publisher,
edition,title,author FROM catalog WHERE title='Engineering as a Service'");
              cqlQuery.setQuery("SELECT   journal, publisher, edition,title,
author FROM catalog2 WHERE KEY='catalog1'");
              //cqlQuery.setQuery("SELECT catalog_id, journal, publisher,
edition,title,author FROM catalog WHERE catalog_id='catalog1'");//Generates
exception
              QueryResult<CqlRows<String, String, String>> result = cqlQuery
                            .execute();
              System.out.println(result);
              Iterator iterator = result.get().iterator();
              while (iterator.hasNext()) {
                     Row row = (Row) iterator.next();
                     String key = (String) row.getKey();
                     ColumnSlice columnSlice = row.getColumnSlice();
                     List columnList = columnSlice.getColumns();
                     Iterator iter = columnList.iterator();
                     while (iter.hasNext()) {
                            HColumn column = (HColumn) iter.next();
                            System.out.println("Column name: " +
column.getName() + " ");
                            System.out.println("Column Value: " +
column.getValue());
                            System.out.println("
");
                      }
              }
       }
       /*Updates a row or rows of data in a column family*/
       private static void update() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                             StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("UPDATE catalog USING CONSISTENCY ALL   SET
'edition' = '11/12 2013', 'author' = 'Kelley, David A.' WHERE CATALOG_ID =
'catalog1'");
              cqlQuery.execute();
       }
       /*Updates a row or rows of data in a column family*/
       private static void update2() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("UPDATE catalog USING CONSISTENCY ALL   SET
'edition' = 'November-December 2013', 'author' = 'Kelley, David A.' WHERE
CATALOG_ID = 'catalog1'");
              cqlQuery.execute();
       }
       /*Deletes columns from a row or rows of data in a column family*/
       private static void delete() {
 
              CqlQuery cqlQuery = new CqlQuery<String, String, String>(keyspace,
                            StringSerializer.get(), StringSerializer.get(),
                            StringSerializer.get());
              cqlQuery.setQuery("DELETE journal, publisher from catalog WHERE
catalog_id='catalog3'");
              cqlQuery.execute();
              cqlQuery.setQuery("DELETE from catalog WHERE
catalog_id='catalog4'");
              cqlQuery.execute();
              cqlQuery.setQuery("DELETE catalog_id, journal, publisher, edition,
title, author from catalog WHERE catalog_id='catalog4'");
              cqlQuery.execute();
       }
       /*Runs multiple statements in a batch*/
       private static void batch() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("BEGIN BATCH USING CONSISTENCY QUORUM   UPDATE
catalog   SET 'edition' = '11/12 2013', 'author' = 'Haunert, Tom' WHERE CATALOG_ID =
'catalog2' INSERT INTO catalog (catalog_id, journal, publisher, edition,title,
author) VALUES ('catalog3','Oracle Magazine', 'Oracle Publishing', 'November-
December 2013', '','') INSERT INTO catalog (catalog_id, journal, publisher,
edition,title,author) VALUES ('catalog4','Oracle Magazine', 'Oracle Publishing',
'November-December 2013', '','') UPDATE catalog       SET 'edition' = '11/12 2013'
WHERE CATALOG_ID = 'catalog3' APPLY BATCH");
              cqlQuery.execute();
       }
       /*Updates a column family*/
       private static void updateCF() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("ALTER COLUMNFAMILY catalog ALTER edition TYPE
int");
              cqlQuery.execute();
       }
 
       /*Updates a column family*/
       private static void updateCF2() {
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("ALTER COLUMNFAMILY catalog ALTER edition TYPE
text");
              cqlQuery.execute();
              cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                             StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("ALTER COLUMNFAMILY catalog ALTER journal TYPE
int");
              cqlQuery.execute();
              /* CF gets updated with column to a type different from column value*/
              cqlQuery.setQuery("INSERT INTO catalog (catalog_id, journal,
publisher, edition,title,author) VALUES ('catalog5','Oracle Magazine', 'Oracle
Publishing', 'November-December 2013', '','')");
              cqlQuery.execute();
       }
       /*Drops a column family*/
       private static void dropCF() {
 
              CqlQuery cqlQuery = new CqlQuery(keyspace, StringSerializer.get(),
                            StringSerializer.get(), StringSerializer.get());
              cqlQuery.setQuery("DROP COLUMNFAMILY   catalog");
              cqlQuery.execute();
       }
}

NEW FEATURES IN CQL 3

CQL 3 has added support for several new features and is backward-compatible. The keyword COLUMNFAMILY has been replaced with TABLE. Some of the salient new features are discussed next.

Compound Primary Key

The CREATE TABLE command has added a provision for a multiple column primary key, also called a compound primary key. The CREATE COLUMNFAMILY example in this chapter makes use of a simple primary key—a primary key with only one column. A compound primary key for the catalog table may be declared as follows:

CREATE TABLE catalog (
       catalog_id text,
       journal text,
       edition text,
       title text,
       author text,
       PRIMARY KEY (catalog_id, journal)
   );

The preceding statement creates a table using the catalog_id and journal columns to form a compound primary key. A table that has a compound primary key must have at least one column that is not included in the primary key.

To run an INSERT statement on a table with a compound primary key, each of the columns in the primary key must be specified. In addition, at least one of the non–primary key columns must be specified.

The WHERE clause may specify each of the columns in the compound primary key using AND as follows:

UPDATE catalog     SET  'edition' = 'November-December 2013', 'author' = 'Kelley,
David A.' WHERE CATALOG_ID = 'catalog1' AND journal='Oracle Magazine';

If a compound primary key is used in a WHERE clause, key-component columns other than the first may have a > (greater than) or < (less than) comparison. If all the preceding key-component columns have been identified with an = comparison, the last key-component may specify any kind of relation.

Conditional Modifications

The CREATE statements for KEYSPACE, TABLE, and INDEX support an IF NOT EXISTS condition. In CQL 2.0, the CREATE statement for KEYSPACE, TABLE, and INDEX throws an exception if the construct already exists.

CREATE KEYSPACE IF NOT EXISTS CQLKeyspace WITH replication = { 'class':
'SimpleStrategy','replication_factor' : 1 };
CREATE TABLE IF NOT EXISTS catalog (catalog_id text PRIMARY KEY,journal text,
publisher text,edition text,title text,author text);

The DROP statements support an IF EXISTS condition:

DROP KEYSPACE IF EXISTS CQLKeyspace;

The INSERT statement supports an IF NOT EXISTS condition. CQL 3 has added the provision to add a new row only if a row by the same primary key value does not already exist. The CQL 3 clause to add conditionally is IF NOT EXISTS. In CQL 2, the INSERT statement was run even if a row by the same primary key was already defined. The following CQL 3 statement adds a row only if a row identified by catalog1 does not exist:

INSERT INTO catalog (catalog_id, journal, publisher, edition,title,author) VALUES
('catalog1','Oracle Magazine', 'Oracle Publishing', 'November-December 2013',
'Engineering as a Service','David A. Kelly') IF NOT EXISTS;

The UPDATE statement supports an IF condition:

UPDATE  table_name
USING option1 AND optionN
SET assignment1 , assignmentN
WHERE <where-clause>
IF column_name1 = literal AND column_nameN = literal

The columns in the IF clause may be different from the columns to be updated. The IF condition incurs a negligible performance overhead, as Paxos is used internally. Paxos is a consensus protocol for a distributed system.

SUMMARY

This chapter introduced Cassandra Query Language (CQL), including the CQL commands. You used CQL 2 queries with the Hector Java client to add, select, update, and delete data from a Cassandra column family. You also discovered the salient new features in CQL 3. The next chapter discusses the DataStax Java driver, which supports CQL 3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 2 Querying Cassandra with CQL

Create new playlist

Sign In

Sign Up

CHAPTER 2

QUERYING CASSANDRA WITH CQL

OVERVIEW OF CQL

SETTING THE ENVIRONMENT

CREATING A JAVA PROJECT

CREATING A KEYSPACE

CREATING A COLUMN FAMILY

USING THE INSERT STATEMENT

USING THE SELECT STATEMENT

CREATING A SECONDARY INDEX

USING THE SELECT STATEMENT WITH THE WHERE CLAUSE

USING THE UPDATE STATEMENT

USING THE BATCH STATEMENT

USING THE DELETE STATEMENT

USING THE ALTER COLUMNFAMILY STATEMENT

DROPPING THE COLUMN FAMILY

DROPPING THE KEYSPACE

THE CQLCLIENT APPLICATION

NEW FEATURES IN CQL 3

Compound Primary Key

Conditional Modifications

SUMMARY

Table of Contents for
Chapter 2 Querying Cassandra with CQL

USING THE `INSERT` STATEMENT

USING THE `SELECT` STATEMENT

USING THE `SELECT` STATEMENT WITH THE `WHERE` CLAUSE

USING THE `UPDATE` STATEMENT

USING THE `BATCH` STATEMENT

USING THE `DELETE` STATEMENT

USING THE `ALTER COLUMNFAMILY` STATEMENT