Datastax has an updated reference documentation at http://www.datastax.com/docs/1.1/references/cql/index which we would like to suggest referring to in case you need more than what we cover here. We will be covering things which will be useful as an administrator.
Cassandra supports multiple interfaces for clients to query data; native transport supports Cassandra Query Language queries (with clients written in multiple languages). With the help of Thrift, Cassandra supports multiple languages for which native transport support doesn't exist. Cassandra Query Language (CQL) is similar to SQL and is used for querying Cassandra. In other words, CQL is a
domain-specific language (DSL) for querying data from Cassandra. You can also choose to use the cqlsh
command to connect to Cassandra to run ad hoc CQL queries.
A USE
statement sets the keyspace of the current client session; all subsequent operations in the connection are within the context of the keyspace. You can reset the keyspace by executing the USE
command again:
USE <keyspace_name>;
CREATE TABLE
creates a new column family under the current keyspace. While creating a table, we need to define the primary/row key's data type and the comparator for the column family. In CQL3, the column family type is assumed to be string and the value types can be defined. You can log in to CQL to define a different data type to the column names if you wish.
In case of a composite primary key (as shown in the following code snippet), Cassandra uses the first part of the primary key as the actual partition key; the second part of the primary key is used as the first part of the composite column name, following which we will have the actual column family name defined:
CREATE TABLE <column family name> ( <column_name> <type>, [<column_name2> <type>, ...] PRIMARY KEY ( <column_name> <type> [, <column_name2> <type>,...] ) [WITH <option name> = <value> [AND <option name> = <value> [...]]] );
Cassandra also supports secondary indexes, but the index queries span across the cluster; this will be cheaper on a smaller installation and expensive on large installations. Cassandra secondary indexes are a good fit for low cardinality values (more unique values):
CREATE INDEX table ON users (<Column name>);
ALTER TABLE
is used to alter the table schematics after the table or keyspace is created and used. The schematics for the command are the same as CREATE
, except the keyword needs to be replaced with ALTER
.
ALTER TABLE [<keyspace_name>].<column_family> ( ALTER <column_name> TYPE <data_type> | ADD <column_name><data_type> | DROP <column_name> | WITH <optionname> = <val> [AND <optionname> = <val> [...]] );
The DESCRIBE
command is used to describe an object's valid options: DESCRIBE CLUSTER
, DESCRIBE TABLE
, DESCRIBE KEYSPACE
, and DESCRIBE SCHEMA
. It is highly recommended that you run these commands on your test cluster while reading this section.
The
DESCRIBE CLUSTER
option allows users to view the cluster name, partitioner, and snitch information.
The DESCRIBE SCHEMA
option allows users to view schema information. You can copy-paste to replicate the schema.
Selecting data out of Cassandra will follow the following syntax:
SELECT <select expression> FROM <column family> [USING CONSISTENCY <level>] [WHERE (<clause>)] [LIMIT <n>] [ORDER BY <composite key 2>] [ASC, DESC] <clause> syntax is: <primary key name> { = | < | > | <= | >= } <key_value> <primary key name> IN (<key_value> [,...]);
It is important to understand that an efficient query with a WHERE
clause should be specified with a primary key. The first part of the primary key query has to be an exact match and the second part can have <
, >
, or =
clauses (primary key order does matter).
3.144.100.237