Database

The database in Hive describes a collection of tables that are used for a similar purpose or belong to the same groups. If the database is not specified, the default database is used and uses /user/hive/warehouse in HDFS as its root directory. This path is configurable by the hive.metastore.warehouse.dir property in hive-site.xmlWhenever a new database is created, Hive creates a new directory for each database under /user/hive/warehouse. For example, the myhivebook database is located at /user/hive/datawarehouse/myhivebook.db. In addition, DATABASE has a name alias, SCHEMA, meaning they are the same thing in HQL. The following is the major DDL for databases operations:

  1. Create the database/schema if it doesn't exist:
      > CREATE DATABASE myhivebook;
> CREATE SCHEMA IF NOT EXISTS myhivebook;
  1. Create the database with the location, comments, and metadata information:
      > CREATE DATABASE IF NOT EXISTS myhivebook
> COMMENT 'hive database demo'
> LOCATION '/hdfs/directory'
> WITH DBPROPERTIES ('creator'='dayongd','date'='2018-05-01');


-- To show the DDL use show create database since v2.1.0
> SHOW CREATE DATABASE default;
+------------------------------------------------+
| createdb_stmt |
+------------------------------------------------+
| CREATE DATABASE `default` |
| COMMENT |
| 'Default Hive database' |
| LOCATION |
| 'hdfs://localhost:9000/user/hive/warehouse' |
+------------------------------------------------+
  1. Show and describe the database with wildcards:
      > SHOW DATABASES;
+----------------+
| database_name |
+----------------+
| default |
+----------------+
1 row selected (1.7 seconds)


> SHOW DATABASES LIKE 'my.*';

> DESCRIBE DATABASE default;
+-------+----------------------+--------------------------------+
|db_name| comment | location |
+-------+----------------------+--------------------------------+
|default|Default Hive database | hdfs://localhost:9000
/user/hive/warehouse |

+-------+----------------------+--------------------------------+
1 row selected (1.352 seconds)
  1. Switch to use one database or directly qualify the table name with the database name:
      > USE myhivebook;
> --SELECT * FROM myhivebook.table_name;
  1. Show the current database:
      > SELECT current_database();
+----------+
| _c0 |
+----------+
| default |
+----------+
1 row selected (0.218 seconds)
  1. Drop the database:
      > DROP DATABASE IF EXISTS myhivebook;--failed when database is 
not empty
> DROP DATABASE IF EXISTS myhivebook CASCADE;--drop database and
tables
Hive databases/tables are directories/subdirectories in HDFS. In order to remove the database directory, we need to remove the subdirectories (for tables) first. By default, the database cannot be dropped if it is not empty, unless the CASCADE option is specified. With this option, it drops all tables in the database automatically before dropping the database.
  1. Alter the database properties. The ALTER DATABASE statement can only apply to dbproperties, owner, and location on the database. The other database properties cannot be changed:
      > ALTER DATABASE myhivebook SET DBPROPERTIES ('edited-
by'='Dayong');

> ALTER DATABASE myhivebook SET OWNER user dayongd;
> ALTER DATABASE myhivebook SET LOCATION '/tmp/data/myhivebook';
Since Hive v2.2.1, the ALTER DATABASE ... SET LOCATION statement can be used to modify the database's location, but it does not move all existing tables/partitions in the current database directory to the newly specified location. It only changes the location for newly added tables after the database is altered. This behavior is analogous to how changing a table-directory does not move existing partitions to a different location.
The SHOW and DESC (or DESCRIBE) statements in Hive are used to show the definition for most of the objects, such as tables and partitions. The SHOW statement supports a wide range of Hive objects, such as tables, tables' properties, table DDL, index, partitions, columns, functions, locks, roles, configurations, transactions, and compactions. The DESC statement supports a small range of Hive objects, such as databases, tables, views, columns, and partitions. However, the DESC statement is able to provide more detailed information combined with the EXTENDED or FORMATTED keywords. In this book, there is no dedicated section to introduce SHOW and DESC. Instead, we introduce them in line with other HQL through the remaining chapters.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.226.109