For fine-grained access control on a column and row level, we can use SQL standard-based mode, available since Hive v0.13.0. It is similar to relational database authorization by using the GRANT and REVOKE statements to control access through the hiveserver2 configuration. However, tools such as Hive or HDFS commands do not access data through hiveserver2, so SQL standard-based mode cannot authorize their access.
Therefore, it is recommended you use storage-based mode together with SQL standard-based mode to authorize users connecting from various tools. To enable SQL standard-based mode authorization, we can set the following properties in the hive-site.xml file:
<property> <name>hive.security.authorization.enabled</name> <value>true</value> <description>enable or disable the hive client authorization </description> </property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>Allows Hive queries to be run by the user who submits the query rather than the hive user. Need to turn if off for this SQL standard-base mode</description>
</property>
<property>
<name>hive.users.in.admin.role</name>
<value>dayongd,administrator</value>
<description>Comma-separated list of users assigned to the ADMIN role.</description>
</property>
<property>
<name>hive.security.authorization.manager</name> <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sql</value> </property>
<property>
<name>hive.security.authenticator.manager</name> <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory</value>
</property>
<property>
<name>hive.security.metastore.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider,org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly</value>
<description>It takes a comma separated list, so we can add MetaStoreAuthzAPIAuthorizerEmbedOnly along with StorageBasedAuthorization parameter,if we want to enable that as well</description>
</property>
In addition, we need to put the following configurations in hiveserver2-site.xml, before restarting hiveserver2, to make SQL standard-based authorization effective:
<configuration>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description></description>
</property>
<property>
<name>hive.security.authorization.manager</name
<value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>
</property>
<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9085</value>
<description>Use 9083 for hive1 and 9085 for hive2</description>
</property>
</configuration>
With SQL standard-based mode authorization, we can manage privileges on two levels: role or object.
The syntax to grant or revoke an authorization at the role level is as follows:
- GRANT <ROLE_NAME> TO <PRINCIPLES> [ WITH ADMIN OPTION]
- REVOKE [ADMIN OPTION FOR] <ROLE_NAME> FROM <PRINCIPLES>
The usage of the parameters is as follows:
- <ROLE_NAME>: This is a comma-separated role name
- <PRINCIPLES>: This is a user or a role
- WITH ADMIN OPTION: This is optional. Once specified, it makes sure that the user gets the privileges to grant the role to other users/roles
On the other hand, the syntax to grant or revoke an authorization at the object level is as follows:
- GRANT <PRIVILEGE> ON <OBJECT> TO <PRINCIPLES>
- REVOKE <PRIVILEGE> ON <OBJECT> FROM <PRINCIPLES>
Here, the following parameters are used:
- <PRIVILEGE>: This can be INSERT, SELECT, UPDATE, DELETE, or ALL
- <PRINCIPLES>: This can be a user or a role
- <OBJECT>: This is a table or a view
For more examples of HQL statements to manage SQL standard-based authorization, please refer to the Hive wiki at https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-Configuration.