SQL standard-based mode

For fine-grained access control on a column and row level, we can use SQL standard-based mode, available since Hive v0.13.0. It is similar to relational database authorization by using the GRANT and REVOKE statements to control access through the hiveserver2 configuration. However, tools such as Hive or HDFS commands do not access data through hiveserver2, so SQL standard-based mode cannot authorize their access.

Therefore, it is recommended you use storage-based mode together with SQL standard-based mode to authorize users connecting from various tools. To enable SQL standard-based mode authorization, we can set the following properties in the hive-site.xml file:

<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description>enable or disable the hive client authorization </description>
</property>

<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
<description>Allows Hive queries to be run by the user who submits the query rather than the hive user. Need to turn if off for this SQL standard-base mode</description>
</property>

<property>
<name>hive.users.in.admin.role</name>
<value>dayongd,administrator</value>
<description>Comma-separated list of users assigned to the ADMIN role.</description>
</property>

<property>
<name>hive.security.authorization.manager</name> <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sql</value> </property>

<property>
<name>hive.security.authenticator.manager</name> <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory</value>
</property>

<property>
<name>hive.security.metastore.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider,org.apache.hadoop.hive.ql.security.authorization.MetaStoreAuthzAPIAuthorizerEmbedOnly</value>
<description>It takes a comma separated list, so we can add MetaStoreAuthzAPIAuthorizerEmbedOnly along with StorageBasedAuthorization parameter,if we want to enable that as well</description>
</property>

In addition, we need to put the following configurations in hiveserver2-site.xml, before restarting hiveserver2, to make SQL standard-based authorization effective:

<configuration>

<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
<description></description>
</property>

<property>
<name>hive.security.authorization.manager</name
<value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>
</property>

<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
</property>

<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9085</value>
<description>Use 9083 for hive1 and 9085 for hive2</description>
</property>

</configuration>
Before restarting hiveserver2 to enable the preceding setting, do not forget to grant admin roles to the users defined in hive.users.in.admin.role using GRANT admin TO USER <user_name>.

With SQL standard-based mode authorization, we can manage privileges on two levels: role or object.

The syntax to grant or revoke an authorization at the role level is as follows:

  • GRANT <ROLE_NAME> TO <PRINCIPLES> [ WITH ADMIN OPTION]
  • REVOKE [ADMIN OPTION FOR] <ROLE_NAME> FROM <PRINCIPLES>

The usage of the parameters is as follows:

  • <ROLE_NAME>: This is a comma-separated role name
  • <PRINCIPLES>: This is a user or a role
  • WITH ADMIN OPTION: This is optional. Once specified, it makes sure that the user gets the privileges to grant the role to other users/roles

On the other hand, the syntax to grant or revoke an authorization at the object level is as follows:

  • GRANT <PRIVILEGE> ON <OBJECT> TO <PRINCIPLES>
  • REVOKE <PRIVILEGE> ON <OBJECT> FROM <PRINCIPLES>

Here, the following parameters are used:

  • <PRIVILEGE>: This can be INSERT, SELECT, UPDATE, DELETE, or ALL
  • <PRINCIPLES>: This can be a user or a role
  • <OBJECT>: This is a table or a view

For more examples of HQL statements to manage SQL standard-based authorization, please refer to the Hive wiki at https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-Configuration.

Apache Sentry is a highly modular system for providing centralized, fine-grained, role-based authorization to both data and metadata stored on an Apache Hadoop cluster. It can be integrated with Hive to deliver advanced authorization controls. For more information about Sentry, please refer to https://sentry.apache.org/. Sentry is usually distributed in the Cloudera CDH package. Another similar project is Apache Ranger (https://ranger.apache.org/), which is usually distributed in the Hortonworks HDP package.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.133.180