Security

Financial organizations have a strict criterion to meet requirements on access control restrictions, confidentiality rules, and privacy restrictions. As Hadoop was designed for formatting large amounts of unstructured public data on commodity servers, security was never a driver for its design or development. That's why initially it was a big barrier for Hadoop in making its entry into the financial sector.

In 2009, Yahoo chose Kerberos as the authentication mechanism for Hadoop, and since then, Kerberos has become the basis of Hadoop's security model. Kerberos combined with Hadoop's own file system security has taken care of this security concern within the financial sector.

This section only covers the topic at a high level. Please refer your product's security documentation for more details.

The three main aspects of security are:

  • Authentication: Kerberos is used for authentication in two different ways—Kerberos RPC using SASL/ GSSAPI and Kerberos HTTP SPNEGO for Hadoop's Web UIs.

    Hadoop provides authorization controls to authenticated users via the use of HDFS file permissions and service-level authorization.

    HDFS uses a permissions model for files and directories, which is similar to the UNIX model.

  • Authorization: The cluster will authorize based on users and their group permissions. If you have users from multiple domains, you have to do some mapping of the Kerberos principals to usernames.

    If you are already using Active Directory for Kerberos authentication, it might make sense to use group management via your LDAP instance or Active Directory.

  • Encryption: Hadoop provides mechanisms for encryption for data in transit on the network. By default, the clients and DataNodes transmit data using the Hadoop Data Transfer Protocol and that is unencrypted. If you have sensitive data on your network and as expected, you would like to protect your cluster from hackers and network sniffers, then you will need to configure encryption for Hadoop components.

    Since Hadoop V2.6.0, HDFS implements transparent end-to-end encryption. The data read from and written to configured directories is transparently encrypted.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.170.187