Multicluster integration with ESS
This chapter describes the implementation and recommended practices for multicluster integration with ESS and contains the following sections:
4.1 Introduction
Multicluster ESS setups allow connections to different clusters. You have these options:
Allow clusters to access one or more file systems belonging to a different cluster.
Mount file systems that belong to other clusters on condition that the correct authorization is set up.
A multicluster setup is useful for sharing data across clusters.
The use cases for sharing data across multiple ESS clusters include these scenarios:
Separating the data ingest cluster from data analytics cluster.
Collaboration of data across different departments.
Isolating the storage cluster from the application clients.
It is possible to share data across multiple clusters within a physical location or across locations. Clusters are most often attached by using a LAN, but they might also include a SAN.
4.2 Planning
When you plan for a multicluster setup, it is important to consider the use case for setting up multiple clusters. In other words, consider the reason for sharing of data. This consideration helps you to determine the type of access that you need to set up across clusters for data sharing. Determine what the different clusters are used for, and why the clusters need to be separated. The separation of clusters might be ESS storage cluster, protocol cluster, or compute client cluster.
ESS allows users to have shared access to files in either the cluster where the file system was created or other ESS clusters. These configurations are described in the following figures. It is possible to export specific file systems with specific remote clusters, and also export them as readOnly and do root squashing.
Figure 4-1 on page 43 shows a set of application client cluster Nodes that are mounting file systems from remote ESS storage clusters.
Figure 4-1 Application client cluster Nodes that are mounting file systems from remote ESS storage clusters
Figure 4-2 depicts the protocol cluster with an alternative authentication scheme.
Figure 4-2 Protocol cluster with an alternative authentication scheme
Finally, Figure 4-3 on page 44 shows the separate protocol and client clusters that are mounting different file systems.
Figure 4-3 Separate protocol and client clusters that are mounting different file systems
While you plan for a multicluster installation, it is important to note the following points:
Each cluster is installed, managed, and upgraded independently. So, follow these guidelines:
A file system is administered only by the cluster where the file system was created. Other clusters might be allowed to mount the file system. However, their administrators cannot add or delete disks, change characteristics of the file system, enable or disable quotas, run the mmfsck command, and so on. The only commands that other clusters can issue are list type commands, such as: mmlsfs, mmlsdisk, mmlsmount, and mmdf.
Because each cluster is managed independently, there is no automatic coordination and propagation of changes between clusters. In contrast, automatic coordination happens between the nodes within a cluster. This implication follows:
 – If the administrator of cluster1 (the owner of file system gpfs1) decides to delete it or rename it, the information for gpfs1 in cluster2 becomes obsolete.
 – An attempt to mount gpfs1 from cluster2 fails.
It is assumed that when such changes take place, the two administrators will inform each other. Then, the administrator of cluster2 can then use the update or delete options of the mmremotefs command to make the appropriate changes.
Use the update option of the mmremotecluster command to reflect these changes to these values:
 – names of the contact nodes
 – name of the cluster
 – public key files
Every node in a cluster needs to have a TCP/IP connection to every other node in the cluster. Similarly, every node in the cluster that requires access to another cluster's file system must be able to open a TCP/IP connection to every node in the other cluster.
Each cluster requires its own GUI. The GUI might be installed onto the CES nodes, but performance must be considered.
Each cluster has its own REST API.
Each cluster has its own health monitoring. This means that error events that are raised in the one cluster are not visible in the other cluster and vice versa.
Availability of certain performance metrics depends on the role of the cluster. That is, NFS metrics are available on protocol clusters only.
4.3 Guidelines
The following practices are recommended when you implement multicluster integration with ESS:
Unless the cluster is small, it is recommended to separate the storage and the compute cluster.
It is recommended that the storage cluster should include only 'storage-related nodes'. These types of nodes belong in their own cluster:
 – Protocol nodes (such as NFS, SMB, Object)
 – Backup nodes (such as Protect)
 – Archive nodes (IBM Linear Tape File System™ Enterprise Edition (LTFS-EE)
 – Active File Management (AFM) gateway nodes
The storage cluster can be managed by the storage administrator.
If a protocol cluster is installed into an environment with an existing storage cluster, the ESS version that is used should comply with the protocol cluster. The installation can be performed either manually or by using the installation toolkit.
It is possible that different clusters are administered by different organizations. So, it is possible for each cluster to have a unique set of user accounts. For consistency of ownership and access control, a uniform user-identity namespace is preferred. Users must be known to the other cluster. You typically achieve this goal as follows:
a. Create a user account in the other cluster.
b. Give this account the same set of user and group IDs that the account has in the cluster where the file system was created.
However, this approach might pose problems in some situations.
GPFS helps to solve this problem by optionally performing user ID and group ID remapping internally, by using user-supplied helper applications. For a detailed description of the GPFS user ID remapping convention, see the IBM white paper entitled UID Mapping for GPFS in a Multi-cluster Environment in the IBM Knowledge Center.
Access from a remote cluster by a root user presents a special case. It is often desirable to disallow root access from a remote cluster, while you allow regular user access. Such a restriction is commonly known as root squash. A root squash option is available when you make a file system available for mounting by other clusters by using the mmauth command. This option is similar to the NFS root squash option. When enabled, it causes GPFS to squash superuser authority on accesses to the affected file system on nodes that are located in remote clusters.
The configuration limits need to be treated the same for nodes in all the clusters. For example, the max socket connections (fsysctl net.core.somaxconn) value should be the same.
A cluster might own a file system has a maxblocksize configuration parameter that is different from the maxblocksize configuration parameter of the cluster that desires to mount a file system. In this case, a mismatch can occur and file system mount requests might fail with messages to this effect. Use the mmlsconfig command to check your maxblocksize configuration parameters on both clusters. Correct any discrepancies with the mmchconfig command.
Use the show option of the mmremotecluster and mmremotefs commands to display the current information about remote clusters and file systems.
Cross-protocol change notifications do not work on remotely mounted file systems. For example, an NFS client might change a file. In this case, the system does not issue a file change notification to the SMB client that has asked for a notification.
Each protocol cluster must use a dedicated file system. It is not allowed to share a file system between multiple protocol clusters.
The storage cluster owns all of the exported ESS file systems. This ownership includes at least two file systems per protocol cluster (one CES shared root + one data file system).
The protocol clusters cannot own any ESS file systems. Only remote mounts from the storage cluster are allowed.
Any file system can be remotely mounted by exactly one protocol cluster. Sharing a file system between multiple protocol clusters might cause data inconsistencies.
The primary use case for multi-cluster protocol is to allow multiple authentication configurations. Do not use the setup for these purposes:
 – extending the scalability of Cluster Export Services (CES)
 – working around defined limitations (for example, number of SMB connections)
This setup provides some level of isolation between the clusters, but there is no strict isolation of administrative operations. Also, there is no guarantee that administrators on one cluster cannot see data from another cluster. Strict isolation is guaranteed through NFS or SMB access only.
Storage and protocol clusters are in the same site/location. High network latencies between them can cause problems.
4.4 Networking
Each node in a cluster needs to have a TCP/IP connection to every other node in the cluster. Similarly, every node in the cluster that requires access to another cluster's file system must be able to open a TCP/IP connection to every node in the other cluster.
Nodes in two separate remote clusters that mount the same file system are not required to be able to open a TCP/IP connection to each other. Here is an example:
A node in clusterA mounts a file system from clusterB.
A node in clusterC desires to mount the same file system.
Nodes in clusterA and clusterC do not have to communicate with each other.
4.4.1 Node roles
When a remote node mounts a local file system, it joins the local cluster. So, the cluster manager manages its leases, the token servers on the local cluster manages tokens, and so on. You must take that into account when you size and design the node designation.
You might have a protocol cluster that is separate from the ESS storage cluster. In this case, it is recommended that you have the quorum, cluster manager, and file system manager functions on the remote protocol cluster.
Be aware that the availability of certain performance metrics depends on the role of the cluster. For example, NFS metrics are available on protocol clusters only.
Due to the separation of duties (storage clusters own the file systems and protocol clusters own the NFS/SMB exports), certain management tasks must be done in the corresponding cluster:
File system-related operations like creating file systems, filesets, or snapshots must be done in the storage cluster.
Export-related operations like creating exports, managing CES IP addresses, and managing authentication must be done in the protocol cluster.
The resource cluster is unaware of the authentication setup and UID mapping. For this reason, all actions that require a user or a group name must be done in the corresponding protocol cluster. For example, you must generate quota reports and manage access control lists (ACLs) in that protocol cluster.
4.5 Initial setup
The procedure to set up remote file system access involves the generation and exchange of authorization keys between the two clusters. In addition, the administrator of the GPFS cluster that owns the file system must authorize the remote clusters that are to access it. In turn, the administrator of the GPFS cluster that seeks access to a remote file system must define to GPFS the remote cluster and file system whose access is desired.
The package gpfs.gskit must be installed on all the nodes of the owning cluster and the accessing cluster. For more information, see the installation chapter for your operating system, such as Installing IBM Spectrum Scale on Linux nodes and deploying protocols.
4.5.1 Setting up protocols over remote cluster mounts
Figure 4-3 on page 44 shows the separation of tasks that are performed by each cluster.
Storage cluster owns the file systems and the storage. Protocol clusters contain the protocol node that provides access to the remotely mounted file system, through NFS or SMB.
Here, the storage cluster owns a file system and the protocol cluster remotely mounts the file system. The protocol nodes (CES nodes) in the protocol cluster export the file system via SMB and NFS.
You can define one set of protocol nodes per cluster, by using multiple independent protocol clusters that remotely mount file systems. Protocol clusters can share access to a storage cluster but not to a file system. Each protocol cluster requires a dedicated file system. Each protocol cluster can have a different authentication configuration, thus allowing different authentication domains while you keep the data at a central location. Another benefit is the ability to access existing ESS-based file systems through NFS or SMB without adding nodes to the ESS cluster.
Configuring protocols on a separate cluster
The process for configuring protocols on a separate cluster is in many respects the same as for a single cluster. However, there are a few differences mainly in procedure order.
This procedure assumes an environment with the server, network, storage, and operating systems are installed and ready for ESS use. For more information, see Installing section of the IBM Spectrum Scale documentation at IBM Knowledge Center.
Perform the following steps:
1. Install ESS on all nodes that are in the storage and protocol clusters. If you install a protocol cluster into an environment with an existing ESS cluster, the ESS version that is used should comply with the protocol cluster. The installation can be performed either manually or by using the installation toolkit. Do not create clusters or file systems or Cluster Export Services yet.
2. Create the storage and protocol clusters. Proceed with cluster creation of the storage cluster and one or more protocol clusters. Ensure that the configuration parameter maxBlockSize is set to the same value on all clusters.
3. Create file systems on the storage cluster, taking the following points into consideration:
 – CES shared root file system: Each protocol cluster requires its own CES shared root file system. Having a shared root file system that is different from the file system that serves data eases the management of CES.
 – Data file systems: At least one file system is required for each protocol cluster that is configured for Cluster Export Services. A data file system can be exported only from a single protocol cluster.
4. Before you install and configure Cluster Export Services, consider the following points:
 – Authentication: Separate authentication schemes are supported for each CES cluster.
 – ID mapping: The ID mapping of users that are authenticating to each CES cluster. It is recommended to have unique ID mapping across clusters, but not mandatory.
You must judiciously determine the ID mapping requirements and prevent possible interference or security issues.
 – GUI: GUI support for remote clusters is limited. Each cluster should have its own GUI. The GUI may be installed onto CES nodes but performance must be considered.
 – Object: Object is not supported in multi-cluster configurations.
5. Configure clusters for remote mount. For more information, see Mounting a remote GPFS file system.
6. Install and configure Cluster Export Services by using the installation toolkit or manually. For more information, refer to the following links:
5. Use the remotely mounted CES-shared root file system. After SMB and/or NFS is enabled, new exports can be created on the remotely mounted data file system.
4.6 Ongoing maintenance (upgrades)
Before you schedule an upgrade of a cluster that contains ESS and protocol nodes, planning discussions must take place to determine the current cluster configuration and to understand which functions might face an outage.
Each cluster is installed, managed, and upgraded independently. There is no special process to upgrade clusters in a multi-cluster environment. Upgrades are performed on a cluster-boundary basis.
When you choose an IBM ESS version, the release should comply with release-level limitations.
After all clusters in the environment are upgraded, the release and the file system version should be changed. The release version might be changed concurrently. However, changing the file system version requires the file system to be unmounted. To view the differences between file system versions, see Listing file system attributes.
To change the IBM ESS release, issue the following command on each cluster:
mmchconfig release=LATEST
 
Note: Nodes that run an older version of ESS on the remote cluster can no longer mount the file system. A command fails if any nodes that are running an older version are mounted at time that the command is issued.
To change the file system version, issue the following command for each file system on the storage cluster:
mmchfs <fs> -V full
If your requirements call for it, issue the following command:
mmchfs <fs> -V compat
This enables only backward-compatible format changes.
 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.94.249