In the previous section, we learned about the software load balancer configuration using the Apache Web Server. In this section, we extend the setup by configuring the cluster between Liferay Portal Server nodes. To set up a cluster of Liferay Portal Server nodes, we need to ensure all shared resources are either centralized or replicated. The following list highlights the resources that need to be handled for cluster setup:
In this section, we will learn how to configure these resources to work in a clustered environment. We will also learn about the best practices associated with each option.
Session replication is a technique to replicate the session information across all the nodes. With the help of session replication, we can ensure automatic recovery after the failover of any node. In our load balancer configuration, we configured session stickiness which ensures all requests related to the same user session are served through a specific node. Now suppose that node goes down; in this case, the load balancer sends subsequent requests to another node in the cluster. If the new node does not have the session information of the same user, it considers it as a new session and in this situation the user will be logged out of the system. With the help of session replication, we can avoid this situation and ensure transparent switching between nodes.
Let's learn how to configure session replication.
server.xml
file of liferay-node-01
located in node-01liferay-portal-6.1.1-ce-ga2 tomcat-7.0.27conf
, and add the following configuration inside the <Engine>
tag:<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="6"> <Manager className="org.apache.catalina.ha.session.DeltaManager" expireSessionsOnShutdown="false" notifyListenersOnReplication="true"/> <Channel className="org.apache.catalina.tribes.group.GroupChannel"> <Membership className="org.apache.catalina.tribes.membership.McastService" address="228.0.0.4" port="45564" frequency="500" dropTime="3000"/> <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver" address="auto" port="5000" selectorTimeout="100" maxThreads="6"/> <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/> </Sender> <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/> <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/> </Channel> <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=".*.gif;.*.js;.*.jpg;.*.png;.*.htm;.*.html;.*.css;.*.txt;"/> <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/> <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/> </Cluster>
web.xml
file of liferay-node-01
located in node-01liferay-portal-6.1.1-ce-ga2 tomcat-7.0.27webappsROOTWEB-INF
and at the bottom of the file before the </web-app>
tag, add the following content:<distributable/>
liferay-node-02
.With this configuration, changes in session replication between both the Liferay Portal servers is set up. The Tomcat server provides a simple TCP cluster which connects multiple Tomcat servers using the TCP protocol. In our configuration, we used DeltaManager
which identifies session changes and transfers these changes to other nodes in the cluster. We have used IP multicast to connect both the Tomcat servers. Once both the nodes connect with each other, they establish a set of sender and receiver socket channels. The session replication data is transferred using these channels. We have also configured various interceptors to intercept data transfer. The replication manager checks the session data after every request and accordingly transfers the changed session data to other nodes. For some kinds of requests, it is sure that the session data is not going to change; for example, requests for static resources like images, videos, and so on. So, it is unnecessary to check the session data after such requests. We configured a filter for all such resources in the replication valve configuration. The Application Server does not replicate sessions of any application unless the application is enabled for session replication. So, we enabled session replication for the Liferay Portal application by adding the <distributable>
tag in web.xml
.
Session replication is not a mandatory requirement for cluster configuration. Session replication consumes lots of server and network resources. So if there is not a real need to handle transparent failover, it is advisable to avoid session replication.
Caching is a very important technique to boost the performance of the system. Liferay Portal, by default, caches resources of the persistence layer and the service layer. By default, Liferay Portal uses the Ehcache framework for caching, and it caches resources in memory and the filesystem. In the clustered environment, each Liferay Portal node will have its own copy of the cache. It is very important to invalidate or replicate the cache on all the Liferay Portal nodes if the cache is invalidated or updated on any of the nodes. To implement this we need to replicate the cache. In this section, we will learn multiple options to replicate Ehcache across the cluster.
The Ehcache framework provides RMI (Remote Method Invocation) based cache replication across the cluster. It is the default implementation for replication. The RMI-based replication works on the TCP protocol. Cached resources are transferred using the serialization and deserialization mechanism of Java. RMI is a point-to-point protocol and hence, it generates a lot of network traffic between clustered nodes. Each node will connect to other nodes in the cluster and send cache replication messages. Liferay provides Ehcache replication configuration files in the bundle. We can re-use them to set up Ehcache replication using RMI. Let's learn how to configure Ehcache replication using RMI for our cluster.
portal-ext.properties
file of both the Liferay Portal nodes:net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml net.sf.ehcache.configurationResourceName.peerProviderProperties=peerDiscovery=automatic,multicastGroupAddress=${multicast.group.address["hibernate"]},multicastGroupPort=${multicast.group.port["hibernate"]},timeToLive=1 ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml ehcache.multi.vm.config.location.peerProviderProperties=peerDiscovery=automatic,multicastGroupAddress=${multicast.group.address["multi-vm"]},multicastGroupPort=${multicast.group.port["multi-vm"]},timeToLive=1 multicast.group.address["hibernate"]=233.0.0.4 multicast.group.port["hibernate"]=23304 multicast.group.address["multi-vm"]=233.0.0.5 multicast.group.port["multi-vm"]=23305
Liferay Portal uses two separate Ehcache configurations for the hibernate cache and the Liferay service layer cache. Liferay ships with two different sets of configuration files for each hibernate and service layer cache. By default, it uses the non-replicated version of the cache file. Using the portal-ext.properties
file, we can tell Liferay to use the replicated cache configuration file. In the preceding steps, we configured the replicated version of cache files for both the hibernate and service layer cache using the net.sf.ehcache.configurationResourceName
and ehcache.multi.vm.config.location
properties. Replicated Ehcache configuration files internally use IP multicast to establish the RMI connection between each Liferay node. We configured IP multicast and ports for establishing connections.
Another option to replicate Ehcache is using JGroups. JGroups is a powerful framework used for multicast communication. The Ehcache framework also supports replication using JGroups. Similar to the RMI-based Ehcache replication, Liferay also supports JGroup-based replication. Let's learn how to configure the JGroup-based Ehcache replication.
portal-ext.properties
file of both the Liferay Portal nodes:ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml ehcache.multi.vm.config.location.peerProviderProperties=connect=UDP(mcast_addr=multicast.group.address["hibernate"];mcast_port=multicast.group.port["hibernate"];):PING: MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS ehcache.bootstrap.cache.loader.factory=com.liferay.portal.cache.ehcache.JGroupsBootstrapCacheLoaderFactory ehcache.cache.event.listener.factory=net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml net.sf.ehcache.configurationResourceName.peerProviderProperties=peerDiscovery=connect=UDP(mcast_addr=multicast.group.address["multi-vm"];mcast_port=multicast.group.port["multi-vm"];):PING: MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS multicast.group.address["hibernate"]=233.0.0.4 multicast.group.port["hibernate"]=23304 multicast.group.address["multi-vm"]=233.0.0.5 multicast.group.port["multi-vm"]=23305
The Ehcache replication configuration is very similar to the RMI-based replication. Here, we used the UDP protocol to connect Liferay Portal nodes. With this option both Liferay Portal nodes also connect with each other using IP multicast.
We learned about the JGroups- and RMI-based Ehcache replication. The Liferay Enterprise version includes another powerful feature called Cluster Link, which provides the Ehcache replication mechanism. Internally, this feature uses JGroups to replicate the cache across the network. Let's go through the steps to configure this feature.
ehcache-cluster-web
enterprise plugin on both the Liferay Portal servers.portal-ext.properties
of both the nodes:cluster.link.enabled=true ehcache.cluster.link.replication.enabled=true net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml
In Chapter 1, Architectural Best Practices, we talked about this option. Unlike the JGroups- or RMI-based Ehcache replication, this option centralizes all Ehcache changes at one place and then distributes changes to all the nodes of the cluster. This in turn reduces unnecessary network transfers.
We talked about different options to configure Ehcache replication. Let's learn the best practices related to Ehcache replication.
cluster.link.channel.properties.transport
configuration property.Media Library is one of the most important features of Liferay Portal. The Media Library content is divided into two repositories. The metadata of the Media Library content is stored in the Liferay database. The actual media files are stored by default on the filesystem. For a clustered setup, we need to make sure that the media files are stored in a centralized repository, otherwise each node will have their own copy of files. Liferay Portal provides various options to store media files in centralized storage. Let's learn how to configure Media Library for the clustered environment and then talk about best practices.
In Chapter 1, Architectural Best Practices, we talked about the Advanced File System store. It's a pluggable Media Library repository store. It stores files on the filesystem, but it divides files into multiple directories. This feature improves the efficiency in locating the files, especially when the files are stored on the network filesystem. To use this option in a clustered environment, we need to use a Storage Area Network appliance or Network File System. We need to mount the storage SAN or the NFS directory on both the Liferay Portal nodes. Let's learn how to configure Media Library with the Advanced File System store.
portal-ext.properties
of both the Liferay Portal nodes:dl.store.impl=com.liferay.portlet.documentlibrary.store.AdvancedFileSystemStore dl.store.file.system.root.dir=<SAN Directory>
We have configured Media Library to use AdvancedFileSystemStore
, and also provided a networked location where the Portal should store the Media Library content. Both the Portal nodes will store content in the same filesystem location. To use this option, we need to make sure the SAN appliance supports file locking, as multiple nodes will access the filesystem at the same time. As this option requires specialized hardware like SAN or NFS, it will add additional cost to the solution.
Liferay Portal provides an option to store the Media Library content to the database using the JCR store. Liferay Portal uses Apache Jackrabbit as JCR implementation. Jackrabbit provides both filesystem- and database-based storage for the content. By default, the Jackrabbit configuration uses filesystem-based storage. Another option is to configure Jackrabbit to use the database for the Media Library content. Let's learn how to configure Media Library using the JCR store.
portal-ext.properties
of both the nodes and add the following configuration:dl.store.impl=com.liferay.portlet.documentlibrary.store.JCRStore
node-01liferay-portal-6.1.1-ce-ga2datajackrabbit
epository.xml
and make the following changes:<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem"> <param name="path" value="${rep.home}/repository" /> </FileSystem>
<FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem"> <param name="driver" value="com.mysql.jdbc.Driver"/> <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal"/> <param name="schema" value="mysql"/> <param name="user" value="{Database User Id}"/> <param name="password" value="{Database Password}"/> <param name="schemaObjectPrefix" value="J_R_FS_"/> </FileSystem>
<workspace>
tag:<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem"> <param name="path" value="${wsp.home}" /> </FileSystem> <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.BundleFsPersistenceManager" />
<workspace>
tag. Make sure you provide the correct IP, username, and password of the MySQL database:<PersistenceManager class="org.apache.jackrabbit.core.state.db.SimpleDbPersistenceManager"> <param name="driver" value="com.mysql.jdbc.Driver" /> <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" /> <param name="user" value="{Database User Id}"/> <param name="password" value="{Database Password}"/> <param name="schema" value="mysql" /> <param name="schemaObjectPrefix" value="J_PM_${wsp.name}_" /> <param name="externalBLOBs" value="false" /> </PersistenceManager> <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem"> <param name="driver" value="com.mysql.jdbc.Driver"/> <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" /> <param name="user" value="{Database User Id}"/> <param name="password" value="{Database Password}"/> <param name="schema" value="mysql"/> <param name="schemaObjectPrefix" value="J_FS_${wsp.name}_"/> </FileSystem>
<versioning>
tag:<FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem"> <param name="path" value="${rep.home}/version" /> </FileSystem> <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.BundleFsPersistenceManager"/>
<Versioning>
tag. Make sure you provide the correct IP, username, and password of the MySQL database:<FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem"> <param name="driver" value="com.mysql.jdbc.Driver"/> <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" /> <param name="user" value="{Database User Id}"/> <param name="password" value="{Database Password}"/> <param name="schema" value="mysql"/> <param name="schemaObjectPrefix" value="J_V_FS_"/> </FileSystem> <PersistenceManager class="org.apache.jackrabbit.core.state.db.SimpleDbPersistenceManager"> <param name="driver" value="com.mysql.jdbc.Driver" /> <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" /> <param name="user" value="{Database User Id}"/> <param name="password" value="{Database Password}"/> <param name="schema" value="mysql" /> <param name="schemaObjectPrefix" value="J_V_PM_" /> <param name="externalBLOBs" value="false" /> </PersistenceManager>
<Clustering>
tag. Make sure you provide the correct IP, username, and password of the MySQL database:<Cluster id="node_1" syncDelay="5"> <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal"> <param name="revision" value="${rep.home}/revision"/> <param name="driver" value="com.mysql.jdbc.Driver"/> <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal"/> <param name="user" value="{Database User Id}"/> <param name="password" value="{Database Password}"/> <param name="schema" value="mysql"/> <param name="schemaObjectPrefix" value="J_C_"/> </Journal> </Cluster>
id
attribute of the cluster
tag to node_2
.In the preceding configuration, we first enabled the JCR store for Media Library. This change will internally use Jackrabbit to store the Media Library content. By default, configuration of Jackrabbit is stored in the repository.xml
file. By default, the Jackrabbit configuration stores the Media Library content in the data folder. We configured the repository.xml
file to store content in the same lportal
database. We can also configure the repository.xml
file such that it stores the Media Library content in a separate database. Jackrabbit internally divides the Media Library data into the following types of data in the database:
We configured the repository.xml
file such that the preceding data is stored in the database.
Liferay Portal 6.1 introduced a new type of repository store to persist Media Library content in the Liferay database. It is very simple to configure and provides better performance than the JCR store with the database. Let's learn how to configure Media Library to use DBStore
.
portal-ext.properties
of both the nodes and add the following configuration:dl.store.impl= com.liferay.portlet.documentlibrary.store.DBStore
We talked about two options to centralize Media Library content storage. In Chapter 1, Architectural Best Practices, we briefly talked about other options too. Let's talk about some of the best practices related to Media Library.
repository.xml
).Liferay Portal uses Apache Lucene as a search engine. Apache Lucene creates search indexes to provide the search functionality. Apache Lucene, by default, stores search indexes into the filesystem. To make sure the search functionality works properly in a clustered environment, we need to synchronize search indexes of all the Liferay Portal nodes. There are multiple options to make sure the search functionality works properly in a clustered environment. Let's learn how to configure these options and then talk about the best practices associated with them.
Liferay's Lucene configuration provides a way to configure the index storage directory through the portal-ext.properties
file. In order to use this option, we will need a specialized Storage Area Network (SAN) appliance with file locking capabilities. Let's learn how to configure Lucene to store index files on the SAN appliance.
portal-ext.properties
of both the Liferay Portal nodes:lucene.dir=<SAN based mapped directory>
We have just added a property in the portal-ext.properties
file that specifies the location of the search indexes. Both the Liferay Portal nodes will specify the same network storage directory and hence, both the nodes will refer to the same copy of search indexes. As the index storage location has changed, we rebuilt search indexes for the existing data. This is the easiest option to centralize search indexes.
We talked about the Cluster Link feature for Ehcache replication. Cluster Link is a very powerful feature and it can be used for Lucene index replication as well. Using Cluster Link, Liferay Portal sends index changes to all the other Liferay Portal nodes in the group. Internally, Cluster Links uses JGroups to send the index data across to other nodes. Let's learn how to configure Cluster Link to replicate search indexes.
portal-ext.properties
file of both the nodes:cluster.link.enabled=true lucene.replicate.write=true
We simply enabled Cluster Link through portal-ext.properties
. We enable one of the Lucene properties which generates replication events through Cluster Link for every search index change. Cluster Link then distributes the event to all the nodes in the cluster. With this option, each node will have their copy of search indexes.
Apache Solr is one of the powerful open search engine projects. Liferay supports Apache Solr integration. We can replace the default Lucene search engine with Solr. Unlike Lucene, Solr runs as a separate application. In a clustered environment, Liferay Portal nodes connect to centralize the Solr server to search and index the data. Let's learn how to configure Liferay Portal with Solr.
Solr
.apache-tomcat-7.0.34.zip
file in the solr
directory.apache-solr-1.4.0.zip
file to a temporary directory. From the extracted directory, copy the content of the apache-solr-1.4.0/example/solr
directory to the solr
directory created in step 1.apache-solr-1.4.0/dist
directory. Rename the WAR file to solr.war
and copy it to the solr/apache-tomcat-7.0.34/webapps
directory.catalina.sh
file, add the JVM argument for -Dsolr.solr.home=<fully qualified path of solr directory created in step 1>
.http://localhost:8080/solr/admin
URL..lpkg
extension.liferay-portal-6.1.1-ce-ga2
directory.solr-web
plugin.solr-web
plugin is deployed successfully, stop both the nodes again.node-01liferay-portal-6.1.1-ce-ga2 tomcat-7.0.27webappssolr-webWEB-INFclassesMETA-INFsolr-spring.xml
and change the Solr server URL as follows:<bean id="com.liferay.portal.search.solr.server.BasicAuthSolrServer" class="com.liferay.portal.search.solr.server.BasicAuthSolrServer">
<constructor-arg type="java.lang.String" value="http://localhost:8080/solr" />
</bean>
liferay-node-02
.solr/conf/schema.xml
file with the node-01/liferay-portal-6.1.1-ce-ga2/tomcat-7.0.27/webapps/solr-web/WEB-INF/conf/schema.xml
file from liferay-node-01
.We just configured Solr 1.4 as a separate application on the Tomcat server. We then deployed the solr-web
plugin on both the nodes. The solr-web
plugin connects to the Solr server. We configured the URL of our Solr server by changing the spring configuration file. The Solr server uses a predefined schema for indexes. Liferay Portal has its own schema for indexes. This schema file is supplied with the solr-web
plugin. We replaced the Solr server schema with the one provided with the solr-web
plugin. After the preceding setup, when we create any data like a user or blog, indexes of related data will be created in the Solr server.
We learned about three options available to configure the search engine to work properly in a clustered environment. Let's learn some of the best practices associated with them.
Liferay Portal includes a built-in scheduler engine. There are many features in Liferay Portal that use a scheduler; for example, expiration of web content, LDAP import functionality, and so on. Liferay also supports setting up a scheduler for custom portlets. Internally, Liferay Portal uses the Quartz scheduler. Quartz is a very popular open source scheduler engine. Quartz scheduler stores data related to scheduled jobs in the Liferay database. Hence in a clustered environment, it is possible that multiple nodes start the same job at the same time. This can create havoc. To prevent this situation, we need to configure Quartz for the clustered environment.
Let's learn how to configure the Quartz scheduler to run in the clustered environment.
portal-ext.properties
file of both the Liferay Portal nodes:org.quartz.jobStore.isClustered=true
lportal
database, drop all the tables starting with QUARTZ_
. This step is required if Liferay tables are already created.We just added a property to let the Quartz scheduler know that we are running multiple instances of the Quartz scheduler connected to a single database. By enabling this property, the Quartz scheduler will make sure that each job is executed only once.
3.148.108.112