Liferay Portal cluster configuration

In the previous section, we learned about the software load balancer configuration using the Apache Web Server. In this section, we extend the setup by configuring the cluster between Liferay Portal Server nodes. To set up a cluster of Liferay Portal Server nodes, we need to ensure all shared resources are either centralized or replicated. The following list highlights the resources that need to be handled for cluster setup:

  • Liferay Portal web sessions: For every user conversation, a web session object is created and managed by the Liferay Portal application server. A web session object stores important data related to a specific user conversation. In a clustered environment, it is possible that subsequent user requests are served by different Liferay Portal nodes. So, it is very important to make sure that the same session object is available on all clustered nodes.
  • Cache replication: Liferay Portal, by default, uses the Ehcache caching framework for caching persistence and service layer resources. It is very important to invalidate or replicate caches across the cluster to avoid stale cache issues.
  • Media Library: Media Library is one of the key features of Liferay. It is used to store documents, videos, images, and so on. Liferay stores the metadata of the Media Library content in the Liferay database, but the actual resources are stored using various repository stores. So, we need to ensure that the Media Library content is stored at a centralized place.
  • Search indexes: Liferay provides a powerful built-in search feature. The default installation uses the Lucene search engine to provide search capability. The Lucene search engine stores the index on the filesystem. It is very important to ensure that search indexes are either centralized or replicated across all the nodes.
  • Quartz jobs: There are various features in Liferay which internally use scheduled jobs. In a clustered environment, it is very important to ensure that all the nodes are aware about running scheduler jobs.

In this section, we will learn how to configure these resources to work in a clustered environment. We will also learn about the best practices associated with each option.

Session replication configuration

Session replication is a technique to replicate the session information across all the nodes. With the help of session replication, we can ensure automatic recovery after the failover of any node. In our load balancer configuration, we configured session stickiness which ensures all requests related to the same user session are served through a specific node. Now suppose that node goes down; in this case, the load balancer sends subsequent requests to another node in the cluster. If the new node does not have the session information of the same user, it considers it as a new session and in this situation the user will be logged out of the system. With the help of session replication, we can avoid this situation and ensure transparent switching between nodes.

Let's learn how to configure session replication.

  1. Stop the Liferay Portal nodes if they are running.
  2. Edit the server.xml file of liferay-node-01 located in node-01liferay-portal-6.1.1-ce-ga2 tomcat-7.0.27conf, and add the following configuration inside the <Engine> tag:
    <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
    channelSendOptions="6">
    <Manager className="org.apache.catalina.ha.session.DeltaManager"
    expireSessionsOnShutdown="false"
    notifyListenersOnReplication="true"/>
    <Channel className="org.apache.catalina.tribes.group.GroupChannel">
    <Membership className="org.apache.catalina.tribes.membership.McastService"
    address="228.0.0.4"
    port="45564"
    frequency="500"
    dropTime="3000"/>
    <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
    address="auto"
    port="5000"
    selectorTimeout="100"
    maxThreads="6"/>
    
    <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
    <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
    </Sender>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
    <Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
    </Channel>
    <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
                     filter=".*.gif;.*.js;.*.jpg;.*.png;.*.htm;.*.html;.*.css;.*.txt;"/>
    <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
    
    <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
    <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
    </Cluster>
  3. Edit the web.xml file of liferay-node-01 located in node-01liferay-portal-6.1.1-ce-ga2 tomcat-7.0.27webappsROOTWEB-INF and at the bottom of the file before the </web-app> tag, add the following content:
    <distributable/>
  4. Now repeat steps 2 and 3 on liferay-node-02.
  5. Restart both the Liferay Portal nodes.

With this configuration, changes in session replication between both the Liferay Portal servers is set up. The Tomcat server provides a simple TCP cluster which connects multiple Tomcat servers using the TCP protocol. In our configuration, we used DeltaManager which identifies session changes and transfers these changes to other nodes in the cluster. We have used IP multicast to connect both the Tomcat servers. Once both the nodes connect with each other, they establish a set of sender and receiver socket channels. The session replication data is transferred using these channels. We have also configured various interceptors to intercept data transfer. The replication manager checks the session data after every request and accordingly transfers the changed session data to other nodes. For some kinds of requests, it is sure that the session data is not going to change; for example, requests for static resources like images, videos, and so on. So, it is unnecessary to check the session data after such requests. We configured a filter for all such resources in the replication valve configuration. The Application Server does not replicate sessions of any application unless the application is enabled for session replication. So, we enabled session replication for the Liferay Portal application by adding the <distributable> tag in web.xml.

Session replication is not a mandatory requirement for cluster configuration. Session replication consumes lots of server and network resources. So if there is not a real need to handle transparent failover, it is advisable to avoid session replication.

Cache replication

Caching is a very important technique to boost the performance of the system. Liferay Portal, by default, caches resources of the persistence layer and the service layer. By default, Liferay Portal uses the Ehcache framework for caching, and it caches resources in memory and the filesystem. In the clustered environment, each Liferay Portal node will have its own copy of the cache. It is very important to invalidate or replicate the cache on all the Liferay Portal nodes if the cache is invalidated or updated on any of the nodes. To implement this we need to replicate the cache. In this section, we will learn multiple options to replicate Ehcache across the cluster.

Ehcache replication using RMI

The Ehcache framework provides RMI (Remote Method Invocation) based cache replication across the cluster. It is the default implementation for replication. The RMI-based replication works on the TCP protocol. Cached resources are transferred using the serialization and deserialization mechanism of Java. RMI is a point-to-point protocol and hence, it generates a lot of network traffic between clustered nodes. Each node will connect to other nodes in the cluster and send cache replication messages. Liferay provides Ehcache replication configuration files in the bundle. We can re-use them to set up Ehcache replication using RMI. Let's learn how to configure Ehcache replication using RMI for our cluster.

  1. Stop both the Liferay Portal nodes if they are running.
  2. Add the following properties to the portal-ext.properties file of both the Liferay Portal nodes:
    net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml
    net.sf.ehcache.configurationResourceName.peerProviderProperties=peerDiscovery=automatic,multicastGroupAddress=${multicast.group.address["hibernate"]},multicastGroupPort=${multicast.group.port["hibernate"]},timeToLive=1
    ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml
    ehcache.multi.vm.config.location.peerProviderProperties=peerDiscovery=automatic,multicastGroupAddress=${multicast.group.address["multi-vm"]},multicastGroupPort=${multicast.group.port["multi-vm"]},timeToLive=1
    multicast.group.address["hibernate"]=233.0.0.4
    multicast.group.port["hibernate"]=23304
    multicast.group.address["multi-vm"]=233.0.0.5
    multicast.group.port["multi-vm"]=23305
  3. Now restart both the Liferay Portal nodes.

Liferay Portal uses two separate Ehcache configurations for the hibernate cache and the Liferay service layer cache. Liferay ships with two different sets of configuration files for each hibernate and service layer cache. By default, it uses the non-replicated version of the cache file. Using the portal-ext.properties file, we can tell Liferay to use the replicated cache configuration file. In the preceding steps, we configured the replicated version of cache files for both the hibernate and service layer cache using the net.sf.ehcache.configurationResourceName and ehcache.multi.vm.config.location properties. Replicated Ehcache configuration files internally use IP multicast to establish the RMI connection between each Liferay node. We configured IP multicast and ports for establishing connections.

Ehcache configuration using JGroups

Another option to replicate Ehcache is using JGroups. JGroups is a powerful framework used for multicast communication. The Ehcache framework also supports replication using JGroups. Similar to the RMI-based Ehcache replication, Liferay also supports JGroup-based replication. Let's learn how to configure the JGroup-based Ehcache replication.

  1. Stop both the Liferay Portal nodes if they are running.
  2. Add the following properties to the portal-ext.properties file of both the Liferay Portal nodes:
    ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml
    ehcache.multi.vm.config.location.peerProviderProperties=connect=UDP(mcast_addr=multicast.group.address["hibernate"];mcast_port=multicast.group.port["hibernate"];):PING:
    MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS
    ehcache.bootstrap.cache.loader.factory=com.liferay.portal.cache.ehcache.JGroupsBootstrapCacheLoaderFactory
    ehcache.cache.event.listener.factory=net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory
    net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml
    net.sf.ehcache.configurationResourceName.peerProviderProperties=peerDiscovery=connect=UDP(mcast_addr=multicast.group.address["multi-vm"];mcast_port=multicast.group.port["multi-vm"];):PING:
    MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS
    multicast.group.address["hibernate"]=233.0.0.4
    multicast.group.port["hibernate"]=23304
    multicast.group.address["multi-vm"]=233.0.0.5
    multicast.group.port["multi-vm"]=23305
  3. Now restart both the nodes one by one to activate the preceding configuration.

The Ehcache replication configuration is very similar to the RMI-based replication. Here, we used the UDP protocol to connect Liferay Portal nodes. With this option both Liferay Portal nodes also connect with each other using IP multicast.

Ehcache replication using Cluster Links

We learned about the JGroups- and RMI-based Ehcache replication. The Liferay Enterprise version includes another powerful feature called Cluster Link, which provides the Ehcache replication mechanism. Internally, this feature uses JGroups to replicate the cache across the network. Let's go through the steps to configure this feature.

  1. Stop both the Liferay Portal nodes if they are running.
  2. Now deploy the ehcache-cluster-web enterprise plugin on both the Liferay Portal servers.
  3. Now, edit portal-ext.properties of both the nodes:
    cluster.link.enabled=true
    ehcache.cluster.link.replication.enabled=true
    net.sf.ehcache.configurationResourceName=/ehcache/hibernate-clustered.xml
    ehcache.multi.vm.config.location=/ehcache/liferay-multi-vm-clustered.xml
  4. Now restart both the Liferay Portal servers to activate this configuration.

In Chapter 1, Architectural Best Practices, we talked about this option. Unlike the JGroups- or RMI-based Ehcache replication, this option centralizes all Ehcache changes at one place and then distributes changes to all the nodes of the cluster. This in turn reduces unnecessary network transfers.

Note

This option is only available in the Liferay Enterprise version. Hence, the preceding steps are applicable only if you are using the Liferay Enterprise version.

Ehcache clustering best practices

We talked about different options to configure Ehcache replication. Let's learn the best practices related to Ehcache replication.

  • If there are more than two nodes in the cluster, it is recommended to either use Cluster Link- or JGroups-based replication. If we are using the Liferay Enterprise edition, it is recommended to use Cluster Link for Ehcache replication.
  • All three options that we discussed previously use IP multicast for establishing connections with other nodes. The IP multicast technique uses group IP and port to know other nodes in the same group. It is very important to ensure that the same IP and port are used by the nodes of the same cluster.
  • It is advisable to keep the group IP and port different for development, testing, or staging environment to make sure that the nodes of other environments do not pair up with the production environment.
  • Cluster Link provides up to 10 transport channels to transfer cached resources across the cluster. If the application is supposed to have a huge cache and frequent cache changes, it is advisable to configure multiple transport channels using the cluster.link.channel.properties.transport configuration property.

Media Library configuration

Media Library is one of the most important features of Liferay Portal. The Media Library content is divided into two repositories. The metadata of the Media Library content is stored in the Liferay database. The actual media files are stored by default on the filesystem. For a clustered setup, we need to make sure that the media files are stored in a centralized repository, otherwise each node will have their own copy of files. Liferay Portal provides various options to store media files in centralized storage. Let's learn how to configure Media Library for the clustered environment and then talk about best practices.

Network file storage using the Advanced File System store

In Chapter 1, Architectural Best Practices, we talked about the Advanced File System store. It's a pluggable Media Library repository store. It stores files on the filesystem, but it divides files into multiple directories. This feature improves the efficiency in locating the files, especially when the files are stored on the network filesystem. To use this option in a clustered environment, we need to use a Storage Area Network appliance or Network File System. We need to mount the storage SAN or the NFS directory on both the Liferay Portal nodes. Let's learn how to configure Media Library with the Advanced File System store.

  1. Stop both the Liferay Portal nodes if they are running.
  2. Add the following properties to portal-ext.properties of both the Liferay Portal nodes:
    dl.store.impl=com.liferay.portlet.documentlibrary.store.AdvancedFileSystemStore
    dl.store.file.system.root.dir=<SAN Directory>
  3. Now restart both the Liferay Portal nodes one by one.

We have configured Media Library to use AdvancedFileSystemStore, and also provided a networked location where the Portal should store the Media Library content. Both the Portal nodes will store content in the same filesystem location. To use this option, we need to make sure the SAN appliance supports file locking, as multiple nodes will access the filesystem at the same time. As this option requires specialized hardware like SAN or NFS, it will add additional cost to the solution.

Database storage using the JCR store

Liferay Portal provides an option to store the Media Library content to the database using the JCR store. Liferay Portal uses Apache Jackrabbit as JCR implementation. Jackrabbit provides both filesystem- and database-based storage for the content. By default, the Jackrabbit configuration uses filesystem-based storage. Another option is to configure Jackrabbit to use the database for the Media Library content. Let's learn how to configure Media Library using the JCR store.

  1. Stop both the Liferay Portal nodes if they are already running.
  2. Edit portal-ext.properties of both the nodes and add the following configuration:
    dl.store.impl=com.liferay.portlet.documentlibrary.store.JCRStore
  3. Now edit node-01liferay-portal-6.1.1-ce-ga2datajackrabbit epository.xml and make the following changes:
    1. Comment the following lines from the file:
      <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${rep.home}/repository" />
      </FileSystem>
    2. Uncomment the following lines and change the values as given in the following code snippet. Make sure you provide the correct IP, username, and password of the MySQL database:
      <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
      <param name="driver" value="com.mysql.jdbc.Driver"/>
      <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal"/>
      <param name="schema" value="mysql"/>
      <param name="user" value="{Database User Id}"/>
      <param name="password" value="{Database Password}"/>
      <param name="schemaObjectPrefix" value="J_R_FS_"/>
      </FileSystem>
    3. Comment the following lines that appear within the <workspace> tag:
      <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
        <param name="path" value="${wsp.home}" />
      </FileSystem>
      <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.BundleFsPersistenceManager" />
    4. Uncomment and change the following lines that appear within the <workspace> tag. Make sure you provide the correct IP, username, and password of the MySQL database:
      <PersistenceManager class="org.apache.jackrabbit.core.state.db.SimpleDbPersistenceManager">
      <param name="driver" value="com.mysql.jdbc.Driver" />
      <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" />
      <param name="user" value="{Database User Id}"/>
      <param name="password" value="{Database Password}"/>
      <param name="schema" value="mysql" />
      <param name="schemaObjectPrefix" value="J_PM_${wsp.name}_" />
      <param name="externalBLOBs" value="false" />
      </PersistenceManager>
      <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
      <param name="driver" value="com.mysql.jdbc.Driver"/>
      <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" />
      <param name="user" value="{Database User Id}"/>
      <param name="password" value="{Database Password}"/>
      <param name="schema" value="mysql"/>
      <param name="schemaObjectPrefix" value="J_FS_${wsp.name}_"/>
      </FileSystem>
    5. Uncomment the following lines of code within the <versioning> tag:
      <FileSystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
      <param name="path" value="${rep.home}/version" />
      </FileSystem>
      <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.BundleFsPersistenceManager"/>
    6. Uncomment the following lines that appear within the <Versioning> tag. Make sure you provide the correct IP, username, and password of the MySQL database:
      <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
      <param name="driver" value="com.mysql.jdbc.Driver"/>
      <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" />
      <param name="user" value="{Database User Id}"/>
      <param name="password" value="{Database Password}"/>
      <param name="schema" value="mysql"/>
      <param name="schemaObjectPrefix" value="J_V_FS_"/>
      </FileSystem>
      <PersistenceManager class="org.apache.jackrabbit.core.state.db.SimpleDbPersistenceManager">
      <param name="driver" value="com.mysql.jdbc.Driver" />
      <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal" />
      <param name="user" value="{Database User Id}"/>
      <param name="password" value="{Database Password}"/>
      <param name="schema" value="mysql" />
      <param name="schemaObjectPrefix" value="J_V_PM_" />
      <param name="externalBLOBs" value="false" />
      </PersistenceManager>
    7. Finally, uncomment and change the following tag within the <Clustering> tag. Make sure you provide the correct IP, username, and password of the MySQL database:
      <Cluster id="node_1" syncDelay="5">
      <Journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
      <param name="revision" value="${rep.home}/revision"/>
      <param name="driver" value="com.mysql.jdbc.Driver"/>
      <param name="url" value="jdbc:mysql:// {IP of MySQL Database Server}/lportal"/>
      <param name="user" value="{Database User Id}"/>
      <param name="password" value="{Database Password}"/>
      <param name="schema" value="mysql"/>
      <param name="schemaObjectPrefix" value="J_C_"/>
      </Journal>
      </Cluster>
  4. Now, replace the same file in the other node and change the id attribute of the cluster tag to node_2.
  5. Restart both the Liferay Portal nodes one by one.

In the preceding configuration, we first enabled the JCR store for Media Library. This change will internally use Jackrabbit to store the Media Library content. By default, configuration of Jackrabbit is stored in the repository.xml file. By default, the Jackrabbit configuration stores the Media Library content in the data folder. We configured the repository.xml file to store content in the same lportal database. We can also configure the repository.xml file such that it stores the Media Library content in a separate database. Jackrabbit internally divides the Media Library data into the following types of data in the database:

  • Repository-filesystem-related data
  • Workspace-related data
  • Versioning-related data
  • Cluster-related data

We configured the repository.xml file such that the preceding data is stored in the database.

Database storage using DBStore

Liferay Portal 6.1 introduced a new type of repository store to persist Media Library content in the Liferay database. It is very simple to configure and provides better performance than the JCR store with the database. Let's learn how to configure Media Library to use DBStore.

  1. Stop both the Liferay Portal nodes if they are already running.
  2. Edit portal-ext.properties of both the nodes and add the following configuration:
    dl.store.impl= com.liferay.portlet.documentlibrary.store.DBStore
  3. Restart both the Liferay Portal nodes one by one.

Media Library clustering best practices

We talked about two options to centralize Media Library content storage. In Chapter 1, Architectural Best Practices, we briefly talked about other options too. Let's talk about some of the best practices related to Media Library.

  • In a clustered environment, the filesystem-based Media Library store can only be used with SAN or NFS that supports file locking.
  • If the Media Library content needs to be stored in the database, DBStore is preferred over the JCR store with database. DB Store is better for performance and scalability.
  • If JCR-based database storage is used for Media Library, it is recommended to keep the JCR database separate.
  • If JCR-based database storage is used for Media Library, it is very important to ensure that the cluster node ID is unique in the Jackrabbit configuration file (repository.xml).

Search engine configuration

Liferay Portal uses Apache Lucene as a search engine. Apache Lucene creates search indexes to provide the search functionality. Apache Lucene, by default, stores search indexes into the filesystem. To make sure the search functionality works properly in a clustered environment, we need to synchronize search indexes of all the Liferay Portal nodes. There are multiple options to make sure the search functionality works properly in a clustered environment. Let's learn how to configure these options and then talk about the best practices associated with them.

Lucene index storage on network storage

Liferay's Lucene configuration provides a way to configure the index storage directory through the portal-ext.properties file. In order to use this option, we will need a specialized Storage Area Network (SAN) appliance with file locking capabilities. Let's learn how to configure Lucene to store index files on the SAN appliance.

  1. Stop both the Liferay Portal servers if they are already running.
  2. Add the following property to portal-ext.properties of both the Liferay Portal nodes:
    lucene.dir=<SAN based mapped directory>
  3. Now start both the nodes one by one.
  4. Now, access the Portal and sign in using admin user name. Then from the dock bar, access Control Panel and then from the Server Administration section, click on the button beside the Rebuild all search indexes label.

We have just added a property in the portal-ext.properties file that specifies the location of the search indexes. Both the Liferay Portal nodes will specify the same network storage directory and hence, both the nodes will refer to the same copy of search indexes. As the index storage location has changed, we rebuilt search indexes for the existing data. This is the easiest option to centralize search indexes.

Lucene index replication using Cluster Link

We talked about the Cluster Link feature for Ehcache replication. Cluster Link is a very powerful feature and it can be used for Lucene index replication as well. Using Cluster Link, Liferay Portal sends index changes to all the other Liferay Portal nodes in the group. Internally, Cluster Links uses JGroups to send the index data across to other nodes. Let's learn how to configure Cluster Link to replicate search indexes.

  1. Stop both the Liferay Portal nodes if they are already running.
  2. Add the following properties to the portal-ext.properties file of both the nodes:
    cluster.link.enabled=true
    lucene.replicate.write=true
  3. Now restart both the nodes one by one.

We simply enabled Cluster Link through portal-ext.properties. We enable one of the Lucene properties which generates replication events through Cluster Link for every search index change. Cluster Link then distributes the event to all the nodes in the cluster. With this option, each node will have their copy of search indexes.

Using the Apache Solr search engine

Apache Solr is one of the powerful open search engine projects. Liferay supports Apache Solr integration. We can replace the default Lucene search engine with Solr. Unlike Lucene, Solr runs as a separate application. In a clustered environment, Liferay Portal nodes connect to centralize the Solr server to search and index the data. Let's learn how to configure Liferay Portal with Solr.

  1. Connect to the server on which Solr has to be installed and create a root named Solr.
  2. Download Apache Tomcat 7.0.34 server from the http://apache.techartifact.com/mirror/tomcat/tomcat-7/v7.0.34/bin/apache-tomcat-7.0.34.zip URL.
  3. Extract the apache-tomcat-7.0.34.zip file in the solr directory.
  4. Download Apache Solr 1.4.0 from the http://archive.apache.org/dist/lucene/solr/1.4.0/apache-solr-1.4.0.zip URL.
  5. Extract the preceding apache-solr-1.4.0.zip file to a temporary directory. From the extracted directory, copy the content of the apache-solr-1.4.0/example/solr directory to the solr directory created in step 1.
  6. In the preceding temporary directory, you can locate the Apache Solr WAR file in the apache-solr-1.4.0/dist directory. Rename the WAR file to solr.war and copy it to the solr/apache-tomcat-7.0.34/webapps directory.
  7. In the catalina.sh file, add the JVM argument for -Dsolr.solr.home=<fully qualified path of solr directory created in step 1>.
  8. Start the Solr Tomcat server and access Solr Admin using the http://localhost:8080/solr/admin URL.
  9. Now from the Liferay Marketplace, download the Solr Search Engine CE app. Liferay Marketplace can be accessed from the http://www.liferay.com/marketplace URL. From the Marketplace, we will get a file with the .lpkg extension.
  10. Now, copy this file to deploy the directory of both the nodes and start them. By default, the deployed directory will be there in the liferay-portal-6.1.1-ce-ga2 directory.
  11. On startup, both the Liferay nodes will deploy the solr-web plugin.
  12. Once the solr-web plugin is deployed successfully, stop both the nodes again.
  13. Now edit node-01liferay-portal-6.1.1-ce-ga2 tomcat-7.0.27webappssolr-webWEB-INFclassesMETA-INFsolr-spring.xml and change the Solr server URL as follows:
    <bean id="com.liferay.portal.search.solr.server.BasicAuthSolrServer" class="com.liferay.portal.search.solr.server.BasicAuthSolrServer">
    <constructor-arg type="java.lang.String" value="http://localhost:8080/solr" />
    </bean>
  14. Make the same changes in liferay-node-02.
  15. Now on the Solr server, replace the solr/conf/schema.xml file with the node-01/liferay-portal-6.1.1-ce-ga2/tomcat-7.0.27/webapps/solr-web/WEB-INF/conf/schema.xml file from liferay-node-01.
  16. Now restart the Solr Tomcat server. Then, restart both the Liferay Portal nodes.

We just configured Solr 1.4 as a separate application on the Tomcat server. We then deployed the solr-web plugin on both the nodes. The solr-web plugin connects to the Solr server. We configured the URL of our Solr server by changing the spring configuration file. The Solr server uses a predefined schema for indexes. Liferay Portal has its own schema for indexes. This schema file is supplied with the solr-web plugin. We replaced the Solr server schema with the one provided with the solr-web plugin. After the preceding setup, when we create any data like a user or blog, indexes of related data will be created in the Solr server.

Clustering best practices for the search engine

We learned about three options available to configure the search engine to work properly in a clustered environment. Let's learn some of the best practices associated with them.

  • If the Portal application is expected to write a few indexes, it is recommended to use the Cluster Link option. It is a lightweight option and can be configured quickly.
  • As indexes are accessed and changed frequently, a network filesystem-based index storage can create issues related to concurrent file access. Hence, it is advisable to avoid using that option even though it gives the best performance.
  • If the Portal application is expected to have a large amount of data written to search indexes, it is advisable to use the Solr search engine instead of other options.
  • The Solr server provides a master/slave server concept. If the Portal application is expected to have a huge amount of read and write transactions on search indexes, it is advisable to use that option to manage heavy loads.
  • If the Cluster Link option is used to replicate search indexes and the Portal application is expected to have frequent index changes, it is advisable to configure multiple transport channels for the Cluster Link.

Quartz scheduler configuration

Liferay Portal includes a built-in scheduler engine. There are many features in Liferay Portal that use a scheduler; for example, expiration of web content, LDAP import functionality, and so on. Liferay also supports setting up a scheduler for custom portlets. Internally, Liferay Portal uses the Quartz scheduler. Quartz is a very popular open source scheduler engine. Quartz scheduler stores data related to scheduled jobs in the Liferay database. Hence in a clustered environment, it is possible that multiple nodes start the same job at the same time. This can create havoc. To prevent this situation, we need to configure Quartz for the clustered environment.

Let's learn how to configure the Quartz scheduler to run in the clustered environment.

  1. Stop both the Liferay Portal servers if they are running.
  2. Add the following property to the portal-ext.properties file of both the Liferay Portal nodes:
    org.quartz.jobStore.isClustered=true
  3. From the lportal database, drop all the tables starting with QUARTZ_. This step is required if Liferay tables are already created.
  4. Now restart both the Liferay Portal servers.

We just added a property to let the Quartz scheduler know that we are running multiple instances of the Quartz scheduler connected to a single database. By enabling this property, the Quartz scheduler will make sure that each job is executed only once.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.108.112