Liferay DXP is the most recent and updated version of Liferay. It comes with many new features that enables it as one of the best platform for Portal Development for both developers and enterprises. It provides enterprises and businesses to develop easy to use interfaces and Digital Experience Platforms for their internal and external stakeholders. Liferay DXP clustering is a major part of that experience and it means, “Connecting two or more nodes together in a way that they behave like a single application is termed as “Clustering”.

Clustering is used for parallel processing (scalability), fault tolerance (availability) and load balancing. For high traffic websites, single server gets overloaded and application gets unresponsive, to prevent such kind of failure we need Clustering to be activated in web portals.

A common misconception is that by configuring Liferay, a high-availability and clustered environment is automatically created. But it is not the case, a clustered environment includes load balancers, web server, clustered application servers and databases. Once this complete clustered environment is set up, Liferay can then be installed in this environment.

Clustering is currently only available in Liferay DXP, however Liferay 7 Community Edition roadmap have now included support for new cluster module in the Plan.

Read more: Liferay DXP Vs Adobe Experience Manager.

This article provides instructions with a basic configuration for installing Liferay DXP in a clustered environment.

See the image below to understand a Typical representation of Cluster Architecture;

Liferay-Cluster-Architecture

1. Load Balancer: Distribution of traffic between multiple web server resources to ensure smooth & efficient balancing of load. It can be software e.g. Apache, or hardware (preferred) e.g. F5 Big-IP, Cisco Load Director or Cisco Content Services Switch (CSS) having firewall, reverse proxy, rules, etc. configured and redirecting requests to Web / Application servers based on various algorithm like Round robin, Weighted round robin, least connections, Least response time, further based on application specific data such as HTTP headers, cookies or value of application specific parameter. Load balancers ensure reliability and availability by monitoring the “Health” of applications.

2. Web Server: Processes requests via HTTP, a network protocol used to distribute information on the World Wide Web. To Deliver static content elements like images, rich media, CSS files, etc. Also provides integration modules to single sign-on solutions like CA SiteMinder, Oracle Identity, Ping, etc.

  • Considering Apache Web Server as a software Load Balancer in this scenario for distributing incoming traffic, across the application servers in a pool of servers. Intelligent mechanism allows the best performing and most suitable server to be targeted for individual client requests.
  • Sticky Session is a method that is used with Load Balancing to achieve server affinity with your portal and database, so that HTTP session doesn’t lose valifity across application instances.
  • There are 3 algorithms available for use Request Counting, Weighted Traffic Counting and Pending Request Counting. These are controlled via the lbmethod value of the Balancer definition.
  • Below Apache Configuration using Mod Proxy module implements a proxy/gateway for Apache, having stickysession enabled, application server nodes BalancerMember, load balancing algorithm by traffic, static file caching, etc.
mod_proxy (httpd)
 
ProxyPass / balancer://wwwcluster/ stickysession=JSESSIONID
ProxyPassReverse / balancer://wwwcluster/
ProxyTimeout 600
Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
BalancerMember http://LIFERAY.APPLICATION.NODE1.IP:8080 route=node1 keepalive=on
BalancerMember http://LIFERAY.APPLICATION.NODE2.IP:8080 route=node2 keepalive=on
ProxySet lbmethod=bytraffic
ProxySet stickysession=ROUTEID

3. Liferay Configurations that needs be considered for setting up Liferay in Clustering

3.1. Database Configuration: All Liferay Application Server nodes must point to the same Liferay database (or database cluster). The connection to the database is simply a JDBC or JNDI connection. Recommended configuration is using JNDI.

portal-ext.properties in liferay_home directory

jdbc.default.jndi.name=jdbc/LiferayPool

root.xml in liferay_home/tomcat/conf/Catalina/localhost directory

/>

* Database is usually configured in Master/Slave configuration for high availability, however it is abstract from Liferay’s point of you as we just specify IP:Port of master database. Also using jdbc connection properties database can be configured for reading and writing to 2 different data sources.

Start Liferay Tomcats (Nodes 1 and 2) sequentially. The reason is so that the Quartz Scheduler can elect a master node.

Verification: Log in and Add a portlet (e.g. Hello Velocity) to Node 1 and On Node 2 refresh the page. The addition should show up on Node 2. Repeat steps with the nodes reversed to test the other node.

3.2. File Storage (Data folder): Repository for Document and Media library. Default file store is simple FileSystemStore, there are different store implementations available with Liferay OOB that can be configured e.g. AdvancedFileSystemStore, DBStore, JCRStore also support for 3rd Party Repositories that can be mounted e.g. S3Store, CMISStore.

portal-ext.properties in liferay_home directory

dl.store.impl=com.liferay.portal.store.file.system.FileSystemStore
## Recommended property for Production, Shared storage NAS file system (e.g. NFS, SMB, AFS, etc.) SAN file system (e.g. Fiber Channel, iSCSI, etc.)
dl.store.impl=com.liferay.portal.store.file.system.AdvancedFileSystemStore
## Database storage
dl.store.impl=com.liferay.portal.store.db.DBStore
## Java Content Repository storage
dl.store.impl=com.liferay.portal.store.jcr.JCRStore
## Amazon S3 cloud storage
dl.store.impl=com.liferay.portal.store.s3.S3Store
## 3rd Party CMIS compatile storage
dl.store.impl=com.liferay.portal.store.cmis.CMISStore

Configuring Data folder: The store’s default root folder/directory path is [Liferay_Home]/data/document_library

Option 1: You can specify a different root directory from within Control Panel Administration -> Configuration -> System Settings -> Search for Advanced File System Store property and edit

Liferay-Cluster_Configuring-Data-folder

Option 2: Create config file with name “com.liferay.portal.store.file.system.configuration.FileSystemStoreConfiguration.cfg” in [Liferay Home]/osgi/configs directory.

Mention your shared/mounted path for NAS/SAN file system in property as below,

rootDir=opt/dxp/data/document_library
** Trick of finding this configuration file name and its property is to save the property configuration above in option 1 and download it through Export option from Action icon.

3.3. Clustering Activation on Nodes: Data is generated one time and copied/replicated to other servers in the cluster, saving time and resources. To enable cluster link add following property to all node’s

Liferay’s algorithm uses a single UDP multicast channel, so that nodes don’t have to create a thread for each other node in the cluster. Ehcache ClusterLink plugin required, comes OOB deployed and installed with Liferay DXP, make sure it is started (Telnet to see status of modules by running cmd -> lb | grep “Multiple”) on all the clustered Liferay instances.

portal-ext.properties
 
cluster.link.enabled=true
cluster.link.autodetect.address=DB.Server.IP:Port
ehcache.cluster.link.replication.enabled=true

3.3.1.a Option 1: Distributed Caching (Multicast – Default & Preferred) Allows a Liferay cluster (nodes) to share cache content among multiple cluster nodes via Ehcache. Ehcache can be configured to use an efficient algorithm for replication, called ClusterLink layer for both discovery and transport to manage cluster cache. Liferay’s ClusterLink internally uses JGroups for its underlying technology.

By default, two protocols (UDP or TCP) can be used to send/receive messages to/from the network depending on the needs of an environment.

Default (MPING + UDP/IP Multicast): update multicast addresses/ports according to network specifications. This configuration uses MPING for discovery and UDP for transport. In general, if UDP can be used, there is no need to use or configure any other protocol.

portal-ext.properties
 
## Multicast
 
# Each address and port combination represent a conversation that is made
# between different nodes. If they are not unique or correctly set, there
# will be a potential of unnecessary network traffic that may cause slower
# updates or inaccurate updates.
 
multicast.group.address["cluster-link-control"]=239.255.0.1
multicast.group.port["cluster-link-control"]=23301
 
multicast.group.address["cluster-link-udp"]=239.255.0.2
multicast.group.port["cluster-link-udp"]=23302
 
multicast.group.address["cluster-link-mping"]=239.255.0.3
multicast.group.port["cluster-link-mping"]=23303
 
multicast.group.address["multi-vm"]=239.255.0.5
multicast.group.port["multi-vm"]=23305

3.3.1.b. Option 2: TCP Transports (Unicast): This configuration uses TCP_PING for discovery and TCP for transport. For TCP_PING, you must pre-specify all members of the cluster. This is not an auto-discovery protocol i.e. adding/removing members of cluster not supported.

setenv.sh: Add IP addresses of all the cluster nodes in below properties via JVM parameters in setenv.sh for Linux environments.

-Djgroups.bind_addr=
-Djgroups.tcpping.initial_hosts=[port1],[port2]... 

tcp.xml: Create TCP discovery configuration files

Extract tcp.xml from $liferay_home/osgi/marketplace/Liferay

Foundation.lpkg/com.liferay.portal.cluster.multiple-[version number].jar and place it in a convenient location.

Since TCP_PING is the default discovery method, the only thing need to be modify is TCP bind_port, if you run several nodes on the same machine.

Sharing a transport between multiple channels, Add singleton_name=”liferay_tcp_cluster” into the TCP tag in the tcp.xml

portal-ext.properties: Point Liferay at the configuration files by adding the following properties

cluster.link.channel.properties.control=[CONFIG_FILE_PATH]/tcp.xml
cluster.link.channel.properties.transport.0=[CONFIG_FILE_PATH]/tcp.xml
** Similarly, discovery can also be configurated using,

JDBC_PING (jdbc_ping_config.xml),
S3_PING (s3_ping_config.xml Amazon only),
RACKSPACE_PING (Rackspace only)

Infograhic: Core Attributes Of Liferay Digital Experience Platform

3.3.2. Modifying/Tuning Cache Configuration: There are specific files that you would need to deploy/configure in order for your hibernate, multivm, or singlevm caches tuning configurations to be read by the system. In Liferay DXP caches configurations can be tuned via deployable plugins, if you use cluster link replication (the benefit is you don’t need to restart the server for the settings to take effect).

Download: Sample configure module to override liferay-multi-vm-clustered.xml

You can find OOB xml file in [Liferay_home]/osgi/marketplace folder, com.liferay.portal.ehcache-[version].jar. Default XML files in the /ehcache folder inside the .jar.

Ehcache has a lot of different modifications that can be done to cache certain objects. Users can tune these settings for their needs. The XML files have configuration settings which can be modified appropriately for your requirements (for example: Number of elements in cache, Expiry time for cache, Whether cache overflows to disk, etc.)

Optionally, below properties in portal-ext.properties can be configured with xml tuned and located at custom folder in Liferay_home directory

ehcache.single.vm.config.location=/custom-ehcache/liferay-single-vm.xml
ehcache.multi.vm.config.location=/custom-ehcache/liferay-multi-vm-clustered.xml
net.sf.ehcache.configurationResourceName=/custom-ehcache/hibernate-clustered.xml
** For advanced optimization and configuration, please refer to the Ehcache documentation: http://www.ehcache.org/documentation/configuration

3.4. Index Replication: Starting from Liferay DXP the search engine needs to be separated from the main Liferay server for scalability reasons. For it there are two ways to achieve it: Elasticsearch or Solr.

Note: Storing indexes locally is not an option anymore, lucene.replicate.write=true only needed when Lucene (Solr) is being used.

3.4.1. Installing Elasticsearch: Download ES

  • Extract the zip file at some location in Elasticsearch Server /opt/elasticsearch-2.4.6/
  • Edit /opt/elasticsearch-2.4.6/config/elasticsearch.yml and Set the name of your cluster and Save cluster.name: LiferayElasticsearchCluster

** Tune and Scale ElasticSearch with Production Recommended Configuration, e.g. Change -Xmx, ES_JAVA_OPTS, ES_HEAP_SIZE

  • Install required Elasticseach plugins from the bin directory:

plugin install analysis-smartcn
plugin install analysis-kuromoji

  • Deploy the elasticsearch-head plugin so you can access your cluster from a browser http://elastic.server.ip:9200/_plugin/head/
  • Restart Elasticsearch server wait until it’s up and running

3.4.2. Connect Elasticsearch and Liferay: There are two ways to configure Elasticsearch for Liferay cluster from Control Panel and .cfg (advisable) file.

Option 1: Navigate to Control Panel -> System settings -> Foundations -> Search for Elasticsearch and Edit the Properties Set Cluster Name, Change Operation mode to REMOTE, Enter Transport Address, Additional Advanced configuration if any and Save.

Download the saved configuration (.cfg file) through Export option from Action icon.

OR

Option 2: It is advisable to use the Configuration file for Production, add/copy com.liferay.portal.search.elasticsearch.configuration.ElasticsearchConfiguration.cfg in LIFERAY_HOME/osgi/configs directory.

ElasticsearchConfiguration.cfg Sample Configuration file
 
operationMode=REMOTE
clientTransportIgnoreClusterName=false
indexNamePrefix=liferay-
httpCORSConfigurations=
additionalConfigurations=
httpCORSAllowOrigin=/https?://localhost(:[0-9]+)?/
networkBindHost=
transportTcpPort=
bootstrapMlockAll=false
networkPublishHost=
clientTransportSniff=true
additionalIndexConfigurations=
retryOnConflict=5
httpCORSEnabled=true
clientTransportNodesSamplerInterval=5s
additionalTypeMappings=
logExceptionsOnly=true
httpEnabled=true
networkHost=
transportAddresses=ip.of.elasticsearch.node:9300
discoveryZenPingUnicastHostsPort=9300-9400
clusterName=LiferayElasticsearchCluster
  • Restart Liferay Servers
  • Index Replication: Navigate to Control Panel -> Configuration -> Server Administration –> Resources Execute Reindex
  • Verification: Test Search by Adding a Basic Web Content and Search for it from both nodes
  • Modules/Plugins must be deployed through Hot deploy folders for each node, if you’re not using centralized server farms deployment.
  • Liferay needs to have the same patches installed across all the nodes to avoid any inconsistencies.

Read more: Liferay Unicast Clustering On Amazon EC2

4. Application Server Configuration: Configuring Application Server for Cluster Awareness, in this case Tomcat.

[php] sever.xml: Enable tomcat clustering in all the nodes by specifying jvmRoute="Node1" in tag and jvmRoute="Node2" in other node respectively Configure tomcat’s worker threads for Clustering maxThreads="150" in tag Enabling Session Replication so when user request goes to any tomcat in cluster, particular session gets copied on entire cluster to provide failover capabilities so the user session is not lost and experience is seamless. Add below configuration for session replication between tag. [/php]
server.xml
web.xml: For switching on session replication, add tag in web.xml just after start of tag.

That should be It, Restart the Liferay Tomcat Servers (Nodes 1 and 2) sequentially and Verify Clustering and Session Replication for its working.

Liferay DXP is redefining the customer experience with its many features and plugins. We at KNOWARTH are pioneers in developing custom portals for enterprises and providing Liferay Consulting Services for Liferay Portal Development.

Blog is written by Ankit Pancholi, Principal Liferay Consultant, Anblicks

Hadoop to Snowflake New Web