Clustering Grails

I did a talk on Grails clustering at the Groovy & Grails eXchange 2009 in London last week and wanted to put up the slides and the sample application and clustering scripts. This was the third time I’ve given a version of this talk (first at a Boston Grails meetup and again at SpringOne 2GX) so I’m way overdue getting this up.

I created a simple Grails app and installed the Spring Security plugin (to test web cluster login failover and to have domain classes for Hibernate 2nd level caching) and the Quartz plugin to demonstrate support for multiple servers. The basic non-clustered version is here so you can compare it to the version that has the required changes here.

In no particular order, here are the relevant project files:

  • There’s a simple Quartz job in grails-app/jobs/ClusterTestJob that prints which instance it’s running on to stdout every two seconds.
  • grails-app/conf/QuartzConfig.groovy specifies jdbcStore=true to have the plugin configure job storage in the database instead of in-memory.
  • Externalized configuration is enabled in grails-app/conf/Config.groovy using
    grails.config.locations = [
         "classpath:${appName}-config.groovy",
         "file:./${appName}-config.groovy"]
    

    which allows local dev environment configuration using a file name clustered-config.groovy in the root directory of the project, or prod configuration (e.g. to keep production database passwords out of source control) using a clustered-config.groovy in the classpath. Tomcat’s lib directory is in its classpath, so that’s a convenient place to put the file.

  • In addition note that the Log4j file loggers are configured using ServerAwareFileAppender. Specifying the location of log files is tricky. If you leave the path out and make it relative, then files get created relative to the directory Tomcat is started from, which might not always be the same. But if you hard-code the path, it won’t work cross-platform. So this class figures out if you’re running in Tomcat or Jetty and if you’re running in dev mode, and is also cluster-aware. Depending on how you’re running it knows where the logs directory is and puts logs there.
  • grails-app/conf/hibernate/hibernate.cfg.xml loads grails-app/conf/hibernate/Quartz.mysql.innodb.hbm.xml which has DDL for creating the tables needed for Quartz. Use the appropriate DDL for your database from one of the files in the Quartz plugin’s src/templates/sql directory.
  • Role.groovy has been customized to support the Hibernate 2nd-level cache:
    – implements Serializable
    – read-only cache configured in mapping (along with disabling optimistic locking since it’s not needed)
    – custom list() and count() methods that optionally use the cache or bypass it
  • User.groovy is also customized for the Hibernate 2nd-level cache:
    – implements Serializable
    – read-write cache configured in mapping
  • grails-app/conf/spring/resources.groovy has the fix for GRAILSPLUGINS-1207
  • src/java/ehcache.xml has caches configured for the domain classes. It’s not distributed though since it’ll be used in the development environment.
  • scripts/_Events.groovy has code to delete unused jars (unrelated to clustering, just a good idea) and delete the version of ehcache.jar that comes with Grails since the app uses a newer version. It also replaces the non-distributed ehcache.xml with the distributed version from cluster_resources, and copies quartz.properties in case you need to customize those beyond the defaults.
  • grails-app/conf/BootStrap.groovy has code to create the admin role and an admin user to test login

The files from Tomcat and the cluster scripts are available here. Untar them somewhere on the server where you want to create a cluster:

tar xfz cluster.tar.gz
cd cluster

You’ll see five scripts:

  • cleanup.sh
  • createCluster.sh
  • createInstance.sh
  • deploy.sh
  • run.sh

These are combinations of bash scripts and an Ant build file (clusterTasks.xml). All of the scripts need a cluster root directory. You can either specify it for each script invocation (run the script without parameters to see the usage) or you can set the CR environment variable

export CR=/usr/local/tomcatcluster

The first script to run is createCluster.sh:

./createCluster.sh

This will create the directory structure and copy the Tomcat jars and other shared files.

There’s an empty clustered-config.groovy that you can use to customize the production deployment. One common use for this is to externalize database passwords. This can be done with JNDI, but I prefer a self-contained war file that doesn’t require container-specific configuration. This is copied to $CR/shared/lib when you run createCluster.sh since it’s shared by all instances.

Next you need to run createInstance.sh once for each cluster node on this server. The syntax is

createInstance.sh [server number] [instance number] [cluster root dir]

and you need to ensure that the server number and instance number combination is unique throughout the cluster. This is simple if you number each server and choose a unique instance number for each instance on that server, e.g.

./createInstance.sh 1 1
./createInstance.sh 1 2
./createInstance.sh 1 3

on server one,

./createInstance.sh 2 1
./createInstance.sh 2 2
./createInstance.sh 2 3

on server two, etc. So go ahead and create at least two nodes on this server:

./createInstance.sh 1 1
./createInstance.sh 1 2

Some files to note are $CR/shared/conf/server.xml and $CR/instance_X/conf/catalina.properties. Most values in server.xml have been replaced with ${} properties whose values are specified in catalina.properties. These property values are retrieved via System.getProperty() but Tomcat reads catalina.properties and sets system properties from there. Using this approach we can have one parameterized server.xml and allow per-instance customization of ports, etc. Another config file is $CR/instance_X/bin/setenv.sh which configures CATALINA_OPTS (and optionally other environment variables). Edit this file to change the heap or permgen memory for each instance.

Having created the instances, now you need to deploy a war. Download the test project and unpack it:

tar xfz clustered.tar.gz
cd clustered

and run grails compile to trigger installation of the project’s plugins. As mentioned in the presentation, there’s a bug in the Quartz plugin that causes a problem in clusters since it tries to re-register jobs, so edit plugins/quartz-0.4.1/QuartzGrailsPlugin.groovy and change

if(scheduler.getTrigger(trigger.name, trigger.group)) {
   scheduler.rescheduleJob(trigger.name, trigger.group, ctx.getBean("${key}Trigger"))
} else {
   scheduler.scheduleJob(ctx.getBean("${key}Trigger"))
}

to

if(scheduler.getTrigger(trigger.triggerAttributes.name,
                        trigger.triggerAttributes.group)) {
   scheduler.rescheduleJob(trigger.triggerAttributes.name,
                           trigger.triggerAttributes.group,
                           ctx.getBean("${key}Trigger"))
} else {
   scheduler.scheduleJob(ctx.getBean("${key}Trigger"))
}

Now you can build the war:

grails clean && grails war

and use deploy.sh to deploy it to the cluster:

./deploy.sh /path/to/clustered-0.1.war

The name of the war isn’t important since deploy.sh deploys as the root context (in $CR/shared/webapps/ROOT). Note that you only deploy a war once per server since it’s shared by all instances on that box.

Next create the database:

$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.1.39-log MySQL Community Server (GPL)

Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.

mysql> create database clustered;
Query OK, 1 row affected (0.00 sec)

mysql> grant all on clustered.* to clustered@localhost identified by ‘clustered’;
Query OK, 0 rows affected (0.00 sec)

The project has a modified version of the SchemaExport script that fixes a bug reading hbm.xml files, so run

grails schema-export

and choose option [2] to use the project version instead of the Grails version (this bug is fixed in Grails 1.2). This will create a file called ddl.sql that you can use to create the tables. The generated script is designed to be used by Hibernate’s schema export tool which ignores errors trying to drop foreign keys on non-existent tables but this will fail from the commandline, so edit the file and remove all of the DROP TABLE ... and alter table ... drop foreign key ... statements and run

mysql -u clustered -pclustered -D clustered < ddl.sql

Now that we’ve deployed the war and created the database, we can now start the instances using run.sh. The syntax is

run.sh [start|stop|run] [instance number]

so you would run

run.sh start 1

to start instance #1. Once it has successfully started you can start the other instances:

run.sh start 2

You should see output indicating that the instances are discovering one another.

The server logs are in a few different directories. $CR/instance_X/logs will contain that instance’s catalina.out and localhost_access_log, and the two rolling Log4j logs configured in Config.groovy, X_clusterdemo.log and X_sql.log, prefixed with the instance number.

$CR/shared/logs contains the remaining Tomcat logs; admin, catalina, host-manager, localhost, and manager all prefixed with the instance number. instance_X.pid files are here also, containing the process id for each instance to help with shutting down individual instances.


There is no load balancing configured yet; this can be configured using hardware load balancers, Apache, etc. $CR/shared/conf/server.xml configures the jvmRoute attribute of the <Engine> element (worker${cluster.server.number}_${cluster.instance.number}) for use with mod_jk.

So for now to test the app we’ll go to specific instances. Open http://localhost:8091/admin/ and since that controller is secured, it’ll prompt you to login. Use the username and password configured in BootStrap.groovy (admin/p4ssw0rd) and you should be allowed to view the page.

To test session replication, kill this instance. Find the pid for instance 1 in $CR/shared/logs/instance_1.pid and run kill -9 to ensure there’s no orderly cleanup:

kill -9 `cat $CR/shared/logs/instance_1.pid`

You’ll see messages in the logs for instance 2 indicating that instance 1 has disappeared:
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared. You’ll also see that if Quartz wasn’t running in instance 2 it now is.

Open http://localhost:8092/admin/ and you should still be logged in since your Authentication is stored in the HTTP session and was replicated.


To shutdown an instance, use run.sh with the stop argument:

run.sh stop 1

You can use cleanup.sh to delete log files and temp files in the temp and work directories. This isn’t required but it’s convenient.


Some further reading:

Links and files:

15 Responses to “Clustering Grails”

  1. Dean says:

    Great write up!

    Will this also work when clustering Tomcat instances which are located on different physical servers?

  2. Burt says:

    @Dean sure, the example shows creating instances on multiple servers (./createInstance.sh 1 1, ./createInstance.sh 2 1, etc. – the 1st parameter is the server number and the 2nd is the instance number on that server). You’ll probably want to run multiple instances per physical server and have as many physical servers as you need to handle the load. The cool thing is that since they auto-discover each other, you can scale up as your load increases without configuration changes.

  3. Dan Lynn says:

    Thanks so much for posting this, Burt. Great info. I’m also looking forward to trying your new spring security core plugin!

  4. omarji says:

    This is really cool. thanks a lot for sharing.

    Does this all still apply to grails 1.3.x?

  5. omarji says:

    Perfect!! thx

  6. sonpham says:

    Hello,
    I’m a student, at University in VietNam.
    It’s nice guide, but it only deploy on Linux environment. Window?
    On other way, i can config tomcat cluster and then deploy with clustered (your project). OK?
    i looking forward to hearing from you. please help me.

    • Burt says:

      I have no interest in deploying on Windows, but it shouldn’t be that much work to convert the bash scripts to batch files.

      • sonpham says:

        hi!
        i have a question and please help me.!
        i generate a war file by command: Grails war.(Default it run on port 8080)
        i create a cluster with 3 tomcat instances: tomcatA,tomcatB,tomcatC(tomcat 6) and port 8081,8082,8083. How to run war in this cluster?

        • sonpham says:

          Hi!
          Sorry, i wrong. I don’t use your cluster package. i listened to and followed the way that was mentioned in your slide. The way use Apache(httpd) -> 3 instances tomcat. i checked carefully then it works. i deploy 3 .war files on 3 tomcat instances. And now i will continue with mysql.jdbc.Replication. Thanks.

      • sonpham says:

        Hi!
        Happy New Year.
        Thanks!

  7. buitahau says:

    Hi ,when i run “./createCluster.sh” ,i have error :
    “./createCluster.sh: line 18: ant: command not found”
    Please help me to solve it .Thank !!

    • buitahau says:

      Hi , i install ant and i have another error :
      Buildfile: /home/hau/cluster/clusterTasks.xml

      getClusterInfo:

      confirmClusterDelete:

      noDeleteCluster:

      ensureConfigExists:

      doCreateCluster:

      BUILD FAILED
      /home/hau/cluster/clusterTasks.xml:36: Directory /usr/local/tomcatcluster creation was not successful for an unknown reason

      Can u help me ? Thank !!

  8. […] plugin for Grails on a recent project and ran into a minor annoyance after (partially) following Burt Beckwith’s clustering example.  Part of the problem was caused by not wanting to have Hibernate auto-run the DDL for the quartz […]

  9. Firebat says:

    Hi,

    Thanks for the great article. I was successfull running multiple instances in the same server, however

    I have a question regarding deploying a clustered app with multiple phyisical servers. How exactly do you do that?

    I have tried:

    createCluster.sh on each
    createInstance 1 1 (srv 1)
    createInstance 2 1 (srv 2)
    deploy on both
    run 1 (on both)

    this seems to create 2 independent clusters.

    if I dont run createCluster.sh on srv2 then I cant createInstance.

    if I mount $CR (located in srv1) to srv2 and do:

    createCluster.sh (srv1)
    createInstance 1 1 (srv 1)
    createInstance 2 1 (srv 2)
    deploy (srv1)
    run 1 (on both)

    srv2$ ./run.sh 1 fails because there is a pid for that instance already (since they are sharing the same storage).

    What is the right way to do this for multiple servers?

    Thanks a lot!

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.