A Grails Memory Leak

I'm in the process of building a large database to load-test our application at work. Index usage and execution plans are quite different with many rows than with a few, so I'm creating millions of rows in tables for domain classes that will see a lot of activity to find missing indexes, inefficient queries, etc.

I used the typical strategy for Hibernate batch processing - flush() and clear() the Session periodically to push the data to the database and free up memory used by domain instances. But even with a 1-gig heap, the process kept running out of memory after several hours, and was running very slowly for the last couple of hours - typically indicative of lots of garbage-collection thrashing.

Profiling the process using YourKit showed that domain instance counts were steadily increasing. Repeatedly running gc() had no effect on these instances, although it clearly ran since memory usage and other object counts dropped (note that calls to gc() can't force garbage collection, it's just a request).

The code and domain classes aren't complicated and didn't have any obvious couplings that could cause instances to not be garbage collected, and I kept ripping out more and more code to try to find what was causing the problem. Finally I got it down to the smallest code block that illustrated the problem:

for (int i = 0; i <100000; i++) {
   new Thing(name: "thing_${i}")
   Thread.sleep(50) // to allow time to watch things in the profiler
}

I ran this in the Grails console and with Hibernate 2nd-level caching turned off, both in a transaction and not (to eliminate the transaction as the culprit). Just creating the instances and not saving them - in this code Hibernate isn't involved at all - triggered the problem.

So I described the issue on the Grails User mailing list and after an interesting discussion Graeme mentioned that it had to do with the validation Errors objects. He suggested setting errors to null after the instances were no longer used, i.e

for (int i = 0; i <100000; i++) {
   Thing thing = new Thing(name: "thing_${i}")
   thing.errors = null
   Thread.sleep(50) // to allow time to watch things in the profiler
}

This should work in a web context, but I was working in the console, so I got a NullPointerException. The problem was clear when I looked at the MetaClass code in DomainClassGrailsPlugin:

static final PROPERTY_INSTANCE_MAP = new org.codehaus.groovy.grails.support.SoftThreadLocalMap()

metaClass.setErrors = { Errors errors ->
   def request = RCH.getRequestAttributes()?.request
   def storage = request ? request : PROPERTY_INSTANCE_MAP.get()
   def key = "org.codehaus.groovy.grails.ERRORS_${delegate.class.name}_${System.identityHashCode(delegate)}"
   storage[key] = errors
}

metaClass.getErrors = {->
   def request = RCH.getRequestAttributes()?.request
   def errors
   def storage = request ? request : PROPERTY_INSTANCE_MAP.get()
   def key = "org.codehaus.groovy.grails.ERRORS_${delegate.class.name}_${System.identityHashCode(delegate)}"
   errors = storage[key]
   if(!errors) {
      errors =  new BeanPropertyBindingResult( delegate, delegate.getClass().getName())
      storage[key] = errors
   }
   errors
}

Instead of storing the Errors in the Request, which in the console doesn't exist, it stores it in a Thread-local Commons Collections ReferenceMap. This map doesn't accept nulls, so instead I had to access it directly, i.e.

DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()

This worked - memory usage was fine and I could see that instances were being garbage-collected while running under the profiler.

But this begs the question - why was I in this mess in the first place? Why are there Errors objects involved if I'm not calling validate() or save(), just creating new instances?

ControllersGrailsPlugin wires up a DataBindingDynamicConstructor that takes a Map to allow creating domain instances like this:

def foo = new Foo(prop1: 123, prop2: 'foo', prop3: new Date())

This dynamic constructor copies the map data using a GrailsDataBinder into the new instance in DataBindingUtils.bindObjectToInstance(), and at the end calls setErrors() passing the BindingResult (which implements Errors) generated from copying the data. These are stored in the Map (or as a Request attribute if running in a web context) and this ended up being the reference that kept the instances from being garbage collection candidates even though they were no longer in scope in the loop.

So, mystery solved. In the end this is all academic for me since I needed to get things working so I rewrote the data population code using JDBC (and Java) and it runs quite a bit faster than the equivalent Grails/Groovy/GORM code. This isn't usually an issue for small web requests but for a process that takes many hours it definitely makes a difference.

7 Responses to “A Grails Memory Leak”

  1. Jack
    September 23rd, 2008 21:58
    1

    Great write up on this, always good to see the solution illustrated so clearly.

  2. Trails
    July 25th, 2009 11:18
    2

    I’m running into a memory leak on my grails app.

    I have a pretty substantial amount of data processing, including domain object instantiation, running inside a quartz job. If things are being persisted indefinitely in thread local, this is probably the cause.

    Very informative, thanks for the post.

  3. Ted Naleid
    September 12th, 2009 15:08
    3

    Thanks Burt, this post was a big help in figuring out what was going on with some batch loading I’m doing through GORM. I wish I had found it a couple of days ago :) .

  4. Ted Naleid » Batch Import Performance with Grails and MySQL
    October 1st, 2009 00:54
    4

    [...] reading through a couple of very helpful posts by Burt Beckwith, and reading through the Hibernate chapter on batch processing, we have all [...]

  5. Mark O'Brien
    November 6th, 2009 08:06
    5

    Hi Burt,

    Where does this piece of code:
    DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()

    need to go…is it related to every domain class access?

    Thanks
    M

  6. Burt
    November 6th, 2009 14:09
    6

    Mark -
    Typically when doing batch imports in Hibernate you flush and clear the session regularly, e.g. every 25 or 100 inserts or so. I put the call to clear PROPERTY_INSTANCE_MAP in there, e.g.:

    for (int i = 0; i < count; i++) {
      new Thing(...).save()
      if (i > 0 && i % BATCH_SIZE == 0) {
        session.flush()
        session.clear()
        DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP.get().clear()
      }
    }

    but you could also call it after every save() since as long as that’s successful you no longer need the associated errors.

  7. Tweets that mention An Army of Solipsists -- Topsy.com
    January 20th, 2010 20:20
    7

    [...] This post was mentioned on Twitter by Ryuichi Ohtaki, ふむふむ. ふむふむ said: RT rohtaki: A Grails Memory Leak http://bit.ly/8Mbpwy なるほど。これでバッチ処理でメモリーリークするわけだ。 [...]

Leave a Reply

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.