Thursday, July 5, 2012

Avoiding WebappClassLoader memory leaks.

Recently, I have been working to solve memory leak issues of our Integration web application and would like to document my findings.

Issue is WebappClassLoader object is not de-referenced when the application context is destroyed and hence cannot be garbage collected and as a result all objects which are referenced from WebappClassLoader reside in memory (to be precise in PermGenSpace part of the heap)


Following are interesting links about this issue:
1) Frank Kieviet's blog.
2) A Nice tutorial on diagnosing memory leak.
3) Creating memory leaks in java.
4) Class loader in deep.
5) As always, Java language specification is a great guide.

It turns out, the web application had a couple of issues:
1) Static final object reference: Static means only one object can persist in memory and making that object final makes it even harder to make it available for garbage collection by "nulling the reference".
Issue: The application sneaky created a reference to this Log4j object through FileWatchdog (instantiated through PropertyConfigurer). Because these object references are maintained in HashTable (object "ht") inside Log4j library's Hierarchy class restarting the web application didn't clear this reference as it was also referenced from a thread which was never stopped. More details at #4 below.
Solution: Removed static keyword on the object because there was only one object of the class ever to exist. It worked for us, it might not for you, take a look at the Log4jConfigListener (define it in web.xml before other listeners) for another solution. This manages Log objects when the during the life cycle of the web application.

2) There were threads which were started during web application startup but never stopped. This prevented from WebappClassLoader garbage collection. The code was not properly architected, even the InterruptedException was swallowed by the thread and continued processing indefinately.
Solution: Making the thread response to interrupts and return on InterruptedException.

3) A module in the web application fires jobs based on cron expression at regular intervals. The SchedulerFactoryBean.destroy() was never called when the context was destroyed.  
Solution: Modified the spring bean with destroy-method="destroy" in the application context spring bean definition. Note, this might not work for older version of spring as discussed here. In this case, use the solution discussed of writing a your servlet listener which destroys the quartz classes when the app context is destroyed.

4) FileWatchDog Thread in Log4j: This is most serious of all issues because there is no solution yet to this. This thread is created when dynamic loading of log4j.properties without having to re-starting web container (in our case, it is tomcat), by PropertyConfigurer.configureAndWatch. (note, every invocation of this method creates a new thread)
Issue: The code doesn't take into account interruption (InterruptedException is swallowed), no way to shutdown this thread.
 From stackoverflow, some solutions can be concluded but they wont work unless VM shuts down. The reason being, there is no way to get to this thread (i.e. PropertyConfigurer does not have a reference of this instantiated thread)
  • Using LogManager.shutdown(), this method does shutdown every Log appenders and Category associated with the Root but not FileWatchDog thread.
  • Defining Log4jConfigListener in web.xml doesnt help because when the context is destroyed, internally it calls LogManager.shutdown() (through Log4jWebConfigurer.shutdownLogging)
Solution
  • I have to admit that not using the dynamic reloading of log files in tomcat is the easiest!
  • More hacky way, the thread FileWatchdog, repeatedly executes the method: checkAndConfigure(). Below is the code snippet:
protected
  void checkAndConfigure() {
    boolean fileExists;
    try {
      fileExists = file.exists();
    } catch(SecurityException  e) {
      LogLog.warn("Was not allowed to read check file existance, file:["+
          filename+"].");
      interrupted = true; // there is no point in continuing
      return;
    }

    if(fileExists) {
      long l = file.lastModified(); // this can also throw a SecurityException
      if(l > lastModif) {           // however, if we reached this point this
    lastModif = l;              // is very unlikely.
    doOnChange();
    warnedAlready = false;
      }
    } else {
      if(!warnedAlready) {
    LogLog.debug("["+filename+"] does not exist.");
    warnedAlready = true;
      }
    }
  }
 
If the file permission just before context destroyed (maybe define a custom listener) was made read only, the method checkAndConfigure will throw a SecurityException which causes the boolean "interrupted" to set and hence will break the loop. Dont forget to reset the file persmission on context initialization.

We went ahead with the first solution as dynamic reload didn't add much value but hopefully the second solution might help someone...

No comments:

Post a Comment