Tuesday, June 14, 2011

Fixing errors you cannot see and how I tried {to} catch (them){}

For those of you following my twitter @ctoisrael then you will know I was bug fixing today.  My system had a bug, the bug produced a tremendous amount of output.  I can only presume that what ever went wrong produced some sort of uncaught exception.  Uncaught exceptions are not written to the log, this is not usually a huge problem.  I run my system using multiple "windows" on a gnu screen.  If you use the linux commandline a lot, particularly remotely, this is one of the most useful tool there are out there.  I will save screen for another post. I actually have it set to scrollback a large amount of lines, but today even 10000 lines was too much.  There was just too much output.  Anything output by System.out or System.err was gone for good.  The log was not revealing.  I could see that there must have been an error from a malformed log line.  It didn't look right but I couldn't tell where it came from.

Some of the most critical errors in Java are NullPointerExceptions.  They are not caught by default, and the IDEs obviously do not enforce you to surround code with try and catch blocks.  So I am guessing that somewhere outside of my scrollback was a NullPointer or ConcurrentModification exception that was not caught and though did not crash the whole system, resulted in some malformed data that produced a glitch that got me in to a big mess, but thankfully not a costly one.  The symptoms were resolved and no long term damage done.  I am still not further on with the root issue.

I cannot find the root issue it is impossible, I checked the code where it could have happened but I cannot see anything wrong.  I have been through the log, I can see the result but still not the cause.  The next time this happened I must be prepared.  In order to do this I have to catch all possible exceptions.  This it turns out is not as easy as you might think.  Ideally I need to catch all exceptions, handle them if necessary and write them to the log.

I have looked in to this in the past.  I have found two suggested solutions:

  1. Surround everything with try/catch blocks, catching Exception, so that all unhandled  / uncaught exceptions will caught, and then handled.
  2. Create a Class that implements Thread.UncaughtExceptionHandler and add it to the Thread
The try catch method is a simple solution but a flexible one.  One must remember that every thread needs to have a try catch block around it.  My suggestion would be to put it in the run method.

public void run(){
  try {
    <all code to run the thread>
  } catch (Exception e){
    <handle exception>
  }
}


This can be a huge pain to implement if your existing code has made different threads.  Additionally I find that too many try catch blocks can look ugly and longer than it should be.  The one advantage of this is that it allows the coder to have control over each potential exception in every place that it could occur.  Clever use or more specific exceptions that can go uncaught can allow handling in different ways.

Despite this the second method is far better.  I created an abstract class:

public abstract class Log4JUncaughtExceptionHandler implements Thread.UncaughtExceptionHandler {
  private static Logger log = Logger.getLogger("log");
}

because it is abstract and subclass must implement the method uncaughtException(Thread t, Exception ex);
I created a subclass:

public class Log4JCatchAllExceptions extends Log4JUncaughtExceptionHandler{


  @Override
  public class uncaughtException(Thread t, Exception ex){
    log.error(t.getName(),ex);
  }
}

I used them as follows.  In any class with a main method I use the class to catch all exceptions with the line

Thread.setDefaultUncaughtExceptionHandler(new Log4JCatchAllExceptions());

This only needs to go in once for the entire program.  All uncaught exceptions will be caught and handled by the Log4JCatchAllExceptions class.  Its pretty neat, no need to add lots of try catch blocks to the code and has pretty much the same effect.  What you don't have is the possibility to handle exceptions differently in different cases.  The reason is because it has been added as a static property of the Thread class, hence every new thread will have this property.  This can be overridden in the thread instances by using:

Thread t = new Thread();
t.setUncaughtExcpetionHandler(new Log4JUncaughtExceptionHandler(){
  @Override
  public class uncaughtException(Thread t, Exception ex){
    log.error(t.getName(),ex);
  }
};
t.start();

This will set the handler of the specific thread to the anonymous class that is declared here.  So any time that there needs to be a specific handling of and error it can be done like this.  Here I have just printed a log in the same way that I did with the default class.  But in my real code I restart threads so that if they die thanks to a null pointer exception then they will respawn, reducing the effects of the error while recording it at the same time.
And don't forget that there are times where going back to regular try catch blocks will fill in any holes that are not handled by these two classes.

No comments:

Post a Comment