April 26, 2005

Try/Catch Blues - Error Checking Gone Horribly Wrong

This is my old posting posted at planetsourcecode.com which I am re-posting on my own blog...

It may seem like you are doing good putting try/catch blocks all throughout your code, but you are probably being redundant. When you declare a try/catch block in Java you cause the compiler to create “protected zones” of execution which require the runtime to do boundary checks and create further processing for the system. You are not making the program more stable by doing this, you are making it many times slower!

If you have a try/catch block which has only one line of code inside the try { … } you are probably being inefficient.

Here is an example of a bad piece of code:

Example 1a:

public HashTable doEvent(Event event) throws CBException {

Connection conn = null;

EntityClass entClass = null;

ArrayList list = null;

try {

try {

conn = dataAdapter.getConnection(DataSource.PRIMARY);

}

catch (Exception e) {

System.out.println(“[SomeClass] An error occurred obtaining connection: “ + e.getMessage());

}

try {

entClass = new EntityClass();

list = entClass.list(conn);

}

catch (Exception e) {

System.out.println(“[SomeClass] An error occurred obtaining data from the Entity: “ + e.getMessage());

}

try {

result.put(“stuff”, list);

}

catch (Exception e) {

System.out.println(“[SomeClass] An error occurred putting data into HashTable: ” + e.getMessage());

}

}

catch (Exception e {

System.out.println(“[SomeClass] An error occurred in the method: “ + e.getMessage());

}

finally {

dataAdapter.releaseConnection(conn);

}

}

The above example is a terrible piece of code. It is inherently inefficient, represents poor error handling (despite the try/catch blocks) and is very ugly to look at to say the least. For one, doing a boundary check on dataAdapter.getConnection(DataSource.PRIMARY) is pointless, why would you create a unique boundary for this. This represents poor thinking. Consider the following example of much more “cleaned up code”.

Example 1b:

public HashTable doEvent(Event event) {

Connection conn = null;

EntityClass entClass = null;

ArrayList list = null;

try {

conn = dataAdapter.getConnection(DataSource.PRIMARY);

entClass = new EntityClass();

list = entClass.list(conn);

result.put(“stuff”, list);

}

catch (Exception e {

System.out.println(“[SomeClass] An error occurred in the method: “ + e.getMessage());

}

finally {

dataAdapter.releaseConnection(conn);

}

return result;

}

The above method in Example 1b is just as effective as Example 1a at handling exceptions, and its execution time is much faster. Think logically, if for example the dataAdapter.getConnection() method fails, the Exception will still be caught by the one single catch block. Also, you can generally rely on the message coming from the ConnectionPoolManager, and DataAdapter to pass an understandable failure message.

Lets take a look at another bad habit:

Example 2a:

try {

try {

compEntity.fetch(conn, compToolEvent.getComponentId());

ArrayList roles = genRolesEntity.list(conn);

results.put("roles", roles);

results.put("component", compEntity);

}

catch (CBException e) {

throw new CBException(“Error: “ + e.toString());

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

… Hidden Code Here …

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

First of all, if you don’t see a problem in the above block of code then you don’t understand Exceptions. The above code shows massive redundancy in the error handling. Let’s say for example that compEntity.fetch() throws a CBException, this is what happen at runtime:

1. Line: compEntity.fetch(conn, compToolEvent.getComponentId()) Throws a CBException

2. CBException is CAUGHT by first catch block.

3. A NEW CBException is THROWN.

4. CBException is caught in the Exception catch block at the bottom.

5. A NEW CBException is THROWN.

So what’s the problem you ask? There are three big problems, and they are that: Three CBExceptions are instantiated instead of one! But what does it matter? A lot actually, this type of boundary checking is expensive and foolish. Consider this revised code.

Example 2b:

try {

try {

compEntity.fetch(conn, compToolEvent.getComponentId());

ArrayList roles = genRolesEntity.list(conn);

results.put("roles", roles);

results.put("component", compEntity);

return results;

}

catch (CBException e) {

throw e;

}

catch (Exception e) {

throw e;

}

… Hidden Code Here …

}

catch (CBException e) {

throw e;

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

The above example makes much more sense. When we catch a CBException, we just throw the existing CBException instance which is caught again and thrown to the higher level. In this case we only create one Exception object. The above example is once again also inherently redundant anyways, and can be even further compressed into the following example (remembering the discussion earlier):

Example 2c:

try {

compEntity.fetch(conn, compToolEvent.getComponentId());

ArrayList roles = genRolesEntity.list(conn);

results.put("roles", roles);

results.put("component", compEntity);

return results;

… Hidden Code Here …

}

catch (CBException e) {

throw e;

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

That’s more like it. But it could still be better, but that can wait until a later article.

Final Words

To re-iterate, don’t use excessive try/catch blocks. Plan more carefully, and also take advantage of stacked catches to encapsulate code rather than creating many protected areas. Performance is a very important.

Programming Efficient Code - Loops and Locks

This is my old posting posted at planetsourcecode.com which I am re-posting on my own blog...

One of the worst things that a programmer can assume is that the compiler and middleware will do the optimizations for you! Most applications are being targeted for 50-200 concurrent users, which is why we need to constantly be worrying about the performance of our code.

Say you have a search component (which will be on everybody’s desktop) which takes 3-5 seconds to load; and consequently ties down the database. Imagine what will happen when 200 people try to load this at the same time. Simple math would suggest (say using an average of 4 seconds): (4 x 200) / 60 = 13 – THAT’S 13 MINUTES! And actually, when dealing with situations of high contention, you cannot assume 100% efficiency and could be realistically dealing with something in the range of 20-25 minutes of processing time required.

There is no ‘silver bullet’ to making fast and efficient code. Middleware will not solve the problem for you, databases will not solve the problem for you, it is up to you as a computational process engineer (how’s that for a title?) to understand and deal with the underlying inefficiencies in the software you design. There are many things which you need to consider, and you need to think your logic out carefully. One thing you should always be asking yourself is “could this be done better?”.

In programming, we find ourselves in loops a lot. In Java, we especially find ourselves looping through Collection objects an awful lot. This is one of the particular areas where many of us need some improvement. When you use a Collection object, how do you decide what type of Collection to use, and how to apply it? It seems to me that most Java Programmers are just using “whatever works”, and they use the one which they “believe” to be the fastest. The fact of the matter is, different types of collections are for different kinds of applications. Do you truly know the differences between Vector and ArrayList. The most common misconception I have heard is that a Vector will automatically grow in size, and an ArrayList will not, this is simply not true. The only real difference is that Vector is thread-safe and ArrayList is not. And what does this mean?

Being thread-safe is not always a good thing. When something is thread safe, it means that the runtime must maintain locks on certain objects, when they are being accessed to prevent concurrent modification. In many cases, this additional check is unnecessary and very costly to performance. On the other hand there are situations where it is very necessary to do thread-safe operations. Many people understand the jist of synchronization, but don’t truly understand how to take advantage of it properly. One thing I have see people doing a lot is applying synchronized in places where they should not. Consider the following:

Example 1A:

synchronized void addUser(User user) { this.list.add(user);

}

Another common misconception is that the synchronized keyword will only protect that particular method. If you think this, you should read on. This will however, effectively only allow the instance of list to be accessed by only one Thread at a time. But by doing this, you force the runtime to place a lock on the entire object pool of the class instance, which essentially means, any instance methods cannot be executed during the execution of addUser(). In most cases, this is inefficient. Other threads may need access to other non-effected items.

The following example addresses this problem.

Example 2A:

void adduUser(User user) {

synchronized (this.list) {

this.list.add(user)

}

}

In this example, we only lock the instance of list for the duration of the add() execution. This is much more efficient than Example 1A.

Now what does this have to do with picking ArrayList or Vector? Well a lot really. In instances where we are dealing with temporary sets of data or method-scoped instances, using a Vector is very inefficient. In situations where there is no chance of their being concurrent access, you should most certainly choose an ArrayList. For using a Vector would serve absolutely no useful purpose, and would provide unnecessary lock-checking. We’ll leave hashed-collections for later :).

Loop Iteration and Tail Recursion

As we said earlier, our programs spend a lot of time in loops, and unfortunately loose a lot of their performance in them as well. I will try to cover a few pointers which may help you in certain situations shave some unnecessary computational cycles off you’re code.

People tend to think from beginning to end, and they tend to program in this forward lineage as well. But this can often be inefficient. Sometimes the computer can find its way from the end to the beginning much faster.

Consider the code in Example 1.

Example 1B:

for (int i = 0; i <>

Object obj = (Object) object.get(i);

obj.doSomething();

}

This is a fairly straight-forward for-loop to iterate that iterates through an entire collection to do something. But consider Example 2:

Example 2B:

for (int i = arrayList.size(); i != 0; i--) {

Object obj = (Object) object.get(i);

Obj.doSomething();

}

This example is many times more efficient than Example 1. In example one, we are making a call to arrayList.size() for every iteration through the loop which is unnecessary, and also we are doing a direct XAND comparison to determine if the loop should continue which is also more efficient. By looping backwards through the ArrayList we manage to increase processing efficiency but 50% or more!

Another magical method to performing ultra-efficient loops has been long-since forgotten. Yes, I am talking about “tail recursion”. This is one of the best ways to do mathematical sums on large lists. It also works brilliantly with Java’s Iterator and Enumeration interfaces. Consider the following example:

Example 1C:

public int getRecordsSum(Iterator iter) {

return _getRecordsSum(iter, 1);

}

 
public int _getRecordsSum(Iterator iter, int counter) {



            if (iter.hasNext() {



               return _getRecordsSum(iter, counter + ((Integer)i.next()).intValue());

}

else {

return counter;

}

}

Now for those of you who are keen, you might be thinking StackOverflowException here. But actually, the compiler will see the optimization opportunity here, just as C and C++ compilers will. The compiler will pick up on the tail recursion based on the fact that _getRecordsSum() contains no method variables, and is passing references back into itself. Therefore, this will not cause a run-away stack, but rather a very efficient way of processing numbers.

Final Words

Programming is all about problem solving. And as with other kinds of problem solving, there are always many different ways to solve the problem. However, some ways are more certainly better than others. You should take the time to understand how the underlying components you are using actually work, and why they work they way they do.