April 26, 2005

Try/Catch Blues - Error Checking Gone Horribly Wrong

This is my old posting posted at planetsourcecode.com which I am re-posting on my own blog...

It may seem like you are doing good putting try/catch blocks all throughout your code, but you are probably being redundant. When you declare a try/catch block in Java you cause the compiler to create “protected zones” of execution which require the runtime to do boundary checks and create further processing for the system. You are not making the program more stable by doing this, you are making it many times slower!

If you have a try/catch block which has only one line of code inside the try { … } you are probably being inefficient.

Here is an example of a bad piece of code:

Example 1a:

public HashTable doEvent(Event event) throws CBException {

Connection conn = null;

EntityClass entClass = null;

ArrayList list = null;

try {

try {

conn = dataAdapter.getConnection(DataSource.PRIMARY);

}

catch (Exception e) {

System.out.println(“[SomeClass] An error occurred obtaining connection: “ + e.getMessage());

}

try {

entClass = new EntityClass();

list = entClass.list(conn);

}

catch (Exception e) {

System.out.println(“[SomeClass] An error occurred obtaining data from the Entity: “ + e.getMessage());

}

try {

result.put(“stuff”, list);

}

catch (Exception e) {

System.out.println(“[SomeClass] An error occurred putting data into HashTable: ” + e.getMessage());

}

}

catch (Exception e {

System.out.println(“[SomeClass] An error occurred in the method: “ + e.getMessage());

}

finally {

dataAdapter.releaseConnection(conn);

}

}

The above example is a terrible piece of code. It is inherently inefficient, represents poor error handling (despite the try/catch blocks) and is very ugly to look at to say the least. For one, doing a boundary check on dataAdapter.getConnection(DataSource.PRIMARY) is pointless, why would you create a unique boundary for this. This represents poor thinking. Consider the following example of much more “cleaned up code”.

Example 1b:

public HashTable doEvent(Event event) {

Connection conn = null;

EntityClass entClass = null;

ArrayList list = null;

try {

conn = dataAdapter.getConnection(DataSource.PRIMARY);

entClass = new EntityClass();

list = entClass.list(conn);

result.put(“stuff”, list);

}

catch (Exception e {

System.out.println(“[SomeClass] An error occurred in the method: “ + e.getMessage());

}

finally {

dataAdapter.releaseConnection(conn);

}

return result;

}

The above method in Example 1b is just as effective as Example 1a at handling exceptions, and its execution time is much faster. Think logically, if for example the dataAdapter.getConnection() method fails, the Exception will still be caught by the one single catch block. Also, you can generally rely on the message coming from the ConnectionPoolManager, and DataAdapter to pass an understandable failure message.

Lets take a look at another bad habit:

Example 2a:

try {

try {

compEntity.fetch(conn, compToolEvent.getComponentId());

ArrayList roles = genRolesEntity.list(conn);

results.put("roles", roles);

results.put("component", compEntity);

}

catch (CBException e) {

throw new CBException(“Error: “ + e.toString());

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

… Hidden Code Here …

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

First of all, if you don’t see a problem in the above block of code then you don’t understand Exceptions. The above code shows massive redundancy in the error handling. Let’s say for example that compEntity.fetch() throws a CBException, this is what happen at runtime:

1. Line: compEntity.fetch(conn, compToolEvent.getComponentId()) Throws a CBException

2. CBException is CAUGHT by first catch block.

3. A NEW CBException is THROWN.

4. CBException is caught in the Exception catch block at the bottom.

5. A NEW CBException is THROWN.

So what’s the problem you ask? There are three big problems, and they are that: Three CBExceptions are instantiated instead of one! But what does it matter? A lot actually, this type of boundary checking is expensive and foolish. Consider this revised code.

Example 2b:

try {

try {

compEntity.fetch(conn, compToolEvent.getComponentId());

ArrayList roles = genRolesEntity.list(conn);

results.put("roles", roles);

results.put("component", compEntity);

return results;

}

catch (CBException e) {

throw e;

}

catch (Exception e) {

throw e;

}

… Hidden Code Here …

}

catch (CBException e) {

throw e;

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

The above example makes much more sense. When we catch a CBException, we just throw the existing CBException instance which is caught again and thrown to the higher level. In this case we only create one Exception object. The above example is once again also inherently redundant anyways, and can be even further compressed into the following example (remembering the discussion earlier):

Example 2c:

try {

compEntity.fetch(conn, compToolEvent.getComponentId());

ArrayList roles = genRolesEntity.list(conn);

results.put("roles", roles);

results.put("component", compEntity);

return results;

… Hidden Code Here …

}

catch (CBException e) {

throw e;

}

catch (Exception e) {

throw new CBException(“Error: “ + e.toString());

}

That’s more like it. But it could still be better, but that can wait until a later article.

Final Words

To re-iterate, don’t use excessive try/catch blocks. Plan more carefully, and also take advantage of stacked catches to encapsulate code rather than creating many protected areas. Performance is a very important.

Programming Efficient Code - Loops and Locks

This is my old posting posted at planetsourcecode.com which I am re-posting on my own blog...

One of the worst things that a programmer can assume is that the compiler and middleware will do the optimizations for you! Most applications are being targeted for 50-200 concurrent users, which is why we need to constantly be worrying about the performance of our code.

Say you have a search component (which will be on everybody’s desktop) which takes 3-5 seconds to load; and consequently ties down the database. Imagine what will happen when 200 people try to load this at the same time. Simple math would suggest (say using an average of 4 seconds): (4 x 200) / 60 = 13 – THAT’S 13 MINUTES! And actually, when dealing with situations of high contention, you cannot assume 100% efficiency and could be realistically dealing with something in the range of 20-25 minutes of processing time required.

There is no ‘silver bullet’ to making fast and efficient code. Middleware will not solve the problem for you, databases will not solve the problem for you, it is up to you as a computational process engineer (how’s that for a title?) to understand and deal with the underlying inefficiencies in the software you design. There are many things which you need to consider, and you need to think your logic out carefully. One thing you should always be asking yourself is “could this be done better?”.

In programming, we find ourselves in loops a lot. In Java, we especially find ourselves looping through Collection objects an awful lot. This is one of the particular areas where many of us need some improvement. When you use a Collection object, how do you decide what type of Collection to use, and how to apply it? It seems to me that most Java Programmers are just using “whatever works”, and they use the one which they “believe” to be the fastest. The fact of the matter is, different types of collections are for different kinds of applications. Do you truly know the differences between Vector and ArrayList. The most common misconception I have heard is that a Vector will automatically grow in size, and an ArrayList will not, this is simply not true. The only real difference is that Vector is thread-safe and ArrayList is not. And what does this mean?

Being thread-safe is not always a good thing. When something is thread safe, it means that the runtime must maintain locks on certain objects, when they are being accessed to prevent concurrent modification. In many cases, this additional check is unnecessary and very costly to performance. On the other hand there are situations where it is very necessary to do thread-safe operations. Many people understand the jist of synchronization, but don’t truly understand how to take advantage of it properly. One thing I have see people doing a lot is applying synchronized in places where they should not. Consider the following:

Example 1A:

synchronized void addUser(User user) { this.list.add(user);

}

Another common misconception is that the synchronized keyword will only protect that particular method. If you think this, you should read on. This will however, effectively only allow the instance of list to be accessed by only one Thread at a time. But by doing this, you force the runtime to place a lock on the entire object pool of the class instance, which essentially means, any instance methods cannot be executed during the execution of addUser(). In most cases, this is inefficient. Other threads may need access to other non-effected items.

The following example addresses this problem.

Example 2A:

void adduUser(User user) {

synchronized (this.list) {

this.list.add(user)

}

}

In this example, we only lock the instance of list for the duration of the add() execution. This is much more efficient than Example 1A.

Now what does this have to do with picking ArrayList or Vector? Well a lot really. In instances where we are dealing with temporary sets of data or method-scoped instances, using a Vector is very inefficient. In situations where there is no chance of their being concurrent access, you should most certainly choose an ArrayList. For using a Vector would serve absolutely no useful purpose, and would provide unnecessary lock-checking. We’ll leave hashed-collections for later :).

Loop Iteration and Tail Recursion

As we said earlier, our programs spend a lot of time in loops, and unfortunately loose a lot of their performance in them as well. I will try to cover a few pointers which may help you in certain situations shave some unnecessary computational cycles off you’re code.

People tend to think from beginning to end, and they tend to program in this forward lineage as well. But this can often be inefficient. Sometimes the computer can find its way from the end to the beginning much faster.

Consider the code in Example 1.

Example 1B:

for (int i = 0; i <>

Object obj = (Object) object.get(i);

obj.doSomething();

}

This is a fairly straight-forward for-loop to iterate that iterates through an entire collection to do something. But consider Example 2:

Example 2B:

for (int i = arrayList.size(); i != 0; i--) {

Object obj = (Object) object.get(i);

Obj.doSomething();

}

This example is many times more efficient than Example 1. In example one, we are making a call to arrayList.size() for every iteration through the loop which is unnecessary, and also we are doing a direct XAND comparison to determine if the loop should continue which is also more efficient. By looping backwards through the ArrayList we manage to increase processing efficiency but 50% or more!

Another magical method to performing ultra-efficient loops has been long-since forgotten. Yes, I am talking about “tail recursion”. This is one of the best ways to do mathematical sums on large lists. It also works brilliantly with Java’s Iterator and Enumeration interfaces. Consider the following example:

Example 1C:

public int getRecordsSum(Iterator iter) {

return _getRecordsSum(iter, 1);

}

 
public int _getRecordsSum(Iterator iter, int counter) {



            if (iter.hasNext() {



               return _getRecordsSum(iter, counter + ((Integer)i.next()).intValue());

}

else {

return counter;

}

}

Now for those of you who are keen, you might be thinking StackOverflowException here. But actually, the compiler will see the optimization opportunity here, just as C and C++ compilers will. The compiler will pick up on the tail recursion based on the fact that _getRecordsSum() contains no method variables, and is passing references back into itself. Therefore, this will not cause a run-away stack, but rather a very efficient way of processing numbers.

Final Words

Programming is all about problem solving. And as with other kinds of problem solving, there are always many different ways to solve the problem. However, some ways are more certainly better than others. You should take the time to understand how the underlying components you are using actually work, and why they work they way they do.

March 16, 2005

EVDO Technology

We are working on a web application using J2EE for a customer to offer high speed internet connection using the EVDO technology. This is just another retail website, but new thing is the technology associated with the business.

The EVDO system allows cellular service providers to provide broadband high-speed data services to their customers. The EVDO is an "always-on" system that allows users to browse the Internet without complicated dialup connections. EVDO provides wireless data connections that are 10 times as fast as a regular modem.

With the announcement of EVDO deployment, OEM IDs (laptop manufacturers) are looking to embed the EVDO/1X chipset into their devices and provide customers a wide area wireless service.

February 18, 2005

Presentation of table with alternate row background colors (works with Java, .NET, Ruby, everything else too)

Yesterday my team member asked me (after spending 2hrs) for logic for presentation of table with alternate row background colors. I felt that it may be simple to people who know it and difficult to others, hence I am blogging about the solution.

This solution works with Java, .NET, Ruby, every other language. I’ll use HTML example for simplicity and focus on solution rather than technology syntax.

First, declare two style classes (or something you need to set the font/background/etc that needs to be altered):

<style>

.rowstyle1 {background-color:white;}

.rowstyle0 {background-color:lightyellow;}

</style>

Second, declare a numeric (int) variable to count (running number of) the table rows:

int rowNumber = 0;

Third, increment the running number before rendering the table row:

rowNumber++;

Fourth, use the running number to assign the style class to each row:

class='rowstyle<%=(rowNumber % 2)%>'

Post me if you need help with logic building in any situation. I’ll try to help.

January 26, 2005

Using NDM (Network Data Mover)

This is my old posting posted at planetsourcecode.com which I am re-posting on my own blog...

Recently, I got an assignment to create a module for archiving the transaction files. The files range from 1KB to 100GB. The system gets about a million such files everyday for transactions generated for Asia pacific region from Citibank customers all over the world. The destination can be any type of system (windows, unix, mainframe, etc).

Good thing for us is that these files can be identified by its sources and time zone. The best part is that 100GB files comes from US, they are consolidated files and max 12 a day. I need to focus on the Japan firm banking system since they have multiple 1KB files and all need to process immediately.

Unfortunately, there is no Java API to do it and we cannot use 3rd part tools. I have written such solutions in Unix shell scripts earlier too, hence the concept is clear.

Solution:

1. Java component to go round-robin to poll the incoming folders (one for each destination) to look for files. This component will manage threads for using existing NMD licenses of Citibank (configurable in XML file).

2. Java thread component to create threads using the priority set in configuration for each destination folder.

3. Java component to write the Unix shell script and execute it using JNI and track the Unix process.