Monday 17 March 2008

Do you know Java Shutdown Hook?

JVMs are killed and Restarted either automatically by network management systems or manually by a network administrator or by the infrastructure management teams. When a JVM is killed, it is often necessary to perform some cleanup work before the JVM finishes shutting down. How we can achieve such behaviour ?

JVM shutdown hooks provide a clean and simple mechanism for registering application-specific behavior that performs cleanup work when a JVM terminates or its killed.

We need to first define the shutdown hook, which is a thread which would be started automatically by JVM when it terminates. So whatever action we want to take as a part of shutdown process goes within the run method of thread.


public class ShutdownHook extends Thread {
public
ShutdownHook() {
super();
}
public void run() {
System.out.println("
ShutdownHook started ...
do whatever you want here !!!");
}
}

Once you have defines the shutdown hook whats next? How this would be invoked?
We need to register an instance of this hook to JVM ...

   public class HookDemo() {
public
HookDemo() {
ShutdownHook sh = new ShutdownHook();
//register the hook like this ...
Runtime.getRuntime().addShutdownHook(sh);
}

public static void main(String[] str ) {

HookDemo obj = new HookDemo();
//let us terminate the program now...
System.exit(1);

//what output you expect ??
}
}
Output : ShutdownHook started... do whatever you want here !!!

Let us move to more complex example now... Let us assume I have a JDBC manager in my project ... I want to make sure whenever the Application shuts down the JDBCManager#cleanup() method is always executed , which ensures we have shutdown the manager proprly.
  public class JDBCManager() {
public
JDBCManager() {
ShutdownHook sh = new ShutdownHook(this);
//register the hook like this ...
Runtime.getRuntime().addShutdownHook(sh);
}

public void cleanup() {

System.out.println(" Hold on ...
Let me do the cleanup please .... ");

}

public static void main(String[] str ) {

HookDemo obj = new HookDemo();
//let us terminate the program now...
System.exit(1);

//what output you expect ??
}
}
What we did is we created the shutdown hook with the manager itself and passed the manager instance to hook... why we would do that?
In plain language : We need to do all the post shutdown related actions in hook#run() method. so its essential the hook has the necessary refernces available to take the actions.

public class ShutdownHook extends Thread {
private JDBCManager manager;

public ShutdownHook(JDBCManager manager) {
this.manager = manager;
}

public void run() {
System.out.println("ShutdownHook started ...");
//let us invoke the cleanup operation here
manager.cleanup();
}
}


The following Q&A extract from Sun java documentation addresses some of the design issues of the Shutdown Hooks API

Isn't this what runFinalizersOnExit is for?

You can use the Runtime.runFinalizersOnExit method, or the equivalent method in the System class, to schedule actions to take place when the VM shuts down due to exit. This technique does not, however, work for termination-triggered shutdowns. It is also is inherently unsafe, and in fact these methods were deprecated in version 1.2 of the JavaTM 2 Platform.

Why don't you provide information as to why the VM is shutting down?

On some platforms a native process can't distinguish a shutdown due to exit from a shutdown due to termination. Other platforms provide much richer capabilities, in some cases including notification of system suspension and restart or of imminent power failure. In short, it's impossible to generalize such information in a portable way.

Will shutdown hooks be run if the VM crashes?

If the VM crashes due to an error in native code then no guarantee can be made about whether or not the hooks will be run.

Why are shutdown hooks run concurrently? Wouldn't it make more sense to run them in reverse order of registration?

Invoking shutdown hooks in their reverse order of registration is certainly intuitive, and is in fact how the C runtime library's atexit procedure works. This technique really only makes sense, however, in a single-threaded system. In a multi-threaded system such as Java platform the order in which hooks are registered is in general undetermined and therefore implies nothing about which hooks ought to be run before which other hooks. Invoking hooks in any particular sequential order also increases the possibility of deadlocks. Note that if a particular subsystem needs to invoke shutdown actions in a particular order then it is free to synchronize them internally.

Why are hooks just threads, and unstarted ones at that? Wouldn't it be simpler to use Runnable objects, or Beans-style event and listener patterns?

The approach taken here has two advantages over the more obvious, and more frequently suggested, callback-oriented designs based upon Runnable objects or Beans-style event listeners.

First, it gives the user complete control over the thread upon which a shutdown action is executed. The thread can be created in the proper thread group, given the correct priority, context, and privileges, and so forth.

Second, it simplifies both the specification and the implementation by isolating the VM from the hooks themselves. If shutdown actions were executed as callbacks then a robust implementation would wind up having to create a separate thread for each hook anyway in order for them to run concurrently. The specification would also have to include explicit language about how the threads that execute the callbacks are created.

Aren't threads pretty expensive things to keep around, especially if they won't be started until the VM shuts down?

Most implementations of the Java platform don't actually allocate resources to a thread until it's started, so maintaining a set of unstarted threads is actually very cheap. If you look at the internals of java.lang.Thread you can see that its various constructors just do security checks and initialize private fields. The native start() method does the real work of allocating a thread stack, etc., to get things going.

What about Personal and Embedded Java? Won't starting threads during shutdown be too expensive on those platforms?

This API may not be suitable for the smaller Java platforms. Threads in the Java 2 Platform carry more information than threads in JDK 1.1 and p/eJava. A thread has a class loader, it may have some inherited thread-local variables, and, in the case of GUI apps, it may be associated with a specific application context. Threads will come to carry even more information as the platform evolves; for example, the security team is planning to introduce a notion of per-thread user identity in their upcoming authentication framework.

Because of all this contextual information, shutdown hooks would be harder to write and maintain if they were just Runnable objects or Beans-style event listeners. Suppose that a Runnable shutdown hook, or an equivalent event listener, needed a specific bit of thread-contextual information in order to carry out its operations. Such information could be saved in some shared location before the hook is registered. While this is merely awkward, suppose further that threads acquire some new type of contextual information in a future release. If an operation invoked by the hook also evolves to need that information then the code that registers the hook would have to be amended to save that information as well. Making hooks be threads instead of Runnables or event listeners insulates them from this sort of future change.

Okay, but won't I have to write a lot of code just to register a simple shutdown hook?

No. Simple shutdown hooks can often be written as anonymous inner classes, as in this example:

Runtime.getRuntime().addShutdownHook(new Thread() {

public void run() { database.close(); }

});

This idiom is fine as long as you'll never need to cancel the hook, in which case you'd need to save a reference to the hook when you create it.

What about security? Can an untrusted applet register a shutdown hook?

If there is a security manager installed then the addShutdownHook and removeShutdownHook methods check that the caller's security context permits RuntimePermission("shutdownHooks"). An untrusted applet will not have this permission, and will therefore not be able to register or de-register a shutdown hook.

What happens if a shutdown hook throws an exception and the exception is not caught?

Uncaught exceptions are handled in shutdown hooks just as in any other thread, by invoking the uncaughtException method of the thread's ThreadGroup object. The default implementation of this method prints the exception's stack trace to System.err and terminates the thread. Note that uncaught exceptions do not cause the VM to exit; this happens only when all non-daemon threads have finished or when the Runtime.exit method is invoked.

Why did you add the Runtime.halt method? Isn't it pretty dangerous?

The new halt method is certainly powerful, and it should be used with the utmost caution. It's provided so that applications can insulate themselves from shutdown hooks that deadlock or run for inordinate amounts of time. It also allows applications to force a quick exit in situations where that is necessary.

What happens if finalization-on-exit is enabled? Will finalizers be run before, during, or after shutdown hooks?

Finalization-on-exit processing is done after all shutdown hooks have finished. Otherwise a hook may fail if some live objects are finalized prematurely.


Thursday 13 March 2008

How to invoke a stored procedure using JDBC

The very first thing you need to know is the Stored Procedure signatures.
what is the name of the stored procedure?
does the stored procedure have any return type?
how many parameters it takes and which parameters are of type OUT or INOUT?

stored procedures are of basically 3 types "in", "out" "inout"
1) "in" is the default type. such parameters are the ones which are used to pass the value to stored procedure when we invoke the procedure.
e.g. lets us say we have int sum(n1 int, n2 int)
here n1 and n2 are the in parameters and sum is returned through return type int.

2) "out" : Some stored procedures return values through parameters. When a parameter in a SQL statement or stored procedure is declared as "out", the value of the parameter is returned back to the caller.

3) "inout" : this is both "in" and "out". when the procedure is called the value is passed in this parameter and stored procedure returns the value back to caller in this parameter.

Another SQL question... whats difference between a function and a procedure ?
simple question :) If you don't know the answer ... then you need to do some homework on your SQL skills first :)

Inovking any Stored procedure from JDBC is a 6 step process.

Step - I : Define the call for SP.
e.g. Lets us assume I have a SQL function int sum(n1 int, n2 int)
so calling sum requires n1 and n2 as inputs and it returns the result.
I would write my call as String spCall ="{ ? = call sum(?, ?)}";

If I have a SQL Procedure sum(n1 int, n2 int, n3 int OUT)
this implies the procedure requires 2 input parameters and we get the result in 3rd parameter. In this case the call would be String spCall ="{call sum(?, ?, ?)}";

I have a SQL Procedure sum(n1 int, n2 int INOUT)
This implies the procedure requires 2 input parameters and we get the result in 2nd parameter itself. In this case the call would be String spCall ="{call sum(?, ?)}";

Isn't it quite simple?

Lets us complete all the next steps with ex. (1)
So the call is String spCall = "{ ? = call sum(?, ?)}";

Step-II : The next step is to define the CallableStatement object. CallableStatements facilitate a JDBC program to execute any valid SQL Block or Procedures. So we would define a callablestatement for our spCall as follows

CallableStatement sp = con.prepareCall(spCall);

Step III : Identify all the OUT or INOUT parameters type and register them. What do I mean by register ? This meant 2 things
1) What is the index of such parameters.
2) What type of value these parameters will return. now how do we specify type. JDBC have java.sql.Types which identifies the type of a sql parameters e.g. Types.INT identifies integers, Types.LONG indentifies LONG ... what would be Types.STRING?

now given above facts what will happen with our sum example? don't you agree the first parameter is the one which would contain the results and type would be int.
so here we go ...

sp.registerOUTParameter(1, TYPES.Types);

STEP-IV : Set all the input parameters which the stored procedure needs for invocation. This is very similar to how we use prepared statements. Take this statement as a thumb rule.
e.g. let us say we want to invoke sum with n1=10 and n2=20
sp.setInt(2, 10)
sp.setInt(3, 20)
This is what i meant. I set 2nd parameter with value 10 and 3rd parameter value 20.

All is set now....

Step V: The most straightforward one .. call execute on the sp
sp.execute()
This would execute the procedure...


Step VI : Reap what you sow ... Time to get the results out ..
We would fetch the results from the stored procedure . This is very similar to how you retrieve values from ResultSet. Again a Thumb rule.

int sum_n1_n2 = sp.getInt(1)

did you noticed 2 things here. 1) I called getInt(). because I registered the out parameter parameter as an Int in step III
2) I called getInt(1) because index "1" is registered as an out parameter in step III.

and this completed the 6 step process. Now this is the most generalized set of sequences you need to invoke any stored procedure. Any of these steps become optional depending upon the fact what your SP signatures demand.

Wednesday 12 March 2008

common mistake on statements

Most of the memory leaks that I have seen with Java occurred when developers forgot to close database statements. a mistake in closing logic propagates to many places.
Generally I keep my JDBC program structure as follows

PreparedStatement st=null, st2 = null ....
Connection con = null ....

try {
//your JDBC code goes here ....
con = ConnectionMgr.getConnection(); //get the connection from a connection broker class...
st = con.prepareStatement(
"INSERT INTO subscribers (name, email) VALUE (?, ?)");
st.setString(1, name);
st.setString(2, email);
st.executeUpdate();
..
..
..
}
catch(SQLException e) {
//handle or log exception
}
finally {
//ensure in finally all the resources are collected...
try {
if(st!=null) {
st.close();
}
if(st2!=null) {
st2.close()
}
if(con!=null) {
con.close();
}
}
catch(SQLException e) {
//handle or log exception ...
}
}

I have to write atleast 10 additonal lines to make sure whatever I have opened is closed properly.
I have heard people saying .. why to bother about closing the connections/statements the garbage collector is there to do the job :)
Dear friends the garbage collector is never going to do this for you .. if you are relying on GC to do this job for you then be prepared for worst things coming your way.
If you don't close the connections what is going to happen?
Every database has a timeout set on a connection. if the connection does not do any activity for that time period the database is going to expire the connection. so if you dont close the connection yourself the connection would be stake untill its clotimedout. so greater is the timeout value the higher is the time such orphaned connections are sitting in JVM, eventually its possible that your application runs out of limit for the maximum possible connections to the database.

However, the problem still remains of how to write code that closes the statements and result sets reliably. You want to always attempt to close the statements, whether an exception occurs or not. In the Jakarta Commons DbUtils project, functions are provided to ensure that your statements are also always closed.

JDBC resource cleanup code is mundane, error prone work so these classes abstract out all of the cleanup tasks from your code leaving you with what you really wanted to do with JDBC in the first place: query and update data.

This is how you would use its QueryRunner:

import org.apache.commons.dbutils.QueryRunner;
import java.sql.*;

public class Database2 {
private QueryRunner queryRunner = new QueryRunner();
public int insertSubscriber(Connection con, String name, String email)
throws SQLException {
String sql = "INSERT INTO subscribers (name, email) VALUE (?, ?)";
Object[] params = { name, email };
return queryRunner.update(con, sql, params);
}
}

There are very good features available in DB utils e.g. for a given select query ResultSet you can associate the java beans and when you fire the query you get the collection of associated java objects. please have a look at Examples here.

Why Prepared Statements are important?

Databases have a tough job. They accept SQL queries from many clients concurrently and execute the queries as efficiently as possible against the data. Processing statements can be an expensive operation but databases are now written in such a way so that this overhead is minimized. However, these optimizations need assistance from the application developers if we are to capitalize on them. This article shows you how the correct use of PreparedStatements can significantly help a database perform these optimizations.

How does a statement is executed ?

From JDBC perspective when a database receives a statement, the database engine first parses the statement and looks for syntax errors. Once the statement is parsed, the database needs to figure out the most efficient way to execute the statement. This can be quite expensive task. The database checks what indexes, if any, can help, or whether it should do a full read of all rows in a table. Databases use statistics on the data to figure out what is the best way and finally we get a Query execution plan. Once the query plan is created then it can be executed by the database engine.

Everytime we execute a sql query all these jobs are repeated again and again.

Statement Caches
Databases are tuned to do statement caches. They usually include some kind of statement cache. This cache uses the statement itself as a key and the access plan is stored in the cache with the corresponding statement. This allows the database engine to reuse the plans for statements that have been executed previously. For example, if we sent the database a statement such as "select a,b from t where c = 2", then the computed access plan is cached. If we send the same statement later, the database can reuse the previous access plan, thus saving us CPU power.

Note however, that the entire statement is the key. For example, if we later sent the statement "select a,b from t where c = 3", it would not find an access plan. This is because the "c=3" is different from the cached plan "c=2". So, for example:

For(int I = 0; I < 1000; ++I)
{
PreparedStatement ps = conn.prepareStatement("select a,b from t where c = " + I);
ResultSet rs = Ps.executeQuery();
Rs.close();
Ps.close();
}

Here the cache won't be effective. Each iteration of the loop sends a different SQL statement to the database (Why ? I is changing for each iteration as a result sql query is different for each iteration). A new access plan is computed for each iteration and we're basically throwing CPU cycles away using this approach. However, look at the next snippet:

PreparedStatement ps = conn.prepareStatement("select a,b from t where c = ?");
//We are using the sql parametrized form ... kinda skeleton query ...
For(int I = 0; I < 1000; ++I)
{
ps.setInt(1, I);
ResultSet rs = ps.executeQuery();
Rs.close();
}
ps.close();

Here it will be much more efficient. The statement sent to the database is parameterized using the '?' marker in the sql. This means every iteration is sending the same statement to the database with different parameters for the "c=?" part. This allows the database to reuse the access plans for the statement and makes the program execute more efficiently inside the database. This basically let's your application run faster or makes more CPU available to users of the database.

PreparedStatements and J2EE servers

Things can get more complicated when we use a J2EE server. Normally, a prepared statement is associated with a single database connection. When the connection is closed, the preparedstatement is discarded. Normally, a fat client application would get a database connection and then hold it for its lifetime. It would also create all prepared statements eagerly or lazily. Eagerly means that they are all created at once when the application starts. Lazily means that they are created as they are used. An eager approach gives a delay when the application starts but once it starts then it performs optimally. A lazy approach gives a fast start but as the application runs, the prepared statements are created when they are first used by the application. This gives an uneven performance until all statements are prepared but the application eventually settles and runs as fast as the eager application. Which is best depends on whether you need a fast start or even performance.

The problem with a J2EE application is that it can't work like this. It only keeps a connection for the duration of the request. This means that it must create the prepared statements every time the request is executed. This is not as efficient as the fat client approach where the prepared statements are created once, rather than on every request. J2EE vendors have noticed this and designed connection pooling to avoid this performance disadvantage.

When the J2EE server gives your application a connection, it isn't giving you the actual connection; you're getting a wrapper. You can verify this by looking at the name of the class for the connection you are given. It won't be a database JDBC connection, it'll be a class created by your application server. Normally, if you called close on a connection then the jdbc driver closes the connection. We want the connection to be returned to the pool when close is called by a J2EE application. We do this by making a proxy jdbc connection class that looks like a real connection. It has a reference to the actual connection. When we invoke any method on the connection then the proxy forwards the call to the real connection. But, when we call methods such as close instead of calling close on the real connection, it simply returns the connection to the connection pool and then marks the proxy connection as invalid so that if it is used again by the application we'll get an exception.

Wrapping is very useful as it also helps J2EE application server implementers to add support for prepared statements in a sensible way. When an application calls Connection.prepareStatement, it is returned a PreparedStatement object by the driver. The application then keeps the handle while it has the connection and closes it before it closes the connection when the request finishes. However, after the connection is returned to the pool and later reused by the same, or another application, , then ideally, we want the same PreparedStatement to be returned to the application.

J2EE PreparedStatement Cache

J2EE PreparedStatement Cache is implemented using a cache inside the J2EE server connection pool manager. The J2EE server keeps a list of prepared statements for each database connection in the pool. When an application calls prepareStatement on a connection, the application server checks if that statement was previously prepared. If it was, the PreparedStatement object will be in the cache and this will be returned to the application. If not, the call is passed to the jdbc driver and the query/preparedstatement object is added in that connections cache.

We need a cache per connection because that's the way jdbc drivers work. Any preparedstatements returned are specific to that connection.

If we want to take advantage of this cache, the same rules apply as before. We need to use parameterized queries so that they will match ones already prepared in the cache. Most application servers will allow you to tune the size of this prepared statement cache.
Summary

In conclusion, we should use parameterized queries with prepared statements. This reduces the load on the database by allowing it to reuse access plans that were already prepared. This cache is database-wide so if you can arrange for all your applications to use similar parameterized SQL, you will improve the efficiency of this caching scheme as an application can take advantage of prepared statements used by another application. This is an advantage of an application server because logic that accesses the database should be centralized in a data access layer (either an OR-mapper, entity beans or straight JDBC).

Finally, the correct use of prepared statements also lets you take advantage of the prepared statement cache in the application server. This improves the performance of your application as the application can reduce the number of calls to the JDBC driver by reusing a previous prepared statement call. This makes it competitive with fat clients efficiency-wise and removes the disadvantage of not being able to keep a dedicated connection.

If you use parameterized prepared statements, you improve the efficiency of the database and your application server hosted code. Both of these improvements will allow your application to improve its performance.