Thursday, October 21, 2010

Hibernate Batch Processing with MySQL

Batch processing is a common solution to improve database performance by inserting or updating massive data in bulk. However, there are some hiccups you might run into. Let's walk through an example to discover the issues. Suppose we have a simple table named Account with column name and age. Followings are the SQL script (MySQL), Hibernate bean, and batch insert codes.

Account.sql
CREATE TABLE `Account` (
  `id` int(20) NOT NULL auto_increment,
  `name` varchar(20) NOT NULL,
  `age` int(11) NOT NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Account.java
@Entity
@Table(name="Account")
public class Account implements Serializable {
 
  private static final long serialVersionUID = 6954737637154168665L;
 
  private long id;
  private String name;
  private int age;
 
  @Id @GeneratedValue(strategy=GenerationType.IDENTITY)
  public long getId() {
    return id;
  }

  public void setId(long id) {
    this.id = id;
  }

  @Basic
  @Column(name="name", nullable=false, length=20)
  public String getName() {
    return name;
  }

  public void setName(String name) {
    this.name = name;
  }
 
  @Basic
  @Column(name="age", nullable=false)
  public int getAge() {
    return age;
  }

  public void setAge(int age) {
    this.age = age;
  }
 
}

Test.java
Transaction tx = session.beginTransaction();
        
for(int i=0; i < accounts.size(); i++) {
  Account account = accounts.get(i);
  session.save(account);

  if(i % 50 == 0) {
    session.flush();
    session.clear();
  }
}
tx.commit();

First thing to notice is you will need to flush and clear the data every N (50 here) records as Hibernate caches newly inserted records in session-level cache. Otherwise, you will see OutOfMemoryException somewhere if you have a large amount of records.

Secondly, it's essential to configure hibernate.jdbc.batch_size to achieve optimal performance. The value should be the same with N. You could play with different values to find the best one for your application. In addition, the second-level cache needs to be disabled either in the configuration or through CacheMode programmingly.

But wait... Why didn't I see performance improvement??

The problem is Hibernate explicitly says:
Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.

As a result, we have to use alternative identity generator. I am going to show you how to use MultipleHiLoPerTableGenerator due to efficiency of the HiLo algorithm for identity generation. Here are the updated annotation in Account.java as well as the new table to store the sequence's next hi value.

@GenericGenerator(name="AccountIdGenerator", strategy="org.hibernate.id.MultipleHiLoPerTableGenerator",
  parameters = {
    @Parameter(name="table", value="IdentityGenerator"),
    @Parameter(name="primary_key_column", value="sequence_name"),
    @Parameter(name="primary_key_value", value="Account"),
    @Parameter(name="value_column", value="next_hi_value"),
    @Parameter(name="primary_key_length", value="100"),
    @Parameter(name="max_lo", value="1000")
  }
)
@Id @GeneratedValue(strategy=GenerationType.TABLE, generator="AccountIdGenerator")

CREATE TABLE `IdentityGenerator` (
  `id` int(11) NOT NULL auto_increment,
  `sequence_name` varchar(100) NOT NULL,
  `next_hi_value` int(11) NOT NULL,
  PRIMARY KEY  (`id`)
) ENGINE= InnoDB DEFAULT CHARSET=latin1;

Yes.. I know.. The performance still sucks.

The reason is because MySQL does not support batch form of prepare statement parameter bindings. Fortunately, we could turn on rewriteBatchedStatements so that it will rewrite them to the form as "INSERT INTO Account VALUES (...), (...), (...)".

<property name="hibernate.connection.url">
  jdbc:mysql://127.0.0.1/sheng?rewriteBatchedStatements=true
</property>

Congratulations, you are all set! You will see great performance gain with the same test set you run.

Tuesday, June 22, 2010

Running Seam 2.0 on Tomcat

Java Server Faces or JSF is a component-based MVC framework designed for rapid development. It includes predefined UI components and event model. The framework enables the integration of third party components. One of my favorites is RichFaces which provides an array of rich components and skinability.

JBoss Seam is a powerful enterprise web framework for seamless integration between JSF and EJB. Seam supports several features not in the JSF such as conversation scope, JBPM integration, navigation enhancement, just to name a few (Some features are supported since JSF 2.0).

If you had worked on JSF project without using Seam, you might see some exception complaining your session is closed when rendering Hibernate objects. With Seam, you can inject the Hibernate session (or EntityManager) into your bean. Seam will manage the session for you. It saves some tedious plumbing codes to to use Data Transfer Object.

However, I am not the fan of EJB as it is heavyweight. And you would need to deploy on J2EE compliance server such as JBoss. It takes comparatively longer to start up than our old friend Tomcat. So I began my research on how to deploy my web application on Tomcat without sacrificing the features I want in Seam. The first attempt was to use embedded JBoss in Tomcat. It works, but... it still gave me some exceptions during startup and I like to make deployment easier. Why can't I just deploy war file on Tomcat without any change? Fortunately, here is the solution. I assume readers have knowledge on how to configure JPA and Seam.

persistence.xml in the META-INF
<persistence xmlns="http://java.sun.com/xml/ns/persistence"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence
http://java.sun.com/xml/ns/persistence/persistence_1_0.xsd">

  <persistence-unit name="testdb" type="RESOURCE_LOCAL">
    <provider>org.hibernate.ejb.HibernatePersistence</provider>
    <properties>
      <name="hibernate.connection.driver_class" value="org.gjt.mm.mysql.Driver"/>
      <name="hibernate.dialect" value="org.hibernate.dialect.MySQLInnoDBDialect"/>
      <name="hibernate.connection.url" value="jdbc:mysql://127.0.0.1/test"/>
      <name="hibernate.connection.username" value="test"/>
      <name="hibernate.connection.password" value="test"/>
      <name="hibernate.default_schema" value="test"/>
      <name="hibernate.bytecode.use_reflection_optimizer" value="false"/>
      <name="hibernate.cache.provider_class" value="org.hibernate.cache.HashtableCacheProvider"/>
      <name="show_sql" value="false"/>

      <name="hibernate.connection.provider_class" value="org.hibernate.connection.C3P0ConnectionProvider"/>
      <name="hibernate.c3p0.acquire_increment" value="1"/>
      <name="hibernate.c3p0.idle_test_period" value="120"/>
      <name="hibernate.c3p0.timeout" value="60"/>
      <name="hibernate.c3p0.max_size" value="50"/>
      <name="hibernate.c3p0.min_size" value="1"/>
      <name="hibernate.c3p0.max_statements" value="0"/>
      <name="hibernate.c3p0.preferredTestQuery" value="select 1;"/>
    </properties>
  </persistence-unit>
</persistence>

components.xml in Seam
<components xmlns="http://jboss.com/products/seam/components"
xmlns:core="http://jboss.com/products/seam/core"
xmlns:persistence="http://jboss.com/products/seam/persistence"
xmlns:transaction="http://jboss.com/products/seam/transaction"
xmlns:security="http://jboss.com/products/seam/security"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ui="http://jboss.com/products/seam/ui"
xsi:schemaLocation=
"http://jboss.com/products/seam/core http://jboss.com/products/seam/core-2.1.xsd 
http://jboss.com/products/seam/persistence http://jboss.com/products/seam/persistence-2.1.xsd
http://jboss.com/products/seam/transaction http://jboss.com/products/seam/transaction-2.1.xsd 
http://jboss.com/products/seam/security http://jboss.com/products/seam/security-2.1.xsd 
http://jboss.com/products/seam/components http://jboss.com/products/seam/components-2.1.xsd">

  <core:init debug="false" transaction-management-enabled="false"/>
  <core:manager conversation-timeout="120000"/>
  <core:resource-loader>
  <core:bundle-names>
    <value>org.sheng.bundle.messages</value>
  </core:bundle-names>
  </core:resource-loader>

  <persistence:entity-manager-factory name="entityManagerFactory" persistence-unit-name="testdb"/>
  <persistence:managed-persistence-context name="entityManager" auto-create="true" entity-manager-factory="#{entityManagerFactory}"/>
  <transaction:entity-transaction entity-manager="#{entityManager}"/>
  <factory name="session" scope="STATELESS" auto-create="true" value="#{entityManager.delegate}"/>
</components>

One thing to notice is the persistence-unit-name needs to match the one in persistence.xml. If you are using JPA, just remove the factory tag and inject your entityManager. Now you could simply drop the war file in the webapps folder of your tomcat. The version I use at the time of this writing is 6.0.26.

Tuesday, March 16, 2010

Effective Unit Testing with TestNG

Writing good unit tests is essential to maintain product quality. JUnit is a popular testing framework for Java. Starting with JUnit 4, it has a radical change on the framework in support of annotation based configuration. With annotations, it's no longer required to extend TestCase for all your test cases. However, the lack of good supporting dependency testing makes TestNG, another test automation framework, a formidable alternative to JUnit.

TestNG natively supports dependency testing with dependsOnMethods and dependsOnGroups. It has a concept of group that helps organize test cases on the level of category. Another important feature is parametric testing by using Data Provider. A variety of data values can be injected into same test method to verify different scenarios.

Ideally, every project should have as complete test cases covered as possible so that future enhancements could be assured not breaking existing codes. But, let's face the real world. Developers are driven by tight deadlines. Sluggish testing process would make them reluctant to run the tests.

With the help of in-memory database, you could run your unit testing comfortably fast. Here is an example of how Hibernate defines in-memory database using H2, a SQL database written in Java and proven very fast.

test-hibernate.cfg.xml
<!-- Basic -->
<property name="hibernate.connection.driver_class">org.h2.Driver</property>
<property name="hibernate.dialect">org.hibernate.dialect.HSQLDialect</property>
<property name="hibernate.connection.url">jdbc:h2:mem:test</property>
<property name="hibernate.connection.username">username</property>
<property name="hibernate.connection.password">password</property>
<property name="hibernate.connection.default_schema">test</property>
<property name="hbm2ddl.auto">create</property>

<!-- Mappings -->
<mapping class="org.sheng.db.User"/>

The property hbm2ddl.auto is set to create to make the testing process idempotent, meaning every execution of same test cases generates same result. You also don't need to clean testing data created from testing phases. Please note I intentionally use HSQLDialect here due to H2Dialect in some version of Hibernate is buggy. Followings are more details of the example:

testng.xml
<suite name="Test Suite">
  <test name="Test Cases">
    <classes>
      <class name="org.sheng.test.MyTest"/>
    </classes>
  </test>
</suite>

MyTest.java
public class MyTest {

  protected Session session;

  @BeforeSuite
  public void init() throws Exception {
    DaoFactory.configureSessionFactory("test-hibernate.cfg.xml");
  }

  @BeforeClass
  public void setUp() throws Exception {
    session = DaoFactory.newSession();
  }

  @DataProvider(name = "users") 
  public Object[][] getUsers() { 
    return new Object[][] { 
      new Object[] { "admin", "password" },
      new Object[] { "manager", "password" },
      new Object[] { "user1", "password" },
      new Object[] { "user2", "password" }
    };
  }

  @Test(groups = {"database"}, dataProvider = "users")
  public void createUser(String username, String password) throws Exception {
    User user = new User();
    user.setUsername(username);
    uesr.setPassword(password);
    DaoFactory.getUserDao().createUser(session, user);
  }

  @Test(groups = {"database"}, dependsOnMethods = {"createUser"})
  public void verifyUsers() throws Exception {
    //Verify users
  }

}

MyTest.java
public class DaoFactory {

  private static AnnotationConfiguration configuration;
  private static SessionFactory sessionFactory;

  public synchronized static void configureSessionFactory(String resource) {
    URL url = getResource(resource);
    configuration = new AnnotationConfiguration();
    sessionFactory = configuration.configure(url).buildSessionFactory();
  }

  public static Session newSession() {
    return sessionFactory.openSession();
  }

  public static URL getResource(String resource) {
    ClassLoader classLoader = null;
    URL url = null;

    try {
      Method method = Thread.class.getMethod("getContextClassLoader", (Class[])null);
      classLoader = (ClassLoader)method.invoke(Thread.currentThread(), (Object[])null);
      url = classLoader.getResource(resource);      

      if(url != null) {
        return url;
      }
    }
    catch(Exception e) {
      e.printStackTrace()
    }

    return ClassLoader.getSystemResource(resource);
  }

}

Effective testing process is important to the success of a project. It's difficult to keep track of every change especially when you are collaborating with other teammates. Well-written test cases also serves as documents to help a developer who is new to the project to make contributions in a timely manner.

Monday, October 19, 2009

Wicket in iPhone Development

Since Apple released iPhone on June 29th 2007, it has attracted a lot of attention. It revolutionizes the mobile world with its novel touch screen and motion sensor. In addition, iPhone SDK provides developers with new possibilities to bring their creativity on the mobile platform. The advent of iPhone has certainly brought mobile computing to next generation.

Wicket is a lightweight Java web framework. After prototyping, Wicket quickly caught my interest due to its Simplicity, Clean Separation of Concerns, and Ease of Deployment. With my experience in Swing, I found it familiar and easy to use, especially for small scale web development.

To start with Wicket, first download the latest binary from Apache Wicket. The latest stable version at the time of this writing is 1.4.1. Here is a hello world example:

web.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE web-app
PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
"http://java.sun.com/dtd/web-app_2_3.dtd">
<web-app>
  </display-name>MyApp</display-name>
  <filter>
    <filter-name>MyApp</filter-name>
    <filter-class>org.apache.wicket.protocol.http.WicketFilter</filter-class>
    <init-param>
      <param-name>applicationClassName</param-name>
      <param-value>com.mycompany.wicket.MyApp</param-value>
    </init-param>
  </filter>
</web-app>

Next, you can start developing your web application. A Wicket page consists of two parts, HTML and Java class. They should have the same naming, but just with different file extension. For instance, HelloPage.html and HelloPage.java.

HelloPage.html
<html>
  <head>
    <title>Hello World</title>
  </head>
  <body>
    <span wicket:id="hello">Label here</span>
  </body>
</html>
HelloPage.java
package com.mycompany.wicket;

import org.apache.wicket.markup.html.WebPage;
import org.apache.wicket.markup.basic.Label;

public HelloPage extends WebPage {

  public HelloPage() {
    add(new Label("hello", "Hello World"));
  }

}
Please notice Wicket defines a property wicket:id which is used in Java for correlation. The discerning reader may notice that the text "Label here" will be replaced with "Hello World" in this example. It provides the clean way to separate the view from controller and model. Therefore, web designer can prototype the web pages with mockup data and only need to deal with pure HTML.

Remember we set applicationClassNamein web.xml. Before deployment, this class need to be created. In this example, we configure the bookmarkable URL for HelloPage.class as "/hello". Now, it's time to deploy the web application. I chose Jetty in favor of its lightweightness so that it's fast to test. Once deployed, you could access the page by pointing your browser to http://127.0.0.1:8080/hello if on local.

MyApp.java
package com.mycompany.wicket;

import org.apache.wicket.markup.html.WebApplication;

public MyApp extends WebApplication {

  @Override
  public void init() {
    super.init();
    mountBookmarkablePage("/hello", HelloPage.class);
  }

  @Override
  public Class getHomePage() {
    return HelloPage.class;
  }

}

You should see "Hello World" if everything goes smoothly so far. However, you will find it's not displayed in correct zoom level on iPhone. To solve the issue, simply add following line in the header of your HTML.
<head>
  <meta name="viewport" content="width=device-width; initial-scale=1.0; maximum-scale=1.0;">
  <title>Hello World</title>
</head>

Programming on mobile devices has its own discipline which lies in the limitations of the resources such as CPU power, memory, network bandwidth, just to name a few. There may be some unexpected behaviors when you start testing on the handset. For instance, you might see a noticeable flash while navigating pages. Following is a list of some advices:

1. Avoid background images in CSS.
2. Store larger static files in native application.
3. Delayed refresh and update in a batch.
4. Server side caching (ex. Hibernate caching).
5. Avoid using redirect.

The iPhone SDK allows you to create sleek and user-friendly interface. And mobile web provides you the flexibility to update your software. How you decide to use them really depends on your requirements. A common best practice is try to mimic the look and feel of native interface for you mobile web if used. It will give your users better navigation experience.

Thursday, October 15, 2009

Hibernate, C3p0, and MySQL

In today's applications, database connection pooling is a common best practice. It improves performance by reusing existing connections as establishing a new one is fairly expensive. C3p0 is one of the well-known database connection pool libraries. And Hibernate is now the de facto standard for object-relational mapping (ORM) in Java world. Here is a sample configuration to use C3p0 in Hibernate for MySQL database:
<!-- Basic -->
<property name="hibernate.connection.driver_class">org.gjt.mm.mysql.Driver</property>
<property name="hibernate.dialect">org.hibernate.dialect.MySQLInnoDBDialect</property>
<property name="hibernate.connection.url">jdbc:mysql://127.0.0.1/test</property>
<property name="hibernate.connection.username">username</property>
<property name="hibernate.connection.password">password</property>
<property name="hibernate.connection.default_schema">test</property>

<!-- C3p0 -->
<property name="hibernate.c3p0.acquire_increment">1</property>
<property name="hibernate.c3p0.idle_test_period">300</property>
<property name="hibernate.c3p0.timeout">120</property>
<property name="hibernate.c3p0.max_size">25</property>
<property name="hibernate.c3p0.min_size">1</property>
<property name="hibernate.c3p0.max_statement">0</property>
<property name="hibernate.c3p0.preferredTestQuery">select 1;</property>
Two properties to put focus on are idle_test_period and timeout. The property idle_test_period defines the interval to test unchecked-out connections while timeout is the time unused connections can live in the pool. However, you might see Communication Link Failure errors occasionally in your log, even though it's configured to have connections tested. Listed are some possible causes:

Network
Make sure your system allows connecting to the database. For example, your unix system may have /etc/hosts.allow or /etc/hosts.deny turned on for access control. Or your firewall could block or kill your connections based on the rules.

Database
Check the system configuration on your database. MySQL has a wait_timeout setting which will close the connections idle over a period of time. If your idle_test_period and timeout are greater than MySQL wait_timeout (28800 seconds by default), your unused connections will have chance to be reset by MySQL server before tested or timed out by C3p0.

Configuration
Choose the connection pool library for your configuration. By default, Hibernate should pick the right one for you. But, if there are more than one connection pool libraries on your classpath, it may not be the way you expect. Hibernate allows you to specify the library by setting the property provider_class as follows:
<property name="hibernate.connection.provider_class">
  org.hibernate.connection.C3P0ConnectionProvider
</property>
Of course, you could just simply remove other connection pool libraries.

Reference
How to configure the C3P0 connection pool