Using ClusterJ (part of MySQL Cluster Connector for Java) – a tutorial

Fig. 1 Java access to MySQL Cluster

ClusterJ is part of the MySQL Cluster Connector for Java which is currently in beta as part of MySQL Cluster 7.1. It is designed to provide a high performance method for Java applications to store and access data in a MySQL Cluster database. It is also designed to be easy for Java developers to use and is “in the style of” Hibernate/Java Data Objects (JDO) and JPA. It uses the Domain Object Model DataMapper pattern:

  • Data is represented as domain objects
  • Domain objects are separate from business logic
  • Domain objects are mapped to database tables

The purpose of ClusterJ is to provide a mapping from the table-oriented view of the data stored in MySQL Cluster to the Java objects used by the application. This is achieved by annotating interfaces representing the Java objects; where each persistent interface is mapped to a table and each property in that interface to a column. By default, the table name will match the interface name and the column names match the property names but this can be overridden using annotations.

Fig. 2 ClusterJ Interface Annotations

If the table does not already exist (for example, this is a brand new application with new data) then the table must be created manually – unlike OpenJPA, ClusterJ will not create the table automatically.

Figure 2 shows an example of an interface that has been created in order to represent the data held in the ‘employee’ table.

ClusterJ uses the following concepts:

  • Fig. 3 ClusterJ Terminology

    SessionFactory: There is one instance per MySQL Cluster instance for each Java Virtual Machine (JVM). The SessionFactory object is used by the application to get hold of sessions. The configuration details for the ClusterJ instance are defined in the Configuration properties which is an artifact associated with the SessionFactory.

  • Session: There is one instance per user (per Cluster, per JVM) and represents a Cluster connection
  • Domain Object: Objects representing the data from a table. The domain objects (and their relationships to the Cluster tables) are defined by annotated interfaces (as shown in the right-hand side of Figure 2.
  • Transaction: There is one transaction per session at any point in time. By default, each operation (query, insert, update, or delete) is run under a new transaction. . The Transaction interface allows developers to aggregate multiple operations into a single, atomic unit of work.

ClusterJ will be suitable for many Java developers but it has some restrictions which may make OpenJPA with the ClusterJPA plug-in more appropriate. These ClusterJ restrictions are:

  • Persistent Interfaces rather than persistent classes. The developer provides the signatures for the getter/setter methods rather than the properties and no extra methods can be added.
  • No Relationships between properties or between objects can be defined in the domain objects. Properties are primitive types.
  • No Multi-table inheritance; there is a single table per persistent interface
  • No joins in queries (all data being queried must be in the same table/interface)
  • No Table creation – user needs to create tables and indexes
  • No Lazy Loading – entire record is loaded at one time, including large object (LOBs).

Tutorial

This tutorial uses MySQL Cluster 7.1.2a on Fedora 12. If using earlier or more recent versions of MySQL Cluster then you may need to change the class-paths as explained in http://dev.mysql.com/doc/ndbapi/en/mccj-using-clusterj.html

It is necessary to have MySQL Cluster up and running. For simplicity all of the nodes (processes) making up the Cluster will be run on the same physical host, along with the application.

These are the MySQL Cluster configuration files being used :

config.ini:

[ndbd default]noofreplicas=2
datadir=/home/billy/mysql/my_cluster/data

[ndbd]
hostname=localhost
id=3

[ndbd]
hostname=localhost
id=4

[ndb_mgmd]
id = 1
hostname=localhost
datadir=/home/billy/mysql/my_cluster/data

[mysqld]
hostname=localhost
id=101

[api]
hostname=localhost

my.cnf:

[mysqld]
ndbcluster
datadir=/home/billy/mysql/my_cluster/data
basedir=/usr/local/mysql

This tutorial focuses on ClusterJ rather than on running MySQL Cluster; if you are new to MySQL Cluster then refer to running a simple Cluster before trying this tutorial.

ClusterJ needs to be told how to connect to our MySQL Cluster database; including the connect string (the address/port for the management node), the database to use, the user to login as and attributes for the connection such as the timeout values. If these parameters aren’t defined then ClusterJ will fail with run-time exceptions. This information represents the “configuration properties” shown in Figure 3.  These parameters can be hard coded in the application code but it is more maintainable to create a clusterj.properties file that will be imported by the application. This file should be stored in the same directory as your application source code.

clusterj.properties:

com.mysql.clusterj.connectstring=localhost:1186
 com.mysql.clusterj.database=clusterdb
 com.mysql.clusterj.connect.retries=4
 com.mysql.clusterj.connect.delay=5
 com.mysql.clusterj.connect.verbose=1
 com.mysql.clusterj.connect.timeout.before=30
 com.mysql.clusterj.connect.timeout.after=20
 com.mysql.clusterj.max.transactions=1024

As ClusterJ will not create tables automatically, the next step is to create ‘clusterdb’ database (referred to in clusterj.properties) and the ‘employee’ table:

[billy@ws1 ~]$ mysql -u root -h 127.0.0.1 -P 3306 -u root
 mysql>  create database clusterdb;use clusterdb;
 mysql> CREATE TABLE employee (
 ->     id INT NOT NULL PRIMARY KEY,
 ->     first VARCHAR(64) DEFAULT NULL,
 ->     last VARCHAR(64) DEFAULT NULL,
 ->     municipality VARCHAR(64) DEFAULT NULL,
 ->     started VARCHAR(64) DEFAULT NULL,
 ->     ended  VARCHAR(64) DEFAULT NULL,
 ->     department INT NOT NULL DEFAULT 1,
 ->     UNIQUE KEY idx_u_hash (first,last) USING HASH,
 ->     KEY idx_municipality (municipality)
 -> ) ENGINE=NDBCLUSTER;

The next step is to create the annotated interface:

Employee.java:

import com.mysql.clusterj.annotation.Column;
import com.mysql.clusterj.annotation.Index;
import com.mysql.clusterj.annotation.PersistenceCapable;
import com.mysql.clusterj.annotation.PrimaryKey;
@PersistenceCapable(table="employee")
@Index(name="idx_uhash")
public interface Employee {
@PrimaryKey
int getId();
void setId(int id);
String getFirst();
void setFirst(String first);

String getLast();
void setLast(String last);
@Column(name="municipality")
@Index(name="idx_municipality")
String getCity();
void setCity(String city);
String getStarted();
void setStarted(String date);
String getEnded();
void setEnded(String date);
Integer getDepartment();
void setDepartment(Integer department);
}

The name of the table is specified in the annotation @PersistenceCapable(table=”employee”) and then each column from the employee table has an associated getter and setter method defined in the interface. By default, the property name in the interface is the same as the column name in the table – the column name has been overridden for the City property by explicitly including the @Column(name=”municipality”) annotation just before the associated getter method. The @PrimaryKey annotation is used to identify the property whose associated column is the Primary Key in the table. ClusterJ is made aware of the existence of indexes in the database using the @Index annotation.

The next step is to write the application code which we step through here block by block; the first of which simply contains the import statements and then loads the contents of the clusterj.properties defined above:

Main.java (part 1):

import com.mysql.clusterj.ClusterJHelper;
import com.mysql.clusterj.SessionFactory;
import com.mysql.clusterj.Session;
import com.mysql.clusterj.Query;
import com.mysql.clusterj.query.QueryBuilder;
import com.mysql.clusterj.query.QueryDomainType;
import java.io.File;
import java.io.InputStream;
import java.io.FileInputStream;
import java.io.*;
import java.util.Properties;
import java.util.List;
public class Main {
public static void main (String[] args) throws java.io.FileNotFoundException,java.io.IOException {
// Load the properties from the clusterj.properties file
File propsFile = new File("clusterj.properties");
InputStream inStream = new FileInputStream(propsFile);
Properties props = new Properties();
props.load(inStream);
//Used later to get userinput
BufferedReader br = new BufferedReader(new
InputStreamReader(System.in));

The next step is to get a handle for a SessionFactory from the ClusterJHelper class and then use that factory to create a session (based on the properties imported from clusterj.properties file.

Main.java (part 2):

// Create a session (connection to the database)
SessionFactory factory = ClusterJHelper.getSessionFactory(props);
Session session = factory.getSession();

Now that we have a session, it is possible to instantiate new Employee objects and then persist them to the database. Where there are no transaction begin() or commit() statements, each operation involving the database is treated as a separate transaction.

Main.java (part 3):

// Create and initialise an Employee
Employee newEmployee = session.newInstance(Employee.class);
newEmployee.setId(988);
newEmployee.setFirst("John");
newEmployee.setLast("Jones");
newEmployee.setStarted("1 February 2009");
newEmployee.setDepartment(666);
// Write the Employee to the database
session.persist(newEmployee);

At this point, a row will have been added to the ‘employee’ table. To verify this, a new Employee object is created and used to read the data back from the ‘employee’ table using the primary key (Id) value of 998:

Main.java (part 4):

// Fetch the Employee from the database
 Employee theEmployee = session.find(Employee.class, 988);
if (theEmployee == null)
 {System.out.println("Could not find employee");}
else
 {System.out.println ("ID: " + theEmployee.getId() + "; Name: " +
 theEmployee.getFirst() + " " + theEmployee.getLast());
 System.out.println ("Location: " + theEmployee.getCity());
 System.out.println ("Department: " + theEmployee.getDepartment());
 System.out.println ("Started: " + theEmployee.getStarted());
 System.out.println ("Left: " + theEmployee.getEnded());
}

This is the output seen at this point:

ID: 988; Name: John Jones
Location: null
Department: 666
Started: 1 February 2009
Left: null
Check the database before I change the Employee - hit return when you are done

The next step is to modify this data but it does not write it back to the database yet:

Main.java (part 5):

// Make some changes to the Employee & write back to the database
theEmployee.setDepartment(777);
theEmployee.setCity("London");
System.out.println("Check the database before I change the Employee -
hit return when you are done");
String ignore = br.readLine();

The application will pause at this point and give you chance to check the database to confirm that the original data has been added as a new row but the changes have not been written back yet:

mysql> select * from clusterdb.employee;
+-----+-------+-------+--------------+-----------------+-------+------------+
| id  | first | last  | municipality | started         | ended | department |
+-----+-------+-------+--------------+-----------------+-------+------------+
| 988 | John  | Jones | NULL         | 1 February 2009 | NULL  |        666 |
+-----+-------+-------+--------------+-----------------+-------+------------+

After hitting return, the application will continue and write the changes to the table, using an automatic transaction to perform the update.

Main.java (part 6):

session.updatePersistent(theEmployee);
System.out.println("Check the change in the table before I bulk add
Employees - hit return when you are done");
ignore = br.readLine();

The application will again pause so that we can now check that the change has been written back (persisted) to the database:

mysql> select * from clusterdb.employee;
+-----+-------+-------+--------------+-----------------+-------+------------+
| id  | first | last  | municipality | started         | ended | department |
+-----+-------+-------+--------------+-----------------+-------+------------+
| 988 | John  | Jones | London       | 1 February 2009 | NULL  |        777 |
+-----+-------+-------+--------------+-----------------+-------+------------+

The application then goes onto create and persist 100 new employees. To improve performance, a single transaction is used to that all of the changes can be written to the database at once when the commit() statement is run:

Main.java (part 7):

// Add 100 new Employees - all as part of a single transaction
 newEmployee.setFirst("Billy");
 newEmployee.setStarted("28 February 2009");
session.currentTransaction().begin();
for (int i=700;i<800;i++) {
 newEmployee.setLast("No-Mates"+i);
 newEmployee.setId(i+1000);
 newEmployee.setDepartment(i);
 session.persist(newEmployee);
 }
session.currentTransaction().commit();

The 100 new employees will now have been persisted to the database. The next step is to create and execute a query that will search the database for all employees in department 777 by using a QueryBuilder and using that to build a QueryDomain that compares the ‘department’ column with a parameter. After creating the, the department parameter is set to 777 (the query could subsequently be reused with different department numbers). The application then runs the query and iterates through and displays each of employees in the result set:

Main.java (part 8):

// Retrieve the set all of Employees in department 777
QueryBuilder builder = session.getQueryBuilder();
QueryDomainType<Employee> domain =
builder.createQueryDefinition(Employee.class);
domain.where(domain.get("department").equal(domain.param(
"department")));
Query<Employee> query = session.createQuery(domain);
query.setParameter("department",777);
List<Employee> results = query.getResultList();
for (Employee deptEmployee: results) {
System.out.println ("ID: " + deptEmployee.getId() + "; Name: " +
deptEmployee.getFirst() + " " + deptEmployee.getLast());
System.out.println ("Location: " + deptEmployee.getCity());
System.out.println ("Department: " + deptEmployee.getDepartment());
System.out.println ("Started: " + deptEmployee.getStarted());
System.out.println ("Left: " + deptEmployee.getEnded());
}
System.out.println("Last chance to check database before emptying table
- hit return when you are done");
ignore = br.readLine();

At this point, the application will display the following and prompt the user to allow it to continue:

ID: 988; Name: John Jones
Location: London
Department: 777
Started: 1 February 2009
Left: null
ID: 1777; Name: Billy No-Mates777
Location: null
Department: 777
Started: 28 February 2009
Left: null

We can compare that output with an SQL query performed on the database:

mysql> select * from employee where department=777;
 +------+-------+-------------+--------------+------------------+-------+------------+
 | id   | first | last        | municipality | started          | ended | department |
 +------+-------+-------------+--------------+------------------+-------+------------+
 |  988 | John  | Jones       | London       | 1 February 2009  | NULL  |        777 |
 | 1777 | Billy | No-Mates777 | NULL         | 28 February 2009 | NULL  |        777 |
 +------+-------+-------------+--------------+------------------+-------+------------+

Finally, after pressing return again, the application will remove all employees:

Main.java (part 9):

session.deletePersistentAll(Employee.class);
 }
}

As a final check, an SQL query confirms that all of the rows have been deleted from the ‘employee’ table.

mysql> select * from employee;
Empty set (0.00 sec)

Compiling and running the ClusterJ tutorial code

javac -classpath /usr/local/mysql/share/mysql/java/clusterj-api.jar:. Main.java Employee.java
java -classpath /usr/local/mysql/share/mysql/java/clusterj.jar:. -Djava.library.path=/usr/local/mysql/lib Main
 

Download the source code for this tutorial from here (together with the code for the up-coming ClusterJPA tutorial).





7 comments

  1. […] is a follow up to the earlier post Using ClusterJ (part of MySQL Cluster Connector for Java) – a tutorial but covers the ClusterJPA interface rather than […]

  2. Tousif Khan says:

    Excellent Tutorial. its really very helpful.
    Thanks Andrew.

  3. Afzal says:

    how can i do this?
    [billy@ws1 ~]$ mysql -u root -h 127.0.0.1 -P 3306 -u root
    mysql> create database clusterdb;use clusterdb;
    mysql> CREATE TABLE employee (
    -> id INT NOT NULL PRIMARY KEY,
    -> first VARCHAR(64) DEFAULT NULL,
    -> last VARCHAR(64) DEFAULT NULL,
    -> municipality VARCHAR(64) DEFAULT NULL,
    -> started VARCHAR(64) DEFAULT NULL,
    -> ended VARCHAR(64) DEFAULT NULL,
    -> department INT NOT NULL DEFAULT 1,
    -> UNIQUE KEY idx_u_hash (first,last) USING HASH,
    -> KEY idx_municipality (municipality)
    -> ) ENGINE=NDBCLUSTER;

  4. Salman says:

    Hey,

    How can we batch multiple read operation in a transaction? Any code snippet would be great

    /Salman

  5. marcos says:

    ClusterJ needs at least Java 5? Can not run on 1.4?

Leave a Reply to andrew Cancel reply