-
Glimpse of the future – massively improved JOIN performance for MySQL Cluster
Posted on October 22nd, 2010 No commentsA great chance to find out about the “SPJ” project that’s under way to improve the performance of complex table JOINs. A free webinar has been scheduled for Thursday, November 04 at 09:00 Pacific time (16:00 UK; 17:00 Central European Time) – just register for the webinar at mysql.com. Even if you can’t attend, by registering you’ll get an email telling you where to watch the replay shortly after the webinar.
MySQL Cluster performance has always been extremely high and scalable when the work load is primarily primary key reads and write but complex JOINS (many tables in the JOIN and/or a large number of results from the first part of the query) have traditionally been much slower than when using other storage engines. Work is underway to address this and SPJ is the name we’ve been using.
Traditionally, JOINs have been performed as Nested Loop JOINs in the MySQL Server which is fine when all of the data is held there (e.g. MyISAM) but when the data is held externally (in Cluster’s case, in the data nodes) it can result in a massive amount of messaging. SPJ works by pushing the processing of JOINs down into the data nodes where they can be performed much more efficiently as the data is local.
As well as finding out about the implementation, you’ll also learn:
- What queries benefit -> how you might tweak you application to get the biggest benefits
- The kind of performance improvements you might expect to see
- How to try the (pre-GA!) software for yourself.
Note that (as always) Oracle reserves the right to alter the timing and/or existence of new product releases.
-
MySQL Cluster 7.1.8 binaries released
Posted on October 11th, 2010 2 comments
The binary version for MySQL Cluster 7.1.8 has now been made available at http://www.mysql.com/downloads/cluster/A description of all of the changes (fixes) that have gone into MySQL Cluster 7.1.8 (compared to 7.1.5) can be found in the official MySQL Cluster documentation.
-
MySQL Cluster session from Oracle OpenWorld – replay is available.
Posted on September 28th, 2010 No comments
As part of “MySQL Sunday” at this year’s Oracle Open World, Mat Keep and I presented on the MySQL Cluster architecture and the latest features. If you weren’t able to attend then you can watch/listen to the replay here. -
MySQL Cluster documentation on iPhone/iPad
Posted on September 13th, 2010 No commentsMySQL documentation has now been made available in ePub format (fetch the MySQL Cluster versions from http://dev.mysql.com/doc/index-cluster.html ). This format is intended for various book readers.
I’ve tried it on 2 iPhone applications; Apple’s iBooks and the iPhone version of Stanza. The documents render well in both applications but as with most reference books you get the best results if you reduce the font size so that you get more on the screen at once.
With both applications, you can add the books through iTunes but Stanza also has the advantage that if you browse to the document in Opera on the iPhone then it gives you the option of opening it in Stanza – cutting out the need to use iTunes. I also prefer the rendering in Stanza. The only problems I’ve come across with Stanza is that you have to add the cover images yourself (if you care) – not a huge problem.
-
Upcoming webinar: MySQL Cluster deployment best practices
Posted on August 2nd, 2010 No commentsOn Wednesday 25 August, Johan Anderson and Mat keep will be presenting a free webinar on the best practices for deploying MySQL Cluster. If you’d like to attend then just register here.
Johan has years of MySQL Cluster professional services experience – and so if you’re interested in deploying a product ready MySQL Cluster database then this is a must-view session. Even if you can’t make it live, register anyway and you’ll be sent a link to the recording and charts afterwards.
Content
An invaluable session for those who are about to, or who already have, deployed MySQL Cluster. Delivered by the lead MySQL Cluster professional services consultant in Oracle, this session will present best practices on deploying MySQL Cluster in order to accelerate time to service with the highest levels of availability and performance to support your most critical web and telecoms applications.
The session will cover:
- identifying suitable applications for MySQL Cluster
- comparing differences in functionality and configuration with InnoDB
- using triggers and stored procedures to implement foreign key functionality
- hardware, networking and file system selection
- dimensioning and capacity planning
- configuration best practices
- disk data deployment
- administration and monitoring
- services available to get started
WHEN
Wednesday, August 25, 2010: 09:00 Pacific time (America)Wed, Aug 25: 06:00 Hawaii timeWed, Aug 25: 10:00 Mountain time (America)Wed, Aug 25: 11:00 Central time (America)Wed, Aug 25: 12:00 Eastern time (America)Wed, Aug 25: 16:00 UTCWed, Aug 25: 17:00 Western European timeWed, Aug 25: 18:00 Central European timeWed, Aug 25: 19:00 Eastern European time -
How can a database be in-memory and durable at the same time?
Posted on July 13th, 2010 1 commentThere is often confusion as to how it can be claimed that MySQL Cluster delivers in-memory performance while also providing durability (the “D” in ACID). This post explains how that can be achieved as well as how to mix and match scalability, High Availability and Durability.
As an aside, the user can specify specific MySQL Cluster tables or columns to be stored on disk rather than in memory – this is a solution for extra capacity but you don’t need to take this performance hit just to have the data persisted to disk. This post focuses on the in-memory approach.
There is a great deal of flexibility in how you deploy MySQL Cluster with in-memory data – allowing the user to decide which features they want to make use of.
The simplest (and least common) topology is represented by the server sitting outside of the circles in the diagram. The data is held purely in memory in a single data node and so if power is lost then so is the data. This is an option if you’re looking for an alternative to the MEMORY storage engine (and should deliver better write performance as well as more functionality). To implement this, your configuration file would look something like this:
config.ini (no Durability, Scalability or HA)
[ndbd default] NoOfReplicas=1 datadir=E:\am233268\Documents\MySQL_Cluster\My_Cluster\data [ndbd] hostname=localhost [ndb_mgmd] hostname=localhost [mysqld] hostname=localhost
By setting NoOfReplicas to 1, you are indicating that data should not be duplicated on a second data node. By only having one [ndbd] section you are specifying that there should be only 1 data node.
To indicate that the data should not be persisted to disk, make the following change:
mysql> SET ndb_table_no_logging=1;Once ndb_table_no_logging has been set to 1, any Cluster tables that are subsequently created will be purely in-memory (and hence the contents will be volatile).
Durability can be added as an option. In this case, the changes to the in-memory data is persisted to disk asynchronously (thus minimizing any increase in transaction latency). Persistent is implemented using 2 mechanisms in combination:
- Periodically a snapshot of the in-memory data in the data node is written to disk – this is referred to as a Local Checkpoint (LCP)
- Each change is written to a Redo log buffer and then periodically these buffers are flushed to a disk-based Redo log file – this is coordinated across all data nodes in the Cluster and is referred to as a Global Checkpoint (GCP)
This checkpointing to disk is enabled by default but if you’ve previously turned it off then you can turn it back on with:
mysql> SET ndb_table_no_logging=0;Following this change, any new Cluster tables will be asynchronously persisted to disk. If you have existing, volatile MySQL Cluster tables then you can now make them persistent:
mysql> ALTER TABLE tab1 ENGINE=ndb;High Availability can be implemented by including extra data node(s) in the Cluster and increasing the value of NoOfReplicas (2 is the normal value so that all data is held in 2 data nodes). The set (pair) of data nodes storing the same set of data is referred to as a node group. Data is synchronously replicated between the data nodes in the node group and so changes cannot be lost unless both data nodes fail at the same time. If the 2 data nodes making up a node group are run on different servers then the data can remain available for use even if one of the servers fails. The configuration file for single, 2 data node node group Cluster would look something like:
config.ini (HA but no scalability)
[ndbd default] NoOfReplicas=2 datadir=E:\am233268\Documents\MySQL_Cluster\My_Cluster\data
[ndbd] hostname=192.168.0.1
[ndbd] hostname=192.168.0.2
[ndb_mgmd] hostname=192.168.0.3
[mysqld] hostname=192.168.0.1
[mysqld] hostname=192.168.0.2
If you exceed the capacity or performance of a single node group then you can add extra data node(s) to add 1 or more extra node groups. An example configuration where we want scalability but not High Availability would have multiple node groups but each made up of a single data node. The configuration file would look something like this:config.ini (scalability but not HA)[ndbd default] NoOfReplicas=1 datadir=E:\am233268\Documents\MySQL_Cluster\My_Cluster\data
[ndbd] hostname=192.168.0.1
[ndbd] hostname=192.168.0.2
[ndb_mgmd] hostname=192.168.0.1
[mysqld] hostname=192.168.0.1
[mysqld] hostname=192.168.0.2
New node groups can be added to a Cluster without taking the database off-line (see MySQL Cluster 7.1 New Features White Paper).As shown in the diagram at the start of this post it is also possible to implement any combination of Durability, Scalability and High Availability. A typical configuration that has scalability (in this case 2 node-groups), HA (2 data nodes in each node group) and durability (there by default) could be implemented with this configuration file:config.ini (Scalability, HA & Durability)[ndbd default] NoOfReplicas=2 datadir=E:\am233268\Documents\MySQL_Cluster\My_Cluster\data
[ndbd] hostname=192.168.0.1
[ndbd] hostname=192.168.0.2
[ndbd] hostname=192.168.0.3[ndbd] hostname=192.168.0.4[ndb_mgmd] hostname=192.168.0.5
[ndb_mgmd] hostname=192.168.0.6[mysqld] hostname=192.168.0.5
[mysqld] hostname=192.168.0.6
While that solution now provides you with scalability, durability and HA you are still vulnerable to the loss of the entire Cluster (for example, a catastrophic power failure for the whole data center) – to avoid this, asynchronous replication (Geo Replication) can be setup between 2 (or more) Clusters running at 2 different locations. There is no limit to the distance between the 2 sites. As with the nodal topology, Geo Replication can be used between Clusters deploying any combination of the features described here and there is no requirement for both sites to be using the same Cluster configuration (or even for the second site to store data in MySQL Cluster at all!). More details on Geo Replication scenarios can be found at http://www.clusterdb.com/mysql-cluster/setting-up-mysql-asynchronous-replication-for-high-availability/ -
Free webinar – Scaling web apps with MySQL (an alternative to the MEMORY storage engine)
Posted on July 9th, 2010 No commentsMat Keep and I will be presenting this free webinar on Wednesday 14 July.

The MEMORY storage engine has been widely adopted by MySQL users to provide near-instant responsiveness with use cases such as caching and web session management. As these services evolve to support more users, so the scalability and availability demands can start to exceed the capabilities of the MEMORY storage engine.
The MySQL Cluster database, which itself can be implemented as a MySQL storage engine, is a viable alternative to address these evolving web service demands. MySQL Cluster can be configured and run in the same way as the MEMORY storage engine (ie on a single host with no replication and no persistence). As web services evolve, any of these attributes can then be added in any combination to deliver higher levels of scalability, availability and database functionality, especially for those workloads which predominately access data by the primary key.
As always, the webinar is free of charge but you will need to register here.
Time:
- Wed, Jul 14: 06:00 Hawaii time
- Wed, Jul 14: 09:00 Pacific time (America)
- Wed, Jul 14: 10:00 Mountain time (America)
- Wed, Jul 14: 11:00 Central time (America)
- Wed, Jul 14: 12:00 Eastern time (America)
- Wed, Jul 14: 16:00 UTC
- Wed, Jul 14: 17:00 Western European time
- Wed, Jul 14: 18:00 Central European time
- Wed, Jul 14: 19:00 Eastern European time
If you can’t make the live webinar then register anyway and you’ll get sent a link to the recording after the event.
-
MySQL Workbench 5.2 goes GA – partial support for MySQL Cluster
Posted on July 1st, 2010 No commentsThe new version of MySQL Workbench (5.2.25) has just gone GA – see the Workbench BLOG for details.
So what’s the relevance to MySQL Cluster? If you have a Cluster that uses MySQL Servers to provide SQL access then you can now use MySQL Workbench to manage those nodes:
- Start & stop the mysqld processes
- Configure the per-mysqld configuration data held in my.cnf or my.ini
The reason that I describe the support as ‘partial’ is that these MySQL Servers are treated as independent entities (no concept of them being part of a Cluster) and there is currently no way to use it to configure or manage the other Cluster processes (data and management nodes). Having said that, what is there provides a lot of value and Workbench is designed to be very extensible and so hopefully there can be further MySQL Cluster support in the future.
In addition to MySQL Cluster-specific configuration parameters, you can also access the Cluster-specific status variables (these are the ones starting with ndb).
While I’ve focussed on what’s unique to MySQL Cluster, you can of course use the other Workbench features with MySQL Cluster – for example:
- Creating (or reverse-engineering) your data model
- Define your schema
- View/write data to your tables
- Create your SQL queries
-
Using Syslog with MySQL Cluster
Posted on June 28th, 2010 No commentsBy default, MySQL Cluster sends log data to a file but you can also send it to the console or to Syslog; this article explains how to send it to Syslog. The example given here is for LINUX.
In this example, I’ll use the “user” syslog facility name and so the first step is to make sure that syslog is configured to route those messages. If this hasn’t already been configured then add the following lines to /etc/rsyslog.conf:
# Log user messages to local files user.* /var/log/user
For the changes to take effect, restart the syslog service:
[root@ws1 etc]# service rsyslog restart Shutting down system logger: [ OK ] Starting system logger: [ OK ]
Note that you should make those changes as root.
Still as root, start up a stream of any additions to the new log file:
[root@ws1 etc]# tail -f /var/log/userTo tell Cluster to use Syslog, add this line into the [ndb_mgmd] section in config.ini:
LogDestination=SYSLOG:facility=userand then start up your Cluster as normal.
You should now be able to see that MySQL Cluster information is being logged to /var/log/user.
You can adjust how much information is logged either through the config file or from the ndb_mgm tool, for example – to see when global checkpoints are written:
ndb_mgm> all clusterlog checkpoint=15 Executing CLUSTERLOG CHECKPOINT=15 on node 3 OK! Executing CLUSTERLOG CHECKPOINT=15 on node 4 OK!
Note that a log-level of 15 will show all logs and 0 will show none. Other log categories besides CHECKPOINT are STARTUP, SHUTDOWN, STATISTICS, NODERESTART, CONNECTION, INFO, ERROR, CONGESTION, DEBUG and BACKUP. -
MySQL Cluster presentation at Oracle Open World 2010
Posted on June 28th, 2010 No comments
As part of “MySQL Sunday” at this year’s Oracle Open World, Mat Keep and I will be presenting on the latest MySQL Cluster features. We’ll be presenting at 15:30 (Pacific Time) on 19th September (the event starts with a key note at 12:30).If you’re attending Oracle Open World then please indicate that you’d like to attend the MySQL Sunday when you register. If you aren’t planning to go to Oracle Open World but will be in the San Francisco area then buying a Discover pass (only $50 if you register by 16 July) will get you into the MySQL Sunday sessions. Register here.











