-
MySQL Cluster Powers Leading Document Management Web Service
Posted on May 24th, 2010 No commentsA new cu
stomer case-study is available for download from http://www.mysql.com/why-mysql/case-studies/mysql_cs-cluster_docudesk_WebServices.phpThe DocQ web service eliminates the limitations of sharing physical documents by offering a complete paperless business solution; providing a single place where customers can manage, archive, and send their important documents. DocQ supports secure business transactions and the services to store, edit, collaborate, and publish business documents.
- The database needed to deliver the high levels of write throughput, low latency responsiveness and continuous availability demanded by the service
- A sharded, multi-master MySQL solution with memcached was rejected due to the complexity of integration and management
- MySQL Cluster was selected as it met all of the requirements of the service with one, integrated solution out of the box
- MySQL Cluster is handling on average 1 million queries per day across both in-memory and disk-based tables, with the database growing at up to 2% daily
- MySQL Cluster handles document metadata and text, PHP session state, ACLs, job queues and tracking of document actions for billing
-
Configure MySQL Enterprise Monitor to monitor MySQL Cluster
Posted on May 20th, 2010 No comments
MySQL Cluster 7.1 introduced the ndbinfo database which contains views giving real-time access to a whole host of information that helps you monitor and tune your MySQL Cluster deployment. Because this data can be accessed through regular SQL, various systems can be configured to monitor the Cluster. This post gives one example, extending MySQL Enterprise Monitor to keep an eye on the amount of free memory on the data nodes (through a graph) and then raise an alarm when it starts to run low – even generating SNMP traps if that’s what you need.One of the features of MySQL Enterprise Monitor is that you can define custom data collectors and that those data collectors can run SQL queries to get the data. The information retrieved by those custom data collectors can then be used with rules that the user defines through the MySQL Enterprise Monitor GUI to create warning/alarms.
In this example, I create two new data collectors in the file”<MySQL Enterprise Monitor installation directory>/agent/share/mysql-proxy/items/cluster.xml” before starting up the MySQL Enterprise Monitor agent (note that these should be created for the agent of each MySQL Server in the Cluster that you would like to use to present the information from the data nodes):
cluster.xml:
<?xml version="1.0" encoding="utf-8"?> <classes> <class> <namespace>mysql</namespace> <classname>cluster_max_used</classname> <query><![CDATA[SELECT MAX(used) AS Used FROM ndbinfo.memoryusage WHERE memory_type = 'Data Memory';]]></query> </class> <class> <namespace>mysql</namespace> <classname>cluster_min_avail</classname> <query><![CDATA[SELECT MIN(total) AS Total FROM ndbinfo.memoryusage WHERE memory_type = 'Data Memory';]]></query> </class> </classes>So that the agent picks up this file, it should be referenced within <MySQL Enterprise Monitor installation directory>/agent/mysql-monitor-agent.ini:
agent-item-files = share/mysql-monitor-agent/items/quan.lua,share/mysql-monitor-agent/items/items-mysql-monitor.xml, share/mysql-monitor-agent/items/custom.xml,share/mysql-monitor-agent/items/cluster.xmlIn MySQL Enterprise Monitor, events are raised by rules. Rules are grouped together into Advisors and so I create a new Advisor called “MySQL Cluster” and then create just one new rule within that Advisor group.
As shown in Fig. 1 the rule is called “Data Node Low Memory”. The “Variable Assignment” section is used to define 2 variables %used_mem% and %config_mem% which are populated from the Used and Total results from the 2 new data collectors. The “Expression” section is used to test “((Total - Used)/Total)x100< THRESHOLD” and then the values to be substituted for THRESHOLD are defined in the “Thresholds” section – indicating at what points the Info, Warning and Critical Alters should be raised.
There are then a number of optional sections that you can use to add useful information to the person investigating the alert.
Once the rule has been created, the next step is to schedule and (if desired) tag that the alerts should also result in SNMP traps being raised. This is standard MySQL Enterprise Monitor practice and so it isn’t explained here except to point out that this rule is monitoring information from the data nodes but the rule has to be applied to a MySQL Server in the Cluster (MySQL Enterprise Monitor has no idea what a data node is) and so you need to schedule the rule against one or more arbitrary MySQL Server instances in the Cluster).
To test the functionality, start adding more data to your MySQL Cluster until the Warning alert is triggered as shown in Fig. 2. As you can see, the optional information we included is shown – including values from Used and Total.
I then add more data to the database until the critical alert is raised and confirm that it’s displayed on the main monitoring panel of the MySQL Enterprise Monitor dashboard. Note that if you requested these alerts be included with the SNMP feed then SNMP traps will also be raised.
Please note that this example is intended to illustrate the mechanics of setting up monitoring on an arbitrary piece of data from ndbinfo and obviously in the real world you would want to monitor more than just the memory and even for the memory, you might want to use a more sophisticated rule.
It is sometimes more useful to see how a value changes over time. For this, MySQL Enterprise Monitor provides graphs. The data collectors created for the rule can also be used to add a new graph to Enterprise monitor. The graph is defined by creating the following file:
<com_mysql_merlin_server_graph_Design> <version>1.0</version> <uuid>b0bc2bba-ea9b-102b-b396-94aca32b0b28</uuid> <tag></tag> <name>Per Data Node Data Memory Use</name> <rangeLabel>MB</rangeLabel> <frequency>00:01:00</frequency> <series> <label>Used</label> <expression>cluster_data_node_used_data_memory/1024/1024</expression> </series> <series> <label>Avail</label> <expression>cluster_data_node_config_data_memory/1024/1024</expression> </series> <variables> <name>cluster_data_node_used_data_memory</name> <dcItem> <nameSpace>mysql</nameSpace> <className>cluster_max_used</className> <attribName>Used</attribName> </dcItem> <instance>local</instance> </variables> <variables> <name>cluster_data_node_config_data_memory</name> <dcItem> <nameSpace>mysql</nameSpace> <className>cluster_min_avail</className> <attribName>Total</attribName> </dcItem> <instance>local</instance> </variables> </com_mysql_merlin_server_graph_Design>
Click on Import/Export in the Graphs tab in Enterprise Monitor (2.2) and then import the file defining the graph.
The graph will then appear on the graphs tab and can also be configured to appear on the main dashboard as shown in Fig. 5
-
MySQL Cluster 6.3.33 binaries released
Posted on May 17th, 2010 No comments
The binary version for MySQL Cluster 6.3.33 has now been made available at http://www.mysql.com/downloads/cluster/6.3.html#downloadsA description of all of the changes (fixes) that have gone into MySQL Cluster 6.3.33 (compared to 6.3.32) can be found in the MySQL Cluster 6.3.33 ChangeLog .
-
Trying out MySQL Push-Down-Join (SPJ) preview
Posted on April 29th, 2010 No commentsAt the 2010 MySQL User Conference, Jonas Oreland presented on the work he’s been doing on improving the performance of joins when using MySQL Cluster – the slides are available for download. While not ready for production systems, a preview version is available for you to try out. The purpose of this blog is to step through testing an example query as well as presenting the results (SPOILER: In one configuration, I got a 50x speedup!).
SPJ is by no means complete and there are a number of constraints as to which queries benefit (and I’ll give an example of one that didn’t). For details of the current (April 2010) software and limitations, check out Jonas’s slides and then keep up to date by following his blog.
We’re anxious to get feedback – please feel free to post results as comments to this blog but also make sure that you send them to spj-feedback@sun.com – describing your schema, the query or queries you tested, the output from EXPLAIN and your before and after timings.
Joins in MySQL Cluster are implemented as nested-loop joins within the MySQL Server; this can be inefficient as it results in many trips to the data nodes to fetch the required data. SPJ works by pushing the join (actually a spec of the needed data) down into the data nodes where the data can be collected and sent back up to the MySQL Server much more efficiently.
For my tests, I used 2 different configurations. In both cases there are 2 data nodes running on 2 physical hosts. In the first configuration the MySQL Server resides on one of those 2 hosts. In the second configuration, the MySQL Server is moved to a virtual machine running on a 3rd host.
Setting up the Cluster
On each of the 3 hosts, I downloaded the software from ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.44-ndb-7.1.3-spj-preview/ and then compiled and installed it. If you’re not comfortable with that then you can find instructions in this earlier blog or if you’re used to using the tools from severalnines then check out the SPJ instructions on Johan’s blog.
Create the schema
The 3 tables I used can be created with these commands from the mysql client:
mysql> create database clusterdb; use clusterdb; mysql> create table subs (sub_id int not null primary key, dept int,country int) engine=ndb; mysql> create table department (id int not null primary key, name int) engine=ndb; mysql> create table roles (dept int not null primary key, role varchar (30)) engine=ndb;Each of these tables is then populated with 100,000 rows (the files can be downloaded from here).
Once extracted, the data should be loaded into the database:
mysql> use clusterdb;mysql> load data local infile "/home/billy/Dropbox/LINUX/projects/SPJ/subs.csv" replace into table subs fields terminated by ',';mysql> load data local infile "/home/billy/Dropbox/LINUX/projects/SPJ/dept.csv" replace into table department fields terminated by ',';mysql> load data local infile "/home/billy/Dropbox/LINUX/projects/SPJ/roles.csv" replace into table roles fields terminated by ',';Running the tests (Config 1 – local mysqld)
To get a baseline, ensure that SPJ is turned off:
mysql> set ndb_join_pushdown=off;and then get the output from EXPLAIN:
mysql> EXPLAIN SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+-----------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+-----------------------------------+ | 1 | SIMPLE | subs | ALL | NULL | NULL | NULL | NULL | 100000 | Using where with pushed condition | | 1 | SIMPLE | department | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.subs.dept | 1 | | | 1 | SIMPLE | roles | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.department.name | 1 | | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+-----------------------------------+and then execute the query:
mysql> SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----------+ | count(*) | +----------+ | 33334 | +----------+ 1 row in set (9.08 sec)Now to see the benefits of SPJ, turn it on:
mysql> set ndb_join_pushdown=on;Check the output from EXPLAIN again:
mysql> EXPLAIN SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------------------------------------------+ | 1 | SIMPLE | subs | ALL | NULL | NULL | NULL | NULL | 100000 | Parent of 3 pushed join@1; Using where with pushed condition | | 1 | SIMPLE | department | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.subs.dept | 1 | Child of pushed join@1 | | 1 | SIMPLE | roles | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.department.name | 1 | Child of pushed join@1 | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------------------------------------------+and then re-run the query:
mysql> SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----------+ | count(*) | +----------+ | 33334 | +----------+ 1 row in set (0.77 sec)In this test, the query ran almost 12x faster!
Running the tests (Config 1 – separate mysqld)
The test was then repeated with the MySQL Server running within a VM on a 3rd host – the purpose of this is to represent the more normal configuration where the MySQL servers must communicate over the network to the data nodes. As the purpose of SPJ is to reduce the messaging between the MySQL Server and the data nodes, it’s reasonable to expect the benefits from SPJ to be more pronounced with this configuration.
Again, to get a baseline, ensure that SPJ is turned off:
mysql> set ndb_join_pushdown=off;and then get the output from EXPLAIN:
mysql> EXPLAIN SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+-----------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+-----------------------------------+ | 1 | SIMPLE | subs | ALL | NULL | NULL | NULL | NULL | 100000 | Using where with pushed condition | | 1 | SIMPLE | department | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.subs.dept | 1 | | | 1 | SIMPLE | roles | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.department.name | 1 | | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+-----------------------------------+and then execute the query:
mysql> SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----------+ | count(*) | +----------+ | 33334 | +----------+ 1 row in set (1 min 2.12 sec)
Now to see the benefits of SPJ, turn it back on:
mysql> set ndb_join_pushdown=on;Check the output from EXPLAIN again:
mysql> EXPLAIN SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------------------------------------------+ | 1 | SIMPLE | subs | ALL | NULL | NULL | NULL | NULL | 100000 | Parent of 3 pushed join@1; Using where with pushed condition | | 1 | SIMPLE | department | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.subs.dept | 1 | Child of pushed join@1 | | 1 | SIMPLE | roles | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.department.name | 1 | Child of pushed join@1 | +----+-------------+------------+--------+---------------+---------+---------+---------------------------+--------+--------------------------------------------------------------+and then re-run the query:
mysql> SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND department.id=subs.dept AND roles.dept=department.name; +----------+ | count(*) | +----------+ | 33334 | +----------+ 1 row in set (1.26 sec)
In this test, the query ran almost 50x faster!
Do all queries benefit from SPJ
No and that’s why it’s especially important to get feedback from real users with representative schemas so that SPJ can be extended to cover as many of the significant use cases as possible.
As an example, using the following query I saw no speedup at all (using the local mysqld configuration):
mysql> set ndb_join_pushdown=off; mysql> EXPLAIN SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND subs.dept=department.name AND department.id=roles.dept; +----+-------------+------------+--------+---------------+---------+---------+-------------------------+--------+-----------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+-------------------------+--------+-----------------------------------+ | 1 | SIMPLE | subs | ALL | NULL | NULL | NULL | NULL | 100000 | Using where with pushed condition | | 1 | SIMPLE | department | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Using where; Using join buffer | | 1 | SIMPLE | roles | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.department.id | 1 | | +----+-------------+------------+--------+---------------+---------+---------+-------------------------+--------+-----------------------------------+ mysql> SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND subs.dept=department.name AND department.id=roles.dept; +----------+ | count(*) | +----------+ | 33334 | +----------+ 1 row in set (3 min 56.26 sec)mysql> set ndb_join_pushdown=on;mysql> EXPLAIN SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND subs.dept=department.name AND department.id=roles.dept; +----+-------------+------------+--------+---------------+---------+---------+-------------------------+--------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+-------------------------+--------+-----------------------------------------------------------+ | 1 | SIMPLE | subs | ALL | NULL | NULL | NULL | NULL | 100000 | Using where with pushed condition | | 1 | SIMPLE | department | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Parent of 2 pushed join@1; Using where; Using join buffer | | 1 | SIMPLE | roles | eq_ref | PRIMARY | PRIMARY | 4 | clusterdb.department.id | 1 | Child of pushed join@1 | +----+-------------+------------+--------+---------------+---------+---------+-------------------------+--------+-----------------------------------------------------------+ mysql> SELECT count(*) FROM subs, department, roles WHERE subs.country=44 AND subs.dept=department.name AND department.id=roles.dept; +----------+ | count(*) | +----------+ | 33334 | +----------+ 1 row in set (3 min 57.76 sec)
-
Free webinar – learn about MySQL Cluster 7.1
Posted on April 29th, 2010 No comments
MySQL Cluster 7.1 was declared GA earlier this month and today (29 April) you have the chance to learn all about it by registering for this free webinar.In blazing speed we will cover the most important features of MySQL Cluster 7.1: NDB$INFO; MySQL Cluster Connector/Java and other features that push the limits of MySQL Cluster into new workloads and communities.
NDB$INFO presents real-time usage statistics from the MySQL Cluster data nodes as a series of SQL tables, enabling developers and administrators to monitor database performance and optimize their applications.
Designed for Java developers, the MySQL Cluster Connector for Java implements an easy-to-use and high performance native Java interface and OpenJPA plug-in that maps Java classes to tables stored in the MySQL Cluster database.
It’s worth registering even if you can’t attend as you should then receive a link to the replay and the charts.
It starts at 9:00 Pacific / 5 pm UK / 6pm CET.
-
Charts from LDAP Con on LDAP access to MySQL Cluster
Posted on April 11th, 2010 No commentsAt last year’s LDAP-Con event, Ludo from OpenDS and Howard from OpenLDAP presented on the work that they’d done on using MySQL Cluster as the scalable, real-time data store for LDAP directories (going directly to the NDB API rather than using SQL). Symas now provide their implementation (back-ndb) for OpenLDAP.
You can view the charts at http://www.mysql.com/customers/view/?id=1041
-
MySQL Cluster 7.1.2a binaries released
Posted on March 24th, 2010 No comments
The binary version for MySQL Cluster 7.1.2a has now been made available at http://dev.mysql.com/downloads/cluster/ under the Development tab.Note that this beta load contains the latest NDBINFO and MySQL Cluster Connector for Java (ClusterJ) enhancements – please try them out and provide feedback (any bugs should be reported through bugs.mysql.com.
A description of all of the changes (fixes) that have gone into MySQL Cluster 7.1.2a (compared to 7.1.1) can be found in the MySQL Cluster 7.1.2a Change Log.
-
Build MySQL Cluster 7.1 from source – including MySQL Cluster Connector for Java
Posted on March 19th, 2010 2 commentsIf you want to try out the beta features in MySQL Cluster 7.1 then you can either use the appropriate binaries or you can build it for yourself from source. Here I explain how to do this on LINUX.
Note that if you want to make use of OpenJPA then you first need to install OpenJPA and Connector/J.
The example here was on Fedora12 with the MySQL Cluster 7.1.2 source:
CFLAGS=”-O3″ CXX=gcc CXXFLAGS=”-O3 -felide-constructors -fno-exceptions -fno-rtti” ./configure -prefix=/usr/local/mysql –enable-assembler –with-mysqld-ldflags=-all-static –with-plugins=max –with-openjpa –with-classpath=/usr/local/openjpa/openjpa-1.2.1.jar:/usr/local/openjpa/lib/geronimo-jpa_3.0_spec-1.0.jar:/usr/local/openjpa/lib/geronimo-jta_1.1_spec-1.1.jar –with-extra-charsets=all
make
make install
That’s it! Obviously, the exact location of the OpenJPA jars will depend on where you installed it. Note that for ‘make install’ you need to run it from an account that has access to /usr/local
I’ll follow up a little later with a post with example applications (in the mean time refer to this tutorial or the MySQL Cluster for Java on-line documentation) but FYI these are the options I use to compile and run my test aps:
ClusterJ:
javac -classpath /usr/local/mysql/share/mysql/java/clusterj-api.jar:. Main.java Employee.java
java -classpath /usr/local/mysql/share/mysql/java/clusterj.jar:. -Djava.library.path=/usr/local/mysql/lib/mysql/ Main
ClusterJPA:
javac -classpath /usr/local/mysql/share/mysql/java/clusterjpa.jar:/usr/local/openjpa/openjpa-1.2.1.jar:/usr/local/openjpa/lib/geronimo-jpa_3.0_spec-1.0.jar:. Main.java Employee.java Department.java
java -Djava.library.path=/usr/local/mysql/lib/mysql/ -classpath /usr/local/mysql/share/mysql/java/clusterjpa.jar:/usr/local/openjpa/openjpa-1.2.1.jar:/usr/local/openjpa/lib/*:/usr/local/connectorj/mysql-connector-java-5.1.12-bin.jar:. Main
-
MySQL Cluster 7.1.2 beta binaries released
Posted on March 16th, 2010 No comments
The binary version for MySQL Cluster 7.1.2 has now been made available at http://dev.mysql.com/downloads/cluster/ under the Development tab.Note that this beta load contains the latest NDBINFO and MySQL Cluster Connector for Java (ClusterJ) enhancements – please try them out and provide feedback (any bugs should be reported through bugs.mysql.com.
A description of all of the changes (fixes) that have gone into MySQL Cluster 7.1.2 (compared to 7.1.2) can be found in the MySQL Cluster 7.1.2 Change Log.
-
MySQL Cluster 7.0.13 binaries released
Posted on March 16th, 2010 No comments
The binary version for MySQL Cluster 7.0.13 has now been made available at http://dev.mysql.com/downloads/cluster/ under the GA tab.A description of all of the changes (fixes) that have gone into MySQL Cluster 7.0.13 (compared to 7.0.12) can be found in the MySQL Cluster 7.0.13 Change Log.







