Tag Archive for NoSQL

MySQL & NoSQL – Best of Both Worlds. Upcoming webinar

MySQL Cluster LogoOn Thursday 22nd May I’ll be hosting a webinar explaining how you can get the best from the NoSQL world while still getting all of the benefits of a proven RDBMS. As always the webinar is free but please register here.

There’s often a lot of excitement around NoSQL Data Stores with the promise of simple access patterns, flexible schemas, scalability and High Availability. The downside can come in the form of losing ACID transactions, consistency, flexible queries and data integrity checks. What if you could have the best of both worlds?

This webinar shows how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data, with a simple key-value API (Memcached), REST, JavaScript, Java or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance, 99.999% availability, on-line maintenance and linear, horizontal scalability through transparent auto-sharding.

This is also an opportunity to pick the brains of the MySQL Cluster engineering team and get your technical questions answered.

Times:

  • Thu, May 22: 09:00 Pacific time (America)
  • Thu, May 22: 10:00 Mountain time (America)
  • Thu, May 22: 11:00 Central time (America)
  • Thu, May 22: 12:00 Eastern time (America)
  • Thu, May 22: 13:00 São Paulo time
  • Thu, May 22: 16:00 UTC
  • Thu, May 22: 17:00 Western European time
  • Thu, May 22: 18:00 Central European time
  • Thu, May 22: 19:00 Eastern European time
  • Thu, May 22: 21:30 India, Sri Lanka
  • Fri, May 23: 00:00 Singapore/Malaysia/Philippines time
  • Fri, May 23: 00:00 China time
  • Fri, May 23: 01:00 日本
  • Fri, May 23: 02:00 NSW, ACT, Victoria, Tasmania (Australia)

Even if you can’t join the live webinar, it’s worth registering as you’ll be emailed a link to the replay as soon as it’s available.





Webinar Replay + Q&A – Developing JavaScript Applications for Node.js with MySQL and NoSQL

MySQL Cluster driver for JavaScript/Node.jsOn Thursday 12th September I co-presented a webinar on how MySQL Cluster delivers the key benefits of NoSQL Data Stores without having to give up the features that people rely on from relational databases (consistency, SQL etc.). There was a particular focus on how to use the new node.js JavaScript API which was recently released as part of MySQL Cluster 7.3. If you weren’t able to attend the live event then the webinar replay is available here. If you scroll down to the bottom of this post you’ll see a summary of the Questions and Answers from the session.

The new MySQL JavaScript Connector for Node.js is designed to deliver simplicity and high performance for JavaScript users.

Enables end-to-end JavaScript development, from the browser to the server and now to the world’s most popular open source database.

Provides native NoSQL access to the MySQL Cluster storage engine without first going through SQL transformations and parsing

This session gives you everything you need in order to get started on building highly scalable, event-driven Web, mobile, and social applications.

It discusses:

  •  Implementation of the connector
  • Database connection and metadata management
  • Creating and executing queries
  • Use with InnoDB (via an internal SQL conversion) and MySQL Cluster (direct)

WHO:

  • Andrew Morgan, Principal Product Manager
  • John Duncan, Senior Software Developer
  • Craig Russell, Principal Software Developer

Q & A

  • What is the benefit of using an asynchrous run-time model? The async everywhere idea in Node.JS means the one execution thread is always doing work for a user. Never waiting on I/O. It allows you to minimize the cost of waiting on the network for i/o, and that’s a big part of how it can handle so many simultaneous requests.
  • Can you please ellaborate more about multi-threading in node.js? Node.js has a just one thread that runs JavaScript. Then there is also a small thread pool that handles some background i/o tasks.
  • Why can’t you use a synchrous model? Would there be any drawbacks? The async programming style can be hard to get used to … but in JavaScript, async is really “how things are done”. Just like browser-side JavaScript responding to a mouse event, this is server-side JS responding to a database event or a network event.
  • Is there also a synchronous Session function? There is a synchronous version of connect(). There are some “immediate” calls on session (which don’t have to do any waiting), but there are no synchronous calls on session that wait for anything
  • Most applications run as responses to requests and so most of the logic needs to be executed sequentially. How are asynchronous calls handled? Is there any mechanism to wait and join multiple asynch call for a particular section of logic? If you have multiple database operations to execute, you can use the batch feature which will wait for all operations to complete and then call back. If you have other operations you need to have your own synchronization mechanism.
  • It was mentioned that you can use this APi to access the MySQL Cluster data nodes directly or MyISAM/InnoDB data via SQL and a MySQL server – how do I configure that? In the connection properties for your database connection, you use adapter=ndb for a native NDB connection, or adapter=mysql for a MySQL connection.
  • Are results mapped into objects? Yes. When the inner layer reads a row from the database, it creates a JavaScript object (according to the mapping) and it calls your Constructor with no arguments and with this set to the newly created object.
  • So there is seperate mapping for tables and for results? No, a single mapping.
  • Does the object mapping support the entity relationships like in JPA implementations? Can it be referenced with JSON ojects with one-to-one, one-to-many, etc relationships? The current release does not support relationships/JOINs.
  • JavaScript is weakly typed. How are different data types handled with the same reffrence? Dynamic typing means that values (objects, exspanssions) have types (i.e. a set of available operations and properties) but not variables, so a variable can, over time, refer to values of different types.
  • Are there sql like transaction through the Node.js NoSQL API? Yes, the API looks something like this: session.currentTransaction().begin() … define operations … session.currentTransaction.commit()
  • So, we can use session to track any transaction? Right, a session owns its single transaction. If you don’t call begin() on the transaction, then your operation will run “auto-commit”.
  • Does adapter=mysql option use https://github.com/felixge/node-mysql‎ adapter adapter? Yes.
  • Is this library similar to what mongoose is to mongodb? I’m not very familiar with mongoose — but it is broadly similar to many object-to-database mapping layers.
  • Is there a working “Hello World” example showing all of these technologies as a big happy family? You could start with this… Using JavaScript and Node.js with MySQL Cluster – First steps
  • So, just for clarification, the chief advantage of this is it’s moving the heavy lifting from the server side (like with PHP) to the client side? Not quite, node.js is server-side javascript. The main advantage is an asynchronous processing of requests that never block. In contrast, a large number of server-threads need to be provided with requests that can block.
  • Node.js runs on V8 which developed by google, its run fast any browser or only in Chrome? Node.js actually runs in the app server rather than in your browser and so it will play nicely when accessed via any browser (or other client)




Webinar – Developing JavaScript Applications for Node.js with MySQL and NoSQL

Note that the webinar replay + transcript of the Questions and Answers is now available from here.

MySQL Cluster driver for JavaScript/Node.jsOn Thursday 12th September I’ll be co-presenting a free webinar on how MySQL Cluster delivers the key benefits of NoSQL Data Stores without having to give up the features that people rely on from relational databases (consistency, SQL etc.). There will be particular focus on how to use the new node.js JavaScript API which was recently released as part of MySQL Cluster 7.3.  As usual the webinar is free but you should register in advance here.

The new MySQL JavaScript Connector for Node.js is designed to deliver simplicity and high performance for JavaScript users.

Enables end-to-end JavaScript development, from the browser to the server and now to the world’s most popular open source database.

Provides native NoSQL access to the MySQL Cluster storage engine without first going through SQL transformations and parsing

This session gives you everything you need in order to get started on building highly scalable, event-driven Web, mobile, and social applications.

It discusses:

  •  Implementation of the connector
  • Database connection and metadata management
  • Creating and executing queries
  • Use with InnoDB (via an internal SQL conversion) and MySQL Cluster (direct)

WHO:

  • Andrew Morgan, Principal Product Manager
  • John Duncan, Senior Software Developer
  • Craig Russell, Principal Software Developer

WHEN:

  • Thu, Sep 12: 09:00 Pacific time (America)
  • Thu, Sep 12: 10:00 Mountain time (America)
  • Thu, Sep 12: 11:00 Central time (America)
  • Thu, Sep 12: 12:00 Eastern time (America)
  • Thu, Sep 12: 16:00 UTC
  • Thu, Sep 12: 17:00 Western European time
  • Thu, Sep 12: 18:00 Central European time
  • Thu, Sep 12: 19:00 Eastern European time
  • Thu, Sep 12: 21:30 India, Sri Lanka
  • Fri, Sep 13: 00:00 Singapore/Malaysia/Philippines time
  • Fri, Sep 13: 00:00 China time
  • Fri, Sep 13: 01:00 日本
  • Fri, Sep 13: 02:00 NSW, ACT, Victoria, Tasmania (Australia)

 





Using JavaScript and Node.js with MySQL Cluster – First steps

MySQL Cluster driver for JavaScript/Node.js

We’re very pleased to announce that MySQL Cluster 7.3 has gone GA; for a full run-down of the new features and enhancements, take a look at the "MySQL Cluster 7.3 New Features for Internet-Scale Performance with Carrier-Grade Availability" white paper but this post will focus on just one of the features – the MySQL Cluster JavaScript Driver for Node.js. The post will step you through setting everything up so that you can get your first Node.js code reading and writing from MySQL Cluster.

Background

MySQL Cluster is a distributed, shared nothing database that provides linear scalability, 99.999% availability and high read/write throughput with low latency. SQL is just one way of accessing data stored in MySQL Cluster – there are also a number of native NoSQL APIs that go straight to the data without any interim conversion to SQL. These APIs include C++, Memcached, Java, JPA and REST – in MySQL Cluster 7.3 JavaScript/Node.js is added to the list.

Node.js (typically referred to simply as "Node") is a platform for running fast, scalable applications using JavaScript. For maximum performance, Node.js is built around a non-blocking, event-driven architecture – just like MySQL Cluster and so they fit together very well.

The MySQL Cluster JavaScript Driver for Node.js is just what it sounds like it is – it’s a connector that can be called directly from your JavaScript code to read and write your data. As it accesses the data nodes directly, there is no extra latency from passing through a MySQL Server and need to convert from JavaScript code//objects into SQL operations. If for some reason, you’d prefer it to pass through a MySQL Server (for example if you’re storing tables in InnoDB) then that can be configured.

As with all of the other APIs that we offer, you’re able to mix & match access methods; while you access the database using JavaScript objects, these will be mapped to rows in database tables and this same data can simultaneously be accessed by other NoSQL API or through SQL.

MySQL Cluster Auto-Installer

MySQL Cluster Auto-Installer

Installing MySQL Cluster

Obviously, to try out our JavaScript code we’ll need a database to access. The simplest way to do this is to use the new auto-installer that’s part of MySQL Cluster 7.3; this provides a browser-based GUI that configures and runs a MySQL Cluster database that’s been configured for your application and environment. Using this installer is described in the "MySQL Cluster Auto-Installer" post.

 

 

 

Setting up the Database

Foreign Key constraints between tables

Tables with Foreign Key constraint

A simple database schema is being used – a single table would really be enough but we’ll actually create two in order to demonstrate that Foreign Key constraints (Foreign Keys are another new feature in MySQL Cluster 7,3) are still enforced when using the NoSQL APIs.

These two tables are set up as follows:

[billy@bob]$ mysql --protocol=tcp -u root
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 2
Server version: 5.6.10-ndb-7.3.1-cluster-gpl MySQL Cluster Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> CREATE TABLE test.counties (county VARCHAR(30) NOT NULL PRIMARY KEY,
country VARCHAR(30)) ENGINE=ndb;
Query OK, 0 rows affected (1.17 sec)

mysql> CREATE TABLE test.towns (town VARCHAR(30) NOT NULL PRIMARY KEY, county VARCHAR(30),
INDEX county_county_index (county), CONSTRAINT county_town FOREIGN KEY (county) REFERENCES counties(county) 
ON DELETE RESTRICT ON UPDATE RESTRICT) ENGINE=ndb;
Query OK, 0 rows affected (1.00 sec)

Installing Node.js

Node.js can be downloaded from Node.js.org.

If you’re using Windows then the MSI will guide you through the process – though when you’ve finished you’ll need to manually run the following from the cmd prompt:

npm install -g node-gyp

On Linux, you’ll need to manually build and install Node.js after you’ve downloaded it:

[root@bob node-v0.10.7]# ./configure
[root@bob node-v0.10.7]# make
[root@bob node-v0.10.7]# make install

Building mysql-js (the JavaScript driver for MySQL Cluster)

Regardless of what operating system is being used, the driver must be built before it can be used. Furthermore, to build the driver you need to have a local MySQL Cluster package (so that it can access the NDB API client library).

Depending on the exact Linux distro and version being used, some dependencies may need to be installed first:

[root@bob]$ yum install gcc
[root@bob]$ yum install gcc-c++
[root@bob]$ npm install mysql
[root@bob]$ npm install -g node-gyp

Once that has been done, the driver can be built – the following should be run from within the share/nodejs directory within the MySQL Cluster package:

[billy@bob nodejs]$ node configure.js 
[billy@bob nodejs]$ node-gyp configure build -d

Example Code

To begin with, the complete code is provided and then each component is explained.

var nosql = require('..'); 

var Town = function(name, county) {
    if (name) this.town = name;
    if (county) this.county = county;
};

// create basic object<->table mapping
var annotations = new nosql.TableMapping('towns').applyToClass(Town);

//check results of find
var onFind = function(err, result) {
    console.log('onFind.');
    if (err) {
        console.log(err);
    } else {
        console.log('Found: ' + JSON.stringify(result));
    }
    process.exit(0);
};

//check results of insert
var onInsert = function(err, object, session) {
    console.log('onInsert.');
    if (err) {
        console.log(err);
    } else {
        console.log('Inserted: ' + JSON.stringify(object));

        // Now read the data back out from the database
        session.find(Town, 'Maidenhead', onFind);
    }
};

// insert an object
var onSession = function(err, session) {
    console.log('onSession.');
    if (err) {
    console.log('Error onSession.');
        console.log(err);
        process.exit(0);
    } else {
        var data = new Town('Maidenhead', 'Berkshire');
        session.persist(data, onInsert, data, session);
    }
};

var dbProperties = nosql.ConnectionProperties('ndb');

console.log('Openning session');

// connect to the database
nosql.openSession(dbProperties, Town, onSession);

console.log('Openned session');

Step 1 Import API software

The first step is for your JavaScript application to specify that it will use the new API (mysql-js); this will work if this file is stored in a directory under cluster-install/share/nodejs:

var nosql = require('mysql-js');

Step 2 Define JavaScript class

A class should be defined that will be used by the application and will also be mapped to the towns table that has already been created. The class is defined through the specification of its constructor:

var Town = function(name, county) {
    if (name) this.town = name;
    if (county) this.county = county;
};

Step 3 Define schema mappings

MySQL Cluster is a relational database while JavaScript works with objects. To get the best of both worlds, mappings can be set up between MySQL tables (or individual columns) and JavaScript objects – in this case it’s kept very simple and the Town class is mapped to the towns table:

var annotations = new nosql.TableMapping('towns').applyToClass(Town);

Note that you can also define mappings at a finer level – only mapping to specific columns within a table and allowing the class members to have different names to the table columns.

Step 4 Connect to the database

In this step, the application connects to the database to indicate that it wants to access the data nodes directly rather than via the MySQL Server (and hidden conversion to SQL), the ConnectionProperties is set to ndb (rather than mysql). Additionally, a callback function (onSession) is passed as an argument. Recall that this is an event-driven, non-blocking, asynchronous API and so after calling openSession, control will be immediately returned to the application – without waiting for the session to actually be established – the application can then go away and perform other useful work. Once the session has been created, the callback function (onSession) will be invoked, allowing the application to take the next steps.

var dbProperties = nosql.ConnectionProperties('ndb');
nosql.openSession(dbProperties, annotations, onSession);

Step 5 Store some data

Note that the code in this step is the implementation of the call back function (onSession) that we passed in as a parameter to openSession in the previous step. This function simply instantiates a Town JavaScript object and then writes it to MySQL Cluster through the persist method. Once more, a callback function is passed in (onInsert) which will be invoked once the data has been stored in the database. Note that any objects specified after the callback function in the parameter list (in this case, the Town object data) will be provided as parameters to the callback function when it’s executed – note that the session parameter is also included and the framework will pass that as a parameter to the onInsert callback function.

var onSession = function(err, session) {
    if (err) {...} else {
        var data = new Town('Maidenhead', 'Berkshire');
        session.persist(data, onInsert, data, session);
    }
};

Step 6 Verify insert succeeded & initiate reading of data from the database

Next, our callback function (onInsert) is called and the object parameter contains the Town that had been passed into the persist method in the previous step, together with the session variable. This function simply checks that there wasn’t an error and then displays the Town that has been stored in the database. Again, a callback function is provided – onFind – that will be called once the data has been found.

var onInsert = function(err, object, session) {
    if (err) { ... } else {
        console.log('Inserted: ' + JSON.stringify(object));
        session.find(Town, 'Maidenhead', onFind);
    }
};

Step 7 Process the data read back from the database

Finally, onFind is called-back and it handles the processing of the results (whether it is an error or the Town object in the form of the result parameter).

var onFind = function(err, result) {
    console.log('onFind.');
    if (err) { ... } else {
        console.log('Found: ' + JSON.stringify(result));
    }
    process.exit(0);
};

The on-line documentation for mysql-js describes the API in full and includes some further examples.

Running the application

Before running the application, the location of the NDB API client library must be provided (this is with the MySQL Cluster package):

[billy@bob my_samples]$ export LD_LIBRARY_PATH=/home/billy/cluster_7_3/lib

The example code should be stored in a file called my-test.js in a directory under cluster-install/share/nodejs/my_samples. To run the application, just run the following (spoiler – there will be a database error!):

[billy@bob my_samples]$ node my-test.js 
Openning session
Openned session
Connected to cluster as node id: 50
onSession.
onInsert.
{ message: 'Error',
  sqlstate: '23000',
  ndb_error: null,
  cause: 
   { message: 'Foreign key constraint violated: No parent row found [255]',
     sqlstate: '23000',
     ndb_error: 
      { message: 'Foreign key constraint violated: No parent row found',
        handler_error_code: 151,
        code: 255,
        classification: 'ConstraintViolation',
        status: 'PermanentError' },
     cause: null } }

This error is seen because the code is attempting to add an object/row to the towns table where the county doesn’t already have an entry in the counties table – this breaks the Foreign Key constraint that was included when the tables were created.

To get around this, the required row should be added to the counties table:

[billy@bob my_samples]$ mysql --protocol=tcp -u root
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 3
Server version: 5.6.10-ndb-7.3.1-cluster-gpl MySQL Cluster Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> INSERT INTO test.counties VALUES ('Berkshire', 'England');
Query OK, 1 row affected (0.25 sec)

The code can then be executed again – this time successfully:

[billy@drbd1 my_samples]$ node my-test.js 
Openning session
Openned session
Connected to cluster as node id: 50
onSession.
onInsert.
Inserted: {"town":"Maidenhead","county":"Berkshire"}
onFind.
Found: {"county":"Berkshire","town":"Maidenhead"}

What next?

Please give this a try – especially with your own code – and let us know how you get on. We’re especially interested in what features you’d like to see added – embedded objects, lists, joins….





MySQL Presentations from FOSDEM 2013

FOSDEM 2013

FOSDEM 2013

The Free and Open source Software Developers’ European Meeting (FOSDEM) is a two-day event organized by volunteers to promote the widespread use of Free and Open Source software.

I attended for the first time over the weekend and was really impressed by the number of people there, the energy and the quality of the content. The event really lives up to it’s name and is very developer-focused.

In the end, I got the opportunity to make 2 presentations. The first is a general introduction to MySQL Cluster….


The second illustrates how you can realise the benefits promised by NoSQL data stores wihtout losing the consistency and flexibility of relational databases…






NoSQL and MySQL – free webinar, replay now available

Schema-free NoSQL Data

Update – the webinar replay is now available from here.

On Thursday, I’ll be presenting a webinar on NoSQL (of course with a MySQL twist!) – as always it’s free to attend but you need to register here in advance. Even if you can’t attend, it’s worth registering as you’ll be sent a link to the replay and the charts. The session will introduce the concepts and motivations behind the NoSQL movement and then go on to explain how you can get most of the same benefits with MySQL (including MySQL Cluster) while still getting the RDBMS benefits such as ACID transactions.

The official description:

The ever increasing performance demands of web-based services has generated significant interest in providing NoSQL access methods to MySQL – enabling users to maintain all of the advantages of their existing relational database infrastructure, while providing blazing fast performance for simple queries, using an API to complement regular SQL access to their data. This session looks at the existing NoSQL access methods for MySQL as well as the latest developments for both the InnoDB and MySQL Cluster storage engines. See how you can get the best of both worlds – persistence, consistency, rich queries, high availability, scalability and simple, flexible APIs and schemas for agile development.

When:

  • Thursday, March 29, 2012: 09:00 Pacific time (America)
  • Thu, Mar 29: 06:00 Hawaii time
  • Thu, Mar 29: 10:00 Mountain time (America)
  • Thu, Mar 29: 11:00 Central time (America)
  • Thu, Mar 29: 12:00 Eastern time (America)
  • Thu, Mar 29: 16:00 UTC
  • Thu, Mar 29: 17:00 Western European time
  • Thu, Mar 29: 18:00 Central European time
  • Thu, Mar 29: 19:00 Eastern European time




Scalable, persistent, HA NoSQL Memcache storage using MySQL Cluster

Memcached API with Cluster Data Nodes

The native Memcached API for MySQL Cluster is now GA as part of MySQL Cluster 7.2

This post was first published in April 2011 when the first trial version of the Memcached API for MySQL Cluster was released; it was then up-versioned for the second MySQL Cluster 7.2 Development Milestone Release in October 2011. I’ve now refreshed the post based on the GA of MySQL Cluster 7.2 which includes the completed Memcache API.

There are a number of attributes of MySQL Cluster that make it ideal for lots of applications that are considering NoSQL data stores. Scaling out capacity and performance on commodity hardware, in-memory real-time performance (especially for simple access patterns), flexible schemas… sound familiar? In addition, MySQL Cluster adds transactional consistency and durability. In case that’s not enough, you can also simultaneously combine various NoSQL APIs with full-featured SQL – all working on the same data set. This post focuses on a new Memcached API that is now available to download, try out and deploy. This post steps through setting up Cluster with the Memcached API and then demonstrates how to read and write the same data through both Memcached and SQL (including for existing MySQL Cluster tables).

Download the community version from mysql.com or the commercial version from Oracle’s Software Delivery Cloud (note that there is not currently a Windows version).

Traditional use of Memcached

First of all a bit of background about Memcached. It has typically been used as a cache when the performance of the database of record (the persistent database) cannot keep up with application demand. When changing data, the application will push the change to the database, when reading data, the application first checks the Memached cache, if it is not there then the data is read from the database and copied into Memcached. If a Memcached instance fails or is restarted for maintenance reasons, the contents are lost and the application will need to start reading from the database again. Of course the database layer probably needs to be scaled as well so you send writes to the master and reads to the replication slaves.

This has become a classic architecture for web and other applications and the simple Memcached attribute-value API has become extremely popular amongst developers.

As an illustration of the simplicity of this API, the following example stores and then retrieves the string “Maidenhead” against the key “Test”:

telnet localhost 11211
set Test 0 0 10
Maidenhead!
END
get Test
VALUE Test 0 10
Maidenhead!
END

Note that if we kill and restart the memcached server, the data is lost (as it was only held in RAM):

get Test
END

New options for using Memcached API with MySQL Cluster

Architecture for Memcached NDB API

What we’re doing with MySQL Cluster is to offer a bunch of new ways of using this API but with the benefits of MySQL Cluster. The solution has been designed to be very flexible, allowing the application architect to find a configuration that best fits their needs.

A quick diversion on how this is implemented. The application sends reads and writes to the memcached process (using the standard Memcached API). This in turn invokes the Memcached Driver for NDB (which is part of the same process) which in turn calls the NDB API for very quick access to the data held in MySQL Cluster’s data nodes (it’s the fastest way of accessing MySQL Cluster).

Because the data is now stored in MySQL Cluster, it is persistent and you can transparently scale out by adding more data nodes (this is an on-line operation).

Another important point is that the NDB API is already a commonly used, fully functional access method that the Memcached API can exploit. For example, if you make a change to a piece of data then the change will automatically be written to any MySQL Server that has its binary logging enabled which in turn means that the change can be replicated to a second site.

Memcached API with Cluster Data Nodes

So the first (and probably simplest) architecture is to co-locate the Memcached API with the data nodes.

The applications can connect to any of the memcached API nodes – if one should fail just switch to another as it can access the exact same data instantly. As you add more data nodes you also add more memcached servers and so the data access/storage layer can scale out (until you hit the 48 data node limit).

Memcached server with the Application

Another simple option is to co-locate the Memcached API with the application. In this way, as you add more application nodes you also get more Memcached throughput. If you need more data storage capacity you can independently scale MySQL Cluster by adding more data nodes. One nice feature of this approach is that failures are handled very simply – if one App/Memcached machine should fail, all of the other applications just continue accessing their local Memcached API.

Separate Memcached layer

For maximum flexibility, you can have a separate Memcached layer so that the application, the Memcached API & MySQL Cluster can all be scaled independently.

In all of the examples so far, there has been a single source for the data (it’s all held in MySQL Cluster).






Local Cache in Memcached

If you choose, you can still have all or some of the data cached within the memcached server (and specify whether that data should also be persisted in MySQL Cluster) – you choose how to treat different pieces of your data. If for example, you had some data that is written to and read from frequently then store it just in MySQL Cluster, if you have data that is written to rarely but read very often then you might choose to cache it in Memcached as well and if you have data that has a short lifetime and wouldn’t benefit from being stored in MySQL Cluster then only hold it in Memcached. The beauty is that you get to configure this on a per-key-prefix basis (through tables in MySQL Cluster) and that the application doesn’t have to care – it just uses the Memcached API and relies on the software to store data in the right place(s) and to keep everything in sync.

Of course if you want to access the same data through SQL then you’d make sure that it was configured to be stored in MySQL Cluster.

Enough of the theory, how to try it out…

Installing & configuarying the software

As this post is focused on API access to the data rather than testing High Availability, performance or scalability the Cluster can be kept extremely simple with all of the processes (nodes) running on a single server. The only thing to be careful of when you create your Cluster is to make sure that you define at least 5 API sections (e.g. [mysqld]) in your configuration file so you can access using SQL and 2 Memcached servers (each uses 2 connections) at the same time.

For further information on how to set up a single-host Cluster, refer to this post or just follow the next few steps.

Create a config.ini file for the Cluster configuration:

[ndb_mgmd]
hostname=localhost
datadir=/home/billy/my_cluster/ndb_data
NodeId=1

[ndbd default]
noofreplicas=2
datadir=/home/billy/my_cluster/ndb_data

[ndbd]
hostname=localhost
NodeId=3

[ndbd]
hostname=localhost
NodeId=4

[mysqld]
NodeId=50

[mysqld]
NodeId=51

[mysqld]
NodeId=52

[mysqld]
NodeId=53

[mysqld]
NodeId=54

and a my.cnf file for the MySQL server:

[mysqld]
ndbcluster
datadir=/home/billy/my_cluster/mysqld_data

Before starting the Cluster, install the standard databases for the MySQL Server (from wherever you have MySQL Cluster installed – typically /usr/local/mysql):

[billy@ws2 mysql]$ ./scripts/mysql_install_db
  --basedir=/usr/local/mysql
  --datadir=/home/billy/my_cluster/mysqld_data
  --user=billy
Start up the system

We are now ready to start up the Cluster processes:

[billy@ws2 my_cluster]$ ndb_mgmd -f conf/config.ini
  --initial --configdir=/home/billy/my_cluster/conf/
[billy@ws2 my_cluster]$ ndbd
[billy@ws2 my_cluster]$ ndbd
[billy@ws2 my_cluster]$ ndb_mgm -e show # Wait for data nodes to start
[billy@ws2 my_cluster]$ mysqld --defaults-file=conf/my.cnf &

If your version doesn’t already have the ndbmemcache database installed then that should be your next step:

[billy@ws2 ~]$ mysql -h 127.0.0.1 -P3306 -u root < /usr/local/mysql/share/memcache-api/ndb_memcache_metadata.sql

After that, start the Memcached server (with the NDB driver activated):

[billy@ws2 ~]$  memcached -E /usr/local/mysql/lib/ndb_engine.so -e "connectstring=localhost:1186;role=db-only" -vv -c 20

Notice the “connectstring” – this allows the primary Cluster to be on a different machine to the Memcached API. Note that you can actually use the same Memcached server to access multiple Clusters – you configure this within the ndbmemcached database in the primary Cluster. In a production system you may want to include reconf=false amogst the -e parameters in order to stop configuration changes being applied to running Memcached servers (you’d need to restart those servers instead).

Try it out!

Next the fun bit – we can start testing it out:

[billy@ws2 ~]$ telnet localhost 11211

set maidenhead 0 0 3 
SL6 
STORED 
get maidenhead 
VALUE maidenhead 0 3 
SL6 
END

We can now check that the data really is stored in the database:

mysql> SELECT * FROM ndbmemcache.demo_table;
   +------------------+------------+-----------------+--------------+
   | mkey             | math_value | cas_value       | string_value |
   +------------------+------------+-----------------+--------------+
   | maidenhead       |       NULL | 263827761397761 | SL6          |
   +------------------+------------+-----------------+--------------+

Of course, you can also modify this data through SQL and immediately see the change through the Memcached API:

mysql> UPDATE ndbmemcache.demo_table SET string_value='sl6 4' WHERE mkey='maidenhead';

[billy@ws2 ~]$ telnet localhost 11211

get maidenhead
VALUE maidenhead 0 5 
SL6 4 
END

By default, the normal limit of 14K per row still applies when using the Memcached API; however, the standard configuration treats any key-value pair with a key-pefix of “b:” differently and will allow the value to be up to 3 Mb (note the default limit imposed by the Memcached server is 1 Mb and so you’d also need to raise that). Internally the contents of this value will be split between 1 row in ndbmemcache.demo_table_large and one or more rows in ndbmemcache.external_values.

Note that this is completely schema-less, the application can keep on adding new key/value pairs and they will all get added to the default table. This may well be fine for prototyping or modest sized databases. As you can see this data can be accessed through SQL but there’s a good chance that you’ll want a richer schema on the SQL side or you’ll need to have the data in multiple tables for other reasons (for example you want to replicate just some of the data to a second Cluster for geographic redundancy or to InnoDB for report generation).

The next step is to create your own databases and tables (assuming that you don’t already have them) and then create the definitions for how the app can get at the data through the Memcached API. First let’s create a table that has a couple of columns that we’ll also want to make accessible through the Memcached API:

mysql> CREATE DATABASE clusterdb; USE clusterdb;
mysql> CREATE TABLE towns_tab (town VARCHAR(30) NOT NULL PRIMARY KEY,
  zip VARCHAR(10), population INT, county VARCHAR(10)) ENGINE=NDB;
mysql> INSERT INTO towns_tab VALUES ('Marlow', 'SL7', 14004, 'Berkshire');

Next we need to tell the NDB driver how to access this data through the Memcached API. Two ‘containers’ are created that identify the columns within our new table that will be exposed. We then define the key-prefixes that users of the Memcached API will use to indicate which piece of data (i.e. database/table/column) they are accessing:

mysql> USE ndbmemcache;
mysql> INSERT INTO containers VALUES ('towns_cnt', 'clusterdb',
'towns_tab', 'town', 'zip', 0, NULL, NULL, NULL, NULL);
mysql> INSERT INTO containers VALUES ('pop_cnt', 'clusterdb',
  'towns_tab', 'town', 'population', 0, NULL, NULL, NULL, NULL);
mysql> SELECT * FROM containers;
   +------------+-------------+------------------+-------------+----------------+-------+------------------+------------+--------------------+-----------------------------+
   | name       | db_schema   | db_table         | key_columns | value_columns  | flags | increment_column | cas_column | expire_time_column | large_values_table          |
   +------------+-------------+------------------+-------------+----------------+-------+------------------+------------+--------------------+-----------------------------+
   | demo_ext   | ndbmemcache | demo_table_large | mkey        | string_value   | 0     | NULL             | cas_value  | NULL               | ndbmemcache.external_values |
   | towns_cnt  | clusterdb   | towns_tab        | town        | zip            | 0     | NULL             | NULL       | NULL               | NULL                        |
   | demo_table | ndbmemcache | demo_table       | mkey        | string_value   | 0     | math_value       | cas_value  | NULL               | NULL                        |
   | pop_cnt    | clusterdb   | towns_tab        | town        | population     | 0     | NULL             | NULL       | NULL               | NULL                        |
   | demo_tabs  | ndbmemcache | demo_table_tabs  | mkey        | val1,val2,val3 | flags | NULL             | NULL       | expire_time        | NULL                        |
   +------------+-------------+------------------+-------------+----------------+-------+------------------+------------+--------------------+-----------------------------+
mysql> INSERT INTO key_prefixes VALUES (1, 'twn_pr:', 0,
  'ndb-only', 'towns_cnt');
mysql> INSERT INTO key_prefixes VALUES (1, 'pop_pr:', 0,
  'ndb-only', 'pop_cnt');
mysql> SELECT * FROM key_prefixes;
   +----------------+------------+------------+---------------+------------+
   | server_role_id | key_prefix | cluster_id | policy        | container  |
   +----------------+------------+------------+---------------+------------+
   |              1 | pop_pr:    |          0 | ndb-only      | pop_cnt    |
   |              0 | t:         |          0 | ndb-test      | demo_tabs  |
   |              3 |            |          0 | caching       | demo_table |
   |              0 |            |          0 | ndb-test      | demo_table |
   |              0 | mc:        |          0 | memcache-only | NULL       |
   |              1 | b:         |          0 | ndb-only      | demo_ext   |
   |              2 |            |          0 | memcache-only | NULL       |
   |              1 |            |          0 | ndb-only      | demo_table |
   |              0 | b:         |          0 | ndb-test      | demo_ext   |
   |              3 | t:         |          0 | caching       | demo_tabs  |
   |              1 | t:         |          0 | ndb-only      | demo_tabs  |
   |              4 |            |          0 | ndb-test      | demo_ext   |
   |              1 | twn_pr:    |          0 | ndb-only      | towns_cnt  |
   |              3 | b:         |          0 | caching       | demo_ext   |
   +----------------+------------+------------+---------------+------------+

At present it is necessary to restart the Memcached server in order to pick up the new key_prefix (and so you’d want to run multiple instances in order to maintain service):


[billy@ws2:~]$ memcached -E /usr/local/mysql/lib/ndb_engine.so -e "connectstring=localhost:1186;role=db-only" -vv -c 20
   07-Feb-2012 11:22:29 GMT NDB Memcache 5.5.19-ndb-7.2.4 started [NDB 7.2.4; MySQL 5.5.19]
   Contacting primary management server (localhost:1186) ...
   Connected to "localhost:1186" as node id 51.
   Retrieved 5 key prefixes for server role "db-only".
   The default behavior is that:
       GET uses NDB only
       SET uses NDB only
       DELETE uses NDB only.
   The 4 explicitly defined key prefixes are "b:" (demo_table_large), "pop_pr:" (towns_tab), 
      "t:" (demo_table_tabs) and "twn_pr:" (towns_tab)

Now these columns (and the data already added through SQL) are accessible through the Memcached API:

[billy@ws2 ~]$ telnet localhost 11211

get twn_pr:Marlow
VALUE twn_pr:Marlow 0 3 
SL7 
END 
set twn_pr:Maidenhead 0 0 3 
SL6 
STORED 
set pop_pr:Maidenhead 0 0 5 
42827 
STORED

and then we can check these changes through SQL:

mysql> SELECT * FROM clusterdb.towns_tab;
   +------------+------+------------+-----------+
   | town       | zip  | population | county    |
   +------------+------+------------+-----------+
   | Maidenhead | SL6  |      42827 | NULL      |
   | Marlow     | SL7  |      14004 | Berkshire |
   +------------+------+------------+-----------+

One final test is to start a second memcached server that will access the same data. As everything is running on the same host, we need to have the second server listen on a different port:

[billy@ws2 ~]$ memcached -E /usr/local/mysql/lib/ndb_engine.so 
   -e "connectstring=localhost:1186;role=db-only" -vv -c 20 
   -p 11212 -U 11212 
[billy@ws2 ~]$ telnet localhost 11212

get twn_pr:Marlow 
VALUE twn_pr:Marlow 0 3
SL7 
END

Memcached alongside NoSQL & SQL APIs

As mentioned before, there’s a wide range of ways of accessing the data in MySQL Cluster – both SQL and NoSQL. You’re free to mix and match these technologies – for example, a mission critical business application using SQL, a high-running web app using the Memcached API and a real-time application using the NDB API. And the best part is that they can all share the exact same data and they all provide the same HA infrastructure (for example synchronous replication and automatic failover within the Cluster and geographic replication to other clusters).

Finally, a reminder – please try this out and let us know what you think (or if you don’t have time to try it then let us now what you think anyway) by adding a comment to this post.





Direct access to MySQL Cluster through Memcached API – free webinar

Memcached access to MySQL Cluster

As described in an earlier post Memcached is an extremely popular caching layer used in most big web properties and we’re adding the ability to access MySQL Cluster directly using the familiar Memcached key-value/NoSQL API without needing to go through the MySQL Server. There is a huge amount of flexibility built into this solution – including:

  • Decide what data should be held only in the Memcached server; what should be written straight through to MySQL Cluster and then discarded  and what data should be cached in Memcached but persisted in MySQL Cluster
  • Where data is held both in Cluster and the Memcached server, they can automatically be kept in sync
  • By default it’s completely schema-less, all key-value pairs will be transparently stored in a single table within MySQL Cluster behind the scenes
  • Can map key-prefixes to columns in MySQL Cluster tables – allowing simultaneous access to the same data using SQL.
Mat Keep along with JD Duncan (lead developer for this functionality) will be hosting a free webinar on this topic (and I’ll be helping with the Q&A) on Wednesday (20th July 2011) at 9:00 am Pacific (17:00 UK, 18:00 CET). As always, please register for the event even if you can’t make this time as you’ll be sent a link to the recording.




Webinar: MySQL Cluster, Scaling Web Databases with Auto-Partitioning and SQL/NoSQL Access

Scale-out with MySQL Cluster

Update: webinar replay is now available from http://event.on24.com/r.htm?e=311660&s=1&k=3DCFE1CB3E1CF3F0FD0969DC66D93989

On Thursday 26th May Mat Keep and I will be presenting a webinar on how MySQL Cluster can deliver linear scalability – together with some tips on how to achieve it. As always the webinar is free but you need to register here.

 

The session starts on Thu, May 26 at 17:00 UK time, 18:00 Central European Time, 09:00 Pacific.

This webinar will discuss best practices in scaling services on-demand for high volumes of reads and writes, and provide insight on the range of NoSQL and SQL access methods available to developers, specifically covering:

  • Automatic partitioning (sharding) for high scalability
  • On-line scaling of the cluster across commodity hardware
  • SQL and NoSQL interfaces, and what should be used when
  • On-line updating of schema design to accommodate rapidly evolving applications
  • Resources to get started




Are you using NoSQL databases?

We’re interested in finding out what NoSQL databases you might be using (and we include MySQL Cluster in that list when using one of its NoSQL interfaces such as the NDB API or ClusterJ).

To figure this out we’ve posted a quick poll on the home page of dev.mysql.com (go straight to the bottom-right corner of the page) – please take 30 seconds to complete it (it shouldn’t take any longer than that) if you can. In return, you can also see the latest results from the poll.