Archive for MySQL Cluster

SQL/NoSQL and MySQL Cluster 7.4 Presentations now available

My 2 sessions from 2014’s MySQL Central at Oracle OpenWorld are now available:

NoSQL and SQL: The Best of Both Worlds [CON2853]

There’s a lot of excitement about NoSQL data stores, with the promise of simple access patterns, flexible schemas, scalability, and high availability. The downside comes in the form of losing ACID transactions, consistency, flexible queries, and data integrity checks. What if you could have the best of both worlds? This session shows how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data—whether it’s in a simple key-value API (memcached) or REST, JavaScript, Java, or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance; 99.999 percent availability; online maintenance; and linear, horizontal scalability through transparent autosharding.

MySQL Cluster: Dive into the Latest Developments [CON3815]

Wednesday, Oct 1, 3:30 PM – 4:15 PM – Moscone South – 250

I’ll be co-presenting this session with Bernd Ocklin – Director MySQL Cluster, Oracle

MySQL Cluster does more than scale beyond a billion transactions per minute. It’s also the in-memory database at the heart of mobile phone networks and online games. Scaling for the masses. A touch of your mobile phone’s green button likely has already gotten you in contact with MySQL Cluster. Driven by these extreme use cases, this session covers how to build business-critical scalable solutions with MySQL Cluster.





Active-Active Replication, Performance Improvements & Operational Enhancements – some of what’s available in the new MySQL Cluster 7.4.1 DMR

MySQL Cluster Logo

Oracle have just made availble the new MySQL Cluster 7.4.1 Development Milestone Release – it can be downloaded from the development release tab here. Note that this is not a GA release and so we wouldn’t recommend using it in production.

There are three main focus areas for this DMR and the purpose of this post is to briefly introduce them:

  • Active-Active (Multi-Master) Replication
  • Performance
  • Operational improvements (speeding up of restarts; enhanced memory reporting)

Active-Active (Multi-Master) Replication

MySQL Cluster allows bi-directional replication between two (or more) clusters. Replication within each cluster is synchronous but between clusters it is asynchronous which means the following scenario is possible:

Conflict with asynchronous replication
Site A Replication Site B
x == 10 x == 10
x = 11 x = 20
– x=11 –> x == 11
x==20 <– x=20 –

 

In this example a value (column for a row in a table) is set to 11 on site A and the change is queued for replication to site B. In the mean time, an application sets the value to 20 on site B and that change is queued for replication to site A. Once both sites have received and applied the replicated change from the other cluster site A contains the value 20 while site B contains 11 – in other words the databases are now inconsistent.

How MySQL Cluster implements eventual consistency

There are two phases to establishing consistency between both clusters after an inconsistency has been introduced:

  1. Detect that a conflict has happened
  2. Resolve the inconsistency

The following animation illustrates how MySQL Cluster 7.2 detects that an inconsistency has been introduced by the asynchronous, active-active replication:

Detecting conflicts

While we typically consider the 2 clusters in an active-active replication configuration to be peers, in this case we designate one to be the primary and the other the secondary. Reads and writes can still be sent to either cluster but it is the responsibility of the primary to identify that a conflict has arisen and then remove the inconsistency.

A logical clock is used to identify (in relative terms) when a change is made on the primary – for those who know something of the MySQL Cluster internals, we use the index of the Global Checkpoint that the update is contained in. For all tables that have this feature turned on, an extra, hidden column is automatically added on the primary – this represents the value of the logical clock when the change was made.

Once the change has been applied on the primary, there is a “window of conflict” for the effected row(s) during which if a different change is made to the same row(s) on the secondary then there will be an inconsistency. Once the slave on the secondary has applied the change from the primary, it will send a replication event back to the slave on the primary, containing the primary’s clock value associated with the changes that have just been applied on the secondary. (Remember that the clock is actually the Global Checkpoint Index and so this feature is sometimes referred to as Reflected GCI). Once the slave on the primary has received this event, it knows that all changes tagged with a clock value no later than the reflected GCI are now safe – the window of conflict has closed.

If an application modifies this same row on the secondary before the replication event from the primary was applied then it will send an associated replication event to the slave on the primary before it reflects the new GCI. The slave on the primary will process this replication event and compare the clock value recorded with the effected rows with the latest reflected GCI; as the clock value for the conflicting row is higher the primary recognises that a conflict has occured and will launch the algorithm to resolve the inconsistency.

Options for MySQL Cluster replication conflict detection/resolution

After a conflict has been detected, you have the option of having the database simply report the conflict to the application or have it roll back just the conflicting row or the entire transaction and all subsequent transactions that were dependent on it.

So – what’s new in 7.4.1?

  • Detects conflicts between inserts and updates
  • Option to roll back entire transaction (and dependent transactions) rather than just the conflicting row
  • All conflicts are handled before switching primary – avoiding potential race conditions

As mentioned at the start of this post, this is pre-GA and there are some extra enhancements we plan on including in the final version:

  • Handle deletes which conflict with other operations
  • Roll back transactions that have read a row that had been rolled back due to a conflict

Performance

MySQL CLuster 7.4.1 Read-Write Performance
Being a scaled-out, in-memory, real-time database, MySQL Cluster performance has always been great but we continue to work on making it faster each release. In particular, we want to keep pace with the trend of having more and more cores rather than faster ones. 7.4 continues along the path of better exploiting multiple cores – as can be seen from these benchmark results.
MySQL CLuster 7.4.1 Read Performance
Just make sure that you’re using the multi-threaded data node (ndbmtd rather than ndbd) and have configured how many threads it should use.

Faster Restarts

You can restart MySQL Cluster processes (nodes) without losing database service (for example if adding extra memory to a server) and so on the face of it, the speed of the restarts isn’t that important. Having said that, while the node is restarting you’ve lost some of your high-availability which for super-critical applications can make you nervous. Additionally, faster restarts mean that you can complete maintenance activities faster – for example, a software upgrade requires a rolling restart of all of the nodes – if you have 48 data nodes then you want each of the data nodes to restart as quickly as possible.

MySQL 7.4.1 includes a number of optimisations to the restart code and so if you’re already using MySQL Cluster, it might be interesting to see how much faster it gets for your application. We also have some extra optimisations in the works that you can expect to see in later 7.4 versions.

Extra Memory Reporting

MySQL Cluster presents a lot of monitoring information through the ndbinfo database and in 7.4 we’ve added some extra information on how memory is used for individual tables.

For example; to see how much memory is being used by each data node for a particular table…

mysql> CREATE DATABASE clusterdb;USE clusterdb;
mysql> CREATE TABLE simples (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY) ENGINE=NDB;
mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1280 |         40 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1280 |         40 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4256 |        133 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4256 |        133 |
+------+------+-------------+------------+------------+

When you delete rows from a MySQL Cluster table, the memory is not actually freed up and so if you check the existing memoryusage table you won’t see a change. This memory will be reused when you add new rows to that same table. In MySQL Cluster 7.4, it’s possible to see how much memory is in that state for a table…

mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1280 |         40 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1280 |         40 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4256 |        133 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4256 |        133 |
+------+------+-------------+------------+------------+
mysql> DELETE FROM clusterdb.simples LIMIT 1;
mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1312 |         41 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1312 |         41 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4288 |        134 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4288 |        134 |
+------+------+-------------+------------+------------+

As a final example, we can check whether a table is being evenly sharded accross the data nodes (in this case a realy bad sharding key was chosen)…

mysql> CREATE TABLE simples (id INT NOT NULL AUTO_INCREMENT, \
        species VARCHAR(20) DEFAULT "Human", 
        PRIMARY KEY(id, species)) engine=ndb PARTITION BY KEY(species);

// Add some data

mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |           0 |          0 |          0 |
|    1 |    2 |      196608 |      11732 |        419 |
|    2 |    0 |           0 |          0 |          0 |
|    2 |    2 |      196608 |      11732 |        419 |
|    3 |    1 |           0 |          0 |          0 |
|    3 |    3 |           0 |          0 |          0 |
|    4 |    1 |           0 |          0 |          0 |
|    4 |    3 |           0 |          0 |          0 |
+------+------+-------------+------------+------------+

If you get chance to try out this new release then please let us know how you get on – either through a comment on this blog, a MySQL bug report or a post to the MySQL Cluster Forum.





MySQL Cluster latest developments – webinar replay + Q&A

MySQL Cluster LogoI recently hosted hosting a webinar which explained what MySQL Clusrter is, what it can deliver and what the latest developments were. The “Discover the latest MySQL Cluster Developments” webinar is now available to view here. At the end of this article you’ll find a full transcript of the Q&A from the live session.

Details:

View this webinar to learn how MySQL Cluster 7.3, the latest GA release, enables developer agility by making it far simpler and faster to build your products and web-based applications with MySQL Cluster. You’ll also learn how MySQL Cluster and its linear scalability, 99.999% uptime, real-time responsiveness, and ability to perform over 1 BILLION Writes per Minute can help your products and applications meet the needs of the most demanding markets. MySQL Cluster combines these capabilities and the affordability of open source, making it well suited for use as an embedded database.

In this replay you’ll learn about the following MySQL Cluster capabilities, including the latest innovations in the 7.3 GA release:

  • Auto-sharding (partitioning) across commodity hardware for extreme read and write scalability
  • Cross-data center geographic synchronous and asynchronous replication
  • Online scaling and schema upgrades, now with improved Connection Thread Scalability
  • Real-time optimizations for ultra-low, predictable latency
  • Foreign Key Support for tight referential integrity
  • SQL and NoSQL interfaces, now with support for Node.js
  • Support for MySQL 5.6, allowing use of the latest InnoDB and NDB engines within one database
  • Integrated HA for 99.999% availability
  • Auto-Installer that installs, configures, provisions and tunes a production grade cluster in minutes

In addition, you will get a sneak preview of some of the new features planned in MySQL Cluster 7.4 Come and learn how MySQL Cluster can help you differentiate your products and extend their reach into new markets, as well as deliver highly demanding web-based applications, either on premises or in the cloud.

Q&A Transcript

  • When using the Memcached API, can I use my existing Memcached connector? Yes. The Memcached API actually uses the regular memcached protocol but then has a custom plugin that acesses the MySQL Cluster data nodes rather than using its local in-memory store.
  • If I’m replicating between 2 Clusters in 2 data centres and the WAN fails for a minute – what happens? Because the replication between MySQL Cluster instances is asynchronous – the application isn’t impacted (for example, there will be no extra errors or latency). The changes will be stored in the binary log of the Cluster to which they were sent and then replicated to the other site once the WAN returns.
  • Can I scale back down as well as up? It’s an online operation to reduce the number of MySQL Servers (or other application nodes) but that isn’t currently possible for the data nodes. In reality, it’s very rare that applications need to reduce the amount of data they store.
  • Are there any MySQL connectors that don’t work with MySQL Cluster? No, any connector that works with MySQL will work just as well with MySQL Cluster.
  • Do you have more details on the benchmark results? Yes – take a look at the MySQL Cluster Benchmarks page.
  • I’ve been hearing about MySQL Fabric – does that also allow queries and joins ot span multiple shards? Currently, the only option for cross-shard queries is to use MySQL Cluster or implement them at the application layer.
  • Is the data is partioned over diffrent cluster nodes or do all cluster nodes hold the full data set. Each node group stores a subset of the rows from each table. The 2 data nodes within the node group will store the exact same set of rows.
  • Where can I find a definition of those different kinds of Foreign Key constraints? The wikipedia definition for Foreign Keys is a good place to start.
  • What is the diffrence between ndbcluster and MySQL Cluster ? None – they’re one and the same. When you hear any of “Cluster”, “MySQL Cluster”, “NDB” and “NDB Cluster” the meaning is the same.
  • Do I need to have a web server installed for the Auto-Installer to work? No – the MySQL Cluster auto-installer comes with a small web server built-in.
  • Are there any dependencies to meet before installing MySQL Cluster on RHEL Liunx? It should work out of the box. My preferred way of working is to use the generic Linux tar ball for MySQL Cluster (get it from the MySQL Cluster download page) – extract it and then run the auto-installer or configure it manually.
  • Is there any guide available to migrate mysql nodes to mysql cluster? Probably the closest we have is a white paper on how to get the best out of any PoC for MySQL Cluster (as it highlights what needs to be done differently in order to get the best results)… MySQL Cluster Evaluation Guide. Note that MySQL Cluster uses a different version of the mysqld binary and so you’ll need to stop your existing MySQL Server and start up the new one. To migrate a specific table to MySQL Cluster after that is done use “ALTER TABLE my-tab ENGINE=NDB;”.
  • Does drupal support MySQL Cluster? I’ve heard of people doing it but I suspect that minor tweaks to teh Drupal code may have been needed.
  • How do the NoSQL APIs map to the SQL database schemas? It varies slightly by API – in general, you provide some annotations or meta-data to specify how tables or columns should map to keys/objects/properties. With Memcached you have the option of being schema-less and having all data stored in one, big, generic table.
  • Where can I learn more about MySQL Fabric? The MySQL Fabric page is a good starting point; for an end-to-end example, take a look at this tutorial on adding HA and then sharding using MySQL Fabric.
  • What is difference between MySQL Fabric and MySQL Cluster? MySQL Fabric provides server farm management on top of ‘regular’ MySQL Servers storing data with the InnoDB storage engine it delivers HA and sharding. MySQL Cluster works below the MySQL Server, storing data in the NDB storage engine (on the data nodes). MySQL Cluster can deliver higher levels of High Availability; better application transparency and cross-shard queries, joins and transactions but it does mean using a different storage engine which of course comes with its own limitations (see the MySQL Cluster Evaluation Guide for details of those).
  • So, if I have any full table scans, should I forget about MySQL Cluster> Note necessarily. If every one of your high running operations is a full table scan then MySQL Cluster might not be ideal. However if most operations are simpler but you have some full table scans then that could be fine. The optimisations going into MySQL Cluster 7.4 should particularly benefit table scans.




Discover the latest MySQL Cluster Developments – Upcoming webinar

MySQL Cluster LogoOn Thursday 17th July I’ll be hosting a webinar which explains what MySQL Clusrter is, what it can deliver and what the latest developments are. As always the webinar is free but please register here.

Details:

Join this technical webinar to learn how MySQL Cluster 7.3, the latest GA release, enables developer agility by making it far simpler and faster to build your products and web-based applications with MySQL Cluster. You’ll also learn how MySQL Cluster and its linear scalability, 99.999% uptime, real-time responsiveness, and ability to perform over 1 BILLION Writes per Minute can help your products and applications meet the needs of the most demanding markets. MySQL Cluster combines these capabilities and the affordability of open source, making it well suited for use as an embedded database.

In this webcast you’ll learn about the following MySQL Cluster capabilities, including the latest innovations in the 7.3 GA release:

  • Auto-sharding (partitioning) across commodity hardware for extreme read and write scalability
  • Cross-data center geographic synchronous and asynchronous replication
  • Online scaling and schema upgrades, now with improved Connection Thread Scalability
  • Real-time optimizations for ultra-low, predictable latency
  • Foreign Key Support for tight referential integrity
  • SQL and NoSQL interfaces, now with support for Node.js
  • Support for MySQL 5.6, allowing use of the latest InnoDB and NDB engines within one database
  • Integrated HA for 99.999% availability
  • Auto-Installer that installs, configures, provisions and tunes a production grade cluster in minutes

In addition, you will get a sneak preview of some of the new features planned in MySQL Cluster 7.4 Come and learn how MySQL Cluster can help you differentiate your products and extend their reach into new markets, as well as deliver highly demanding web-based applications, either on premises or in the cloud.

Even if you can’t join the live webinar, it’s worth registering as you’ll be emailed a link to the replay as soon as it’s available.





MySQL Cluster 7.3.6 Released

MySQL Cluster Logo
The binary and source versions of MySQL Cluster 7.3.6 have now been made available at http://www.mysql.com/downloads/cluster/.

Release notes

MySQL Cluster NDB 7.3.6 is a new release of MySQL Cluster, based
on MySQL Server 5.6 and including features from version 7.3 of the
NDB storage engine, as well as fixing a number of recently
discovered bugs in previous MySQL Cluster releases.

Obtaining MySQL Cluster NDB 7.3. MySQL Cluster NDB 7.3 source
code and binaries can be obtained from
http://dev.mysql.com/downloads/cluster/.

For an overview of changes made in MySQL Cluster NDB 7.3, see
MySQL Cluster Development in MySQL Cluster NDB 7.3
(http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-development-5-6-ndb-7-3.html).

This release also incorporates all bugfixes and changes made in
previous MySQL Cluster releases, as well as all bugfixes and
feature changes which were added in mainline MySQL 5.6 through
MySQL 5.6.19 (see Changes in MySQL 5.6.19 (2014-05-30)
(http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-19.html)).

Functionality Added or Changed

  • Cluster API: Added as an aid to debugging the ability to
    specify a human-readable name for a given Ndb object and later
    to retrieve it. These operations are implemented,
    respectively, as the setNdbObjectName() and getNdbObjectName()
    methods.
    To make tracing of event handling between a user application
    and NDB easier, you can use the reference (from getReference()
    followed by the name (if provided) in printouts; the reference
    ties together the application Ndb object, the event buffer,
    and the NDB storage engine’s SUMA block. (Bug #18419907)

Bugs Fixed

  • Cluster API: When two tables had different foreign keys with
    the same name, ndb_restore considered this a name conflict and
    failed to restore the schema. As a result of this fix, a slash
    character (/) is now expressly disallowed in foreign key
    names, and the naming format parent_id/child_id/fk_name is now
    enforced by the NDB API. (Bug #18824753)
  • Processing a NODE_FAILREP signal that contained an invalid
    node ID could cause a data node to fail. (Bug #18993037, Bug
    #73015)
    References: This bug is a regression of Bug #16007980.
  • When building out of source, some files were written to the
    source directory instead of the build dir. These included the
    manifest.mf files used for creating ClusterJ jars and the
    pom.xml file used by mvn_install_ndbjtie.sh. In addition,
    ndbinfo.sql was written to the build directory, but marked as
    output to the source directory in CMakeLists.txt. (Bug
    #18889568, Bug #72843)
  • Adding a foreign key failed with NDB Error 208 if the parent
    index was parent table’s primary key, the primary key was not
    on the table’s initial attributes, and the child table was not
    empty. (Bug #18825966)
  • When an NDB table served as both the parent table and a child
    table for 2 different foreign keys having the same name,
    dropping the foreign key on the child table could cause the
    foreign key on the parent table to be dropped instead, leading
    to a situation in which it was impossible to drop the
    remaining foreign key. This situation can be modelled using
    the following CREATE TABLE statements:
    CREATE TABLE parent (
    id INT NOT NULL,
    PRIMARY KEY (id)
    ) ENGINE=NDB;
    CREATE TABLE child (
    id INT NOT NULL,
    parent_id INT,
    PRIMARY KEY (id),
    INDEX par_ind (parent_id),
    FOREIGN KEY (parent_id)
    REFERENCES parent(id)
    ) ENGINE=NDB;
    CREATE TABLE grandchild (
    id INT,
    parent_id INT,
    INDEX par_ind (parent_id),
    FOREIGN KEY (parent_id)
    REFERENCES child(id)
    ) ENGINE=NDB;

    With the tables created as just shown, the issue occured when
    executing the statement ALTER TABLE child DROP FOREIGN KEY
    parent_id, because it was possible in some cases for NDB to
    drop the foreign key from the grandchild table instead. When
    this happened, any subsequent attempt to drop the foreign key
    from either the child or from the grandchild table failed.
    (Bug #18662582)
  • ndbmtd supports multiple parallel receiver threads, each of
    which performs signal reception for a subset of the remote
    node connections (transporters) with the mapping of
    remote_nodes to receiver threads decided at node startup.
    Connection control is managed by the multi-instance TRPMAN
    block, which is organized as a proxy and workers, and each
    receiver thread has a TRPMAN worker running locally.
    The QMGR block sends signals to TRPMAN to enable and disable
    communications with remote nodes. These signals are sent to
    the TRPMAN proxy, which forwards them to the workers. The
    workers themselves decide whether to act on signals, based on
    the set of remote nodes they manage.
    The current isuue arises because the mechanism used by the
    TRPMAN workers for determining which connections they are
    responsible for was implemented in such a way that each worker
    thought it was responsible for all connections. This resulted
    in the TRPMAN actions for OPEN_COMORD, ENABLE_COMREQ, and
    CLOSE_COMREQ being processed multiple times.
    The fix keeps TRPMAN instances (receiver threads) executing
    OPEN_COMORD, ENABLE_COMREQ and CLOSE_COMREQ requests. In
    addition, the correct TRPMAN instance is now chosen when
    routing from this instance for a specific remote connection.
    (Bug #18518037)
  • Executing ALTER TABLE … REORGANIZE PARTITION after
    increasing the number of data nodes in the cluster from 4 to
    16 led to a crash of the data nodes. This issue was shown to
    be a regression caused by previous fix which added a new dump
    handler using a dump code that was already in use (7019),
    which caused the command to execute two different handlers
    with different semantics. The new handler was assigned a new
    DUMP code (7024). (Bug #18550318)
    References: This bug is a regression of Bug #14220269.
  • When running with a very slow main thread, and one or more
    transaction coordinator threads, on different CPUs, it was
    possible to encounter a timeout when sending a
    DIH_SCAN_GET_NODESREQ signal, which could lead to a crash of
    the data node. Now in such cases the timeout is avoided. (Bug
    #18449222)
  • During data node failure handling, the transaction coordinator
    performing takeover gathers all known state information for
    any failed TC instance transactions, determines whether each
    transaction has been committed or aborted, and informs any
    involved API nodes so that they can report this accurately to
    their clients. The TC instance provides this information by
    sending TCKEY_FAILREF or TCKEY_FAILCONF signals to the API
    nodes as appropriate top each affected transaction.
    In the event that this TC instance does not have a direct
    connection to the API node, it attempts to deliver the signal
    by routing it through another data node in the same node group
    as the failing TC, and sends a GSN_TCKEY_FAILREFCONF_R signal
    to TC block instance 0 in that data node. A problem arose in
    the case of multiple transaction cooridnators, when this TC
    instance did not have a signal handler for such signals, which
    led it to fail.
    This issue has been corrected by adding a handler to the TC
    proxy block which in such cases forwards the signal to one of
    the local TC worker instances, which in turn attempts to
    forward the signal on to the API node. (Bug #18455971)
  • A local checkpoint (LCP) is tracked using a global LCP state
    (c_lcpState), and each NDB table has a status indicator which
    indicates the LCP status of that table (tabLcpStatus). If the
    global LCP state is LCP_STATUS_IDLE, then all the tables
    should have an LCP status of TLS_COMPLETED.
    When an LCP starts, the global LCP status is LCP_INIT_TABLES
    and the thread starts setting all the NDB tables to
    TLS_ACTIVE. If any tables are not ready for LCP, the LCP
    initialization procedure continues with CONTINUEB signals
    until all tables have become available and been marked
    TLS_ACTIVE. When this initialization is complete, the global
    LCP status is set to LCP_STATUS_ACTIVE.
    This bug occurred when the following conditions were met:

    • An LCP was in the LCP_INIT_TABLES state, and some but not
      all tables had been set to TLS_ACTIVE.
    • The master node failed before the global LCP state
      changed to LCP_STATUS_ACTIVE; that is, before the LCP
      could finish processing all tables.
    • The NODE_FAILREP signal resulting from the node failure
      was processed before the final CONTINUEB signal from the
      LCP initialization process, so that the node failure was
      processed while the LCP remained in the LCP_INIT_TABLES
      state.
      Following master node failure and selection of a new one, the
      new master queries the remaining nodes with a MASTER_LCPREQ
      signal to determine the state of the LCP. At this point, since
      the LCP status was LCP_INIT_TABLES, the LCP status was reset
      to LCP_STATUS_IDLE. However, the LCP status of the tables was
      not modified, so there remained tables with TLS_ACTIVE.
      Afterwards, the failed node is removed from the LCP. If the
      LCP status of a given table is TLS_ACTIVE, there is a check
      that the global LCP status is not LCP_STATUS_IDLE; this check
      failed and caused the data node to fail.
      Now the MASTER_LCPREQ handler ensures that the tabLcpStatus
      for all tables is updated to TLS_COMPLETED when the global LCP
      status is changed to LCP_STATUS_IDLE. (Bug #18044717)
  • When performing a copying ALTER TABLE operation, mysqld
    creates a new copy of the table to be altered. This
    intermediate table, which is given a name bearing the prefix
    #sql-, has an updated schema but contains no data. mysqld then
    copies the data from the original table to this intermediate
    table, drops the original table, and finally renames the
    intermediate table with the name of the original table.
    mysqld regards such a table as a temporary table and does not
    include it in the output from SHOW TABLES; mysqldump also
    ignores an intermediate table. However, NDB sees no difference
    between such an intermediate table and any other table. This
    difference in how intermediate tables are viewed by mysqld
    (and MySQL client programs) and by the NDB storage engine can
    give rise to problems when performing a backup and restore if
    an intermediate table existed in NDB, possibly left over from
    a failed ALTER TABLE that used copying. If a schema backup is
    performed using mysqldump and the mysql client, this table is
    not included. However, in the case where a data backup was
    done using the ndb_mgm client’s BACKUP command, the
    intermediate table was included, and was also included by
    ndb_restore, which then failed due to attempting to load data
    into a table which was not defined in the backed up schema.
    To prevent such failures from occurring, ndb_restore now by
    default ignores intermediate tables created during ALTER TABLE
    operations (that is, tables whose names begin with the prefix
    #sql-). A new option –exclude-intermediate-sql-tables is
    added that makes it possible to override the new behavior. The
    option’s default value is TRUE; to cause ndb_restore to revert
    to the old behavior and to attempt to restore intermediate
    tables, set this option to FALSE. (Bug #17882305)
  • The logging of insert failures has been improved. This is
    intended to help diagnose occasional issues seen when writing
    to the mysql.ndb_binlog_index table. (Bug #17461625)
  • The DEFINER column in the INFORMATION_SCHEMA.VIEWS table
    contained erroneous values for views contained in the ndbinfo
    information database. This could be seen in the result of a
    query such as SELECT TABLE_NAME, DEFINER FROM
    INFORMATION_SCHEMA.VIEWS WHERE TABLE_SCHEMA=’ndbinfo’. (Bug
    #17018500)
  • Employing a CHAR column that used the UTF8 character set as a
    table’s primary key column led to node failure when restarting
    data nodes. Attempting to restore a table with such a primary
    key also caused ndb_restore to fail. (Bug #16895311, Bug
    #68893)
  • Disk Data: Setting the undo buffer size used by
    InitialLogFileGroup to a value greater than that set by
    SharedGlobalMemory prevented data nodes from starting; the
    data nodes failed with Error 1504 Out of logbuffer memory.
    While the failure itself is expected behavior, the error
    message did not provide sufficient information to diagnose the
    actual source of the problem; now in such cases, a more
    specific error message Out of logbuffer memory (specify
    smaller undo_buffer_size or increase SharedGlobalMemory) is
    supplied. (Bug #11762867, Bug #55515)
  • Cluster Replication: When using NDB$EPOCH_TRANS, conflicts
    between DELETE operations were handled like conflicts between
    updates, with the primary rejecting the transaction and
    dependents, and realigning the secondary. This meant that
    their behavior with regard to subsequent operations on any
    affected row or rows depended on whether they were in the same
    epoch or a different one: within the same epoch, they were
    considered conflicting events; in different epochs, they were
    not considered in conflict.
    This fix brings the handling of conflicts between deletes by
    NDB$EPOCH_TRANS with that performed when using NDB$EPOCH for
    conflict detection and resolution, and extends testing with
    NDB$EPOCH and NDB$EPOCH_TRANS to include “delete-delete”
    conflicts, and encapsulate the expected result, with
    transactional conflict handling modified so that a conflict
    between DELETE operations alone is not sufficient to cause a
    transaction to be considered in conflict. (Bug #18459944)
  • Cluster API: When an NDB data node indicates a buffer overflow
    via an empty epoch, the event buffer places an inconsistent
    data event in the event queue. When this was consumed, it was
    not removed from the event queue as expected, causing
    subsequent nextEvent() calls to return 0. This caused event
    consumption to stall because the inconsistency remained
    flagged forever, while event data accumulated in the queue.
    Event data belonging to an empty inconsistent epoch can be
    found either at the beginning or somewhere in the middle.
    pollEvents() returns 0 for the first case. This fix handles
    the second case: calling nextEvent() call dequeues the
    inconsistent event before it returns. In order to benefit from
    this fix, user applications must call nextEvent() even when
    pollEvents() returns 0. (Bug #18716991)
  • Cluster API: The pollEvents() method returned 1, even when
    called with a wait time equal to 0, and there were no events
    waiting in the queue. Now in such cases it returns 0 as
    expected. (Bug #18703871)




SQL & NoSQL, The Best of Both Worlds with MySQL Cluster – webinar replay now available

MySQL Cluster Logo

I recently presented a webinar explaining how you can enjoy the key benefits of NoSQL data stores without giving up all of the great features provided by a mature RDBMS.

In case you weren’t able to attend (or wanted to refresh your memory) then the webinar replay and charts are now available.

There’s often a lot of excitement around NoSQL Data Stores with the promise of simple access patterns, flexible schemas, scalability and High Availability. The downside can come in the form of losing ACID transactions, consistency, flexible queries and data integrity checks. What if you could have the best of both worlds?

This webinar showed how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data, with a simple key-value API (Memcached), REST, JavaScript, Java or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance, 99.999% availability, on-line maintenance and linear, horizontal scalability through transparent auto-sharding.

These webinars are always a good opportunity to get your questions answered; here’s a catch up of the Q&A from this session:

  • Would you suggest using mysql cluster to store graph data (150k writes second)? For graph data, you always have to choose between a specialized graph database and a more general-purpose database. If it’s “almost” relational with a few graph-like connections, your decision might be different than if it’s purely graph-like. In any case, your write load of 150K writes per second can certainly be managed in MySQL Cluster. It only requires a little care to get an appropriate cluster configuration as far as number of data nodes, number of API nodes, memory, disk, and networking. Also, the total eventual size of the data is an important factor in the decision about whether to use Cluster, since indexes must always fit in the total distributed memory of the data nodes.
  • Can you please explain the RAM requirements for MySQL Cluster, for example if my database is 10GBs in disc space, will it require 10GBs of RAM in MySQL Cluster? There is additional overhead in addition to the raw data. It’s tricky to try to summarize, but there is fixed overhead per row plus space for re-do logs and indexes. Details are in online documents. All indexed columns must be in memory but other columns can be on disk if you choose. Remember that each row has to be stored on 2 data nodes and so you need to figure out your total memory requirement, double it and then divide by the number of data nodes to find how much memory would be needed for each data node. MySQL Cluster Evaluation Guide – Designing, Evaluating and Benchmarking MySQL Cluster is a good white paper to refer to in order to decide if MySQL Cluster is the right database for your application as well as what you’ll need and what you should do to get the best results.
  • Is there a wizard to migrate innoDB to MySQL Cluster? There’s not a “wizard” per se, but “ALTER TABLE x ENGINE=ndb” will convert a particular table. (It’s only tricky if you have foreign keys which might have to be dropped at the beginning and reenabled at the end of the process).
  • Can this be deployed on EC2 instances, or is this for bare metal? MySQL Cluster has been successfully deployed (e.g. by PayPal)
  • How difficult is it to do a hardware upgrade? Do you have to do it all at once or can you do each machine in turn? Both hardware and software upgrades are online operations. You can add nodes to a running cluster, and upgrade the software on nodes individually. If you use the MySQL Cluster Manager, many of the upgrade operations can be automated. You won’t be able to exploit some upgrades (e.g. extra hardware on a data node) until you’ve upgraded.
  • Does MySQL Cluster store all data in memory? What scenarios available for swaping data to disk? Can we differentiate which tables/columns are stored on memory/disk? All indexes are in memory. A table can be all in-memory, or it can have non-indexed columns stored on disk. That’s a per-column choice.
  • Can mysql be subscribed/notified when some data is changed/updated? There is a notification API. It is currently only supported in C NDB API (this is the “Event API”), not in MySQL server or others. There are plans to also support it in Node.JS, but no actual support at this time. If using SQL then triggers can be defined in the MySQL Server – just like for InnoDB tables.




I’m speaking at OUG Scotland this week

ougscot14-resourcepk-isa-v1
If you’re going to be near Edinburgh this week then consider registering for OUG Scotland. I’ll be presenting on how to acheive the benefits of NoSQL (scalability, HA, ease of use. simple APIs) while at the same time still benefiting from the RDBMS features people have grown to rely on (ACID transactions, rich schemas, flexible access patterns) – the presentation will be at 11:25 on Wednesday as part of the developers’ track.

Hint for those that can’t make it – MySQL Cluster is the key :)





MySQL & NoSQL – Best of Both Worlds. Upcoming webinar

MySQL Cluster LogoOn Thursday 22nd May I’ll be hosting a webinar explaining how you can get the best from the NoSQL world while still getting all of the benefits of a proven RDBMS. As always the webinar is free but please register here.

There’s often a lot of excitement around NoSQL Data Stores with the promise of simple access patterns, flexible schemas, scalability and High Availability. The downside can come in the form of losing ACID transactions, consistency, flexible queries and data integrity checks. What if you could have the best of both worlds?

This webinar shows how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data, with a simple key-value API (Memcached), REST, JavaScript, Java or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance, 99.999% availability, on-line maintenance and linear, horizontal scalability through transparent auto-sharding.

This is also an opportunity to pick the brains of the MySQL Cluster engineering team and get your technical questions answered.

Times:

  • Thu, May 22: 09:00 Pacific time (America)
  • Thu, May 22: 10:00 Mountain time (America)
  • Thu, May 22: 11:00 Central time (America)
  • Thu, May 22: 12:00 Eastern time (America)
  • Thu, May 22: 13:00 São Paulo time
  • Thu, May 22: 16:00 UTC
  • Thu, May 22: 17:00 Western European time
  • Thu, May 22: 18:00 Central European time
  • Thu, May 22: 19:00 Eastern European time
  • Thu, May 22: 21:30 India, Sri Lanka
  • Fri, May 23: 00:00 Singapore/Malaysia/Philippines time
  • Fri, May 23: 00:00 China time
  • Fri, May 23: 01:00 日本
  • Fri, May 23: 02:00 NSW, ACT, Victoria, Tasmania (Australia)

Even if you can’t join the live webinar, it’s worth registering as you’ll be emailed a link to the replay as soon as it’s available.





MySQL Cluster Manager 1.3.1 released

MySQL Cluster Manager logoMySQL Cluster Manager 1.3.1 is now available to download from My Oracle Support and soon from the Oracle Software Delivery Cloud.

Details are available in the the MCM 1.3.1 Release Notes .

Documentation is available here.





MySQL Cluster 7.1.31 Released

MySQL Cluster LogoThe binary and source versions of MySQL Cluster 7.1.31 have now been made available at http://www.mysql.com/downloads/cluster/.

A description of all of the changes (fixes) that have gone into MySQL Cluster 7.1.31 (compared to 7.1.30) is available from the 7.1.31 Change log.