MySQL Cluster | Andrew Morgan on Databases

Tag Archive for MySQL Cluster

andrew | July 14, 2014

MySQL Cluster 7.3.6 Released

The binary and source versions of MySQL Cluster 7.3.6 have now been made available at http://www.mysql.com/downloads/cluster/.

Release notes

MySQL Cluster NDB 7.3.6 is a new release of MySQL Cluster, based
on MySQL Server 5.6 and including features from version 7.3 of the
NDB storage engine, as well as fixing a number of recently
discovered bugs in previous MySQL Cluster releases.

Obtaining MySQL Cluster NDB 7.3. MySQL Cluster NDB 7.3 source
code and binaries can be obtained from
http://dev.mysql.com/downloads/cluster/.

For an overview of changes made in MySQL Cluster NDB 7.3, see
MySQL Cluster Development in MySQL Cluster NDB 7.3
(http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-development-5-6-ndb-7-3.html).

This release also incorporates all bugfixes and changes made in
previous MySQL Cluster releases, as well as all bugfixes and
feature changes which were added in mainline MySQL 5.6 through
MySQL 5.6.19 (see Changes in MySQL 5.6.19 (2014-05-30)
(http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-19.html)).

Functionality Added or Changed

Cluster API: Added as an aid to debugging the ability to
specify a human-readable name for a given Ndb object and later
to retrieve it. These operations are implemented,
respectively, as the setNdbObjectName() and getNdbObjectName()
methods.
To make tracing of event handling between a user application
and NDB easier, you can use the reference (from getReference()
followed by the name (if provided) in printouts; the reference
ties together the application Ndb object, the event buffer,
and the NDB storage engine’s SUMA block. (Bug #18419907)

Bugs Fixed

Cluster API: When two tables had different foreign keys with
the same name, ndb_restore considered this a name conflict and
failed to restore the schema. As a result of this fix, a slash
character (/) is now expressly disallowed in foreign key
names, and the naming format parent_id/child_id/fk_name is now
enforced by the NDB API. (Bug #18824753)
Processing a NODE_FAILREP signal that contained an invalid
node ID could cause a data node to fail. (Bug #18993037, Bug
#73015)
References: This bug is a regression of Bug #16007980.
When building out of source, some files were written to the
source directory instead of the build dir. These included the
manifest.mf files used for creating ClusterJ jars and the
pom.xml file used by mvn_install_ndbjtie.sh. In addition,
ndbinfo.sql was written to the build directory, but marked as
output to the source directory in CMakeLists.txt. (Bug
#18889568, Bug #72843)
Adding a foreign key failed with NDB Error 208 if the parent
index was parent table’s primary key, the primary key was not
on the table’s initial attributes, and the child table was not
empty. (Bug #18825966)
When an NDB table served as both the parent table and a child
table for 2 different foreign keys having the same name,
dropping the foreign key on the child table could cause the
foreign key on the parent table to be dropped instead, leading
to a situation in which it was impossible to drop the
remaining foreign key. This situation can be modelled using
the following CREATE TABLE statements:
CREATE TABLE parent ( id INT NOT NULL, PRIMARY KEY (id) ) ENGINE=NDB; CREATE TABLE child ( id INT NOT NULL, parent_id INT, PRIMARY KEY (id), INDEX par_ind (parent_id), FOREIGN KEY (parent_id) REFERENCES parent(id) ) ENGINE=NDB; CREATE TABLE grandchild ( id INT, parent_id INT, INDEX par_ind (parent_id), FOREIGN KEY (parent_id) REFERENCES child(id) ) ENGINE=NDB;
With the tables created as just shown, the issue occured when
executing the statement ALTER TABLE child DROP FOREIGN KEY
parent_id, because it was possible in some cases for NDB to
drop the foreign key from the grandchild table instead. When
this happened, any subsequent attempt to drop the foreign key
from either the child or from the grandchild table failed.
(Bug #18662582)
ndbmtd supports multiple parallel receiver threads, each of
which performs signal reception for a subset of the remote
node connections (transporters) with the mapping of
remote_nodes to receiver threads decided at node startup.
Connection control is managed by the multi-instance TRPMAN
block, which is organized as a proxy and workers, and each
receiver thread has a TRPMAN worker running locally.
The QMGR block sends signals to TRPMAN to enable and disable
communications with remote nodes. These signals are sent to
the TRPMAN proxy, which forwards them to the workers. The
workers themselves decide whether to act on signals, based on
the set of remote nodes they manage.
The current isuue arises because the mechanism used by the
TRPMAN workers for determining which connections they are
responsible for was implemented in such a way that each worker
thought it was responsible for all connections. This resulted
in the TRPMAN actions for OPEN_COMORD, ENABLE_COMREQ, and
CLOSE_COMREQ being processed multiple times.
The fix keeps TRPMAN instances (receiver threads) executing
OPEN_COMORD, ENABLE_COMREQ and CLOSE_COMREQ requests. In
addition, the correct TRPMAN instance is now chosen when
routing from this instance for a specific remote connection.
(Bug #18518037)
Executing ALTER TABLE … REORGANIZE PARTITION after
increasing the number of data nodes in the cluster from 4 to
16 led to a crash of the data nodes. This issue was shown to
be a regression caused by previous fix which added a new dump
handler using a dump code that was already in use (7019),
which caused the command to execute two different handlers
with different semantics. The new handler was assigned a new
DUMP code (7024). (Bug #18550318)
References: This bug is a regression of Bug #14220269.
When running with a very slow main thread, and one or more
transaction coordinator threads, on different CPUs, it was
possible to encounter a timeout when sending a
DIH_SCAN_GET_NODESREQ signal, which could lead to a crash of
the data node. Now in such cases the timeout is avoided. (Bug
#18449222)
During data node failure handling, the transaction coordinator
performing takeover gathers all known state information for
any failed TC instance transactions, determines whether each
transaction has been committed or aborted, and informs any
involved API nodes so that they can report this accurately to
their clients. The TC instance provides this information by
sending TCKEY_FAILREF or TCKEY_FAILCONF signals to the API
nodes as appropriate top each affected transaction.
In the event that this TC instance does not have a direct
connection to the API node, it attempts to deliver the signal
by routing it through another data node in the same node group
as the failing TC, and sends a GSN_TCKEY_FAILREFCONF_R signal
to TC block instance 0 in that data node. A problem arose in
the case of multiple transaction cooridnators, when this TC
instance did not have a signal handler for such signals, which
led it to fail.
This issue has been corrected by adding a handler to the TC
proxy block which in such cases forwards the signal to one of
the local TC worker instances, which in turn attempts to
forward the signal on to the API node. (Bug #18455971)
A local checkpoint (LCP) is tracked using a global LCP state
(c_lcpState), and each NDB table has a status indicator which
indicates the LCP status of that table (tabLcpStatus). If the
global LCP state is LCP_STATUS_IDLE, then all the tables
should have an LCP status of TLS_COMPLETED.
When an LCP starts, the global LCP status is LCP_INIT_TABLES
and the thread starts setting all the NDB tables to
TLS_ACTIVE. If any tables are not ready for LCP, the LCP
initialization procedure continues with CONTINUEB signals
until all tables have become available and been marked
TLS_ACTIVE. When this initialization is complete, the global
LCP status is set to LCP_STATUS_ACTIVE.
This bug occurred when the following conditions were met:
- An LCP was in the LCP_INIT_TABLES state, and some but not
  all tables had been set to TLS_ACTIVE.
- The master node failed before the global LCP state
  changed to LCP_STATUS_ACTIVE; that is, before the LCP
  could finish processing all tables.
- The NODE_FAILREP signal resulting from the node failure
  was processed before the final CONTINUEB signal from the
  LCP initialization process, so that the node failure was
  processed while the LCP remained in the LCP_INIT_TABLES
  state.
  Following master node failure and selection of a new one, the
  new master queries the remaining nodes with a MASTER_LCPREQ
  signal to determine the state of the LCP. At this point, since
  the LCP status was LCP_INIT_TABLES, the LCP status was reset
  to LCP_STATUS_IDLE. However, the LCP status of the tables was
  not modified, so there remained tables with TLS_ACTIVE.
  Afterwards, the failed node is removed from the LCP. If the
  LCP status of a given table is TLS_ACTIVE, there is a check
  that the global LCP status is not LCP_STATUS_IDLE; this check
  failed and caused the data node to fail.
  Now the MASTER_LCPREQ handler ensures that the tabLcpStatus
  for all tables is updated to TLS_COMPLETED when the global LCP
  status is changed to LCP_STATUS_IDLE. (Bug #18044717)
When performing a copying ALTER TABLE operation, mysqld
creates a new copy of the table to be altered. This
intermediate table, which is given a name bearing the prefix
#sql-, has an updated schema but contains no data. mysqld then
copies the data from the original table to this intermediate
table, drops the original table, and finally renames the
intermediate table with the name of the original table.
mysqld regards such a table as a temporary table and does not
include it in the output from SHOW TABLES; mysqldump also
ignores an intermediate table. However, NDB sees no difference
between such an intermediate table and any other table. This
difference in how intermediate tables are viewed by mysqld
(and MySQL client programs) and by the NDB storage engine can
give rise to problems when performing a backup and restore if
an intermediate table existed in NDB, possibly left over from
a failed ALTER TABLE that used copying. If a schema backup is
performed using mysqldump and the mysql client, this table is
not included. However, in the case where a data backup was
done using the ndb_mgm client’s BACKUP command, the
intermediate table was included, and was also included by
ndb_restore, which then failed due to attempting to load data
into a table which was not defined in the backed up schema.
To prevent such failures from occurring, ndb_restore now by
default ignores intermediate tables created during ALTER TABLE
operations (that is, tables whose names begin with the prefix
#sql-). A new option –exclude-intermediate-sql-tables is
added that makes it possible to override the new behavior. The
option’s default value is TRUE; to cause ndb_restore to revert
to the old behavior and to attempt to restore intermediate
tables, set this option to FALSE. (Bug #17882305)
The logging of insert failures has been improved. This is
intended to help diagnose occasional issues seen when writing
to the mysql.ndb_binlog_index table. (Bug #17461625)
The DEFINER column in the INFORMATION_SCHEMA.VIEWS table
contained erroneous values for views contained in the ndbinfo
information database. This could be seen in the result of a
query such as SELECT TABLE_NAME, DEFINER FROM
INFORMATION_SCHEMA.VIEWS WHERE TABLE_SCHEMA=’ndbinfo’. (Bug
#17018500)
Employing a CHAR column that used the UTF8 character set as a
table’s primary key column led to node failure when restarting
data nodes. Attempting to restore a table with such a primary
key also caused ndb_restore to fail. (Bug #16895311, Bug
#68893)
Disk Data: Setting the undo buffer size used by
InitialLogFileGroup to a value greater than that set by
SharedGlobalMemory prevented data nodes from starting; the
data nodes failed with Error 1504 Out of logbuffer memory.
While the failure itself is expected behavior, the error
message did not provide sufficient information to diagnose the
actual source of the problem; now in such cases, a more
specific error message Out of logbuffer memory (specify
smaller undo_buffer_size or increase SharedGlobalMemory) is
supplied. (Bug #11762867, Bug #55515)
Cluster Replication: When using NDB$EPOCH_TRANS, conflicts
between DELETE operations were handled like conflicts between
updates, with the primary rejecting the transaction and
dependents, and realigning the secondary. This meant that
their behavior with regard to subsequent operations on any
affected row or rows depended on whether they were in the same
epoch or a different one: within the same epoch, they were
considered conflicting events; in different epochs, they were
not considered in conflict.
This fix brings the handling of conflicts between deletes by
NDB$EPOCH_TRANS with that performed when using NDB$EPOCH for
conflict detection and resolution, and extends testing with
NDB$EPOCH and NDB$EPOCH_TRANS to include “delete-delete”
conflicts, and encapsulate the expected result, with
transactional conflict handling modified so that a conflict
between DELETE operations alone is not sufficient to cause a
transaction to be considered in conflict. (Bug #18459944)
Cluster API: When an NDB data node indicates a buffer overflow
via an empty epoch, the event buffer places an inconsistent
data event in the event queue. When this was consumed, it was
not removed from the event queue as expected, causing
subsequent nextEvent() calls to return 0. This caused event
consumption to stall because the inconsistency remained
flagged forever, while event data accumulated in the queue.
Event data belonging to an empty inconsistent epoch can be
found either at the beginning or somewhere in the middle.
pollEvents() returns 0 for the first case. This fix handles
the second case: calling nextEvent() call dequeues the
inconsistent event before it returns. In order to benefit from
this fix, user applications must call nextEvent() even when
pollEvents() returns 0. (Bug #18716991)
Cluster API: The pollEvents() method returned 1, even when
called with a wait time equal to 0, and there were no events
waiting in the queue. Now in such cases it returns 0 as
expected. (Bug #18703871)

andrew | June 9, 2014

No comments

I’m speaking at OUG Scotland this week

If you’re going to be near Edinburgh this week then consider registering for OUG Scotland. I’ll be presenting on how to acheive the benefits of NoSQL (scalability, HA, ease of use. simple APIs) while at the same time still benefiting from the RDBMS features people have grown to rely on (ACID transactions, rich schemas, flexible access patterns) – the presentation will be at 11:25 on Wednesday as part of the developers’ track.

Hint for those that can’t make it – MySQL Cluster is the key 🙂

Category: MySQL Cluster | Tags: MySQL, MySQL Cluster, NoSQL

andrew | May 16, 2014

No comments

MySQL & NoSQL – Best of Both Worlds. Upcoming webinar

On Thursday 22nd May I’ll be hosting a webinar explaining how you can get the best from the NoSQL world while still getting all of the benefits of a proven RDBMS. As always the webinar is free but please register here.

There’s often a lot of excitement around NoSQL Data Stores with the promise of simple access patterns, flexible schemas, scalability and High Availability. The downside can come in the form of losing ACID transactions, consistency, flexible queries and data integrity checks. What if you could have the best of both worlds?

This webinar shows how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data, with a simple key-value API (Memcached), REST, JavaScript, Java or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance, 99.999% availability, on-line maintenance and linear, horizontal scalability through transparent auto-sharding.

This is also an opportunity to pick the brains of the MySQL Cluster engineering team and get your technical questions answered.

Times:

Thu, May 22: 09:00 Pacific time (America)
Thu, May 22: 10:00 Mountain time (America)
Thu, May 22: 11:00 Central time (America)
Thu, May 22: 12:00 Eastern time (America)
Thu, May 22: 13:00 São Paulo time
Thu, May 22: 16:00 UTC
Thu, May 22: 17:00 Western European time
Thu, May 22: 18:00 Central European time
Thu, May 22: 19:00 Eastern European time
Thu, May 22: 21:30 India, Sri Lanka
Fri, May 23: 00:00 Singapore/Malaysia/Philippines time
Fri, May 23: 00:00 China time
Fri, May 23: 01:00 日本
Fri, May 23: 02:00 NSW, ACT, Victoria, Tasmania (Australia)

Even if you can’t join the live webinar, it’s worth registering as you’ll be emailed a link to the replay as soon as it’s available.

Category: MySQL Cluster | Tags: MySQL Cluster, NoSQL, webinar

andrew | April 30, 2014

No comments

MySQL Cluster Manager 1.3.1 released

MySQL Cluster Manager 1.3.1 is now available to download from My Oracle Support and soon from the Oracle Software Delivery Cloud.

Details are available in the the MCM 1.3.1 Release Notes .

Documentation is available here.

Category: MySQL Cluster | Tags: MCM, MySQL, MySQL Cluster, MySQL Cluster Manager

andrew | April 30, 2014

No comments

MySQL Cluster 7.1.31 Released

The binary and source versions of MySQL Cluster 7.1.31 have now been made available at http://www.mysql.com/downloads/cluster/.

A description of all of the changes (fixes) that have gone into MySQL Cluster 7.1.31 (compared to 7.1.30) is available from the 7.1.31 Change log.

Category: MySQL Cluster | Tags: MySQL, MySQL Cluster, MySQL Cluster 7.1

andrew | March 31, 2014

4 comments

MySQL Cluster 7.4.0 Labs Release

The first version of MySQL Cluster 7.4 has now been released on MySQL Labs. Note that labs loads are not suitable for production use (in fact they’re even less mature than Development Milestone Releases); their purpose is to give users a chance to see what’s in the works, try it for themselves and then provide feedback. Having read that, if you’d like to try it out then Download MySQL Cluster 7.4 from MySQL Labs.

The focus of this first Cluster 7.4 load is performance and data node restart times.

Performance

MySQL Cluster was designed from the outset to be a distributed, in-memory database and has been deployed that way for many, many years (it’s interesting to see that the idea of in-memory databases has now really come into vogue with excitement around new arrivals on the scene such as Hekaton). Not surprisingly when people are considering MySQL Cluster, performance and scalability are key features (High Availability is another) and so performance improvements are always a key focus of every release and MySQL CLuster 7.4 is no exception.

The graphs show what’s already been acheived with Read Only Sysbench showing a 47% increase in throughput and a 38% improvement for the Read/Write benchmark. Even better improvements are seen when configuring the data nodes to use even more threads. For those not familiar with Sysbench, you should realise that each of the transactions involves quite a lot of work: 10 Primary Key lookups, 5 different types of scans where we fetch 100 records (normal select through ordered index followed by oder by, group by and so forth).

Restart Times

While less glamorous than performance, the time taken for a data node to restart can make a huge difference to how easy it is to manage your cluster. As the size and activity of the database increases, the restart time for a single data node will go up, if you then multiply that time by the number of data nodes you have, maintenance activities can start to take longer than you’d like.

This first MySQL Cluster 7.4 labs makes some signifficant improvements to the restart times – mostly by allowing more of the work to be done in parallel.