Posted on March 26th, 2013 No comments
Update – the recording of this webinar is now available here.
This Wednesday (27th March) Mat Keep and I will be presenting a free, live webinar on MySQL 5.6 Replication. You need to register here ahead of the webinar – worth doing even if you can’t attend as you’ll then be sent a link to the replay when it’s available. We’ll also have some of the key MySQL replication developers on-line to answer your questions and so it’s also a great chance to get some free consultancy
Join this session to learn how the new replication features in MySQL 5.6 enable developers and DBAs to build and scale next generation services using the world’s most popular open source database. MySQL 5.6 delivers new replication capabilities which we will discuss and demonstrate in the webinar:
- High performance with Binary Log Group Commit, Multi-Threaded Slaves and Optimized Row Based Replication
- High availability with Global Transaction Identifiers, Failover Utilities and Crash Safe Slaves & Binlog
- Data integrity with Replication Event Checksums
Dev/Ops agility with new Replication Utilities, Time Delayed Replication and more
The session will wrap up with resources to get started with MySQL 5.6.
Wed, Mar 27: 07:00 Pacific time (America)
Wed, Mar 27: 08:00 Mountain time (America)
Wed, Mar 27: 09:00 Central time (America)
Wed, Mar 27: 10:00 Eastern time (America)
Wed, Mar 27: 14:00 UTC
Wed, Mar 27: 14:00 Western European time
Wed, Mar 27: 15:00 Central European time
Wed, Mar 27: 16:00 Eastern European time
Wed, Mar 27: 19:30 India, Sri Lanka
Wed, Mar 27: 22:00 Singapore/Malaysia/Philippines time
Wed, Mar 27: 22:00 China time
Wed, Mar 27: 23:00 日本
Thu, Mar 28: 01:00 NSW, ACT, Victoria, Tasmania (Australia)
The presentation will be approximately 60 minutes long.
Posted on February 5th, 2013 1 commentMySQL 5.6 has now been declared Generally Available (i.e. suitable for production use). This is a very exciting release from a MySQL replication perspective with some big new features. These include:
- Global Transaction Identifiers (GTIDs) – a unique identifier that is used accross your replication topology to identify a transaction. Makes setting up and managing your cluster (including the promotion of a new master) far simpler and more reliable.
- Multi-threaded slaves (MTS) – Increases the performance of replication on the slave; different threads will handle applying events to different databases.
- Binary Log Group Commit – Improves replication performance on the master.
- Optimized Row Based Replication – reduces the amount of data that needs to be replicated; reducing network usage and potentially speeding up replication.
- Crash-Safe Replication – makes replication transactional. The data and replication positioning information are both updated as part of the same transaction.
- Replication Event Checksums – checks to ensure that the data being replicated hasn’t become corrupted, avoids writing corrupted data to the slave.
- Time-Delayed Replication – configure one or more of your slaves to only apply replicated events after a configured time. This can be useful to protect against DBA mistakes.
- Informational Logs – includes the original statement in the binary log when using row-based-replication to aid in debugging.
To coincide with the GA announcement, a new white paper has been released – MySQL 5.6 Replication An Introduction – that provides a lot more details on these new features. We’ve also released a second white paper – MySQL 5.6 Replication, Enhancing Scalability and Availability – A Tutorial that steps you through using MySQL Replication, with a particular focus on the new features.
If you’re already familiar with using MySQL Replication, here are a few pointers on what to do differently for the new MySQL 5.6 features but you should refer to the full tutorial to see these changes in context.
First of all, here are some extra configuration parameters to include…
[mysqld] gtid-mode=on # GTID enforce-gtid-consistency=true # GTID master-info-repository=TABLE # Chrash-safe replication relay-log-info-repository=TABLE # Chrash-safe replication slave-parallel-workers=2 # MTS binlog-checksum=CRC32 # Checksums master-verify-checksum=1 # Checksums slave-sql-verify-checksum=1 # Checksums binlog-rows-query-log_events=1 # Informational logs
When initiating (or restarting) replication on the slave, it is no longer necessary to include the binary log positioning information as the master and slave will automatically negotiate what outstanding events need to be replicated based on any GTIDs previously received by the slave…
slave> CHANGE MASTER TO MASTER_HOST='black', MASTER_USER='repl_user', MASTER_PASSWORD='billy', MASTER_AUTO_POSITION=1;
Please try out these new features and let us know what you think.
Posted on September 29th, 2012 2 comments
Oracle has announced that it now provides support for DRBD with MySQL – this means a single point of support for the entire MySQL/DRBD/Pacemaker/Corosync/Linux stack! As part of this, we’ve released a new white paper which steps you through everything you need to do to configure this High Availability stack. The white paper provides a step-by-step guide to installing, configuring, provisioning and testing the complete MySQL and DRBD stack, including:
- MySQL Database
- DRBD kernel module and userland utilities
- Pacemaker and Corosync cluster messaging and management processes
- Oracle Linux operating system
DRBD is an extremely popular way of adding a layer of High Availability to a MySQL deployment – especially when the 99.999% availability levels delivered by MySQL Cluster isn’t needed. It can be implemented without the shared storage required for typical clustering solutions (not required by MySQL Cluster either) and so it can be a very cost effective solution for Linux environments.
Introduction to MySQL on DRBD/Pacemaker/Corosync/Oracle LinuxFigure 1 illustrates the stack that can be used to deliver a level of High Availability for the MySQL service.
At the lowest level, 2 hosts are required in order to provide physical redundancy; if using a virtual environment, those 2 hosts should be on different physical machines. It is an important feature that no shared storage is required. At any point in time, the services will be active on one host and in standby mode on the other.
Pacemaker and Corosync combine to provide the clustering layer that sits between the services and the underlying hosts and operating systems. Pacemaker is responsible for starting and stopping services – ensuring that they’re running on exactly one host, delivering high availability and avoiding data corruption. Corosync provides the underlying messaging infrastructure between the nodes that enables Pacemaker to do its job; it also handles the nodes membership within the cluster and informs Pacemaker of any changes.
The core Pacemaker process does not have built in knowledge of the specific services to be managed; instead agents are used which provide a wrapper for the service-specific actions. For example, in this solution we use agents for Virtual IP Addresses, MySQL and DRBD – these are all existing agents and come packaged with Pacemaker. This white paper will demonstrate how to configure Pacemaker to use these agents to provide a High Availability stack for MySQL.
The essential services managed by Pacemaker in this configuration are DRBD, MySQL and the Virtual IP Address that applications use to connect to the active MySQL service.
DRBD synchronizes data at the block device (typically a spinning or solid state disk) – transparent to the application, database and even the file system. DRBD requires the use of a journaling file system such as ext3 or ext4. For this solution it acts in an active-standby mode – this means that at any point in time the directories being managed by DRBD are accessible for reads and writes on exactly one of the two hosts and inaccessible (even for reads) on the other. Any changes made on the active host are synchronously replicated to the standby host by DRBD.
Setting up MySQL with DRBD/Pacemaker/Corosync/Oracle LinuxFigure 2 shows the network configuration used in this paper – note that for simplicity a single network connection is used but for maximum availability in a production environment you should consider redundant network connections.
A single Virtual IP (VIP) is shown in the figure (192.168.5.102) and this is the address that the application will connect to when accessing the MySQL database. Pacemaker will be responsible for migrating this between the 2 physical IP addresses.
One of the final steps in configuring Pacemaker is to add network connectivity monitoring in order to attempt to have an isolated host stop its MySQL service to avoid a “split-brain” scenario. This is achieved by having each host ping an external (not one part of the cluster) IP addresses – in this case the network router (192.168.5.1).
Figure 3 shows where the MySQL files will be stored. The MySQL binaries as well as the socket (mysql.sock) and process-id (mysql.pid) files are stored in a regular partition – independent on each host (under /var/lib/mysql/). The MySQL Server configuration file (my.cnf) and the database files (data/*) are stored in a DRBD controlled file system that at any point in time is only available on one of the two hosts – this file system is controlled by DRBD and mounted under /var/lib/mysql_drbd/.
The white paper steps through setting all of this up as well as the resources in Pacemaker/Corosync that allow detection of a problem and the failover of the storage (DRBD), database (MySQL) and the Virtual IP address used by the application to access the database – all in a coordinated way of course. As you’ll notice in Figure 4 this involves setting up quite a few entities and relationships – the paper goes through each one.
Posted on September 24th, 2012 22 commentsIf you’re a user of MySQL Workbench then you may have noticed a pocket knife icon appear in the top right hand corner – click on that and a terminal opens which gives you access to the MySQL utilities. In this post I’m focussing on the replication utilities but you can also refer to the full MySQL Utilities documentation.
What I’ll step through is how to uses these utilities to:
- Set up replication from a single master to multiple slaves
- Automatically detect the failure of the master and promote one of the slaves to be the new master
- Introduce the old master back into the topology as a new slave and then promote it to be the master again
Before going through the steps in detail here’s a demonstration of the replication utilities in action…
To get full use of these utilities you should use the InnoDB storage engine together with the Global Transaction ID functionality from the latest MySQL 5.6 DMR.
Do you really need/want auto-failover?
For many people, the instinctive reaction is to deploy a fully automated system that detects when the master database fails and then fails over (promotes a slave to be the new master) without human intervention. For many applications this may be the correct approach.
There are inherent risks to this though – What if the failover implementation has a flaw and fails (after all, we probably don’t test this out in the production system very often)? What if the slave isn’t able to cope with the workload and makes things worse? Is it just a transitory glitch and would the best approach have been just to wait it out?
Following a recent, high profile outage there has been a great deal of debate on the topic between those that recommend auto-failover and those that believe it should only ever be entrusted to a knowledgeable (of the application and the database architecture) and well informed (of the state of the database nodes, application load etc.) human. Of course, if the triggering of the failover is to be left to a human then you want that person to have access to the information they need and an extremely simple procedure (ideally a single command) to execute the failover. Probably the truth is that it all depends on your specific circumstances.
The MySQL replication utilities aim to support you whichever camp you belong to:
- In the fully automated mode, the utilities will continually monitor the state of the master and in the event of its failure identify the best slave to promote – by default it will select the one that is most up-to-date and then apply any changes that are available on other slaves but not on this one before promoting it to be the new master. The user can override this behaviour (for example by limiting which of the slaves are eligible for promotion). The user is also able to bind in their own programs to be run before and after the failover (for example, to inform the application).
- In the monitoring mode, the utility still continually checks the availability of the master, and informs the user if it should fail. The user then executes a single command to fail over to their preferred slave.
Step 1. Make sure MySQL Servers are configured correctly
For some of the utilities, it’s important that you’re using Global Transaction IDs; binary logging needs to be enabled; may as well use the new crash-safe slave functionality… It’s beyond the scope of this post to go through all of those and so instead I’ll just give example configuration files for the 5 MySQL Servers that will be used:
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+ master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 datadir=/home/billy/mysql/data1 server-id=1 log-bin=util11-bin.log report-host=utils1 report-port=3306 socket=/home/billy/mysql/sock1 port=3306
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+ master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 datadir=/home/billy/mysql/data2 server-id=2 log-bin=util12-bin.log report-host=utils1 report-port=3307 socket=/home/billy/mysql/sock2 port=3307
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+ master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 datadir=/home/billy/mysql/data3 server-id=3 log-bin=util2-bin.log report-host=utils2 report-port=3306 socket=/home/billy/mysql/sock3 port=3306
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+ master-info-repository=TABLE relay-log-info-repository=TABLE master-info-file=/home/billy/mysql/master4.info datadir=/home/billy/mysql/data4 server-id=4 log-bin=util4-bin.log report-host=utils2 report-port=3307 socket=/home/billy/mysql/sock4 port=3307
[mysqld] binlog-format=ROW log-slave-updates=true gtid-mode=on disable-gtid-unsafe-statements=true # Use enforce-gtid-consistency from 5.6.9+ datadir=/home/billy/mysql/data5 master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 #master-info-file=/home/billy/mysql/master5.info server-id=5 log-bin=util5-bin.log report-host=utils2 report-port=3308 socket=/home/billy/mysql/sock5 port=3308
The utilities are actually going to be run from a remote host and so it will be necessary for that host to access each of the MySQL Servers and so a user has to be granted remote access (note that the utilities will automatically create the replication user):
[billy@utils1 ~]$ mysql -h 127.0.0.1 -P3306 -u root -e "grant all on *.* to root@'%' with grant option;" [billy@utils1 ~]$ mysql -h 127.0.0.1 -P3307 -u root -e "grant all on *.* to root@'%' with grant option;" [billy@utils2 ~]$ mysql -h 127.0.0.1 -P3306 -u root -e "grant all on *.* to root@'%' with grant option;" [billy@utils2 ~]$ mysql -h 127.0.0.1 -P3307 -u root -e "grant all on *.* to root@'%' with grant option;" [billy@utils2 ~]$ mysql -h 127.0.0.1 -P3308 -u root -e "grant all on *.* to root@'%' with grant option;"
OK – that’s the most painful part of the whole process out of the way!
Set up replication
While there are extra options (such as specifying what username/password to use for the replication user or providing a password for the root user) I’m going to keep things simple and use the defaults as much as possible. The following commands are run from the MySQL Utilities terminal – just click on the pocket-knife icon in MySQL Workbench.
mysqlreplicate --master=root@utils1:3306 --slave=root@utils1:3307 # master on utils1: ... connected. # slave on utils1: ... connected. # Checking for binary logging on master... # Setting up replication... # ...done. mysqlreplicate --master=root@utils1:3306 --slave=root@utils2:3306 # master on utils1: ... connected. # slave on utils2: ... connected. # Checking for binary logging on master... # Setting up replication... # ...done. mysqlreplicate --master=root@utils1:3306 --slave=root@utils2:3307 # master on utils1: ... connected. # slave on utils2: ... connected. # Checking for binary logging on master... # Setting up replication... # ...done. mysqlreplicate --master=root@utils1:3306 --slave=root@utils2:3308 # master on utils1: ... connected. # slave on utils2: ... connected. # Checking for binary logging on master... # Setting up replication... # ...done.
That’s it, replication has now been set up from one master to four slaves.
You can now check that the replication topology matches what you intended:
mysqlrplshow --master=root@utils1; # master on utils1: ... connected. # Finding slaves for master: utils1:3306 # Replication Topology Graph utils1:3306 (MASTER) | +--- utils1:3307 - (SLAVE) | +--- utils2:3306 - (SLAVE) | +--- utils2:3307 - (SLAVE) | +--- utils2:3308 - (SLAVE)
Additionally, you can also check that any of the replication relationships is correctly configure:
mysqlrplcheck --master=root@utils1 --slave=root@utils2:3307 # master on utils1: ... connected. # slave on utils2: ... connected. Test Description Status --------------------------------------------------------------------------- Checking for binary logging on master [pass] Are there binlog exceptions? [pass] Replication user exists? [pass] Checking server_id values [pass] Is slave connected to master? [pass] Check master information file [pass] Checking InnoDB compatibility [pass] Checking storage engines compatibility [pass] Checking lower_case_table_names settings [pass] Checking slave delay (seconds behind master) [pass] # ...done.
Including the -s option would have included the output that you’d expect to see from SHOW SLAVE STATUS\G on the slave.
Automated monitoring and failover
The previous section showed how you can save some serious time (and opportunity for user-error) when setting up MySQL replication. We now look at using the utilities to automatically monitor the state of the master and then automatically promote a new master from the pool of slaves. For simplicity I'll stick with default values wherever possible but note that there are a number of extra options available to you such as:
- Constraining which slaves are eligible for promotion to master; the default is to take the most up-to-date slave
- Binding in your own scripts to be run before or after the failover (e.g. inform your application to switch master?)
- Have the utility monitor the state of the servers but don't automatically initiate failover
Here is how to set it up:
mysqlfailover --master=root@utils1:3306 --discover-slaves-login=root --rediscover MySQL Replication Failover Utility Failover Mode = auto Next Interval = Wed Aug 15 13:19:30 2012 Master Information ------------------ Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB util11-bin.000001 2586 Replication Health Status +---------+-------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +---------+-------+---------+--------+------------+---------+ | utils1 | 3306 | MASTER | UP | ON | OK | | utils1 | 3307 | SLAVE | UP | ON | OK | | utils2 | 3306 | SLAVE | UP | ON | OK | | utils2 | 3307 | SLAVE | UP | ON | OK | | utils2 | 3308 | SLAVE | UP | ON | OK | +---------+-------+---------+--------+------------+---------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
mysqlfailover will then continue to run, refreshing the state - just waiting for something to go wrong.
Rather than waiting, I kill the master MySQL Server:
mysqladmin -h utils1 -P3306 -u root shutdown
Checking with the still-running mysqlfailover we can see that it has promoted utils1:3307.
MySQL Replication Failover Utility Failover Mode = auto Next Interval = Wed Aug 15 13:21:13 2012 Master Information ------------------ Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB util12-bin.000001 7131 Replication Health Status +---------+-------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +---------+-------+---------+--------+------------+---------+ | utils1 | 3307 | MASTER | UP | ON | OK | | utils2 | 3306 | SLAVE | UP | ON | OK | | utils2 | 3307 | SLAVE | UP | ON | OK | | utils2 | 3308 | SLAVE | UP | ON | OK | +---------+-------+---------+--------+------------+---------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
Add the recovered MySQL Server back into the topology
After restarting the failed MySQL Server, it can be added back into the mix as a slave to the new master:
mysqlreplicate --master=root@utils1:3307 --slave=root@utils1:3306 # master on utils1: ... connected. # slave on utils1: ... connected. # Checking for binary logging on master... # Setting up replication... # ...done.
The output from mysqlfailover (still running) confirms the addition:
MySQL Replication Failover Utility Failover Mode = auto Next Interval = Wed Aug 15 13:24:38 2012 Master Information ------------------ Binary Log File Position Binlog_Do_DB Binlog_Ignore_DB util12-bin.000001 7131 Replication Health Status +---------+-------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +---------+-------+---------+--------+------------+---------+ | utils1 | 3307 | MASTER | UP | ON | OK | | utils1 | 3306 | SLAVE | UP | ON | OK | | utils2 | 3306 | SLAVE | UP | ON | OK | | utils2 | 3307 | SLAVE | UP | ON | OK | | utils2 | 3308 | SLAVE | UP | ON | OK | +---------+-------+---------+--------+------------+---------+ Q-quit R-refresh H-health G-GTID Lists U-UUIDs
If it were important that the recovered MySQL Server be restored as the master then it is simple to manually trigger the promotion (after quitting out of mysqlfailover):
mysqlrpladmin --master=root@utils1:3307 --new-master=root@utils1:3306 --demote-master --discover-slaves-login=root switchover # Discovering slaves for master at utils1:3307 # Checking privileges. # Performing switchover from master at utils1:3307 to slave at utils1:3306. # Checking candidate slave prerequisites. # Waiting for slaves to catch up to old master. # Stopping slaves. # Performing STOP on all slaves. # Demoting old master to be a slave to the new master. # Switching slaves to new master. # Starting all slaves. # Performing START on all slaves. # Checking slaves for errors. # Switchover complete. # # Replication Topology Health: +---------+-------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +---------+-------+---------+--------+------------+---------+ | utils1 | 3306 | MASTER | UP | ON | OK | | utils1 | 3307 | SLAVE | UP | ON | OK | | utils2 | 3306 | SLAVE | UP | ON | OK | | utils2 | 3307 | SLAVE | UP | ON | OK | | utils2 | 3308 | SLAVE | UP | ON | OK | +---------+-------+---------+--------+------------+---------+ # ...done.
As always, we'd really appreciate people trying this out and giving us feedback!
Posted on May 29th, 2012 No comments
On Wednesday (16th May 2012), Mat Keep and I presented on the new replication features that are previewed as part of the latest MySQL 5.6 Development Release.
In addition, there were a huge number of great questions raised and we had a couple of key engineers answering them on-line – view the Q&A transcript here.
A reminder of the topics covered in the webinar…
MySQL 5.6 delivers new replication capabilities which we will discuss in the webinar:
- High performance with Multi-Threaded Slaves and Optimized Row Based Replication
- High availability with Global Transaction Identifiers, Failover Utilities and Crash Safe Slaves & Binlog
- Data integrity with Replication Event Checksums
- Dev/Ops agility with new Replication Utilities, Time Delayed Replication and more
Posted on May 10th, 2012 No comments
On Wednesday (16th May 2012), Mat Keep and I will be presenting the new replication features that are previewed as part of the latest MySQL 5.6 Development Release. If you’d like to attend then register here.
MySQL 5.6 delivers new replication capabilities which we will discuss in the webinar:
- High performance with Multi-Threaded Slaves and Optimized Row Based Replication
- Data integrity with Replication Event Checksums
- Dev/Ops agility with new Replication Utilities, Time Delayed Replication and more
The session will wrap up with resources to get started with MySQL 5.6 and an opportunity to ask questions.
The webinar will last 45-60 minutes and will start on Wednesday, May 16, 2012 at 09:00 Pacific time (America); start times in other time zones:
- Wed, May 16: 06:00 Hawaii time
- Wed, May 16: 10:00 Mountain time (America)
- Wed, May 16: 11:00 Central time (America)
- Wed, May 16: 12:00 Eastern time (America)
- Wed, May 16: 16:00 UTC
- Wed, May 16: 17:00 Western European time
- Wed, May 16: 18:00 Central European time
- Wed, May 16: 19:00 Eastern European time
As always, it’s worth registering even if you can’t make the live webcast as you’ll be emailed a link to the replay as soon as it’s available.
Posted on December 12th, 2011 No comments
There will be a webinar this Wednesday where you can get the latest information on all of the great new content that has been included in the MySQL 5.6 Development Releases as well as some features that are still being developed. As always, the webinar is free but you need to register here in advance. Even if you can’t attend the live event it’s worth registering so that you get sent the replay.
Some of the topics we’ll be discussing are:
The event starts on Wednesday, December 14, 2011: 09:00 Pacific time; 17:00 UK; 18:00 CET.
- Enhanced data integrity: Global Transactions Identifiers, Crash-Safe Slaves and Replication Event Checksums;
- High performance: Multi-Threaded Slaves, Binlog Group Commit and Optimized Row-Based Replication;
- Improved flexibility: Time Delayed Replication, Multi-Source Replication, Binlog API and Informational Log Events
Posted on December 7th, 2011 No comments
If you were unable to attend the live webinar (or you want to go back and listen to it again) then it’s now available to view on-line here.
Databases are the center of today’s web and enterprise applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization. Tune into this webcast to learn more.
The session discusses:
- Causes, effect and impact of downtime
- Methodologies to map applications to HA solution
- Overview of MySQL HA solutions
- Operational best practices to ensure business continuity
Posted on September 6th, 2011 8 comments
Oracle has announced support for running MySQL on Windows Server Failover Clustering (WSFC); with so many people developing and deploying MySQL on Windows, this offers a great option to add High Availability to MySQL deployments if you don’t want to go as far as deploying MySQL Cluster.
This post will give a brief overview of how to set things up but for all of the gory details a new white paper MySQL with Windows Server 2008 R2 Failover Clustering is available – please give me any feedback. I will also be presenting on this at a free webinar on Thursday 15th September (please register in advance) as well at an Oracle OpenWorld session in San Francisco on Tuesday 4th October (Tuesday, 01:15 PM, Marriott Marquis – Golden Gate C2) – a good opportunity to get more details and get your questions answered.
It sometimes surprises people just how much MySQL is used on Windows, here are a few of the reasons:
Probably the most common form of High Availability for MySQL is MySQL (asynchronous or semi-synchronous replication) and the option for the highest levels of availability is MySQL Cluster. We are in the process of rolling out a number of solutions that provide levels of availability somewhere in between MySQL Replication and MySQL Cluster; Oracle VM Template for MySQL Enterprise Edition was the first (overview, webinar replay, white paper) and WSFC is the second.
- Lower TCO
- 90% savings over Microsoft SQL Server
- If your a little skeptical about this then try it out for yourself with the MySQL TCO Savings Calculator
- Broad platform support
- No lock-in
- Windows, Linux, MacOS, Solaris
- Ease of use and administration
- < 5 mins to download, install & configure
- MySQL Enterprise Monitor & MySQL WorkBench
- 24×7 Uptime
- Field Proven
- Performance and scalability
- MySQL 5.5 delivered over 500% performance boost on Windows.
- Integration into Windows environment
- ADO.NET, ODBC & Microsoft Access Integration
- And now, support for Windows Server Failover Clustering!
MySQL with Windows Failover Clustering requires at least 2 servers within the cluster together with some shared storage (for example FCAL SAN or iSCSI disks). For redundancy, 2 LANs should be used for the cluster to avoid a single point of failure and typically one would be reserved for the heartbeats between the cluster nodes.
The MySQL binaries and data files are stored in the shared storage and Windows Failover Clustering ensures that at most one of the cluster nodes will access those files at any point in time (hence avoiding file corruptions).
Clients connect to the MySQL service through a Virtual IP Address (VIP) and so in the event of failover they experience a brief loss of connection but otherwise do not need to be aware that the failover has happened other than to handle the failure of any in-flight transactions.
This post will briefly step through how to set up and use a cluster and this diagrams shows how this is mapped onto physical hardware and network addresses for the lab used later in this post. In this case, iSCSI is used for the shared storage. Note that ideally there would be an extra subnet for the heartbeat connection between ws1 and ws3.
This is only intended to be an overview and the steps have been simplified refer to the white paper for more details on the steps.
- MySQL 5.5 & InnoDB must be used for the database (note that MyISAM is not crash-safe and so failover may result in a corrupt database)
- Windows Server 2008 R2
- Redundant network connections between nodes and storage
- WSFC cluster validation must pass
- iSCSI or FCAL SAN should be used for the shared storage
Step 1 Configure iSCSI in software (optional)
This post does not attempt to describe how to configure a highly available, secure and performant SAN but in order to implement the subsequent steps shared storage is required and so in this step we look at one way of using software to provide iSCSI targets without any iSCSI/SAN hardware (just using the server’s internal disk). This is a reasonable option to experiment with but probably not what you’d want to deploy with for a HA application. If you already have shared storage set up then you can skip this step and use that instead.
As part of this process you’ll create at least two virtual disks within the iSCSI target; one for the quorum file and one for the MySQL binaries and data files. The quorum file is used by Windows Failover Clustering to avoid “split-brain” behaviour.
Step 2. Ensure Windows Failover Clustering is enabled
To confirm that Windows Failover Clustering is installed on ws1 and ws3, open the “Features” branch in the Server Manager tool and check if “Failover Cluster Manager” is present.
If Failover Clustering is not installed then it is very simple to add it. Select “Features” within the Service Manager and then click on “Add Features” and then select “Failover Clustering” and then “Next”.
Step 3. Install MySQL as a service on both servers
If MySQL is already installed as a service on both ws1 and ws3 then this step can be skipped.
The installation is very straight-forward using the MySQL Windows Installer and selecting the default options is fine.
Within the MySQL installation wizard, sticking with the defaults is fine for this exercise. When you reach the configuration step, check “Create Windows Service”.
The installation and configuration must be performed on both ws1 and ws2, if necessary.
Step 4. Migrate MySQL binaries & data to shared storage
If the MySQL Service is running on either ws1 or ws3 then stop it – open the Task Manager using ctrl-shift-escape, select the “Services” tab and then right-click on the MySQL service and choose “Stop Service”.
As the iSCSI disks were enabled on ws1 you can safely access them in order to copy across the MySQL binaries and data files to the shared disk.
Step 5. Create Windows Failover Cluster
From the Server Manager on either ws1 or ws3 navigate to “Features -> Failover Cluster Manager” and then select “Validate a Configuration”. When prompted enter ws1 as one name and then ws3 as the other.
In the “Testing Options” select “Run all tests” and continue. If the tests report any errors then these should be fixed before continuing.
Now that the system has been verified, select “Create a Cluster” and provide the same server names as used in the validation step. In this example, “MySQL” is provided as the “Cluster Name” and then the wizard goes on to create the cluster.
Step 6. Create Cluster of MySQL Servers within Windows Cluster
Adding the MySQL service to the new Cluster is very straight-forward. Right-click on “Services and applications” in the Server Manager tree and select “Configure a Service or Application…”. When requested by the subsequent wizard, select “Generic Service” from the list and then “MySQL” from the offered list of services. Our example name was “ClusteredMySQL”. Please choose an appropriate name for your cluster. The wizard will then offer the shared disk that has not already been established as the quorum disk for use with the Clustered service – make sure that it is selected.
Once the wizard finishes, it starts up the MySQL Service. Click on the “ClusteredMySQL” service branch to observe that the service is up and running. You should also make a note of the Virtual IP (VIP) assigned, in this case 192.168.2.18.
Step 7. Test the cluster
As described in Step 6, the VIP should be used to connect to the clustered MySQL service:
C:\ mysql –u root –h 192.168.2.18 –P3306 –pbob
From there create a database and populate some data.
mysql> CREATE DATABASE clusterdb; mysql> USE clusterdb; mysql> CREATE TABLE simples (id int not null primary key) ENGINE=innodb; mysql> INSERT INTO simples VALUES (1); mysql> SELECT * FROM simples; +----+ | id | +----+ | 1 | +----+
The MySQL service was initially created on ws1 but it can be forced to migrate to ws3 by right-clicking on the service and selecting “Move this service or application to another node”.
As the MySQL data is held in the shared storage (which has also been migrated to ws3), it is still available and can still be accessed through the existing mysql client which is connected to the VIP:
mysql> select * from simples; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 1 Current database: clusterdb +----+ | id | +----+ | 1 | +----+
Note the error shown above – the mysql client loses the connection to the MySQL service as part of the migration and so it automatically reconnects and complete the query. Any application using MySQL with Windows Failover Cluster should also expect to have to cope with these “glitches” in the connection.
More users develop and deploy and MySQL on Windows than any other single platform. Enhancements in MySQL 5.5 increased performance by over 5x compared to previous MySQL releases. With certification for Windows Server Failover Clustering, MySQL can now be deployed to support business critical workloads demanding high availability, enabling organizations to better meet demanding service levels while also reducing TCO and eliminating single vendor lock-in.
Please let me know how you get on by leaving comments on this post.
- Lower TCO
Posted on September 6th, 2011 No comments
I will be presenting on MySQL Cluster and MySQL Replication at the Oracle Technical Network MySQL Developer day in London on Tuesday, 18 October 2011 (8:30 AM – 4:00 PM). It’s free but you need to register here while there are still places (attendance has been extremely high at other locations).
The MySQL Developer Day is a one-stop shop for you to learn all the essential MySQL skills. In this free, one-day seminar, we will cover everything you need to know to successfully design, develop, and manage your MySQL databases. You’ll also learn the guidelines and best practices in performance tuning and scalability.
Attend this event and gain the knowledge to:
- Develop your new applications cost-effectively using MySQL
- Improve performance of your existing MySQL databases
- Manage your MySQL environment more efficiently
8:30 a.m. – 9:30 a.m. Registration 9:30 a.m. – 10:30 a.m. Keynote: MySQL Essentials 10:30 a.m. – 11:30 p.m. Session: MySQL Replication and Scalabilit 11:30 a.m. – 11:45 a.m. Break 11:45 a.m. – 12:45 p.m. Session: MySQL Cluster 12:45 p.m. – 1:30 p.m. Lunch 1:30 p.m. – 2:30 p.m. Session: MySQL Administration and Management Tools 2:30 p.m. – 3:30 p.m. Session: MySQL Performance Tuning 3:30 p.m. – 4:00 p.m. Close
If you’re going to be in the area, please register and come along – would be good to meet as many of you as possible.
Posted on September 5th, 2011 1 comment
Slight adjustment to some of the times + added the MySQL community reception (read vodka!). Oracle OpenWorld (San Francisco) starts on Sunday 2nd October (including some MySQL community sessions) through Thursday 6th October. MySQL has a lot of sessions this year as well as 3 demo booths.
This year I’m going to be involved in 3 public sessions – if you’re attending, please come along and say hello!
In addition I’ll be spending as much time as I can at the MySQL demo booths in the exhibition hall. Come and visit us for demos of MySQL Cluster, MySQL Enterprise Edition and MySQL WorkBench.
- Getting the Most Out of MySQL on Windows – 13:15 on Tuesday (Marriott Marquis – Golden Gate C2)
- Building Highly Available and Scalable Real-Time Services with MySQL Cluster – 10:15 on Wednesday (Marriott Marquis – Golden Gate C1)
- NoSQL Access to MySQL: The Best of Both Worlds – 11:45 on Wednesday (Marriott Marquis – Golden Gate C1)
- MySQL Community Reception – 19:00 on Wednesday (San Francisco Marriott Marquis – Foothill G)
Posted on July 11th, 2011 No comments
Join us for a webinar this Wednesday (13th July) to understand more about the benefits of using the new Oracle VM Template for MySQL Enterprise Edition as well as how to get started with it. As always the webinar is free but please register here. The webinar starts at 9:00 am Pacific (5:00 pm UK, 6:00 pm CET) and even if you can’t make that time register anyway and you’ll be sent a link to the charts and replay.
As a reminder, a new white paper is available that goes through some of the details – if you have time then take a look at this paper before the webinar and then get any of your questions answered. Get the white paper here.
The official description…
Virtualization is a key technology to enable data center efficiency and provides the foundation for cloud computing. Integrating MySQL Enterprise Edition with Oracle Linux, the Oracle VM Template is the fastest, easiest and most reliable way to provision virtualized MySQL instances, enabling users to meet the explosive demand for web-based services. This webinar will demonstrate how the Oracle VM Template for MySQL Enterprise Edition ensures rapid deployment and helps eliminate configuration efforts and risks by providing a pre-installed and pre-configured virtualized software image, also taking advantage of Oracle VM’s mechanisms to deliver high availability. The webinar will cover: – An overview of the technologies packaged into the template – How to configure and deploy the template – How to use, modify and save an adapted template that supports specific application requirements By attending the webinar, you will learn how the Oracle VM Template for MySQL Enterprise Edition provides the foundation for a more agile and highly available MySQL environment. This session will be approximately 1 hour in length and will include interactive Q&A throughout. Please join us for this informative webinar!
Posted on June 28th, 2011 No commentsA new white paper is available that steps you though the benefits and the use of the Oracle VM Template for MySQL Enterprise Edition (also see yesterday’s press announcement). Get the white paper here.
As a reminder Oracle Virtual Manager provides a way to add High Availability to your MySQL deployment (this is separate from MySQL Replication and MySQL Cluster). The new OVM template provides a very convenient way to create one or more Virtual Machines that come pre-provisioned with OVM, Oracle Linux and MySQL Enterprise Edition as well as the scripts required to integrate MySQL into OVM’s HA features.
Posted on June 27th, 2011 2 commentsOracle today announced the release of the “Oracle VM Template for MySQL Enterprise Edition” – you can read the press-release here.
There are a couple of ways to look at the benefits:
- Provides a simple alternate High Availability solution for MySQL (i.e. rather than MySQL Replication or MySQL Cluster – although you can still set up replication to/from a MySQL Server installed this way)
- A new, very simple way to get up and running with MySQL Enterprise – running on a complete stack that Oracle can provide support for
As the focus of this blog is normally MySQL Cluster and Replication, the HA attributes of this solution are probably those of most interest. When you download the template and then load it into Oracle Virtual Machine (OVM) Manager and fire it up you get the chance to flag that you want it to be run in a HA mode. What this means is that:
- If the hardware, virtual machine, operating system or MySQL Server fail then the OVM Manager will recreate the VM on another physical server from the OVM Server Pool
- For routine hardware maintenance you can tell OVM Manager to migrate a running VM to another physical server
Note that the data files are actually stored on shared storage under Oracle Cluster File System 2 (OCFS2) which is how a newly instantiated replacement VM can get up and running quickly. As the original hardware (or the MySQL Server instance, OS or VM) could still fail without warning the MySQL Storage Engine must be crash-safe and so this should be used with InnoDB rather than MyISAM.
The complete software line-up looks like this:
- Oracle Linux 5 Update 6 with the Unbreakable Enterprise Kernel
- Oracle VM 2.2.1
- Oracle VM Manager 2.1.5
- Oracle Cluster File System 2 (OCFS2)
- MySQL Database 5.5.10 (Enterprise Edition) – InnoDB
Posted on November 17th, 2009 No commentsI will be presenting a free Webinar on Geographic Replication for MySQL Cluster at 9:00 am (UK time) on Tuesday 24 November.
MySQL Cluster has been deployed into some of the most demanding web, telecoms and enterprise /
government workloads, supporting 99.999% availability with real time performance and linear write scalability.
You can register on-line here.
Tune into this webinar where you can hear from the MySQL Cluster product management team provide a detailed “deep dive” into one of MySQL Cluster’s key capabilities – Geographic Replication.
In this session, you will learn how using Geographic Replication enables your applications to:
- Achieve higher levels of availability within a data center or across a WAN
- Locate data closer to users, providing lower latency access
- Replicate to other MySQL storage engines for complex data analysis and reporting of real time data
- Gow to get started with Geographic Replication
Tuesday, November 24, 2009: 10:00 Central European time
- Tue, Nov 24: 09:00 Western European time
- Tue, Nov 24: 11:00 Eastern European time
The presentation will be approximately 1 hour long, including on-line Q&A.
Posted on November 16th, 2009 15 comments
MySQL Cluster is designed to be a High Availability, Fault Tolerant database where no single failure results in any loss of service.
This is however dependent on how the user chooses to architect the configuration – in terms of which nodes are placed on which physical hosts, and which physical resources each physical host is dependent on (for example if the two blades containing the data nodes making up a particular node group are cooled by the same fan then the failure of that fan could result in the loss of the whole database).
Of course, there’s always the possibility of an entire data center being lost due to earthquake, sabotage etc. and so for a fully available system, you should consider using asynchronous replication to a geographically remote Cluster.
Fig 1. illustrates a typical small configuration with one or more data nodes from different node groups being stored on two different physical hosts and a management node on an independent machines (probably co-located with other applications as its resource requirements are minimal. If any single node (process) or physical host is lost then service can continue.
The basics of MySQL Cluster fault tolerance
Data held within MySQL Cluster is partitioned, with each node group being responsible for 2 or more fragments. All of the data nodes responsible for the same fragments form a Node Group (NG). If configured correctly, any single data node can be lost and the other data nodes within its node group will continue to provide service.
The management node (ndb_mgmd process) is required when adding nodes to the cluster – either when it was initially configured or when a node has been lost and restarted.
A heart-beat protocol is used between the data nodes in order to identify when a node has been lost. In many cases, the community of surviving data nodes can reconfigure themselves but in some cases they need help from the management node – much of this article focuses on how to identify these cases so that you can decide what level of redundancy is required for the management node.
Goals of the algorithm
The algorithm used by MySQL Cluster has 2 priorities (in order):
- Prevent database inconsistencies through “split brain” syndrome
- Keep the database up and running, serving the application
Split brain would occur if 2 data nodes within a node group lost contact with each other and independently decided that they should be master for the fragments controlled by their node group. This could lead to them independently applying conflicting changes to the data – making it very hard to recover the database (which would include undoing the changes that the application believes to have been safely committed). Note that a particular node doesn’t know whether its peer(s) has crashed or if it has just lost its connection to it. If the algorithm is not confident of avoiding a split brain situation then all of the data nodes are shut down – obviously that isn’t an ideal result and so it’s important to understand how to configure your cluster so that doesn’t happen.
If all of the data nodes making up a node group are lost then the cluster shuts down.
When data nodes lose contact with each other (could be failure of a network connection, process or host) then all of the data nodes that can still contact each other form a new community. Each community must decide whether its data nodes should stay up or shut down:
- If the community doesn’t contain at least 1 data node from each node group then it is not viable and its data nodes should shut down.
- If this community is viable and it can determine that it contains enough of the data nodes such that there can be no other viable community out there (one with at least 1 data node from each node group) then it will decide (by itself) to keep up all of its data nodes.
- If the community is viable but there is the possibility of another viable community then it contacts the arbitrator which decides which amongst all viable communities should be allowed to stay up. If the community can not connect to the arbitrator then its data nodes shut down.
In this way, at most one community of data nodes will survive and there is no chance of split brain.
The arbitrator will typically run on a management node. As you’ll from the algorithm and the following examples, the cluster can sometimes survive a failure without needing a running management node but sometimes it can’t. In most of the examples, a single management node is used but you may well decide to have a second for redundacy so that more multiple-point-of-failures can be handled. At any point in time, just one of the management nodes would act as the active arbitrator, if the active one is lost then a majority community of data nodes can hand control over to the other management node.
Note that the management node consumes very little resource and so can be co-located with other functions/applications but as you’ll see from the examples, you would normally avoid running it on the same host as a data node.
Example 1: Simplest cluster – loss of management node followed by a data node
Fig 2. shows a very simple cluster with 3 hosts, the management node running on 192.168.0.19 and then a single node group (NG1) made up of 2 data nodes split between 192.168.0.3 and 192.168.0.4. In this scenario, the management node is ‘lost’ (could be process, host or network failure) followed by one of the data nodes.
The surviving data node forms a community of 1. As it can’t know whether the other data node from NG1 is still viable and it can’t contact the arbitrator (the management node) it must shut itself down and so service is lost.
Note that the 2 nodes could be lost instantaneously or the management node might be lost first followed some time later by a data node.
To provide a truly HA solution there are 2 steps:
- Ensure that there is no single point of failure that could result in 192.168.0.19 and either of the other 2 hosts being lost.
- Run a second management node on a 4th host that can take over as arbitrator if 192.168.0.19 is lost
Example 2: Half of data nodes isolated but management node available
In Fig 3. host 192.168.0.3 and its 2 data nodes remains up and running but becomes isolated from the management node and the other data nodes. 2 communities of connected data nodes are formed. As each of these communities are viable but recognize that there could be another viable surviving communitycontain a data node from each node group, they must defer to the management node. As192.168.0.3 has lost it’s connection to the management node, the community of data nodes hosted there shut themselves down. The community hosted on 192.168.0.4 can contact the management node which as it’s the only community it can see, allows its data nodes to stay up and so service is maintained.
Example 3: Half of data nodes isolated and management node lost
The scenario shown in Fig 4. builds upon Example 2 but in the case, the management node is lost before one of the data node hosts loses its connection to the other.
In this case, both communities defer to the management node but as that has been lost they both shut themselves down and service is lost.
Refer to Example 1 to see what steps could be taken to increase the tolerance to multiple failures.
Example 4: Management node co-located with data nodes
Fig. 4 shows a common, apparent short-cut that people may take, with just 2 hosts available hosting the management node of the same machine as some of the data nodes. In this example, the connection between the 2 hosts is lost. As each community is viable they each attempt to contact the arbitrator – the data nodes on 192.168.0.3 are allowed to stay up while those on 192.168.0.4 shut down as they can’t contact the management node.
However this configuration is inherently unsafe, if 192.168.0.3 failed then there would be a complete loss of service as the data nodes on 192.168.0.4 would form a viable community but be unable to confirm that they represent the only viable community.
It would be tempting to make this more robust by running a second management node on 192.168.0.4 – in that case, when each host becomes isolated from the other, the data nodes local to the management node that’s the current arbitrator will stay up however if the entire host on which the active arbitrator failed then you would again lose service. The management node must be run on a 3rd host for a fault-tolerant solution.
Example 5: Odd number of data node hosts
Fig 6. shows a configuration running the management node on the same host as some of the data nodes does provide a robust solution.
In this example 192.168.0.3 becomes isolated its data nodes form a community of 2 which doesn’t include a data node from NG2 and so they shut themselves down. 192.168.0.4 and 192.168.0.19 are still connected and so they form a commiunity of 4 data nodes; they recognize that the community holds all data nodes from NG2 and so there can be no other viable community and so they are kept up without having to defer to the arbitrator.
Note that as there was no need to consult the management node, service would be maintained even if it was the machine hosting the management node that became isolated.
Example 6: Loss of all data nodes in a node group
Fig 7. illustrates the case where there are multiple data node failures such that all of the data nodes making up a node group are lost. In this scenario, the cluster has to shut down as that node group is no longer viable and the cluster would no longer be able to provide access to all data.
Example 7: Loss of multiple data nodes from different node groups
Fig 8. is similar to Example 6 but in this case, there is still a surviving data node from each node group. The surviving data node from NG1 forms a community with the one from NG2. As there could be another viable community (containing a data node from NG1 and NG2), they defer to the management node and as they form the only viable community they’re allowed to stay up.
Example 8: Classic, fully robust configuration
Fig 9. shows the classic, robust configuration. 2 independent machines both host management nodes. These are in turn connected by 2 independent networks to each of the data nodes (which are in turn all connected to each other via duplicated network connections).
Posted on May 26th, 2009 9 comments
One of the better kept secrets about MySQL Cluster appears to be the flexibility available when setting up replication. Rather than being constrained to implementing a single replication scheme, you can mix and match approaches.
Just about every Cluster deployment will use synchronous replication between the data nodes within a node group to implement High Availability (HA) by making sure that at the point a transaction is committed, the new data is stored in at least 2 physical hosts. Given that MySQL Cluster is usually used to store the data in main memory rather than on disk, this is pretty much mandatory (note that the data changes are still written to disk but that’s done asynchronously to avoid slowing down the database).
MySQL asynchronous replication is often used for MySQL Cluster deployments in order to provide Geographic Redundancy. At the same time as the synchronous replication within a Cluster, the changes can be replicated asynchronously to a second Cluster (or to more than one) at a remote location. Asynchronous rather than synchronous replication is used so that the transaction commit is not delayed while waiting for the remote (could be thousands of miles away, connected by a high latency WAN) Cluster to receive, apply and acknowledge the change. A common misconception is that changes being made through the NDB API will not be replicated to the remote site as this replication is handled by a MySQL Server instance – the reality is that the MySQL Replication implementation will pick up the changes even when they’re written directly to the data nodes through the NDB API.
A third use of replication is to store the Cluster’s data in a seperate database – for example to have a read-only, up-to-date copy of the data stored within the MyISAM storage engine so that complex reports can be generated from it. And the best news is that this can be done at the same time as the local HA and remote Geographic Redundancy replication!
Johan’s Blog provides the technical details around configuring replication in order to provide some extra scaling by setting up non-Cluster slave databases that pick up all changes from the Cluster database.