Sunday 4 September 2016

Multi master write/write MySQL cluster troubleshooting

Multi-master write/write MySQL Cluster

Background

So I have 4 payments servers located around the world:
London, Paris, Netherlands and Frankfurt. My customers are based around Europe, so these locations are best suited for us.

I operate a auto failover via Amazon DNS and using its health checks is very cost effective. If a server is non-responsive, AWS takes it out and routes traffic to another payments server.

Due to the fact that communication between servers has to be encrypted, and I will want to, eventually, pass more traffic between them (i.e. mysql + wddx, api calls etc etc), I decided to skip the SSL connection between MySQL processes.

Instead I decided to establish, effectively a site-to-site VPN connection using OpenVPN.
Due to the fact that this relies on a 'central' node, like a star topology, having that central node go down would be an issue. So I decided to make at least 2 central nodes, that would act as 'servers' to the client VPNs to connect to.

With this method, I'd only have to maintain 2 sets of VPN connections, rather than the 4 (one for each payment server), that would make it truly resilient. It would also need 50% of payment servers to go down before a split-brain situation would occur.

For MySQL, I decided to use Galera cluster for master/master (active/active) replication. This has been working very well, and ~600mb of state data (on initial node setup) takes very little time. If you find that initial startup takes longer than 30 minutes, then I would suspect a networking issue (rate limiting etc). This had occurred to myself in with the Paris payments server, and sadly it took around 6 hrs to sync the node, before it could take on incrementals.

Configuration

They are all connected via VPN tunnels to each other in a mesh network, to allow secure communications (it was easier than messing around with SSL certs), and are running Galera.
This works very well, as any changes to a server are instantly replicated out within a couple of seconds.

Troubleshooting

Node re-syncing

If a node in the cluster drops off, you can tell with the query:
SHOW STATUS LIKE 'wsrep_cluster_size';
which in my case, shows 4.

If for any reason something does drop off, this will show less than 4.

To resync the node that's disconnected, you can simply restart mysql, or if the donor is too slow, you can select another donor by using the command:
/etc/init.d/mysql restart --wsrep_sst_donor=payments3

Identifying the most advanced node

 This is done by checking the Global Transaction ID, in the grastate.dat file (usually in /var/lib/mysql/)
Check the latest sequence number