Quantcast
Channel: Severalnines - ClusterControl
Viewing all 203 articles
Browse latest View live

New ClusterControl subscriptions for managing MySQL, MongoDB and PostgreSQL

$
0
0

We’ve got your databases covered: check out our new pricing plans for ClusterControl, the single console to deploy, monitor and manage your entire database infrastructure.

Whether you’re looking to manage standalone instances, need high availability or have 24/7 SLA requirements for your databases, ClusterControl now comes with three enhanced options for you to chose from in addition to its Community Edition.

Standalone

Do you have standalone database servers to manage? Then this is the best plan for you. From real-time monitoring and performance advisors, to analyzing historical query data and making sure all your servers are backed up, ClusterControl Standalone has you covered.

Advanced

As our company name indicates, we’re all about achieving high availability. With ClusterControl Advanced, you can take the guesswork out of managing your high availability database setups - automate failover and recovery of your databases, add load balancers with read-write splits, add nodes or read replicas - all with a couple of clicks.

Enterprise

If you’re looking for all of the above in a 24/7 secure service environment, then look no further. From high-spec operational reports to role-based access control and SSL encryption, this is our most advanced plan aimed at mission-critical environments.

Here is a summary view of the new subscriptions:

Full features table & pricing plansContact us

Note that ClusterControl can be downloaded for free and that each download includes an initial 30 day trial of ClusterControl Enterprise, so that you can test the full features set of our product. It then becomes ClusterControl Community, should you decide not to purchase a plan. With ClusterControl Community, you can deploy and monitor MySQL, MongoDB and PostgreSQL.

Happy Clustering!


Load balanced MySQL Galera setup - Manual Deployment vs ClusterControl

$
0
0

If you have deployed databases with high availability before, you will know that a deployment does not always go your way, even though you’ve done it a zillion times. You could spend a full day setting everything up and may still end up with a non-functioning cluster. It is not uncommon to start over, as it’s really hard to figure out what went wrong.

So, deploying a MySQL Galera Cluster with redundant load balancing takes a bit of time. This blog looks at how long time it would take to do it manually vs using ClusterControl to perform the task. For those who have not used it before, ClusterControl is an agentless management and automation software for databases. It supports MySQL (Oracle and Percona server), MariaDB, MongoDB (MongoDB inc. and Percona), and PostgreSQL.

For manual deployment, we’ll be using the popular “Google university” to search for how-to’s and blogs that provide deployment steps.

Database Deployment

Deployment of a database consists of several parts. These include getting the hardware ready, software installation, configuration tweaking and a bit of tuning and testing. Now, let’s assume the hardware is ready, the OS is installed and it is up to you to do the rest. We are going to deploy a three-node Galera cluster as shown in the following diagram:

Manual

Googling on “install mysql galera cluster” led us to this page. By following the steps explained plus some additional dependencies, the following is what we should run on every DB node:

$ semanage permissive -a mysqld_t
$ systemctl stop firewalld
$ systemctl disable firewalld
$ vim /etc/yum.repos.d/galera.repo # setting up Galera repository
$ yum install http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
$ yum install mysql-wsrep-5.6 galera3 percona-xtrabackup
$ vim /etc/my.cnf # setting up wsrep_* variables
$ systemctl start mysql --wsrep-new-cluster # ‘systemctl start mysql’ on the remaining nodes
$ mysql_secure_installation

The above commands took around 18 minutes to finish on each DB node. Total deployment time was 54 minutes.

ClusterControl

Using ClusterControl, here are the steps we took to first install ClusterControl (5 minutes):

$ wget http://severalnines.com/downloads/cmon/install-cc
$ chmod 755 install-cc
$ ./install-cc

Login to the ClusterControl UI and create the default admin user.

Setup passwordless SSH to all DB nodes on ClusterControl node (1 minute):

$ ssh-keygen -t rsa
$ ssh-copy-id 10.0.0.217
$ ssh-copy-id 10.0.0.218
$ ssh-copy-id 10.0.0.219

In the ClusterControl UI, go to Create Database Cluster -> MySQL Galera and enter the following details (4 minutes):

Click Deploy and wait until the deployment finishes. You can monitor the deployment progress under ClusterControl -> Settings -> Cluster Jobs and once deployed, you will notice it took around 15 minutes:

To sum it up, the total deployment time including installing ClusterControl is 15 + 4 + 1 + 5 = 25 minutes.

Following table summarizes the above deployment actions:

AreaManualClusterControl
Total steps8 steps x 3 servers + 1 = 258
Duration18 x 3 = 54 minutes25 minutes

To summarize, we needed less steps and less time with ClusterControl to achieve the same result. 3 node is sort of a minimum cluster size, and the difference would get bigger with clusters with more nodes.

Load Balancer and Virtual IP Deployment

Now that we have our Galera cluster running, the next thing is to add a load balancer in front. This provides one single endpoint to the cluster, thus reducing the complexity for applications to connect to a multi-node system. Applications would not need to have knowledge of the topology and any changes caused by failures or admin maintenance would be masked. For fault tolerance, we would need at least 2 load balancers with a virtual IP address.

By adding a load balancer tier, our architecture will look something like this:

Manual Deployment

Googling on “install haproxy virtual ip galera cluster” led us to this page. We followed the steps:

On each HAproxy node (2 times):

$ yum install epel-release
$ yum install haproxy keepalived
$ systemctl enable haproxy
$ systemctl enable keepalived
$ vi /etc/haproxy/haproxy.cfg # configure haproxy
$ systemctl start haproxy
$ vi /etc/keepalived/keepalived.conf # configure keepalived
$ systemctl start keepalived

On each DB node (3 times):

$ wget https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck
$ chmod +x clustercheck
$ mv clustercheck /usr/bin/
$ vi /etc/xinetd.d/mysqlchk # configure mysql check user
$ vi /etc/services # setup xinetd port
$ systemctl start xinetd
$ mysql -uroot -p
mysql> GRANT PROCESS ON *.* TO 'clustercheckuser'@'localhost' IDENTIFIED BY 'clustercheckpassword!'

The total deployment time for this was around 42 minutes.

ClusterControl

For the ClusterControl host, here are the steps taken (1 minute) :

$ ssh-copy-id 10.0.0.229
$ ssh-copy-id 10.0.0.230

Then, go to ClusterControl -> select the database cluster -> Add Load Balancer and enter the IP address of the HAproxy hosts, one at a time:

Once both HAProxy are deployed, we can add Keepalived to provide a floating IP address and perform failover:

Go to ClusterControl -> select the database cluster -> Logs -> Jobs. The total deployment took about 5 minutes, as shown in the screenshot below:

Thus, total deployment for load balancers plus virtual IP address and redundancy is 1 + 5 = 6 minutes.

Following table summarized the above deployment actions:

AreaManualClusterControl
Total steps(8 x 2 haproxy nodes) + (8 x 3 DB nodes) = 406
Duration42 minutes6 minutes

ClusterControl also manages and monitors the load balancers:

Adding a Read Replica

Our setup is now looking pretty decent, and the next step is to add a read replica to Galera. What is a read replica, and why do we need it? A read replica is an asynchronous slave, replicating from one of the Galera nodes using standard MySQL replication. There are a few good reasons to have this. Long-running reporting/OLAP type queries on a Galera node might slow down an entire cluster, if the reporting load is so intensive that the node has to spend considerable effort coping with it. So reporting queries can be sent to a standalone server, effectively isolating Galera from the reporting load. An asynchronous slave can also serve as a remote live backup of our cluster in a DR site, especially if the link is not good enough to stretch one cluster across 2 sites.

Our architecture is now looking like this:

Manual Deployment

Googling on “mysql galera with slave” brought us to this page. We followed the steps:

On master node:

$ vim /etc/my.cnf # setting up binary log and gtid
$ systemctl restart mysql
$ mysqldump --single-transaction --skip-add-locks --triggers --routines --events > dump.sql
$ mysql -uroot -p
mysql> GRANT REPLICATION SLAVE ON .. ;

On slave node (we used Percona Server):

$ yum install http://www.percona.com/downloads/percona-release/redhat/0.1-3/percona-release-0.1-3.noarch.rpm
$ yum install Percona-Server-server-56
$ vim /etc/my.cnf # setting up server id, gtid and stuff
$ systemctl start mysql
$ mysql_secure_installation
$ scp root@master:~/dump.sql /root
$ mysql -uroot -p < /root/dump.sql
$ mysql -uroot -p
mysql> CHANGE MASTER ... MASTER_AUTO_POSITION=1;
mysql> START SLAVE;

The total time spent for this manual deployment was around 40 minutes (with 1GB database in size).

ClusterControl

With ClusterControl, here is what we should do. Firstly, configure passwordless SSH to the target slave (0.5 minute):

$ ssh-copy-id 10.0.0.231 # setup passwordless ssh

Then, on one of the MySQL Galera nodes, we have to enable binary logging to become a master (2 minutes):

Click Proceed to start enabling binary log for this node. Once completed, we can add the replication slave by going to ClusterControl -> choose the Galera cluster -> Add Replication Slave and specify as per below (6 minutes including streaming 1GB of database to slave):

Click “Add node” and you are set. Total deployment time for adding a read replica complete with data is 6 + 2 + 0.5 = 8.5 minutes.

Following table summarized the above deployment actions:

AreaManualClusterControl
Total steps153
Duration40 minutes8.5 minutes

We can see that ClusterControl automates a number of time consuming tasks, including slave installation, backup streaming and slaving from master. Note that ClusterControl will also handle things like master failover so that replication does not break if the galera master fails..

Conclusion

A good deployment is important, as it is the foundation of an upcoming database workload. Speed matters too, especially in agile environments where a team frequently deploys entire systems and tears them down after a short time. You’re welcome to try ClusterControl to automate your database deployments, it comes with a free 30-day trial of the full enterprise features. Once the trial ends, it will default to the community edition (free forever).

ClusterControl Tips & Tricks - Custom graphs to monitor your MySQL, MariaDB, MongoDB and PostgreSQL systems

$
0
0

Graphs are important, as they’re your window onto your monitored systems. ClusterControl comes with a predefined set of graphs for you to analyze, these are built on top of the metric sampling done by the controller. Those are designed to give you, at first glance, as much information as possible about the state of your database cluster. You might have your own set of metrics you’d like to monitor though. Therefore ClusterControl allows you to customize the graphs available in the cluster overview section and in the Nodes -> DB Performance tab. Multiple metrics can be overlaid on the same graph.

Overview tab

Let’s take a look at the cluster overview - it shows the most important information aggregated under different tabs.

You can see there graphs like “Cluster Load” and “Galera - Flow Ctrl” along with couple of others. If this is not enough for you, you can click on “Dash Settings” and then pick “Create Board” option. This is also a place in which, later, you can manage existing graphs - you can edit a graph by double-clicking on it, you can also delete it from the tab list.

When you decide to create a new graph, you’ll be presented with an option to pick metrics that you’d like to monitor. Let’s assume we are interested in monitoring temporary objects - tables, files and tables on disk. We just need to pick all three metrics we want to follow and add them to our new graph.

Next, pick some name for our new graph and pick a scale. Most of the time you want scale to be linear but in some rare cases, like when you mix metrics containing large and small values, you may want to use logarithmic scale instead.

Finally, you can pick if your template should be presented as a default graph. If you tick this option, this is the graph you will see by default when you enter the “Overview” tab.

Once we save the new graph, you can enjoy the result:

DB Performance tab

When you take a look at the node and then follow into DB Performance tab, you’ll be presented by, by default, eight different MySQL metrics. You can change them or add new ones. To do that, you need to use “Choose Graph” button:

You’ll be presented with a new window, that allows you to configure the layout and the metrics graphed.

Here you can pick the layout - two or three columns of graphs and number of graphs - up to 20. Then, you may want to modify which metrics you’d want to see plotted - use drop-down dialog boxes to pick whatever metric you’d like to add. Once you are ready, save the graphs and enjoy your new metrics.

Database Cluster Management - Manual vs Automation via ClusterControl

$
0
0

In a previous blog, we looked at the efficiency gains when automating the deployment of a Galera cluster for MySQL. In this post, we are going to dive into cluster management. Should we manage our cluster manually, or does it make sense to automate the procedures ? What do we gain with automation?

Cluster management involves a number of tasks:

  • Node management:
    • Restart service (per node or rolling restart)
    • Bootstrap cluster
    • Stop cluster
    • Reboot server
    • Rebuild node
    • Recover cluster/node
    • Find most advanced node (i.e. node with latest data)
  • Configuration management:
    • Centrally manage configuration files
    • Change configurations
    • Rolling restarts
  • Upgrades
    • Rolling upgrades
  • Backup management:
    • Create backup (physical or logical, full or incremental)
    • Schedule backup
    • Restore backup
  • Schema and users management:
    • Create schema
    • Manage user privileges
  • Private keys and certificates management:
    • Manage private keys and certificates
    • Generate new private keys and certificates
    • Manage CA bundles
    • Import existing private keys and certificates

Let’s take a look at the different steps involved in some of the above, and see if it would make sense to automate them.

Node Management

Restarting MySQL service is simple in a standalone setup but in a Galera cluster, performing a rolling restart might be tricky in some ways. You have to ensure the node reaches primary state before proceeding to the next node. In some occasions, a node restart could bring chaos as described in details in this blog post.

Rolling restart is one of the common tasks that you need to do properly. In an environment without a cluster manager, you would:

  1. Login to a DB node
  2. Restart MySQL server
  3. Verify the node state using ‘SHOW STATUS LIKE ‘wsrep_cluster_state’ and ensure it reaches Primary
  4. Repeat step #1 to #3 for every node in the cluster

In ClusterControl, you can achieve this in two ways, using “Rolling Restart” function or restart the node, one node at a time, under the Nodes tab. To perform a rolling restart, go to ClusterControl -> Manage -> Upgrades -> Rolling Restart -> Proceed.

You can then monitor the restart progress under ClusterControl -> Logs -> Jobs, similar to the following screenshot:

Although the duration of the rolling restart is not much different, one would have to be standby to supervise the whole procedure vs. triggering a job through ClusterControl and getting notified when the procedure has completed./p>

Configuration Management

Tuning the database configuration is a continuous process. There are plenty of configuration variables available for instance in MySQL and it’s not trivial to remember which ones are dynamic variables (that you can change on runtime without restart) or non-dynamic variable (where MySQL restart is required). Acknowledging this in advance is important so we are aware of the next step to take - so our new configuration is loaded correctly.

Let’s say we want to adjust the innodb_buffer_pool_size and change log_error (both are non-dynamic variables) in a three-node Galera cluster:

  1. Login to first DB node via SSH
  2. Make configuration changes inside my.cnf
  3. Restart the node
  4. Verify the node state using ‘SHOW STATUS LIKE ‘wsrep_cluster_state’ and ensure it reaches Primary
  5. Repeat step #1 to #4 for each node in the cluster

ClusterControl provides a centralised interface from which to manage your configurations. It knows which variable changes require a restart or not.We covered this in details in this blog post. Let’s say you want to change the same configuration variables as described above, here is what you should do via the ClusterControl UI:

  1. Go to ClusterControl -> Manage -> Configuration -> Change Parameters.

  2. Then specify the variable that we want to change:

    ClusterControl will then perform the configuration change and advise the next step:

  3. From the log window above, we should perform a rolling for the change to take effect. Next, perform the rolling restart as described in the first section, Nodes Management.

Backup Management

Backup is critical and essential when managing our database cluster. For MySQL, two main methods are mysqldump and xtrabackup. There are a number of commands and options based on the chosen method, whether you want to perform a local or remote backup, compressed or plain output, per database or all databases. Once you have figured out the ideal command, you can schedule it using cron. The main steps to schedule a backup:

  1. Create the backup destination path.
  2. Experiment with backup commands (this is the most time consuming process).
  3. Create a cron job for backup.
  4. Run the housekeeping command to test it it works.
  5. Create a cron job for backup housekeeping.

Alternatively, it is possible to schedule backups via ClusterControl. Backup files can be stored on the DB node where the backup is taken, or it can be streamed to the ClusterControl node so it does not take extra space on the database server. Other interesting options are parallel copy threads, throttle rate, network streaming throttle rate, as well as desync of the target node when the backup is running. Older backups are also automatically purged based on a configurable backup rotation interval.

Creating a backup schedule via ClusterControl:

All created backups are available under ClusterControl -> Backups -> Reports:

Backup sets include a full backup and related incremental backups, restoration of a backup set can also be performed from ClusterControl.

Finally, if the scheduled Galera node is down for maintenance or unavailable for some reason, it is possible to configure ClusterControl so that it schedules the backup on the next available node.

MySQL User Management

A common misconception when adding users to Galera cluster is to create them using “INSERT INTO mysql.user” statement, which is plain wrong. The suggested way is to use DDL statements such as GRANT or CREATE USER to create users, since the table mysql.user is a MyISAM table (Galera replicates DDL statement regardless of the storage engine).

Creating a MySQL user can be a simple or tricky task depending on your authentication, authorization and accounting policy. Modifying a user’s complex privileges usually requires a long GRANT statement, which can be error-prone. For example, one would do:

$ mysql -uroot -p
mysql> CREATE USER 'userx'@'%' IDENTIFIED BY 'mypassword';
mysql> GRANT SELECT,INSERT,UPDATE,DELETE,INDEX,CREATE TEMPORARY TABLES,SHOW VIEW,ALTER ROUTINE,CREATE VIEW,EVENT ON app1.* TO 'userx'@'%' WITH MAX_QUERIES_PER_HOUR 50 MAX_CONNECTIONS_PER_HOUR 5 MAX_USER_CONNECTIONS 5;

With ClusterControl, the above CREATE and GRANT statements can be done through a wizard:

ClusterControl also provides an overview of inactive users, where it detects accounts that have not been used since the last server restart. This is useful in cases where an admin has created unnecessary accounts, for instance, to minimize client authentication problems.;

BONUS: Private Keys and Certificates Management

In this last chapter, we are going to touch upon security and encryption for Galera Cluster. ClusterControl supports enabling SSL encryption (client-server encryption) as well as encryption of replication traffic between the Galera nodes. This is especially important when deploying on public clouds, or across different data centers.

Enabling fully SSL encryption (client-server + Galera replication) requires many steps, as we’ve shown in this blog post. Starting ClusterControl v1.3.x, you can perform these tasks with less than 5 clicks:

  1. Click on Enable SSL Encryption and choose “Create Certificate”:

  2. Then, enable the Galera replication encryption:

  3. Once enabled, go to Settings -> Key Management tab and you can manage the generated private keys and certificates there, similar to the screenshot below:

That’s it. Setting up a cluster with full SSL encryption is not a difficult job anymore with ClusterControl. The above is a subset of management functionality that can be automated via ClusterControl, you are very welcome to give it a try and let us know what you think.

ClusterControl Developer Studio: Custom database alerts by combining metrics

$
0
0

In the previous blog posts, we gave a brief introduction to the ClusterControl Developer Studio and the ClusterControl Domain Specific Language. We covered some useful examples, e.g., how to extract information from the Performance Schema, how to automatically have advisors scale your database clusters and how to create an advisor that keeps an eye on the MongoDB replication lag. ClusterControl Developer Studio is free to use, and included in the community version of ClusterControl. It allows you to write your own scriptlets, advisors and alerts. With just a few lines of code, you can already automate your clusters. All advisors are open source on Github, so anyone can contribute back to the community!

In this blog post, we will make things a little bit more complex than in our previous posts. Today we will be using our MongoDB replication window advisor that has been recently added to the Advisors Github repository. Our advisor will not only check on the length of the replication window, but also calculate the lag of its secondaries and warn us if the node would be in any risk of danger. For extra complexity, we will make this advisor compatible with a MongoDB sharded environment, and take a couple of edge cases into account to prevent false positives.

MongoDB Replication window

The MongoDB replication window advisor complements the MongoDB lag advisor. The lag advisor informs us of the number of seconds a secondary node is behind the primary/master. As the oplog is limited in size, having slave lag imposes the following risks:

  1. If a node lags too far behind, it may not be able to catch up anymore as the transactions necessary are no longer in the oplog of the primary.
  2. A lagging secondary node is less favoured in a MongoDB election for a new primary. If all secondaries are lagging behind in replication, you will have a problem and one with the least lag will be made primary.
  3. Secondaries lagging behind are less favoured by the MongoDB driver when scaling out reads with MongoDB, it also adds a higher workload on the remaining secondaries.

If we would have a secondary node lagging behind a few minutes (or hours), it would be useful to have an advisor that informs us how much time we have left before our next transaction will be dropped from the oplog. The time difference between the first and last entry in the oplog is called the Replication Window. This metric can be created by fetching the first and last items from the oplog, and calculating the difference of their timestamps.

Calculating the MongoDB Replication Window

In the MongoDB shell, there is already a function available that calculates the replication window for you. However this function is built into the command line shell, so any outside connection not using the command line shell will not have this built-in function:

mongo_replica_2:PRIMARY> db.getReplicationInfo()
{
           "logSizeMB" : 1894.7306632995605,
           "usedMB" : 501.17,
           "timeDiff" : 91223,
           "timeDiffHours" : 25.34,
           "tFirst" : "Wed Oct 12 2016 22:48:59 GMT+0000 (UTC)",
           "tLast" : "Fri Oct 14 2016 00:09:22 GMT+0000 (UTC)",
           "now" : "Fri Oct 14 2016 12:32:51 GMT+0000 (UTC)"
}

As you can see, this function has a rich output of useful information: the size of the oplog and how much of that has been used already. It also displays the time difference of the first and last items in the oplog.

It is easy to replicate this function by retrieving the first and last items from the oplog. Making use of the MongoDB aggregate function in a one single query is tempting, however the oplog does not have any indexes set on any of the fields. Running an aggregate function on a collection without indexes would require a full collection scan, which would become very slow in an oplog that has a couple of million entries.

Instead we are going to send two individual queries: fetch the first record of the oplog in forward and reverse order. As the oplog already is a sorted collection, we can naturally sort on the reverse of the collection cheaply.

mongo_replica_2:PRIMARY> use local
switched to db local
mongo_replica_2:PRIMARY> db.oplog.rs.find().limit(1);
{ "ts" : Timestamp(1476312539, 1), "h" : NumberLong("-3302015507277893447"), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } }
mongo_replica_2:PRIMARY> db.oplog.rs.find().sort({$natural: -1}).limit(1);
{ "ts" : Timestamp(1476403762, 1), "h" : NumberLong("3526317830277016106"), "v" : 2, "op" : "n",  "ns" : "ycsb.usertable", "o" : { "_id" : "user5864876345352853020",
…
}

The overhead of both queries is very low and will not interfere with the functioning of the oplog.

In the example above, the replication window would be 91223 seconds (the difference of 1476403762 and 1476312539).

Intuitively you may think it only makes sense to do this calculation on the primary node, as this is the source for all write operations. However, MongoDB is a bit smarter than just serving out the oplog to all secondaries. Even though the secondary nodes will copy entries of the oplog from the primary, for joining members it will offload the delta of transactions loading via secondaries if possible. Also secondary nodes may prefer to fetch oplog entries from other secondaries with low latency, rather than fetching them from a primary with high latency. So it would be better to perform this calculation on all nodes in the cluster.

As the replication window will be calculated per node and we like to keep our advisor as readable as possible, and we will abstract the calculation into a function:

function getReplicationWindow(host) {
  var replwindow = {};
  // Fetch the first and last record from the Oplog and take it's timestamp
  var res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: 1}, limit: 1}');
  replwindow['first'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: -1}, limit: 1}');
  replwindow['last'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  replwindow['replwindow'] = replwindow['first'] - replwindow['last'];
  return replwindow;
}

The function returns the timestamp of the first and last items in the oplog and the replication window as well. We will actually need all three in a later phase.

Calculating the MongoDB Replication lag

We covered the calculation of MongoDB replication lag in the previous Developer Studio blog post. You can calculate the lag by simply subtracting the secondary optimeDate (or optime timestamp) from the primary optimeDate. This will give you the replication lag in seconds.

The information necessary for the replication lag will be retrieved from only the primary node, using the replSetGetStatus command. Similarly to the replication window, we will abstract this into a function:

function getReplicationStatus(host, primary_id) {
    var node_status = {};
    var res = host.executeMongoQuery("admin", "{ replSetGetStatus: 1 }");
    // Fetch the optime and uptime per host
    for(i = 0; i < res["result"]["members"].size(); i++)
    {
        tmp = res["result"]["members"][i];
        host_id = tmp["name"];
        node_status[host_id] = {};
        node_status[host_id]["name"] = host_id;
        node_status[host_id]["primary"] = primary_id;
        node_status[host_id]["setname"] = res["result"]["set"];
        node_status[host_id]["uptime"] = tmp["uptime"];
        node_status[host_id]["optime"] = tmp["optime"]["ts"]["$timestamp"]["t"];
    }
    return node_status;
}

We keep a little bit more information than necessary here, but you will see why in the next paragraphs.

Calculating the time left per node

Now that we have calculated both the replication window and the replication lag per node, we can calculate the time left per node where it can theoretically still catch up with the primary. So here we subtract the timestamp of the first entry in the oplog (of the primary) from timestamp of the last executed transaction (optime) of the node.

// Calculate the replication window of the primary against the node's last transaction
replwindow_node = replstatus[host]['optime'] - replwindow_primary['first'];

But we are not done yet!

Adding another layer of complexity: sharding

In ClusterControl we see everything as a cluster. Whether you would run a single MongoDB server, a replicaSet or a sharded environment with config servers and shard replicaSets: they are all identified and administered as a single cluster. This means our advisor has to cope with both replicaSet and shard logic.

In the example code above, where we calculated the time left per node, our cluster had a single primary. In the case of a sharded cluster, we have to take into account that we have one replicaSet for the config server and one for the each shard. This means we have to store per node it’s primary node and use that one in the calculation.

The corrected line of code would be:

host_id = host.hostName() + ":" + host.port();
primary_id = replstatus_per_node[host_id]['primary'];
...
// Calculate the replication window of the primary against the node's last transaction
replwindow_node = replstatus['optime'] - replwindow_per_node[primary_id]['first'];

For readability we also need to include the replicaSet name in the messages we output via our advisor. If we would not do this, it would become quite hard to distinguish hosts, replicaSets and shards on large sharded environments.

msg = "The replication window for node " + host_id + " (" + replstatus["setname"] + ") is long enough.";

Take the uptime into account

Another possible impediment with our advisor is that it will start warning us immediately on a freshly deployed cluster:

  1. The first and last entry in the oplog will be within seconds of each other after initiating the replicaSet.
  2. Also on a newly created replicaSet, the probability of it being immediately used would be very low, so there is not much use in alerting on a (too) short replication window in this case either.
  3. At the same time a newly created replicaSet may also receive a huge write workload when a logical backup is restored, shortening all entries in the oplog to a very short timeframe. This especially becomes an issue if after this burst of writes, no more writes happen for a long time, as now the replication window becomes very short but also outdated.

There are three possible solutions for these problems to make our advisor more reliable:

  1. We parse the first item in the oplog. If it contains the replicaset initiating document we will ignore a too short replication window. This can easily be done alongside parsing the first record in the oplog.
  2. We also take the uptime into consideration. If the host’s uptime is shorter than our warning threshold, we will ignore a too short replication window.
  3. If the replication window is too short, but the newest item in the oplog is older than our warning threshold, this should be considered a false positive.

To parse the first item in the oplog we have to add these lines, where we identify it as a new set:

var res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: 1}, limit: 1}');
replwindow['first'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];

if (res["result"]["cursor"]["firstBatch"][0]["o"]["msg"] == "initiating set") {
    replwindow['newset'] = true;
}

Then later in the check for our replication window we can verify all three exceptions together:

if(replwindow['newset'] == true) {
    msg = "Host " + host_id + " (" + replstatus["setname"] + ") is a new replicaSet. Not enough entries in the oplog to determine the replication window.";
    advice.setSeverity(Ok);
    advice.setJustification("");
} else if (replstatus["uptime"] < WARNING_REPL_WINDOW) {
    msg = "Host " + host_id + " (" + replstatus["setname"] + ") only has an uptime of " + replstatus["uptime"] + " seconds. Too early to determine the replication window.";
    advice.setSeverity(Ok);
    advice.setJustification("");
}
else if (replwindow['max'] < (CmonDateTime::currentDateTime().toString("%s") - WARNING_REPL_WINDOW)) {
    msg = "Latest entry in the oplog for host " + host_id + " (" + replstatus["setname"] + ") is older than " + WARNING_REPL_WINDOW + " seconds. Determining the replication window would be unreliable.";
    advice.setSeverity(Ok);
    advice.setJustification("");
}

After this we can finally warn / advise on the replication window of the node. The output would look similar to this:

Conclusion

That’s it! We have created an advisor that gathers information from various hosts / replicaSets and combines this to get an exact view of the replication status of the entire cluster. We have also shown how to make things more readable by abstracting code into functions. And the most important aspect is that we have shown how to also think bigger than just the standard replicaSet.

For completeness, here is the full advisor:

#include "common/helpers.js"
#include "cmon/io.h"
#include "cmon/alarms.h"

// It is advised to have a replication window of at least 24 hours, critical is 1 hour
var WARNING_REPL_WINDOW = 24*60*60;
var CRITICAL_REPL_WINDOW = 60*60;
var TITLE="Replication window";
var ADVICE_WARNING="Replication window too short. ";
var ADVICE_CRITICAL="Replication window too short for one hour of downtime / maintenance. ";
var ADVICE_OK="The replication window is long enough.";
var JUSTIFICATION_WARNING="It is advised to have a MongoDB replication window of at least 24 hours. You could try to increase the oplog size. See also: https://docs.mongodb.com/manual/tutorial/change-oplog-size/";
var JUSTIFICATION_CRITICAL=JUSTIFICATION_WARNING;


function main(hostAndPort) {

    if (hostAndPort == #N/A)
        hostAndPort = "*";

    var hosts   = cluster::mongoNodes();
    var advisorMap = {};
    var result= [];
    var k = 0;
    var advice = new CmonAdvice();
    var msg = "";
    var replwindow_per_node = {};
    var replstatus_per_node = {};
    var replwindow = {};
    var replstatus = {};
    var replwindow_node = 0;
    var host_id = "";
    for (i = 0; i < hosts.size(); i++)
    {
        // Find the primary and execute the queries there
        host = hosts[i];
        host_id = host.hostName() + ":" + host.port();

        if (host.role() == "shardsvr" || host.role() == "configsvr") {
            // Get the replication window of each nodes in the cluster, and store it for later use
            replwindow_per_node[host_id] = getReplicationWindow(host);

            // Only retrieve the replication status from the master
            res = host.executeMongoQuery("admin", "{isMaster: 1}");
            if (res["result"]["ismaster"] == true) {
                //Store the result temporary and then merge with the replication status per node
                var tmp = getReplicationStatus(host, host_id);
                for(o=0; o < tmp.size(); o++) {
                    replstatus_per_node[tmp[o]['name']] = tmp[o];
                }

                //replstatus_per_node = 
            }
        }
    }

    for (i = 0; i < hosts.size(); i++)
    {
        host = hosts[i];
        if (host.role() == "shardsvr" || host.role() == "configsvr") {
            msg = ADVICE_OK;

            host_id = host.hostName() + ":" + host.port();
            primary_id = replstatus_per_node[host_id]['primary'];
            replwindow = replwindow_per_node[host_id];
            replstatus = replstatus_per_node[host_id];
    
            // Calculate the replication window of the primary against the node's last transaction
            replwindow_node = replstatus['optime'] - replwindow_per_node[primary_id]['first'];
            // First check uptime. If the node is up less than our replication window it is probably no use warning
            if(replwindow['newset'] == true) {
              msg = "Host " + host_id + " (" + replstatus["setname"] + ") is a new replicaSet. Not enough entries in the oplog to determine the replication window.";
              advice.setSeverity(Ok);
              advice.setJustification("");
            } else if (replstatus["uptime"] < WARNING_REPL_WINDOW) {
                msg = "Host " + host_id + " (" + replstatus["setname"] + ") only has an uptime of " + replstatus["uptime"] + " seconds. Too early to determine the replication window.";
                advice.setSeverity(Ok);
                advice.setJustification("");
            }
            else if (replwindow['max'] < (CmonDateTime::currentDateTime().toString("%s") - WARNING_REPL_WINDOW)) {
              msg = "Latest entry in the oplog for host " + host_id + " (" + replstatus["setname"] + ") is older than " + WARNING_REPL_WINDOW + " seconds. Determining the replication window would be unreliable.";
              advice.setSeverity(Ok);
              advice.setJustification("");
            }
            else {
                // Check if any of the hosts is within the oplog window
                if(replwindow_node < CRITICAL_REPL_WINDOW) {
                    advice.setSeverity(Critical);
                    msg = ADVICE_CRITICAL + "Host " + host_id + " (" + replstatus["setname"] + ") has a replication window of " + replwindow_node + " seconds.";
                    advice.setJustification(JUSTIFICATION_CRITICAL);
                } else {
                    if(replwindow_node < WARNING_REPL_WINDOW)
                    {
                        advice.setSeverity(Warning);
                        msg = ADVICE_WARNING + "Host " + host_id + " (" + replstatus["setname"] + ") has a replication window of " + replwindow_node + " seconds.";
                        advice.setJustification(JUSTIFICATION_WARNING);
                    } else {
                        msg = "The replication window for node " + host_id + " (" + replstatus["setname"] + ") is long enough.";
                        advice.setSeverity(Ok);
                        advice.setJustification("");
                    }
                }
            }
    
            advice.setHost(host);
            advice.setTitle(TITLE);
            advice.setAdvice(msg);
            advisorMap[i]= advice;
        }
    }
    return advisorMap;
}

function getReplicationStatus(host, primary_id) {
    var node_status = {};
    var res = host.executeMongoQuery("admin", "{ replSetGetStatus: 1 }");
    // Fetch the optime and uptime per host
    for(i = 0; i < res["result"]["members"].size(); i++)
    {
        tmp = res["result"]["members"][i];
        node_status[i] = {};
        node_status[i]["name"] = tmp["name"];;
        node_status[i]["primary"] = primary_id;
        node_status[i]["setname"] = res["result"]["set"];
        node_status[i]["uptime"] = tmp["uptime"];
        node_status[i]["optime"] = tmp["optime"]["ts"]["$timestamp"]["t"];
    }
    return node_status;
}

function getReplicationWindow(host) {
  var replwindow = {};
  replwindow['newset'] = false;
  // Fetch the first and last record from the Oplog and take it's timestamp
  var res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: 1}, limit: 1}');
  replwindow['first'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  if (res["result"]["cursor"]["firstBatch"][0]["o"]["msg"] == "initiating set") {
      replwindow['newset'] = true;
  }
  res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: -1}, limit: 1}');
  replwindow['last'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  replwindow['replwindow'] = replwindow['last'] - replwindow['first'];
  return replwindow;
}

High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

$
0
0

We regularly get questions about how to set up a Galera cluster with just 2 nodes. The documentation clearly states you should have at least 3 Galera nodes to avoid network partitioning. But there are some valid reasons for considering a 2 node deployment, e.g., if you want achieve database high availability but have limited budget to spend on a third database node. Or perhaps you are running Galera in a development/sandbox environment and prefer a minimal setup.

Galera implements a quorum-based algorithm to select a primary component through which it enforces consistency. The primary component needs to have a majority of votes, so in a 2 node system, there would be no majority resulting in split brain. Fortunately, it is possible to add a garbd (Galera Arbitrator Daemon), which is a lightweight stateless daemon that can act as the odd node. Arbitrator failure does not affect the cluster operations and a new instance can be reattached to the cluster at any time. There can be several arbitrators in the cluster.

ClusterControl has support for deploying garbd on non-database hosts.

Normally a Galera cluster needs at least three hosts to be fully functional, however at deploy time two nodes would suffice to create a primary component. Here are the steps:

  1. Deploy a Galera cluster of two nodes,
  2. After the cluster has been deployed by ClusterControl, add garbd on the ClusterControl node.

You should end up with the below setup:

Deploy the Galera Cluster

Go to the ClusterControl deploy wizard to deploy the cluster.

Even though ClusterControl warns you a Galera cluster needs an odd number of nodes, only add two nodes to the cluster.

Deploying a Galera cluster will trigger a ClusterControl job which can be monitored at the Jobs page.

Install Garbd

Once deployment is complete, install garbd on the ClusterControl host. It will be under the Manage -> Load Balancer:

Installing garbd will trigger a ClusterControl job which can be monitored at the Jobs page. Once completed, you can verify garbd is running with a green tick icon at the top bar:

That’s it. Our minimal two-node Galera cluster is now ready!

We’re keeping the tills ringing at eCommerce platform vidaXL

$
0
0

ClusterControl helps vidaXL compete with the world's largest e-commerce platforms by managing its MongoDB & MySQL databases.

Press Release: everywhere around the world, November 9th 2016 - today we announced vidaXL, an international eCommerce platform where you can “live it up for less”, as our latest customer. ClusterControl was deployed to help manage vidaXL’s polyglot database architecture, which consists of SQL and NoSQL database solutions to handle specific tasks within the enterprise.

vidaXL caters to the product hunters, offering items for inside and outside the home at competitive prices. With a catalogue of currently over 20,000 products to choose from and selling directly in 29 countries, it has a huge task of managing and updating the database its consumers rely on to fulfil their orders. With 200,000 orders monthly, vidaXL is one of the largest international e-retailers.

The eCommerce company is growing and it has an aim of expanding its product catalogue to over 10,000,000 items within the next 12 months. This extremely large selection of goods creates a wealth of new data; images alone in the catalogue create roughly 100 terabytes worth of data, and the products rows between one to two terabytes. The increase of data originally required vidaXL to hire more database administrators (DBAs), but it searched for a cost-effective solution.

ClusterControl was deployed to manage the database systems. As scaling was an issue for vidaXL, particularly the horizontal scaling of its servers, ClusterControl as a single platform replaced the need for a combination of tools and the sometimes unreliable command line control. The ClusterControl deployment took around one week to implement, with no extra support required from Severalnines.

ClusterControl is easily integrated within a polyglot framework, managing different databases with the same efficiency. vidaXL is using several different databases, MongoDB and MySQL for product and customer listings, along with ElasticSearch, for its real-time search capabilities; ClusterControl was plugged in to automate management and give control over scaling of MongoDB and MySQL. The operations team also leveraged it for proactive reporting.

Zeger Knops, Head of Business Technology, vidaXL said, “We’re looking to grow exponentially in the near future with the products we offer and maintain our position as the world’s largest eCommerce operator. This means we cannot suffer any online outages which lead to a loss of revenue. Scaling from thousands to millions of products is a giant leap and that will require us to have a strong infrastructure foundation. Our back-end is reliant on different databases to tackle different tasks. Using several different tools, rather than a one-stop shop, was detrimental to our productivity. Severalnines is that “shop” and we haven’t looked back. It’s an awesome solution like no other.”

Vinay Joosery, Severalnines CEO, added, “As we head towards the busy end of the year for retailers with Cyber Monday just around the corner, a product catalogue of VidaXL’s size requires strong database management skills and technologies. Keeping operations online and supplying people with their required orders is key. We trust that VidaXL will continue to reap the benefits of ClusterControl as it grows.”

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 8,000 deployments to date via its popular ClusterControl product. Currently counting BT, Orange, Cisco, CNRS, Technicolor, AVG, Ping Identity and Paytrail as customers. Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit, http://www.severalnines.com/company.

Tips and Tricks: Receive email notifications from ClusterControl

$
0
0

As sysadmins and DBAs, we need to be notified whenever something critical happens to our database.  But would it not be nicer if we were informed upfront, and still have time to perform pre-emptive maintenance and retain high availability? Being informed about anomalies or anything that may degrade cluster health and performance is key. In this tips and tricks post, we will explain how you can set up email notifications in ClusterControl and stay up to date with your cluster state.

Email notification types in ClusterControl

First we will explain the two types of email notifications that ClusterControl can send. The normal notifications will be sent instantly, once an alert is triggered or an important event occurs. This instant mail type (deliver) is necessary if you wish to immediately receive critical or warning notifications that require swift action.

The other type is called digest, where ClusterControl will accumulate all notifications and then send them each day in a single email on a preset time. Informational and warning notifications, that do not need immediate action can best be sent via the digest email.

Then there is a third option: not to send a notification and ignore the message. This, obviously, should only be configured if you are absolutely certain you don’t wish to receive this type of notification.

Setting up email notifications per user

There are two methods for setting up email notifications in ClusterControl. The below is the first one, where you can set the email notifications on a user level. Go to Settings > Email Notifications.

Here you can select an existing user and load it’s current settings. You can change at what time digest emails are to be sent, and to prevent ClusterControl from sending too many emails, what the limit is for the non-digest emails. Be careful: if you set this too low, you will no longer receive notifications for the remainder of the day! Setting this to -1 sets this to unlimited. Per alarm/event category, the email notifications can be set to the notification type necessary.

Keep in mind that this setting is on a global level, so this accounts for all clusters.

Setting up email notifications per cluster

On the cluster level, the notifications can be set for both users and additional email addresses. This interface can be found via Cluster > Settings > General Settings > Email Notifications.

Here you can select an existing user/email address and load it’s current settings. You can change at what time digest emails are to be sent, and to prevent ClusterControl from sending too many emails, what the limit is for the non-digest emails. Again here, if you set this too low, you will no longer receive notifications for the remainder of the day! Setting this to -1 sets this to unlimited. Per alarm/event category the email notifications can be set to the notification type necessary.

Keep in mind all settings are on a cluster specific level, so this only changes settings for the selected cluster.

Adding and removing email addresses

Apart from defining the email notification settings, you can also add new email addresses by clicking on the plus-button. (+) This can be handy if you wish to send notifications to, for example, a distribution list inside your company.

Removing email addresses can be done, by selecting the email address that needs removal and click the minus-button. (-)

Configuring the mail server

To be able to send email, you need to tell ClusterControl how to send emails. There are two options: via sendmail or via an SMTP server.

When you make use of sendmail, the server where you have installed ClusterControl should have a local command line mail client installed. ClusterControl will send it’s email using the -r option to set the from-address. As sendmail may not deliver your email reliably, the recommended method of sending email would be via SMTP.

If you decide to use an SMTP server instead, you may need to authenticate against this server. Check with your hosting provider if this is required.

Once set in the first cluster, the mail server settings will be carried over to any new cluster created.

Sending a test email

In the Configure Mail Server interface, you can also send a test email. This will create a backend job, that will send an email to all configured recipients for this cluster under Email Notification Settings.

Troubleshooting

If your test email is not arriving and you have set your mail server settings to sendmail, you can check its workings from the ClusterControl host.

CMON log files

You can check your CMON logfiles and see if the email has been sent.

In /var/log/cmon_<clusterid>.log, you should see something similar to this:

2016-12-09 12:44:11 : (INFO) Executing email job.

If you see a log line like this, you may want to increase the daily message limit:

2016-12-09 12:44:47 : (WARNING) Refusing to send more than 10 messages daily to 'mailto://you@yourcompany.com'

As said earlier: if the message limit has been reached, you will no longer receive notifications.

A message about the -r option indicate your mail client does not support the from-header:

2016-12-09 12:44:17 : (WARNING) mail command doesn't support -r SENDER argument, retrying without that.

You can follow this support article to learn how which packages to install.

Sendmail log files

You can also check the local sendmail log files (/var/log/maillog) and see if your email gets delivered. A typical sendmail connection flow looks like the following:

Dec  9 17:36:41 localhost sendmail[24529]: uB9HafLM024529: from=clustercontrol@yourcompany.com, size=326, class=0, nrcpts=1, msgid=<584aeba9.9LBxfOatDgnTC+vm%clustercontrol@yourcompany.com>, relay=root@localhost
Dec  9 17:36:41 localhost postfix/smtpd[24530]: connect from n1[127.0.0.1]
Dec  9 17:36:41 localhost postfix/smtpd[24530]: 2C0AF4094CF9: client=n1[127.0.0.1]
Dec  9 17:36:41 localhost postfix/cleanup[24533]: 2C0AF4094CF9: message-id=<584aeba9.9LBxfOatDgnTC+vm%clustercontrol@yourcompany.com>
Dec  9 17:36:41 localhost sendmail[24529]: uB9HafLM024529: to=you@yourcompany.com, ctladdr=clustercontrol@yourcompany.com (0/0), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30326, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (Ok: queued as 2C0AF4094CF9)
Dec  9 17:36:41 localhost postfix/qmgr[1256]: 2C0AF4094CF9: from=<clustercontrol@yourcompany.com>, size=669, nrcpt=1 (queue active)
Dec  9 17:36:41 localhost postfix/smtpd[24530]: disconnect from n1[127.0.0.1]
Dec  9 17:36:41 localhost postfix/smtp[24534]: 2C0AF4094CF9: to=<you@yourcompany.com>, relay=mail.yourcompany.com[94.142.240.10]:25, delay=0.38, delays=0.05/0.02/0.08/0.24, dsn=2.0.0, status=sent (250 OK id=1cFP69-0002Ns-Db)

If these entries are not to be found inside the log file, you can increase the loglevel of Sendmail.

Command line email

A final check would be to run the mail command and see if that arrives:

echo "test message" | mail -r youremail@yourcompany.com -s "test subject" youremail@yourcompany.com

If the message from the command line arrives, but the ClusterControl message does not, it may be related to not having set the from-email address in ClusterControl. ClusterControl will then send the email from the default user on the system. If the hostname is not properly set on the ClusterControl host to a fully qualified domain name, this may result in your email server not accepting any emails by an unqualified domain name, or non-existing user.

We hope these tips help you configure notifications in ClusterControl.


New whitepaper - the DevOps Guide to database backups for MySQL and MariaDB

$
0
0

This week we’re happy to announce that our new DevOps Guide to Database Backups for MySQL & MariaDB is now available for download (free)!

This guide discusses in detail the two most popular backup utilities available for MySQL and MariaDB, namely mysqldump and Percona XtraBackup.

Topics such as how database features like binary logging and replication can be leveraged in backup strategies are covered. And it provides best practices that can be applied to high availability topologies in order to make database backups reliable, secure and consistent.

Ensuring that backups are performed, so that a database can be restored if disaster strikes, is a key operational aspect of database management. The DBA or System Administrator is usually the responsible party to ensure that the data is protected, consistent and reliable. Ever more crucially, backups are an important part of any disaster recovery strategy for businesses.

So if you’re looking for insight into how to perform database backups efficiently or the impact of Storage Engine on MySQL or MariaDB backup procedures, need some tips & tricks on MySQL / MariaDB backup management … our new DevOps Guide has you covered.

Tips and Tricks - How to shard MySQL with ProxySQL in ClusterControl

$
0
0

Having too large a (write) workload on a master is dangerous. If the master collapses and a failover happens to one of its slave nodes, the slave node could collapse under the write pressure as well. To mitigate this problem you can shard horizontally across more nodes.

Sharding increases the complexity of data storage though, and very often, it requires an overhaul of the application. In some cases, it may be impossible to make changes to an application. Luckily there is a simpler solution: functional sharding. With functional sharding you move a schema or table to another master, and thus alleviating the master from the workload of these schemas or tables.

In this Tips & Tricks post, we will explain how you can functionally shard your existing master, and offload some workload to another master using functional sharding. We will use ClusterControl, MySQL replication and ProxySQL to make this happen, and the total time taken should not be longer than 15 minutes in total. Mission impossible? :-)

The example database

In our example we have a serious issue with the workload on our simple order database, accessed by the so_user. The majority of the writes are happening on two tables: orders and order_status_log. Every change to an order will write to both the order table and the status log table.

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `customer_id` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `total_vat` decimal(15,2) DEFAULT '0.00',
  `total` decimal(15,2) DEFAULT '0.00',
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `order_status_log` (
  `orderId` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `changeTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `logline` text,
  PRIMARY KEY (`orderId`, `status`, `changeTime` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `customers` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `firstname` varchar(15) NOT NULL,
  `surname` varchar(80) NOT NULL,
  `address` varchar(255) NOT NULL,
  `postalcode` varchar(6) NOT NULL,
  `city` varchar(50) NOT NULL,
  `state` varchar(50) NOT NULL,
  `country` varchar(50) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

What we will do is to move the order_status_log table to another master.

As you might have noticed, there is no foreign key defined on the order_status_log table. This simply would not work across functional shards. Joining the order_status_log table with any other table would simply no longer work as it will be physically on a different server than the other tables. And if you write transactional data to multiple tables, the rollback will only work for one of these masters. If you wish to retain these things, you should consider to use homogenous sharding instead where you keep related data grouped together in the same shard.

Installing the Replication setups

First, we will install a replication setup in ClusterControl. The topology in our example is really basic: we deploy one master and one replica:

But you could import your own existing replication topology into ClusterControl as well.

After the setup has been deployed, deploy the second setup:

While waiting for the second setup to be deployed, we will add ProxySQL to the first replication setup:

Adding the second setup to ProxySQL

After ProxySQL has been deployed we can connect with it via command line, and see it’s current configured servers and settings:

MySQL [(none)]> select hostgroup_id, hostname, port, status, comment from mysql_servers;
+--------------+-------------+------+--------+-----------------------+
| hostgroup_id | hostname    | port | status | comment               |
+--------------+-------------+------+--------+-----------------------+
| 20           | 10.10.36.11 | 3306 | ONLINE | read server           |
| 20           | 10.10.36.12 | 3306 | ONLINE | read server           |
| 10           | 10.10.36.11 | 3306 | ONLINE | read and write server |
+--------------+-------------+------+--------+-----------------------+
MySQL [(none)]> select rule_id, active, username, schemaname, match_pattern, destination_hostgroup from mysql_query_rules;
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+
| rule_id | active | username | schemaname | match_pattern                                           | destination_hostgroup |
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+
| 100     | 1      | NULL     | NULL       | ^SELECT .* FOR UPDATE                                   | 10                    |
| 200     | 1      | NULL     | NULL       | ^SELECT .*                                              | 20                    |
| 300     | 1      | NULL     | NULL       | .*                                                      | 10                    |
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+

As you can see, ProxySQL has been configured with the ClusterControl default read/write splitter for our first cluster. Any basic select query will be routed to hostgroup 20 (read pool) while all other queries will be routed to hostgroup 10 (master). What is missing here is the information about the second cluster, so we will add the hosts of the second cluster first:

MySQL [(none)]> INSERT INTO mysql_servers VALUES (30, '10.10.36.13', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read server'), (30, '10.10.36.14', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read server');
Query OK, 2 rows affected (0.00 sec) 
MySQL [(none)]> INSERT INTO mysql_servers VALUES (40, '10.10.36.13', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read and write server');
Query OK, 1 row affected (0.00 sec)

After this we need to load the servers to ProxySQL runtime tables and store the configuration to disk:

MySQL [(none)]> LOAD MYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.01 sec)

As ProxySQL is doing the authentication for the clients as well, we need to add the os_user user to ProxySQL to allow the application to connect through ProxySQL:

MySQL [(none)]> INSERT INTO mysql_users (username, password, active, default_hostgroup, default_schema) VALUES ('so_user', 'so_pass', 1, 10, 'simple_orders');
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> LOAD MYSQL USERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL USERS TO DISK;
Query OK, 0 rows affected (0.00 sec)

Now we have added the second cluster and user to ProxySQL. Keep in mind that normally in ClusterControl the two clusters are considered two separate entities. ProxySQL will remain part of the first cluster. Even though it is now configured for the second cluster, it will only be displayed under the first cluster,.

Mirroring the data

Keep in mind that mirroring queries in ProxySQL is still a beta feature, and it doesn’t guarantee the mirrored queries will actually be executed. We have found it working fine within the boundaries of this use case. Also there are (better) alternatives to our example here, where you would make use of a restored backup on the new cluster and replicate from the master until you make the switch. We will describe this scenario in a follow up Tips & Tricks blog post.

Now that we have added the second cluster, we need to create the simple_orders database, the order_status_log table and the appropriate users on the master of the second cluster:

mysql> create database simple_orders;
Query OK, 1 row affected (0.01 sec)
mysql> use simple_orders;
Database changed
mysql> CREATE TABLE `order_status_log` (
  `orderId` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `changeTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `logline` text,
  PRIMARY KEY (`orderId`, `status`, `changeTime` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> create user 'so_user'@'10.10.36.15' identified by 'so_pass';
Query OK, 0 rows affected (0.00 sec)
mysql> grant select, update, delete, insert on simple_orders.* to 'so_user'@'10.10.36.15';
Query OK, 0 rows affected (0.00 sec)

This enables us to start mirroring the queries executed against the first cluster onto the second cluster. This requires an additional query rule to be defined in ProxySQL:

MySQL [(none)]> INSERT INTO mysql_query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup, mirror_hostgroup, apply) VALUES (50, 1, 'so_user', 'simple_orders', '(^INSERT INTO|^REPLACE INTO|^UPDATE|INTO TABLE) order_status_log', 20, 40, 1);
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 1 row affected (0.00 sec)

In this rule ProxySQL will match everything that is writing to the orders_status_log table, and send it in addition to the hostgroup 40. (write server of the second cluster)

Now that we have started mirroring the queries, the backfill of the data from the first cluster can take place. You can use the timestamp from the first entry in the new orders_status_log table to determine the time we started to mirror.

Once the data has been backfilled we can reconfigure ProxySQL to perform all actions on the orders_status_log table on the second cluster. This will be a two step approach: add a new rule to move the read queries to the second cluster’s read servers and except the SELECT … FOR UPDATE queries. Then another one to modify our mirroring query to stop mirroring and only write to the second cluster.

MySQL [(none)]> INSERT INTO mysql_query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup, apply) VALUES (70, 1, 'so_user', 'simple_orders', '^SELECT .* FROM order_status_log', 30, 1), (60, 1, 'so_user', 'simple_orders', '^FROM order_status_log .* FOR UPDATE', 40, 1);
Query OK, 2 rows affected (0.00 sec)
MySQL [(none)]> UPDATE mysql_query_rules SET destination_hostgroup=40, mirror_hostgroup=NULL WHERE rule_id=50;
Query OK, 1 row affected (0.00 sec)

And don’t forget to activate and persist the new query rules:

MySQL [(none)]> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.05 sec)

After this final step we should see the workload drop on the first cluster, and increase on the second cluster. Mission possible and accomplished. Happy clustering!

Secure MongoDB and Protect Yourself from the Ransom Hack

$
0
0

In this blogpost we look at the recent concerns around MongoDB ransomware and security issues, and how to mitigate this threat to your own MongoDB instance.

Recently, various security blogs raised concern that a hacker is hijacking MongoDB instances and asking ransom for the data stored. It is not the first time unprotected MongoDB instances have been found vulnerable, and this stirred up the discussion around MongoDB security again.

What is the news about?

About two years ago, the university of Saarland in Germany alerted that they discovered around 40,000 MongoDB servers that were easily accessible on the internet. This meant anyone could open a connection to a MongoDB server via the internet. How did this happen?

Default binding

In the past, the MongoDB daemon bound itself to any interface. This means anyone who has access to any of the interfaces on the host where MongoDB is installed, will be able to connect to MongoDB. If the server is directly connected to a public ip address on one of these interfaces, it may be vulnerable.

Default ports

By default, MongoDB will bind to standard ports: 27017 for MongoDB replicaSets or Shard Routers, 27018 for shards and 27019 for Configservers. By scanning a network for these ports it becomes predictable if a host is running MongoDB.

Authentication

By default, MongoDB configures itself without any form of authentication enabled. This means MongoDB will not prompt for a username and password, and anyone connecting to MongoDB will be able to read and write data. Since MongoDB 2.0 authentication has been part of the product, but never has been part of the default configuration.

Authorization

Part of enabling authorization is the ability to define roles. Without authentication enabled, there will also be no authorization. This means anyone connecting to a MongoDB server without authentication enabled, will have administrative privileges too. Administrative privileges stretches from defining users to configuring MongoDB runtime.

Why is all this an issue now?

In December 2016 a hacker exploited these vulnerabilities for personal enrichment. The hacker steals and removes your data, and leaves the following message in the WARNING collection:

{
     "_id" : ObjectId("5859a0370b8e49f123fcc7da"),
     "mail" : "harak1r1@sigaint.org",
     "note" : "SEND 0.2 BTC TO THIS ADDRESS 13zaxGVjj9MNc2jyvDRhLyYpkCh323MsMq AND CONTACT THIS EMAIL WITH YOUR IP OF YOUR SERVER TO RECOVER YOUR DATABASE !"
}

Demanding 0.2 bitcoins (around $200 at this moment of writing) may not sound like a lot if you really want your data back. However in the meanwhile your website/application is not able to function normally and may be defaced, and this could potentially cost way more than the 0.2 bitcoins.

A MongoDB server is vulnerable when it has a combination of the following:

  • Bound to a public interface
  • Bound to a default port
  • No (or weak) authentication enabled
  • No firewall rules or security groups in place

The default port could be debatable. Any port scanner would also be able to identify MongoDB if it was placed under an obscured port number.

The combination of all four factors means any attacker may be able to connect to the host. Without authentication (and authorization) the attacker can do anything with the MongoDB instance. And even if authentication has been enabled on the MongoDB host, it could still be vulnerable.

Using a network port scanner (e.g. nmap) would reveal the MongoDB build info to the attacker. This means he/she is able to find potential (zero-day) exploits for your specific version, and still manage to compromise your setup. Also weak passwords (e.g. admin/admin) could pose a threat, as the attacker would have an easy point of entry.

How can you protect yourself against this threat?

There are various precautions you can take:

  • Put firewall rules or security groups in place
  • Bind MongoDB only to necessary interfaces and ports
  • Enable authentication, users and roles
  • Backup often
  • Security audits

For new deployments performed from ClusterControl, we enable authentication by default, create a separate administrator user and allow to have MongoDB listen on a different port than the default. The only part ClusterControl can’t setup, is whether the MongoDB instance is available from outside your network.

ClusterControl
Single Console for Your Entire Database Infrastructure
Deploy, manage, monitor, scale your databases on the technology stack of your choice!

Securing MongoDB

The first step to secure your MongoDB server, would be to place firewall rules or security groups in place. These will ensure only the client hosts/applications necessary will be able to connect to MongoDB. Also make sure MongoDB only binds to the interfaces that are really necessary in the mongod.conf:

# network interfaces
net:
      port: 27017
      bindIp : [127.0.0.1,172.16.1.154]

Enabling authentication and setting up users and roles would be the second step. MongoDB has an easy to follow tutorial for enabling authentication and setting up your admin user. Keep in mind that users and passwords are still the weakest link in the chain, and ensure to make those secure!

After securing, you should ensure to always have a backup of your data. Even if the hacker manages to hijack your data, with a backup and big enough oplog you would be able to perform a point-in-time restore. Scheduling (shard consistent) backups can easily be setup in our database clustering, management and automation software called ClusterControl.

Perform security audits often: scan for any open ports from outside your hosting environment. Verify that authentication has been enabled for MongoDB, and ensure the users don’t have weak passwords and/or excessive roles. For ClusterControl we have developed two advisors that will verify all this. ClusterControl advisors are open source, and the advisors can be run for free using ClusterControl community edition.

Will this be enough to protect myself against any threat?

With all these precautions in place, you will be protected against any direct threat from the internet. However keep in mind that any machine compromised in your hosting environment may still become a stepping stone to your now protected MongoDB servers. Be sure to upgrade MongoDB to the latest (patch) releases and be protected against any threat.

How to use the ClusterControl Query Monitor for MySQL, MariaDB and Percona Server

$
0
0

The MySQL database workload is determined by the number of queries that it processes. There are several situations in which MySQL slowness can originate. The first possibility is if there is any queries that are not using proper indexing. When a query cannot make use of an index, the MySQL server has to use more resources and time to process that query. By monitoring queries, you have the ability to pinpoint SQL code that is the root cause of a slowdown.

By default, MySQL provides several built-in tools to monitor queries, namely:

  • Slow Query Log - Captures query that exceeds a defined threshold, or query that does not use indexes.
  • General Query Log - Captures all queries happened in a MySQL server.
  • SHOW FULL PROCESSLIST statement (or through mysqladmin command) - Monitors live queries currently being processed by MySQL server.
  • PERFORMANCE_SCHEMA - Monitors MySQL Server execution at a low level.

There are also open-source tools out there that can achieve similar result like mtop and Percona’s pt-query-digest.

How ClusterControl monitors queries

ClusterControl does not only monitor your hosts and database instances, it also monitors your database queries. It gets the information in two different ways:

  • Queries are retrieved from PERFORMANCE_SCHEMA
  • If PERFORMANCE_SCHEMA is disabled or unavailable, ClusterControl will parse the content of the Slow Query Log

ClusterControl starts reading from the PERFORMANCE_SCHEMA tables immediately when the query monitor is enabled, and the following tables are used by ClusterControl to sample the queries:

  • performance_schema.events_statements_summary_by_digest
  • performance_schema.events_statements_current
  • performance_schema.threads

In older versions of MySQL (5.5), having PERFORMANCE_SCHEMA (P_S) enabled might not be an option since it can cause significant performance degradation. With MySQL 5.6 the overhead is reduced and even more so in 5.7. P_S offers great introspection of the server at an overhead of a few percents (1-3%). If the overhead is a concern then ClusterControl can parse the Slow Query log remotely to sample queries. Note that no agents are required on your database servers. It uses the following flow:

  1. Start slow log (during MySQL runtime).
  2. Run it for a short period of time (a second or couple of seconds).
  3. Stop log.
  4. Parse log.
  5. Truncate log (ClusterControl creates new log file).
  6. Go to 1.

As you can see, ClusterControl does the above trick when pulling and parsing the Slow Query log to overcome the problems with offsets. The drawback of this method is that the continuous sampling might miss some queries during steps 3 to 5. Hence, if continuous query sampling is vital for you and part of your monitoring policy, the best way is to use P_S. If enabled, ClusterControl will automatically use it.

The collected queries are hashed, calculated and digested (normalize, average, count, sort) and then stored in ClusterControl.

Enabling Query Monitoring

As mentioned earlier, ClusterControl monitors MySQL query via two ways:

  • Fetch the queries from PERFORMANCE_SCHEMA
  • Parse the content of MySQL Slow Query

Performance Schema (Recommended)

First of all, if you would like to use Performance Schema, turn it on all MySQL servers (MySQL/MariaDB v5.5.3 and later). Enabling this requires a MySQL restart. Add the following line to your MySQL configuration file:

performance_schema = ON

Then, restart the MySQL server. For ClusterControl users, you can use the configuration management feature at Manage -> Configurations -> Change Parameter and perform a rolling restart at Manage -> Upgrades -> Rolling Restart.

Once enabled, ensure at least events_statements_current is enabled:

mysql> SELECT * FROM performance_schema.setup_consumers WHERE NAME LIKE 'events_statements%';
+--------------------------------+---------+
| NAME                           | ENABLED |
+--------------------------------+---------+
| events_statements_current      | YES     |
| events_statements_history      | NO      |
| events_statements_history_long | NO      |
+--------------------------------+---------+

Otherwise, run the following statement to enable it:

UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'events_statements_current';

MySQL Slow Query

If Performance Schema is disabled, ClusterControl will then default to the Slow Query log. Hence, you don’t have to do anything since it can be turned on and off dynamically during runtime via SET statement.

The Query Monitoring function must be toggled to on under ClusterControl -> Query Monitor -> Top Queries. ClusterControl will monitor queries on all database nodes under this cluster:

Click on the “Settings” and configure “Long Query Time” and toggle “Log queries not using indexes” to On. If you have defined two parameters (long_query_time and log_queries_not_using_indexes) inside my.cnf and you would like to use those values instead, toggle “MySQL Local Query Override” to On. Otherwise, ClusterControl will obey the former.

Once enabled, you just need to wait a couple of minutes before you can see data under Top Queries and Query Histogram.

How ClusterControl visualizes the queries

Under the Query Monitor tab, you should see the following three items:

  • Top Queries

  • Running Queries

  • Query Histogram

We’ll have a quick look at these here, but remember that you can always find more details in the ClusterControl documentation.

Top Queries

Top Queries is an aggregated list of all your top queries running on all the nodes of your cluster. The list can be ordered by “Occurrence” or “Execution Time”, to show the most common or slowest queries respectively. You don’t have to login to each of the servers to see the top queries. The UI provides an option to filter based on MySQL server.

If you are using the Slow Query log, only queries that exceed the “Long Query Time” will be listed here. If the data is not populated correctly and you believe that there should be something in there, it could be:

  • ClusterControl did not collect enough queries to summarize and populate data. Try to lower the “Long Query Time”.
  • You have configured Slow Query Log configuration options in the my.cnf of MySQL server, and “Override Local Query” is turned off. If you really want to use the value you defined inside my.cnf, probably you have to lower the long_query_time value so ClusterControl can calculate a more accurate result.
  • You have another ClusterControl node pulling the Slow Query log as well (in case you have a standby ClusterControl server). Only allow one ClusterControl server to do this job.

The “Long Query Time” value can be specified to a resolution of microseconds, for example 0.000001 (1 x 10-6). The following shows a screenshot of what’s under Top Queries:

Clicking on each query will show the query plan executed, similar to EXPLAIN command output:

Running Queries

Running Queries provides an aggregated view of current running queries across all nodes in the cluster, similar to SHOW FULL PROCESSLIST command in MySQL. You can stop a running query by selecting to kill the connection that started the query. The process list can be filtered out by host.

Use this feature to monitor live queries currently running on MySQL servers. By clicking on each row that contains “Info”, you can see the extended information containing the full query statement and the query plan:

Query Histogram

The Query Histogram is actually showing you queries that are outliers. An outlier is a query taking longer time than the normal query of that type. Use this feature to filter out the outliers for a certain time period. This feature is dependent on the Top Queries feature above. If Query Monitoring is enabled and Top Queries are captured and populated, the Query Histogram will summarize these and provide a filter based on timestamp.

That’s all folks! Monitoring queries is as important as monitoring your hosts or MySQL instances, to make sure your database is performing well.

Announcing ClusterControl 1.4 - the MySQL Replication & MongoDB Edition

$
0
0

Today we are pleased to announce the 1.4 release of ClusterControl - the all-inclusive database management system that lets you easily deploy, monitor, manage and scale highly available open source databases in any environment; on-premise or in the cloud.

This release contains key new features for MongoDB and MySQL Replication in particular, along with performance improvements and bug fixes.

Release Highlights

For MySQL

MySQL Replication

  • Enhanced multi-master deployment
  • Flexible topology management & error handling
  • Automated failover

MySQL Replication & Load Balancers

  • Deploy ProxySQL on MySQL Replication setups and monitor performance
  • HAProxy Read-Write split configuration support for MySQL Replication setups

Experimental support for Oracle MySQL Group Replication

  • Deploy Group Replication Clusters

And support for Percona XtraDB Cluster 5.7

Download ClusterControl

For MongoDB

MongoDB & sharded clusters

  • Convert a ReplicaSet to a sharded cluster
  • Add or remove shards
  • Add Mongos/Routers

More MongoDB features

  • Step down or freeze a node
  • New Severalnines database advisors for MongoDB

Download ClusterControl

View release details and resources

ClusterControl
Single Console for Your Entire Database Infrastructure
Find out what else is new in ClusterControl

New MySQL Replication Features

ClusterControl 1.4 brings a number of new features to better support replication users. You are now able to deploy a multi-master replication setup in active - standby mode. One master will actively take writes, while the other one is ready to take over writes should the active master fail. From the UI, you can also easily add slaves under each master and reconfigure the topology by promoting new masters and failing over slaves.

Topology reconfigurations and master failovers are not usually possible in case of replication problems, for instance errant transactions. ClusterControl will check for issues before any failover or switchover happens. The admin can define whitelists and blacklists of which slaves to promote to master (and vice versa). This makes it easier for admins to manage their replication setups and make topology changes when needed. 

Deploy ProxySQL on MySQL Replication clusters and monitor performance

Load balancers are an essential component in database high availability. With this new release, we have extended ClusterControl with the addition of ProxySQL, created for DBAs by René Cannaò, himself a DBA trying to solve issues when working with complex replication topologies. Users can now deploy ProxySQL on MySQL Replication clusters with ClusterControl and monitor its performance.

By default, ClusterControl deploys ProxySQL in read/write split mode - your read-only traffic will be sent to slaves while your writes will be sent to a writable master. ProxySQL will also work together with the new automatic failover mechanism. Once failover happens, ProxySQL will detect the new writable master and route writes to it. It all happens automatically, without any need for the user to take action.

MongoDB & sharded clusters

MongoDB is the rising star of the Open Source databases, and extending our support for this database has brought sharded clusters in addition to replica sets. This meant we had to retrieve more metrics to our monitoring, adding advisors and provide consistent backups for sharding. With this latest release, you can now convert a ReplicaSet cluster to a sharded cluster, add or remove shards from a sharded cluster as well as add Mongos/routers to a sharded cluster.

New Severalnines database advisors for MongoDB

Advisors are mini programs that provide advice on specific database issues and we’ve added three new advisors for MongoDB in this ClusterControl release. The first one calculates the replication window, the second watches over the replication window, and the third checks for un-sharded databases/collections. In addition to this we also added a generic disk advisor. The advisor verifies if any optimizations can be done, like noatime and noop I/O scheduling, on the data disk that is being used for storage.

There are a number of other features and improvements that we have not mentioned here. You can find all details in the ChangeLog.

We encourage you to test this latest release and provide us with your feedback. If you’d like a demo, feel free to request one.

Thank you for your ongoing support, and happy clustering!

PS.: For additional tips & tricks, follow our blog: http://www.severalnines.com/blog/

Automating MySQL Replication with ClusterControl 1.4.0 - what’s new

$
0
0

With the recent release of ClusterControl 1.4.0, we added a bunch of new features to better support MySQL replication users. In this blog post, we’ll give you a quick overview of the new features.

Enhanced multi-master deployment

A simple master-slave replication setup is usually good enough in a lot of cases, but sometimes, you might need a more complex topology with multiple masters. With 1.4.0, ClusterControl can help provision such setups. You are now able to deploy a multi-master replication setup in active - standby mode. One of the masters will actively take writes, while the other one is ready to take over writes should the active master fail. You can also easily add slaves under each master, right from the UI.

Enhanced flexibility in replication topology management

With support for multi-master setups comes improved support for managing replication topology changes. Do you want to re-slave a slave off the standby master? Do you want to create a replication chain, with an intermediate master in-between? Sure! You can use a new job for that: “Change Replication Master”. Just go to one of the nodes and pick that job (not only on the slaves, you can also change replication master for your current master, to create a multi-master setup). You’ll be presented with a dialog box in which you can pick the master from which to slave your node off. As of now, only GTID-enabled replication is supported, both Oracle and MariaDB implementations.

Replication error handling

You may ask - what about issues like errant transactions which can be a serious problem for MySQL replication? Well, for starters, ClusterControl always set slaves in read_only mode so only a superuser can create an errant transaction. It still may happen, though. That’s why we added replication error handling in ClusterControl.

Errant transactions are common and they are handled separately - errant transactions are checked for before any failover or switchover happens. The user can then fix the problem before triggering a topology change once more. If, for some reason (like high availability, for example), a user wants to perform a failover anyway, no matter if it is safe or not, it can also be done by setting:

replication_stop_on_error=0

This is set in the cmon configuration file of the replication setup ( /etc/cmon.d/cmon_X.cnf, where X is the cluster ID of the replication setup). In such cases, failover will be performed even if there’s a possibility that replication will break.

To handle such cases, we added experimental support for slave rebuilding. If you enable replication_auto_rebuild_slave in the cmon configuration and if your slave is marked as down with the following error in MySQL:

Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'

ClusterControl will attempt to rebuild the slave using data from the master. Such a setting may be dangerous as the rebuilding process will induce an increased load on the master, it may also be that your dataset is very large and a regular rebuild is not an option - that’s why this behavior is disabled by default. Feel free to try it out, though and let us know what you think about it.

Automated failover

Handling replication errors is not enough to maintain high availability with MySQL replication - you need also to handle crashes of MySQL instances. Until now, ClusterControl alerted the user and let her perform a manual failover. With ClusterControl version 1.4.0 comes support for automated failover handling. It is enough to have cluster recovery enabled for your replication cluster and ClusterControl will try to recover your replication cluster in the best way possible. You must explicitly enable "Cluster Auto Recovery" in the UI in order for automatic failover to be activated.

Once a master failure is detected, ClusterControl starts to look for the most up-to-date slave available. Once it’s been found, ClusterControl checks the remaining slaves and looks for additional, missing transactions. If such transactions are found on some of the slaves, the master candidate is configured to replicate from each of those slaves and apply any missing transactions.

If, for any reason, you’d rather not wait for a master candidate to get all missing transactions (maybe because you are 100% sure there won’t be any), you can disable this step by enabling the replication_skip_apply_missing_txs setting in cmon configuration.

For MariaDB setups, the behavior is different - ClusterControl picks the most advanced slave and promotes it to become master.

Getting missing transactions is one thing. Applying them is another. ClusterControl, by default, does not fail over to a slave if the slave has not applied all missing transactions. You could lose data. Instead, it will wait indefinitely to allow slaves to catch up. Of course, if the master candidate becomes up to date, ClusterControl will failover immediately after. This behavior can be configured using replication_failover_wait_to_apply_timeout setting in the cmon configuration file. Default value (-1) prevents any failover if master candidate is lagging behind. If you’d like to execute failover anyway, you can set it to 0. You can also set a timeout in seconds, this is the amount of time that ClusterControl will wait for a master candidate to catch up before performing a failover.

Once a master candidate is brought up to date, it is promoted to master and the remaining slaves are slaved off it. The exact process differs depending on which host failed (the active or standby master in a multi-master setup) but the final outcome is that all slaves are again replicating from the working master. Combined with proxies such as HAProxy, ProxySQL or MaxScale, this lets you build an environment where a master failure is handled in an automated and transparent way.

Additional control over failover behavior is granted through replicaton_failover_whitelist and replicaton_failover_blacklist lists in the cmon configuration file. These let you configure a list of slaves which should be treated as a candidate list to become master, and a list of slaves which should not be promoted to master by ClusterControl. There are numerous reasons you may want to use those variables. Maybe you have some backup or OLAP/reporting slaves which are not suitable to become a master? Maybe some of your slaves use weaker hardware or maybe they are located in a different datacenter? In this case, you can avoid them from being promoted by adding those slaves to the replicaton_failover_blacklist variable.

Likewise, maybe you want to limit the number of slaves that are promotable to a particular set of hosts which are the closest to the current master? Or maybe you use master - master, active - passive setup and you want only your standby master to be considered for promotion? Then specify the IP’s of master candidates in the replicaton_failover_whitelist variable. Please keep in mind that a restart of cmon process will be required to reload such configuration. By executing cmon --help-config on the controller, you will get more detailed information about these (and other) parameters.

Finally, you might want to manually restore replication.If you do not want ClusterControl to perform any automated failover in your replication topology, you can disable cluster recovery from the ClusterControl UI.

So, there are lots of good stuff to try out here for MySQL replication users. Do give it a try, and let us know how we’re doing.

ClusterControl Developer Studio: Custom database alerts by combining metrics

$
0
0

In the previous blog posts, we gave a brief introduction to the ClusterControl Developer Studio and the ClusterControl Domain Specific Language. We covered some useful examples, e.g., how to extract information from the Performance Schema, how to automatically have advisors scale your database clusters and how to create an advisor that keeps an eye on the MongoDB replication lag. ClusterControl Developer Studio is free to use, and included in the community version of ClusterControl. It allows you to write your own scriptlets, advisors and alerts. With just a few lines of code, you can already automate your clusters. All advisors are open source on Github, so anyone can contribute back to the community!

In this blog post, we will make things a little bit more complex than in our previous posts. Today we will be using our MongoDB replication window advisor that has been recently added to the Advisors Github repository. Our advisor will not only check on the length of the replication window, but also calculate the lag of its secondaries and warn us if the node would be in any risk of danger. For extra complexity, we will make this advisor compatible with a MongoDB sharded environment, and take a couple of edge cases into account to prevent false positives.

MongoDB Replication window

The MongoDB replication window advisor complements the MongoDB lag advisor. The lag advisor informs us of the number of seconds a secondary node is behind the primary/master. As the oplog is limited in size, having slave lag imposes the following risks:

  1. If a node lags too far behind, it may not be able to catch up anymore as the transactions necessary are no longer in the oplog of the primary.
  2. A lagging secondary node is less favoured in a MongoDB election for a new primary. If all secondaries are lagging behind in replication, you will have a problem and one with the least lag will be made primary.
  3. Secondaries lagging behind are less favoured by the MongoDB driver when scaling out reads with MongoDB, it also adds a higher workload on the remaining secondaries.

If we would have a secondary node lagging behind a few minutes (or hours), it would be useful to have an advisor that informs us how much time we have left before our next transaction will be dropped from the oplog. The time difference between the first and last entry in the oplog is called the Replication Window. This metric can be created by fetching the first and last items from the oplog, and calculating the difference of their timestamps.

Calculating the MongoDB Replication Window

In the MongoDB shell, there is already a function available that calculates the replication window for you. However this function is built into the command line shell, so any outside connection not using the command line shell will not have this built-in function:

mongo_replica_2:PRIMARY> db.getReplicationInfo()
{"logSizeMB" : 1894.7306632995605,"usedMB" : 501.17,"timeDiff" : 91223,"timeDiffHours" : 25.34,"tFirst" : "Wed Oct 12 2016 22:48:59 GMT+0000 (UTC)","tLast" : "Fri Oct 14 2016 00:09:22 GMT+0000 (UTC)","now" : "Fri Oct 14 2016 12:32:51 GMT+0000 (UTC)"
}

As you can see, this function has a rich output of useful information: the size of the oplog and how much of that has been used already. It also displays the time difference of the first and last items in the oplog.

It is easy to replicate this function by retrieving the first and last items from the oplog. Making use of the MongoDB aggregate function in a one single query is tempting, however the oplog does not have any indexes set on any of the fields. Running an aggregate function on a collection without indexes would require a full collection scan, which would become very slow in an oplog that has a couple of million entries.

Instead we are going to send two individual queries: fetch the first record of the oplog in forward and reverse order. As the oplog already is a sorted collection, we can naturally sort on the reverse of the collection cheaply.

mongo_replica_2:PRIMARY> use local
switched to db local
mongo_replica_2:PRIMARY> db.oplog.rs.find().limit(1);
{ "ts" : Timestamp(1476312539, 1), "h" : NumberLong("-3302015507277893447"), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } }
mongo_replica_2:PRIMARY> db.oplog.rs.find().sort({$natural: -1}).limit(1);
{ "ts" : Timestamp(1476403762, 1), "h" : NumberLong("3526317830277016106"), "v" : 2, "op" : "n",  "ns" : "ycsb.usertable", "o" : { "_id" : "user5864876345352853020",
…
}

The overhead of both queries is very low and will not interfere with the functioning of the oplog.

In the example above, the replication window would be 91223 seconds (the difference of 1476403762 and 1476312539).

Intuitively you may think it only makes sense to do this calculation on the primary node, as this is the source for all write operations. However, MongoDB is a bit smarter than just serving out the oplog to all secondaries. Even though the secondary nodes will copy entries of the oplog from the primary, for joining members it will offload the delta of transactions loading via secondaries if possible. Also secondary nodes may prefer to fetch oplog entries from other secondaries with low latency, rather than fetching them from a primary with high latency. So it would be better to perform this calculation on all nodes in the cluster.

As the replication window will be calculated per node and we like to keep our advisor as readable as possible, and we will abstract the calculation into a function:

function getReplicationWindow(host) {
  var replwindow = {};
  // Fetch the first and last record from the Oplog and take it's timestamp
  var res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: 1}, limit: 1}');
  replwindow['first'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: -1}, limit: 1}');
  replwindow['last'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  replwindow['replwindow'] = replwindow['first'] - replwindow['last'];
  return replwindow;
}

The function returns the timestamp of the first and last items in the oplog and the replication window as well. We will actually need all three in a later phase.

Calculating the MongoDB Replication lag

We covered the calculation of MongoDB replication lag in the previous Developer Studio blog post. You can calculate the lag by simply subtracting the secondary optimeDate (or optime timestamp) from the primary optimeDate. This will give you the replication lag in seconds.

The information necessary for the replication lag will be retrieved from only the primary node, using the replSetGetStatus command. Similarly to the replication window, we will abstract this into a function:

function getReplicationStatus(host, primary_id) {
    var node_status = {};
    var res = host.executeMongoQuery("admin", "{ replSetGetStatus: 1 }");
    // Fetch the optime and uptime per host
    for(i = 0; i < res["result"]["members"].size(); i++)
    {
        tmp = res["result"]["members"][i];
        host_id = tmp["name"];
        node_status[host_id] = {};
        node_status[host_id]["name"] = host_id;
        node_status[host_id]["primary"] = primary_id;
        node_status[host_id]["setname"] = res["result"]["set"];
        node_status[host_id]["uptime"] = tmp["uptime"];
        node_status[host_id]["optime"] = tmp["optime"]["ts"]["$timestamp"]["t"];
    }
    return node_status;
}

We keep a little bit more information than necessary here, but you will see why in the next paragraphs.

Calculating the time left per node

Now that we have calculated both the replication window and the replication lag per node, we can calculate the time left per node where it can theoretically still catch up with the primary. So here we subtract the timestamp of the first entry in the oplog (of the primary) from timestamp of the last executed transaction (optime) of the node.

// Calculate the replication window of the primary against the node's last transaction
replwindow_node = replstatus[host]['optime'] - replwindow_primary['first'];

But we are not done yet!

Adding another layer of complexity: sharding

In ClusterControl we see everything as a cluster. Whether you would run a single MongoDB server, a replicaSet or a sharded environment with config servers and shard replicaSets: they are all identified and administered as a single cluster. This means our advisor has to cope with both replicaSet and shard logic.

In the example code above, where we calculated the time left per node, our cluster had a single primary. In the case of a sharded cluster, we have to take into account that we have one replicaSet for the config server and one for the each shard. This means we have to store per node it’s primary node and use that one in the calculation.

The corrected line of code would be:

host_id = host.hostName() + ":" + host.port();
primary_id = replstatus_per_node[host_id]['primary'];
...
// Calculate the replication window of the primary against the node's last transaction
replwindow_node = replstatus['optime'] - replwindow_per_node[primary_id]['first'];

For readability we also need to include the replicaSet name in the messages we output via our advisor. If we would not do this, it would become quite hard to distinguish hosts, replicaSets and shards on large sharded environments.

msg = "The replication window for node " + host_id + " (" + replstatus["setname"] + ") is long enough.";

Take the uptime into account

Another possible impediment with our advisor is that it will start warning us immediately on a freshly deployed cluster:

  1. The first and last entry in the oplog will be within seconds of each other after initiating the replicaSet.
  2. Also on a newly created replicaSet, the probability of it being immediately used would be very low, so there is not much use in alerting on a (too) short replication window in this case either.
  3. At the same time a newly created replicaSet may also receive a huge write workload when a logical backup is restored, shortening all entries in the oplog to a very short timeframe. This especially becomes an issue if after this burst of writes, no more writes happen for a long time, as now the replication window becomes very short but also outdated.

There are three possible solutions for these problems to make our advisor more reliable:

  1. We parse the first item in the oplog. If it contains the replicaset initiating document we will ignore a too short replication window. This can easily be done alongside parsing the first record in the oplog.
  2. We also take the uptime into consideration. If the host’s uptime is shorter than our warning threshold, we will ignore a too short replication window.
  3. If the replication window is too short, but the newest item in the oplog is older than our warning threshold, this should be considered a false positive.

To parse the first item in the oplog we have to add these lines, where we identify it as a new set:

var res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: 1}, limit: 1}');
replwindow['first'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];

if (res["result"]["cursor"]["firstBatch"][0]["o"]["msg"] == "initiating set") {
    replwindow['newset'] = true;
}

Then later in the check for our replication window we can verify all three exceptions together:

if(replwindow['newset'] == true) {
    msg = "Host " + host_id + " (" + replstatus["setname"] + ") is a new replicaSet. Not enough entries in the oplog to determine the replication window.";
    advice.setSeverity(Ok);
    advice.setJustification("");
} else if (replstatus["uptime"] < WARNING_REPL_WINDOW) {
    msg = "Host " + host_id + " (" + replstatus["setname"] + ") only has an uptime of " + replstatus["uptime"] + " seconds. Too early to determine the replication window.";
    advice.setSeverity(Ok);
    advice.setJustification("");
}
else if (replwindow['max'] < (CmonDateTime::currentDateTime().toString("%s") - WARNING_REPL_WINDOW)) {
    msg = "Latest entry in the oplog for host " + host_id + " (" + replstatus["setname"] + ") is older than " + WARNING_REPL_WINDOW + " seconds. Determining the replication window would be unreliable.";
    advice.setSeverity(Ok);
    advice.setJustification("");
}

After this we can finally warn / advise on the replication window of the node. The output would look similar to this:

Conclusion

That’s it! We have created an advisor that gathers information from various hosts / replicaSets and combines this to get an exact view of the replication status of the entire cluster. We have also shown how to make things more readable by abstracting code into functions. And the most important aspect is that we have shown how to also think bigger than just the standard replicaSet.

For completeness, here is the full advisor:

#include "common/helpers.js"
#include "cmon/io.h"
#include "cmon/alarms.h"

// It is advised to have a replication window of at least 24 hours, critical is 1 hour
var WARNING_REPL_WINDOW = 24*60*60;
var CRITICAL_REPL_WINDOW = 60*60;
var TITLE="Replication window";
var ADVICE_WARNING="Replication window too short. ";
var ADVICE_CRITICAL="Replication window too short for one hour of downtime / maintenance. ";
var ADVICE_OK="The replication window is long enough.";
var JUSTIFICATION_WARNING="It is advised to have a MongoDB replication window of at least 24 hours. You could try to increase the oplog size. See also: https://docs.mongodb.com/manual/tutorial/change-oplog-size/";
var JUSTIFICATION_CRITICAL=JUSTIFICATION_WARNING;


function main(hostAndPort) {

    if (hostAndPort == #N/A)
        hostAndPort = "*";

    var hosts   = cluster::mongoNodes();
    var advisorMap = {};
    var result= [];
    var k = 0;
    var advice = new CmonAdvice();
    var msg = "";
    var replwindow_per_node = {};
    var replstatus_per_node = {};
    var replwindow = {};
    var replstatus = {};
    var replwindow_node = 0;
    var host_id = "";
    for (i = 0; i < hosts.size(); i++)
    {
        // Find the primary and execute the queries there
        host = hosts[i];
        host_id = host.hostName() + ":" + host.port();

        if (host.role() == "shardsvr" || host.role() == "configsvr") {
            // Get the replication window of each nodes in the cluster, and store it for later use
            replwindow_per_node[host_id] = getReplicationWindow(host);

            // Only retrieve the replication status from the master
            res = host.executeMongoQuery("admin", "{isMaster: 1}");
            if (res["result"]["ismaster"] == true) {
                //Store the result temporary and then merge with the replication status per node
                var tmp = getReplicationStatus(host, host_id);
                for(o=0; o < tmp.size(); o++) {
                    replstatus_per_node[tmp[o]['name']] = tmp[o];
                }

                //replstatus_per_node =
            }
        }
    }

    for (i = 0; i < hosts.size(); i++)
    {
        host = hosts[i];
        if (host.role() == "shardsvr" || host.role() == "configsvr") {
            msg = ADVICE_OK;

            host_id = host.hostName() + ":" + host.port();
            primary_id = replstatus_per_node[host_id]['primary'];
            replwindow = replwindow_per_node[host_id];
            replstatus = replstatus_per_node[host_id];

            // Calculate the replication window of the primary against the node's last transaction
            replwindow_node = replstatus['optime'] - replwindow_per_node[primary_id]['first'];
            // First check uptime. If the node is up less than our replication window it is probably no use warning
            if(replwindow['newset'] == true) {
              msg = "Host " + host_id + " (" + replstatus["setname"] + ") is a new replicaSet. Not enough entries in the oplog to determine the replication window.";
              advice.setSeverity(Ok);
              advice.setJustification("");
            } else if (replstatus["uptime"] < WARNING_REPL_WINDOW) {
                msg = "Host " + host_id + " (" + replstatus["setname"] + ") only has an uptime of " + replstatus["uptime"] + " seconds. Too early to determine the replication window.";
                advice.setSeverity(Ok);
                advice.setJustification("");
            }
            else if (replwindow['max'] < (CmonDateTime::currentDateTime().toString("%s") - WARNING_REPL_WINDOW)) {
              msg = "Latest entry in the oplog for host " + host_id + " (" + replstatus["setname"] + ") is older than " + WARNING_REPL_WINDOW + " seconds. Determining the replication window would be unreliable.";
              advice.setSeverity(Ok);
              advice.setJustification("");
            }
            else {
                // Check if any of the hosts is within the oplog window
                if(replwindow_node < CRITICAL_REPL_WINDOW) {
                    advice.setSeverity(Critical);
                    msg = ADVICE_CRITICAL + "Host " + host_id + " (" + replstatus["setname"] + ") has a replication window of " + replwindow_node + " seconds.";
                    advice.setJustification(JUSTIFICATION_CRITICAL);
                } else {
                    if(replwindow_node < WARNING_REPL_WINDOW)
                    {
                        advice.setSeverity(Warning);
                        msg = ADVICE_WARNING + "Host " + host_id + " (" + replstatus["setname"] + ") has a replication window of " + replwindow_node + " seconds.";
                        advice.setJustification(JUSTIFICATION_WARNING);
                    } else {
                        msg = "The replication window for node " + host_id + " (" + replstatus["setname"] + ") is long enough.";
                        advice.setSeverity(Ok);
                        advice.setJustification("");
                    }
                }
            }

            advice.setHost(host);
            advice.setTitle(TITLE);
            advice.setAdvice(msg);
            advisorMap[i]= advice;
        }
    }
    return advisorMap;
}

function getReplicationStatus(host, primary_id) {
    var node_status = {};
    var res = host.executeMongoQuery("admin", "{ replSetGetStatus: 1 }");
    // Fetch the optime and uptime per host
    for(i = 0; i < res["result"]["members"].size(); i++)
    {
        tmp = res["result"]["members"][i];
        node_status[i] = {};
        node_status[i]["name"] = tmp["name"];;
        node_status[i]["primary"] = primary_id;
        node_status[i]["setname"] = res["result"]["set"];
        node_status[i]["uptime"] = tmp["uptime"];
        node_status[i]["optime"] = tmp["optime"]["ts"]["$timestamp"]["t"];
    }
    return node_status;
}

function getReplicationWindow(host) {
  var replwindow = {};
  replwindow['newset'] = false;
  // Fetch the first and last record from the Oplog and take it's timestamp
  var res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: 1}, limit: 1}');
  replwindow['first'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  if (res["result"]["cursor"]["firstBatch"][0]["o"]["msg"] == "initiating set") {
      replwindow['newset'] = true;
  }
  res = host.executeMongoQuery("local", '{find: "oplog.rs", sort: { $natural: -1}, limit: 1}');
  replwindow['last'] = res["result"]["cursor"]["firstBatch"][0]["ts"]["$timestamp"]["t"];
  replwindow['replwindow'] = replwindow['last'] - replwindow['first'];
  return replwindow;
}

High Availability on a Shoestring Budget - Deploying a Minimal Two Node MySQL Galera Cluster

$
0
0

We regularly get questions about how to set up a Galera cluster with just 2 nodes. The documentation clearly states you should have at least 3 Galera nodes to avoid network partitioning. But there are some valid reasons for considering a 2 node deployment, e.g., if you want achieve database high availability but have limited budget to spend on a third database node. Or perhaps you are running Galera in a development/sandbox environment and prefer a minimal setup.

Galera implements a quorum-based algorithm to select a primary component through which it enforces consistency. The primary component needs to have a majority of votes, so in a 2 node system, there would be no majority resulting in split brain. Fortunately, it is possible to add a garbd (Galera Arbitrator Daemon), which is a lightweight stateless daemon that can act as the odd node. Arbitrator failure does not affect the cluster operations and a new instance can be reattached to the cluster at any time. There can be several arbitrators in the cluster.

ClusterControl has support for deploying garbd on non-database hosts.

Normally a Galera cluster needs at least three hosts to be fully functional, however at deploy time two nodes would suffice to create a primary component. Here are the steps:

  1. Deploy a Galera cluster of two nodes,
  2. After the cluster has been deployed by ClusterControl, add garbd on the ClusterControl node.

You should end up with the below setup:

Deploy the Galera Cluster

Go to the ClusterControl deploy wizard to deploy the cluster.

Even though ClusterControl warns you a Galera cluster needs an odd number of nodes, only add two nodes to the cluster.

Deploying a Galera cluster will trigger a ClusterControl job which can be monitored at the Jobs page.

Install Garbd

Once deployment is complete, install garbd on the ClusterControl host. It will be under the Manage -> Load Balancer:

Installing garbd will trigger a ClusterControl job which can be monitored at the Jobs page. Once completed, you can verify garbd is running with a green tick icon at the top bar:

That’s it. Our minimal two-node Galera cluster is now ready!

We’re keeping the tills ringing at eCommerce platform vidaXL

$
0
0

ClusterControl helps vidaXL compete with the world's largest e-commerce platforms by managing its MongoDB & MySQL databases.

Press Release: everywhere around the world, November 9th 2016 - today we announced vidaXL, an international eCommerce platform where you can “live it up for less”, as our latest customer. ClusterControl was deployed to help manage vidaXL’s polyglot database architecture, which consists of SQL and NoSQL database solutions to handle specific tasks within the enterprise.

vidaXL caters to the product hunters, offering items for inside and outside the home at competitive prices. With a catalogue of currently over 20,000 products to choose from and selling directly in 29 countries, it has a huge task of managing and updating the database its consumers rely on to fulfil their orders. With 200,000 orders monthly, vidaXL is one of the largest international e-retailers.

The eCommerce company is growing and it has an aim of expanding its product catalogue to over 10,000,000 items within the next 12 months. This extremely large selection of goods creates a wealth of new data; images alone in the catalogue create roughly 100 terabytes worth of data, and the products rows between one to two terabytes. The increase of data originally required vidaXL to hire more database administrators (DBAs), but it searched for a cost-effective solution.

ClusterControl was deployed to manage the database systems. As scaling was an issue for vidaXL, particularly the horizontal scaling of its servers, ClusterControl as a single platform replaced the need for a combination of tools and the sometimes unreliable command line control. The ClusterControl deployment took around one week to implement, with no extra support required from Severalnines.

ClusterControl is easily integrated within a polyglot framework, managing different databases with the same efficiency. vidaXL is using several different databases, MongoDB and MySQL for product and customer listings, along with ElasticSearch, for its real-time search capabilities; ClusterControl was plugged in to automate management and give control over scaling of MongoDB and MySQL. The operations team also leveraged it for proactive reporting.

Zeger Knops, Head of Business Technology, vidaXL said, “We’re looking to grow exponentially in the near future with the products we offer and maintain our position as the world’s largest eCommerce operator. This means we cannot suffer any online outages which lead to a loss of revenue. Scaling from thousands to millions of products is a giant leap and that will require us to have a strong infrastructure foundation. Our back-end is reliant on different databases to tackle different tasks. Using several different tools, rather than a one-stop shop, was detrimental to our productivity. Severalnines is that “shop” and we haven’t looked back. It’s an awesome solution like no other.”

Vinay Joosery, Severalnines CEO, added, “As we head towards the busy end of the year for retailers with Cyber Monday just around the corner, a product catalogue of VidaXL’s size requires strong database management skills and technologies. Keeping operations online and supplying people with their required orders is key. We trust that VidaXL will continue to reap the benefits of ClusterControl as it grows.”

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 8,000 deployments to date via its popular ClusterControl product. Currently counting BT, Orange, Cisco, CNRS, Technicolor, AVG, Ping Identity and Paytrail as customers. Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit, http://www.severalnines.com/company.

Tips and Tricks: Receive email notifications from ClusterControl

$
0
0

As sysadmins and DBAs, we need to be notified whenever something critical happens to our database.  But would it not be nicer if we were informed upfront, and still have time to perform pre-emptive maintenance and retain high availability? Being informed about anomalies or anything that may degrade cluster health and performance is key. In this tips and tricks post, we will explain how you can set up email notifications in ClusterControl and stay up to date with your cluster state.

Email notification types in ClusterControl

First we will explain the two types of email notifications that ClusterControl can send. The normal notifications will be sent instantly, once an alert is triggered or an important event occurs. This instant mail type (deliver) is necessary if you wish to immediately receive critical or warning notifications that require swift action.

The other type is called digest, where ClusterControl will accumulate all notifications and then send them each day in a single email on a preset time. Informational and warning notifications, that do not need immediate action can best be sent via the digest email.

Then there is a third option: not to send a notification and ignore the message. This, obviously, should only be configured if you are absolutely certain you don’t wish to receive this type of notification.

Setting up email notifications per user

There are two methods for setting up email notifications in ClusterControl. The below is the first one, where you can set the email notifications on a user level. Go to Settings > Email Notifications.

Here you can select an existing user and load it’s current settings. You can change at what time digest emails are to be sent, and to prevent ClusterControl from sending too many emails, what the limit is for the non-digest emails. Be careful: if you set this too low, you will no longer receive notifications for the remainder of the day! Setting this to -1 sets this to unlimited. Per alarm/event category, the email notifications can be set to the notification type necessary.

Keep in mind that this setting is on a global level, so this accounts for all clusters.

Setting up email notifications per cluster

On the cluster level, the notifications can be set for both users and additional email addresses. This interface can be found via Cluster > Settings > General Settings > Email Notifications.

Here you can select an existing user/email address and load it’s current settings. You can change at what time digest emails are to be sent, and to prevent ClusterControl from sending too many emails, what the limit is for the non-digest emails. Again here, if you set this too low, you will no longer receive notifications for the remainder of the day! Setting this to -1 sets this to unlimited. Per alarm/event category the email notifications can be set to the notification type necessary.

Keep in mind all settings are on a cluster specific level, so this only changes settings for the selected cluster.

Adding and removing email addresses

Apart from defining the email notification settings, you can also add new email addresses by clicking on the plus-button. (+) This can be handy if you wish to send notifications to, for example, a distribution list inside your company.

Removing email addresses can be done, by selecting the email address that needs removal and click the minus-button. (-)

Configuring the mail server

To be able to send email, you need to tell ClusterControl how to send emails. There are two options: via sendmail or via an SMTP server.

When you make use of sendmail, the server where you have installed ClusterControl should have a local command line mail client installed. ClusterControl will send it’s email using the -r option to set the from-address. As sendmail may not deliver your email reliably, the recommended method of sending email would be via SMTP.

If you decide to use an SMTP server instead, you may need to authenticate against this server. Check with your hosting provider if this is required.

Once set in the first cluster, the mail server settings will be carried over to any new cluster created.

Sending a test email

In the Configure Mail Server interface, you can also send a test email. This will create a backend job, that will send an email to all configured recipients for this cluster under Email Notification Settings.

Troubleshooting

If your test email is not arriving and you have set your mail server settings to sendmail, you can check its workings from the ClusterControl host.

CMON log files

You can check your CMON logfiles and see if the email has been sent.

In /var/log/cmon_<clusterid>.log, you should see something similar to this:

2016-12-09 12:44:11 : (INFO) Executing email job.

If you see a log line like this, you may want to increase the daily message limit:

2016-12-09 12:44:47 : (WARNING) Refusing to send more than 10 messages daily to 'mailto://you@yourcompany.com'

As said earlier: if the message limit has been reached, you will no longer receive notifications.

A message about the -r option indicate your mail client does not support the from-header:

2016-12-09 12:44:17 : (WARNING) mail command doesn't support -r SENDER argument, retrying without that.

You can follow this support article to learn how which packages to install.

Sendmail log files

You can also check the local sendmail log files (/var/log/maillog) and see if your email gets delivered. A typical sendmail connection flow looks like the following:

Dec  9 17:36:41 localhost sendmail[24529]: uB9HafLM024529: from=clustercontrol@yourcompany.com, size=326, class=0, nrcpts=1, msgid=<584aeba9.9LBxfOatDgnTC+vm%clustercontrol@yourcompany.com>, relay=root@localhost
Dec  9 17:36:41 localhost postfix/smtpd[24530]: connect from n1[127.0.0.1]
Dec  9 17:36:41 localhost postfix/smtpd[24530]: 2C0AF4094CF9: client=n1[127.0.0.1]
Dec  9 17:36:41 localhost postfix/cleanup[24533]: 2C0AF4094CF9: message-id=<584aeba9.9LBxfOatDgnTC+vm%clustercontrol@yourcompany.com>
Dec  9 17:36:41 localhost sendmail[24529]: uB9HafLM024529: to=you@yourcompany.com, ctladdr=clustercontrol@yourcompany.com (0/0), delay=00:00:00, xdelay=00:00:00, mailer=relay, pri=30326, relay=[127.0.0.1] [127.0.0.1], dsn=2.0.0, stat=Sent (Ok: queued as 2C0AF4094CF9)
Dec  9 17:36:41 localhost postfix/qmgr[1256]: 2C0AF4094CF9: from=<clustercontrol@yourcompany.com>, size=669, nrcpt=1 (queue active)
Dec  9 17:36:41 localhost postfix/smtpd[24530]: disconnect from n1[127.0.0.1]
Dec  9 17:36:41 localhost postfix/smtp[24534]: 2C0AF4094CF9: to=<you@yourcompany.com>, relay=mail.yourcompany.com[94.142.240.10]:25, delay=0.38, delays=0.05/0.02/0.08/0.24, dsn=2.0.0, status=sent (250 OK id=1cFP69-0002Ns-Db)

If these entries are not to be found inside the log file, you can increase the loglevel of Sendmail.

Command line email

A final check would be to run the mail command and see if that arrives:

echo "test message" | mail -r youremail@yourcompany.com -s "test subject" youremail@yourcompany.com

If the message from the command line arrives, but the ClusterControl message does not, it may be related to not having set the from-email address in ClusterControl. ClusterControl will then send the email from the default user on the system. If the hostname is not properly set on the ClusterControl host to a fully qualified domain name, this may result in your email server not accepting any emails by an unqualified domain name, or non-existing user.

We hope these tips help you configure notifications in ClusterControl.

New whitepaper - the DevOps Guide to database backups for MySQL and MariaDB

$
0
0

This week we’re happy to announce that our new DevOps Guide to Database Backups for MySQL & MariaDB is now available for download (free)!

This guide discusses in detail the two most popular backup utilities available for MySQL and MariaDB, namely mysqldump and Percona XtraBackup.

Topics such as how database features like binary logging and replication can be leveraged in backup strategies are covered. And it provides best practices that can be applied to high availability topologies in order to make database backups reliable, secure and consistent.

Ensuring that backups are performed, so that a database can be restored if disaster strikes, is a key operational aspect of database management. The DBA or System Administrator is usually the responsible party to ensure that the data is protected, consistent and reliable. Ever more crucially, backups are an important part of any disaster recovery strategy for businesses.

So if you’re looking for insight into how to perform database backups efficiently or the impact of Storage Engine on MySQL or MariaDB backup procedures, need some tips & tricks on MySQL / MariaDB backup management … our new DevOps Guide has you covered.

Tips and Tricks - How to shard MySQL with ProxySQL in ClusterControl

$
0
0

Having too large a (write) workload on a master is dangerous. If the master collapses and a failover happens to one of its slave nodes, the slave node could collapse under the write pressure as well. To mitigate this problem you can shard horizontally across more nodes.

Sharding increases the complexity of data storage though, and very often, it requires an overhaul of the application. In some cases, it may be impossible to make changes to an application. Luckily there is a simpler solution: functional sharding. With functional sharding you move a schema or table to another master, and thus alleviating the master from the workload of these schemas or tables.

In this Tips & Tricks post, we will explain how you can functionally shard your existing master, and offload some workload to another master using functional sharding. We will use ClusterControl, MySQL replication and ProxySQL to make this happen, and the total time taken should not be longer than 15 minutes in total. Mission impossible? :-)

The example database

In our example we have a serious issue with the workload on our simple order database, accessed by the so_user. The majority of the writes are happening on two tables: orders and order_status_log. Every change to an order will write to both the order table and the status log table.

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `customer_id` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `total_vat` decimal(15,2) DEFAULT '0.00',
  `total` decimal(15,2) DEFAULT '0.00',
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `order_status_log` (
  `orderId` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `changeTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `logline` text,
  PRIMARY KEY (`orderId`, `status`, `changeTime` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `customers` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `firstname` varchar(15) NOT NULL,
  `surname` varchar(80) NOT NULL,
  `address` varchar(255) NOT NULL,
  `postalcode` varchar(6) NOT NULL,
  `city` varchar(50) NOT NULL,
  `state` varchar(50) NOT NULL,
  `country` varchar(50) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

What we will do is to move the order_status_log table to another master.

As you might have noticed, there is no foreign key defined on the order_status_log table. This simply would not work across functional shards. Joining the order_status_log table with any other table would simply no longer work as it will be physically on a different server than the other tables. And if you write transactional data to multiple tables, the rollback will only work for one of these masters. If you wish to retain these things, you should consider to use homogenous sharding instead where you keep related data grouped together in the same shard.

Installing the Replication setups

First, we will install a replication setup in ClusterControl. The topology in our example is really basic: we deploy one master and one replica:

But you could import your own existing replication topology into ClusterControl as well.

After the setup has been deployed, deploy the second setup:

While waiting for the second setup to be deployed, we will add ProxySQL to the first replication setup:

Adding the second setup to ProxySQL

After ProxySQL has been deployed we can connect with it via command line, and see it’s current configured servers and settings:

MySQL [(none)]> select hostgroup_id, hostname, port, status, comment from mysql_servers;
+--------------+-------------+------+--------+-----------------------+
| hostgroup_id | hostname    | port | status | comment               |
+--------------+-------------+------+--------+-----------------------+
| 20           | 10.10.36.11 | 3306 | ONLINE | read server           |
| 20           | 10.10.36.12 | 3306 | ONLINE | read server           |
| 10           | 10.10.36.11 | 3306 | ONLINE | read and write server |
+--------------+-------------+------+--------+-----------------------+
MySQL [(none)]> select rule_id, active, username, schemaname, match_pattern, destination_hostgroup from mysql_query_rules;
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+
| rule_id | active | username | schemaname | match_pattern                                           | destination_hostgroup |
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+
| 100     | 1      | NULL     | NULL       | ^SELECT .* FOR UPDATE                                   | 10                    |
| 200     | 1      | NULL     | NULL       | ^SELECT .*                                              | 20                    |
| 300     | 1      | NULL     | NULL       | .*                                                      | 10                    |
+---------+--------+----------+------------+---------------------------------------------------------+-----------------------+

As you can see, ProxySQL has been configured with the ClusterControl default read/write splitter for our first cluster. Any basic select query will be routed to hostgroup 20 (read pool) while all other queries will be routed to hostgroup 10 (master). What is missing here is the information about the second cluster, so we will add the hosts of the second cluster first:

MySQL [(none)]> INSERT INTO mysql_servers VALUES (30, '10.10.36.13', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read server'), (30, '10.10.36.14', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read server');
Query OK, 2 rows affected (0.00 sec) 
MySQL [(none)]> INSERT INTO mysql_servers VALUES (40, '10.10.36.13', 3306, 'ONLINE', 1, 0, 100, 10, 0, 0, 'Second repl setup read and write server');
Query OK, 1 row affected (0.00 sec)

After this we need to load the servers to ProxySQL runtime tables and store the configuration to disk:

MySQL [(none)]> LOAD MYSQL SERVERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL SERVERS TO DISK;
Query OK, 0 rows affected (0.01 sec)

As ProxySQL is doing the authentication for the clients as well, we need to add the os_user user to ProxySQL to allow the application to connect through ProxySQL:

MySQL [(none)]> INSERT INTO mysql_users (username, password, active, default_hostgroup, default_schema) VALUES ('so_user', 'so_pass', 1, 10, 'simple_orders');
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> LOAD MYSQL USERS TO RUNTIME;
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL USERS TO DISK;
Query OK, 0 rows affected (0.00 sec)

Now we have added the second cluster and user to ProxySQL. Keep in mind that normally in ClusterControl the two clusters are considered two separate entities. ProxySQL will remain part of the first cluster. Even though it is now configured for the second cluster, it will only be displayed under the first cluster,.

Mirroring the data

Keep in mind that mirroring queries in ProxySQL is still a beta feature, and it doesn’t guarantee the mirrored queries will actually be executed. We have found it working fine within the boundaries of this use case. Also there are (better) alternatives to our example here, where you would make use of a restored backup on the new cluster and replicate from the master until you make the switch. We will describe this scenario in a follow up Tips & Tricks blog post.

Now that we have added the second cluster, we need to create the simple_orders database, the order_status_log table and the appropriate users on the master of the second cluster:

mysql> create database simple_orders;
Query OK, 1 row affected (0.01 sec)
mysql> use simple_orders;
Database changed
mysql> CREATE TABLE `order_status_log` (
  `orderId` int(11) NOT NULL,
  `status` varchar(14) DEFAULT 'created',
  `changeTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `logline` text,
  PRIMARY KEY (`orderId`, `status`, `changeTime` )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> create user 'so_user'@'10.10.36.15' identified by 'so_pass';
Query OK, 0 rows affected (0.00 sec)
mysql> grant select, update, delete, insert on simple_orders.* to 'so_user'@'10.10.36.15';
Query OK, 0 rows affected (0.00 sec)

This enables us to start mirroring the queries executed against the first cluster onto the second cluster. This requires an additional query rule to be defined in ProxySQL:

MySQL [(none)]> INSERT INTO mysql_query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup, mirror_hostgroup, apply) VALUES (50, 1, 'so_user', 'simple_orders', '(^INSERT INTO|^REPLACE INTO|^UPDATE|INTO TABLE) order_status_log', 20, 40, 1);
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 1 row affected (0.00 sec)

In this rule ProxySQL will match everything that is writing to the orders_status_log table, and send it in addition to the hostgroup 40. (write server of the second cluster)

Now that we have started mirroring the queries, the backfill of the data from the first cluster can take place. You can use the timestamp from the first entry in the new orders_status_log table to determine the time we started to mirror.

Once the data has been backfilled we can reconfigure ProxySQL to perform all actions on the orders_status_log table on the second cluster. This will be a two step approach: add a new rule to move the read queries to the second cluster’s read servers and except the SELECT … FOR UPDATE queries. Then another one to modify our mirroring query to stop mirroring and only write to the second cluster.

MySQL [(none)]> INSERT INTO mysql_query_rules (rule_id, active, username, schemaname, match_pattern, destination_hostgroup, apply) VALUES (70, 1, 'so_user', 'simple_orders', '^SELECT .* FROM order_status_log', 30, 1), (60, 1, 'so_user', 'simple_orders', '^FROM order_status_log .* FOR UPDATE', 40, 1);
Query OK, 2 rows affected (0.00 sec)
MySQL [(none)]> UPDATE mysql_query_rules SET destination_hostgroup=40, mirror_hostgroup=NULL WHERE rule_id=50;
Query OK, 1 row affected (0.00 sec)

And don’t forget to activate and persist the new query rules:

MySQL [(none)]> LOAD MYSQL QUERY RULES TO RUNTIME;
Query OK, 1 row affected (0.00 sec)
MySQL [(none)]> SAVE MYSQL QUERY RULES TO DISK;
Query OK, 0 rows affected (0.05 sec)

After this final step we should see the workload drop on the first cluster, and increase on the second cluster. Mission possible and accomplished. Happy clustering!

Viewing all 203 articles
Browse latest View live