MySQL replication status alerts with bash script

In theory, MySQL replication should work without problems even if the servers are restarted, databases created and removed, data flowing in and out… Sometimes, however, there are errors that are breaking the replication process. It is good to know once such situation appears.

Understanding the basics of replication status messages

The main method to check the replication status is to execute SHOW SLAVE STATUS \G on the replication slave server. The “\G” switch changes the result displaying layout – from the horizontal table to the vertical one (there is one row only, so it is much easier to review this way).

The main elements of the output we should check are Slave_IO_Running, Slave_SQL_Running, and Seconds_Behind_Master. The first two should read “Yes” and the third one should be a reasonable value (in seconds). If Slave_IO_Running or Slave_SQL_Running are not set to “Yes”, you can review values in Last_Error, Last_IO_Error, and Last_SQL_Error. This should give you the clue on what is going on.

When talking of Seconds_Behind_Master – this value may vary. For most of the small installations, it should be almost always zero or near zero. For bigger servers or a significant amount of data being inserted or adjusted, it can reach higher values. The number is provided in seconds. In most cases, I’m using 300 as the border value between “OK” state and “something wrong”. We will use this in the script below.

Security

If your MySQL server is allowing you to connect from the command line without a password, you can skip this step (but it is still recommended from the security point of view). For remote servers or the servers that require a password to connect from localhost, you should create a separate user only for this purpose. Such a user will be granted only necessary privileges.

To create such user execute the commands in MySQL:

CREATE USER 'replicationStatus'@'localhost' IDENTIFIED BY 'password';

GRANT SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'replicationStatus'@'localhost';

The new user will be able to check the status of the replication, show database names but will not be able to retrieve or remove the data. Of course, in the example above you should replace the password with something more complicated.

Checking replication status

Here is the script:

#!/bin/bash
#
# Set the maximum number of seconds behind master that will be ignored. 
# If the slave is be more than maximumSecondsBehind, an email will be sent. 
#
maximumSecondsBehind=300

#
# Checking MySQL replication status...
#
mysql -u replicationStatusLocal -p'password' -e 'SHOW SLAVE STATUS \G' | grep 'Running:\|Master:\|Error:' > replicationStatus.txt


#
# displaying results, just in case you want to see them 
#
echo "Results:"
cat replicationStatus.txt


#
# checking parameters
#
slaveRunning="$(cat replicationStatus.txt | grep "Slave_IO_Running: Yes" | wc -l)"
slaveSQLRunning="$(cat replicationStatus.txt | grep "Slave_SQL_Running: Yes" | wc -l)"
secondsBehind="$(cat replicationStatus.txt | grep "Seconds_Behind_Master" | tr -dc '0-9')"


#
# Sending email if needed
#
if [[ $slaveRunning != 1 || $slaveSQLRunning != 1 || $secondsBehind -gt $maximumSecondsBehind ]]; then
  echo ""
  echo "Sending email"
  mail -s "MyServer.com - replication issue found" [email protected] < replicationStatus.txt 
else
  echo ""
  echo "Replication looks fine."
fi

You can adjust the maximumSecondsBehind to the value of your choice. If you will notice that the alert is triggered too frequently, you can change this value to a higher number. You should also set the proper password in the MySQL execution string. As the last step, you should adjust an email subject and address in mail execution part.

The results of the status check are saved to the text file because I’m using it also for other reports. This allows me also to easily manipulate the values and check them using simple grep. If you need a more complex report or other values to be checked, you can simply change the grep filter on the MySQL execution string and the processing of the values.

25 Replies to “MySQL replication status alerts with bash script”

  1. $secondsBehind > $maximumSecondsBehind

    should be

    $secondsBehind -gt $maximumSecondsBehind

    1. Hello BashProfi,

      Interesting thought. Why do you suggest such a change? I used double brackets [[ ]] on purpose so I can use > sign safely. Maybe there is something I’m not aware of, can you explain it to me?

      1. Hi dulare, first: great post! many thanks.
        Secondly: the > does not work to me, be it does the -gt

        1. Daniel, thank you for the comment. I will adjust the code above to use -gt instead of >

  2. working for me just not able to execute using crontab.. did all the environment settings but no luck;

    Thanks for the script;

  3. I am not getting emails
    Slave_IO_Running: No
    Slave_SQL_Running: No
    Seconds_Behind_Master: NULL
    Last_SQL_Error:
    notification.sh: 25: [[: not found
    notification.sh: 25: 0: not found
    notification.sh: 25: -gt: not found

    1. Can you please paste the contents of your “notification.sh” file EXCLUDING sensitive information such as password? It looks like there is an error in the script.

      1. #
        # Set the maximum number of seconds behind master that will be ignored.
        # If the slave is be more than maximumSecondsBehind, an email will be sent.
        #
        maximumSecondsBehind=300
        #
        # Checking MySQL replication status…
        #
        mysql -u root -p’*****’ -e ‘SHOW SLAVE STATUS FOR CHANNEL “source_1″\G’ | grep ‘Running:\|Master:\| Last_SQL_Error:’ > replicationStatus.txt
        #
        # displaying results, just in case you want to see them
        #
        echo “Results:”
        cat replicationStatus.txt
        #
        # checking parameters
        #
        slaveRunning=”$(cat replicationStatus.txt | grep “Slave_IO_Running: Yes” | wc -l)”
        slaveSQLRunning=”$(cat replicationStatus.txt | grep “Slave_SQL_Running: Yes” | wc -l)”
        secondsBehind=”$(cat replicationStatus.txt | grep “Seconds_Behind_Master” | tr -dc ‘0-9’)”
        #
        # Sending email if needed
        #
        if [[ $slaveRunning != 1 || $slaveSQLRunning != 1 || $secondsBehind -gt $maximumSecondsBehind ]]; then
        echo “”
        echo “Sending email”
        mail -s “MyServer.com – replication issue found” [email protected] < replicationStatus.txt
        else
        echo ""
        echo "Replication looks fine."
        fi

        1. It looks like you are using various types of apostrophes and quotes.
          Can you please change all apostrophes to ‘ and quotes to ” and check again?

          1. I have simply copied your content and changed the values. Nothing else I did. Please help

    1. I sent you an email, please take a look also at spam folder.

  4. I am getting this after your suggestion to
    change all apostrophes to ‘ and quotes to ”

    grep: Last_SQL_Error:‘: No such file or directory
    mysql: [Warning] Using a password on the command line interface can be insecure.
    ”Results:”
    grep: Yes”: No such file or directory
    grep: Yes”: No such file or directory
    notification.sh: 25: [[: not found
    notification.sh: 25: ”0”: not found
    notification.sh: 25: ””: not found
    ””
    ”Replication looks fine.”

  5. Hey,
    Thanks for helping me out. Code is perfect.. however the issue was :
    – using sh instead of bash
    – mail engine was not installed
    Sorry for trouble.
    Best,
    Varsha

    1. I’m glad that it worked 🙂 no need to be sorry, we are all learning 🙂

  6. FYI – a new (10.5.9 birthversion) SLAVE MONITOR privilege allows SHOW SLAVE STATUS, SHOW RELAYLOG EVENTS

Comments are closed.