We use DRBD at work on several CentOS 5.x nodes to replicate data between our two computer rooms (in different buildings but linked with Gigabit fiber). It's true that you can know if something wrong happens at the DRBD level if you have configured the correct 'handlers' and the appropriate notifications scripts (Have a look for example at the Split Brain notification script). Those scripts are 'cool' but what if you could 'plumb' the DRBD status in your actual monitoring solution ? We useZabbixat \$work and I was asked to centralize events from differents sources and Zabbix doesn't support directly monitoring DRBD devices. But one of the cool thing with Zabbix is that it's like a Lego system : you can extend what it does if you know what to query and how to do it. If you want to monitor DRBD devices, the best that Zabbix can do (on the agent side, when using the zabbix agent running as a simple zabbix user with /sbin/nologin as shell) is to query and parse/proc/drbd . So here we go : we need to modify the Zabbix agent to use Flexible User Parameters, like this (in /etc/zabbix/zabbix_agentd.conf) :
UserParameter=drbd.cstate[*],cat /proc/drbd |grep \$1:|tr [:blank:] \\n|grep cs|cut -f 2 -d ':'|grep Connected |wc -l
UserParameter=drbd.dstate[*],cat /proc/drbd |grep \$1:|tr [:blank:] \\n|grep ds|cut -f 2 -d ':'|cut -f 1 -d '/'|grep UpToDate|wc -l
We just need to inform the Zabbix server of the actual Connection State (cs) and Disk State (ds) . For that we just need to create Application/Items and Triggers .. but what if we could just create a Zabbix Template so that we can just link that template to a DRBD host ? I attach to this post the DRBD Zabbix template (xml file that you can import in your zabbix setup) and you can just link it to your drbd hosts. Here is the link. That XML file contains both two Items (cstate and dstate) and the associated triggers. Of course you can extend it, especially if you use multiple resources , drbd disks. Because we used the Flexible parameters, you can for example in the Zabbix item, create a new one (based on the template) and monitor the /dev/drbd1 device just by using the drbd.dstate[1] key in that zabbix item.
Happy Monitoring and DRBD'ing ...