Archive for March, 2010
More Raid tidbits – Monitoring all raid events and changing default email template
A geek really knows the importance of his or her data and backups that just avoids pulling the hair off! When one of my hard drives on a server just died after having a well served 6000+ hours of life span, I found myself really lucky as other array component of RAID1 came to the rescue. Reason was a perhaps a short circuit which could have cost me the biggest loss of my data ever, I had in my life, so a blazing smile was well deserved. Electric power is one of the infinite things that doesn’t work here like it always (oh, its a long story – I should tell some of it sometime later)!
I got an email from mdmonitor telling me about DegradedArray event. So, when I was rebuilding the array, I noticed I got no alerts about rebuild process or array status updates which I really wanted to investigate. Till that time, I wasn’t event knowing that ‘mdadm –monitor’ only sends you the critical updates. So, I pulled up man pages and saw these are critical events:
- DeviceDisappeared
- Fail
- FailSpare
- DegradedArray
Rest of the events are not reported at all! Also, that RHEL5′s mdadm package has pre-compiled template of email that mdadm sends upon occurrence of a critical event which I wanted to change from as well cause it looks pretty immature:
This is an automatically generated mail message from mdadm running on HOSTNAME A DegradedArray event had been detected on md device /dev/md1. Faithfully yours, etc. P.S. The /proc/mdstat file currently contains the following: bla bla bla
Seriously, it says “faithfully”… wth? Lol. We know that all machines are faithful to a human unless they’re not broken or gay!
It definitely needed to be changed. Checking /etc/init.d/mdmonitor at least gave an idea that its not something changeable but it uses default template when MAILADDR is specified while it doesn’t when PROGRAM parameter is used in /etc/mdadm.conf by passing on RAID array as arguments to the script which is used, instead.
I did this then.
# mdadm --detail --scan >> /etc/mdadm.conf # echo "PROGRAM /etc/raidalerter" >> /etc/mdadm.conf # sed -e '1i\DEVICE partitions' -i /etc/mdadm.conf # cat /etc/raidalerter (create this file with below script) #!/bin/bash echo -e "Likely an unfavourable or a bad thing just happened to your RAID. Even if its recovering, it was a bad thing which caused this! \n\n\n" $(cat -A /proc/mdstat | sed 's/\$/\\n/g') | mail -s "$1 on $2 $3 at $HOSTNAME" some-mail-address@example.com # chmod +x /etc/raidalerter # service mdmonitor restart
Provided that you’ve an MTA working fine, mails would be delivered upon any of RAID incidents to the maximum verbosity possible. I don’t think that any of the hardware raids does so?!
I then tested it on a small array to make sure that alerts are deliverable.
# mdadm /dev/md0 -f /dev/sdb1 -r /dev/sdb1 mdadm: set /dev/sdb1 faulty in /dev/md0 mdadm: hot removed /dev/sdb1 # mdadm /dev/md0 -a /dev/sdb1 mdadm: re-added /dev/sdb1
Preview:
Subject: RebuildFinished on /dev/md0 at ToughGuy
Likely an unfavorable or a bad thing just happened to your RAID. Even if its recovering, it was a bad thing which caused this! Personalities :
[raid1]
md1 : active
raid1 sdb3[1] sda3[0]
724555520 blocks [2/2] [UU]
md0 : active
raid1 sdb1[1] sda1[0]
4008064 blocks [2/2] [UU]
unused devices: <none>
Linux System Variables
Posted by Abbas in Uncategorized on March 9, 2010
Ever wanted to list down all of system built-in global or local variables stored for your shell? Well, it can be with ‘env‘ and ‘set‘ commands.
The env lists global variables and set lists local ones. Difference between the two is that, global variables are built-in into any shell while local variables include the ones which are set by different applicatons. Such as MAILCHECK (which controls mail checking frequency and informs shell prompt when new mail arrives), only appears in ‘set’ command’s output.






Recent Comments