First I got an email telling me of a fail event detected on my RAID
device. Then 12 minutes later a second email this time from SMART
monitoring complaining that one of my hard drives could not be opened.
There is a difference between a failed to open device and a failed
device situation.
So I verified these situations with the terminal. I verified the SMART
monitoring report.
#smartctl -a /dev/questionable-device
I verified the status of the RAID array.
#mdadm --detail /dev/md0
I have a degraded array, one device is removed and faulty. The device is
/dev/sdd. This is the 1TB hard
drive which is the oldest of the 4 I'm using.
I opened the black box. I pulled the SATA cables and examined them for
anything out of the ordinary. I put them back again. I opened the
terminal to check if I have access to the hard drives.
#smartctl -a /dev/all-the-drives
I added the faulty drive to the RAID array with:
#mdadm --manage --add /dev/md0 /dev/faulty-drive
I checked the status of the RAID array.
#cat /proc/mdstat
donato@desktop:~$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md0 : active raid5 sdd[3] sdb1[0] sdc1[1]
1953260544 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2]
[UU_]
[=====>...............] recovery = 26.4% (257972512/976630272)
finish=104.2min speed=114919K/sec
bitmap: 6/8 pages [24KB], 65536KB chunk
unused devices: <none>
I guess this time it's for real. The RAID array is rebuilding itself.
Back to my monitoring station then.
device. Then 12 minutes later a second email this time from SMART
monitoring complaining that one of my hard drives could not be opened.
There is a difference between a failed to open device and a failed
device situation.
So I verified these situations with the terminal. I verified the SMART
monitoring report.
#smartctl -a /dev/questionable-device
I verified the status of the RAID array.
#mdadm --detail /dev/md0
I have a degraded array, one device is removed and faulty. The device is
/dev/sdd. This is the 1TB hard
drive which is the oldest of the 4 I'm using.
I opened the black box. I pulled the SATA cables and examined them for
anything out of the ordinary. I put them back again. I opened the
terminal to check if I have access to the hard drives.
#smartctl -a /dev/all-the-drives
I added the faulty drive to the RAID array with:
#mdadm --manage --add /dev/md0 /dev/faulty-drive
I checked the status of the RAID array.
#cat /proc/mdstat
donato@desktop:~$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md0 : active raid5 sdd[3] sdb1[0] sdc1[1]
1953260544 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2]
[UU_]
[=====>...............] recovery = 26.4% (257972512/976630272)
finish=104.2min speed=114919K/sec
bitmap: 6/8 pages [24KB], 65536KB chunk
unused devices: <none>
I guess this time it's for real. The RAID array is rebuilding itself.
Back to my monitoring station then.
Comments