I am a bit behind on this issue, but essentially, Seagate Barracuda 7200.11 hard drives had some issues with certain firmwares bricking the drive. When the error appeared, the drive would no longer be recognized by the BIOS, making subsequent firmware upgrades impossible. Although I have been running a RAID-5 array of five of these drives for over a year, I was finally caught by the bug after a reboot, when my RAID-5 array came up in degraded mode. Seagate’s warranty process was relatively straight forward; I paid the $20 fee to have a new drive sent via 2-Day UPS, in a hard drive shipping container with a return-postage label. I was a bit ticked about paying the fee, but it saved me a lot of headaches regarding finding a suitable shipping container. I also avoided having to wait for them to receive the defective drive, after which they would mail out the new drive. My goal was to have the array in degraded mode for as short of a period as possible, to avoid having a second drive fail before it was repaired.
As I noted above, this issue occurs when the computer is rebooted, the drives won’t just drop off of your SATA controller. As such I wanted to repair the array without rebooting the machine. First, I had to determine which drive was actually down/damaged. To do so, I used the following command on each of the drives in the RAID array:
dd if=/dev/sdX of=/dev/null
I ran this command for each drive, while watching the hard drive activity lights on my RAID enclosure. Alternatively, I could have run a similar command for the RAID array itself:
dd if=/dev/mdX of=/dev/null
The drive that doesn’t show activity is the one that needs to be replaced. After pulling the drive and replacing it with the replacement from Seagate I hot-added it to the system, once again, to avoid rebooting the machine and potentially having another drive fail. To do so, I used the scsiadd command, for which you can find an article here. I stepped through each channel, trying to add the drive; note that selecting an empty channel won’t break anything. To verify that the drive has been added back in, you can run:
ls /dev/sd*
If the drive was added by scsiadd, you should have a new entry. After this I was able to format the drive and add it back into the array. As for the other drives, I needed to upgrade the firmware from SD15 to SD1A, with the exception of the replacement drive, which already had SD1A installed. I used the following guide, which explains everything much better than I could. Once again, I used the tricks above to figure out which drive was which prior to using scsiadd to remove it from the system. At this point, all of my drive have been updated to SD1A and I am **hopefully** safe.
I was late to update the BIOS on these drives and in many ways, I was lucky. I didn’t do it sooner because I had been running them for more than a year without issue. I believe that I got by for so long because I use the drives in my server, which is not rebooted often, at least relative to your standard desktop. Moral of the story: if you haven’t done so yet, update your firmware to SD1A.
If you are uncertain about which firmware you are running, various tools will display it:
scsiadd -p
smartctl –all /dev/sdX
October 13th, 2009 | Tags: barracuda, firmware, Linux, mdadm, sd15, sd1a, seagate | Category: Hardware, Technology | Leave a comment