Technicus stultissimus: fixing a RAID superblock

Wednesday, September 15, 2010

fixing a RAID superblock

Source
Grub won't fix a bad superblock, the advice above only applied if you were using raid on your / partition.
1: View the contents /etc/raid.blah.conf or mdadm.conf
2: Use fsck for the partitions
4: mdadm -E -D -s
To rebuild the superblock if your .conf hasn't changed:
#mdadm -A
----------------
Source
Recovering from a real hardware failure
This process is similar to recovering from a "simulated failure":
To recover from a from a real hardware failure, do:

make sure that partitions on a new device are the same as on the old one:
- create them with fdisk (fdisk -l will tell you what partitions you have on a good disk; remember to set the same start/end blocks, and to set partition's system id to "Linux raid autodetect")
- consult /etc/mdadm.conf file, which describes which partitions are used for md devices
add a new device to the array:

# mdadm /dev/md0 -a /dev/sda1
mdadm: hot added /dev/sda1

Then, you can consult mdadm --detail /dev/md0 and/or /proc/mdstat to see how long the reconstruction will take.
Make sure you run lilo when the reconstruction is complete - see below.
RAID boot CD-ROM It's always a good idea to have a CD-ROM, from which you can always boot your system (in case lilo was removed etc.).
It can be created with mkbootdisk tool:

# mkbootdisk --iso --device /root/raid-boot.iso `uname -r`

Then, just burn the created ISO.

If everything fails...
....................................................

Source
f you've done a reinstall you've probably lost your /etc/mdadm.conf . Try running mdadm assemble --scan to see if it picks up the drives again.
It's highly unlikely that you've lost the superblocks on both drives, unless it was done intentionally. If the blind assemble doesn't work, try assembling with one drive. If that works, add the other drive as a hot spare and the array will automatically rebuild.
If you have lost superblocks off both drives, you can try mdadm build to bring up an array.
//
After more search on the internet, I found that one of the new features of Debian 505 (I had 504) was "experimental support for software raid...", this seemed to imply that support was not present in the version I was using.
So I installed 505 with software raid1 which seemed to require a mount point for the 2 sata raid drives. Raid 1 was not working after reboot and I tried the following command:
mdadm --create /dev/md0 --metadata=1 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
this did not work at first but it did after unmounting the 2 drives
and this gave the following result:
size=488383936K mtime=Thu Jul 22 01:21:15 2010
mdadm: /dev/sdb1 appears to contain an ext2fs file system
size=488383936K mtime=Thu Jul 22 01:21:15 2010
Continue creating array? y
mdadm: array /dev/md0 started.
Does this mean the array is now working or do I still need to test if it is?
Also, the 2 drives have ext3 filesystem, should the drives be reformated to ext2? How?
(It seems experimental support for raid.. is intended to work on ext2 filesystems only!)
cat /proc/mdstat gave the following result
Personalities : [raid1]
md0 : active (auto-read-only) raid1 sdb1[1] sda1[0]
487307508 blocks super 1.0 [2/2] [UU]
resync=PENDING
unused devices:

-----------------
4. Error Recovery

Q: I have a RAID-1 (mirroring) setup, and lost power while there was disk activity. Now what do I do?
A: The redundancy of RAID levels is designed to protect against a disk failure, not against a power failure. There are several ways to recover from this situation.
- Method (1): Use the raid tools. These can be used to sync the raid arrays. They do not fix file-system damage; after the raid arrays are sync'ed, then the file-system still has to be fixed with fsck. Raid arrays can be checked with ckraid /etc/raid1.conf (for RAID-1, else, /etc/raid5.conf, etc.) Calling ckraid /etc/raid1.conf --fix will pick one of the disks in the array (usually the first), and use that as the master copy, and copy its blocks to the others in the mirror. To designate which of the disks should be used as the master, you can use the --force-source flag: for example, ckraid /etc/raid1.conf --fix --force-source /dev/hdc3 The ckraid command can be safely run without the --fix option to verify the inactive RAID array without making any changes. When you are comfortable with the proposed changes, supply the --fix option.
- Method (2): Paranoid, time-consuming, not much better than the first way. Lets assume a two-disk RAID-1 array, consisting of partitions /dev/hda3 and /dev/hdc3. You can try the following:
  1. fsck /dev/hda3
  2. fsck /dev/hdc3
  3. decide which of the two partitions had fewer errors, or were more easily recovered, or recovered the data that you wanted. Pick one, either one, to be your new ``master'' copy. Say you picked /dev/hdc3.
  4. dd if=/dev/hdc3 of=/dev/hda3
  5. mkraid raid1.conf -f --only-superblock
  Instead of the last two steps, you can instead run ckraid /etc/raid1.conf --fix --force-source /dev/hdc3 which should be a bit faster.
- Method (3): Lazy man's version of above. If you don't want to wait for long fsck's to complete, it is perfectly fine to skip the first three steps above, and move directly to the last two steps. Just be sure to run fsck /dev/md0 after you are done. Method (3) is actually just method (1) in disguise.
In any case, the above steps will only sync up the raid arrays. The file system probably needs fixing as well: for this, fsck needs to be run on the active, unmounted md device. With a three-disk RAID-1 array, there are more possibilities, such as using two disks to ''vote'' a majority answer. Tools to automate this do not currently (September 97) exist.

Apple	Atari	Commodore	Data General	DEC	Honeywell
Hewlett Packard	IBM	NCR	Olivetti	Sinclair	Sun Microsystem
Silicon Graphics	Unisys	Mattel	Amstrad	Altre marche	Hardware
Calcolatrici	Fuori categoria	Pubblicita	Documentazione	Software	Lista

Technicus stultissimus

Bienvenido! - Willkommen! - Welcome!

Wednesday, September 15, 2010

fixing a RAID superblock

No comments:

Labels

Pages

Tux & Cía. -Ventas

Search This Blog

Tux & Cía.

Amenazas Informáticas

Browsers&Security Tests

Diseño Gráfico

Basic Data Management

Windows Updates

Network tools

Forums - Foren - Foros

Which Security SW do you use?

[Your PC]

Piador

Me and Linus Torvalds' work

Free&Open Source SW

Blog Archive

Labels

Comics

About Me

Nerd or Dren?

Live Traffic Map

useful LX commands

Recommended

Networking Technique - Computer Technologies - Commputer Archeology

Computer Museum

The Power of Knowledge-El Poder del Conocimiento-Die Macht des Wissens

Visitantes-Besucher-Visitors