Bienvenido! - Willkommen! - Welcome!

Bitácora Técnica de Tux&Cía., Santa Cruz de la Sierra, BO
Bitácora Central: Tux&Cía.
Bitácora de Información Avanzada: Tux&Cía.-Información
May the source be with you!

Saturday, November 12, 2011

AMD SB600 - SB700 RAID

Problems rebuilding the array
If the RaidXPert/WebPam utility is actually from Promise, mayby the Promise drivers would work on an AMD southbridge. Or even Silicon Image, since the Promise drivers (at least some of them) are actually from Silicon Image. Of course I'd have to hack the .inf files to get them to install.
I have been unable to get the AMD sb700 drivers to work while in SATA-RAID mode. In AHCI mode, the driver works fine but RAID is not supported.  
RAID 1 is fine but i limit it to system disk mirroring and store/backup my data some other way on a separate disk. makes system hopping easier too.
If your disk controller dies you need the exact same mobo to restore your system or you need a PCI based raid controller to start with. ick...
I thought a couple of hours was fairly fast for 160G. 
a 2 TB RAID 5 array initialization took around 6h and a verify about 3h.
For my continuing trials on the sb700 motherboard, I split the IDE-SATA software mirror and switched the BIOS SATA mode to RAID. Then I booted Windows and mirrored the two SATA 160G drives, using RaidXpert this time instead of the BIOS utility.
After that, I mirrored the IDE drive to the SATA mirror by software (Windows Server 2008). This is to move the system drive back to SATA. This re-mirroring took over 5 hours, since that's when I went to bed. To be fair, Windows was also synchronizing a 500G mirror during the same time.
Now, I'll break the mirror, boot from the SATA-RAID mirror and see how stable Windows is.
Note, current version of ahcix64s.sys is 3.1.1540.64
again have the same problem, even worst, because the old DOS program doesn't work with my new SB750 SB ! I couldn't rebuild my RAID at all !
At the end of the story I used my second Mobo with SB600 to reinitialize RAID and rebuilt it by RaidExpert - it took 2.30min approx. 
Issue: AMD RaidXpert Service gives an Raid 1 (or 5, as reported in other forum) array "error" stating a single disk failure. One of the two drives (or may be more in your case, but if that's the case you are beyond screwed) in the Raid 1 array becomes a JBOD standalone drive, basically detached from the array for whatever reason. The drive is still listed as healthy and SMART scan does not show any issue either. It is simply "assigned as JBOD" under "Logical Drive View" in RaidXpert.
OR, you really have a dead drive, and you have inserted a new drive. But even after you have initialized it, the Rebuild tab is still greyed out.
The "reasonable" procedure, by intuition, since there is no manual, is to start RaidXpert in the browser (username and password by default are both "admin"), go to the "Logical Drive View" and do a "Rebuild". The problem is, the "Rebuild" tab is greyed out and unusable. You can only see the failed array listed as "Critical", one of the two drives (in my case) is still mounted as part of the Raid 1 array, but the other drive is now JBOD. This can also happen if the drive is new. It is simply "assigned" as a JBOD. And the "Rebuild" tab is still greyed out.
You basically cannot rebuild.
This problem has been reported on many forums and no one has provided a step-by-step solution. The AMD Help with the software is inaccurate and unhelpful.
Solution: Make sure that you have boot into the remaining Raid 1 drive and NOT the JBOD drive if the array was your system drive. This can be double-checked by going into the BIOS during POST and you should make sure that in the Boot sequence you are booting through the Raid array. If the array is not your system drive, this precaution is not necessary.
Start RaidXpert in Windows. After you have logged into RaidXpert, click "Logical Drive View" under "AMD RAIDXpert" -> "Logical Drive View". You can check the serial number of the working drive and the "JBOD" drive under Physical Drive View, you will see which drive has been assigned to the array and which one is assigned as JBOD. If you are using the new drive, you should see that its serial number is associated with the JBOD assignment. Make a note of that, and make sure you are not replacing the wrong drive.
Under "Logical Drive View", and to the right of the screen you will see "Delete" as a usable tab. Click on it. Do not worry, nothing will happen yet.
Under the "Delete" tab, you will find check boxes next to each one of your array and JBOD drive. Check the JBOD drive and click "Submit". You will see two warning messages telling you that the data in the JBOD disk will be erased. If this is your new drive, no problem, just hit "Ok". And the drive would disappear from the "Logical Drive View". If this was one of the two system drives, this is your last chance to make sure that this is not the drive you have booted into. If you are sure, hit "Ok", and the drive would disappear from the "Logical Drive View".
Once that's done, the Rebuild tab would appear. You can also see under "Physical Drive View" that this drive is no longer assigned, and is therefore "Spare and free" and can be used for Rebuilding.
Now hit "Rebuild" and the process will begin. And you are set.

Hope this helps.
My system information:
ASUS M3A78-T motherboard with the SB750 (the South Bridge responsible for the RAID)
AMD Phenom II 940 X4 Quad Core
System drive (Raid 1): Western Digital Raptor 150Gb model: WD1500HLFS (two of them)
Scratch disk (for Photoshop) (Raid 0): Hitachi Deskstar 500Gb P7K500 (three of them)
OS: Windows Vista 64-bit
The rest is not relevant to this problem. The array that reported failure was the Raid 1 array.
This tutorial/Standard Operating Procedure was written by Leo Lam © 2009. All rights reserved. No right is granted to Facebook Inc or any other entity for reproduction. But do feel free to seed this or post a link to this. 
I have tried the directions above and still have the same problem. I have a MA378-T board with the SB700 chipset and running two disks 500GB. They were running fine for two months all of a sudden the raid went critical.
However after a reboot the drive reappeared as JBOD. thinking the drive is on its way out but surprised it reappeared i did a media check. In fact now i have done several with no problems i have also pulled the drive and put it in another computer and scanned it, it is fine. I put it back into the the machine it came from and did a low level security format its fine and yet i can not add it back tot he raid 1 i constantly have the same problem.
No rebuild tab. The drive appears after every reboot as a JBOD disk after removing the JBOD logical disk however (as above) the rebuild tab still does not show up.
One thing I do notice that may be the problem as mentioned in other posts the second disk reports as 500.10 but only has 500.04 available i tried with the SB600 floppy but as we know it does not see the disk.
May be its the size difference I'm not sure but this is really frustrating. I can not do software raid as I also have Linux partitions on the logical disk (currently by the way im running vista64) I want/need to do hardware raid it was one of the reason behind choosing the board. 
Use MoBo with SB600 and reset size of disk to 500.10, that is all.
Also try:
disk A=500.10 B=500.04
Raid no 1 critical has only disk A
Make new one no 2 critical with disk B
Duplicate raid no 1 to 2 using Acronic True Image (free trial for 15 days) both critical
Delete raid no 1, clean disk A and join to raid no 2, it should set for 500.04 as well
Job done
The bug is in process of defining the RAID. It ask about size and ENTER is full disk (or something like that). ENTER gives no problem (automatically goes from 500.10 to 500.04 and rebuild in future brings no harms) but when put value yourself 500.10 works good as well, but is very hard (for some reason) get full size of disk again - it will show 500.04 in future, means to small for second disk to rebuild.
Because setup program from 2006 has option resize disk, that must be well know problem years ago, why they didn't fix it ? Ask AMD... 
I managed to fix my problem and rebuild my Raid array using the solution described above. The biggest problem I had was that I could not work out if I was booting into the Disk assigned to the Raid 1 array or assigned to JBOD. In the end I unplugged each drive individually and tried to boot into each. Only 1 would boot at all and that drive was assigned to Raid 1 in RaidXpert (but was listed as 'Single Disk 1' in BIOS) so I was pretty sure I was booting into the disk needed. So I deleted the JBOD drive in RaidXpert as instructed and sure enough I was able to select rebuild and it was fixed in less than an hour
in my case on an ASUS M3A78-EM with the latest BIOS 2003. Once it went critical, it became unbootable. I confirmed the boot order was the same as before, and it was booting from the critical RAID and not the corrupted drive in the BIOS, and via manual boot selection with F8.
It turns out once the RAID-1 breaks and the failed drive is no longer assigned to the main array, the BIOS becomes mixed up and the boot order options of RAID Array 1 and RAID Single Drive 1 or something similar are actually labelled in reverse. When you choose to boot from the drive assigned as the Array, the system is unbootable. I noticed when I disconnected the known corrupt disk (which had not failed, it was just corrupted somehow) the PC would boot fine, but as soon as I connected the corrupted drive the system was unbootable. Obviously in this case, I cannot boot with both drives active so I could not initiate the rebuild in Windows.
I assumed the corrupted, unlinked drive was somehow still being seen and used as part of the array and thus causing the problems. So I set the controller back to SATA mode, fully SMART tested the drive and then zeroed it using Seagate Seatools for DOS. But after all of this (it would now be equivalent to a new disk off the shelf), the booting with both drives connected problem persisted. At this point I tried selecting a boot off the blanked drive using F8 (RAID Single Disk vs RAID Array 1) and it booted up fine. This is how I realized the motherboard BIOS (not the RAID BIOS) was mislabelling the drives in its boot selections when the controller was in RAID mode. Now that it had booted normally with both drives active, I ran the RAID Xpert utility and initiated the array rebuild. There were no performance issues--the 1.5TB array rebuild completed in about 3 hours.  
i had 1 of the HD crashed.. and ive got the critical msg at startup.. I just replaced the hard drive for a new one. What to do now??

Here is what i have in RAIDXpert

delete JBOD disk and create shedule rebuilder disk.
Have had a similar problem. I changed RAID to IDE for same reason and when I've changed it back to RAID it got Critical.
After read this:
I was waiting 2 days for nothing....
I made spare disk and I deleted it.... and nothing.
I read that also
So and the end of the day I found that (from Asus)
Try AHCI SBSETUP V2.5.1540.4 for DOS
It is DOS version of RAID Fastbuild bios of SB 600 controler. Ok ?
Early version. 2.5. It have options:
1 - Viev Drive Assignments
2 - Define Array
3 - Delete Array
4 - Controller Configuration
I prepared clear dos floppy disk and coped it and rebooted.
Wow. I try Rebuild - got 'spare drive not exist or is too small' !
Checked opt 1 and one disc was 500107kb and other 500108kb !
First was empty, 2nd has raid array !
Than I went to opt G and used option restore original size. Got 500108. When I went opt 1, and RAID 1 was back automatically !!!
Then I went to Rebuild and show me my array disc and new and asked to continue...
I stoped. I wanted to see what expert raid tell me.
After Win XP started I've seen in AMD RAIDExpert option REBUILD activated !
Chose it and it started rebuilding with nice progress bar. Now i'm waiting for end of that but at this stage I don't expect any problem. Hopefully
With SB700 based RAID 1 has some quirks, as does SATA in general on these SB700 (SB600 too) based boards. I've been through 4 different Gigabyte 780G boards, ALL had RAID 1 problems of some sort. More often than not, the drives are FINE!!!! (I don't know in your case).
It seems that what happens is that the SATA/SB/RAID controller thingy (not sure which exact piece of logic) loses its marbles, and cannot understand what is going on with the RAID set. It could be the driver also, but I don't think so. I believe it is an inconsistency in the SB silicon unfortunately. It could be firmware too (DFI has been one of the few who has been furiously putting our AMD RAID BIOS updates for the past year, but no one else seems to care)
As Overmind pointed out in the above post, the specific port matters. First, make sure that both drives are connected to numerically adjacent SATA ports on the mobo. On the Gigabyte, they are 0/2 or 1/3. If you connect them to 0/1 or 2/3, the thing falls apart and exhibits those characteristics. I have no idea why. Not sure how its laid out on the J&W, it could be different. It seems to have some relationship to how "Master/Slave" is defined by the mobo, although in SATA there is no such thing (go figure)
In RAIDExpert, you can do a SMART scan, although if your array is showing as broken, most likely, you cannot see that. It might be worth it to boot the machine with the western digital CD and to a disk health check, or whatever they call it, to see the condition of the disks. You may be able to boot with the disks in SATA mode, Windows will see them as individual disks, and you can use something like SpeedFan's SMART scanner to see what is going on. You can put them back into RAID mode, as the signature file remains intact (or so it has with me)
When the thing refuses to rebuild, unfortunately, it may be the SB700 lobotomizing itself finally. I've had the same thing on SB600 based boards (the Asus M2A-VM is famous for puking its RAID volumes.)

No comments: