## Bienvenido! - Willkommen! - Welcome!

Bitácora Técnica de Tux&Cía., Santa Cruz de la Sierra, BO
Bitácora Central: Tux&Cía.
May the source be with you!

## Friday, July 31, 2009

### RAID 1

A RAID 1 creates an exact copy (or mirror) of a set of data on two or more disks. This is useful when read performance or reliability are more important than data storage capacity. Such an array can only be as big as the smallest member disk. A classic RAID 1 mirrored pair contains two disks (see diagram), which increases reliability geometrically over a single disk. Since each member contains a complete copy of the data, and can be addressed independently, ordinary wear-and-tear reliability is raised by the power of the number of self-contained copies.
RAID 1 failure rate
As a trivial example, consider a RAID 1 with two identical models of a disk drive with a 5% probability that the disk would fail within three years. Provided that the failures are statistically independent, then the probability of both disks failing during the three year lifetime is
$P(\mathrm{bothfail}) = \left(0.05\right)^2 = 0.0025 = 0.25\,\%$.

Thus, the probability of losing all data is 0.25% if the first failed disk is never replaced. If only one of the disks fails, no data would be lost, assuming the failed disk is replaced before the second disk fails.

However, since two identical disks are used and since their usage patterns are also identical, their failures can not be assumed to be independent. Thus, the probability of losing all data, if the first failed disk is not replaced, is considerably higher than 0.25% but still below 5%.

RAID 0 failure rate
Reliability of a given RAID 0 set is equal to the average reliability of each disk divided by the number of disks in the set:
$\mathrm{MTTF}_{\mathrm{group}} \approx \frac{\mathrm{MTTF}_{\mathrm{disk}}}{\mathrm{number}}$

That is, reliability (as measured by mean time to failure (MTTF) or mean time between failures (MTBF) is roughly inversely proportional to the number of members – so a set of two disks is roughly half as reliable as a single disk. If there were a probability of 5% that the disk would fail within three years, in a two disk array, that probability would be upped to $\mathbb{P}(\mbox{at least one fails}) = 1 - \mathbb{P}(\mbox{neither fails}) = 1 - (1 - 0.05)^2 = 0.0975 = 9.75\,\%$.

The reason for this is that the file system is distributed across all disks. When a drive fails the file system cannot cope with such a large loss of data and coherency since the data is "striped" across all drives (the data cannot be recovered without the missing disk). Data can be recovered using special tools, however, this data will be incomplete and most likely corrupt, and recovery of drive data is very costly and not guaranteed.

RAID 1 performance