RAID RECOVERY LABS - MASS STORAGE DATA RECOVERY
 

STANDARD RAID LEVELS


STANDARD RAID LEVELS


The standard RAID levels are a basic set of RAID configurations and employ striping, mirroring, or parity. The standard RAID levels can be nested for other benefits.



Concatenation (JBOD or SPAN)




Concatenation or Spanning of disks is not one of the numbered RAID levels, but it is a popular method for combining multiple physical disk drives into a single virtual disk. It provides no data redundancy. As the name implies, disks are merely concatenated together, end to beginning, so they appear to be a single large disk. This mode is sometimes called JBOD, or "Just a Bunch Of Disks".


Concatenation may be thought of as the reverse of partitioning. Whereas partitioning takes one physical drive and creates two or more logical drives, JBOD uses two or more physical drives to create one logical drive.


In that it consists of an array of independent disks, it can be thought of as a distant relation to RAID. Concatenation is sometimes used to turn several odd-sized drives into one larger useful drive, which cannot be done with RAID 0. For example, JBOD could combine 3 GB, 15 GB, 5.5 GB, and 12 GB drives into a logical drive at 35.5 GB, which is often more useful than the individual drives separately.


In the diagram to the right, data are concatenated from the end of disk 0 (block A63) to the beginning of disk 1 (block A64); end of disk 1 (block A91) to the beginning of disk 2 (block A92). If RAID 0 were used, then disk 0 and disk 2 would be truncated to 28 blocks, the size of the smallest disk in the array (disk 1) for a total size of 84 blocks.


Some RAID controllers use JBOD to refer to configuring drives without RAID features. Each drive shows up separately in the OS. This JBOD is not the same as concatenation.


Many Linux distributions use the terms incorrectly and refer to JBOD as "linear mode" or "append mode". The Mac OS X 10.4 implementation — called a "Concatenated Disk Set" — does not leave the user with any usable data on the remaining drives if one drive fails in a concatenated disk set, although the disks otherwise operate as described above.


Concatenation is one of the uses of the Logical Volume Manager in Linux, which can be used to create virtual drives spanning multiple physical drives and/or partitions.


RAID-0




A striped set (minimum 2 disks) without parity. Provides improved performance and additional storage but "no fault tolerance". Any disk failure destroys the array, which becomes more likely with more disks in the array. A single disk failure destroys the entire array because when data is written to a RAID 0 drive, the data is broken into fragments. The number of fragments is dictated by the number of disks in the drive.


The fragments are written to their respective disks simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the drive in parallel, giving this type of arrangement huge bandwidth. When one sector on one of the disks fails, however, the corresponding sector on every other disk is rendered useless because part of the data is now corrupted. RAID 0 does not implement error checking so any error is unrecoverable (not at RAID Recovery Labs!). More disks in the array means higher bandwidth, but greater risk of data loss.


Advantage: RAID 0 uses a very simple design and is easy to implement with a HUGE performance advantage. RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written to a separate disk drive. I/O performance is greatly improved by spreading the I/O load across many channels and drives while the best performance is achieved when data is striped across multiple controllers with only one drive per controller. No parity calculation overhead is involved.


Disadvantage: Not a "True" RAID because it is NOT fault-tolerant. The failure of just one drive will result in all data in an array being lost. Should never be used in mission critical environments such as a VOB storage without other backup solutions. RAID Recovery Labs is one of the only recovery labs that has developed tools and techniques to recovery RAID 0 arrays.


NOTE: RAID 0 since it's inception has been known as the "unrecoverable" array type. RAID Recovery Labs is one of the only mass storage recovery labs that has developed tools and techniques to successfully recover RAID 0 arrays. This is not a endorsement of RAID 0 by RAID Recovery Labs unless implemented for the proper senerio. Consider RAID 1+0, 0+1 or an array type that offers both performance and some form of fault tolerance.


RAID-1




A RAID 1 creates an exact copy (or mirror) of a set of data on two or more disks. This is useful when read performance or reliability are more important than data storage capacity. Such an array can only be as big as the smallest member disk. A classic RAID 1 mirrored pair contains two disks (see diagram), which increases reliability geometrically over a single disk. Since each member contains a complete copy of the data, and can be addressed independently, ordinary wear-and-tear reliability is raised by the power of the number of self-contained copies.


To understand RAID 1 failure rates consider a RAID 1 with two identical models of a disk drive with a weekly probability of failure of 1:500. Assuming defective drives are replaced weekly, the installation would carry a 1:250,000 probability of failure for a given week. That is, the likelihood that the RAID array is down due to mechanical failure during any given week is the product of the likelihoods of failure of both drives. In other words, if the probability of failure is 1 in 500 and if the failures are statistically independent then the probability of both drives failing is



Additionally, since all the data exist in two or more copies, each with its own hardware, the read performance can go up roughly as a linear multiple of the number of copies. That is, a RAID 1 array of two drives can be reading in two different places at the same time, though not all implementations of RAID 1 do this.[4] To maximize performance benefits of RAID 1, independent disk controllers are recommended, one for each disk. Some refer to this practice as splitting or duplexing. When reading, both disks can be accessed independently and requested sectors can be split evenly between the disks. For the usual mirror of two disks, this would, in theory, double the transfer rate when reading. The apparent access time of the array would be half that of a single drive. Unlike RAID 0, this would be for all access patterns, as all the data are present on all the disks. In reality, the need to move the drive heads to the next block (to skip unread blocks) can effectively mitigate speed advantages for sequential access. Read performance can be further improved by adding drives to the mirror. Many older IDE RAID 1 controllers read only from one disk in the pair, so their read performance is always that of a single disk. Some older RAID 1 implementations would also read both disks simultaneously and compare the data to detect errors. The error detection and correction on modern disks makes this less useful in environments requiring normal availability. When writing, the array performs like a single disk, as all mirrors must be written with the data. Note that these performance scenarios are in the best case with optimal access patterns.


RAID 1 has many administrative advantages. For instance, in some environments, it is possible to "split the mirror": declare one disk as inactive, do a backup of that disk, and then "rebuild" the mirror. This is useful in situations where the file system must be constantly available. This requires that the application supports recovery from the image of data on the disk at the point of the mirror split. This procedure is less critical in the presence of the "snapshot" feature of some file systems, in which some space is reserved for changes, presenting a static point-in-time view of the file system. Alternatively, a set of disks can be kept in much the same way as traditional backup.


Advantage: For highest performance, the controller must be able to perform two concurrent separate Reads per mirrored pair or two duplicate Writes per mirrored pair. One Write or two Reads are possible per mirrored pair. Twice the Read transaction rate of single disks, with the same Write transaction rate as single disks. 100% redundancy of data means no rebuild is necessary in case of a disk failure, just a copy to the replacement disk. Transfer rate per block is equal to that of a single disk. Under certain circumstances, RAID 1 can sustain multiple simultaneous drive failures. Simplest RAID storage subsystem design.


Disadvantage: Highest disk overhead of all RAID types (100%) - inefficient due to the duplication of Write tasks. Typically the RAID function is done by system software, loading the CPU/Server and possibly degrading throughput at high activity levels. Hardware implementation is strongly recommended. May not support hot swap of failed disk when implemented in "software".


RAID-2



- Diagram under construction -


RAID Level 2 is an implementation of the Hamming Code ECC. What this means is that each data word written to the disk has a corresponding error checking code written to another disk. On a Read, the ECC verifies data was correctly Read or corrects a single disk error. In other words, Level 2 stripes data at the bit level rather than the block level.


RAID Level 2 requires a minimum of 3 drives to implement.


Advantage: Dynamic data error correction. Extremely high data transfer rates possible; the higher the data transfer rate required, the better the ratio of data disks to ECC disks. Relatively simple controller design compared to RAID levels 3, 4 & 5.


Disadvantage: Very high ratio of ECC disks to data disks with smaller word sizes - inefficient. The entry level cost is very high - requiring very high transfer rate requirement to justify. The transaction rate is equal to that of a single disk at best (with spindle synchronization) and currently, there are no commercial implementations which exist.


RAID-3




A RAID 3 uses byte-level striping with a dedicated parity disk. RAID 3 is very rare in practice. One of the side-effects of RAID 3 is that it generally cannot service multiple requests simultaneously. This comes about because any single block of data will, by definition, be spread across all members of the set and will reside in the same location. So, any I/O operation requires activity on every disk.


In our example, a request for block "A" consisting of bytes A1-A6 would require all three data disks to seek to the beginning (A1) and reply with their contents. A simultaneous request for block B would have to wait.


Advantage: Very high Read data transfer rate. Very high Write data transfer rate. Disk failure has an insignificant impact on throughput. Low ratio of ECC (Parity) disks to data disks means high efficiency.


Disadvantage: Transaction rate equal to that of a single disk drive at best (if spindles are synchronized). Controller design is fairly complex. Very difficult and resource intensive to do as a "software" RAID because of the parity generation and checking.


RAID-4




A RAID 4 uses block-level striping with a dedicated parity disk. This allows each member of the set to act independently when only a single block is requested. If the disk controller allows it, a RAID 4 set can service multiple read requests simultaneously. RAID 4 looks similar to RAID 5 except that it does not use distributed parity, and similar to RAID 3 except that it stripes at the block level, rather than the byte level. Generally, RAID 4 is implemented with hardware support for parity calculations, and a minimum of 3 disks is required for a complete RAID 4 configuration.


In the example above, a read request for block "A1" would be serviced by disk 0. A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1


Advantage: Very high Read data transaction rate. Low ratio of ECC (Parity) disks to data disks means high efficiency. Has a high aggregate Read transfer rate.


Disadvantage: Requires a complex controller design. The worst Write transaction rate and Write aggregate transfer rate. It is difficult and inefficient to rebuild data in the event of disk failure. Block Read transfer rate is equal to that of a single disk.


RAID-5




A RAID 5 uses block-level striping with parity data distributed across all member disks. RAID 5 has achieved popularity due to its low cost of redundancy. Generally, RAID 5 is implemented with hardware support for parity calculations. A minimum of 3 disks is generally required for a complete RAID 5 configuration. A RAID 5 two disk set is possible, but many implementations do not allow for this. In some implementations a degraded disk set can be made (3 disk set of which 2 are online).


In the example above, a read request for block "A1" would be serviced by disk 0. A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1.


RAID 5 parity handling - A series of blocks (one on each of the disks in an array) is collectively called a "stripe". If another block, or some portion of a block, is written on that same stripe, the parity block (or some portion of the parity block) is recalculated and rewritten. For small writes, this requires reading the old data, writing the new parity, and writing the new data. The disk used for the parity block is staggered from one stripe to the next, hence the term "distributed parity blocks". RAID 5 writes are expensive in terms of disk operations and traffic between the disks and the controller.


The parity blocks are not read on data reads, since this would be unnecessary overhead and would diminish performance. The parity blocks are read, however, when a read of a data sector results in a cyclic redundancy check (CRC) error. In this case, the sector in the same relative position within each of the remaining data blocks in the stripe and within the parity block in the stripe are used to reconstruct the errant sector. The CRC error is thus hidden from the main computer. Likewise, should a disk fail in the array, the parity blocks from the surviving disks are combined mathematically with the data blocks from the surviving disks to reconstruct the data on the failed drive "on the fly".


This is sometimes called Interim Data Recovery Mode. The computer knows that a disk drive has failed, but this is only so that the operating system can notify the administrator that a drive needs replacement; applications running on the computer are unaware of the failure. Reading and writing to the drive array continues seamlessly, though with some performance degradation. The difference between RAID 4 and RAID 5 is that in interim data recovery mode, RAID 5 might be slightly faster than RAID 4: When the CRC and parity are in the disk that failed, the calculation does not have to be performed, while with RAID 4, if one of the data disks fails, the calculations have to be performed with each access.


RAID 5 disk failure rate - The maximum number of drives in a RAID 5 redundancy group is theoretically unlimited, but it is common practice to limit the number of drives. The tradeoffs of larger redundancy groups are greater probability of a simultaneous double disk failure, the increased time to rebuild a redundancy group, and the greater probability of encountering an unrecoverable sector during RAID reconstruction. As the number of disks in a RAID 5 group increases, the Mean Time Between Failures (MTBF, the reciprocal of the failure rate) can become lower than that of a single disk. This happens when the likelihood of a second disk failing out of (N-1) dependent disks, within the time it takes to detect, replace and recreate a first failed disk, becomes larger than the likelihood of a single disk failing. RAID 6 is an alternative that provides dual parity protection thus enabling larger numbers of disks per RAID group.


Some RAID vendors will avoid placing disks from the same manufacturing lot in a redundancy group to minimize the odds of simultaneous early life and end of life failures as evidenced by the Bathtub curve.


RAID 5 performance - RAID 5 implementations suffer from poor performance when faced with a workload which includes many writes which are smaller than the capacity of a single stripe; this is because parity must be updated on each write, requiring read-modify-write sequences for both the data block and the parity block. More complex implementations often include non-volatile write back cache to reduce the performance impact of incremental parity updates.


The read performance of RAID 5 is almost as good as RAID 0 for the same number of disks. Except for the parity blocks, the distribution of data over the drives follows the same pattern as RAID 0. The reason RAID 5 is slightly slower is that the disks must skip over the parity blocks.


In the event of a system failure while there are active writes, the parity of a stripe may become inconsistent with the data. If this is not detected and repaired before a disk or block fails, data loss may ensue as incorrect parity will be used to reconstruct the missing block in that stripe. This potential vulnerability is sometimes known as the "write hole". Battery-backed cache and similar techniques are commonly used to reduce the window of opportunity for this to occur.


RAID 5 usable size - Parity data use up the capacity of one drive in the array (This can be seen by comparing it with RAID 4: RAID 5 distributes the parity data across the disks, while RAID 4 centralizes it on one disk, but the amount of parity data is the same). In case that the drives vary in capacity, the smallest of them sets the bar. Therefore,the usable capacity of a RAID 5 array is



where N is the total number of drives in the array and Smin is the capacity of the smallest drive in the array.


The number of hard drives that can belong to a single array is theoretically unlimited (although the time required for initial construction of the array as well as that for reconstruction of a failed disk increases with the number of drives in an array).


Advantage: It has the highest Read data transaction rate and with a medium Write data transaction rate. A low ratio of ECC (Parity) disks to data disks means high efficiency along with a good aggregate transfer rate.


Disadvantage: Disk failure has a medium impact on throughput. It also has the most complex controller design. It's often difficult to rebuild in the event of a disk failure (as compared to RAID level 1) and individual block data transfer rate same as single disk.


RAID-6




Striped set with dual distributed parity. Provides fault tolerance from two drive failures; array continues to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high availability systems. This becomes increasingly important because large-capacity drives lengthen the time needed to recover from the failure of a single drive.


Single parity RAID levels are vulnerable to data loss until the failed drive is rebuilt: the larger the drive, the longer the rebuild will take. With dual parity, it gives time to rebuild the array without the data being volatile while the failed drive is being recovered.


Advantage: RAID 6 is essentially an extension of RAID level 5 which allows for additional fault tolerance by using a second independent distributed parity scheme (two-dimensional parity). Data is striped on a block level across a set of drives, just like in RAID 5, and a second set of parity is calculated and written across all the drives; RAID 6 provides for an extremely high data fault tolerance and can sustain multiple simultaneous drive failures which typically makes it a perfect solution for mission critical applications.


Disadvantage: Disk failure has a medium impact on throughput. It also has the most complex controller design. It's often difficult to rebuild in the event of a disk failure (as compared to RAID level 1) and individual block data transfer rate same as single disk.Very complex controller design with controller overhead to compute parity addresses being extremely high. It has very poor Write performance in addition to requiring N+2 drives to implement because of two-dimensional parity scheme.



Home | Recovery Services | Testimonials | Release Form | Contact | About us |
2007 RAID RECOVERY LABS, Inc. All Rights Reserved.