RAID - The
Basics of RAID
What is RAID?
RAID stands for "Redundant Array of Independent Disks".
There are six levels of RAID: level 0 - level 5. Each level supports a different storage
layout scheme on the disk drives, from mirroring to parity striping. All drives in a RAID
set MUST BE the exact same drive.
What is meant by redundancy, parity or parity data?
Redundancy means that there is protection against any single disk
failure. Parity data is information used by a RAID system to rebuild the data on a disk in
the event of a failure. Parity data is created by using a logical exclusive-OR (XOR) on
actual user data and storing the result on disk. Example: If an array of 5 drives exists,
the 4 drives are used as the storage devices and the 5th as the parity drive. Data on the
first sector of each of the 4 data drives is XORed creating parity data that is stored on
the first sector of the parity drive. The same holds true for the second sector on.
What are the different RAID levels and what do they
support?
Level 0 : Disk Striping - data is transferred in parallel
across an array of disks. Redundancy is not provided in this level.
Level 1 : Disk Mirroring - duplicate contents of one disk
are written onto another disk.
Level 2 : Bit interleaving data across multiple disks with
parity information created using a Hamming code. A Hamming code detects errors that occur
and determine which part is in error. RAID level 2 specifies 39 disks with 32 disks of
user storage and 7 disks of error recovery coding.
Level 3 : Data is striped across multiple drives and parity
is written to a dedicated drive. Level 3 is typically implemented at the BYTE level.
Level 4 : Data is striped across mulitple drives and parity
is written to a dedicated drive. Level 4 is typically implemented at the BLOCK level.
Level 5 : Error correction data is striped at the block
level across all the drives in the array. Reads and writes may be performed concurrently.
What are the minimum requirements to run a RAID set?
If you are implementing RAID level 0, you will need a minimum of 2
disk drives to create a rank, a RAID controller card, and Windows NT. If you are
implementing RAID level 3 or 5, then you will need a minimum of 3 disks to create a rank.
RAID is also supported in some Novell and Unix environments.
Detailed RAID From the Linux high performance SCSI &
RAID page.
In 1987, Patterson, Gibson and Katz at the University of California
Berkeley, published a paper entitled "A Case for Redundant Arrays of Inexpensive
Disks (RAID)" . This paper described various types of disk arrays, referred to by the
acronym RAID. The basic idea of RAID was to combine multiple small, inexpensive disk
drives into an array of disk drives which yields performance exceeding that of a Single
Large Expensive Drive (SLED). Additionally, this array of drives appears to the computer
as a single logical storage unit or drive.
The Mean Time Between Failure (MTBF) of the array will be equal to
the MTBF of an individual drive, divided by the number of drives in the array. Because of
this, the MTBF of an array of drives would be too low for many application requirements.
However, disk arrays can be made fault-tolerant by redundantly storing information in
various ways.
Five types of array architectures, RAID-1 through RAID-5, were
defined by the Berkeley paper, each providing disk fault-tolerance and each offering
different trade-offs in features and performance. In addition to these five redundant
array architectures, it has become popular to refer to a non-redundant array of disk
drives as a RAID-0 array.
Data Striping
Fundamental to RAID is "striping", a method of
concatenating multiple drives into one logical storage unit. Striping involves
partitioning each drive's storage space into stripes which may be as small as one sector
(512 bytes) or as large as several megabytes. These stripes are then interleaved
round-robin, so that the combined space is composed alternately of stripes from each
drive. In effect, the storage space of the drives is shuffled like a deck of cards. The
type of application environment, I/O or data intensive, determines whether large or small
stripes should be used.
Most multi-user operating systems today, like NT, Unix and Netware,
support overlapped disk I/O operations across multiple drives. However, in order to
maximize throughput for the disk subsystem, the I/O load must be balanced across all the
drives so that each drive can be kept busy as much as possible. In a multiple drive system
without striping, the disk I/O load is never perfectly balanced. Some drives will contain
data files which are frequently accessed and some drives will only rarely be accessed. In
I/O intensive environments, performance is optimized by striping the drives in the array
with stripes large enough so that each record potentially falls entirely within one
stripe. This ensures that the data and I/O will be evenly distributed across the array,
allowing each drive to work on a different I/O operation, and thus maximize the number of
simultaneous I/O operations which can be performed by the array.
In data intensive environments and single-user systems which access
large records, small stripes (typically one 512-byte sector in length) can be used so that
each record will span across all the drives in the array, each drive storing part of the
data from the record. This causes long record accesses to be performed faster, since the
data transfer occurs in parallel on multiple drives. Unfortunately, small stripes rule out
multiple overlapped I/O operations, since each I/O will typically involve all drives.
However, operating systems like DOS which does not allow overlapped disk I/O, will not be
negatively impacted. Applications such as on-demand video/audio, medical imaging and data
acquisition, which utilize long record accesses, will achieve optimum performance with
small stripe arrays.
A potential drawback to using small stripes is that synchronized
spindle drives are required in order to keep performance from being degraded when short
records are accessed. Without synchronized spindles, each drive in the array will be at
different random rotational positions. Since an I/O cannot be completed until every drive
has accessed its part of the record, the drive which takes the longest will determine when
the I/O completes. The more drives in the array, the more the average access time for the
array approaches the worst case single-drive access time. Synchronized spindles assure
that every drive in the array reaches its data at the same time. The access time of the
array will thus be equal to the average access time of a single drive rather than
approaching the worst case access time.
The different RAID levels
- RAID-0
- RAID Level 0 is not redundant, hence does not truly fit the
"RAID" acronym. In level 0, data is split across drives, resulting in higher
data throughput. Since no redundant information is stored, performance is very good, but
the failure of any disk in the array results in data loss. This level is commonly referred
to as striping.
- RAID-1
- RAID Level 1 provides redundancy by writing all data to two or more
drives. The performance of a level 1 array tends to be faster on reads and slower on
writes compared to a single drive, but if either drive fails, no data is lost. This is a
good entry-level redundant system, since only two drives are required; however, since one
drive is used to store a duplicate of the data, the cost per megabyte is high. This level
is commonly referred to as mirroring.
- RAID-2
- RAID Level 2, which uses Hamming error correction codes, is intended
for use with drives which do not have built-in error detection. All SCSI drives support
built-in error detection, so this level is of little use when using SCSI drives.
- RAID-3
- RAID Level 3 stripes data at a byte level across several drives, with
parity stored on one drive. It is otherwise similar to level 4. Byte-level striping
requires hardware support for efficient use.
- RAID-4
- RAID Level 4 stripes data at a block level across several drives,
with parity stored on one drive. The parity information allows recovery from the failure
of any single drive. The performance of a level 4 array is very good for reads (the same
as level 0). Writes, however, require that parity data be updated each time. This slows
small random writes, in particular, though large writes or sequential writes are fairly
fast. Because only one drive in the array stores redundant data, the cost per megabyte of
a level 4 array can be fairly low.
- RAID-5
- RAID Level 5 is similar to level 4, but distributes parity among the
drives. This can speed small writes in multiprocessing systems, since the parity disk does
not become a bottleneck. Because parity data must be skipped on each drive during reads,
however, the performance for reads tends to be considerably lower than a level 4 array.
The cost per megabyte is the same as for level 4.
RAID-0 is the fastest and most efficient array type but offers no
fault-tolerance.
RAID-1 is the array of choice for performance-critical,
fault-tolerant environments. In addition, RAID-1 is the only choice for fault-tolerance if
no more than two drives are desired.
RAID-2 is seldom used today since ECC is embedded in almost all
modern disk drives.
RAID-3 can be used in data intensive or single-user environments
which access long sequential records to speed up data transfer. However, RAID-3 does not
allow multiple I/O operations to be overlapped and requires synchronized-spindle drives in
order to avoid performance degradation with short records.
RAID-4 offers no advantages over RAID-5 and does not support
multiple simultaneous write operations.
RAID-5 is the best choice in multi-user environments which are not
write performance sensitive. However, at least three, and more typically five drives are
required for RAID-5 arrays.
Possible aproaches to RAID
Hardware RAID
The hardware based system manages the RAID subsystem independently from the host and
presents to the host only a single disk per RAID array. This way the host doesn't have to
be aware of the RAID subsystems(s).
- The controller based hardware solution
DPT's SCSI controllers are a good example for a controller based RAID solution.
The intelligent contoller manages the RAID subsystem independently from the host. The
advantage over an external SCSI---SCSI RAID subsystem is that the contoller is able to
span the RAID subsystem over multiple SCSI channels and and by this remove the limiting
factor external RAID solutions have: The transfer rate over the SCSI bus.
- The external hardware solution (SCSI---SCSI RAID)
An external RAID box moves all RAID handling "intelligence" into a contoller
that is sitting in the external disk subsystem. The whole subsystem is connected to the
host via a normal SCSI controller and apears to the host as a single disk.
This solution has drawbacks compared to the contoller based solution: The single SCSI
channel used in this solution creates a bottleneck.
4 SCSI drives can already completely flood a SCSI bus, since the average transfer size is
around 4KB and the command transfer overhead - which is even in Ultra SCSI still done
asynchonously - takes most of the bus time.
Software RAID
- The MD driver in the Linux kernel is an example of a RAID solution
that is completely hardware independent.
However its application is limited, since it only provides RAID level 0, but not the
levels 1 and 5. The author stopped working on this.
- Adaptecs RAID controllers are another example, they have no RAID
functionality whatsoever on the controller, they depend on external drivers to provide all
external RAID functionality.
They are basically only multiple single AHA2940 controllers which have been integrated on
one card. Linux detects them as AHA2940 and treats them accordingly.
Every OS needs its own special driver for this type of RAID solution, this is error prone
and not very compatible.
Hardware vs. Software RAID
Just like any other application, software-based arrays occupy host system memory, consume
CPU cycles and are operating system dependent. By contending with other applications that
are running concurrently for host CPU cycles and memory, software-based arrays degrade
overall server performance. Also, unlike hardware-based arrays, the performance of a
software-based array is directly dependent on server CPU performance and load.
Except for the array functionality, hardware-based RAID schemes have
very little in common with software-based implementations. Since the host CPU can execute
user applications while the array adapter's processor simultaneously executes the array
functions, the result is true hardware multi-tasking. Hardware arrays also do not occupy
any host system memory, nor are they operating system dependent.
Hardware arrays are also highly fault tolerant. Since the array
logic is based in hardware, software is NOT required to boot. Some software arrays,
however, will fail to boot if the boot drive in the array fails. For example, an array
implemented in software can only be functional when the array software has been read from
the disks and is memory-resident. What happens if the server can't load the array software
because the disk that contains the fault tolerant software has failed? Software-based
implementations commonly require a separate boot drive, which is NOT included in the
array.
Reasons why you should use RAID:
- Speed
- Increased Storage capacity
- The economic costs of disk failure
- In addition to downtime, consider. . .
- Emergency service cost
- Cost of restoring data
- Immediate lost productivity
- Long term lost sales
- Lost repeat sales
- Lost word-of-mouth advertising
- In a commercial enterprise, the cost of a disk failure when there is
no mirroring or RAID is much larger than usually recognized.
- Unexpectedly, the largest cost is the accumulated lost sales over a
long period of time.
|