Defend a selection of a particular kind of RAID configuration, given a particular application that you select.
Introduction of RAID
RAID is the acronym for the Redundant Array of the Intelligent Disks. Many a times this acronym is also expanded as Redundant Array of the Intelligent Devices that can also be used for the Inexpensive Disks. RAID is seen as the group of various hard drives which are arranged in way such that it optimizes the storage to suit the individual needs. This optimization is done with the help of software or with the help of hardware. Hard drives that are configured in the RAID setup are also known as the array.
Software implementation, to set up RAID is provided by many different kinds of the operating system. In this case, a software layer usually sits above disk device drivers providing an abstraction layer that may lie between logical drives and physical drives (Vadala, 2003).
Different available RAID and their pros & Cons
There can be many different types of the RAID arrays like RAID 0, RAID 1, RAID5, RAID6, or JBOD and after that there combinational arrays are referred to as the two digit numbers. RAID 0 is considered as the most common level that is followed by RAID 1+0, RAID 5, and RAID 0+1. Apple’s Mac OS Server support all RAID 0, and RAID 1. Besides this, it has been seen that FreeBSD is supported by RAID 0, RAID 3, and RAID 5, and RAID 1 all layering act via GEOM modules. Linux is supported by RAID 0, RAID 4, RAID 1, RAID 5, and RAID 6. Microsoft’s server support RAID 3 levels. There are some Microsoft desktop OS that support RAID like Windows XP. RAID functionality in the Windows is slower as compared to the hardware RAID, allowing RAID array movement.
Software RAID needs to run on the host server that is attached to the storage. Server’s processor need to dedicate processing time, so that RAID software can be run. Additional processing capacity requires RAID 0 as well as RAID 1. Software RAID implementations employ sophisticated algorithms as compared to the hardware RAID implementations (Ike, 2009).
The pros and cons of different RAID types are as given in the table below –
Storage capacity & efficiency
Tolerance to Fault
Combined value (Overall)
Workstation Applications like Image, video & Engineering
Avg. write, good read
Desktops & Entry Servers
Avg. write, good read
High-capacity storage or Mid-Range Servers, particularly DAS, NAS
Enterprise servers which are High-performance e.g. database
Good write, Excellent read
Large volume storage which is Non-critical
Best RAID for a given application – Case of Database Servers
This section discusses the application of RAID10 in case of database servers. The implementation of RAID10 involves striped arrays with segments as RAID1 arrays. The level of fault tolerance for RAID10 is same as RAID1. The overhead remains the same for fault tolerance as it is for only mirroring. Few advantages of RAID10 include the high rate of I/O, capability to sustain simultaneous & multiple failures of drive. These advantages make RAID10 as the best solution for those applications, which would have in the first instance had gone for RAID1. Opting for RAID10 results in over the top performance boost, which is not possible through RAID1. The implementation of RAID10 level requires minimum of 4 drives. Following diagram gives the schematic of the RAID10:
Diagram 1: Schematic diagram of RAID10 disk arrangement
As can be seen from the diagram above, the RAID10 robust and sturdy structure combines the various stripes of RAID1 mirrors, having a RAID0 arrangement over the top. This results in the better replication & data sharing between disks, having simple read and seek operations, resulting in the requirement of simple rebuilding whenever there is a hard drive failure. This combination provides the easy recovery of data on RAID10 arrangement. The arrangement of RAID0, with its advantage of high speed, & RAID1, with the advantage of data redundancy, gives an excellent combination of database servers without any need for parity calculations. RAID10 can be of 2 types, viz. RAID0+1 & RAID1+0. In RAID0+1, data is organized in a way that stripes are across different disks followed by mirroring of the striped disk sets. On the other hand, RAID1+0 have data mirrored and the striping is done for the mirrors (Ike, 2009).
Why RAID10 is preferred for Database Servers
The case of Database servers clearly highlights the importance of RAID10 disk arrangement for enhancing the storage, backup and fault tolerance. The RAID10 arrangement of storage is mainly used for applications which are heavy on processor, e.g. MS exchange or Oracle database (Vadala, 2003). Following points describe the advantages with RAID10:
Sturdy in the event of data loss & recovery: As explained earlier, in the scheme of RAID10, an even number of disks are used. These disks are the mirror of each other. Thus, each array of disks has a mirror array. In case of failure, all the remaining disk of the RAID10 arrangement can be used. Only, until the last disk is remaining, RAID10 arrangement can be used without any severe and visible impact to the storage and the backup capability of the server. This database server of any organization has critical data stored on it. Thus, RAID10 scheme is a peace of mind for the DBA and the business, as it can survive high volume of disk failure.
Theoretically, RAID10 can handle many drives failing, the database admin must make efforts to immediately rectify the failures as leaving the RAID10 until the last drive fails can be a risky matter. In certain unforeseen situations involving physical damage, like flooding, fire and other natural disaster, these drives can fail catastrophically. In such situations, the solution is to have a disaster recovery center away from the production location. The disaster recovery center would receive the data from the production location every day and create a daily backup event. Same mirror set disk failures are extremely rare and exceptional still, it can occur. In such scenarios, the Disaster recovery backup is very useful.
The simple process of recovering data from the RAID10 involves mirroring the entire volume followed by the processing of entire array. These arrays are processed as image files. These steps ensure that all of the data is safe and eases the recovery. The next step involves the evaluation of the image arrays to identify whether the corruption is actually there. If the corrupt disk/sector is found, the next step is to repair it. Many data recovery companies use their own software to reconstruct/extract the data in the process, which is similar to the original rebuild. While, the rebuild is happening, the root cause analysis of the problem is being done to ensure the prevention of such problem in future.
Ease of rebuild: As the data loss and recovery capability of the RAID10 arrangement is excellent, the rebuild can be done very easily on this. This does not mean that everyone who has IT exposure should start doing the rebuild. In the situations, whether the RAID10 has failed, the usual operators would get confused by the complex process involved in the recovery and rebuild. Thus, they must not attempt the recovery. Especially, in the case of server registry loss or failure of controller, normal IT support resources are not able to perform the RAID repair as they lack the training, experience and the basic tools to execute it. Thus, in the situation of RAID10 failing, its best to consult the experts in this domain as there is a fear of loss of precious data.
I/O performance: RAID10 is known for its high performance write operation. This makes it the ideal choice for the applications needing high writing performance like database servers. Though, the point to note here is that RAID10 is not preferred for heavy database systems as they need high speed write performance, which is not as impressive for RAID10. But in the case of several hundred transactions entering into the system, the response of RAID10 is found to be better than any other RAID arrangement. In such scenarios, on other RAID arrangement, the best of cache arrangement is not able to handle the very high volume of transactions. Thus, the situation of chokablock blockage occurs. In these cases, RAID10 advantage becomes apparent. This is because RAID10 arrangement does not need to worry about the parity calculations like in other RAID arrangements, for example RAID5. In the same vein as RAID10 being a better performance in terms of I/O, it needs to be understood that RAID10 disk arrangement has less capacity as half of the disks are used to store data and other half mirrors it.
High Data Redundancy: The array arrangement for RAID10 has high data redundancy as compared to other RAID arrangements. As the database servers need high data redundancy, RAID10 becomes the ideal choice in such case.
Flexibility in the Architecture of RAID10: The RAID10 employs the best combination arrangement of RAID1 and RAID0. This results in the minimization of the free space in the RAID10 storage scheme.
Requirement of Controller: Any hardware controller can be used for RAID10 arrangement unlike other RAID arrangements. For instance, in case of RAID5, a high end card is required for the performance of the data storage. If the software controller is used for RAID5, it results in the slow performance of the computer.
Disk operation: RAID10 consists of many sets of devices which are mirrored. As mentioned earlier, these mirrored disk devices are then striped for creating the final drive. The result of this arrangement, RAID10, can be scaled and has the capability to handle faster reads and writes. This is due to the significant improvement in the operation of disk operation. The disk operations are multiple and spread across the multiple drive heads. This results in effective load distribution of read and write and hence faster operations. In case of RAID5, there is only one drive head which also does the parity calculation while it is writing; hence it is slow as compared to the RAID10. In the event of huge writing transactions, the RAID5 arrangement becomes terribly slow, while the RAID10 performs extremely well under such scenarios. Thus, it is preferred in the database servers in the organizations
Reduction in Space by 50%: This is not really an advantage for RAID10 but it needs to be discussed as this is the only disadvantage for RAID10. The space is reduced by 50% in RAID10 arrangement as half of the disks are used to mirror the other disks. The cost-benefit analysis coupled with the data protection context justifies the RAID10 arrangement. The increase in the I/O speed of the RAID10 justifies the economic burden created due to purchase of 2 disks and effectively utilizing only 1 for data storage.
Finally, in summary, if the organization considers the cost of recovering the data, can not afford any server downtime and has value attached to the data, then it must go for RAID10 arrangement database servers. The organization must not fall trap to the syndrome that so long as everything is fine, there is no issue. As today, any industry has IT and servers as its backbone, any cases of disk failures can easily bring a halt to the operations of the organization.
Vadala, Derek (2003), Managing RAID on Linux, O’Reilly Media, Inc., ISBN: 1565927303, 9781565927308
Ike Antkare (2009), Deconstructing RAID using Shern. In Proceedings of the Conference on Scalable, Embedded Configurations, April 2009.