Reid message. RAID array: types and creation process

Transfer of the center of gravity with processor-oriented on data-oriented applications causes an increase in the importance of storage systems. Together with this problem low bandwidth And fault tolerance characteristic of such systems has always been quite important and always required its decision.

In the modern computer industry, magnetic discs are universally used as a secondary storage system, because, despite all its drawbacks, they have the best characteristics for the appropriate type of devices at an affordable price.

Features of the magnetic disk construction technologies led to a significant inconsistency between the increase in the performance of the processor modules and the magnetic disks themselves. If in 1990 the best among serials were 5.25 "Discs with an average access time 12 ms and a delay time of 5 ms (with a spindle time of about 5,000 rpm 1), then today the palm of the championship belongs to 3.5" disks with an average access time 5 ms and The time of delay is 1 ms (when the spindle revolutions are 10,000 rpm). Here we see improvement technical characteristics By magnitude about 100%. At the same time, the speed of processors increased by more than 2,000%. This was largely possible due to the fact that processors have direct advantages of using VLSI (ultra-hand integration). Its use not only makes it possible to increase the frequency, but also the number of components that can be integrated into the chip, which makes it possible to implement architectural advantages that allow parallel calculations.

1 - averaged data.

The current situation can be described as a crisis of I / O a secondary storage system.

Increase speed

The inability to significantly increase the technological parameters of magnetic disks entails the need to search for other paths, one of which is parallel processing.

If you arrange a block of data on N discs of some array and organize this placement so that there is the possibility of simultaneously reading information, then this unit can be considered in N times faster, (excluding the time of block formation time). Since all data is transmitted in parallel, this architectural solution is called parallel-Access Array (array with parallel access).

Parallel access arrays are commonly used for applications that require large-size data.

Some tasks, on the contrary, are characterized by a large number of small requests. Such tasks include, for example, database processing tasks. Having a database records across the disks of the array, you can distribute the download, independently positioning the discs. This architecture is customary called independent Access Array (An array with independent access).

Increase fault tolerance

Unfortunately, with an increase in the number of disks in the array, the reliability of the whole array decreases. With independent failures and exponential array distribution law, MTTF total array (Mean Time to Failure - average time of trouble-free operation) is calculated by the MTTF Array \u003d MMTF HDD / N HDD formula (MMTF HDD - average time of trouble-free operation of one disk; NHDD - number disks).

Thus, there is a need to increase the fault tolerance of disk arrays. To increase the fault tolerance of arrays, excess coding is used. There are two main types of coding that are used in excess disk arrays is duplication and parity.

Duplication, or mirroring - most commonly used in disk arrays. Simple mirror systems use two copies of data, each copy is placed on separate disks. This scheme is quite simple and does not require additional hardware costs, but has one significant disadvantage - it uses 50% of the disk space for storing a copy of the information.

The second method of implementing excess disk arrays is the use of redundant coding by calculating parity. Parity is calculated as an XOR operation of all characters in the word data. The use of parity in excess disk arrays reduces overhead costs to the value calculated by the formula: HDD \u003d 1 / N HDD (HDD HDD - Overhead; N HDD is the number of disks in the array).

History and Development RAID

Despite the fact that data storage systems based on magnetic disks are produced for 40 years, mass production of fault-tolerant systems began quite recently. Disclaimers with data redundancy that is called RAID (Redundant Array of Inexpensive Disks - an excess array of inexpensive discs) were represented by researchers (Petterson, Gibson and Katz) from the University of California in Berkeley in 1987. But the widespread RAID system was obtained only when the discs that are suitable for use in excess arrays have become available and quite productive. Since the submission of the official RAID report in 1988, research in excess disk arrays began to grow rapidly, in an attempt to ensure wide spectrum Compromise solutions - productivity price-reliability.

With the Abbreviation Raid at one time there was a case. The fact is that inexpensive discs during the writing of the article were called all the discs that were used in the PC, in counterweight expensive disks for the mainframe (universal computer). But for use in RAID arrays had to use quite expensive equipment compared to another PC complete set, so RAID began to decrypt as Redundant Array of Independent Disks 2 - an overweight array of independent disks.

2 - Definition of RAID Advisory Board

RAID 0 was represented by the industry as the definition of a non-fault-tolerant disk array. In Berkeley, RAID 1 was defined as a mirror disk array. RAID 2 is reserved for arrays that use the chemming code. RAID 3, 4, 5 levels use parity to protect data from single malfunctions. It is these levels that inclusive according to the 5th were presented in Berkeley, and this RAID systematics was adopted as a de facto standard.

RAID 3,4,5 levels are quite popular, have a good coefficient of use of disk space, but they have one significant drawback - they are stable only to single malfunctions. This is especially true when using a large number of disks when the probability of simultaneous idle for more than one device increases. In addition, they are characterized by a long restoration, which also imposes some restrictions for their use.

To date, a sufficiently large number of architectures have been developed that ensure the performance of the array while simultaneously refusing any two disks without losing data. Among the whole set, it is worth noting TWO-Dimensional Parity (two-sponsable parity) and Evenodd, which use parity for coding, and RAID 6, which uses Reed-Solomon coding.

In the diagram using two-spatial parity, each data block is involved in the construction of two independent code words. Thus, if the second disk is out of the order in the same code word, another code word is used to reconstruct the data.

The minimum redundancy in such an array is achieved with an equal number of columns and lines. And equal: 2 x Square (N Disk) (in "Square").

If the two-spatial array is not organized in the "square", then when implementing the above scheme, redundancy will be higher.

Evenodd architecture has a two-spanning parity of the fault tolerance scheme, but another placement of information blocks, which guarantees the minimum excess use of containers. As in the two-spatial parity, each data block participates in the construction of two independent code words, but the words are placed in such a way that the redundancy coefficient is constant (in contrast to the previous scheme) and is: 2 x Square (N DISK).

Using two characters for checking, parity and unfulfilled codes, the data word can be constructed in such a way as to ensure fault tolerance when a double malfunction occurs. Such a scheme is known as RAID 6. An unfulfilled code, built on the basis of Reed-Solomon coding, is usually calculated using tables or as an iterative process using linear registers with feedback, and this is a relatively complex operation that requires specialized hardware.

Considering that the use of classic RAID options implementing sufficient fault tolerance for many applications has often unacceptably low speed, the researchers from time to time implement various moves that help increase the speed of RAID systems.

In 1996, Savaoven and Wilke offered AFRAID - a frequently redundant array of independent drives (A Frequently Redundant Array Of Independent Disks). This architecture some extent brings fault tolerance to the victim of speed. Making an attempt to compensate for a small recording problem (Small-Write Problem), characteristic of RAID arrays of the 5th level, is allowed to leave a streaking without calculating parity for a certain period of time. If the disc designed to record parity is busy, then its recording is postponed. It is theoretically proved that a 25% decrease in fault tolerance can increase the speed by 97%. AFRAID actually changes the model of failures of solid faults resistant arrays, since the code word that does not have an updated parity, susceptible to disk failures.

Instead of sacrificing fault tolerance, you can use such traditional ways to increase speeds like caching. Considering the fact that disk traffic has a pulsating nature, you can use a backup memory cache for storing the data at the time when the discs are busy. And if the cache memory is performed in the form of non-volatile memory, then, in the event of a power disappearance, the data will be saved. In addition, pending disk operations make it possible to combine small blocks in random order to perform more efficient disk operations.

There are also many architectures that, sacrificing the volume, increase the speed. Among them are a deferred modification on the log disk and a variety of modification schemes for logical data placement in physical, which allow you to distribute operations in the array more efficiently.

One of the options - parity Logging. (Registration of parity), which involves a solution to a small record problem (Small-Write Problem) and more efficient use of discs. Registration of parity implies a quantity change in RAID 5, recording it in the FIFO LOG (logizer of type FIFO), which is partial in the controller memory and partially on the disk. Given the fact that access to the full track is 10 times more efficient than access to the sector, large quantities of modified parity data are collected using parity, which are then all written to a disk to storing parity throughout the track.

Architecture floating Data and Parity (Floating data and parity), which permits to redistribute the physical placement of disk blocks. Free sectors are placed on each cylinder to reduce rotational Latence (rotation delays), data and parity are placed on these free places. In order to ensure that the power is disappeared, the parity and data card must be maintained in non-volatile memory. If you lose the placement map, all data in the array will be lost.

Virtual Stripping. - Is a Floating Data and Parity architecture using Writeback Cache. Naturally implementing positive sides both.

In addition, there are other ways to increase speed, such as the distribution of RAID operations. At one time, Seagate has built support for RAID operations into their disks with the Fiber Chanel and SCSI interface. What made it possible to reduce traffic between the central controller and disks in the RAID 5. Disc. This was a cardinal innovation in the field of RAID implementations, but the technology did not receive any vouchers, as some features of Fiber Chanel and SCSI standards weaken the failure model for disk arrays.

For the same RAID 5, the Tickertaip architecture was presented. It looks like this - the central control mechanism of the Originator Node (node \u200b\u200binitiator) receives user requests, selects the processing algorithm and then transmits work with the disk and the parity of Worker Node (working unit). Each working knot processes some subset of the disks in the array. As in the model of Seagate, the working components transmit data from each other without the participation of the initiator node. In the event of a desktop failure, the discs that it served becomes inaccessible. But if the code word is built so that each symbol is processed by a separate working knot, then the resistance scheme repeats RAID 5. To prevent the initiator node failures, it is duplicated, so we get an architecture that is stable to fail to fail. With all its positive features, this architecture suffers from the problem of "write errors" ("; Write Hole"). What implies an error while simultaneously changing the code word by several users and a node failure.

It is also necessary to mention a fairly popular way to quickly restore RAID - use free disk (SPARE). If one of the disks of the array is refusing, the RAID can be restored using a free disk instead of the failed. The main feature of such an implementation is that the system goes into its previous (fault tolerant state without external intervention). When using a free disk distribution architecture (Distributed Sparing), the SPARE disc logical blocks are distributed physically across all the disks of the array, removing the need to restructure the array during the disk failure.

In order to avoid the problem of recovery, characteristic of classical RAID levels, the architecture is also used, which is called parity Declustering (parity distribution). It involves the placement of a smaller number of logical disks with a large volume of smaller physical disks, but more. When using this technology, the response time of the system to the request during reconstruction is improved more than doubled, and the reconstruction time is significantly reduced.

The architecture of the main levels of RAID

Now let's look at the Basic Levels architecture (Basic Levels) RAID in more detail. Before considering some assumptions. To demonstrate the principles of constructing RAID systems, we consider a set of N discs (to simplify N, we will consider an even number), each of which consists of M blocks.

Data will be denoted by - D M, N, where M is the number of data blocks, n is the number of sub-blocks to which Data block is divided.

Disks can connect both to one and several data transmission channels. The use of larger channels increases the bandwidth system.

RAID 0. disk array without failover (Striped Disk Array WITHOUT FAULT TOLERANCE)

It is a disk array in which the data is divided into blocks, and each block is recorded (or read) to a separate disk. Thus, you can implement several I / O operations at the same time.

Benefits:

the highest performance for applications requiring intensive processing of I / O queries and large data data;
simplicity;
low cost per unit volume.

disadvantages:

not a fault tolerant solution;
the failure of one disk entails the loss of all these array.

RAID 1. Disk array with duplication or mirror (mirroring)

Mirroring is a traditional way to increase the reliability of a small volume disk array. In the simplest version, two disks are used to record the same information, and in case of refusal of one of them, its double remains, which continues to operate in the same mode.

Benefits:

simplicity;
easy to restore the array in case of failure (copying);
quite high speed for applications with a large query intensity.

disadvantages:

high cost per unit volume - 100% redundancy;
low data transfer rate.

RAID 2. Failure tolerant disk array using Hemming Code (Hamming Code ECC).

Excess coding, which is used in RAID 2, is called the chemming code. The chemming code allows you to correct single and detect dual malfunctions. Today is actively used in data coding technology in ECC RAM. And encoding data on magnetic disks.

In this case, an example with a fixed amount of discs due to the bulkiness of the description is shown (the data word consists of 4 bits, respectively, the ECC code out of 3).

Benefits:

fast error correction ("on the fly");
very high speed of data transfer of large volumes;
with increasing number of disks, overhead costs are reduced;
quite simple implementation.

disadvantages:

high cost with a small amount of disks;
low query processing speed (not suitable for transaction-oriented systems).

RAID 3. Failover array with parallel data transfer and parity (Parallel Transfer Disks With Parity)

Data is divided into sublits at the byte level and are recorded simultaneously on all the wheels of the array besides one that is used for parity. The use of RAID 3 solves a large redundancy problem in RAID 2. Most control disks used in RAID level 2 are needed to determine the position of a faulty discharge. But this is not needed, since most controllers are able to determine when the disk refused using special signals, or additional encoding of information recorded on the disk and used to correct random failures.

Benefits:

very high data transfer rate;
disk failure little affects the speed of the array;

disadvantages:

not easy implementation;
low performance with a large intensity of small amounts of data.

RAID 4. Failure tolerant array of independent discs with a divided disk of parity (Independent Data Disks with Shared Parity Disk)

Data is divided at block level. Each data block is written to a separate disk and can be read separately. The parity for the block group is generated when recording and is checked when reading. RAID level 4 increases the performance of small amounts of data by parallelism, allowing you to perform more than one input / output to simultaneously. The main difference between RAID 3 and 4 is that in the latter, the separation of the data is performed at the level of the sectors, and not at the level of bits or bytes.

Benefits:

very high speed of reading data large volumes;
high performance with a large intensity of data readers;
small overhead for excess exercise.

disadvantages:

very low performance when writing data;
low speed of reading data small volume with single requests;
asymmetric performance relative to reading and writing.

RAID 5. Failure tolerant array of independent discs with distributed parity (Independent Data Disks with Distributed Parity Blocks)

This level is similar to RAID 4, but in contrast to the previous parity is distributed cyclically through all the disks of the array. This change allows you to increase the performance of the record of small amounts of data in multitasking systems. If the recording operations are properly planned, then it may be parallel to handle to N / 2 blocks, where n is the number of disks in the group.

Benefits:

high data recording speed;
high enough data reading speed;
high performance with a large intensity of read / write requests;
small overhead for excess exercise.

disadvantages:

data reading speed is lower than in RAID 4;
low read / write data for small volume data under single requests;
quite complicated implementation;
complex data recovery.

RAID 6. Failure tolerant array of independent discs with two independent parity distributed schemes (Independent Data Disks with Two Independent Distributed Parity Schemes)

Data is divided at block level, similar to RAID 5, but in addition to the previous architecture, the second scheme is used to increase fault tolerance. This architecture is resistant to double failures. However, when executing a logical record, six appeals to the disk actually occurs, which greatly increases the processing time of one request.

Benefits:

high fault tolerance;
sufficiently high speed processing speed;
relatively small overhead for excess redundancy.

disadvantages:

very complicated implementation;
complex data recovery;
very low data recording speed.

Modern RAID controllers allow combining different RAID levels. Thus, you can implement systems that combine the dignity of various levels, as well as systems with a large number of disks. This is usually a null-level combination (stripping) and any fault tolerant level.

RAID 10. Failure tolerant array with duplication and parallel processing

This architecture is an array of type RAID 0, whose segments are RAID arrays 1. It combines very high fault tolerance and performance.

Benefits:

high fault tolerance;
high performance.

disadvantages:

very high cost;
limited scaling.

RAID 30. Failure tolerant array with parallel data transmission and increased performance.

It is an array of type RAID 0, the segments of which are RAID 3. Arrays, it combines fault tolerance and high performance. Usually used for applications requiring successive data transfer of large volumes.

Benefits:

high fault tolerance;
high performance.

disadvantages:

high price;
limited scaling.

RAID 50. Failure tolerant array with distributed parity and high performance

It is an array of type RAID 0, the segments of which are RAID arrays 5. It combines fault tolerance and high performance for applications with a large query intensity and high data transfer rate.

Benefits:

high fault tolerance;
high data transfer rate;
high speed processing.

disadvantages:

high price;
limited scaling.

RAID 7. Failure tolerant array optimized to increase productivity. Optimized Asynchrony for High I / O Rates AS Well As High Data Transfer Rates). RAID 7® is a registered brand STORAGE COMPUTER CORPORATION (SCC)

To understand the architecture of RAID 7, consider it features:

All data transmission requests are processed asynchronously and independently.
All read / write operations are cached through a high-speed X-BUS bus.
The parity disc can be placed on any channel.
In the microprocessor of the array controller, the real-time operating system is used to process processes.
The system has good scalability: up to 12 host interfaces and up to 48 disks.
The operating system controls communication channels.
Standard SCSI discs, tires, motherboards and memory modules are used.
Used high-speed X-BUS bus to work with internal cache memory.
The parity generation procedure is integrated into the cache.
Disks attached to the system can be declared as separately worth.
You can use the SNMP agent to manage and monitor the system.

Benefits:

high data transfer rate and high speed processing speed (1.5 - 6 times higher than other standard RAID levels);
high scalability of host interfaces;
data recording speed increases with an increase in the number of disks in the array;
to calculate parity, there is no need for additional data transmission.

disadvantages:

ownership of one manufacturer;
very high cost per unit volume;
short warranty period;
can not be served by the user;
you need to use a uninterrupted power supply to prevent data from loss of data from memory.

Consider now standard levels together to compare their characteristics. Comparison is made within the architectures mentioned in the table.

RAID	Minimum Disc	Need in disks	Refusal sustainability	Speed data transmission	Intensity Processing Requests	Practical using
0	2	N.			very high to N x 1 disk	Graphics, video
1	2	2n *		R\u003e 1 disk W \u003d 1 disk	up to 2 x 1 disk W \u003d 1 disk	small file servers
2	7	2N.		~ RAID 3.	Low	mainframe
3	3	N + 1.			Low	Graphics, video
4	3	N + 1.		R W.	R \u003d RAID 0 W.	file servers
5	3	N + 1.		R W.	R \u003d RAID 0 W.	database servers
6	4	N + 2.	the tallest	low	R\u003e 1 disk W.	used extremely rare
7	12	N + 1.		the tallest	the tallest	different types of applications

Creation:

* - the usually used option is considered;
k - the number of subsaling;
R - reading;
W - recording.

Some aspects of the implementation of RAID systems

Consider three main options for implementing RAID systems:

software (Software-based);
hardware - bus-based bus-based;
hardware - autonomous subsystem (subsystem-base).

It is impossible to unambiguously say that any realization is better than the other. Each version of the organization of the array satisfies one user needs depending on the financial capabilities, the number of users and applications used.

Each of the above implementations is based on the execution of the program code. They differ in fact where this code is executed: in the computer's central processor (software implementation) or in a specialized processor on RAID controller (hardware implementation).

The main advantage of software implementation is low cost. But it has many drawbacks: low performance, loading additional work of the central processor, increasing tire traffic. Programmatically implements simple RAID levels - 0 and 1, as they do not require significant computing. Given these features, RAID systems with software implementation are used in the entry-level servers.

RAID hardware implementations are more than software, as they use additional equipment to perform an output input operations. At the same time, they unload or exempt the central processor and the system bus and, accordingly, allow you to increase the speed.

Bus-oriented implementations are RAID controllers that use the computer's high-speed bus in which they are installed (the PCI bus is usually used). In turn, tire-oriented implementations can be divided into low-level and high-level. The first usually do not have SCSI chips and use the so-called RAID port on the motherboard with a built-in SCSI controller. In this case, the functions of processing the RAID code and I / O operations are distributed between the processor on the RAID controller and SCSI chips on the motherboard. Thus, the central processor is released from processing additional code And the tire traffic decreases compared to the program option. The cost of such boards is usually small, especially if they are focused on RAID systems - 0 or 1 (there are also implementations of RAID 3, 5, 10, 30, 50, but they are more expensive), due to which they gradually push out software implementations from the initial level servers market. High-level tire implementation controllers have a somewhat different structure than their younger brothers. They assume all the functions associated with the introduction / output and execution of the RAID code. In addition, they are not as dependent on the implementation of the motherboard and, as a rule, have more features (for example, the ability to connect a module for storing information in the cache in the event of a motherboard or power disappearance). Such controllers are usually more expensive than low-level and are used in middle and high-level servers. They, as a rule, implement RAID levels of 0.1, 3, 5, 10, 30, 50. Considering that bus-oriented implementations are connected directly to the inner PCI computer bus, they are the most productive among the systems under consideration (when organizing the same host systems). Maximum speed Such systems can reach 132 MB / s (32bit PCI) or 264 MB / s (64bit PCI) with a tire frequency 33mhz.

Together with the above-mentioned advantages, tire-oriented architecture has the following drawbacks:

dependence on the operating system and the platform;
limited scalability;
limited capabilities on the organization of fault-tolerant systems.

All these drawbacks can be avoided using autonomous subsystems. These systems have a fully autonomous external organization and, in principle, are a separate computer that is used to organize information storage systems. In addition, in case of successful development of fiber-optic channel technology, the speed of autonomous systems will not yield to tire-oriented systems.

Usually an external controller is placed in a separate rack and, unlike systems with a bus organization, there may have a large number of I / O channels, including host channels, which makes it possible to connect multiple host computers to the system and organize cluster systems. In systems with an autonomous controller, you can implement hot backup controllers.

One of the disadvantages of autonomous systems remains their big value.

Considering the foregoing, we note that autonomous controllers are commonly used to implement high-end data warehouses and cluster systems.

Many users heard of such a concept as RAID disk arrays, but in practice few people imagine what it is. But as it turns out, nothing complicated here. We will analyze the essence of this term, which is called, on the fingers, based on the explanation of the information for the ordinary ordinary man.

What are RAID disk arrays?

To begin with, consider the general interpretation that is offered by Internet publications. Disk arrays are whole information storage systems consisting of a bundle of two and more hard drives serving either to increase the speed of access to stored information, or to duplicate it, for example, when saving backup copies.

In such a bundle, the number of hard drives in terms of the installation of theoretically restrictions has no restrictions. It all depends only on how many connections supports motherboard. Actually, why do RAID disk arrays are used? It is worth paying attention to the fact that in the direction of technology development (relative to the hard drive), they have long been frozen at one point (the speed of rotation of the spindle 7200 rpm, the size of the cache, etc.). The exception in this regard is only the SSD models, but they mainly produce only an increase in volume. At the same time, in the production of processors or races of RAM, progress more tangle. Thus, due to the use of RAID arrays, an increase in productivity growth is carried out when accessing hard drives.

RAID disk arrays: species, purpose

As for the arrays themselves, they can be divided into the numbering used (0, 1, 2, etc.). Each such number corresponds to the execution of one of the stated functions.

The main in this classification are disk arrays with numbers 0 and 1 (hereinafter will be clear why), since it is precisely on them the main tasks.

When creating arrays with multiple hard drives, you initially use the BIOS settings, where the RAID value is set to the SATA configuration section. It is important to note that the connected discs must have absolutely identical parameters in terms of volume, interface, connection, cache, etc.

RAID 0 (Striping)

Zero disk arrays are essentially intended to accelerate access to stored information (recording or reading). They, as a rule, may have in a bundle from two to four hard drives.

But then the most important problem is that when removing information on one of the disks, it disappears on others. The information is recorded in the form of blocks alternately on each disk, and an increase in productivity is directly proportional to the number of hard drives (that is, four disks are twice as fast in two). But the loss of information is only associated with the fact that blocks can be located on different disks, although the user in the same "conductor" sees files in normal display.

RAID 1.

Disk arrays with single designation refer to the discharge of the Mirroring (Mirror Image) and serve to save data by duplication.

Roughly speaking, with this state of affairs, the user is somewhat loses in performance, but it can be exactly sure that when the data disappears from one section, they will be stored in the other.

RAID 2 and higher

Arrays with numbers 2 and above have a dual purpose. On the one hand, they are designed to record information, on the other, are used to correct errors.

In other words, disk arrays of this type combine the capabilities of RAID 0 and RAID 1, but they do not use much popular among the computer, although their work is based on the use of

What is better to use in practice?

Of course, if the computer is assumed to use resource-intensive programs, such as modern games, it is better to use RAID 0 arrays. In case of work with important information that needs to be preserved in any way, you will have to turn to RAID arrays 1. Due to the fact that the ligaments with numbers from There were never two and higher than the popular, their use is determined exclusively by the desire of the user. By the way, the use of zero arrays is practical and if the user often downloads the multimedia files, say, movies or music with a high bit rate for MP3 format or in FLAC standard.

The rest will have to rely on your own preferences and needs. It is from this that will depend on the use of one or another array. And, of course, when installing a bundle, it is better to give preference sSD disksSince compared to conventional hard drives, they already initially have higher records in the speed of recording and reading. But they must be absolutely the same in their characteristics and parameters, otherwise the connected combination simply will not work. And it is precisely this one of the most important conditions. So you have to pay attention to this aspect.

Hello to all readers Website! Friends, I have long wanted to talk about how to create an array on a RAID computer (excess array of independent disks). Despite the seeming complexity of the issue, in fact, everything is very simple and I am sure, many readers immediately after reading this article will be put into service and will be happy to use this very useful, related to the security of your data technology.

How to create RAID array and why he need

It is no secret that our information on the computer is practically not insured and is located on a simple hard disk, which has the property to break at the most inopportune moment. Has long been recognized as the fact that hDD The weakest and most indestructible place in our system unit, as it has mechanical parts. Those users who have ever lost important data (I am including) due to the failure of the "screw", chaporing for some time asked how to avoid such trouble in the future and the first thing that comes to mind is creating a RAID array.

The whole point of an excessive array of independent disks is to save your files on a hard disk in case of complete breakdowns of this disk! How to do it, - You ask, it is very simple, you just need two (you can even different in the volume) of hard disk.

In today's article, we will with the help of the Windows 8.1 operating system, create the easiest and most popular hard drives. RAID 1 massival, it is also called "Mirroring" (Mirroring). The meaning of the "mirror" is that the information on both disks is duplicated (written in parallel) and two hard drives are precise copies of each other.

If you copied the file to the first hard disk, then the second appears exactly the same file and how you already understood if one hard disk fails, then all your data will remain integer on second winchester (mirror). The probability of breakdown at once two hard drives is insignificant small.

The only minus RAID 1 array is that you need two hard drives, and they will work as one single, that is, if you install two hard drives in the system unit in the volume of 500 GB, then the same 500 will be available for storing files. GB, not 1TB.

If one hard disk of two fails, you just take it and change it by adding as a mirror to the already installed hard drive with data and that's it.

Personally, I, for many years, i use at work RAID 1 An array of two-hard drives of 1 TB and a year ago, a nuisance occurred, one "hard" ordered a long time to live, I had to replace it right there, then I thought with horror, so that I would not have a RAID massive, a little chill ran On the back, because the data accumulated over several years of work would be gone, and so, I simply replaced the defective "Terabyte" and continued to work. By the way, I also have a small RAID-array of two 500 GB hard drives.

Creating softwareRAID 1. massif of two empty rigid disks Windows 8.1

First of all, we install two pure hard drives in our system unit. For example, I will take two rigid disc of 250 GB.

What to do if the size of the hard drives is different or on one hard disk you already have information, read in the following article.

Open drive control

Disc 0. - SSD solid-state drive with installed operating windows system 8.1 on the section (C :).

Disk 1. and Disc 2. - Hard disks with a volume of 250 GB of which we collect RAID 1 array.

Right mouse on any rigid disk and choose "Create a Mirror Tom"

Add a disk that will be a mirror for the previously selected disk. The first mirror volume we chose disk 1, which means in the left part, select the disk 2 and click on the "Add" button.

We choose the letter of the software RAID 1 massif, I leave the letter (D :). Further

I celebrate the checkbox Quick Formatting and click Next.

In drive control, the mirror volumes are designated in bloody-red and have one letter of the disc, in our case (D :). Copy any files to any disk and they will immediately appear on another disk.

In the "This Computer" window, software RAID 1 An array is displayed as one disk.

If one of two hard drives fails, the RAID array will be marked in the disk control, but on the second rigid disk all the data will be in preservation.

Request design

Description RAID arrays ()

Description RAID 0.

Disk array increased productivity without fault tolerance
Striped Disk Array Without Fault Tolerance

Array RAID 0 The most productive and least protected from all RAIDs. Data is divided into blocks in proportion to the number of disks, which leads to a higher bandwidth. High performance of this structure is provided by a parallel recording and absence of overpopping. The failure of any disk in the array leads to the loss of all data. This level is called striping.

Benefits:
- · the highest performance for applications that require intensive processing of I / O queries and large data data;
- · Easy to implement;
- · Low cost per unit volume.
Disadvantages:
- · not a fault tolerant solution;
- · The failure of one disk entails the loss of all the data of the array.

Description RAID 1.

Disk array with duplication or mirroring
Duplexing & Mirroring
RAID 1 - MIRRORING - Mirror reflection of two disks. The redundancy of the structure of this array ensures its high fault tolerance. The array is characterized by a high cost and low performance.

Benefits:
- · Easy to implement;
- · Easy to restore the array in case of failure (copying);
- · Quite high speed for applications with a large query intensity.
Disadvantages:
- · High cost per unit volume - 100% redundancy;
- · Low data transfer rate.

Description RAID 2.

Failover disk array using Hemming Code
Hamming Code Ecc.
RAID 2 - uses chemming error correction codes (Hamming Code ECC). Codes allow you to correct single and detect dual malfunctions.

Benefits:
- · Fast correction of errors ("on the fly");
- · Very high speed of data transfer of large volumes;
- · with an increase in the number of disks, overheads are reduced;
- · Quite simple implementation.
Disadvantages:
- · High cost with a small amount of disks;
- · Low speed processing speed (not suitable for transaction-oriented systems).

Description RAID 3.

Failover array with parallel data transmission and parity
Parallel Transfer Disks With Parity

RAID 3 - The data is stored on the principle of striping at the level of bytes with the checksum (COP) on one of the disks. The array does not have a problem of some redundancy as in the 2nd level RAID. The checksum discs used in RAID 2 are needed to determine the erroneous charge. However, most modern controllers are able to determine when the disk refused using special signals or additional encoding of information recorded on the disk and used to correct random failures.

Benefits:
- · very high data transfer rate;
- · Disk failure little affects the speed of the array;
- · Small overhead for the sale of redundancy.
Disadvantages:
- · not easy implementation;
- · Low performance with a large intensity of low-volume data requests.

All modern motherboards are equipped with an integrated RAID controller, and top models have even several integrated RAID controllers. As far as integrated RAID controllers are in demand by home users - a separate question. In any case, the modern motherboard provides the user with the ability to create a RAID array from several disks. However, not every home user knows how to create a RAID array, which level of the array to choose, and indeed it does not imagine the benefits and cons of the use of RAID arrays.
In this article, we will give brief recommendations on the creation of RAID arrays at home PC and on a specific example will demonstrate how you can independently test the productivity of the RAID array.

History of creation

For the first time, the term "RAID-Massif" appeared in 1987, when American researchers Patterson, Gibson and Katz from California Berkeley University in their article "Excessive array of inexpensive disks" ("A Case for Redundant Arrays of Inexpensive Discs, Raid") described what You can combine several cheap hard drives into one logical device so that the resulting the capacity and speed of the system increases, and the failure of individual disks did not lead to the failure of the entire system.

Since the release of this article has passed more than 20 years, but the technology of building RAID arrays has not lost their relevance today. The only thing that has changed since, is a decoding of RAID abbreviation. The fact is that the initial RAID arrays were built at all on cheap disks, so the word inexpensive (inexpensive) was changed to independent (independent), which corresponded to reality.

Operating principle

So, RAID is an excess array of independent disks (Redundant Arrays of Independent Discs), which is assigned to ensure fault tolerance and performance improvement. Failure tolerance is achieved due to redundancy. That is, a part of the disk capacity is allocated for service purposes, becoming inaccessible to the user.

Improving the performance of the disk subsystem is ensured by the simultaneous operation of several disks, and in this sense, the more drives in the array (up to a certain limit), the better.

The joint operation of the disks in the array can be organized using either parallel or independent access. With parallel access, disk space is divided into blocks (strips) to record data. Similarly, the information to be recorded on the disk is divided into the same blocks. When recording, individual blocks are recorded on different disks, and the recording of several blocks on various discs occurs simultaneously, which leads to an increase in performance in recording operations. The necessary information is also read by separate blocks simultaneously with several disks, which also contributes to the performance growth in proportion to the number of disks in the array.

It should be noted that the model with parallel access is implemented only on the condition that the size of the data recording request is larger than the size of the block itself. Otherwise, it is almost impossible to carry out a parallel recording of several blocks. Imagine the situation when the size of the individual block is 8 KB, and the size of the data recording request is 64 KB. In this case, the initial information is cut to eight blocks of 8 KB each. If there is an array of four disks, then you can simultaneously burn four blocks, or 32 KB, at a time. Obviously, in the considered example, the recording speed and read speed will be four times higher than when using a single disk. This is true only for the ideal situation, however the query size is not always keen in the size of the block and the number of disks in the array.

If the size of the recorded data is less than the size of the block, then a fundamentally different model is implemented - independent access. Moreover, this model can be used in the case when the size of the recorded data is greater than the size of one block. With independent access, all data of the individual request is recorded on a separate disk, that is, the situation is identical to working with one disk. The advantage of a model with independent access is that while entering several recording requests (read), they will all be performed on separate disks independently of each other. This situation is typical, for example, for servers.

In accordance with various types of access, there are different types of RAID arrays, which are accepted to characterize RAID levels. In addition to the type of access, RAID levels differ in the method of placing and forming redundant information. Excess information can either be placed on a specially dedicated disk, or distributed between all the disks. There are quite a lot of ways to form this information. The simplest of them is complete duplication (100 percent redundancy), or mirroring. In addition, codes with error correction are used, as well as calculating parity.

RAID arrays levels

Currently, there are several RAID levels that can be considered standardized - it is RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5 and RAID 6.

Various combinations of RAID levels are also used, which allows you to combine their advantages. This is usually a combination of any fault-tolerant level and zero level used to increase productivity (RAID 1 + 0, RAID 0 + 1, RAID 50).

Note that all modern RAID controllers support the JBOD (Just a Bench of Disks) function, which is not intended to create arrays, it provides the ability to connect to the RAID controller of individual disks.

It should be noted that the RAID controllers integrated on the motherboards for home PCs are not supported by all RAID levels. Two-port RAID controllers support only levels 0 and 1, and RAID controllers with a large number of ports (for example, a 6-port RAID controller integrated into the southern bridge of the ICH9R / ICH10R chipset) - also levels 10 and 5.

In addition, if we talk about motherboards on Intel chipsets, then the Intel Matrix RAID function is also implemented, which allows you to create a multiple-level RAID-matrices on several hard drives, highlighting part of the disk space for each of them.

RAID 0.

RAID level 0, strictly speaking, is not an excessive array and, accordingly, does not ensure the reliability of data storage. Nevertheless, this level is actively applied in cases where it is necessary to provide high performance of the disk subsystem. When creating a RAID array of level 0, the information is divided into blocks (sometimes these blocks are called bars (Stripe)), which are written to individual discs, that is, a system with parallel access (unless, of course, it allows the block size). Thanks to the possibility of simultaneous I / O with multiple disks, RAID 0 provides maximum data transfer speed and maximum disk space efficiency, since no place for storing checksums is required. The implementation of this level is very simple. Basically, RAID 0 is used in areas where a rapid transmission of a large amount of data is required.

RAID 1 (Mirrored Disk)

RAID level 1 is an array of two disks with 100 percent redundancy. That is, the data at the same time is simply completely duplicated (mirrors), due to which a very high level of reliability is achieved (as well as cost). Note that in order to implement level 1, it is not necessary to pre-smash disks and data on blocks. In the simplest case, two disks contain the same information and are one logical disk. At the failure of one disk, its function performs another (which is absolutely transparent to the user). The array recovery is performed by simple copying. In addition, this level doubles the read speed of information, since this operation can be performed simultaneously from two disks. Such a storage scheme is mainly used in cases where the cost of data security is much higher than the cost of implementing the storage system.

RAID 5.

RAID 5 is a fault-tolerant disk array with distributed storage of checksums. When recording, the data stream is divided into blocks (stiffs) at the byte level and simultaneously recorded on all the disks of the array in the cyclic order.

Suppose the array contains n. disks, and fullness d.. For each portion from n-1. Straps calculated checksum p..

Captured d 1. recorded on the first disk, full d 2. - on the second and so on until full d N-1which is written on ( n.-1) -th disk. Further by n.-y disc records checksum p N.and the process is cyclically repeated from the first disk to which the full d N..

Recording process (N-1) Straps and their checksums are performed simultaneously to all n. disks.

To calculate the checksum, use the "excluding or" (XOR), applied to the recorded data blocks. So if there is n. hard drives d. - Data block (Strip), then the checksum is calculated by the following formula:

p n \u003d d 1 d 2. ... d 1-1.

In case of failure of any disk, the data on it can be restored according to control data and according to the data remaining on good drives.

As an illustration, we consider blocks the size of four bits. Let there are only five discs for storing data and recording checksums. If there is a sequence of bits 1101 0011 1100 1011, broken into four bits blocks, then to calculate the checksum, you must perform the following bitwise operation:

1101 0011 1100 1011 = 1001.

Thus, the checksum recorded on the fifth disk is 1001.

If one of the disks, for example, the fourth, failed, then the block d 4. \u003d 1100 will be inaccessible when reading. However, its value is easy to restore control sum And by the values \u200b\u200bof the other blocks with the help of all the same operation "excluding or":

d 4 \u003d D 1 d 2.d 4.p 5.

In our example, we get:

d 4 \u003d (1101) (0011) (1100) (1011) = 1001.

In the case of RAID 5, all the disks of the array have the same size, however, the total capacity of the disk subsystem, accessible to recording, becomes less than one disk. For example, if five disks have a size of 100 GB, the actual size of the array is 400 GB, since 100 GB is returned to the control information.

RAID 5 can be built on three and more hard drives. With an increase in the number of hard drives in the array, its redundancy decreases.

RAID 5 has an independent access architecture, which provides the ability to simultaneously perform multiple read or recording operations.

RAID 10.

The RAID 10 level is a certain combination of levels 0 and 1. The minimum for this level requires four disks. In the RAID 10 massif of the four disks, they are pairwise combined in level 0 arrays, and both of these array as logic discs Combined in an array of level 1. Another approach is possible: initially discs are combined into mirror arrays of level 1, and then logic discs based on these arrays - in an array of level 0.

Intel Matrix Raid.

The considered RAID arrays of levels 5 and 1 are rarely used at home, which is primarily due to the high cost of such solutions. Most often for domestic PCs, it is an array of level 0 on two disks. As we have already noted, RAID levels 0 does not provide data storage, and therefore end users face a choice: create a fast, but not providing reliability of data storage raid-array of level 0 or, increasing the cost of disk space doubled, - RAID An array of level 1, which ensures reliability of data storage, but does not allow to obtain a significant gain in performance.

In order to resolve this difficult problem, Intel has developed Intel Matrix Storage technology that allows you to combine the advantages of arrays of levels 0 and 1 in just two physical disks. And in order to emphasize that in this case it is not just about the RAID array, but about an array that combines and physical and logic discs, in the title of technology, instead of the word "array", the word "matrix" is used.

So, what is the RAID matrix of two disks using Intel Matrix Storage? The main idea is that if there are several hard drives and motherboards with an Intel chipset that supports Intel Matrix Storage, it is possible to divide disk space into several parts, each of which will function as a separate RAID array.

Consider a simple example of a RAID matrix of two disks of 120 GB each. Any disks can be divided into two logical disks, for example, 40 and 80 GB. Further, two logical disks of the same size (for example, 40 GB) can be combined into a RAID-matrix of level 1, and the remaining logic disks in the RAID matrix of level 0.

In principle, using two physical disks, you can also create only one or two RAID-matrices of level 0, but it is impossible to get only level 1 matrices. That is, if there are only two disks in the system, the Intel Matrix Storage technology allows you to create the following types of RAID matrices:

one level matrix 0;
two matrices of level 0;
level 0 matrix and level 1 matrix.

If three hard disks are installed in the system, then it is possible to create the following types of RAID matrices:

one level matrix 0;
one level 5 matrix;
two matrices of level 0;
two matrices of level 5;
level 0 matrix and level 5 matrix.

If four hard drives are installed in the system, then it is additionally possible to create a level 10 RAID matrix, as well as a combination of level 10 and level 0 or 5.

From theory to practice

They ate to talk about home computers, the most popular and popular RAID arrays of levels 0 and 1. The use of RAID arrays from three or more disks in the home PC is rather an exception to the rule. This is due to the fact that, on the one hand, the cost of RAID arrays increases in proportion to the number of disks involved in it, and on the other - for home computers, the capacity of the disk array, and not its performance and reliability, has a priority.

Therefore, in the future, we consider the RAID arrays of levels 0 and 1 based on only two disks. The task of our study will include a comparison of the performance and functionality of RAID arrays of levels 0 and 1, created on the basis of several integrated RAID controllers, as well as the study of the dependence of the speed characteristics of the RAID array on the size of the straight.

The fact is that although theoretically, when using the RAID array of level 0, the speed of reading and recording should be twice as well, in practice, the increase in the speed characteristics is much less modest and for different RAID controllers it is different. Similarly, for the RAID array of level 1: despite the fact that theoretically, the speed of reading should be doubled, in practice it is not so smooth.

For our comparative testing of RAID controllers, we used the Gigabyte GA-EX58A-UD7 motherboard. This board is based on the Intel X58 EXPRESS chipset with the South ICH10R bridge with an integrated RAID controller on six SATA II ports, which supports the organization RAID arrays of levels 0, 1, 10 and 5 with the Intel Matrix RAID function. In addition, the Gigabyte GA-EX58A-UD7 board is integrated by the Gigabyte SATA2 RAID controller, on the basis of which two SATA II ports are implemented with the possibility of organizing Raid arrays of levels 0, 1 and JBOD.

Also on the GA-EX58A-UD7 board, the SATA III controller Marvell 9128 is integrated, on the basis of which two SATA III ports are implemented with the possibility of organizing RAID-array levels 0, 1 and JBOD.

Thus, on the GIGABYTE GA-EX58A-UD7 board there are three separate RAID controllers, on the basis of which you can create RAID arrays of levels 0 and 1 and compare them with each other. Recall that the SATA III standard is compatible with the SATA II standard, therefore, on the basis of the Marvell 9128 controller, which supports discs with sATA interface Iii, you can also create RAID arrays using the SATA II interface disks.

Stand for testing had the following configuration:

processor - Intel Core i7-965 Extreme Edition;
motherboard - Gigabyte GA-EX58A-UD7;
bIOS version - F2A;
hard drives - two Western Digital WD1002FBYS disk, one Western Digital WD3200AAKS drive;
integrated RAID controllers:
ICH10R,
Gigabyte SATA2,
Marvell 9128;
memory - DDR3-1066;
memory amount - 3 GB (three modules of 1024 MB);
memory mode - DDR3-1333, three-channel operation mode;
video card - Gigabyte GeForce GTS295;
power supply - Tagan 1300W.

Testing was conducted under the Microsoft Windows 7 Ultimate operating system (32-bit). The operating system was installed on the WESTERN DIGITAL WD3200AAKS disk, which was connected to the SATA II controller port integrated into the ICH10R south bridge. The RAID array was assembled on two WD1002FBYS disks with the SATA II interface.

To measure the speed characteristics of the created RAID arrays, we used the iOMETER utility, which is a sectoral standard for measuring the performance of disk systems.

Ometer utility

Since we have conceived this article as a kind of user manual for creating and testing RAID arrays, it will be logical to start with the description of the IOMETER utility (INPUT / OUTPUT Meter), which we have already noted is a kind of industry standard for measuring the performance of disk systems. This utility is free, and it can be downloaded from the resource http://www.iometer.org.

The iOMETER utility is a synthetic test and allows you to work with rigid disks unwanted on logic partitions, so that you can test discs regardless of the file structure and reduce the effect of the operating system.

When testing it is possible to create a specific access model, or "pattern", which allows you to specify the execution hard disk Specific operations. In the case of creating a specific access model, it is allowed to change the following parameters:

the size of the data transmission request;
random / sequential distribution (in%);
distribution of read / write operations (in%);
the number of individual I / O operations operating in parallel.

The iOMEter utility does not require installation to the computer and consists of two parts: the actual IOMETER and DYNAMO.

IOMETER is a controlling part of the program with a user graphical interface that allows you to produce all the necessary settings. Dynamo is a load generator that does not have an interface. Each time you start the iOMter.exe file, the Dynamo.exe load generator is automatically started.

In order to start working with the IOMETER program, it is enough to run the iometer.exe file. This opens the main window of the IOMETER program (Fig. 1).

Fig. 1. The main window of the iOMETER program

It should be noted that the iOMETER utility allows testing not only local disk systems (DAS), but also network drives (NAS). For example, it can be used to test the server disk subsystem performance (file server) using several network customers for this. Therefore, part of the bookmarks and tools in the IOMETER utility window refers to the network settings of the program. It is clear that when testing disks and RAID arrays, these features of the program will not be required, and therefore we will not explain the appointment of all tabs and tools.

So, when starting the iOMter program on the left part of the main window (in the Topology window), the tree structure of all running load generators (Dynamo instances) will be displayed. Each running instance of the Dynamo load generator is called Manager. In addition, the IOMETER program is a multi-thread and each individual run-up of the Dynamo load generator instance of the load generator is called Worker. The number of workers running always corresponds to the number of logical processor cores.

In our example, only one computer with a quad-core processor supporting Hyper-Threading technology is used, so only one manager (one copy of Dynamo) and eight (by the number of logical processor nuclei) are launched.

Actually, for testing discs in this window there is no need to change or add anything.

If you select the computer name in the tree structure of the dynamo running instances in the tree, then in the window Target. On the tab Disk Target. All discs, disk arrays and other drives (including network) installed on the computer are displayed. These are the drives with which the IOMEter program can work. Media can be marked with yellow or blue. Yellow color marked logic sections of carriers, and blue - physical devices without logical partitions created on them. The logical partition can be crossed or not crossed. The fact is that for the operation of the program with a logical partition, it must be prepared before, creating a special file on it equal to the size of the capacity of the entire logical partition. If the logical partition is crossed out, this means that the section is not yet prepared for testing (it will be automatically prepared at the first stage of testing), but if the section does not cross, this means that the file has already been created on the logical section, fully Ready for Testing .

Note that, despite the supported ability to work with logical partitions, it is not optimally tested that the discs are not broken into logical partitions. Delete the logical section of the disk can be very simple - through the snap Disk Management. To access it is enough to right-click on the icon Computer on the desktop and in the menu that opens Manage.. In the window that opens Computer Management. In the left part it is necessary to choose the item Storage, and in it - Disk Management. After that in the right part of the window Computer Management. All connected discs will be displayed. By right-clicking on the desired disk and selecting the item in the opened menu DELETE VOLUME...., you can delete the logical partition on the physical disk. Recall that when you delete the logical partition from the disk, all information on it is deleted without the possibility of recovery.

In general, only clean discs or disk arrays can be tested using the iOMter utility. That is, it is impossible to test the disk or disk array on which the operating system is installed.

So, back to the description of the IOMETER utility. In the window Target. On the tab Disk Target. You must select that disk (or disk array), which will be subjected to testing. Next, you must open the tab Access Specifications.(Fig. 2) on which you can define the testing script.

Fig. 2. Access Specifications tab IOMETER utilities

In the window Global Access Specifications There is a list of preset testing scripts that can be assigned to the boot manager. However, these scenarios will not be needed, so all of them can be highlighted and removed (a button is provided for this Delete.). After that click on the button NewTo create a new test script. In the window that opens Edit Access specification You can define a disk boot script or a RAID array.

Suppose we want to find out the dependence of the speed of consistent (linear) read and write on the size of the data request block. To do this, we need to form a sequence of load scenarios in sequential read mode with different block sizes, and then a sequence of download scripts in a sequential recording mode with different block sizes. Typically, the size of the blocks are selected as a series, each member of which is twice as much more than the previous one, and the first term of this series is 512 bytes. That is, the size of the blocks make up the following row: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB. Making the size of the block is more than 1 MB with consecutive operations there is no sense, because with such large data block sizes, the speed of consecutive operations does not change.

So, I will form a download script in a sequential read mode for a block of 512 bytes.

In field Name. window Edit Access specification Enter the name of the download script. For example, Sequential_Read_512. Next in the field TRANSFER REQUEST SIZE. Set the size of the data block 512 bytes. Slider PERCENT RANDOM / SEQUENTIAL DISTRIBUTION (The percentage ratio between sequential and selective operations) shift until it stops to the left so that all of our operations are only consistent. Well, the slider defining the percentage ratio between read and write operations, shift until it stops to the right so that all of our operations are only reading. The remaining parameters in the window Edit Access specification You do not need to change (Fig. 3).

Fig. 3. Edit Access Specification window to create a sequential read download script
With the size of the data block 512 byte

Click on the button OK., and the first script we created will appear in the window Global Access Specifications On the tab Access Specifications. IOMETER utilities.

Similarly, you need to create scripts for other data blocks, however, to make it easier to work, it's easier to create a script every time again by clicking for this button New, but by selecting the last created script, click the button Edit Copy. (Edit a copy). After that, the window opens again. Edit Access specification With the settings of our latest scenario created. It will just be changed only the name and size of the block. Having done a similar procedure for all other block sizes, you can proceed to the formation of scenarios for a consistent entry, which is done completely similarly, except that the slider PERCENT READ / WRITE DISTRIBUTIONThe percentage ratio between read and write operations must be moved to the left.

Similarly, you can create scripts for selective recording and reading.

After all scenarios are ready, they need to be assigned to the download manager, that is, specify which scenarios will work Dynamo..

For this once again check that in the window TOPOLOGY. The name of the computer is highlighted (that is, the load manager on the local PC), and not a separate worker. This ensures that the load scenarios will be assigned to all workers at once. Next in the window Global Access Specifications We allocate all the load scenarios created by us and press the button. Add.. All dedicated load scenarios will be added to the window. (Fig. 4).

Fig. 4. Assigning created load scenarios manager load

After that you need to go to the tab Test Setup. (Fig. 5), where you can set the time of execution of each script we created. To do this in the group Run Time. Set the time execution scenario. It will be enough to set the time equal to 3 minutes.

Fig. 5. Setting the execution time scenario

In addition, in the field Test Description. You must specify the name of the entire test. In principle, this tab has a lot of other settings, but they are not needed for our tasks.

After all the necessary settings are made, it is recommended to save the created test by clicking on the toolbar on the button with the image of a floppy disk. The test is saved with the * .icf extension. Subsequently, it will be possible to use the created load scenario, running the iOMter.exe not file, and the saved file with the * .icf extension.

Now you can proceed directly to testing by clicking on the button with the image of the checkbox. You will be asked to specify the name of the file with test results and select its location. Test results are saved in a CSV file, which is then easy to export to Excel and by installing the filter on the first column, select the desired data with the test results.

During testing, intermediate results can be observed on the tab. RESULT DISPLAY., and determine to which load scenario they relate, you can on the tab Access Specifications.. In the window Assigned Access Specification The executable script is displayed in green, executed scenarios - red, and not yet executed scenarios - in blue.

So, we reviewed the basic techniques for working with the IOMTER utility, which will be required to test individual disks or RAID arrays. Note that we told not about all the possibilities of the IOMETER utility, but the description of all its capabilities goes beyond the scope of this article.

Creating a RAID array based on the Gigabyte SATA2 controller

So, we start creating a RAID array based on two disks using the Gigabyte SATA2 RAID controller. Of course, Gigabyte itself does not produce chips, and therefore, under the chip, Gigabyte Sata2 is hiding a smoldered chip of another company. As you can find out from the driver's INF file, we are talking about the JMICRON JMB36X series series controller.

Access to the controller setup menu is possible at the system loading stage, for which you need to press the CTRL + G key combination when the appropriate inscription appears on the screen. Naturally, before in the BIOS settings, it is necessary to determine the operation mode of two SATA ports related to the Gigabyte SATA2 controller, like RAID (otherwise, access to the RAID array configurator menu will be impossible).

The Gigabyte SATA2 RAID controller setup menu is pretty simple. As we have already noted, the controller is a two-port and allows you to create RAID arrays of level 0 or 1. You can delete or create a RAID array through the controller setup menu. When creating a RAID array, you have the ability to specify its name, select an array level (0 or 1), set the size of the straight for RAID 0 (128, 84, 32, 16, 8 or 4K), as well as determine the size of the array.

If the array is created, then any changes in it are no longer possible. That is, it is impossible to change the created array for the created array, for example, its level or full size. To do this, you need to remove an array (with a data loss), and then create it again. Actually, this is characteristic not only by the Gigabyte SATA2 controller. The inability to change the parameters created by RAID arrays is a feature of all controllers, which follows from the principle of the implementation of the RAID array.

After the array on the basis of the Gigabyte SATA2 controller is created, the current information about it can be viewed using the Gigabyte Raid Configurer utility, which is installed automatically along with the driver.

Creating a RAID array based on the MARVELL 9128 controller

Configuring the Marvell 9128 RAID controller is possible only through the BIOS settings of the Gigabyte GA-EX58A-UD7 board. In general, it must be said that the MARVELL 9128 controller menu is somewhat damp and can mislead the inexperienced users. However, we will tell about these minor flaws a little later, in while we consider the basic functionality of the Marvell 9128 controller.

So, despite the fact that this controller supports working with disks with the SATA III interface, it is also fully compatible with the disks with the SATA II interface.

The Marvell 9128 controller allows you to create a RAID array of levels 0 and 1 on the basis of two disks. For an array of level 0, you can set the size of the battery 32 or 64 KB, as well as specify the name of the array. In addition, there is also such an option as Gigabyte Rounding, which needs explanation. Despite the name, consonant with the name of the manufacturer's company, the function of Gigabyte Rounding has nothing to do with it. Moreover, it is not connected with the level 0 RAID-array, although in the controller settings it can be determined precisely for the array of this level. Actually, this is the first of those facilities of the MARVELL 9128 controller configurator, which we mentioned. The Gigabyte Rounding feature is defined only for RAID array of level 1. It allows you to use to create a RAID-array level 1 two disks (for example, various manufacturers or different models), the container of which is slightly different from each other. The Gigabyte Rounding feature just specifies the difference in the sizes of two disks used to create a RAID array of level 1. In the Marvell 9128 controller, the Gigabyte Rounding feature allows you to set the difference in the size of the disks 1 or 10 GB.

Another MARVELL 9128 controller configuration failure is that when creating a RAID array of level 1, the user has the ability to choose a fullness size (32 or 64 KB). However, the concept of a straight is not determined at all for the RAID array of level 1.

Creating a RAID array based on the controller integrated into ICH10R

The RAID controller integrated into the ICH10R south bridge is the most common. As already noted, this RAID controller 6-port and supports not only the creation of RAID 0 and RAID 1 arrays, but also RAID 5 and RAID 10.

Access to the controller setup menu is possible at the system loading step, for which you need to press the CTRL + I key combination when the appropriate inscription appears on the screen. Naturally, before in the BIOS settings, it is necessary to determine the mode of operation of this controller as RAID (otherwise, access to the RAID array configurator menu will be impossible).

The RAID controller setup menu is sufficient. Through the controller setup menu, you can delete or create a RAID array. When creating a RAID array, you can specify its name, select the array level (0, 1, 5 or 10), set the size of the staper for RAID 0 (128, 84, 32, 16, 8 or 4K), as well as determine the size of the array.

Comparison of the performance of RAID arrays

To test the RAID arrays using the iOMter utility, we have created a sequential reading scenario, serial recording, selective reading and selective recording. The dimensions of the data blocks in each load scenario were the following sequence: 512 bytes, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 KB, 1 MB.

At each of the RAID controllers, an array of RAID 0 was created with all valid full-size sizes and an array of RAID 1. In addition, in order to be able to estimate the performance gains received from the use of the RAID array, we also tested a single disk on each RAID controller.

So, turn to the results of our testing.

Gigabyte SATA2 controller

First of all, consider the results of the testing of RAID arrays based on the GIGABYTE SATA2 controller (Fig. 6-13). In general, the controller was literally mysterious, and its performance simply disappointed.


Fig. 6. Speed \u200b\u200bserial and selective disk operations Western Digital WD1002FBYS.	Fig. 7. Speed \u200b\u200bserial With Strip Size 128 KB (Controller GIGABYTE SATA2)


Fig. 12. Speed \u200b\u200bserial and sample operations for RAID 0 With Strip size 4 KB (Controller GIGABYTE SATA2)	Fig. 13. Speed \u200b\u200bserial and sample operations For RAID 1 (controller GIGABYTE SATA2)

If you look at the speed characteristics of one disk (without a RAID array), then the maximum consistent reader speed is 102 MB / s, and the maximum consistent recording speed is 107 MB / s.

When creating a RAID 0 array with a full range of 128 KB, the maximum consistent reading speed and recording increases to 125 MB / s, that is, it increases by about 22%.

With the size of Strapp 64, 32 or 16 KB, the maximum sequential reader speed is 130 MB / s, and the maximum consistent recording speed is 141 MB / s. That is, with the specified staped sizes, the maximum consistent reading rate increases by 27%, and the maximum consistent recording rate is by 31%.

In fact, it is not enough for an array of level 0, and I would like the maximum speed of consecutive operations above.

With the size of Strapp 8 KB, the maximum speed of consecutive operations (read and write) remains approximately the same as with the size of Straper 64, 32 or 16 KB, however, with selective reading - obvious problems. With increasing size of the data block, up to 128 Kbytes, the sample reading rate (as it should be) increases in proportion to the size of the data block. However, with the size of the data block of more than 128 KB, the sample reading rate drops almost to zero (about 0.1 MB / s).

With the size of Strapp 4 Kbyte, not only the sample reading rate with the size of the block is more than 128 KB, but also the speed of the sequential reader at a block size of more than 16 KB.

Using the RAID 1 array on the GIGABYTE SATA2 controller practically does not change (in comparison with a single disk) the speed of the sequential reading, however, the maximum consistent recording speed is reduced to 75 MB / s. Recall that for the RAID 1 array 1, the read speed should increase, and the recording speed should not decrease in comparison with the speed of reading and recording a single disk.

Based on the test results of the GIGABYTE SATA2 controller, only one conclusion can be made. Use this controller to create RAID 0 and RAID 1 arrays makes sense only when all other RAID controllers (Marvell 9128, ICH10R) are already involved. Although it is quite difficult to imagine a similar situation.

Controller Marvell 9128.

The Marvell 9128 controller demonstrated much higher speed characteristics in comparison with the GIGABYTE SATA2 controller (Fig. 14-17). Actually, the differences are manifested even when the controller with one disk is operating. If the maximum sequential reader speed is 102 MB / s for the GIGABYTE SATA2 controller and is achieved with the size of the data block 128 KB, then for the Marvell 9128 controller, the maximum sequential read speed is 107 MB / s and is achieved with the size of the data block 16 KB.

When creating an array of RAID 0 with full-range size 64 and 32 KB, the maximum sequential reader speed increases to 211 MB / s, and a sequential recording is up to 185 MB / s. That is, with the same strike size, the maximum consistent reading rate increases by 97%, and the maximum consistent recording rate is 73%.

The essential difference in high-speed indicators of the RAID 0 array with the size of Strapp 32 and 64 KB is not observed, however, the use of full 32 KB is more preferable, since in this case the speed of sequential operations at a block size of less than 128 KB will be slightly higher.

When creating an array of RAID 1 on the Marvell 9128 controller, the maximum speed of consecutive operations is practically not changed in comparison with a single disk. So, if for a single disc, the maximum speed of successive operations is 107 MB / s, then for RAID 1 it is equal to 105 MB / s. We also note that for RAID 1, the sample reading rate is slightly deteriorating.

In general, it should be noted that the Marvell 9128 controller has good speed characteristics and it is quite possible to use both for creating RAID arrays and for connecting single disks to it.

ICH10R controller

The RAID controller built into ICH10R turned out to be the highest outprotection of all tested by us (Fig. 18-25). When working with a single disk (without creating a RAID array), its performance is actually the same as the MARVELL 9128 controller performance. The maximum consistent reader and recording speed is 107 MB and is achieved with the size of the data block 16 KB.

Fig. 18. Speed \u200b\u200bserial
and sample operations
For Western Digital WD1002FBYS disk (ICH10R controller)

If we talk about the RAID 0 array on the ICH10R controller, then the maximum speed of consistent reading and record does not depend on the size of the straight and is 212 MB / s. Only the size of the data block is depends on the size of the stait maximum value Speed \u200b\u200bof consistent reading and writing. As the test results show, for RAID 0 based on the ICH10R controller, it is optimal to use a 64 KB full-range. In this case, the maximum value of the sequential reading and recording rate is achieved with the size of the data block of only 16 KB.

So, summarizing, once again we emphasize that the RAID controller embedded in ICH10R significantly exceeds the performance of all other integrated RAID controllers. And considering that it possesses and greater functionality, optimally use this particular controller and simply forget about the existence of everyone else (unless, of course, SATA III discs are not used in the system).