Notes from a Linux Software RAID Installer

Introduction

This is a document in progress!

This document attempts to explain how to easily install, configure and test a Linux software RAID solution. Specifically,

booting from mirrored RAID (RAID1) and
having striped RAID (eg, RAID0, RAID10 etc) as a root partition.

I will also document my RAID testing - explaining, for example, that swapping supposedly identical drives between stripes in a RAID10 array does not work. I will basically run through my experiences and point out anything that I think would have been useful to know at the time. If you haven't already read the Linux Software RAID HOWTO and the Linux Boot+Root+Raid+LILO HOWTO, read them now because this document really follows on from them.

The Beginning

I started with the following:

Gigabyte GA 7VTHX+ Motherboard with AMD Athlon XP 1800+ and 1024MB CAS2.0 DDR-SDRAM
4 x Seagate Barracuda IV 80GB IDE ATA/100 2MB-cache Hard Drives
2 x Promise Ultra TX2 ATA/100 Hard Disk Controllers (2 IDE channels/cables per card)
A Linux Redhat 7.2 CD (Enigma)
Several other bits and pieces and a lot of patience.

Hard Disk Controller Layout

Now, first of all, a note about disk layout. Having read that you shouldn't have multiple disks on a single channel/cable due to performance and redundancy (cable) problems, I decided to put each of the 4 disks on their own channel. Using the 4 channels from the 2 Promise cards I will refer to the disks using their device names; hde, hdg, hdi and hdk.

Hard Disk Partitioning

Okay, so with 320GB of disk space to play with I decided to go with a mainly RAID10 setup. Let's clear one thing up before we go any further - I am using the standard RAID10 (RAID1+0) definition which means striped mirrors (i.e. striping on top of mirroring) - the confusion arises from some HOWTOs incorrectly referring to RAID0+1 as RAID1+0.

I decided to go for RAID10 because I thought (after studying relevant usenet discussion articles) it offerred excellent redundancy with reasonable performance. This document doesn't debate the performance/redundancy trade-off between various RAID levels, you may interpret my performance tests as you wish. C'est quoi que ce soit.

Now, I read that you need to have a small mirrored (RAID1) /boot partition to be able to boot onto a RAID drive. The main / partition, /home and /data are RAID10. I set /tmp to RAID0 - striped - as is relatively unimportant and can be used for speedy transfer of CD images and the like. Swap is JBOD (Just a Bunch of Disks - i.e. not RAID at all), because the Linux kernel stripes swap over disk partitions already - see the HOWTOs. So the desired partition table (for each disk) is as follows:

[root@dragon /]# fdisk -l /dev/hdg

Disk /dev/hdg: 16 heads, 63 sectors, 155061 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hdg1   *         1        51     25672+  fd  Linux raid autodetect
/dev/hdg2            52    155061  78125040    5  Extended
/dev/hdg5            52     10454   5243080+  fd  Linux raid autodetect
/dev/hdg6         10455     81566  35840416+  fd  Linux raid autodetect
/dev/hdg7         81567    149061  34017448+  fd  Linux raid autodetect
/dev/hdg8        149062    154061   2519968+  fd  Linux raid autodetect
/dev/hdg9        154062    155061    503968+  82  Linux swap
[root@dragon /]#

This should create a file-system as follows (and a 2GB swap area):

[root@dragon /]# df
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/md4              10320888   1778824   8017772  19% /
/dev/md0                 24785     13111     10394  56% /boot
/dev/md7              70555568   1414028  65557516   3% /home
/dev/md10             66965128     32828  63530572   1% /data
/dev/md11              9920848     32836   9384052   1% /tmp
[root@dragon /]#

How?

So how did I get to this state? Well it's not possible to install Linux onto a disk and then merge that disk into a striped RAID array using the failed-disk method - that only works with mirrored arrays. And the RedHat 7.2 installer, good though it is, doesn't allow you to set up anything more complex than simple RAID0 or RAID1 before installation. So I decided the best way to get Linux onto a striped RAID root partition was to install it on one disk and then copy it over to the RAID array once I got it up and running. So the action campaign was as follows: (Note this campaign is explained in a lot more detail in the following sections).

First hard disk (hde) dedicated to Linux installation. During installation, partition hde1 for /boot and hde2 for /. Setup LILO on hde1's boot sector as you would usually when doing a simple one-disk Linux install.
Boot into Linux. Create hdg1, hdi1 and hdk1 partitions for the new /boot. Create a mirrorred RAID1 array (called /dev/md0) using hde1, hdg1, hdi1 and hdk1 (set hde1 as failed-disk).
Unmount /dev/hde1 (the current /boot partition) (you may need to go to init level 1 for this) and mount /dev/md0 (the new /boot RAID1 partition) in its place. Setup LILO on hdg1, hdi1 and hdk1 setting /dev/hdk2 as the root device. Alter /etc/fstab to have /dev/md0 as /boot.
Power down and swap hde and hdk (this is why we set the root device to /dev/hdk2 in the previous step because /dev/hde2 is now /dev/hdk2). Boot up and the system should be booting off /dev/hde1 (was /dev/hdk1) which is part of our /dev/md0 RAID1 array. /boot should be /dev/md0, / should be /dev/hdk2 (which was /dev/hde2). In /etc/raidtab under raiddev /dev/md0 you can now set /dev/hde1 from failed-disk to raid-disk and /dev/hdk1 from raid-disk to failed-disk.
Partition hde, hdg and hdi for the desired RAID setup. Don't alter hde1, hdg1 and hdi1 that we created in step 2. Definately don't touch /dev/hdk2 which is currenly the / partition. Reboot to re-load the partition table - this is mandatory!
Setup RAID arrays on the new partitions marking all hdk references as failed-disk.
Copy the root partition from /dev/hdk2 to the new RAID partition which is dedicated as the new /. In my setup, the new / RAID partition is /dev/md4.
Create an initrd boot RAM disk to start /dev/md4 and mount it read-only. To do this you will probably want to use mkinitrd. You'll also need to put the raidstart binary in this RAM disk (either as a statically linked binary or with its libraries).
Add the new initrd boot RAM disk to LILO on hde1, hdi1 and hdk1 with /dev/md4 as the root partition. Keep the /dev/hdk2 entry incase this doesn't work (it inevitably doesn't work first time!)
Reboot. Now you have a striped RAID array (/dev/md4) as your root partition, a mirrored RAID array (/dev/md0) as your boot partition.
Partition hdk the same as hde, hdg and hdi. Set the failed-disk parameters which affect hdk in /etc/raidtab to raid-disk and use raidhotadd to add the hdk partitions to the RAID arrays. Let it reconstruct the arrays - may take some time.
Setup LILO on /dev/hdk1 as in step 9.

This explanation of "how?" is expanded in the main focus of this document.

1. Install Linux

Summary: First hard disk (hde) dedicated to Linux installation. During installation, partition hde1 for /boot and hde2 for /. Setup LILO on hde1's boot sector as you would usually when doing a simple one-disk Linux install.

So, we plug one hard disk on a single IDE cable into the first Promise IDE controller's first slot (IDE1). This disk becomes hde. Also we install a CD-ROM drive on, say, one of the motherboard's IDE controllers (hda, b, c or d). Also make the CD-ROM drive the primary boot device in the motherboard BIOS.

Insert the RedHat 7.2 CD 1 (or whatever distribution you're using) and boot the PC. The linux install should run and you should proceed through the installation process. I recommend using the following (or similar) partitions on hde during installation.

partition   size    mount point
=========   ====    ===========
/dev/hde1   25MB+   /boot
/dev/hde2   2GB+    /

Swap is not required at this stage unless you are really low on RAM - in which case you probably shouldn't be installing software RAID anyway). Set up LILO when prompted to run from /dev/hde1.

RedHat should now be installed. You should eject the CD-ROM and power down. Now install the remaining hard disks (if you haven't done so already). Install each disk on its own cable plugged into each of the four Promise IDE slots. These disks become hde (already installed), hdg, hdi and hdk.

2. Create RAID1 /boot

Summary: Boot into Linux. Create hdg1, hdi1 and hdk1 partitions for the new /boot. Create a mirrorred RAID1 array (called /dev/md0) using hde1, hdg1, hdi1 and hdk1 (set hde1 as failed-disk).

Boot up the system and you should be booted into your fresh Linux installation. We are now going to create hard disk partitions in order to setup a small RAID1 (mirrored) array that will be used for our new /boot. We are going to create this array using the first partition of all disks. However, at the moment, the first disk, hde, is currently being used for the current /boot and / - we cannot alter hde's partition table at the moment because we do not want to destroy the current /boot and /. To counter this problem, we are going to use a common technique called the failed-disk technique. See appendix 1.

So, on each of hdg, hdi and hdk, create a primary partition, hdx1, of equal size (minimum 25MB). We will create hde1 later. I recommend using fdisk for partition table edition. Repeat the following procedure for hdg, hdi and hdk as root: In the following notation, the text in bold indicates what the user types. The normal text represents what is printed by the computer.

[root@dragon /]# fdisk /dev/hdg

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-155061, default 1): 1
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-155061, default 155061): +25M

Command (m for help): a
Partition number (1-4): 1

Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help): p

Disk /dev/hdg: 16 heads, 63 sectors, 155061 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hdg1   *         1        51     25672+  fd  Linux raid autodetect

Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@dragon /]#

Now we have created the partitions that will form our new /boot RAID1 array. Let's now create the RAID array itself. First we have to create the configuration for the new RAID array, then we can create the array itself. So, create the following entry in /etc/raidtab for our new /boot RAID1 array. In this example, our new /boot RAID1 array is to be known as /dev/md0.

raiddev               /dev/md0
raid-level            1
nr-raid-disks         4
chunk-size            4
persistent-superblock 1
nr-spare-disks        0
  device              /dev/hdg1
  raid-disk           0
  device              /dev/hdi1
  raid-disk           1
  device              /dev/hdk1
  raid-disk           2
  device              /dev/hde1
  failed-disk         3

Now we've created the configuration for /dev/md0 we can actually create it. Type "mkraid /dev/md0". If that succeeded you can type "cat /proc/mdstat" to see the RAID status. It should say something like the following:

CHECK THIS OUTPUT!!!
[root@dragon /]# cat /proc/mdstat
Personalities : [raid0] [raid1] 
read_ahead 1024 sectors

md0 : active raid1 hdg1[0] hdi1[1] hdk1[2] hde1[3]
      25600 blocks [3/4] [UUU_]
      
unused devices: 
[root@dragon /]#

Now we can format our new drive. Nothing more complex that ext2 is required here: Type "mke2fs /dev/md0".

3. Mount RAID1 /boot and Setup LILO

Summary: Unmount /dev/hde1 (the current /boot partition) (you may need to go to init level 1 for this) and mount /dev/md0 (the new /boot RAID1 partition) in its place. Setup LILO on hdg1, hdi1 and hdk1 setting /dev/hdk2 as the root device. Alter /etc/fstab to have /dev/md0 as /boot.

Now we need to mount our new /boot array in place of the current /boot partition. So we need to type "umount /dev/hde1". This command will, more than likely, result in an error saying "drive in use". If this is the case then go to the console and go to init 1 (type "init 1" as root), change to the root directory (type "cd /") and try again.

Once this has succeeded we can mount our new /boot array in its place. Edit /etc/fstab to replace the line refering to /boot with this one:

/dev/md0   /boot   ext2   defaults   1 1

Now we should be able to mount our new raid array by typing "mount /boot". Check that it has mounted the correct volume by typing "mount". You should have something like the following:

[root@dragon /]# mount
/dev/hde2 on / type ext3 (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
none on /dev/shm type tmpfs (rw)
/dev/md0 on /boot type ext2 (rw)
[root@dragon /]#

Now we have to setup LILO (the Linux boot loader) to make Linux boot from our new RAID1 array. We want to set up LILO to boot from hdg1, hdi1 and hdk1. This is so any of the drives can be used to boot from. Normally, the BIOS / Promise cards will only boot from the first drive (hde) - so in future, if hde breaks, we can swap it with hdg and still be able to boot.

We want to set the root device to /dev/hdk2. This is because, after we have completed this step we are going to physically swap the hde and hdk drives. Our current root partition, /dev/hde2, will become /dev/hdk2 after this swap. Although the configuration below looks incorrect at the moment, after we have swapped the drives, it will become correct.

Create a /boot/lilo.conf.hdx for each drive (swap x for g, i and k). It should be similar to the following example which is for hdg except or the "boot=/dev/hdg1" line should be changed to the boot partition in question, the image/initrd lines may need changing for the correct kernel version (take them from the current /etc/lilo.conf) and the cylinders/secors/heads/start lines need checking (take them from "fdisk -ul /dev/hdg").

[root@dragon /]# fdisk -ul /dev/hdg

Disk /dev/hdg: 16 heads, 63 sectors, 155061 cylinders
Units = sectors of 1 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hdg1   *        63     51407     25672+  fd  Linux raid autodetect

[root@dragon /]# cat /boot/lilo.conf.hdg

disk=/dev/md0
sectors=63
heads=16
cylinders=155061
partition=/dev/md1
start=63
boot=/dev/hdg1
map=/boot/map
install=/boot/boot.b
message=/boot/message
linear
prompt
timeout=50
image=/boot/vmlinuz-2.4.9-31
        label=boot-raid-to-hdk2
        initrd=/boot/initrd-2.4.9-31.img
        read-only
        root=/dev/hdk2

[root@dragon /]#

Now you have created your LILO configuration files you can execute them by typing "lilo -v -C /boot/lilo.conf.hdx" (swap x for g, k and i). The output should be similar to the following:

[root@dragon /]# lilo -v -C /boot/lilo.conf.hdg

LILO version 21.4-4, Copyright (C) 1992-1998 Werner Almesberger
'lba32' extensions Copyright (C) 1999,2000 John Coffman

Reading boot sector from /dev/hdg1
Merging with /boot/boot.b
Mapping message file /boot/message
Boot image: /boot/vmlinuz-2.4.9-31
Mapping RAM disk /boot/initrd-2.4.9-31.img
Added boot-raid-to-hdk2 *
/boot/boot.2201 exists - no backup copy made.
Writing boot sector.

[root@dragon /]#

4. Boot from RAID /boot

Summary: Power down and swap hde and hdk (this is why we set the root device to /dev/hdk2 in the previous step because /dev/hde2 is now /dev/hdk2). Boot up and the system should be booting off /dev/hde1 (was /dev/hdk1) which is part of our /dev/md0 RAID1 array. /boot should be /dev/md0, / should be /dev/hdk2 (which was /dev/hde2). In /etc/raidtab under raiddev /dev/md0 you can now set /dev/hde1 from failed-disk to raid-disk and /dev/hdk1 from raid-disk to failed-disk.

Now we can power down the system and swap the hde and hdk drives as previously mentioned. By swapping hde and hdk, I mean physically swapping the drives over - take the IDE cable plugged into the IDE2 on the 2nd Promise controller (furthest away from the processor) and swap it with the IDE cable plugged into IDE1 on the 1st Promise contoller (nearest the processor). When the system is booted up, it should boot from our new /boot RAID1 array, /dev/md0, which encompasses /dev/hde1, /dev/hdg1 and /dev/hdi1. Remember hde was hdk and vice-versa. The root partition should be /dev/hdk2.

[root@dragon /]$ mount
/dev/hdk2 on / type ext2 (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
none on /dev/shm type tmpfs (rw)
/dev/md0 on /boot type ext2 (rw)
[root@dragon /]$

We now need to alter configuration files to reflect the swap of hde and hdk. Swap the "device /dev/hde1" and "device /dev/hdk1" lines under /dev/md0 in /etc/raidtab. We should also alter the / line in /etc/fstab to /dev/hdk2 from /dev/hde2.

5. Create Striped Root Partition

Summary: Partition hde, hdg and hdi for the desired RAID setup. Don't alter hde1, hdg1 and hdi1 that we created in step 2. Definately don't touch /dev/hdk2 which is currenly the / partition. Reboot to re-load the partition table - this is mandatory!

Now we get to the fun bit. We are going to start creating the partitions that will form our final root partition. At this stage we can also create any other system partitions, if applicable, such as /home, /usr or /tmp. In this example I have got four 80GB drives so I decided to go for a 10GB RAID10 array (mirrored then striped) for root, two 35GB RAID10 arrays for /home and /data, a 10GB RAID0 (striped) array for /tmp which (roughly) leaves four 2.5GB partitions for swap. We already have a 25MB partition for /boot, so I wanted the partition table looking a bit like the following:

device    mount  fs    size  array
========  =====  ====  ====  ====================================================
/dev/md0  /boot  ext2  25MB  RAID1 { hde1, hdg1, hdi1, hdk1 }
/dev/mdx  /      ext3  10GB  RAID0 { RAID1 { hde5, hdg5 }, RAID1 { hdi5, hdk5 } }
/dev/mdx  /home  ext3  35GB  RAID0 { RAID1 { hde6, hdg6 }, RAID1 { hdi6, hdk6 } }
/dev/mdx  /data  ext3  35GB  RAID0 { RAID1 { hde7, hdg7 }, RAID1 { hdi7, hdk7 } }
/dev/mdx  /tmp   ext2  10GB  RAID0 { hde8, hdg8, hdi8, hdk8 }
swap      swap   swap  10GB  JBOD { hde9, hdg9, hdi9, hgk9 }

In this example, I am only going to explain creating a 10GB RAID10 array for a root partition. These instructions are equally applicable to a RAID0 (striped) root partition as they are to a RAID10 (mirrored then striped) root partition. In fact they can be applied in principal to any non linear RAID root array.

So, run fdisk on hde, hdg and hdi to create the partitions required for a root RAID array. In my example, a 10GB RAID10 array needs a 5GB partition on each disk. These four 5GB partitions, hde5, hdg5, hdi5 and hdk5, will form the 10GB root RAID10 array. We will only create 3 of these partitions at this stage - hde5, hdg5 and hdi5. In total my partition tables for hde, hdg and hdi looked a bit like the following after this stage:

[root@dragon /]# fdisk -l /dev/hde

Disk /dev/hde: 16 heads, 63 sectors, 155061 cylinders
Units = cylinders of 1008 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hde1   *         1        51     25672+  fd  Linux raid autodetect
/dev/hde2            52    155061  78125040    5  Extended
/dev/hde5            52     10454   5243080+  fd  Linux raid autodetect
/dev/hde6         10455     81566  35840416+  fd  Linux raid autodetect
/dev/hde7         81567    149061  34017448+  fd  Linux raid autodetect
/dev/hde8        149062    154061   2519968+  fd  Linux raid autodetect
/dev/hde9        154062    155061    503968+  82  Linux swap
[root@dragon /]#

Once we have created the necessary root partitions (hde5, hdg5 and hdi5 in my example) we will need to reboot the computer. This is because this is the only way to flush the partition table. The raid subsystem will have loaded the partition tables at boot and we cannot restart raid because /boot requires it.

6. Create Striped Root RAID Array

Summary: Setup RAID arrays on the new partitions marking all hdk references as failed-disk.

Okay, so you've rebooted to flush the partition tables. We are now ready to create the RAID arrays for the root partition (and other system partitions as necessary). This stage involves creating the RAID configuration (/etc/raidtab), creating the RAID arrays (using mkraid) and finally formatting the RAID partitions (using mke2fs).

Firstly we need to create the relevant /etc/raidtab file for our root (and other) RAID arrays. We will be using the failed-disk technique again which involves creating the array on 3 out of 4 disks, marking the 4th disk as failed and importing it into the array at a later stage. My /etc/raidtab looked like this after this stage. NB. /dev/md0, md2, md3 and md4 are the bare minimum, I have included /dev/md8, md9 and md10 as an extra example. I set the chunk-size of the new root RAID array to 4k as there will be many small files on this disk. I set the chunk-size on the /data partition (/dev/md10) to 128k as this will have bigger files on it.

[root@dragon /]# cat /etc/raidtab
#
# 25MB boot partition
# raid1 (mirrored)
#
raiddev		    /dev/md0
raid-level		    1
nr-raid-disks		    4
chunk-size		    64k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/hdg1
    raid-disk       0
    device	    /dev/hdi1
    raid-disk       1
    device	    /dev/hdk1
    raid-disk       2
    device	    /dev/hde1
    raid-disk       3

#
# 10GB root partition
# raid10 (mirrored then striped)
#
raiddev		    /dev/md2
raid-level		    1
nr-raid-disks		    2
chunk-size		    64k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/hde5
    raid-disk       0
    device	    /dev/hdg5
    raid-disk       1

raiddev		    /dev/md3
raid-level		    1
nr-raid-disks		    2
chunk-size		    64k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/hdi5
    raid-disk       0
    device	    /dev/hdk5
    failed-disk     1

raiddev		    /dev/md4
raid-level		    0
nr-raid-disks		    2
chunk-size		    4k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/md2
    raid-disk       0
    device	    /dev/md3
    raid-disk       1

#
# 35GB data partition #2
# raid10 (mirrored then striped)
#
raiddev		    /dev/md8
raid-level		    1
nr-raid-disks		    2
chunk-size		    64k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/hde7
    raid-disk       0
    device	    /dev/hdg7
    raid-disk       1

raiddev		    /dev/md9
raid-level		    1
nr-raid-disks		    2
chunk-size		    64k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/hdi7
    raid-disk       0
    device	    /dev/hdk7
    failed-disk     1

raiddev		    /dev/md10
raid-level		    0
nr-raid-disks		    2
chunk-size		    128k 
persistent-superblock	    1
nr-spare-disks		    0
    device	    /dev/md8
    raid-disk       0
    device	    /dev/md9
    raid-disk       1
[root@dragon /]#

Now we've created the RAID configuration for our root RAID array we can actually create the RAID array. Create the RAID arrays using the mkraid command. In the above example, you would type "mkraid /dev/md2" for /dev/md2, md3, md4, md8, md9 and md10 in that order. Obviously you cannot create /dev/md4 before you create /dev/md2 and /dev/md3 because /dev/md4 comprises /dev/md2 and /dev/md3. The following output or similar should be produced - this example is for /dev/md8, md9 and md10 which form a 35GB RAID10 array:

[root@dragon /]# mkraid /dev/md8 
handling MD device /dev/md8
analyzing super-block
disk 0: /dev/hde7, 34017448kB, raid superblock at 34017344kB
disk 1: /dev/hdg7, 34017448kB, raid superblock at 34017344kB
[root@dragon /]# mkraid /dev/md9
handling MD device /dev/md9
analyzing super-block
disk 0: /dev/hdi7, 34017448kB, raid superblock at 34017344kB
disk 1: /dev/hdk7, failed
[root@dragon /]# mkraid /dev/md10
handling MD device /dev/md10
analyzing super-block
disk 0: /dev/md8, 34017344kB, raid superblock at 34017280kB
disk 1: /dev/md9, 34017344kB, raid superblock at 34017280kB
[root@dragon /]#

Once we have created the RAID arrays we can verify their creation by looking at the raid status file (/proc/mdstat). It should display an output similar to the following:

[root@dragon /]# cat /proc/mdstat
Personalities : [raid0] [raid1] 
read_ahead 1024 sectors
md10 : active raid0 md9[1] md8[0]
      68034560 blocks 128k chunks
      
md9 : active raid1 hdi7[0]
      34017344 blocks [2/1] [_U]

md8 : active raid1 hdg7[1] hde7[0]
      34017344 blocks [2/2] [UU]
        resync=DELAYED
md4 : active raid0 md2[1] md3[0]
      10485888 blocks 4k chunks
      
md3 : active raid1 hdi5[0]
      5243008 blocks [2/1] [_U]

md2 : active raid1 hdg5[1] hde5[0]
      5243008 blocks [2/2] [UU]
      [=>...................]  resync =  6.9% (723536/10486016) finish=6.8min speed=24095K/sec
md0 : active raid1 hdk1[3] hdi1[2] hdg1[0] hde1[1]
      25600 blocks [4/4] [UUUU]

[root@dragon /]#

Now we have created the RAID arrays we need to format them. In the above example, the following command needs to be run for /dev/md4 and /dev/md10. The stride is calculated by dividing the chunk-size by the block-size. In the case of /dev/md10, 128k / 4k = 32.

mke2fs -j -R stride=1 /dev/md4
mke2fs -j -R stride=32 /dev/md10

toggle: 7. copy current root partition to new root raid array

7. Copy the Current Root Partition to the New Root RAID Array

Summary: Copy the root partition from /dev/hdk2 to the new RAID partition which is dedicated as the new /. In my setup, the new / RAID partition is /dev/md4.

Now we have created and formatted the new root RAID array (/dev/md4) we need to copy Linux onto our new root partition. This process involves mounting our new RAID array at a temporary location (/mnt/tmp), copying the current root partition (/dev/hdk2) to the temporary mount-point and unmounting the RAID array :-

[root@dragon /]# mkdir /mnt/tmp
[root@dragon /]# mount -t ext3 /dev/md4 /mnt/tmp
[root@dragon /]# cp -dpR / /mnt/tmp 
[root@dragon /]# umount /dev/md4

8. Create initrd Boot Ramdisk

Summary: Create an initrd boot RAM disk to start /dev/md4 and mount it read-only. To do this you will probably want to use mkinitrd. You'll also need to put the raidstart binary in this RAM disk (either as a statically linked binary or with its libraries).

Okay, so we need to create a special ramdisk to book our root non-linear RAID array. This is basically because we need to load the Linux RAID modules prior to mounting the root file-system. We can use "mkinitrd" to create most of our ramdisk. But we need to add the raidstart binary to our ramdisk too. To do this we need to compile a statically-linked raidstart binary (statically-linked basically means that it depends on no other files). Firstly, let's get a copy of the raidtools package as source. For me, the easiest place to grab this was at updates.redhat.com (or secondirily at ftp.redhat.com). However it can also be downloaded from source at people.redhat.com/mingo/raidtools. These instructions show how to get a statically-linked raidtools binary from RedHat 7.2 SRPMS Updates raidtools version 0.90-24

[root@dragon /]# rpm -U /tmp/raidtools-0.90-24.src.rpm 
[root@dragon /]# mkdir /tmp/raidtools
mkdir: cannot create directory `/tmp/raidtools': File exists
[root@dragon /]# cd /tmp/raidtools
[root@dragon raidtools]# tar xzf /usr/src/redhat/SOURCES/raidtools-multipath-2.4.2-curr-CVS.tar.gz 
[root@dragon raidtools]# ./configure
creating cache ./config.cache
checking for gcc... gcc
checking whether the C compiler (gcc  ) works... yes
checking whether the C compiler (gcc  ) is a cross-compiler... no
checking whether we are using GNU C... yes
checking whether gcc accepts -g... yes
checking how to run the C preprocessor... gcc -E
updating cache ./config.cache
creating ./config.status
creating Makefile
creating config.h
[root@dragon raidtools]#

Next we need to make the Makefile generate static executables. So edit Makefile in your favourite editor and add " -static" to the end of the line that begins with "LDFLAGS =". In the Makefile for this version of raidtools, I ended up with the line looking like "LDFLAGS = /usr/lib/libpopt.a -static". Now we can actually make the raidtools.

[root@dragon raidtools]# make
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o raidstart.o raidstart.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o parser.o parser.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o raidlib.o raidlib.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o version.o version.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o raid_io.o raid_io.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o scsi.o scsi.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o mkraid.o mkraid.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o mkpv.o mkpv.c
gcc -O2 -Wall -DMD_VERSION=\""raidtools-0.90"\" -c -o detect_multipath.o detect_multipath.c
gcc -o raidstart raidstart.o parser.o raidlib.o version.o raid_io.o scsi.o /usr/lib/libpopt.a -static
gcc -o mkraid mkraid.o parser.o raidlib.o version.o raid_io.o scsi.o /usr/lib/libpopt.a -static
gcc -o mkpv mkpv.o parser.o raidlib.o version.o raid_io.o scsi.o /usr/lib/libpopt.a -static
gcc -o detect_multipath detect_multipath.o parser.o raidlib.o version.o raid_io.o scsi.o /usr/lib/libpopt.a -static
[root@dragon raidtools]# ldd raidstart 
        not a dynamic executable
[root@dragon raidtools]#

Now we have a dynamic raidstart binary we can get on with creating our ramdisk. I executed the following commands to create a ramdisk as /boot/img.gz and then mount it at /mnt/img :-

[root@dragon raidtools]# uname -a
Linux dragon.northlodge 2.4.9-31 #1 Tue Feb 26 06:23:51 EST 2002 i686 unknown
[root@dragon raidtools]# mkinitrd /boot/img.gz 2.4.9-31
[root@dragon raidtools]# gunzip /boot/img.gz
[root@dragon raidtools]# mkdir /mnt/img
[root@dragon raidtools]# mount -v -o loop -t ext2 /boot/img /mnt/img
mount: going to use the loop device /dev/loop1
/boot/img on /mnt/img type ext2 (rw,loop=/dev/loop1)
[root@dragon raidtools]# cd /mnt/img
[root@dragon img]# # copy the statically-linked raidstart binary
[root@dragon img]# cp /tmp/raidtools/raidstart bin 
[root@dragon img]# # copy the RAID configuration file
[root@dragon img]# cp /etc/raidtab etc
[root@dragon img]# # create the necessary devices for /boot and /
[root@dragon img]# cp -dpR /dev/md[0-9] dev
[root@dragon img]# cp -dpR /dev/hd[egik][0-9] dev

The "/etc/fstab" file needs to be created for our ramdisk. Mine looked like the following:

[root@dragon img]# cat /mnt/img/etc/fstab
none      /proc  proc  defaults  0 0
/dev/md0  /boot  ext2  defaults  1 1
/dev/md4  /      ext3  defaults  1 2
[root@dragon img]#

Now the "linuxrc" file in the root of our ramdisk is the first script to be run. It needs to load the RAID modules, start the RAID devices for /boot and /. It can also either tell the system which device to load as root, or just load it itself (I found the latter easier). I ended up with the following linuxrc which starts /dev/md0 for /boot and /dev/md2, md3 and md4 for /. It then mounts /dev/md4 as the root partition.

[root@dragon img]# cat /mnt/img/linuxrc
#!/bin/nash

echo "Loading raid0 module"
insmod /lib/raid0.o
echo "Loading raid1 module"
insmod /lib/raid1.o
echo "Loading jbd module"
insmod /lib/jbd.o
echo "Loading ext3 module"
insmod /lib/ext3.o
mount -t proc /proc /proc
echo Mounting /proc filesystem
/bin/raidstart /dev/md0
/bin/raidstart /dev/md2
/bin/raidstart /dev/md3
/bin/raidstart /dev/md4
echo 0x0100 > /proc/sys/kernel/real-root-dev
umount /proc
echo Mounting root filesystem
mount --ro -t ext3 /dev/md4 /sysroot
pivot_root /sysroot /sysroot/initrd
[root@dragon img]#

Now our ramdisk is complete. We can unmount it and compress it:

[root@dragon img]# cd /
[root@dragon /]# umount /mnt/img
[root@dragon /]# gzip /boot/img
[root@dragon /]#

9. Add initrd boot RAM Image to LILO

Summary: Add the new initrd boot RAM disk to LILO on hde1, hdi1 and hdk1 with /dev/md4 as the root partition. Keep the /dev/hdk2 entry incase this doesn't work (it inevitably doesn't work first time!)

Add the following section in LILO to add our new RAID ramdisk. This section needs to be added to /boot/lilo.hde, lilo.hdg and lilo.hdi. Remember we are creating our LILO configuration files in /boot because we are soon going to trash our current root partition.

image=/boot/vmlinuz-2.4.9-31
        label=hde:2.4.9-31-rd
        initrd=/boot/img.gz
        read-only
        root=/dev/md4

Re-run LILO :-

[root@dragon /]# lilo -v -C /boot/lilo.hde
[root@dragon /]# lilo -v -C /boot/lilo.hdg
[root@dragon /]# lilo -v -C /boot/lilo.hdi

Now it's simply a case of rebooting the PC, selecting the *2.4.9-32-rd entry and seeing if it works. We are keeping the /dev/hdk2 entry in the LILO configuration so that we can come back to the current setup when the new setup does not work.

10. Reboot

Summary: Reboot. Now you have a striped RAID array (/dev/md4) as your root partition, a mirrored RAID array (/dev/md0) as your boot partition.

not yet written...

11. Merge Old Root Partition into RAID System

Summary: Partition hdk the same as hde, hdg and hdi. Set the failed-disk parameters which affect hdk in /etc/raidtab to raid-disk and use raidhotadd to add the hdk partitions to the RAID arrays. Let it reconstruct the arrays - may take some time.

not yet written...

12. Setup LILO on Old Root Partition

Summary: Setup LILO on /dev/hdk1 as in step 9.

not yet written...

toggle: appendix 1. failed-disk techniqute for raid1 installation

Appendix 1. Failed-disk technique for RAID1 Installation

The failed-disk technique is used to create RAID1 (mirrored) arrays when not all the disks/partitions that are required in the final array are available at creation time.

To create a RAID1 array over n disks, where only x disks are available at creation time (0<x<n): Create the necessary partitions on the available disks. Create a RAID1 array configuration (in /etc/raidtab) which includes all the disks. Mark the available disks as "raid-disk" and the non-available disks as "failed-disk". (Make sure you put the raid-disks at the top and the failed-disks at the bottom).

For example, to create a RAID1 array over 4 partitions (hde1, hdg1, hdi1 and hdk1) where only hdg1, hdi1 and hdk1 are currently available, the following raidtab entry would be used:

raiddev               /dev/md0
raid-level            1
nr-raid-disks         4
chunk-size            4
persistent-superblock 1
nr-spare-disks        0
  device              /dev/hdg1
  raid-disk           0
  device              /dev/hdi1
  raid-disk           1
  device              /dev/hdk1
  raid-disk           2
  device              /dev/hde1
  failed-disk         3

Create the RAID1 array using the mkraid command. In the above example, you would type "mkraid /dev/md0".

Later, when the currently unavailable disks/partitions become available, you can insert the disks into the array using the raidhotadd command. In the above example, you would first edit /etc/raidtab and change the "failed-disk 3" line to "raid-disk 3". You would then type "raidhotadd /dev/md0 /dev/hde1". Done.