X
    Categories: Linux

How to replace Faulty Linux RAID disk

How to replace Faulty Linux RAID disk

Suppose that a disk in the RAID array failed. In that case, you need replace Faulty Linux RAID disk. You must have seen my post about creating RAID 1 array same way I have created Raid 5 array with below command, so that I can demonstrate how we can replace Faulty Linux RAID disk.

[root@rhel1 ~]# mdadm -Cv /dev/md1 --level=5 -n3 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdd1 appears to be part of a raid array:
level=raid0 devices=0 ctime=Thu Jan 1 06:00:00 1970
mdadm: partition table exists on /dev/sdd1 but will be lost or
meaningless after creating array
mdadm: layout defaults to left-symmetric
mdadm: size set to 1043968K
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.

Recheck now:

[root@rhel1 ~]# mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Dec 7 19:05:53 2016
Raid Level : raid5
Array Size : 2087936 (2039.34 MiB 2138.05 MB)
Used Dev Size : 1043968 (1019.67 MiB 1069.02 MB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Wed Dec 7 19:05:58 2016
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Name : rhel1.lab.com:1 (local to host rhel1.lab.com)
UUID : fcef1f48:223e87c7:53e9eecc:d5f55e79
Events : 18

Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
3 8 65 2 active sync /dev/sde1

For demo purpose  , You can manually fail a disk to see what it would be like if a disk failed in the array, so that I can demonstrate replace Faulty Linux RAID disk.

[root@rhel1 ~]# mdadm /dev/md1 -f /dev/sdd1
mdadm: set /dev/sdd1 faulty in /dev/md1

Check the status of md1

[root@rhel1 ~]# mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Dec 7 19:05:53 2016
Raid Level : raid5
Array Size : 2087936 (2039.34 MiB 2138.05 MB)
Used Dev Size : 1043968 (1019.67 MiB 1069.02 MB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Wed Dec 7 19:07:26 2016
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Name : rhel1.lab.com:1 (local to host rhel1.lab.com)
UUID : fcef1f48:223e87c7:53e9eecc:d5f55e79
Events : 19

Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 0 0 1 removed
3 8 65 2 active sync /dev/sde1

1 8 49 - faulty spare /dev/sdd1

you can see that /dev/sdd1 is marked as faulty now remove the same from configuration using below command:

[root@rhel1 ~]# mdadm /dev/md1 -r /dev/sdd1
mdadm: hot removed /dev/sdd1 from /dev/md1

When you have replace Faulty Linux RAID disk, you need to make it active in the array. First, you need to partition the disk like you did originally when setting up the RAID array.

When the disk is partitioned, you can add it back to the array as follows:

[root@rhel1 ~]# mdadm /dev/md1 -a /dev/sdd1
mdadm: added /dev/sdd1

Verify that it has been added properly, you can see that its in rebuilding phase:

[root@rhel1 ~]# mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Dec 7 19:05:53 2016
Raid Level : raid5
Array Size : 2087936 (2039.34 MiB 2138.05 MB)
Used Dev Size : 1043968 (1019.67 MiB 1069.02 MB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Wed Dec 7 19:09:35 2016
State : clean, degraded, recovering
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 512K

Rebuild Status : 10% complete

Name : rhel1.lab.com:1 (local to host rhel1.lab.com)
UUID : fcef1f48:223e87c7:53e9eecc:d5f55e79
Events : 50

Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
4 8 49 1 spare rebuilding /dev/sdd1
3 8 65 2 active sync /dev/sde1

After some time all will become in active:sync state as below:

[root@rhel1 ~]# mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Dec 7 19:05:53 2016
Raid Level : raid5
Array Size : 2087936 (2039.34 MiB 2138.05 MB)
Used Dev Size : 1043968 (1019.67 MiB 1069.02 MB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Wed Dec 7 19:09:39 2016
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Name : rhel1.lab.com:1 (local to host rhel1.lab.com)
UUID : fcef1f48:223e87c7:53e9eecc:d5f55e79
Events : 68

Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
4 8 49 1 active sync /dev/sdd1
3 8 65 2 active sync /dev/sde1
[root@rhel1 ~]#

 

Related Post