Problem with recovery of RAID 5 on NAS 542
xstor
Posts: 3
Hello,
I have a problem with RAID 5 recovery on NAS 542. I had RAID 5 consisting of four 2TB disks. First I replaced one 2TB drive with a 4TB one and waited for the data to resynchronize and the NAS to work. Then I took another 2TB drive and replaced it with another 4TB drive. (I formatted the first 2TB disk, that was a mistake) And waited a few days, but I didn't get into working condition anymore.
The NAS has started a trap that RAID is degraded and fails to recover.
The original set of disks was 4x 2TB disks. Now I have 2x 2TB and 2x 4TB. The data on the NAS is corrupted and cannot be read.
When I try to repair RAID, I see that I am working with disks in positions 1, 2 and 4, and when I add a third disk, the repair fails and RAID throws back into a degraded state.
Note: What seems strange to me is that I exchanged disks in positions 4 and 3. And when I look at the logs, I see a DISK 2 IO error there.
Is there a way to recover RAID?
Thanks in advance.
0
All Replies
-
The data on the NAS is corrupted.What do you mean by that?And when I look at the logs, I see a DISK 2 IO error there.A disk I/O error while recovering a degraded array is always fatal for the recovery. The raid manager stops because there is no way to continue.After you exchanged the first disk (which one?) the array was healthy? I suppose yes, else you wouldn't have exchanged the 2nd one.Anyway, can you login over ssh, and post the output ofsumdadm --examine /dev/sd[abcd]3cat /proc/mdstat
0 -
Mijzelf said:The data on the NAS is corrupted.What do you mean by that?And when I look at the logs, I see a DISK 2 IO error there.A disk I/O error while recovering a degraded array is always fatal for the recovery. The raid manager stops because there is no way to continue.After you exchanged the first disk (which one?) the array was healthy? I suppose yes, else you wouldn't have exchanged the 2nd one.Anyway, can you login over ssh, and post the output ofsumdadm --examine /dev/sd[abcd]3cat /proc/mdstat
Firstly i changed disk on 4 position let it synchronize and the raid was healthy. So i format the disk and continue with the disk on 3 position. after i replace the disk on 3 position i let it synchronize and format the 3 disk (Mistake ). And after some time the synchronization finished with error.
So i try to reapir it and it always fail. So i checked the log and saw: "Detected Disk2 I/O error"
Output of the commands through ssh:~ # mdadm --examine /dev/sd[abcd]3/dev/sda3:Magic : a92b4efcVersion : 1.2Feature Map : 0x0Array UUID : b3e06031:85e1e9bb:53ade68f:efaf9298Name : NAS542:2Creation Time : Fri Sep 22 16:08:07 2017Raid Level : raid5Raid Devices : 4Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)Array Size : 5848151040 (5577.23 GiB 5988.51 GB)Data Offset : 262144 sectorsSuper Offset : 8 sectorsState : cleanDevice UUID : 802bdbf8:d7919097:4f90f1db:4cd1d776Update Time : Fri Nov 26 14:35:20 2021Checksum : 95a03e9b - correctEvents : 82655Layout : left-symmetricChunk Size : 64KDevice Role : Active device 0Array State : A..A ('A' == active, '.' == missing)/dev/sdb3:Magic : a92b4efcVersion : 1.2Feature Map : 0x0Array UUID : b3e06031:85e1e9bb:53ade68f:efaf9298Name : NAS542:2Creation Time : Fri Sep 22 16:08:07 2017Raid Level : raid5Raid Devices : 4Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)Array Size : 5848151040 (5577.23 GiB 5988.51 GB)Data Offset : 262144 sectorsSuper Offset : 8 sectorsState : activeDevice UUID : 2bead52a:476c845b:52b6e5e1:1d0787e4Update Time : Fri Nov 26 14:31:09 2021Checksum : ff35320c - correctEvents : 82525Layout : left-symmetricChunk Size : 64KDevice Role : Active device 1Array State : AA.A ('A' == active, '.' == missing)mdadm: cannot open /dev/sdc3: No such device or address/dev/sdd3:Magic : a92b4efcVersion : 1.2Feature Map : 0x0Array UUID : b3e06031:85e1e9bb:53ade68f:efaf9298Name : NAS542:2Creation Time : Fri Sep 22 16:08:07 2017Raid Level : raid5Raid Devices : 4Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB)Array Size : 5848151040 (5577.23 GiB 5988.51 GB)Used Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)Data Offset : 262144 sectorsSuper Offset : 8 sectorsState : cleanDevice UUID : bca2c55b:38f565ac:e95b58ff:8812835bUpdate Time : Fri Nov 26 14:35:20 2021Checksum : fe525612 - correctEvents : 82655Layout : left-symmetricChunk Size : 64KDevice Role : Active device 3Array State : A..A ('A' == active, '.' == missing)~ # cat /proc/mdstatPersonalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]md2 : active raid5 sda3[0] sdd3[4] sdb3[1](F)5848151040 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/2] [U__U]md1 : active raid1 sda2[6] sdd2[4] sdb2[5]1998784 blocks super 1.2 [4/3] [UU_U]md0 : active raid1 sda1[6] sdd1[4] sdb1[5]1997760 blocks super 1.2 [4/3] [UU_U]unused devices: <none>~ #
Thanks for your time.
0 -
That doesn't look nice. It seems your disk 2 (sdb) has developed an I/O problem, after you exchanged the first disk. When an I/O error occurs, the disk is dropped from the array. When the array was already degraded, it will be down. And yours is down, only 2 disks are left in the array.With a trick it is possible to add disk 2 again, problem is that it will be dropped again as soon as the I/O error reoccurs. So adding a 4th disk is not possible.The clean solution is to make a bit-by-bit copy of disk 2 to a new disk, using something like ddrescue. The copy will contain soft error(s) as at least one sector of disk 2 is not readable, but no longer an I/O error. So this disk can be re-inserted in the array, using some commandline magic, after which the 4th disk can be added, to regain redundancy.However there is something I don't understand. You write the filesystem is corrupted, but as the array is down, there is no volume, and so no filesystem. If the corruption showed up before the I/O error, either the disk failed silently, as in without telling upstream it couldn't read it's sector anymore, which is very bad. If the corruption showed up after the I/O error, you were either looking at some local cache in your client, or I misinterpreted the data I have got.Do I understand correctly that you formatted both the original disk 3 and 4, and that only the new disk 4 succeeded the rebuild?0
-
Do I understand correctly that you formatted both the original disk 3 and 4, and that only the new disk 4 succeeded the rebuild?However there is something I don't understand. You write the filesystem is corrupted, but as the array is down, there is no volume, and so no filesystem. If the corruption showed up before the I/O error, either the disk failed silently, as in without telling upstream it couldn't read it's sector anymore, which is very bad. If the corruption showed up after the I/O error, you were either looking at some local cache in your client, or I misinterpreted the data I have got.
There is a system log from the NASWith a trick it is possible to add disk 2 again, problem is that it will be dropped again as soon as the I/O error reoccurs. So adding a 4th disk is not possible.The clean solution is to make a bit-by-bit copy of disk 2 to a new disk, using something like ddrescue. The copy will contain soft error(s) as at least one sector of disk 2 is not readable, but no longer an I/O error. So this disk can be re-inserted in the array, using some commandline magic, after which the 4th disk can be added, to regain redundancy.
Thank you for you time.
I'll let you know when the bit-by-bit copy will be ready.
0
Categories
- All Categories
- 415 Beta Program
- 2.3K Nebula
- 141 Nebula Ideas
- 94 Nebula Status and Incidents
- 5.5K Security
- 216 USG FLEX H Series
- 262 Security Ideas
- 1.4K Switch
- 71 Switch Ideas
- 1K Wireless
- 39 Wireless Ideas
- 6.3K Consumer Product
- 243 Service & License
- 382 News and Release
- 81 Security Advisories
- 27 Education Center
- 8 [Campaign] Zyxel Network Detective
- 3K FAQ
- 34 Documents
- 34 Nebula Monthly Express
- 83 About Community
- 71 Security Highlight