NAS542 Raid5 Degraded
I have raid degraded, I replace disk 2, follow the steps to repair it, after a few minutes of loading took me back to the repair page. I check and this is my status:
Some advice please ?
Best Answers
-
You have got a problem. Either you exchanged the wrong disk, or 2 disks were already dropped when you exchanged the disk./dev/sda3:
Update Time : Fri Jan 28 20:47:48 2022
Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb3:
Update Time : Sat Jan 29 11:05:35 2022
Array State : ..AA ('A' == active, '.' == missing)
/dev/sdc3:
Update Time : Sat Jan 29 11:05:35 2022
Array State : ..AA ('A' == active, '.' == missing)
/dev/sdd3:
Update Time : Sat Jan 29 11:05:35 2022
Array State : ..AA ('A' == active, '.' == missing)Disk 1 was last updated Jan 28 20:47, and at that moment the array was healthy. Disk 2,3 and 4 were updated at Jan 29 11:05, and the array has only 2 members left. So disk 2 was added as spare, as 2 disks is not enough to add an active member.So disk 1 failed first, as it's 'Array State' was never upgraded. Maybe disk 2 also failed, maybe not. Was the array degraded or down, when you exchanged the disk?
0 -
and initialize it,
What do you mean by that?
What you are proposing is pretty harmless, as long as you don't delete and/or recreate volumes using the webinterface. I don't think you will be able to get any data from the NAS, as the array is down, and won't automagically be up again. The raid headers tell the raid manager they don't belong to the same array anymore.
From the commandline it is possible to bring the array up (by re-creating it without touching the content), but as long as the original unreadable sector is there, the array will go down as soon as it's accessed. Very inconvenient when you are trying to backup.
0
All Replies
-
You have got a problem. Either you exchanged the wrong disk, or 2 disks were already dropped when you exchanged the disk./dev/sda3:
Update Time : Fri Jan 28 20:47:48 2022
Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb3:
Update Time : Sat Jan 29 11:05:35 2022
Array State : ..AA ('A' == active, '.' == missing)
/dev/sdc3:
Update Time : Sat Jan 29 11:05:35 2022
Array State : ..AA ('A' == active, '.' == missing)
/dev/sdd3:
Update Time : Sat Jan 29 11:05:35 2022
Array State : ..AA ('A' == active, '.' == missing)Disk 1 was last updated Jan 28 20:47, and at that moment the array was healthy. Disk 2,3 and 4 were updated at Jan 29 11:05, and the array has only 2 members left. So disk 2 was added as spare, as 2 disks is not enough to add an active member.So disk 1 failed first, as it's 'Array State' was never upgraded. Maybe disk 2 also failed, maybe not. Was the array degraded or down, when you exchanged the disk?
0 -
On the 27th I changed disc 2. I read some articles here on the forum and on the 28th I proceeded to repair the raid from the web interface. After about 30 minutes I started to hear the beeps again, it took me out of the web interface and at the storage manager I had the option to repair the raid again. I left it like that overnight, did some more searches on the 29th and started running some SSH commands to see the raid status. Is it possible that disk 1 also crashed while repairing the raid? What can I do in this situation? I'm thinking of inserting disk 2 (the old one) and maybe I have a chance to repair the raid with disk 1. What would be the solution? I have important data there.
0 -
So the rebuild for disk 2 started on the 27th and completed. Then disk 2 was dropped again, and while rebuilding disk 1 was dropped after Fri Jan 28 20:47 (UTC, I think), leaving you in your current situation.It is important to know that the disks are not crashed, they just have one or more unreadable sectors. The raid manager drops a member as soon as an I/O error occurs, which is in many cases an unreadable sector.It is possible to recreate your array using the original 4 disks. Problem is that the unreadable sector is still unreadable, so sooner or later this will hit you again.The solution is to create a bit-by-bit copy on a new disk. The unreadable sector cannot be copied, so it will be filled with zero's on the copy. If that is a problem depends on the function of that sector.You have got 5 disks: A, B1, B2, C and D, where A failed during the 2nd rebuild, B1 was dropped first, and B2 was dropped soon after the 1 rebuild. C and D are healthy, as far as we know.It's a bit strange that B2 was dropped soon after the 1st rebuild. Is that a new disk? Have you looked at it's SMART values?Anyway, I think B1 is most out of sync. It was dropped on the 27th, and all changes to the filesystem here after are not on B1. When A was dropped, the array was down, so A should be up-to-date.I think you should try to create a bit-by-bit backup of A, and then create a degraded array of A, C and D. Then you can add a 4th disk to get redundancy back.The procedure to create the bit-by-bit copy:Remove all disks except A and plug a new disk in. Then executecat /proc/partitionsormdadm --examine /dev/sd[ab]3to make sure disk A is still disk /dev/sda and the new disk is /dev/sdbDownload this 3 files and put them on an USB stick, and plug it in.Executecd /e-data/<some-hex-code>/./fix-ld-linux.sh./screen<enter>./ddrescue /dev/sda /dev/sbb ./logfileThis will copy disk /dev/sda to /dev/sdb, and skip unreadable sectors. Make sure sda is disk A, and sdb is the new disk. When copying is busy, you can close your ssh session. Later you can get your session back withcd /e-data/<some-hex-code>/./screen -r(That is the function of screen). When copying is done,mdadm --examine /dev/sd[ab]3should show 2 identical headers.When that is completed, let's talk about recreating the array.I have important data there.By now it's clear that you should have a backup. And raid is not a backup.
0 -
Thanks. There is a lot of information and I try to understand the steps. What I've done in the meantime. Yesterday, I connected the NAS to a network where I had space to back up data. I turned on the NAS and let it work. I wait until it is fully initialized and reread the status. If I can't access the data, I will reinsert disk 2 (which has been replaced) and initialize it, check the status, and try to copy all the data if I can access it (or at least the critical ones). ). If I still do not have a solution, I will use the steps presented by you.Do you think it's okay to continue? Are there any risks that I do not anticipate at this time due to lack of experience in such issues? Or do you think I should go straight to the steps?Thank you very much for your time and information.I'll be back with a status.PS Related to back-up. Now I realize. I relied on the redundancy of a disk.0
-
and initialize it,
What do you mean by that?
What you are proposing is pretty harmless, as long as you don't delete and/or recreate volumes using the webinterface. I don't think you will be able to get any data from the NAS, as the array is down, and won't automagically be up again. The raid headers tell the raid manager they don't belong to the same array anymore.
From the commandline it is possible to bring the array up (by re-creating it without touching the content), but as long as the original unreadable sector is there, the array will go down as soon as it's accessed. Very inconvenient when you are trying to backup.
0
Categories
- All Categories
- 415 Beta Program
- 2.4K Nebula
- 151 Nebula Ideas
- 98 Nebula Status and Incidents
- 5.7K Security
- 277 USG FLEX H Series
- 277 Security Ideas
- 1.4K Switch
- 74 Switch Ideas
- 1.1K Wireless
- 42 Wireless Ideas
- 6.4K Consumer Product
- 250 Service & License
- 395 News and Release
- 85 Security Advisories
- 29 Education Center
- 10 [Campaign] Zyxel Network Detective
- 3.6K FAQ
- 34 Documents
- 34 Nebula Monthly Express
- 85 About Community
- 75 Security Highlight