How do I recover a volume after repair process silently failed?
bjorn
Posts: 8 Freshman Member
My NAS540 started beeping two weeks ago. The volume was degraded as drive 2 failed. Following the web-interface instructions, I replaced it and initiated repair. A day later, I tried to login to the web interface and it just hung forever. A few posts said the repair may take a while, so I just waited...
...for five days.
At that point, I logged in via SSH. CPU usage was low, and there were no processes which resembled anything that may be performing a repair.
I restarted the device.
Now when I login, I get a "before you start using your NAS" greeting.
My old Shared Folders are still visible in the control panel, but they all say "lost." The file browser shows no results.
My old Shared Folders are still visible in the control panel, but they all say "lost." The file browser shows no results.
The storage manager shows no volumes or disk groups.
All drives are healthy,
Did the web GUI repair process fail and destroy the volume? And, is there any way to reinitiate recovery?
0
All Replies
-
That doesn't look good. Can you login over ssh as root and post the output offdisk -lcat /proc/partitionsmdadm --examine /dev/sd[abcd]30
-
~ # fdisk -l Disk /dev/loop0: 144 MiB, 150994944 bytes, 294912 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock0: 256 KiB, 262144 bytes, 512 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock1: 512 KiB, 524288 bytes, 1024 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock2: 256 KiB, 262144 bytes, 512 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock3: 10 MiB, 10485760 bytes, 20480 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock4: 10 MiB, 10485760 bytes, 20480 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock5: 110 MiB, 115343360 bytes, 225280 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock6: 10 MiB, 10485760 bytes, 20480 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock7: 110 MiB, 115343360 bytes, 225280 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mtdblock8: 6 MiB, 6291456 bytes, 12288 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 21EE10AA-752B-4744-9421-343874E5EE0B Device Start End Sectors Size Type /dev/sda1 2048 3999743 3997696 1.9G Linux RAID /dev/sda2 3999744 7999487 3999744 1.9G Linux RAID /dev/sda3 7999488 7814035455 7806035968 3.7T Linux RAID Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 7B0BC181-BA8F-410B-BB00-12B62214BE8A Device Start End Sectors Size Type /dev/sdb1 2048 3999743 3997696 1.9G Linux RAID /dev/sdb2 3999744 7999487 3999744 1.9G Linux RAID /dev/sdb3 7999488 7814035455 7806035968 3.7T Linux RAID Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 812FBB8D-F40F-4685-852C-BFBF2DC2A8E0 Device Start End Sectors Size Type /dev/sdc1 2048 3999743 3997696 1.9G Linux RAID /dev/sdc2 3999744 7999487 3999744 1.9G Linux RAID /dev/sdc3 7999488 7814035455 7806035968 3.7T Linux RAID Disk /dev/sdd: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 55DC49C1-620A-4F4C-B96D-5C2959CC8F07 Device Start End Sectors Size Type /dev/sdd1 2048 3999743 3997696 1.9G Linux RAID /dev/sdd2 3999744 7999487 3999744 1.9G Linux RAID /dev/sdd3 7999488 7814035455 7806035968 3.7T Linux RAID Disk /dev/md0: 1.9 GiB, 2045706240 bytes, 3995520 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disk /dev/md1: 1.9 GiB, 2046754816 bytes, 3997568 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes
~ # cat /proc/partitions major minor #blocks name 7 0 147456 loop0 31 0 256 mtdblock0 31 1 512 mtdblock1 31 2 256 mtdblock2 31 3 10240 mtdblock3 31 4 10240 mtdblock4 31 5 112640 mtdblock5 31 6 10240 mtdblock6 31 7 112640 mtdblock7 31 8 6144 mtdblock8 8 0 3907018584 sda 8 1 1998848 sda1 8 2 1999872 sda2 8 3 3903017984 sda3 8 16 3907018584 sdb 8 17 1998848 sdb1 8 18 1999872 sdb2 8 19 3903017984 sdb3 8 32 3907018584 sdc 8 33 1998848 sdc1 8 34 1999872 sdc2 8 35 3903017984 sdc3 8 48 3907018584 sdd 8 49 1998848 sdd1 8 50 1999872 sdd2 8 51 3903017984 sdd3 31 9 102424 mtdblock9 9 0 1997760 md0 9 1 1998784 md1 31 10 4464 mtdblock10
~ # mdadm --examine /dev/sda3 /dev/sda3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660160 (11166.25 GiB 11989.67 GB) Used Dev Size : 7805773440 (3722.08 GiB 3996.56 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 9d8a6c1f:bb790cfd:5c04a459:9213646c Update Time : Sat Nov 21 15:26:51 2020 Checksum : b9ad081d - correct Events : 533 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAAA ('A' == active, '.' == missing)
~ # mdadm --examine /dev/sdb3 /dev/sdb3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660736 (11166.25 GiB 11989.67 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 8594f107:efdda91d:618b64fb:ba5b1ec8 Update Time : Wed Nov 25 13:39:37 2020 Checksum : c051785a - correct Events : 1707 Layout : left-symmetric Chunk Size : 64K Device Role : spare Array State : ..AA ('A' == active, '.' == missing)
~ # mdadm --examine /dev/sdc3 /dev/sdc3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660736 (11166.25 GiB 11989.67 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 9232419a:22d8698c:e5ca4ca7:00915b5a Update Time : Wed Nov 25 13:39:37 2020 Checksum : ff8684e2 - correct Events : 1707 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 2 Array State : ..AA ('A' == active, '.' == missing)
~ # mdadm --examine /dev/sdd3 /dev/sdd3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660736 (11166.25 GiB 11989.67 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : b754d116:8d0e00cd:9507af1a:410b6d87 Update Time : Wed Nov 25 13:39:37 2020 Checksum : 5d209464 - correct Events : 1707 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 3 Array State : ..AA ('A' == active, '.' == missing)
0 -
I have some bad news, I'm afraid. The command 'mdadm --examine /dev/sd[abcd]3` shows the headers of the 4 raid members of the data array, and it seems you exchanged the wrong disk./dev/sda3 (the first disk, I think the left one), shows as latest update Sat Nov 21 15:26:51, and as array state AAAA, which means that member thinks all members were active at that time. It's not updated, which means this disk was dropped from the array. The others were updated at Wed Nov 25 13:39:37, and their array state is ..AA, which means they 'know' sda3 is dropped, and sdb3 is missing. Of course sdb3 is not really missing, but it's Role is 'spare', which means it is not added to the array. Which was not possible, as only 2 active members were available after you exchanged sdb.Yet it might be possible to restore the array. Do you still have the old disk?
0 -
Hrm. I'm vaguely following, although I'm unsure how I swapped the incorrect disk going from the restoration screens illustration.
I do still have the failing disk though.0 -
although I'm unsure how I swapped the incorrect disk going from the restoration screens illustration.I'm not pretending the firmware did it right. I have never seen the 'exchange disk' illustrations myself. Although it's easy to see which disk failed in some status file, and I would think translating that to instructions shouldn't be difficult. But who knows?Anyway, can you post the output ofcat /proc/mdstatexchange the old sdb back, (and powercycle) and post againcat /proc/mdstatmdadm --examine /dev/sd[abcd]30
-
Fair. I appreciate your time. Latest executions: Before,
~ # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid1 sda2[0] sdd2[3] sdc2[2] sdb2[4] 1998784 blocks super 1.2 [4/4] [UUUU] md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[4] 1997760 blocks super 1.2 [4/4] [UUUU] unused devices:
After swapping old drive back,~ # cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid1 sdb2[4] sda2[0] sdd2[3] sdc2[2] 1998784 blocks super 1.2 [4/4] [UUUU] md0 : active raid1 sdb1[4] sda1[0] sdd1[3] sdc1[2] 1997760 blocks super 1.2 [4/4] [UUUU] unused devices:
~ # mdadm --examine /dev/sda3 /dev/sda3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660160 (11166.25 GiB 11989.67 GB) Used Dev Size : 7805773440 (3722.08 GiB 3996.56 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 9d8a6c1f:bb790cfd:5c04a459:9213646c Update Time : Sat Nov 21 15:26:51 2020 Checksum : b9ad081d - correct Events : 533 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 0 Array State : AAAA ('A' == active, '.' == missing)
~ # mdadm --examine /dev/sdb3 /dev/sdb3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660160 (11166.25 GiB 11989.67 GB) Used Dev Size : 7805773440 (3722.08 GiB 3996.56 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : active Device UUID : 0a5c35b6:3bd8a182:5030b8be:51bbe238 Update Time : Thu Oct 22 21:21:39 2020 Checksum : 77ae1fd - correct Events : 47 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 1 Array State : AAAA ('A' == active, '.' == missing
~ # mdadm --examine /dev/sdc3 /dev/sdc3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660736 (11166.25 GiB 11989.67 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 9232419a:22d8698c:e5ca4ca7:00915b5a Update Time : Wed Nov 25 13:39:37 2020 Checksum : ff8684e2 - correct Events : 1707 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 2 Array State : ..AA ('A' == active, '.' == missing)
~ # mdadm --examine /dev/sdd3 /dev/sdd3: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : ad82b6f7:6aacc5f3:c7a86a8b:25240df4 Name : NAS540:2 (local to host NAS540) Creation Time : Thu Jul 27 13:12:32 2017 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 7805773824 (3722.08 GiB 3996.56 GB) Array Size : 11708660736 (11166.25 GiB 11989.67 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : b754d116:8d0e00cd:9507af1a:410b6d87 Update Time : Wed Nov 25 13:39:37 2020 Checksum : 5d209464 - correct Events : 1707 Layout : left-symmetric Chunk Size : 64K Device Role : Active device 3 Array State : ..AA ('A' == active, '.' == missing)
0 -
You didn't swap the wrong disk. The old disk was dropped from the array at Thu Oct 22 21:21:39.So what happened, I think, is that disk sdb was dropped at Oct 22. Don't know why you wasn't notified before ~Nov 12. So at Oct 22 the state of sd[acd]3 was changed to A.AA. You exchanged disk sdb, and the state changed for all disks to AAAA, while sdb3 was rebuilding. I think that should have taken around 24 hours (4TB @ ~50MB/sec).Before rebuilding was finished sda was dropped, which kept state AAAA (as it was not written after being dropped), while the other disks were upgraded to ..AA. Partition sdb3 lost it's active state because it was not fully rebuild, and the array was down, so further rebuilding was no longer possible.According to your story you started rebuilding ~Nov 12, and it should have taken around 24 hours, so the array went down ~Nov 13.Don't know why sd[bcd]3 have an update stamp of Nov 25. AFAIK there should have been no updates after the array went down.sd[acd]3 should contain your data, except an (maybe small) error on sda3, which caused it to be dropped. The old sdb3 is probably not usable anymore, as it's content is around 3 weeks older than the rest of the array, so an array build with this disk has almost certainly a corrupt filesystem, but if everything else fails, it can be tried.The new sdb3 has an unknown status. It is possible that it is mainly empty, it is also possible that it's almost completely build.In most cases a disk is dropped because it has an unreadable sector. Unwritable is less obvious, because the disk will transparently swap in a spare sector. When we rebuild the array from the current sd[acd]3, it is possible that you can access all your data, as the error can be on an unused part of the disk, not in use by a file. But if you add a 4th disk to be rebuild, the whole surface is read, as the raid array is below the filesystem, and doesn't know about files. So in that case sda might be dropped on the same unreadable sector.The options are:1) rebuild the current sd[acd]3 array, and copy away the data, in hope the unreadable sector is not in use. That odds are better if there is a lot of free space.2) rebuild the current sd[acd]3 array, and add an sdb, and hope the best of it.3) clone the current sda to a new disk, to get rid of that sector, rebuild sd[acd]3, and add an sdb.4) combine 1 and 3.I would go for 1, assuming there is not too much data. The reason is that if sdb and sda are both dropped due to an unreadable sector, what are the odds that sdc and sdd, which are from the same batch, I suppose, will also have unreadable sectors? So save your data to an independent disk. You need a backup anyway.Having said that, the command to rebuild the array from sd[acd]3 ismdadm --create --assume-clean --level=5 --raid-devices=4 --metadata=1.2 --chunk=64K --layout=left-symmetric --bitmap=none /dev/md2 /dev/sda3 missing /dev/sdc3 /dev/sdd3That is a single line. As you can see I skipped /dev/sdb3 in the command, and used 'missing', so the array will be build degraded. After rebuilding you can re-enable your shares.
0 -
This makes sense. The NAS is in a second bedroom that we've kept shut to conserve heat. I may have just not heard the beeping for a while. (And there are relatively few writes to it.)
I tried to execute the mdadm command. But got,mdadm: '--bitmap none' only support for --grow
0 -
You can omit the --bitmap=none. Nice that each version of mdadm is slightly different.
0 -
I ran the command and it output,
BusyBox v1.19.4 (2020-03-18 07:23:22 CST) built-in shell (ash) Enter 'help' for a list of built-in commands. ~ # mdadm --create --assume-clean --level=5 --raid-devices=4 --metadata=1.2 --chunk=64K --layout=left-symmetric /dev/md2 /dev/sda3 missing /dev/sdc3 /dev/sdd3 mdadm: /dev/sda3 appears to be part of a raid array: level=raid5 devices=4 ctime=Thu Jul 27 13:12:32 2017 mdadm: /dev/sdc3 appears to be part of a raid array: level=raid5 devices=4 ctime=Thu Jul 27 13:12:32 2017 mdadm: /dev/sdd3 appears to be part of a raid array: level=raid5 devices=4 ctime=Thu Jul 27 13:12:32 2017 Continue creating array? y mdadm: array /dev/md2 started.
Logging in now I see,
I assume this puts me in the bad outcome condition?0
Categories
- All Categories
- 415 Beta Program
- 2.4K Nebula
- 151 Nebula Ideas
- 100 Nebula Status and Incidents
- 5.8K Security
- 281 USG FLEX H Series
- 278 Security Ideas
- 1.5K Switch
- 74 Switch Ideas
- 1.1K Wireless
- 42 Wireless Ideas
- 6.5K Consumer Product
- 251 Service & License
- 396 News and Release
- 85 Security Advisories
- 29 Education Center
- 10 [Campaign] Zyxel Network Detective
- 3.6K FAQ
- 34 Documents
- 34 Nebula Monthly Express
- 86 About Community
- 75 Security Highlight