NAS540 with invalid superblock metadata

Options
s21251
s21251 Posts: 3  Freshman Member
First Comment

Hi,

I have a NAS540 with 4 HDDs in RAID 5 configuration. The volume became inaccessible one day, and I've been doing a lot of things on my own, almost certainly incorrectly to repair the volume. I wasn't able to use the UI to auto repair the volume (it simply shows a red state that says "crashed"). I'd like to just be able to mount the volume to recover the data on it.

First, here's the current state of mdadm —examine:

BusyBox v1.19.4 (2024-01-02 11:31:49 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

~ # mdadm --examine /dev/sd[abcdef]3
/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 3014c29a:f2f02714:f899863e:15183df1
           Name : NAS540:2  (local to host NAS540)
  Creation Time : Wed Jun 11 13:39:11 2025
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)
     Array Size : 5848150464 (5577.23 GiB 5988.51 GB)
  Used Dev Size : 3898766976 (1859.08 GiB 1996.17 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : acddab92:079d5458:b960a2a7:1306373c

    Update Time : Wed Jun 11 13:39:11 2025
       Checksum : 5f9f75c7 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 3014c29a:f2f02714:f899863e:15183df1
           Name : NAS540:2  (local to host NAS540)
  Creation Time : Wed Jun 11 13:39:11 2025
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)
     Array Size : 5848150464 (5577.23 GiB 5988.51 GB)
  Used Dev Size : 3898766976 (1859.08 GiB 1996.17 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : a3c1c99a:1703ca51:22c84f0f:17cc7c0c

    Update Time : Wed Jun 11 13:39:11 2025
       Checksum : 9925ed3b - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 3014c29a:f2f02714:f899863e:15183df1
           Name : NAS540:2  (local to host NAS540)
  Creation Time : Wed Jun 11 13:39:11 2025
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)
     Array Size : 5848150464 (5577.23 GiB 5988.51 GB)
  Used Dev Size : 3898766976 (1859.08 GiB 1996.17 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2f3e3324:246a273f:a9723e9c:72ef33fb

    Update Time : Wed Jun 11 13:39:11 2025
       Checksum : 8b929eb8 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdd3.
mdadm: cannot open /dev/sde3: No such device or address
mdadm: cannot open /dev/sdf3: No such device or address

In the process of recovery, I've ran a lot of commands including recreating partitions to zeroing out superblocks and recreating the array from scratch. Here is the output of /proc/mdstat before I messed everything up:

~ # cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]

md2 : active raid5 sda3[0](F) sdc3[2] sdb3[5]

      5848151040 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/2] [_UU_]



md1 : active raid1 sda2[0] sdd2[4] sdc2[2] sdb2[5]

      1998784 blocks super 1.2 [4/4] [UUUU]



md0 : active raid1 sdd1[4] sda1[0] sdc1[2] sdb1[5]

      1997760 blocks super 1.2 [4/4] [UUUU]


~ # mdadm --examine /dev/sd[abcdef]3

/dev/sda3:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 99545e30:aeecb506:7cc2e30b:3d610e47

           Name : NAS540:2  (local to host NAS540)

  Creation Time : Wed Sep 13 08:36:34 2017

     Raid Level : raid5

   Raid Devices : 4



 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)

     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : active

    Device UUID : a0d3cf94:05f63236:82af1e22:5e088985



    Update Time : Sat Jun  7 11:55:02 2025

       Checksum : 9fefda44 - correct

         Events : 3235



         Layout : left-symmetric

     Chunk Size : 64K



   Device Role : Active device 0

   Array State : AAA. ('A' == active, '.' == missing)

/dev/sdb3:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 99545e30:aeecb506:7cc2e30b:3d610e47

           Name : NAS540:2  (local to host NAS540)

  Creation Time : Wed Sep 13 08:36:34 2017

     Raid Level : raid5

   Raid Devices : 4



 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)

     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 05e17a13:d73cc4c8:ffbffc79:cb193407



    Update Time : Sat Jun  7 12:04:41 2025

       Checksum : 8ab652b7 - correct

         Events : 3247



         Layout : left-symmetric

     Chunk Size : 64K



   Device Role : Active device 1

   Array State : .AA. ('A' == active, '.' == missing)

/dev/sdc3:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 99545e30:aeecb506:7cc2e30b:3d610e47

           Name : NAS540:2  (local to host NAS540)

  Creation Time : Wed Sep 13 08:36:34 2017

     Raid Level : raid5

   Raid Devices : 4



 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)

     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : clean

    Device UUID : 8118ef94:89977da0:25be3321:0b956784



    Update Time : Sat Jun  7 12:04:41 2025

       Checksum : 84e5e39 - correct

         Events : 3247



         Layout : left-symmetric

     Chunk Size : 64K



   Device Role : Active device 2

   Array State : .AA. ('A' == active, '.' == missing)

/dev/sdd3:

          Magic : a92b4efc

        Version : 1.2

    Feature Map : 0x0

     Array UUID : 99545e30:aeecb506:7cc2e30b:3d610e47

           Name : NAS540:2  (local to host NAS540)

  Creation Time : Wed Sep 13 08:36:34 2017

     Raid Level : raid5

   Raid Devices : 4



 Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)

     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)

    Data Offset : 262144 sectors

   Super Offset : 8 sectors

          State : active

    Device UUID : 703927dd:f56dd1f6:05a94b5b:e0be3252



    Update Time : Fri May 30 13:11:50 2025

       Checksum : aeb0ecda - correct

         Events : 2938



         Layout : left-symmetric

     Chunk Size : 64K



   Device Role : Active device 3

   Array State : AAAA ('A' == active, '.' == missing)

mdadm: cannot open /dev/sde3: No such device or address

mdadm: cannot open /dev/sdf3: No such device or address

As you can see, it's possible that /sdd3 is severely degraded while /sda3 is close to it, based on the stale event count.

First thing I tried is to just create the array with all 4 drives. It fails due to IO error on /sdd3:

~ # mdadm --create --assume-clean --level=5 --raid-devices=4 --metadata=1.2 --chunk=64K --layout=left-symmetric /dev/md2 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3
mdadm: /dev/sda3 appears to be part of a raid array:
level=raid5 devices=4 ctime=Wed Jun 11 13:39:11 2025
mdadm: /dev/sdb3 appears to be part of a raid array:
level=raid5 devices=4 ctime=Wed Jun 11 13:39:11 2025
mdadm: /dev/sdc3 appears to be part of a raid array:
level=raid5 devices=4 ctime=Wed Jun 11 13:39:11 2025
Continue creating array? yes
mdadm: Failed to write metadata to /dev/sdd3
~ # dmesg | tail
[ 2422.428266] sd 3:0:0:0: [sdd] CDB: cdb[0]=0x28: 28 00 00 3d 07 f0 00 00 08 00
[ 2422.435562] end_request: I/O error, dev sdd, sector 3999728
[ 2422.441307] sd 3:0:0:0: [sdd] Unhandled error code
[ 2422.446126] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00
[ 2422.452511] sd 3:0:0:0: [sdd] CDB: cdb[0]=0x28: 28 00 00 00 08 00 00 00 01 00
[ 2422.459813] end_request: I/O error, dev sdd, sector 2048
[ 2422.465249] sd 3:0:0:0: [sdd] Unhandled error code
[ 2422.470066] sd 3:0:0:0: [sdd] Result: hostbyte=0x04 driverbyte=0x00
[ 2422.476457] sd 3:0:0:0: [sdd] CDB: cdb[0]=0x28: 28 00 00 00 08 00 00 00 01 00
[ 2422.483749] end_request: I/O error, dev sdd, sector 2048

Next, I've tried to recreate the array with the 4th failing drive excluded. However, mounting fails:

~ # mdadm --create --assume-clean --level=5 --raid-devices=4 --metadata=1.2 --chunk=64K --layout=left-symmetric /dev/md2 /dev/sda3 /dev/sdb3 /dev/sdc3 missing
mdadm: /dev/sda3 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Wed Jun 11 13:50:24 2025
mdadm: /dev/sdb3 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Wed Jun 11 13:50:24 2025
mdadm: /dev/sdc3 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Wed Jun 11 13:50:24 2025
Continue creating array? yes
mdadm: array /dev/md2 started.
~ # mkdir /mnt/recovery
~ # mount /dev/md2 /mnt/recovery
mount: wrong fs type, bad option, bad superblock on /dev/md2,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
~ # dmesg | tail
[ 2533.802943] md/raid:md2: allocated 4220kB
[ 2533.807051] md/raid:md2: raid level 5 active with 3 out of 4 devices, algorithm 2
[ 2533.814554] RAID conf printout:
[ 2533.814559]  --- level:5 rd:4 wd:3
[ 2533.814566]  disk 0, o:1, dev:sda3
[ 2533.814571]  disk 1, o:1, dev:sdb3
[ 2533.814577]  disk 2, o:1, dev:sdc3
[ 2533.814682] md2: detected capacity change from 0 to 5988506075136
[ 2533.824985]  md2: unknown partition table
[ 2556.944048] EXT4-fs (md2): bad geometry: block count 1462037760 exceeds size of device (1462037616 blocks)

There appears to be a discrepancy in sizes reported by superblock and the physical size. Running e2fsck shows the following:

~ # e2fsck -n /dev/md2
e2fsck 1.42.12 (29-Aug-2014)
The filesystem size (according to the superblock) is 1462037760 blocks
The physical size of the device is 1462037616 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort? no

/dev/md2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error reading block 628 (Attempt to read block from filesystem resulted in short read).  Ignore error? no

Error while iterating over blocks in inode 7: Attempt to read block from filesystem resulted in short read
e2fsck: aborted

I'm not really sure how to proceed from here. Any guidance would be really appreciated. I don't need full recovery — a partial recovery of the data would also go a long way.

All Replies

  • Mijzelf
    Mijzelf Posts: 2,930  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary

    This emits some questions. Your current array has array state AAAA, while /dev/sdd3 has no superblock, and mdadm got an I/O error on sdd. Any idea how that is possible?

    And then on the current problem. The original array had:
    Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)

    Array Size : 5848151040 (5577.23 GiB 5988.51 GB)

    The current array has:
    Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)

    Array Size : 5848150464 (5577.23 GiB 5988.51 GB)

    Used Dev Size : 3898766976 (1859.08 GiB 1996.17 GB)

    So the array size shrunk from 5848151040 KiB to 5848150464 KiB, due to not fully using the Dev Size.
    Then the mount:
    EXT4-fs (md2): bad geometry: block count 1462037760 exceeds size of device (1462037616 blocks)

    One block is 4096 bytes, which is 4KiB, so the message translates to
    bad geometry: 5848151040 KiB exceeds size of device (5848150464 KiB)

    Looks familiar? So to solve this issue, you have to force the use of the total available Dev Size. Maybe
    mdadm --grow /dev/md2 --size=max

    will do the trick.

    Error reading block 628

    This is worrying. Did you look at the kernel log? This is probably an I/O error on one of the disks, but I can't predict which disk. It can be necessary to make a low level copy of the faulty disk in order to be able to copy the data. (A unreadable block on a key position can be very disturbing). Do not run e2fsck on bad disks, it could make things worse.

  • s21251
    s21251 Posts: 3  Freshman Member
    First Comment

    mdadm --grow /dev/md2 --size=max did the trick! I'm able to mount the volume with the 3 drives, and I'm able to copy the files off. Thank you!

Consumer Product Help Center