NAS540 - RAID5 Crash --> Volume lost

Detlef_M
Detlef_M Posts: 4  Freshman Member
edited April 2018 in Personal Cloud Storage
Hallo zusammen,

ich brauche dringend Hilfe bei der Wiederherstellung meins Raid5-Volume. Als vorhin der Kopiervorgang auf die NAS abbrach und ich diese nicht mehr ansprechen konnte, vollzog ich einen Neustart. Anschließend erhielt ich folgende Fehlermeldung in der GUI. Die einzelnen HDDs zeigen keine Defekte an, auch die SMART-Werte passen. Es gibt jedoch keine Option diesen Fehler zu "reparieren". Was muss ich tun um die Daten nicht entgültig zu verlieren?

Vielen Dank vorab für eure Hilfe.
Detlef
--------------------------------
Hi there,

I desperately need help recovering a raid5 volume. When the copying process to the NAS stopped unsuspected, I could no longer access it, so I restarted the NAS. Then I received the following error message in the GUI. The individual HDDs do not show any defects, even the SMART values fit. However, there is no option to "repair" this error. What do I have to do in order not to lose the data permanently?

Thanks in advance for your help.
Detlef








#NAS_April

Comments

  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Can you enable the ssh server (Control panel->Network->Terminal) login as admin over ssh (on Windows you can use PuTTY for that) and post the output of
    cat /proc/mdstat<br>cat /proc/partitions<br>cat /proc/mounts<br>su # you'll have to provide your admin password again<br>mdadm --examine /dev/sd[abcd]3

  • Detlef_M
    Detlef_M Posts: 4  Freshman Member
    Hi Mijzelf,

    following the output of the commands:
    <b>$ cat /proc/mdstat</b><br>Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]<br>md2 : inactive sdb3[1](S) sda3[4](S) sdd3[3](S)<br>      5848151040 blocks super 1.2<br><br>md1 : active raid1 sda2[4] sdd2[3] sdc2[2] sdb2[1]<br>      1998784 blocks super 1.2 [4/4] [UUUU]<br><br>md0 : active raid1 sda1[4] sdd1[3] sdc1[5] sdb1[1]<br>      1997760 blocks super 1.2 [4/4] [UUUU]<br>/ $ cat /proc/mdstat<br>Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]<br>md2 : inactive sdb3[1](S) sda3[4](S) sdd3[3](S)<br>      5848151040 blocks super 1.2<br><br>md1 : active raid1 sda2[4] sdd2[3] sdc2[2] sdb2[1]<br>      1998784 blocks super 1.2 [4/4] [UUUU]<br><br>md0 : active raid1 sda1[4] sdd1[3] sdc1[5] sdb1[1]<br>      1997760 blocks super 1.2 [4/4] [UUUU]<br><br>unused devices: <none><b><br><br>$ cat /proc/partitions</b><br>major minor  #blocks  name<br><br>   7        0     146432 loop0<br>  31        0        256 mtdblock0<br>  31        1        512 mtdblock1<br>  31        2        256 mtdblock2<br>  31        3      10240 mtdblock3<br>  31        4      10240 mtdblock4<br>  31        5     112640 mtdblock5<br>  31        6      10240 mtdblock6<br>  31        7     112640 mtdblock7<br>  31        8       6144 mtdblock8<br>   8        0 1953514584 sda<br>   8        1    1998848 sda1<br>   8        2    1999872 sda2<br>   8        3 1949514752 sda3<br>   8       16 1953514584 sdb<br>   8       17    1998848 sdb1<br>   8       18    1999872 sdb2<br>   8       19 1949514752 sdb3<br>   8       32 1953514584 sdc<br>   8       33    1998848 sdc1<br>   8       34    1999872 sdc2<br>   8       35 1949514752 sdc3<br>   8       48 1953514584 sdd<br>   8       49    1998848 sdd1<br>   8       50    1999872 sdd2<br>   8       51 1949514752 sdd3<br>  31        9     102424 mtdblock9<br>   9        0    1997760 md0<br>   9        1    1998784 md1<br>  31       10       4464 mtdblock10<br><br><b>$ cat /proc/mounts</b><br>rootfs / rootfs rw 0 0<br>/proc /proc proc rw,relatime 0 0<br>/sys /sys sysfs rw,relatime 0 0<br>none /proc/bus/usb usbfs rw,relatime 0 0<br>devpts /dev/pts devpts rw,relatime,mode=600 0 0<br>ubi5:ubi_rootfs1 /firmware/mnt/nand ubifs ro,relatime 0 0<br>/dev/md0 /firmware/mnt/sysdisk ext4 ro,relatime,user_xattr,barrier=1,data=ordered 0 0<br>/dev/loop0 /ram_bin ext2 ro,relatime,user_xattr,barrier=1 0 0<br>/dev/loop0 /usr ext2 ro,relatime,user_xattr,barrier=1 0 0<br>/dev/loop0 /lib/security ext2 ro,relatime,user_xattr,barrier=1 0 0<br>/dev/loop0 /lib/modules ext2 ro,relatime,user_xattr,barrier=1 0 0<br>/dev/loop0 /lib/locale ext2 ro,relatime,user_xattr,barrier=1 0 0<br>/dev/ram0 /tmp/tmpfs tmpfs rw,relatime,size=5120k 0 0<br>/dev/ram0 /usr/local/etc tmpfs rw,relatime,size=5120k 0 0<br>ubi3:ubi_config /etc/zyxel ubifs rw,relatime 0 0<br>configfs /sys/kernel/config configfs rw,relatime 0 0<br>
    <b># mdadm --examine /dev/sd[abcd]3</b><br>/dev/sda3:<br>          Magic : a92b4efc<br>        Version : 1.2<br>    Feature Map : 0x0<br>     Array UUID : 281cfbe0:a4fbd20c:8e923354:f735754c<br>           Name : NAS540:2  (local to host NAS540)<br>  Creation Time : Thu Sep  3 18:27:14 2015<br>     Raid Level : raid5<br>   Raid Devices : 4<br><br> Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)<br>     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)<br>    Data Offset : 262144 sectors<br>   Super Offset : 8 sectors<br>          State : clean<br>    Device UUID : 2710d035:f2557186:234e4584:01a41e2e<br><br>    Update Time : Sun Apr 22 00:16:07 2018<br>       Checksum : 8f46918a - correct<br>         Events : 1851992<br><br>         Layout : left-symmetric<br>     Chunk Size : 64K<br><br>   Device Role : spare<br>   Array State : .A.A ('A' == active, '.' == missing)<br>/dev/sdb3:<br>          Magic : a92b4efc<br>        Version : 1.2<br>    Feature Map : 0x0<br>     Array UUID : 281cfbe0:a4fbd20c:8e923354:f735754c<br>           Name : NAS540:2  (local to host NAS540)<br>  Creation Time : Thu Sep  3 18:27:14 2015<br>     Raid Level : raid5<br>   Raid Devices : 4<br><br> Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)<br>     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)<br>    Data Offset : 262144 sectors<br>   Super Offset : 8 sectors<br>          State : clean<br>    Device UUID : 845e39ea:e8de200b:ec9037b2:b128b056<br><br>    Update Time : Sun Apr 22 00:16:07 2018<br>       Checksum : 1ee33054 - correct<br>         Events : 1851992<br><br>         Layout : left-symmetric<br>     Chunk Size : 64K<br><br>   Device Role : Active device 1<br>   Array State : .A.A ('A' == active, '.' == missing)<br>/dev/sdc3:<br>          Magic : a92b4efc<br>        Version : 1.2<br>    Feature Map : 0x0<br>     Array UUID : 281cfbe0:a4fbd20c:8e923354:f735754c<br>           Name : NAS540:2  (local to host NAS540)<br>  Creation Time : Thu Sep  3 18:27:14 2015<br>     Raid Level : raid5<br>   Raid Devices : 4<br><br> Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)<br>     Array Size : 5848150464 (5577.23 GiB 5988.51 GB)<br>  Used Dev Size : 3898766976 (1859.08 GiB 1996.17 GB)<br>    Data Offset : 262144 sectors<br>   Super Offset : 8 sectors<br>          State : clean<br>    Device UUID : 1b0ac5ea:4e65e7d9:226389e4:fa1d9c94<br><br>    Update Time : Sat Apr 21 22:49:48 2018<br>       Checksum : 5e7113ec - correct<br>         Events : 1851944<br><br>         Layout : left-symmetric<br>     Chunk Size : 64K<br><br>   Device Role : Active device 2<br>   Array State : AAAA ('A' == active, '.' == missing)<br>/dev/sdd3:<br>          Magic : a92b4efc<br>        Version : 1.2<br>    Feature Map : 0x0<br>     Array UUID : 281cfbe0:a4fbd20c:8e923354:f735754c<br>           Name : NAS540:2  (local to host NAS540)<br>  Creation Time : Thu Sep  3 18:27:14 2015<br>     Raid Level : raid5<br>   Raid Devices : 4<br><br> Avail Dev Size : 3898767360 (1859.08 GiB 1996.17 GB)<br>     Array Size : 5848151040 (5577.23 GiB 5988.51 GB)<br>    Data Offset : 262144 sectors<br>   Super Offset : 8 sectors<br>          State : clean<br>    Device UUID : f76a5985:ea92a351:471f2656:db937cd2<br><br>    Update Time : Sun Apr 22 00:16:07 2018<br>       Checksum : 2040ea50 - correct<br>         Events : 1851992<br><br>         Layout : left-symmetric<br>     Chunk Size : 64K<br><br>   Device Role : Active device 3<br>   Array State : .A.A ('A' == active, '.' == missing)<br>

    Thank's for your efforts so far.
    Detlef
  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    The output of 'mdadm --examine' is interesting. Here the 'Array State' tells what the separate members 'think' about the state of their array, and 'Update Time' tells *when* they thought so.
    As you can see 3 members are saying .A.A, which means only 2 valid members, thus down, as this array needs at least 3 members.
    Sdc3 says AAAA, which means up and redundant. Sdc3 has an 'Update Time' 2018-04-21 22:49:48, while the others have 2018-04-22 00:16:07.
    So somewhere after 04-21 22:49:48 sdc3 was dropped from the array, and that member is no longer updated. The array was degraded.
    At 04-22 00:16:07 the array was down, because for some reason the 'Device Role' of sda3 changed from 'Active device 0' to 'spare'.
    When the copying process to the NAS stopped unsuspected
    I guess that was at 04-22 00:16:07 UTC?

    Anyway, as you were active copying to the NAS, the contents of sdc3 is no longer usable in this array, except in case of emergency.

    It's not clear to me why sda3 became a spare member, but as the array went down immediately, sd[abd]3 should contain a valid filesystem, except for the last writing action.

    You can re-create the array, using the same settings as originally were used.
    mdadm --stop /dev/md2<br>mdadm --create --assume-clean --level=5&nbsp; --raid-devices=4 --metadata=1.2 --chunk=64K&nbsp; --layout=left-symmetric /dev/md2 /dev/sda3 /dev/sdb3 missing /dev/sdd3 <br>
    (that are two lines starting with mdadm)

    Some settings are the defaults, according to https://linux.die.net/man/8/mdadm, but I just add them for safety and completeness.
    Here --assume-clean tells mdadm the partitions contain a valid array, and the 'missing' keyword tells that the device with role '2' is missing. The roles 0 - 3 are assigned in the order in which you specify the partitions here.

    The sequence of the other arguments is important for mdadm, and I don't know if this is the right sequence. Fortunately it will tell you if it's wrong, and it will also tell what should be right.


  • Detlef_M
    Detlef_M Posts: 4  Freshman Member
    edited April 2018
    I guess that was at 04-22 00:16:07 UTC?
    I don't know the exact time but the time frame fits.

    Ok, the most of the points you mentioned I understand. Before I execute the given instructions, however, I have a few questions.
    1. If it doesn' t work, will the data be lost forever?
    2. What happens with the "missing" sdc3? Will it be integrated back into the array?
    3. Change the status "spare" to "active device x" (on sda3) automatically after the execution?
    4. cat /proc/mdstat shows another sequence of partitions. How do I know that the sequence (on your second mdadm --create ... line) is correct? Is there an error if not or would it also cause direct a data loss?
    I'm sorry, but I'm not a specialist in this field.

    Thank you for your competent support!
  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    If it doesn' t work, will the data be lost forever?
    If you (or somebody you pay for it) fail in re-assembling or re-creating the array, and you have no backups, the data is lost.
    What happens with the "missing" sdc3? Will it be integrated back into the array?
    *A* disk can be integrated, which means that all parities are calculated, and redundancy is restored. If you add this disk to the array, it will be treated as a new, empty disk. The data on the disk is useless, because it's out of sync, and the raid manager has no way to check which parts of the data are still usable.
    Change the status "spare" to "active device x" (on sda3) automatically after the execution?

    The command builds a new array, without touching the data on the array. So it does not actually switch the status back, it generates a new 'active device x'.

    In theory it's possible to only change the status of sda3, but I don't know how. (Only changing the status of that member is not enough BTW, you also have to change the 'Array state' on all members sd[abd]3. If the members don't agree on the status of the array, it will not be assembled.)

    cat /proc/mdstat shows another sequence of partitions. How do I know that the sequence (on your second mdadm --create ... line) is correct? Is there an error if not or would it also cause direct a data loss?
    mdstat says 'sdb3[1](S) sda3[4](S) sdd3[3](S)', which means sdb3 is disk 1, (counting from 0), sdd3 is disk 3, and sda3 is disk 4. So sda3 is the 5th disk in a 4 disk array, which automatically means it's a (hot)spare.
    As far as I see that is the same info as 'mdadm --examine' gives for the sequence. But correct me if I'm wrong.

    If you would create an array with the wrong sequence, it simply doesn't produce a valid filesystem, so it can't be mounted, and nothing will be written to it.
    You can retry with another sequence.
    Only if you specify a wrong metadata version, you can be in bigger trouble. The header of the raid member can be located on the start or the end of the partition, depending on the metadata version. So if you specify the wrong one, the newly created raidheader will overwrite a part of the filesystem on the array.
  • Detlef_M
    Detlef_M Posts: 4  Freshman Member
    Thanks a lot for the detailed informations.

    I executed the commands. There were no problems. After this I could mount the new array (read-only).

    Then I made a --zero-superblock on /dev/sdc3 and added it "clean" to the new array again. The recovery phase is currently running.



    I hope that now everything goes through and I can set up a share at the end. Alternatively, I will copy the data to another disk using usb.

    Thank you very much for your intensive support.
    Detlef

Consumer Product Help Center