NAS540 hard disk failure reconstruction abnormality

2»

All Replies

  • Mijzelf
    Mijzelf Posts: 2,815  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    OK. You have to pull all disks, except sdd, and the new disk to which you want to copy.

    Download the dd_rescue package here: https://zyxel.diskstation.eu/Users/Mijzelf/Tools/ , and put it on the nas, in /bin/. You can use WinSCP for that.
    Then open a shell on the nas, and execute

    cd /bin/
    tar xf *.tgz

    If you run dd_rescue now, you should get a warning that you have to specify the in- and output.

    run
    mdadm --examine /dev/sd[ab]3

    This should show the new name of sdd, it's the one with 'Device Role : Active device 2'. The other one is the new disk.

    Let's assume the old sdd now is sda, and the new disk is sdb, then the command is

    dd_rescue /dev/sda /dev/sdb

    This will take several hours, maybe days, depending on the quality of sdd. You'll have to keep the terminal open all that time.

    After that, remove sdd, and put the other original 2 disks back, and repost the output of

    mdadm --examine /dev/sd[abcd]3
    cat /proc/mdstat

  • RiceC
    RiceC Posts: 10  Freshman Member
    Dear Sir, 
    dd_rescue I've run it a few days ago, my current situation is: the original 2 disks and the bit-copy hard disk, 
    I can see the RAID when I boot, but I have to insert a new hard disk to let him It is a normal four-disk RAID5 mechanism, 
    but the error that occurred at the beginning will still occur. The new hard disk that I copied using dd_rescue still has the error of the disk sector. 
    Is there an error to skip and allow the new hard disk to rebuild RAID What about the instructions?
    Thank you.
    
    
    Just like you said above:
    So it is possible that if you re-create this (degraded) array from the command line, using --assume-clean, that you can copy away al your files, without triggering this error again.
    
  • Mijzelf
    Mijzelf Posts: 2,815  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    OK. According to your post on 21 November, your array status went from [_UUU] when rebuild started to [_U_U] when the hardware failure occurred. So the array has to be rebuild from the 'Active devices' 1..3, as Active Device 0 is never completely synced.
    According to your post on 5 December, the 'Active devices' 1..3 are the partitions sdb3, sdd3 and sdc3.

    The command to recreate the array with these 3 members on these roles is

    mdadm --stop /dev/md2
    mdadm --create --assume-clean --level=5  --raid-devices=4 --metadata=1.2 --chunk=64K  --layout=left-symmetric /dev/md2 missing /dev/sdb3 /dev/sdd3 /dev/sdc3

    That are 2 lines, both starting with mdadm.

  • RiceC
    RiceC Posts: 10  Freshman Member
    edited December 2019
    親愛的先生, 


    以下消息出現在命令中。我該怎麼辦?謝謝。

    〜$ mdadm --stop /開發/ md2
    mdadm:必須是超級用戶才能執行此操作
    〜$ sudo mdadm --stop /開發/ md2
    -sh:sudo:找不到
    〜#
    〜$ su根
    密碼:


    BusyBox v1.19.4(2019-09-04 14:33:19 CST)內置外殼(ash)
    輸入“幫助”以獲取內置命令列表。

    〜#
    〜#mdadm --stop / dev / md2
    mdadm:無法獨占訪問/ dev / md2:也許正在運行的進程,已掛載的文件系統或活動的捲組?
    〜#
    /dev/mapper # pvdisplay
      --- Physical volume ---
      PV Name               /dev/md2
      VG Name               vg_28524431
      PV Size               16.36 TiB / not usable 3.81 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              4289348
      Free PE               0
      Allocated PE          4289348
      PV UUID               2L3zxx-baO6-JlSj-Y88b-Jr5I-hBo3-i20If6

  • Mijzelf
    Mijzelf Posts: 2,815  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    Does the NAS support some eastern language? Amazing.

    Anyway, the command to deactivate the logical volume is

    vgchange -an

    which has to be executed before the mdadm --stop.
  • RiceC
    RiceC Posts: 10  Freshman Member
    edited December 2019
    Dear Sir,
    Still some errors, can you please help?
    ~ # vgchange -an   Logical volume vg_28524431/vg_info_area contains a filesystem in use.   Can't deactivate volume group "vg_28524431" with 3 open logical volume(s) ~

    # df -h Filesystem                Size      Used Available Use% Mounted on ubi7:ubi_rootfs2         90.7M     48.7M     37.4M  57% /firmware/mnt/nand /dev/md0                  1.9G    178.9M      1.6G  10% /firmware/mnt/sysdisk /dev/loop0              139.5M    123.1M     16.4M  88% /ram_bin /dev/loop0              139.5M    123.1M     16.4M  88% /usr /dev/loop0              139.5M    123.1M     16.4M  88% /lib/security /dev/loop0              139.5M    123.1M     16.4M  88% /lib/modules /dev/loop0              139.5M    123.1M     16.4M  88% /lib/locale /dev/ram0                 5.0M      4.0K      5.0M   0% /tmp/tmpfs /dev/ram0                 5.0M      4.0K      5.0M   0% /usr/local/etc ubi3:ubi_config           2.4M    160.0K      2.0M   7% /etc/zyxel /dev/mapper/vg_28524431-lv_7f8dcf8b                         366.3G    194.5M    366.1G   0% /i-data/7f8dcf8b /dev/mapper/vg_28524431-lv_7ec47419                          15.9T     10.3T      5.5T  65% /i-data/7ec47419 /dev/mapper/vg_28524431-vg_info_area                          96.9M      4.1M     92.8M   4% /mnt/vg_info_area/vg_28524431 /dev/mapper/vg_28524431-lv_7ec47419                          15.9T     10.3T      5.5T  65% /usr/local/apache/htdocs/desktop,/pkg /dev/mapper/vg_28524431-lv_7ec47419                          15.9T     10.3T      5.5T  65% /usr/local/mysql ~ #

     ~ # umount /i-data/7f8dcf8b ~ #

     ~ # umount /mnt/vg_info_area/vg_28524431 ~ # 

     ~ # umount  /usr/local/apache/htdocs/desktop,/pkg ~ # 

     ~ # umount /usr/local/mysql ~ # 

     ~ # umount /i-data/7ec47419 umount: /i-data/7ec47419: target is busy         (In some cases useful info about processes that          use the device is found by lsof(8) or fuser(1).)

    ~ # vgchange -an   Logical volume vg_28524431/lv_7ec47419 contains a filesystem in use.   Can't deactivate volume group "vg_28524431" with 1 open logical volume(s) ~ #  


    ~ # vgdisplay
      --- Volume group ---
      VG Name               vg_28524431
      System ID
      Format                lvm2
      Metadata Areas        1
      Metadata Sequence No  5
      VG Access             read/write
      VG Status             resizable
      MAX LV                0
      Cur LV                3
      Open LV               1
      Max PV                0
      Cur PV                1
      Act PV                1
      VG Size               16.36 TiB
      PE Size               4.00 MiB
      Total PE              4289348
      Alloc PE / Size       4289348 / 16.36 TiB
      Free  PE / Size       0 / 0
      VG UUID               B2vAgC-DwH6-jxC5-aHqz-sUmH-PZEv-ekuduP
     
    ~ # pvdisplay
      --- Physical volume ---
      PV Name               /dev/md2
      VG Name               vg_28524431
      PV Size               16.36 TiB / not usable 3.81 MiB
      Allocatable           yes (but full)
      PE Size               4.00 MiB
      Total PE              4289348
      Free PE               0
      Allocated PE          4289348
      PV UUID               2L3zxx-baO6-JlSj-Y88b-Jr5I-hBo3-i20If6
     

    ~ # lvdisplay
      --- Logical volume ---
      LV Path                /dev/vg_28524431/vg_info_area
      LV Name                vg_info_area
      VG Name                vg_28524431
      LV UUID                Mxq7Cr-NnAg-fKTi-WwWT-j0zh-ltOU-MKPtoX
      LV Write Access        read/write
      LV Creation host, time NAS540, 2015-11-10 15:29:21 +0800
      LV Status              available
      # open                 0
      LV Size                100.00 MiB
      Current LE             25
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     3072
      Block device           253:0
     
      --- Logical volume ---
      LV Path                /dev/vg_28524431/lv_7ec47419
      LV Name                lv_7ec47419
      VG Name                vg_28524431
      LV UUID                uLjVxB-x3NU-l70j-tdT1-Gi26-JgLx-5jNzMc
      LV Write Access        read/write
      LV Creation host, time NAS540, 2015-11-10 15:29:22 +0800
      LV Status              available
      # open                 1
      LV Size                16.00 TiB
      Current LE             4194048
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     3072
      Block device           253:1
     
      --- Logical volume ---
      LV Path                /dev/vg_28524431/lv_7f8dcf8b
      LV Name                lv_7f8dcf8b
      VG Name                vg_28524431
      LV UUID                c0VVFV-imCI-5qLH-BFmV-oxPR-Xfpe-pMCvcJ
      LV Write Access        read/write
      LV Creation host, time NAS540, 2016-01-24 15:18:37 +0800
      LV Status              available
      # open                 0
      LV Size                372.17 GiB
      Current LE             95275
      Segments               1
      Allocation             inherit
      Read ahead sectors     auto
      - currently set to     3072
      Block device           253:2
  • Mijzelf
    Mijzelf Posts: 2,815  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    We are drifting away.

    • There was a raid5 array ABDC.
    • Disk A failed, degrading the array,
    • and was exchanged by a new disk A'.
    • On resync disk D appeared to have a hardware error,
    • causing the sync to stop,
    • and drop D from the array, which is now down.

    2 possible solutions:
    1. Recreate the (degraded) array _BDC, so that the content can be copied away, hoping the hardware error is in slack space, so it won't be triggered.
    2. Make a bitwise copy of D to a new disk D', recreate the array _BD'C, and resync to A'BD'C.

    Now you are trying to apply solution 1, but that fails because the array cannot be stopped, because it contains a logical volume, which is mounted.

    If that is the case, the array is not down, but contains a valid filesystem, which, according to your df output, contains 10.3TB data. So you can copy away your data. Re-creating the array will change nothing.

Consumer Product Help Center