NSA320 process to replace a bad disk in RAID 1

2

All Replies

  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    edited November 2021
    The white screen is normal, there is no further confirmation. Now you can use a telnet client (PuTTY will do, make sure you select telnet as protocol) to get shell access to your NAS.
    Maybe you'll have to re-open the Telnet backdoor, don't know if it closes on dis-activity.

  • Thanks for the help - I believe I've connected with telnet. (Sorry, all this is very unfamiliar to me)

    Here's the entire session with the commands you listed; after "su" I typed in the admin password.
    </code></div><div><code><br>
    NSA320 login: admin<br>Password:<br><br><br>BusyBox v1.17.2 (2016-03-11 16:40:37 CST) built-in shell (ash)<br>Enter 'help' for a list of built-in commands.<br><br>~ $ cat /proc/partitions<br>major minor  #blocks  name<br><br>   7        0     140288 loop0<br>   8        0 2930266584 sda<br>   8        1     498688 sda1<br>   8        2 2929766400 sda2<br>   8       16 2930266584 sdb<br>   8       17     498688 sdb1<br>   8       18 2929766400 sdb2<br>  31        0       1024 mtdblock0<br>  31        1        512 mtdblock1<br>  31        2        512 mtdblock2<br>  31        3        512 mtdblock3<br>  31        4      10240 mtdblock4<br>  31        5      10240 mtdblock5<br>  31        6      48896 mtdblock6<br>  31        7      10240 mtdblock7<br>  31        8      48896 mtdblock8<br>   9        0 2929766264 md0<br>~ $ cat /proc/mdstat<br>Personalities : [linear] [raid0] [raid1]<br>md0 : active raid1 sda2[2] sdb2[1]<br>      2929766264 blocks super 1.0 [2/1] [_U]<br><br>unused devices: <none><br>~ $ su<br>Password:<br><br><br>BusyBox v1.17.2 (2016-03-11 16:40:37 CST) built-in shell (ash)<br>Enter 'help' for a list of built-in commands.<br><br>~ # mdadm --examine /dev/sd[ab]2<br>/dev/sda2:<br>          Magic : a92b4efc<br>        Version : 1.0<br>    Feature Map : 0x2<br>     Array UUID : 630ad246:af25be64:c3565d38:6dd85ddb<br>           Name : 0<br>  Creation Time : Sun Sep 22 01:50:31 2013<br>     Raid Level : raid1<br>   Raid Devices : 2<br><br> Avail Dev Size : 2929766264 (2794.04 GiB 3000.08 GB)<br>     Array Size : 2929766264 (2794.04 GiB 3000.08 GB)<br>   Super Offset : 5859532784 sectors<br>Recovery Offset : 5859532528 sectors<br>          State : clean<br>    Device UUID : 7346cb76:5a16c5d3:93ba274d:a613f4e0<br><br>    Update Time : Tue Nov 30 09:33:52 2021<br>       Checksum : a43bb60 - correct<br>         Events : 3509251<br><br><br>   Device Role : Active device 0<br>   Array State : AA ('A' == active, '.' == missing)<br>/dev/sdb2:<br>          Magic : a92b4efc<br>        Version : 1.0<br>    Feature Map : 0x0<br>     Array UUID : 630ad246:af25be64:c3565d38:6dd85ddb<br>           Name : 0<br>  Creation Time : Sun Sep 22 01:50:31 2013<br>     Raid Level : raid1<br>   Raid Devices : 2<br><br> Avail Dev Size : 2929766264 (2794.04 GiB 3000.08 GB)<br>     Array Size : 2929766264 (2794.04 GiB 3000.08 GB)<br>   Super Offset : 5859532784 sectors<br>          State : clean<br>    Device UUID : 9d3c2c12:358ea491:6104fcc1:4296fc9d<br><br>    Update Time : Tue Nov 30 09:33:52 2021<br>       Checksum : 381f96da - correct<br>         Events : 3509251<br><br><br>   Device Role : Active device 1<br>   Array State : AA ('A' == active, '.' == missing)<br>~ #<br>~ #



  • Here's the entire session; after "su" I typed the admin password again.
    Thanks in advance for your suggestions.


    NSA320 login: admin
    Password:


    BusyBox v1.17.2 (2016-03-11 16:40:37 CST) built-in shell (ash)
    Enter 'help' for a list of built-in commands.

    ~ $ cat /proc/partitions
    major minor  #blocks  name

       7        0     140288 loop0
       8        0 2930266584 sda
       8        1     498688 sda1
       8        2 2929766400 sda2
       8       16 2930266584 sdb
       8       17     498688 sdb1
       8       18 2929766400 sdb2
      31        0       1024 mtdblock0
      31        1        512 mtdblock1
      31        2        512 mtdblock2
      31        3        512 mtdblock3
      31        4      10240 mtdblock4
      31        5      10240 mtdblock5
      31        6      48896 mtdblock6
      31        7      10240 mtdblock7
      31        8      48896 mtdblock8
       9        0 2929766264 md0
    ~ $ cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1]
    md0 : active raid1 sda2[2] sdb2[1]
          2929766264 blocks super 1.0 [2/1] [_U]

    unused devices: <none>
    ~ $ su
    Password:


    BusyBox v1.17.2 (2016-03-11 16:40:37 CST) built-in shell (ash)
    Enter 'help' for a list of built-in commands.

    ~ # mdadm --examine /dev/sd[ab]2
    /dev/sda2:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x2
         Array UUID : 630ad246:af25be64:c3565d38:6dd85ddb
               Name : 0
      Creation Time : Sun Sep 22 01:50:31 2013
         Raid Level : raid1
       Raid Devices : 2

     Avail Dev Size : 2929766264 (2794.04 GiB 3000.08 GB)
         Array Size : 2929766264 (2794.04 GiB 3000.08 GB)
       Super Offset : 5859532784 sectors
    Recovery Offset : 5859532528 sectors
              State : clean
        Device UUID : 7346cb76:5a16c5d3:93ba274d:a613f4e0

        Update Time : Tue Nov 30 09:33:52 2021
           Checksum : a43bb60 - correct
             Events : 3509251


       Device Role : Active device 0
       Array State : AA ('A' == active, '.' == missing)
    /dev/sdb2:
              Magic : a92b4efc
            Version : 1.0
        Feature Map : 0x0
         Array UUID : 630ad246:af25be64:c3565d38:6dd85ddb
               Name : 0
      Creation Time : Sun Sep 22 01:50:31 2013
         Raid Level : raid1
       Raid Devices : 2

     Avail Dev Size : 2929766264 (2794.04 GiB 3000.08 GB)
         Array Size : 2929766264 (2794.04 GiB 3000.08 GB)
       Super Offset : 5859532784 sectors
              State : clean
        Device UUID : 9d3c2c12:358ea491:6104fcc1:4296fc9d

        Update Time : Tue Nov 30 09:33:52 2021
           Checksum : 381f96da - correct
             Events : 3509251


       Device Role : Active device 1
       Array State : AA ('A' == active, '.' == missing)
    ~ #
    ~ #



  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    OK, according to your /proc/partitions both disks are exact the same size, and so are the data partitions.
    I'm not sure what is happening here, but at least this is strange:
       Super Offset : 5859532784 sectors
    Recovery Offset : 5859532528 sectors
    This version of raid has the headers on the end of the partition, so 'Super Offset' points to the end of the usable space. The recovery stopped at 256 sectors from the end of the array. That is a suspicious number, as it is a power of 2. Further if this indeed happened in 4 hours, it was recovering at >200MB/sec, which seems impossible to me.
    Let's try to continue the recovery, and see what kernel log that gives.
    su
    dmesg -c >/dev/null
    echo "recovery" >/sys/block/md0/md/sync_action<br>

    wait some time

    dmesg

    'su' is an elevation command. 'admin' doesn't have the right credentials. The first 'dmesg' command clears the kernel log. The 'echo' line is supposed to continue the recovery. The second 'dmesg' shows the new log lines.
  • ~ # su


    BusyBox v1.17.2 (2016-03-11 16:40:37 CST) built-in shell (ash)
    Enter 'help' for a list of built-in commands.

    ~ # dmesg -c >/dev/null
    ~ # echo "recovery" > /sys/block/md0/md/sync_action<br>
    sh: syntax error: unexpected newline
    ~ # echo "recovery" >/sys/block/md0/md/sync_action<br>
    sh: syntax error: unexpected newline
    ~ #

    Afraid I don't know where the syntax error is. I tried adding a space between the '>' and '/' to no avail. Also tried removing the <br>, which gave no error nor any other message.

    I tried logging in as NsaRescueAngel also, but same results.

  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    This forumsoftware is really ... . That <br> shouldn't be there, and I didn't type it. So retry without <br>. If you copy&paste, paste first to notepad, and copy it from there. That way no invisible stuff is pasted.
  • Got it - done and done.

    Putty went back to a command prompt, nothing else happened.

    I waited 10 minutes, then

    ~ # dmesg
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2

    etc

    After 20 minutes I got this:


    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    Uncached vma dbc6d758 (addr 40b3c000 flags 080000ff phy 1c370000) from pid 18342
    Uncached vma dbc6d650 (addr 40b3f000 flags 080000ff phy 1c370000) from pid 18342
    Uncached vma dbc6d650 (addr 406fe000 flags 080000ff phy 1c370000) from pid 1341
    Uncached vma dbc6d650 (addr 406fe000 flags 080000ff phy 1c370000) from pid 1341
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2
    ~ #
    ~ # RAID1 conf printout:
    sh: RAID1: not found
    ~ #  --- wd:1 rd:2
    sh: ---: not found
    ~ #  disk 0, wo:1, o:1, dev:sda2
    sh: disk: not found
    ~ #  disk 1, wo:0, o:1, dev:sdb2
    sh: disk: not found
    ~ # Uncached vma dbc6d758 (addr 40b3c000 flags 080000ff phy 1c370000) from pid 1
    8342
    sh: syntax error: unexpected "("
    ~ # Uncached vma dbc6d650 (addr 40b3f000 flags 080000ff phy 1c370000) from pid 1
    8342
    sh: syntax error: unexpected "("
    ~ # Uncached vma dbc6d650 (addr 406fe000 flags 080000ff phy 1c370000) from pid 1
    341
    sh: syntax error: unexpected "("
    ~ # Uncached vma dbc6d650 (addr 406fe000 flags 080000ff phy 1c370000) from pid 1
    341
    sh: syntax error: unexpected "("
    ~ # RAID1 conf printout:
    sh: RAID1: not found
    ~ #  --- wd:1 rd:2
    sh: ---: not found
    ~ #  disk 0, wo:1, o:1, dev:sda2
    sh: disk: not found
    ~ #  disk 1, wo:0, o:1, dev:sdb2
    sh: disk: not found
    ~ # RAID1 conf printout:
    sh: RAID1: not found


  • ngibson
    ngibson Posts: 12
    Friend Collector
    edited December 2021

  • Mijzelf
    Mijzelf Posts: 2,598  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Another déja vu:

    RAID1 conf printout:
     --- wd:1 rd:2
     disk 0, wo:1, o:1, dev:sda2
     disk 1, wo:0, o:1, dev:sdb2

    This time I could find it back: Link. That thread describes your problem, with the difference that there the recovery stops at ~70%, while yours stop at ~99.999%.
    Don't know what to suggest now. In theory it's possible to read the last 256 sectors of the source disk, and write it back, to re-init that broken sector. But still 256 sectors is a suspicious number. It is probably less error prone to remove the source disk, create a new volume on the new disk, put the source disk back, and copy the data over, and finally add the old disk to the new volume.
  • Thanks for the suggestion.
    • What is considered the "source disk"?  I have the original two disks ( "disk1", which seems was the bad one, and "disk2") plus the 3rd "NewDisk". So currently the NAS has the NewDisk in slot 1 and disk2 in slot 2. 
    • Will creating a new volume on any disk delete the current contents? Do I need to back up the whole  volume onto yet another disk?
    • Is there a method or specific steps to "copy the data over, and finally add the old disk to the new volume"?  

Consumer Product Help Center