NSA320 process to replace a bad disk in RAID 1

ZYXEL Status monitor has been reading "Degraded" and then indicating Disk 2. I have a brand new disk, identical in size, ready to replace it with.

Do I just shut it down, take out Disk 2, and put in the new one? Is that all that needs to happen to fix the problem? Or are there other steps?
«13

Answers

  • Mijzelf
    Mijzelf Posts: 1,799  Guru Member
    AFAIK you'll need to initiate the recovery in the webinterface.
  • ngibson
    ngibson Posts: 12
    edited November 27
    @Mijzelf Looking at the web interface, I don't see any options for that.
    Since I replaced the bad disk, the Storage tab only says that the entire RAID volume is "inactive" and now indicates only "Disk 1".

    Do you know what steps need to be taken to initiate recovery?

    Here's the current situation: Original disk is Disk 1 - New disk is Disk 2

  • Mijzelf
    Mijzelf Posts: 1,799  Guru Member
    I don't see a disk 2 in your screenshot? Are you sure you pulled the right disk?
  • ngibson
    ngibson Posts: 12
    edited November 27
    @Mijzelf No, I'm not sure. With the 2 original disks, when it says "Degraded", the Disk(s) column reads "disk2". Zyxel doesn't say whether that indication is the good disk or the bad disk. I will try swapping the disks and see if it's the other one that's bad.

    Edit: Have now replaced disk1 with the new disk and kept disk2 from the original RAID volume.
    Now the Status still said "Degraded" but lists both disks. I clicked the "Repair" icon, and it's now in process. Status is "Recovering".

    Edit 2: After 4 hours, Status is back to "Degraded". Both disks are listed. Did recovery fail?


  • Mijzelf
    Mijzelf Posts: 1,799  Guru Member
    Edit 2: After 4 hours, Status is back to "Degraded". Both disks are listed. Did recovery fail?

    I think so. 4 hours is not enough. I think a '320 can recover at ~50MB/sec, so a recovery of 2TB would take around 11 hours. Anything in the logs?

  • ngibson
    ngibson Posts: 12
    @Mijzelf No, nothing in the log tab that goes beyond about 24 hours ago, by which time the process was already completed. If there is another log somewhere, I don't know where it would be.

    A scan of the volume, without selecting file repair, returns only this:
    Scan Result
    e2fsck 1.41.14 (22-Dec-2010)
    /dev/md0: clean 208496/183115776 files 496579882/732441566 blocks


    A scan with file repair returns this:

    Scan Result
    e2fsck 1.41.14 (22-Dec-2010)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    /dev/md0: 208496/183115776 files (1.5% non-contiguous) 496579883/732441566 blocks
    e2fsck -f -y return value:0

    Status is still "Degraded" so I click "Repair", and get this message at the bottom of the screen:

    Disk capacity must be equal to or greater than the smallest disk in the RAID.

    ???

    Both disks are identical and showing a capacity of 2.73TB, with 1.81TB used.

    Now checking the log tab again, here is all the recent activity:


    What to make of this? Further help is much appreciated!
  • Mijzelf
    Mijzelf Posts: 1,799  Guru Member
    A filesystem scan of a degraded array should show no problems. The filesystem should be healthy.
    Disk capacity must be equal to or greater than the smallest disk in the RAID.
    I have seen that before. Unfortunately I can't remember what caused it. So let's start at the basics. Can you open the Telnet backdoor, login over telnet, and post the output of
    cat /proc/partitions
    cat /proc/mdstat
    su
    mdadm --examine /dev/sd[ab]2

  • ngibson
    ngibson Posts: 12
    Unfortunately I can't open the telnet backdoor using the instructions provided. My NSA320 firmware is V4.70(AFO.3).
    I logged in as admin, then entered the address https://192.168.0.100/zyxel/cgi-bin/remote_help-cgi?type=backdoor
    I got:

    404 Not Found
    The requested URL /zyxel/cgi-bin/remote_help-cgi was not found on this server.

    Tried to use the reset button method, but only 1 beep after more than 20 seconds, and telnet still doesn't become available.

    Are there any other ways to open the telnet interface?

  • Mijzelf
    Mijzelf Posts: 1,799  Guru Member
    Read the 'Update NSA-300 series Firmware 4.60'. There should be an 'r number' in the URL.

  • ngibson
    ngibson Posts: 12
    After some struggle I may have succeeded, i replaced the 'r' number with the number from the admin web interface page. I no longer get a 404 error, just a white screen.
    I'm afraid I don't understand what to do at this point.