NSA325 RAID1 disk replacement

Hi,
I have an NSA325 NAS in which one of the disks failed. It was in a RAID1 configuration so I went ahead and replaced the failed drive. (The original drives were 1.5TB, the new one 4TB). I actually bought two identical drives so that I could take advantage of the larger capacity. 
After replacing the failed drive, the web ui reported it was recovering or rebuilding (I can't recall the exact wording). The RAID1 rebuild finished successfully so - continuing the upgrade I replaced the second 1.5TB drive. When I rebooted the NAS, the RAID1 array appeared down, and only the already replaced drive was visible. 

I tried rebooting (and even replacing disk1 with the original 1.5TB disk) with no improvement.
I logged in using the telnet backdoor and checking the md it appears that disk1 was removed from the array:
~ # mdadm --detail /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Sun Dec 17 17:26:49 2017
     Raid Level : raid1
     Array Size : 1464620792 (1396.77 GiB 1499.77 GB)
  Used Dev Size : 1464620792 (1396.77 GiB 1499.77 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Mon Sep 27 18:07:39 2021
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : Zyxel:0  (local to host Zyxel)
           UUID : 8f840f8c:54479b1a:ef55e4dc:a0898040
         Events : 397683

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       3       8       18        1      active sync   /dev/sdb2

I can manually mount /dev/md0 and my files are there, but I cannot get the NAS to add the /dev/sda to the raid. 
I tried creating the same partition layout as on /dev/sdb, but it does not help. 
~ # fdisk -l

Disk /dev/sda: 4000.7 GB, 4000787030016 bytes
255 heads, 63 sectors/track, 486401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x78562c44

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1          64      514048+  83  Linux
/dev/sda2              65      267349  2146966762+  20  Unknown

Disk /dev/sdb: 4000.7 GB, 4000787030016 bytes
255 heads, 63 sectors/track, 486401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x68a46302

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1          64      514048+  83  Linux
/dev/sdb2              65      267349  2146966762+  20  Unknown

Originally sda didn't have a partition table (sealed disk). 
Could you suggest what to do so that disk1 would also join the array? 
Of course, I could use mdadm to add /dev/sda2 to the RAID but I have a feeling that I would miss something important. 

Thanks in advance,
Seniorsamu

Accepted Solution

  • Mijzelf
    Mijzelf Posts: 2,001  Guru Member
    Answer ✓
    Thanks for the suggestion. The 4TB disk has all the data on it (in an ext4 filesystem), so I can use the currently unused disk for this. What I do not see now, if there was an option to convert a single disk volume to a RAID1 volume.
    And what if I convert the partition table of the 4TB disk to GPT? Either on a linux box or from within the Zyxel NAS (if there are tools for that; gdisk is not available, I checked).
    There is an option to convert a single disk to RAID1. Yet I don't know how it looks like, and I don't know if you have to choose a special type. I know it because I several times have answered the question of people who had a single disk which was full, who added another disk, which resulted in a RAID1 array, which was still full, of course.

    I don't think conversion will work. Don't know if you can read scripts, the script /bin/storage_gen_mntfw.sh decides if it is a 'zyxel disk', and it seems a GPT disk has a different sized first partition than a MBR one. But the script is hard to read.


«1

All Replies

  • Mijzelf
    Mijzelf Posts: 2,001  Guru Member
    There is some script which has to decide if a disk is a 'zyxel disk'  or just 'any disk'. It's quite complicated, there has to be a first (primary) partition with some fixed start and size, and a second partition with a fixed start, and I don't remember if it has a size restriction too.
    I *think* this script choked in your 4TB disk with 1.5TB 2nd partition, or some similar problem.
    Anyway, if you manage to repair the array, you will find that you can't resize the data volume to 4TB, 2TB is the max. The reason is that the partition table of your 1.5TB disk is cloned, and that is an MBR one, which doesn't support partitions with a start or end beyond 2TiB.
    So the most reasonable action (in my opinion) is to create a new volume on one of the 4TB disks, (which will get a GPT table), then plugin the 1.5TB disk, copy data over, and finally add the 2nd 4TB disk.
  • seniorsamu
    seniorsamu Posts: 11
    edited September 2021
    Thanks for the suggestion. The 4TB disk has all the data on it (in an ext4 filesystem), so I can use the currently unused disk for this. What I do not see now, if there was an option to convert a single disk volume to a RAID1 volume.
    And what if I convert the partition table of the 4TB disk to GPT? Either on a linux box or from within the Zyxel NAS (if there are tools for that; gdisk is not available, I checked).
  • Mijzelf said:
    Thanks for the suggestion. The 4TB disk has all the data on it (in an ext4 filesystem), so I can use the currently unused disk for this. What I do not see now, if there was an option to convert a single disk volume to a RAID1 volume.
    And what if I convert the partition table of the 4TB disk to GPT? Either on a linux box or from within the Zyxel NAS (if there are tools for that; gdisk is not available, I checked).
    There is an option to convert a single disk to RAID1. Yet I don't know how it looks like, and I don't know if you have to choose a special type. I know it because I several times have answered the question of people who had a single disk which was full, who added another disk, which resulted in a RAID1 array, which was still full, of course.

    I don't think conversion will work. Don't know if you can read scripts, the script /bin/storage_gen_mntfw.sh decides if it is a 'zyxel disk', and it seems a GPT disk has a different sized first partition than a MBR one. But the script is hard to read.


    Ok. I went through the script you mentioned and found a couple of ambiguities. 
    - The script don't explicitely make a distinction between MBR or GPT partition tables. Rather, it checks the size of the disk and the partitions. 
    - When calculating the size of the first partition it uses sdx1Size=`cat /sys/block/sdb/sdb1/size` which in my case is 1028097. The partition is 514048 blocks according to fdisk so the numbers don't really add up. /sys/block/sdb/sdb1/size gives the size in 512 byte blocks, which is good
    - When the script checks the partition size, it compares the above result (1028097) with the value of  fwPart1SizUseParted (997376). I don't really know where this number is coming from, since the above comment refers to this value as 512MB from /sys/block/sdx/sdx1. 997376 512-byte blocks is not 512MB. 
    So what is happening when the script runs:
    1. It finds that my sda is not partitioned correctly (or at all). That's ok. 
    2. It locates sdb and checks the partitions
        a) sdb1 size is 1028097
        b) checks if it matches 512MB, aka 997376, which it doesn't. 
    3. Failing the partition checks it does not mount.
    I don't understand a couple of things:
    - When I replaced the failed drive with this 4TB one, presumably the firmware created the same partition table on the new drive as I had on the old one (I'll check it on the old drive). If so, how did that pass the partition size check? 
    - Partition 1 is referred to swap partition. Why does it even matter if the size of the partition 500, 512 or 520MB? 
    - Near the end of the script there is a fragment that sizes the second partition as full disk size - 512MB. Ok, this corresponds to the earlier check of the size of partition 1. However, when actually doing the calculation, the script compares the leftover with 520MB:
        sda=`cat /sys/block/${sdxnodev}/size`                                    
        #sector to MB                                                   
        let "sdamb=$sda/2048"                                   
        sda2=`cat /sys/block/${sdxnodev}/${sdxnodev}2/size`   
        sdaleft=$(( sda - sda2 ))                             
        #520MB=1064960                                     
        if [ ${sdaleft} -gt 1064960 ]; then                  
                echo "${bsname}: no internal volume available"
                exit 1                                                            
        fi    
    and if there is less than 520MB left for the first partition, then the script fails. 

    Now about how to proceed:
    I will check the partition sizes on the original 1.5TB disk. I'll also check if the data is still available there.
    Create a JBOD drive on the disk (currently sda) without sdb present in the system. 
    Manually mount the data partition from sdb (actually md0, as part of the recovery created the md on it) and copy all files from there to sda2.
    Convert the JBOD to RAID1 using sdb. (I might have to delete the partition table on sdb for that).
    Does that sound a good plan?
  • Mijzelf
    Mijzelf Posts: 2,001  Guru Member
    Sounds like a plan. As far as I can see your data is never in danger, as you still have the 1.5TB disk.

    And yes, the script is strange. It's not designed, it's grown, I think. And edited by several persons who didn't know about the thoughts of their predecessors. 
    Why does it even matter if the size of the partition 500, 512 or 520MB?
    It doesn't. But ZyXEL decided to use it as tag to recognize a ZyXEL partitioned disk. There must be better ways to do so, but this is how it is.
  • Ok, I finally reached the point when I had time to try this. I created the JBOD on one of the new disks and copied all files from the old one. Then I inserted the second 4TB disks (deleted the leftover partitions) and tried to find the option to convert it from JBOD to RAID1. Unfortunatelly there is no such option. I can create a new JBOD on the second disk and expand the existing JBOD by adding the capacity of the second drive, but there is no option to convert to RAID1. 
    Any ideas?

  • Expand option is not available.

    Create internal volume options. 
  • Mijzelf
    Mijzelf Posts: 2,001  Guru Member
    I'm sorry. What were the choices when you created the volume on a single disk?
  • I have good news. I tried to find information about migrating JBOD to RAID1 on this NAS and found that it should be possible. I had guessed that the option for that should be expand (which was not available for the JBOD volume), but I was wrong. I tried migrate and it actually is converting the JBOD to RAID1. I could follow in the command line that the Raid level had changed from liner to raid1 and now an md0_resync process is running (I guess it takes some time to sync all those gigabytes between the drives. 
    Right now, the webui is inaccessible, I cannot get mdadm --detail /dev/md0 to respond (or I haven't been patient enough) but I will start worrrying if it does not finish by morning. 

    Thanks for all the help. I'll let you know how it worked out. 
  • I was worried in the morning for the md0_sync operation still didn't finish after running for more than 16 hours. I tried but failed logging in using the webui. Telnet was ok, but top was showing no activity. I tried and failed rebooting the NAS and realized that I won't be able to restart it normally. Long pressing the power button helped. Now I can see that the RAID 1 is configured and recovering. It is progressing slowly, but given its current state, I'd say it should finish this afternoon.  

Consumer Product Help Center