NAS326 problem

2»

All Replies

  • Janikovo
    Janikovo Posts: 24  Freshman Member
    edited June 2020
    @Mijzelf

    first I disassemble NAS, remove both disks, then check one by one if they are working (rotating plates)...they are seems both working as they should.

    Then I tried again
    cat /proc/mdstat

     /proc/mdstat shows the status of the 'multi device' devices (raid arrays) - well, there is a change...now it shows 3 of them

    cat /proc/partitions



    cat /proc/mounts


    mdadm --examine /dev/sda3


    a big change when you could see the result of this command two days ago

    mdadm --examine /dev/sdb3






    Well, now, could anybody please explain if my NAS326 is OK ?


    Because this one here md2: seems to be inactive...

    Should I somehow activate it ? and how ?

    Thanks in advance
  • Janikovo
    Janikovo Posts: 24  Freshman Member
    Well,

    I am really confused, so I did try this



    and then mdadm --examine /dev/sda1


    mdadm --examine /dev/sda2


    mdadm --examine /dev/sda3



    What am I missing ?
    I am not expert, bud sda1 is alive as Homesan:0, sda2 is alive as Homesan:1 and sda3 is alive as Homesan:2.

    But I have only two HDDs there.

    Anybody could help me to understand and try to rebuild SAN with all data that seems to be on both drives ?

    Thanks
  • Mijzelf
    Mijzelf Posts: 2,828  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    What am I missing ?
    I am not expert, bud sda1 is alive as Homesan:0, sda2 is alive as Homesan:1 and sda3 is alive as Homesan:2.
    A firmware 5 ZyXEL NAS creates 2 raid1 raidarrays for own use. One for system, and one swap. It is raid1, so the NAS won't go down when a disk fails. So the 2 small partitions are for internal use, the big partition for data.

    Anyway, it seems the raid headers are suddenly readable after reinserting the disk. Did you exchange them? In your first post sdb3 was 'Active device 1', and now it's 'Active device 0'. That data is written in the raid header, while the name of the disk is assigned by the disk slot. If you didn't exchange them, the sequence of detection is changed, which might be caused by a failing sata port, or something like that.

    If you did exchange them, it's okay. The array manager can handle that. It's not clear to me why md2 is inactive. What says

    mdadm --detail /dev/md2

    BTW, you can copy text from PuTTY just by selecting it.
  • Janikovo
    Janikovo Posts: 24  Freshman Member
    edited June 2020
    This:

    ~ # mdadm --detail /dev/md2
    mdadm: md device /dev/md2 does not appear to be active.

    ...and...

    ~ # cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
    md2 : inactive sda3[1](S)
          1949514744 blocks super 1.2

    md1 : active raid1 sdb2[2](F) sda2[1]
          1998784 blocks super 1.2 [2/1] [_U]

    md0 : active raid1 sdb1[2](F) sda1[1]
          1997760 blocks super 1.2 [2/1] [_U]



    And system says again:



    And what I found is also this:


    So what now please ?
    e2fsck ?  running this command assumes that those disks should be unmounted...?

    In fact how could I know which one is Disk2 ?
    Is that one, which is place in Slot number 2 ?



    And thanks Mijzelf you taking care, many many thanks.
  • Janikovo
    Janikovo Posts: 24  Freshman Member
    @Mijzelf

    Well, now I am totally lost...
    found this in dmesg:
    ...
    md/raid1:md1: Disk failure on sdb2, disabling device.
    md/raid1:md1: Operation continuing on 1 devices.
    ...

    and then this with fdisk -l:

    Disk /dev/loop0: 143 MiB, 149946368 bytes, 292864 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock0: 2 MiB, 2097152 bytes, 4096 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock1: 2 MiB, 2097152 bytes, 4096 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock2: 10 MiB, 10485760 bytes, 20480 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock3: 15 MiB, 15728640 bytes, 30720 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock4: 106 MiB, 111149056 bytes, 217088 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock5: 15 MiB, 15728640 bytes, 30720 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/mtdblock6: 106 MiB, 111149056 bytes, 217088 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: BA8CDAD3-A2FC-4169-84F4-2DF43580FC17

    Device       Start        End    Sectors  Size Type
    /dev/sda1     2048    3999743    3997696  1.9G Linux RAID
    /dev/sda2  3999744    7999487    3999744  1.9G Linux RAID
    /dev/sda3  7999488 3907028991 3899029504  1.8T Linux RAID

    Disk /dev/md0: 1.9 GiB, 2045706240 bytes, 3995520 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disk /dev/md1: 1.9 GiB, 2046754816 bytes, 3997568 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes

    tried this:
    ~ # e2fsck /dev/sdb1
    e2fsck 1.42.12 (29-Aug-2014)
    /dev/sdb1 is in use.
    e2fsck: Cannot continue, aborting.


    ~ # e2fsck /dev/sdb2
    e2fsck 1.42.12 (29-Aug-2014)
    /dev/sdb2 is in use.
    e2fsck: Cannot continue, aborting.


    ~ # e2fsck /dev/sda3
    e2fsck 1.42.12 (29-Aug-2014)
    /dev/sda3 is in use.
    e2fsck: Cannot continue, aborting.



    and cannot see how to umount them or see why they are "in use".

    Any advice pls ?


  • Mijzelf
    Mijzelf Posts: 2,828  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    The disk which is now sdb is dying, and seems to be intermittently readable, as first the readheaders were gone, and later they were readable.

    You should not try to repair the filesystem on this disk, as each write can add further damage. The 'classic' way to try to solve this kind of problems is to create a byte-by-byte copy of that disk, and to the filesystem repair on the copy.

    In Linux there is a tool for that, ddrescue. The syntax is

    ddrescue /dev/sda /dev/sdb [ /path/to/logfile ]

    where sda is the source disk, sdb is the target, and logfile is a file where ddrescue keeps track of copied sectors. Using that logfile you can do several passes, where each time ddrescue tries to fill up sectors which failed last time. Without logfile it just tries to read every sector, repeatedly, if it fails, and if it ultimately fails the target sector is filled with zeros.

    Maybe there are tools like this for other OSses, but I'm not aware of that.

    So, if your data is valuable, get a new disk of at least the same size, connect it, together with the failing disk to a Linux box, and run ddrescue. Make sure and doublecheck that you have the target disk right. ddrescue will happily overwrite everything you offer as target. If that is your failing disk, your data is lost forever.

    ddrecue is in most cases not by default installed. So you'll have to install it using the package manager of you Linux flavour.

    The copying can take days, as ddrescue will try hard to read each sector. So a lot of unreadable sectors will slow down the process. On a healthy disk you can expect something like 100MB/sec

    In fact how could I know which one is Disk2 ?
    Is that one, which is place in Slot number 2 ?
    I think so. But it's easy to know. Just pull that disk, boot the NAS and run

    mdadm --examine /dev/sda3

    It should show an 'Device Role : Active device 1', then the healthy disk is still in the NAS.

    and cannot see how to umount them or see why they are "in use".
    They are in use because they are part of a raid array. If you want to check them, you'll have to check the md (multi-disk) device. Which cannot be done for sdb2, as it's swap, without filesystem, and for sdb3 as the array is inactive due to disk errors.
  • Janikovo
    Janikovo Posts: 24  Freshman Member
    @Mijzelf

    Thank you, I have one more question.
    There were two HDDs in RAID1, which I understand the Disk1 has the same data as the Disk2.
    They are "copy". RAID1 consists of an exact copy (or mirror) of a set of data on two disk.

    Should I understand that all data are still alive on the healthy HDD ? 

    You mentioned that I can loss all data....my question is - when I buy second healthy disk, is not NAS 326 capable to repair itself ?

    The second question - Why I lost "the volume" ? As I understood, when disk1 is failing, will be disconnected by NAS software itself and NAS should be working on that one disk with healthy status and all the data.

    I was planning to buy new 2TB disk and simply put in into NAS326, creating "Disk group" and volume will get repaired itself.

    Or by mdadm - assembe command.

    As I dont have any Linux PC I have to buy something like this 

    https://www.alza.de/premiumcord-usb-2-0-konverter-gt-40-44-ide-oder-sata-2-5-und-3-5-hdd-netzteil-d66597.htm?o=7

    or
    https://www.alza.de/premiumcord-converter-usb-3-0-gt-sata-2-5-und-3-5-gerate-netzteil-d191813.htm?o=1

    or 

    https://www.alza.de/premiumcord-usb-3-0-sata-iii-d251654.htm?o=2

     
    Then Try to use some distribution of the "livecd" I dont know which is best for this.
    SystemRescueCD ?, Arch Linux ? or Kali Linux ?

    And then I can try those steps you mentioned - run ddrescue.

    But again , I will buy new Western Digital Blue 2TB, I will put in into nas326 second slot next to healthy HDD...the NAS326 is not able to repair automatically itself ?

    Thaks,
    Jan


  • Mijzelf
    Mijzelf Posts: 2,828  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    There were two HDDs in RAID1, which I understand the Disk1 has the same data as the Disk2.
    They are "copy". RAID1 consists of an exact copy (or mirror) of a set of data on two disk.
    You didn't have a raid1 array. The examine show the raid level is 'linear'. That level doesn't have redundancy. Your data volume had the size of 2 disks totaled, instead of the size of a single disk.

    If you would have had a raid1 array, then yes, the NAS would have repaired it when you simply inserted a new, blank disk. Although it's not actually repairing, but re-adding redundancy. A degraded raid1 array isn't 'broken', it just lost it's redundancy. But that's nitpicking.

    As I dont have any Linux PC I have to buy something like this
    That doesn't have to do with Linux, but that you don't have a PC with available sata ports, I think? Else I don't know what you mean.
    And yes, an USB-sata converter is a suitable way to connect the disk. But I don't think a 3.5" disk can be buspowered by USB, although I'm not sure about USB3. If you don't power the disk externally, check the specifications of the used USB3 port. It should at least be able to supply 15W, or something like that.
    SystemRescueCD ?, Arch Linux ? or Kali Linux ?
    Not Kali. That is a distro for vulnerability testing, and I wouldn't know if it has ddrescue in it's repo. Arch is bleeding edge. It will do, but it wouldn't be my choice when I need reliable functioning out of the box. SystemRescueCD sounds good. According to it's website it has ddrescue out of the box, which is actually the only tool you need.
  • Janikovo
    Janikovo Posts: 24  Freshman Member
    edited June 2020
    @Mijzelf and all others,

    the SAN was working for one hour unexpectedly. I tried to backup, but stopped at 5% of the data. Then stopped again - "No Volume"

    What I want to ask, how can I run e2fsck on that faulty disk ?

    sd(a,b) 1,2,3 : not mounted

    running e2fsck always says:
    /dev/sd(a,b)1,2,3 is in use.
    e2fsck: Cannot continue, aborting.

    Thanks,
    Jan




  • Mijzelf
    Mijzelf Posts: 2,828  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary
    the SAN was working for one hour unexpectedly.

    That was after a reboot, I guess?

    What I want to ask, how can I run e2fsck on that faulty disk ?
    The filesystem is not on the raw partitions. The partition contains a raid container, and you'll have to assemble the array to be able to run fsck on it.
    The array can be assembled, the SAN was working for one hour unexpectedly. but it's not stable. On the first I/O error the array becomes inactive. So you can't feasibly run fsck on a not-redundent md array which contains a disk with hardware errors.


Consumer Product Help Center