RAID1 degraded on NSA325 v2 after file deletion

2

All Replies

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment

    Many thanks! I'll give a try as soon as I've completed a data backup.

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment
    edited December 2024

    So, here we are again :)

    After completing a backup, first tried:

    root@oniNas:/home/shares# /sbin/mdadm /dev/md0 --fail /dev/sdb2 --remove /dev/sdb2
    mdadm: set device faulty failed for /dev/sdb2:  No such device
    

    So I checked:

    root@oniNas:/home/shares# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] 
    md0 : active raid1 sda2[0]
          1952996792 blocks super 1.2 [2/1] [U_]
    

    So I added sdb2 back:

    root@oniNas:/home/shares# /sbin/mdadm /dev/md0 --add /dev/sdb2
    mdadm: added /dev/sdb2
    

    Checking md0:

    root@oniNas:/home/shares# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] 
    md0 : active raid1 sdb2[2] sda2[0]
       1952996792 blocks super 1.2 [2/1] [U_]
             [>....................]  recovery =  0.0% (1364352/1952996792) finish=214.5min speed=151594K/sec
    

    But recovery, again, failed to go past 0.4%…

    Now I have this:

    root@oniNas:/home/shares# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] 
    md0 : active raid1 sdb2[2](F) sda2[0]
          1952996792 blocks super 1.2 [2/1] [U_]
    unused devices: <none>
    

    Tried a couple of times, same result. Trying to do a repair/recovery from web interface, sdb2 is again removed from RAID.

    Reading the post you linked, tried —query:

    root@oniNas:/home/shares# mdadm --query --detail /dev/md0
    /dev/md0:
    Version : 1.2
    Creation Time : Wed May 13 13:10:05 2015
    Raid Level : raid1
    Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
    Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
    Raid Devices : 2
    Total Devices : 2
    Persistence : Superblock is persistent
    Update Time : Fri Dec 27 11:14:30 2024
    State : clean, degraded
    Active Devices : 1
    Working Devices : 1
    Failed Devices : 1
    Spare Devices : 0
               Name : oniNas:0  (local to host oniNas)
               UUID : adcf3210:50eaa529:cec15b96:3edc8d16
             Events : 8258202
        Number   Major   Minor   RaidDevice   State
           0               8           2               0              active sync   /dev/sda2
           1               0           0               1              removed
           2               8           18              -              faulty spare   /dev/sdb2
    

    What should I do now?

    Thanks :)

  • Mijzelf
    Mijzelf Posts: 2,811  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary

    What should I do now?

    Read the kernel log after the syncing is aborted. I suppose some kind of I/O error occurs.

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment
    edited December 2024

    Tried:

    su
    cp /proc/kmsg /tmp/kmsg2.txt
    

    After a fresh reboot. I'm attaching the output to this message.

    Seems the same as before..

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment
    edited December 2024

    Found this thread, can it be of some help?

    Now, this happened:

    root@oniNas:/home/shares# /sbin/mdadm /dev/md0 --add /dev/sdb2
    mdadm: cannot find /dev/sdb2: No such device or address
    

    I still have this:

    root@oniNas:/home/shares# mdadm --query --detail /dev/md0
    /dev/md0:
            Version : 1.2
      Creation Time : Wed May 13 13:10:05 2015
         Raid Level : raid1
         Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
      Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
       Raid Devices : 2
      Total Devices : 1
        Persistence : Superblock is persistent
    
        Update Time : Sat Dec 28 16:44:19 2024
              State : clean, degraded
     Active Devices : 1
    Working Devices : 1
     Failed Devices : 0
      Spare Devices : 0
    
               Name : oniNas:0  (local to host oniNas)
               UUID : adcf3210:50eaa529:cec15b96:3edc8d16
             Events : 8260622
    
        Number   Major   Minor   RaidDevice State
           0       8        2        0      active sync   /dev/sda2
           1       0        0        1      removed
    

    I'm curious why number 1 is "removed". Also, when I was able to add sdb2, it was added as number 2, and not as number 1.

    Should I replace the sdb2 disk? Or re-initialize it in some way?

    Many thanks

    PS: some more details.

    root@oniNas:/home/shares# fdisk -l
    
    Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
    255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x6178ba06
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sda1              63     1028159      514048+   8  AIX
    /dev/sda2         1028160  3907024064  1952997952+  20  Unknown
    
    Disk /dev/md0: 1999.9 GB, 1999868715008 bytes
    2 heads, 4 sectors/track, 488249198 cylinders, total 3905993584 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    Disk /dev/md0 doesn't contain a valid partition table

    Web interface is not responding anymore… Giving connection refused. But I'm still able to connect via ssh.

  • Mijzelf
    Mijzelf Posts: 2,811  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary

    Read the kernel log after the syncing is aborted.

    After a fresh reboot. I'm attaching the output to this message.

    You misunderstood me. The idea was to initiate a sync, wait until it was aborted, and then read the kernel log, without rebooting. The kernel log is volatile. You can only read the last 16kB or something like that, and it doesn't survive a reboot.

    But meanwhile there is another problem.

    mdadm: cannot find /dev/sdb2: No such device or address
    

    No matter if it's part of a raid array or not, mdadm should be able to find sdb2. And

    # fdisk -l
    

    should show sdb, and it's partitions. So it seems your 2nd disk is gone. It died, or maybe the SATA bus died. Again, the kernel log might tell, especially if you can still dump the moment the disk vanished.

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment

    Mijzelf many thanks for your replies.

    I think I'll get another disk…

    In the meantime, how can I check kernel logs? If you remember, "dmesg" doesn't seem to work for my installation, all I know is what you wrote before:

    cp /proc/kmsg /tmp/kmsg2.txt
    

  • Mijzelf
    Mijzelf Posts: 2,811  Guru Member
    250 Answers 2500 Comments Friend Collector Seventh Anniversary

    You can check kernel logs with 'cp /proc/kmsg /tmp/kmsg2.txt'. That erases the log, but new lines can be added, and read with the same command. (Assuming you copied away /tmp/kmsg2.txt, or use another filename). dmesg should work, but for some strange reason it doesn't work for you.

    I think I'll get another disk…

    I recommend to wait until you know what the problem is. That NAS is at least 10 years old, I think. The SATA bus could be failing. Or the power supply. Have you the possibility to test the 'defective' disk in another system? Do you have another disk to put in that slot, just to see if it's recognized?

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment
    edited December 2024

    Ok, tried another cp /proc/kmsg /tmp/kmsg4.txt just after a power on, attaching output here.

    And I can see an I/O error on sdb:

    <6>sd 1:0:0:0: Device offlined - not ready after error recovery
    <6>sd 1:0:0:0: [sdb] Unhandled error code
    <6>sd 1:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
    <3>end_request: I/O error, dev sdb, sector 3907029160
    <3>Buffer I/O error on device sdb, logical block 488378645
    <3>sd 1:0:0:0: rejecting I/O to offline device
    

  • siraph
    siraph Posts: 14  Freshman Member
    First Comment

    Risking to go OT, as I just remembered/checked I have MetaRepository installed, I think from your repo, I'm sending a couple of screens, if you can suggest if I can / should upgrade something.

    You are currently using firmware version : V4.81(AALS.1)
    

    Many thanks :)

Consumer Product Help Center