RAID1 degraded on NSA325 v2 after file deletion

siraph · December 2024

Many thanks! I'll give a try as soon as I've completed a data backup.

siraph · December 2024

So, here we are again :)

After completing a backup, first tried:

root@oniNas:/home/shares# /sbin/mdadm /dev/md0 --fail /dev/sdb2 --remove /dev/sdb2
mdadm: set device faulty failed for /dev/sdb2:  No such device

So I checked:

root@oniNas:/home/shares# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] 
md0 : active raid1 sda2[0]
      1952996792 blocks super 1.2 [2/1] [U_]

So I added sdb2 back:

root@oniNas:/home/shares# /sbin/mdadm /dev/md0 --add /dev/sdb2
mdadm: added /dev/sdb2

Checking md0:

root@oniNas:/home/shares# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] 
md0 : active raid1 sdb2[2] sda2[0]
   1952996792 blocks super 1.2 [2/1] [U_]
         [>....................]  recovery =  0.0% (1364352/1952996792) finish=214.5min speed=151594K/sec

But recovery, again, failed to go past 0.4%…

Now I have this:

root@oniNas:/home/shares# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] 
md0 : active raid1 sdb2[2](F) sda2[0]
      1952996792 blocks super 1.2 [2/1] [U_]
unused devices: <none>

Tried a couple of times, same result. Trying to do a repair/recovery from web interface, sdb2 is again removed from RAID.

Reading the post you linked, tried —query:

root@oniNas:/home/shares# mdadm --query --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Wed May 13 13:10:05 2015
Raid Level : raid1
Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Fri Dec 27 11:14:30 2024
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
           Name : oniNas:0  (local to host oniNas)
           UUID : adcf3210:50eaa529:cec15b96:3edc8d16
         Events : 8258202
    Number   Major   Minor   RaidDevice   State
       0               8           2               0              active sync   /dev/sda2
       1               0           0               1              removed
       2               8           18              -              faulty spare   /dev/sdb2

What should I do now?

Thanks :)

Mijzelf · December 2024

What should I do now?

Read the kernel log after the syncing is aborted. I suppose some kind of I/O error occurs.

siraph · December 2024

Tried:

su
cp /proc/kmsg /tmp/kmsg2.txt

After a fresh reboot. I'm attaching the output to this message.

Seems the same as before..

kmsg2.txt

siraph · December 2024

Found this thread, can it be of some help?

Now, this happened:

root@oniNas:/home/shares# /sbin/mdadm /dev/md0 --add /dev/sdb2
mdadm: cannot find /dev/sdb2: No such device or address

I still have this:

root@oniNas:/home/shares# mdadm --query --detail /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Wed May 13 13:10:05 2015
     Raid Level : raid1
     Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
  Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Sat Dec 28 16:44:19 2024
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : oniNas:0  (local to host oniNas)
           UUID : adcf3210:50eaa529:cec15b96:3edc8d16
         Events : 8260622

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       0        0        1      removed

I'm curious why number 1 is "removed". Also, when I was able to add sdb2, it was added as number 2, and not as number 1.

Should I replace the sdb2 disk? Or re-initialize it in some way?

Many thanks

PS: some more details.

root@oniNas:/home/shares# fdisk -l

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x6178ba06

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1              63     1028159      514048+   8  AIX
/dev/sda2         1028160  3907024064  1952997952+  20  Unknown

Disk /dev/md0: 1999.9 GB, 1999868715008 bytes
2 heads, 4 sectors/track, 488249198 cylinders, total 3905993584 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/md0 doesn't contain a valid partition table

Web interface is not responding anymore… Giving connection refused. But I'm still able to connect via ssh.

Mijzelf · December 2024

Read the kernel log after the syncing is aborted.

After a fresh reboot. I'm attaching the output to this message.

You misunderstood me. The idea was to initiate a sync, wait until it was aborted, and then read the kernel log, without rebooting. The kernel log is volatile. You can only read the last 16kB or something like that, and it doesn't survive a reboot.

But meanwhile there is another problem.

mdadm: cannot find /dev/sdb2: No such device or address

No matter if it's part of a raid array or not, mdadm should be able to find sdb2. And

# fdisk -l

should show sdb, and it's partitions. So it seems your 2nd disk is gone. It died, or maybe the SATA bus died. Again, the kernel log might tell, especially if you can still dump the moment the disk vanished.

siraph · December 2024

Mijzelf many thanks for your replies.

I think I'll get another disk…

In the meantime, how can I check kernel logs? If you remember, "dmesg" doesn't seem to work for my installation, all I know is what you wrote before:

cp /proc/kmsg /tmp/kmsg2.txt

Mijzelf · December 2024

You can check kernel logs with 'cp /proc/kmsg /tmp/kmsg2.txt'. That erases the log, but new lines can be added, and read with the same command. (Assuming you copied away /tmp/kmsg2.txt, or use another filename). dmesg should work, but for some strange reason it doesn't work for you.

I think I'll get another disk…

I recommend to wait until you know what the problem is. That NAS is at least 10 years old, I think. The SATA bus could be failing. Or the power supply. Have you the possibility to test the 'defective' disk in another system? Do you have another disk to put in that slot, just to see if it's recognized?

siraph · December 2024

Ok, tried another cp /proc/kmsg /tmp/kmsg4.txt just after a power on, attaching output here.

And I can see an I/O error on sdb:

<6>sd 1:0:0:0: Device offlined - not ready after error recovery
<6>sd 1:0:0:0: [sdb] Unhandled error code
<6>sd 1:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
<3>end_request: I/O error, dev sdb, sector 3907029160
<3>Buffer I/O error on device sdb, logical block 488378645
<3>sd 1:0:0:0: rejecting I/O to offline device

kmsg4.txt

siraph · December 2024

Risking to go OT, as I just remembered/checked I have MetaRepository installed, I think from your repo, I'm sending a couple of screens, if you can suggest if I can / should upgrade something.

You are currently using firmware version : V4.81(AALS.1)

Many thanks :)

RAID1 degraded on NSA325 v2 after file deletion

All Replies

Categories

Consumer Product Help Center