RAID1 degraded on NSA325 v2 after file deletion
I've been using my NSA325 for many years right now, never replaced the hard drive, always worked fine. It is a two bay storage, and I configured a RAID1 volume.
Recently I was running out of free disk space, so I moved many files (approx 200GB of data) to another storage through rsync, and then deleted files folder by folder.
Some days after that, I checked my RAID1 status in the web admin interface, and it says it is degraded. In the volume detail, I just see disk1. Tried scan and repair: scan is not showing any error, repair stays stuck on "Recovering 0.0%" for many time, and then volume shows Degraded once more. Disk leds on the external of the NAS where both green, but now after some scan and repair, disk2 shows red…
Checking S.M.A.R.T. , both disks are shown healthy, and as far as I can read smart summaries, both disk are working good and not faulty.
How can I solve my problem? Thanks
All Replies
-
Reading various similar posts, I'm giving some additional information.
I mounted two 2TB drives, WD RED, to form my RADI1.
Firmware: V4.81(AALS.1)
Here are some screens of the web interface
It has happened that in above section, I could see only disk1, tried a Repair, never gone beyond "Recovering 0.04%". Led turned red for some time, than back green. But always degraded…
The two drives in SMART page
disk 1 full summary
disk2 full summary
recent logs shown in web interface
Shell commands:
~$ cat /proc/partitions
major minor #blocks name7 0 143360 loop0
8 0 1953514584 sda
8 1 514048 sda1
8 2 1952997952 sda2
31 0 1024 mtdblock0
8 16 1953514584 sdb
8 17 514048 sdb1
8 18 1952997952 sdb2
31 1 512 mtdblock1
31 2 512 mtdblock2
31 3 512 mtdblock3
31 4 10240 mtdblock4
31 5 10240 mtdblock5
31 6 48896 mtdblock6
31 7 10240 mtdblock7
31 8 48896 mtdblock8
9 0 1952996792 md0~$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1]
md0 : active raid1 sdb22 sda2[0]
1952996792 blocks super 1.2 [2/1] [U_]~$ mdadm --examine /dev/sda
-sh: mdadm: command not foundMaybe it's all ok… But if one (or both) my drives are failing, I would like to know something more, before starting to buy another drive and/or NAS.
0 -
The Current_Pending_Sector with a raw value of 2 could cause the disk to be dropped from the array. But it cannot stop a running repair, and after repairing it should be gone.
Your log shows that repairing has started, but /proc/mdstat doesn't show that. It should show the repairing status, and a progression state. (Sorry, can't remember the exact layout, but you'll recognize it if you see it)
So apparently the repair has already stopped, and the firmware log hasn't noticed. Maybe the kernel log shows a reason. Execute 'dmesg' to read the kernel log.
0 -
Thanks for the reply, I'm sending dmesg output attached to this post
0 -
That dmesg output is far from complete. I'd expect at least a hundred lines.
0 -
I thought so… But I don't know where to look for some more information.
That's all I get executing "dmesg" on shell after connecting with SSH:
This, and the fact that I'm missing "mdadm", make me wonder if I'm missing some package installed.
0 -
O_o. According to this page next line should have been about cpu interface. And this is not something about packages not being installed. This is very early kernel log, the disks are not even powered yet. So it doesn't know about packages. (BTW, mdadm is not in a package. How could it, if it's needed to assemble the raid array on which packages are installed? mdadm in in the initramfs, a filesystem embedded in the kernel.)
How about
cat /dev/kmsg
which should give a similar output as dmesg.
0 -
No luck :)
0 -
Oh right. On this old kernel, it's in /proc/kmsg, but you can only read it once, and you need to be root. So you can execute
su cp /proc/kmsg /tmp/kmsg ^C chmod a+r /tmp/kmsg
And then you can examine /tmp/kmsg. That '^C' is really a control C. The copy is blocking, as it's waiting for more loglines to arrive. The chmod is to allow a ordinary user to read the copy. Else you could get in trouble when you try to fetch it using WinSCP, or something like that.
0 -
OK, thank you very much! Here is the output you asked for.
0 -
The disk (or actually partition) is dropped because it's 'non-fresh':
<6>md: md0 stopped.
<6>md: bind<sdb2>
<6>md: bind<sda2>
<4>md: kicking non-fresh sdb2 from array!
<6>md: unbind<sdb2>
<6>md: export_rdev(sdb2)Had to google that, and found this page. I suppose the solution will also work for you, if you substitute sda5 by sdb2.
0
Categories
- All Categories
- 415 Beta Program
- 2.4K Nebula
- 151 Nebula Ideas
- 100 Nebula Status and Incidents
- 5.8K Security
- 283 USG FLEX H Series
- 278 Security Ideas
- 1.5K Switch
- 74 Switch Ideas
- 1.1K Wireless
- 42 Wireless Ideas
- 6.5K Consumer Product
- 251 Service & License
- 396 News and Release
- 85 Security Advisories
- 29 Education Center
- 10 [Campaign] Zyxel Network Detective
- 3.6K FAQ
- 34 Documents
- 34 Nebula Monthly Express
- 86 About Community
- 75 Security Highlight