NAS540 Shows Healthy but RAID degraded.

jahmon · November 2022

When I look at my disks in storage manager they are all green. When I check them with SMART, two show up as BAD. So in theory I have Disk 1 and 4 BAD in RAID 5. Yet, I can still get access to all my data, no problem.

I ran 'repair' three times and the logs say it was successful but I keep getting the RAID degraded message. My GUESS is that those two drive have a small number of sectors reallocated, so SMART is flagging them as bad, and that is triggering the degraded notification?

I've done restarts twice, same results.

TIA

Mijzelf · December 2022

jahmon said:

So I looked through it more carefully, right in the beginning of the "SMART Data Section" there is a line "SMART overall-health self-assessment test result:" Drive A shows "Passed", Drive D shows "Failed". As D is the RAID 5 parity drive, this explains why it's degraded, but I can still access the data and the rebuild fails while processing. Have I got it?

More or less. There is no parity drive in RAID5, the parity blocks are equally distributed over all disks. This is done to maximize the read speed (on a healthy raid array the parity blocks are not used for reading, and so it's a waste to not use a whole disk + it's bandwidth) and to minimize the penalty when a random disk fails.

The raid manager is pretty dumb. When rebuilding the array is simply calculates the content of the 'new' disk from the total surface of the 3 others (the raid manager doesn't know about filesystems, and so doesn't know if a particular sector is used or not), and writes that to the disk. When a write error occurs the new disk is dropped, and the rebuild fails. And worse, if a read error occurs the relevant disk is dropped, bringing the array down.

Mijzelf · November 2022

The raid manager will drop a disk as soon as it throws an I/O error on a read or write action. It doesn't keep track of bad sectors itself, but leaves that to the disks.

A reallocated sector is, well, reallocated, and so it's healthy. When the sector is found bad, the disk reallocates it with a spare sector. It won't throw an 'bad' error a second time. Yet is is possible that there is a Current_Pending_Sector, which is a sector which is physically healthy, yet has a wrong checksum, so it can't be read. It has to be written first. That kind of sector will throw read errors on each attempt to read them, until they are written once.

SMART won't call a disk bad for Current_Pending_Sector's.

Having said that, when SMART says a disk is bad, it has to be replaced. If you ever have an I/O error on a degraded raid array, that disk will be dropped leaving you with a 'down' array. In that case there is no easy way to restore your array.

jahmon · November 2022

Thanks.

jahmon · November 2022

A follow-up here. To repeat, I have 'healthy' showing everywhere on my NAS540 except for the 'RAID Degraded" and the SMART 'BAD' indications. I looked more carefully and both of the disks that are 'BAD' are Hitachi's Ultrastar A7K2000, HUA722020ALA331. I can write and read data no problem. Is this a compatibility error? The compatibility list shows "HDS" and my drives are 'HUA" TIA!

Mijzelf · November 2022

Is this a compatibility error?

Probably not. There is no 'compatibility list', only a 'verified hard disk list'. Sata is sata, and so all disks are supposed to work, but not all models are actually tested, of course.

Does SMART give any information about why the disk is bad?

jahmon · December 2022

Yes - see attached. These don't make any sense to me as written. If this is accurage almost every parameter is overthreshold, and some thresholds are nonsensical. For example temperature and operating hours

Image: https://us.v-cdn.net/6029482/uploads/editor/hp/3bw2m4z4bwxn.png

Image: https://us.v-cdn.net/6029482/uploads/editor/61/r3bb1pccj7qi.png

jahmon · December 2022

...and still shows all disks 'green' note 1 and 4 are the Hitachis flagged as 'BAD' by SMART.

Image: https://us.v-cdn.net/6029482/uploads/editor/nx/z12slhlt8bpd.png

Mijzelf · December 2022

Indeed the SMART info in the webinterface is hardly usable. You'd better look at the output of the smartctl tool.

Login in the NAS over ssh, and execute

su

smartctl -a /dev/sda

smartctl -a /dev/sdd

(I choose sda and sdd as I suppose that will be the device nodes of your both Hitachi's. That can be different when you have an USB disk or SD card connected, or if the NAS just acts weird. Have a look with 'cat /proc/partitions' to see all device nodes in use)

jahmon · December 2022

I tried to do this with PUTTY and WinSCP, but can't get it configured correctly. Do you have a guide on establishing an SSH session? thanks.

Mijzelf · December 2022

Did you enable the ssh server in config->network->terminal?

jahmon · December 2022

That worked, thanks. Let me get the results and review. Sorry for the slow response, I'm sort of doing this between other priorities.

NAS540 Shows Healthy but RAID degraded.

Accepted Solution

All Replies

Categories

Consumer Product Help Center