NAS540 Shows Healthy but RAID degraded.
All Replies
-
~ # smartctl -a /dev/sdd
smartctl 6.3 2014-07-26 r3976 [armv7l-linux-3.2.54] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Ultrastar A7K2000
Device Model: Hitachi HUA722020ALA331
Serial Number: YBK0JV2F
LU WWN Device Id: 5 000cca 221ea85bb
Firmware Version: JKAOA3NH
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Thu Dec 8 15:36:06 2022 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
See vendor-specific Attribute list for failed Attributes.
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (22624) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 377) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 083 083 016 Pre-fail Always - 131506
2 Throughput_Performance 0x0005 130 130 054 Pre-fail Offline - 112
3 Spin_Up_Time 0x0007 116 116 024 Pre-fail Always - 620 (Average 620)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 2108
5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail Always FAILING_NOW 1058
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 34
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6202
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 39
192 Power-Off_Retract_Count 0x0032 099 099 000 Old_age Always - 2110
193 Load_Cycle_Count 0x0012 099 099 000 Old_age Always - 2110
194 Temperature_Celsius 0x0002 120 120 000 Old_age Always - 50 (Min/Max 21/61)
196 Reallocated_Event_Count 0x0032 048 048 000 Old_age Always - 1127
197 Current_Pending_Sector 0x0022 044 044 000 Old_age Always - 1201
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 19 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 19 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 2c 8c bc 68 03 Error: UNC 44 sectors at LBA = 0x0368bc8c = 57195660
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:55.130 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:55.084 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:55.076 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:55.075 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:55.074 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 18 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 80 bc 68 03 Error: UNC 56 sectors at LBA = 0x0368bc80 = 57195648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:38.602 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:38.555 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:38.547 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:38.546 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:38.545 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 17 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 80 bc 68 03 Error: UNC 56 sectors at LBA = 0x0368bc80 = 57195648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:22.061 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:21.261 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:21.253 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:21.252 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:21.251 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 16 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 3f c1 bc 68 03 Error: UNC 63 sectors at LBA = 0x0368bcc1 = 57195713
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 48 b8 bc 68 e0 08 14d+16:28:37.109 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:28:37.063 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:28:37.055 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:28:37.054 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:28:37.053 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 15 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 45 bb bc 68 03 Error: UNC 69 sectors at LBA = 0x0368bcbb = 57195707
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 48 b8 bc 68 e0 08 14d+16:28:12.391 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:28:12.344 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:28:12.336 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:28:12.335 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:28:12.334 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 58737 -
# 2 Short offline Completed without error 00% 57416 -
# 3 Short offline Completed without error 00% 57414 -
# 4 Short offline Completed without error 00% 53020 -
# 5 Short offline Completed without error 00% 53017 -
# 6 Short offline Completed without error 00% 53013 -
# 7 Short offline Completed without error 00% 53011 -
# 8 Short offline Completed without error 00% 53009 -
# 9 Short offline Completed without error 00% 45977 -
#10 Short offline Completed without error 00% 40631 -
#11 Short offline Completed without error 00% 38232 -
#12 Short offline Completed without error 00% 38228 -
#13 Short offline Completed without error 00% 38226 -
#14 Short offline Completed without error 00% 37252 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
~ #
0 -
There's a lot of info here. You are correct my config is sda and sdd. I may be missing something but this looks like it passes self-test. Your help appreciated!
Here's the logs from one as it is too long.~ # smartctl -a /dev/sdd
smartctl 6.3 2014-07-26 r3976 [armv7l-linux-3.2.54] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Ultrastar A7K2000
Device Model: Hitachi HUA722020ALA331
Serial Number: YBK0JV2F
LU WWN Device Id: 5 000cca 221ea85bb
Firmware Version: JKAOA3NH
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Thu Dec 8 15:36:06 2022 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
See vendor-specific Attribute list for failed Attributes.
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (22624) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 377) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 083 083 016 Pre-fail Always - 131506
2 Throughput_Performance 0x0005 130 130 054 Pre-fail Offline - 112
3 Spin_Up_Time 0x0007 116 116 024 Pre-fail Always - 620 (Average 620)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 2108
5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail Always FAILING_NOW 1058
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 34
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6202
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 39
192 Power-Off_Retract_Count 0x0032 099 099 000 Old_age Always - 2110
193 Load_Cycle_Count 0x0012 099 099 000 Old_age Always - 2110
194 Temperature_Celsius 0x0002 120 120 000 Old_age Always - 50 (Min/Max 21/61)
196 Reallocated_Event_Count 0x0032 048 048 000 Old_age Always - 1127
197 Current_Pending_Sector 0x0022 044 044 000 Old_age Always - 1201
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 19 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 19 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 2c 8c bc 68 03 Error: UNC 44 sectors at LBA = 0x0368bc8c = 57195660
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:55.130 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:55.084 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:55.076 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:55.075 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:55.074 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 18 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 80 bc 68 03 Error: UNC 56 sectors at LBA = 0x0368bc80 = 57195648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:38.602 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:38.555 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:38.547 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:38.546 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:38.545 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 17 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 80 bc 68 03 Error: UNC 56 sectors at LBA = 0x0368bc80 = 57195648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:22.061 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:21.261 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:21.253 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:21.252 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:21.251 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 16 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 3f c1 bc 68 03 Error: UNC 63 sectors at LBA = 0x0368bcc1 = 57195713
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 48 b8 bc 68 e0 08 14d+16:28:37.109 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:28:37.063 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:28:37.055 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:28:37.054 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:28:37.053 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 15 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 45 bb bc 68 03 Error: UNC 69 sectors at LBA = 0x0368bcbb = 57195707
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 48 b8 bc 68 e0 08 14d+16:28:12.391 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:28:12.344 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:28:12.336 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:28:12.335 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:28:12.334 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 58737 -
# 2 Short offline Completed without error 00% 57416 -
# 3 Short offline Completed without error 00% 57414 -
# 4 Short offline Completed without error 00% 53020 -
# 5 Short offline Completed without error 00% 53017 -
# 6 Short offline Completed without error 00% 53013 -
# 7 Short offline Completed without error 00% 53011 -
# 8 Short offline Completed without error 00% 53009 -
# 9 Short offline Completed without error 00% 45977 -
#10 Short offline Completed without error 00% 40631 -
#11 Short offline Completed without error 00% 38232 -
#12 Short offline Completed without error 00% 38228 -
#13 Short offline Completed without error 00% 38226 -
#14 Short offline Completed without error 00% 37252 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
0 -
Well, that is clear:=== START OF READ SMART DATA SECTION ===SMART overall-health self-assessment test result: FAILED!Drive failure expected in less than 24 hours. SAVE ALL DATA.<snip>5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail Always FAILING_NOW 1058So the problem is the Reallocated_Sector_Ct. I have no idea why that looks different in the GUI. The disk itself has recorded 4 errors at 5921 power-on hours (which is almost 300 hours ago), at sectors 57195660, 57195648, 57195713 and 57195707. So that 24 hours is a bit exaggerated. Assuming that are 4k sectors, that is around 218GB from the start of the disk, so that is well inside the data partition.
0 -
So I looked through it more carefully, right in the beginning of the "SMART Data Section" there is a line "SMART overall-health self-assessment test result:" Drive A shows "Passed", Drive D shows "Failed". As D is the RAID 5 parity drive, this explains why it's degraded, but I can still access the data and the rebuild fails while processing. Have I got it?0
-
There's a lot of info here. You are correct my config is sda and sdd. I may be missing something but this looks like it passes self-test. Your help appreciated!
Here's the logs from one as it is too long.~ # smartctl -a /dev/sdd
smartctl 6.3 2014-07-26 r3976 [armv7l-linux-3.2.54] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Hitachi Ultrastar A7K2000
Device Model: Hitachi HUA722020ALA331
Serial Number: YBK0JV2F
LU WWN Device Id: 5 000cca 221ea85bb
Firmware Version: JKAOA3NH
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Thu Dec 8 15:36:06 2022 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
See vendor-specific Attribute list for failed Attributes.
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (22624) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 377) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 083 083 016 Pre-fail Always - 131506
2 Throughput_Performance 0x0005 130 130 054 Pre-fail Offline - 112
3 Spin_Up_Time 0x0007 116 116 024 Pre-fail Always - 620 (Average 620)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 2108
5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail Always FAILING_NOW 1058
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 34
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 6202
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 39
192 Power-Off_Retract_Count 0x0032 099 099 000 Old_age Always - 2110
193 Load_Cycle_Count 0x0012 099 099 000 Old_age Always - 2110
194 Temperature_Celsius 0x0002 120 120 000 Old_age Always - 50 (Min/Max 21/61)
196 Reallocated_Event_Count 0x0032 048 048 000 Old_age Always - 1127
197 Current_Pending_Sector 0x0022 044 044 000 Old_age Always - 1201
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 1
ATA Error Count: 19 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 19 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 2c 8c bc 68 03 Error: UNC 44 sectors at LBA = 0x0368bc8c = 57195660
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:55.130 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:55.084 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:55.076 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:55.075 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:55.074 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 18 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 80 bc 68 03 Error: UNC 56 sectors at LBA = 0x0368bc80 = 57195648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:38.602 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:38.555 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:38.547 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:38.546 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:38.545 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 17 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 38 80 bc 68 03 Error: UNC 56 sectors at LBA = 0x0368bc80 = 57195648
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 38 80 bc 68 e0 08 14d+16:29:22.061 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:29:21.261 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:29:21.253 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:29:21.252 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:29:21.251 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 16 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 3f c1 bc 68 03 Error: UNC 63 sectors at LBA = 0x0368bcc1 = 57195713
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 48 b8 bc 68 e0 08 14d+16:28:37.109 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:28:37.063 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:28:37.055 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:28:37.054 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:28:37.053 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
Error 15 occurred at disk power-on lifetime: 5921 hours (246 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 45 bb bc 68 03 Error: UNC 69 sectors at LBA = 0x0368bcbb = 57195707
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 48 b8 bc 68 e0 08 14d+16:28:12.391 READ DMA EXT
27 00 00 00 00 00 e0 08 14d+16:28:12.344 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 14d+16:28:12.336 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 14d+16:28:12.335 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 08 14d+16:28:12.334 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 58737 -
# 2 Short offline Completed without error 00% 57416 -
# 3 Short offline Completed without error 00% 57414 -
# 4 Short offline Completed without error 00% 53020 -
# 5 Short offline Completed without error 00% 53017 -
# 6 Short offline Completed without error 00% 53013 -
# 7 Short offline Completed without error 00% 53011 -
# 8 Short offline Completed without error 00% 53009 -
# 9 Short offline Completed without error 00% 45977 -
#10 Short offline Completed without error 00% 40631 -
#11 Short offline Completed without error 00% 38232 -
#12 Short offline Completed without error 00% 38228 -
#13 Short offline Completed without error 00% 38226 -
#14 Short offline Completed without error 00% 37252 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
0 -
jahmon said:So I looked through it more carefully, right in the beginning of the "SMART Data Section" there is a line "SMART overall-health self-assessment test result:" Drive A shows "Passed", Drive D shows "Failed". As D is the RAID 5 parity drive, this explains why it's degraded, but I can still access the data and the rebuild fails while processing. Have I got it?More or less. There is no parity drive in RAID5, the parity blocks are equally distributed over all disks. This is done to maximize the read speed (on a healthy raid array the parity blocks are not used for reading, and so it's a waste to not use a whole disk + it's bandwidth) and to minimize the penalty when a random disk fails.The raid manager is pretty dumb. When rebuilding the array is simply calculates the content of the 'new' disk from the total surface of the 3 others (the raid manager doesn't know about filesystems, and so doesn't know if a particular sector is used or not), and writes that to the disk. When a write error occurs the new disk is dropped, and the rebuild fails. And worse, if a read error occurs the relevant disk is dropped, bringing the array down.
0 -
Thank you for the top notch support and patience!0
Categories
- All Categories
- 415 Beta Program
- 2.4K Nebula
- 147 Nebula Ideas
- 96 Nebula Status and Incidents
- 5.7K Security
- 262 USG FLEX H Series
- 271 Security Ideas
- 1.4K Switch
- 74 Switch Ideas
- 1.1K Wireless
- 40 Wireless Ideas
- 6.4K Consumer Product
- 249 Service & License
- 387 News and Release
- 84 Security Advisories
- 29 Education Center
- 10 [Campaign] Zyxel Network Detective
- 3.5K FAQ
- 34 Documents
- 34 Nebula Monthly Express
- 85 About Community
- 73 Security Highlight