NAS540 reboots when RAID resync would finish

Hi,
Recently one of 4 HDD-s in my NAS540 failed and had to be replaced. I had issues with fixing and rebuilding the RAID 5 array, but log story short, the UI shows it is healthy now.
However it is resyncing and once it would finish, the NAS reboots and the resync starts again. 
How can I tell why the NAS reboots? Is there some log, that I can check? What can I do in this situation?

I searched forums but found no applicable advice so far. 


Accepted Solution

  • Mijzelf
    Mijzelf Posts: 2,468
    250 Answers 1000 Comments Friend Collector Sixth Anniversary
     Guru Member
    Answer ✓
    So the resync took almost 24 hours? That means it wrote around 23MB/sec to the new disk. A bit low.
    Can the filesystem erros be fixed?
    Possibly. One of the problems is you can't repair a mounted filesystem. So it has to be umounted first, and that is not easy. There is a work-around, by intercepting the shutdown. I wrote about that here. And sorry, this forum b0rkes up everything, you'll have to remove the html tags yourself. And you can omit the resize2fs, the goal is the e2fsck.
    But it's possible that repairing cannot be done, or not completely, depending on the damage.
    Is this related to the filesystem errors?
    Could be. Filesystem errors can cause all kinds of corruptions.

«13

All Replies

  • okimarukas
    okimarukas Posts: 86
    First Answer First Comment Friend Collector First Anniversary
     Ally Member
    What are the size of the disks?
    Not sure about the symptom is related to the disk or NAS, maybe trying with another disk to recover the Raid again.
  • poloschka
    poloschka Posts: 13
    First Comment
    Thanks for the reply!
    I have 2 TB disks.
    I'll take look at recovery again.
  • poloschka
    poloschka Posts: 13
    First Comment
    I fiddled with it again. Since I don't have a spare 2 TB disk, I put the faulty one back just to have one last attempt, but made no progress. I left the NAS to resync for two days and it keeps doing it.
    Funny thing is, when I checked it today the status was not saying it is resync-ing. I was hopeful, that it finally finished resync and I can move on with my life. Few minutes later, the UI was not responding. When I managed to SSH in, the uptime was 1 min...

    Additional thing is the Twonky media server is not running and the file browser returns http 500 error...I guess the volume is not "healthy" after all. Any suggestions?
  • Mijzelf
    Mijzelf Posts: 2,468
    250 Answers 1000 Comments Friend Collector Sixth Anniversary
     Guru Member
    The command
    cat /proc/mdstat
    gives the internal status of all raid arrays.
    When there are I/O errors, they should show up in the kernel log
    dmesg
  • poloschka
    poloschka Posts: 13
    First Comment
    Hi Mijzelf,
    the output of cat /proc/mdstat should be normal I guess, since the array is resyncing.
    ~ # cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
    md2 : active raid5 sda3[0] sdd3[4] sdc3[2] sdb3[1]
          5848151040 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
          [==>..................]  resync = 13.2% (258868516/1949383680) finish=1824                                                                                              .3min speed=15443K/sec
    
    md1 : active raid1 sda2[0] sdd2[4] sdc2[2] sdb2[1]
          1998784 blocks super 1.2 [4/4] [UUUU]
    
    md0 : active raid1 sda1[0] sdd1[4] sdc1[2] sdb1[1]
          1997760 blocks super 1.2 [4/4] [UUUU]

    Thanks for the suggestion to check dmesg. It says the following(copied only the relevant part):

    [   40.495474] md: md2 stopped.
    [   40.535683] md: bind<sdb3>
    [   40.538728] md: bind<sdc3>
    [   40.541870] md: bind<sdd3>
    [   40.544900] md: bind<sda3>
    [   40.551656] md/raid:md2: not clean -- starting background reconstruction
    [   40.558435] md/raid:md2: device sda3 operational as raid disk 0
    [   40.564400] md/raid:md2: device sdd3 operational as raid disk 3
    [   40.570345] md/raid:md2: device sdc3 operational as raid disk 2
    [   40.576282] md/raid:md2: device sdb3 operational as raid disk 1
    [   40.583255] md/raid:md2: allocated 4220kB
    [   40.587373] md/raid:md2: raid level 5 active with 4 out of 4 devices, algorit hm 2
    [   40.594912] RAID conf printout:
    [   40.594919]  --- level:5 rd:4 wd:4
    [   40.594926]  disk 0, o:1, dev:sda3
    [   40.594932]  disk 1, o:1, dev:sdb3
    [   40.594938]  disk 2, o:1, dev:sdc3
    [   40.594943]  disk 3, o:1, dev:sdd3
    [   40.595042] md2: detected capacity change from 0 to 5988506664960
    [   40.604705] md: resync of RAID array md2
    [   40.608651] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
    [   40.614550] md: using maximum available idle IO bandwidth (but not more than  200000 KB/sec) for resync.
    [   40.624019] md: using 128k window, over a total of 1949383680k.
    [   40.629978] md: resuming resync of md2 from checkpoint.
    [   40.953065] ADDRCONF(NETDEV_CHANGE): egiga0: link becomes ready
    [   41.303610]  md2: unknown partition table
    [   42.044206] EXT4-fs (md2): warning: mounting fs with errors, running e2fsck i s recommended
    [   42.083485] EXT4-fs (md2): mounted filesystem with ordered data mode. Opts: u srquota,data=ordered,barrier=1
    [   53.342295] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007418: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(12288), inode=0, rec_len=0, name_len=0
    [   54.089389] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007419: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(16384), inode=0, rec_len=0, name_len=0
    [   57.581948] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #917504 04: block 734007418: comm twonkyserver: bad entry in directory: rec_len is small er than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   57.799205] EXT4-fs warning (device md2): ext4_resize_begin:32: There are err ors in the filesystem, so online resizing is not allowed
    [   57.799213]
    [   58.749225] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm mkdir: bad entry in directory: rec_len is smaller th an minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   58.779833] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm mkdir: bad entry in directory: rec_len is smaller th an minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   58.820400] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   58.845647] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm mkdir: bad entry in directory: rec_len is smaller th an minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   58.883205] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm mkdir: bad entry in directory: rec_len is smaller th an minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   58.944586] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm mkdir: bad entry in directory: rec_len is smaller th an minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   58.971148] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm mkdir: bad entry in directory: rec_len is smaller th an minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   67.251870] EXT4-fs error (device md2): ext4_mb_generate_buddy:739: group 382 73, 24648 clusters in bitmap, 24655 in gd
    [   70.161844] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #15 6762722: block 1254101046: comm chmod: bad entry in directory: rec_len is smalle r than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   70.220167] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm ipkg-cl: bad entry in directory: rec_len is smaller  than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   82.579226] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007418: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(12288), inode=0, rec_len=0, name_len=0
    [   82.699713] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007419: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(16384), inode=0, rec_len=0, name_len=0
    [   83.298249] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #917504 04: block 734007418: comm twonkyserver: bad entry in directory: rec_len is small er than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   85.849026] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [   97.472780] bz time = 1
    [   97.476031] bz status = 1
    [   97.479158] bz_timer_status = 0
    [   97.482325] start buzzer
    [   97.917640] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  110.625260] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  112.279109] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007418: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(12288), inode=0, rec_len=0, name_len=0
    [  116.029196] JBD2: Spotted dirty metadata buffer (dev = md2, blocknr = 0). The re's a risk of filesystem corruption in case of system crash.
    [  118.196244] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #15 6762722:
    [  118.196443] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007419: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(16384), inode=0, rec_len=0, name_len=0
    [  118.196961] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007418: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(12288), inode=0, rec_len=0, name_len=0
    [  118.199477] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007419: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(16384), inode=0, rec_len=0, name_len=0
    [  118.264617] block 1254101046: comm chmod: bad entry in directory: rec_len is  smaller than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  118.289864] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #917504 04: block 734007418: comm twonkyserver: bad entry in directory: rec_len is small er than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  118.315844] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #917504 04: block 734007418: comm twonkyserver: bad entry in directory: rec_len is small er than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  124.356303] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  127.490236] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 731: block 1254101050: comm python: bad entry in directory: rec_len is smaller t han minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  137.318229] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  143.921077] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007418: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(12288), inode=0, rec_len=0, name_len=0
    [  143.941875] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007419: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(16384), inode=0, rec_len=0, name_len=0
    [  144.033360] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #917504 04: block 734007418: comm twonkyserver: bad entry in directory: rec_len is small er than minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  148.383891] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #156762 722: block 1254101046: comm cp: bad entry in directory: rec_len is smaller than  minimal - offset=0(0), inode=0, rec_len=0, name_len=0
    [  173.899310] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007418: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(12288), inode=0, rec_len=0, name_len=0
    [  175.219087] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #91 750404: block 734007419: comm twonkyserver: bad entry in directory: rec_len is s maller than minimal - offset=0(16384), inode=0, rec_len=0, name_len=0
    [  177.226113] EXT4-fs error (device md2): add_dirent_to_buf:1273: inode #917504 04: block 734007418: comm twonkyserver: bad entry in directory: rec_len is small er than minimal - offset=0(0), inode=0, rec_len=0, name_len=0



  • Mijzelf
    Mijzelf Posts: 2,468
    250 Answers 1000 Comments Friend Collector Sixth Anniversary
     Guru Member
    That log shows only logical errors, no hardware errors. Somehow the data partition has got some filesystem errors. But there is one strange line:
    [   57.799205] EXT4-fs warning (device md2): ext4_resize_begin:32: There are errors in the filesystem, so online resizing is not allowed
    
    While it is true that a filesystem with errors cannot be resized, why does it show up here? Like something is trying to initiate a filesystem resize. Is it possible that your new disk is slightly bigger or smaller than the old one? Can you post the output of
    cat /proc/partitions

  • poloschka
    poloschka Posts: 13
    First Comment
    Here is the output of that command:
    ~ # cat /proc/partitions
    major minor  #blocks  name
    
       7        0     146432 loop0
      31        0        256 mtdblock0
      31        1        512 mtdblock1
      31        2        256 mtdblock2
      31        3      10240 mtdblock3
      31        4      10240 mtdblock4
      31        5     112640 mtdblock5
      31        6      10240 mtdblock6
      31        7     112640 mtdblock7
      31        8       6144 mtdblock8
       8        0 1953514584 sda
       8        1    1998848 sda1
       8        2    1999872 sda2
       8        3 1949514752 sda3
       8       16 1953514584 sdb
       8       17    1998848 sdb1
       8       18    1999872 sdb2
       8       19 1949514752 sdb3
       8       32 1953514584 sdc
       8       33    1998848 sdc1
       8       34    1999872 sdc2
       8       35 1949514752 sdc3
       8       48 1953514584 sdd
       8       49    1998848 sdd1
       8       50    1999872 sdd2
       8       51 1949514752 sdd3
      31        9     102424 mtdblock9
       9        0    1997760 md0
       9        1    1998784 md1
      31       10       4464 mtdblock10
       9        2 5848151040 md2

    All disks appear the same size. The new disk is sdd.

    Also, I attached the full output of most recent run of dmesg. Might contain useful info. 
  • Mijzelf
    Mijzelf Posts: 2,468
    250 Answers 1000 Comments Friend Collector Sixth Anniversary
     Guru Member
    Unless the old disk was smaller, I don't see a reason why it would resize the filesystem.
    Your log contains a lot of filesystem errors, but as far as I can see it's mainly repeating itself. It might be a good idea to disable Twonky for now, as it constantly hits an error.
    Unfortunately the log doesn't contain a clue why the system would reboot. Maybe you can catch it if you run
    tail -f /dev/kmsg
    and leave that shell open. That should output the kernel messages 'live', so maybe it will catch the problem. But maybe not. There is a long way between the spawning of the message in the kernel, and getting it in your ssh shell on another system.

  • poloschka
    poloschka Posts: 13
    First Comment
    edited January 9
    Disabled Twonky, much less log is generated now.
    Unfortunately tail -f gave me the below error:

    <div>~ # tail -f /dev/kmsg</div><div>tail: read error: Invalid argument</div><div>tail: read error: Invalid argument</div><div>tail: read error: Invalid argument</div>

    Do I need some additional switch to make the command usable?

    Anyway, as a workaround, I created a cron job to run
    dmesg -c >> /tmp/dmesg.log
    every minute, opened the log with tail -f and made putty to save terminal output to file.

    I hope I catch something...resync should be complete in 1,5 hours...


  • poloschka
    poloschka Posts: 13
    First Comment
    Well the crontab solution didn't work as well, so I wrote a script to catch the logs.
    These logs repeat right before the NAS would restart after resync completes:

    &nbsp;[83984.552585] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900400: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(188416), inode=0, rec_len=0, name_len=0<div>[83984.553745] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900407: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(204800), inode=0, rec_len=0, name_len=0</div><div>[83984.554153] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900415: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(237568), inode=0, rec_len=0, name_len=0</div><div>[83984.554545] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900413: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(229376), inode=0, rec_len=0, name_len=0</div><div>[83984.555249] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900404: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(196608), inode=0, rec_len=0, name_len=0</div><div>[83984.555637] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900414: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(233472), inode=0, rec_len=0, name_len=0</div><div>[83984.555933] EXT4-fs error (device md2): htree_dirblock_to_tree:587: inode #11864587: block 94900351: comm smbd: bad ent&nbsp; &nbsp; &nbsp; ry in directory: rec_len is smaller than minimal - offset=0(102400), inode=0, rec_len=0, name_len=0</div>

    (I ssh-d from the wrong PC and forgot to save putty the whole terminal output to file.)


Consumer Product Help Center