NAS540 crashes due to high CPU load (python?)

Options
kimme
kimme Posts: 35  Freshman Member
First Anniversary
edited January 2019 in Personal Cloud Storage
Hi All!

When I'm using transmission to download some data the NAS crashes after a couple of minutes of use.
I tracked it down to the fact that my CPU load is locked at 100%.
First thing I did was to disable Twonky Media Server and the generation of thumbnails as described in some post I found.
This works fine but still I have a crash now and then.
When monitoring the CPU load Transmission doesn't take more than 50% but there's a python service that sometimes takes up to 60%.
In idle the python service takes 20% CPU and 130Mb RAM which is about 20% of the NAS's capacity.

I checked the following:
/ $ ps | grep python | grep -v grep

 1569 root      127m S    python /usr/local/apache/web_framework/main_wsgi.pyc

 3588 root     26872 S    python /usr/local/apache/web_framework/job_queue_daemon.pyc

 3952 root      107m S N  python /usr/local/fileye/fileye.pyc

/ $ 

Is there something I can kill or disable to free up more resources?

I'm only using the NAS for Transmission and a Google Drive service to backup important data to the cloud.

Thanks in advance!

Kim

#NAS_Jan_2019

«1

All Replies

  • Mijzelf
    Mijzelf Posts: 2,645  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    What do you mean by crash? The system shouldn't crash due to high cpu load, although it might reboot when the watchdog daemon doesn't get enough cycles.
    The OOM killer might kill some daemon when it runs out of memory, and if you needs that daemon to connect to the NAS, it can seem to be down.

    i know fileye.pyc used to use lots of memory, and I wrote a tool to disable it (link), but unfortunately AFAIK it's needed to upload to google drive.
  • kimme
    kimme Posts: 35  Freshman Member
    First Anniversary
    Options
    As the NAS isn't accessible in any way I just assumed it crashed and was rebooting.
    So when I read your comment there's not much I can do about this.
    It's not that bad anyway, I think it only goes down once in a couple hours anymore after the tweaks I already did. I was only hoping to fix it completely...
    Anyway, thanks for the quick reply!
  • kimme
    kimme Posts: 35  Freshman Member
    First Anniversary
    edited January 2019
    Options
    I used your tool to disable the fileye.pyc but still I have the same CPU and MEM-usage.

     1564 root      135m S    python /usr/local/apache/web_framework/main_wsgi.pyc

     3596 root     36316 S    python /usr/local/apache/web_framework/job_queue_dae

    Are these 2 processes necessary to run transmission and access GUI?

    If not, is there a way to disable them so I can free up more resources?

    EDIT1:

    I just started transmission again and every couple of minutes everything stops working for some time...

    When this happens transmission web interface and de GUI from the NAS will go offline for a couple of minutes. Also my shares stop being available.

    When it's available again my downloads restart again but will have lost any progress made trough the download it did before the "crash".

    EDIT2:

    After the last one I wanted to check the logs and I saw a red error icon at the bottom, apparently now my raid array started resyncing...

    I don't know what triggers all of this but I hope my NAS isn't at the end of his lifecycle after 5 years of very mild usage.

  • kimme
    kimme Posts: 35  Freshman Member
    First Anniversary
    Options
    Ok resync just finished.
    To be sure I've updated the firmware to the latest version (minor update I think)
    Problem remains.
    After testing some more the problem seems to occur when a download finishes. I replicated the issue with different downloads.
    So when it's in his last 5 sec. Transmission stops refreshing and immediately after that everything stops working (transmission, Nas GUI, mounted shares,...)

    Any ideas?
  • Mijzelf
    Mijzelf Posts: 2,645  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    <div>1569 root&nbsp; &nbsp; &nbsp; 127m S&nbsp; &nbsp; python /usr/local/apache/web_framework/main_wsgi.pyc <br></div><div>3588 root &nbsp; &nbsp; 26872 S&nbsp; &nbsp; python /usr/local/apache/web_framework/job_queue_daemon.pyc</div>
    Both processes are sleeping. And I checked on my NAS520:
    1331 root      123m S    python /usr/local/apache/web_framework/main_wsgi.pyc
    5544 root     17656 S    python /usr/local/apache/web_framework/job_queue_daemon.pyc
    

    the memory usage isn't diverging either. Have you looked using top which process is hogging the CPU?

  • kimme
    kimme Posts: 35  Freshman Member
    First Anniversary
    Options
    Hi,

    In the meantime I reinstalled the transmission app and for now everything seems to be working fine. Obviously it's only been a bit more than 1 hour without crashing but hey, it's better than a crash every minute.

    As for the python process, how can I check which process it might be? When I click the CPU/MEM graphic in the GUI it only says "python" and it varies between 15 and 45% of CPU usage and 125 to 145mb RAM.
  • Mijzelf
    Mijzelf Posts: 2,645  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    how can I check which process it might be?
    Run 'top' on the command line.
    the GUI it only says "python" and it varies between 15 and 45% of CPU
    Python is the gui, in this case. So by looking at it you are influencing the numbers. Very Heisenberg.
  • kimme
    kimme Posts: 35  Freshman Member
    First Anniversary
    Options

    Mem: 955576K used, 56060K free, 0K shrd, 21376K buff, 794132K cached

    CPU:  1.8% usr  1.6% sys  0.0% nic 96.4% idle  0.0% io  0.0% irq  0.0% sirq

    Load average: 1.12 1.11 1.07 1/144 28052

      PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND

     1571     1 root     S     123m 12.4   0  1.5 python /usr/local/apache/web_framework/main_wsgi.pyc

     2759     1 root     S    80316  7.9   0  0.3 /i-data/4ebe74b4/.PKG/Transmission/bin/transmission-daemon -g /i-data/4ebe74b4/.PKG/Transmission/config/Transmission/

     4089     1 root     S N   2808  0.2   0  0.1 avahi-daemon: running [NAS540.local]

    27928 27919 admin    R     2756  0.2   1  0.1 top

    27608  3850 admin    S N  26312  2.5   0  0.0 /usr/sbin/smbd -D

     3850     1 root     S N  26028  2.5   1  0.0 /usr/sbin/smbd -D

     4118  3850 root     S N  26028  2.5   0  0.0 /usr/sbin/smbd -D

     4006     1 root     S <  20336  2.0   1  0.0 /usr/sbin/nsuagent

     3946     1 root     S N  18124  1.7   1  0.0 /usr/sbin/nmbd -D

     4957     1 root     S    17660  1.7   0  0.0 python /usr/local/apache/web_framework/job_queue_daemon.pyc

     6158     1 root     S    12868  1.2   1  0.0 /i-data/4ebe74b4/.PKG/myZyXELcloud-Agent/bin/zyxel_xmpp_client

     5982  5981 nobody   S N  11540  1.1   0  0.0 /usr/sbin/httpd -f /etc/service_conf/httpd.conf

     5983  5981 nobody   S N  11540  1.1   1  0.0 /usr/sbin/httpd -f /etc/service_conf/httpd.conf

     6669  5981 nobody   S N  11540  1.1   1  0.0 /usr/sbin/httpd -f /etc/service_conf/httpd.conf

     6678  5981 nobody   S N  11408  1.1   0  0.0 /usr/sbin/httpd -f /etc/service_conf/httpd.conf

     6679  5981 nobody   S N  11408  1.1   0  0.0 /usr/sbin/httpd -f /etc/service_conf/httpd.conf

     3880  3681 nobody   S    11400  1.1   1  0.0 /i-data/.system/zy-pkgs/pkg_httpd -f /etc/pkg_service_conf/httpd2.conf

     3879  3681 nobody   S    11400  1.1   1  0.0 /i-data/.system/zy-pkgs/pkg_httpd -f /etc/pkg_service_conf/httpd2.conf

     5981     1 root     S N   9892  0.9   1  0.0 /usr/sbin/httpd -f /etc/service_conf/httpd.conf

     3681     1 root     S     9888  0.9   1  0.0 /i-data/.system/zy-pkgs/pkg_httpd -f /etc/pkg_service_conf/httpd2.conf

     1533     1 root     S     6000  0.5   0  0.0 /usr/sbin/uamd

     4162  4113 root     S N   5908  0.5   1  0.0 /usr/sbin/afpd -d -F /etc/netatalk/afp.conf

    27904  1828 root     S     5484  0.5   1  0.0 {sshd} sshd: admin@pts/0

     1828     1 root     S     5484  0.5   1  0.0 /sbin/sshd -f /etc/ssh/sshd_config

     7701     1 root     S N   5280  0.5   1  0.0 /usr/bin/schedule_controller

     4163  4113 root     S N   4312  0.4   1  0.0 /usr/sbin/cnid_metad -d -F /etc/netatalk/afp.conf

     4113     1 root     S N   4152  0.4   1  0.0 /usr/sbin/netatalk -F /etc/netatalk/afp.conf

     1543     1 root     S N   3808  0.3   0  0.0 /usr/sbin/cupsd

     5799     1 root     S     3776  0.3   0  0.0 stunnel /etc/stunneld.conf

     1523     1 root     S     3696  0.3   1  0.0 /usr/sbin/zylogd

     2010  2008 root     S     3692  0.3   1  0.0 /usr/sbin/zylogger -r

     5771     1 root     S     3664  0.3   0  0.0 zysync --daemon --config /etc/zysyncd.conf --log-file /i-data/.system/zysyncd.log

     2009     1 root     S     3284  0.3   0  0.0 /usr/sbin/syslog-ng

     3074     1 root     S     2912  0.2   1  0.0 {Tweaks} /bin/sh /i-data/4ebe74b4/.PKG/Tweaks//etc/init.d/Tweaks daemon 3064

     5937  2928 root     S     2880  0.2   0  0.0 /i-data/4ebe74b4/.PKG/myZyXELcloud-Agent/bin/zyxel_enet_bridge

     4974     1 root     S N   2788  0.2   1  0.0 /usr/sbin/app_wd

    27919 27904 admin    S     2756  0.2   0  0.0 -sh

     7246     1 root     S N   2752  0.2   1  0.0 /sbin/crond -L /dev/null

        1     0 root     S     2644  0.2   0  0.0 {init} /bin/sh /init

     2928     1 root     S     2644  0.2   1  0.0 {xmpp_client_mon} /bin/sh /i-data/4ebe74b4/.PKG/myZyXELcloud-Agent/bin/xmpp_client_monitor.sh

     6055     1 root     S     2644  0.2   0  0.0 /bin/ifplugd -a -p -q -t3 -d0 -u0 -i egiga0 -r /sbin/egiga_link_up_down.sh

     6057     1 root     S     2644  0.2   0  0.0 /bin/ifplugd -a -p -q -t3 -d0 -u0 -i egiga1 -r /sbin/egiga_link_up_down.sh

     2827     1 root     S N   2644  0.2   0  0.0 {gdrive_update.s} /bin/sh /i-data/4ebe74b4/.PKG/GoogleDriveClient/bin/gdrive_update.sh startup

      855     1 root     S     2644  0.2   1  0.0 {linuxrc} init

     2008     1 root     S     2644  0.2   0  0.0 /bin/sh -c /usr/sbin/zylogger -r

     6255   855 root     S     2644  0.2   1  0.0 {linuxrc} init


  • Mijzelf
    Mijzelf Posts: 2,645  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    Well, as you can see the CPU load is less than 4%. Yet something fishy is going on:
    <div>Load average: 1.12 1.11 1.07</div>
    This is the 'load average' for the last minute, the last 5 minutes, and the last 15 minutes. The number itself means roughly the number of processes/threads that wants CPU cycles at any given moment. A number of around 1 is very high for a system which is mostly idle. 0.01 is more to be expected. In a naive approach you could say a load average of 1.12 matches a CPU load of 112% (and as the box has 2 CPU's, you can go up to 200%). In reality this translation cannot be made, but 1.12 does not match 4%. Are you downloading/seeding a lot simultaneously with Transmission? And do you have a high bandwidth to the internet? In that case it's possible that you are massively waiting for the harddisks. The random access times of a mechanical harddisk just cannot cope with the characteristics of torrenting through a big pipe. 
  • kimme
    kimme Posts: 35  Freshman Member
    First Anniversary
    Options
    At the moment of posting nothing was downloading.
    When I download there are max 4 or 5 simultaneous downloads at a max combined speed of 200mbit
    After some more testing, smaller torrents will download without any problems. Bigger downloads (+3gb) tend to crash more often.

Consumer Product Help Center