Issue with File Transfer Speeds using an ATP800

NEP
NEP Posts: 87  Ally Member
First Comment Friend Collector Third Anniversary
edited August 6 in Security

Hello,

I have a question about performance with regard to an ATP800. The question is due to only getting file transfers speeds of about 40 MB/s on the network. In looking into this, the weak link appears to be the uplink to the ATP. We have 10G fiber throughout the company connecting all the switches and speeds on the same subnet are 100 MB/s. However, the second you try to hop subnets (which is most traffic) the speed drops. I assume this is because the ATP only has 1G connections.

That said, I checked the Port Statistics monitor and it shows that we have spikes of ~650 Mbps but the average is more like 250 Mbps. All of the "throughput" specs for that ATP are at least 2.5 times higher than the max we are seeing. At the moment, all LAN connections pass through App Patrol and Web Content Filter. We then assumed this might be the issue, but I believe those fall under UTM and the throughput for that is 1900 Mbps. It seems that should more than cover our traffic.

With that in mind I have a few questions. The ATP seems capable of it, so why can't we get higher than 40 MB/s transfer? I see that the USG Flex 700h has 10G support and that would be an option in the future. However, having searched the manual, I found that the ATP supports LAG. Would you recommend that we set up an LACP LAG to get more throughput or will that not help in this scenario?

If you need more information, let me know. Thank you for your time.

Update 1: I created a policy that allows traffic from one IP to another bypass UTM by setting all profiles to None. At least I assume that is what is happening. Checked the logs and the policy is being used. No difference in performance. Then a coworker found that iPerf with -P 5 (multiple streams) works at 100 MB/s. Because iPerf on the same subnet without -P and through the ATP with -P allows 100 MB/s throughput, it seems that the ATP is capable of that speed even with the 1G uplink. So the original question remains, what is causing the bottleneck?

«1

All Replies

  • PeterUK
    PeterUK Posts: 3,926  Guru Member
    100 Answers 2500 Comments Friend Collector Seventh Anniversary
    edited August 6

    The LAG on old models of Zyxel limits the speed to 1Gb no matter how many ports it is only for Redundancy not sure if FLEX H has the same limitation.

    One thing I know is if in traffic Statistics the Collect Statistics is enabled it can limit speed problem is it enables every reboot so see what happen with that off.

    I'm not sure how many cores the ATP800 has but due to it being under the old ZLD it work best with multiple streams which the ATP800 try to evenly load the cores for best throughout.

    when your testing with iPerf you should be testing through ATP800 like from one subnet to another testing in the same subnet does not route through ATP800 and is at your switch level unless you proxy ARP and redirect ARP to ATP800 for the same subnet traffic.

  • NEP
    NEP Posts: 87  Ally Member
    First Comment Friend Collector Third Anniversary
    edited August 6

    Thanks for the info, @PeterUK. I figured that LAG probably would not improve the situation. Just wondered if having more connections might cause more of the device's hardware to be utilized, but then again it should be smart enough to load balance the connections to available hardware.

    The above comment is based on a thread that my coworker found. That discussion indicated that the USG1100 "will use one core to do the packet forwarding and this is the current design mechanism."

    Anyway, I looked and the ATP800 is double the specs of the USG1100 in most regards, so performance should be better. However, we are getting roughly the same as them. If what was stated there applies, it sounds like the CPU is the determining factor when it comes to transfer speed. It just boggles my mind that a multi-1G router is incapable of sustaining 1G transfer from one device to another. All because the CPU that the device has is incapable of sustaining 1G. That's crazy!

    As for disabling Traffic Statistics, I'll give that a shot at some point. Would like to hear from Zyxel. In my mind there is no way that having stats enabled could cause a massive performance increase. Yes, there will be some because of the overhead, but it should be near impossible to jump from 40 MB/s to 100 MB/s or even 60 MB/s for that matter.

    Lastly, I don't like running iPerf with the multiple streams because it doesn't give a real world feel for what performance is like. If it runs better with multiple streams but nothing else (namely Windows file copy) on the network is able to do that, the test does nothing for us.

  • PeterUK
    PeterUK Posts: 3,926  Guru Member
    100 Answers 2500 Comments Friend Collector Seventh Anniversary

    Yes its why the FLEX H with the uOS was over due vs ZLD hardware design that said I would expect the ATP800 to do better when the traffic is not handled by UTM and such.

    You should test by two PC direct to ATP800 on different subnets to rule out anything else.

     

  • Zyxel_Melen
    Zyxel_Melen Posts: 3,610  Zyxel Employee
    Zyxel Certified Network Engineer Level 1 - Switch Zyxel Certified Network Administrator - Switch Zyxel Certified Network Administrator - Nebula Zyxel Certified Sales Associate

    Hi @NEP

    Could you help to provide some information so we can check this issue?

    1. Your test topology, include the link speed on each connection, when test cross subnet.
    2. Your test client's OS information.
    3. Which security services did you enable? Or, could you help to provide your configuration?
    4. Could you test the scenario that PeterUK mentioned? "test by two PC direct to ATP800 on different subnets"

    These can help us to clarify this issue. Thanks for your cooperation!

    Zyxel Melen


  • NEP
    NEP Posts: 87  Ally Member
    First Comment Friend Collector Third Anniversary

    @Zyxel_Melen Thanks for getting back to me. Was talking with my coworker trying to figure out what your response meant. It's quite ambiguous. I'm hoping that the ambiguity means that you think the ATP800 is capable 1Gbps throughput but need more info to find out why ours is not.

    1. ATP is connected to the top switch with 1G Ethernet. Switches are all connected with 10G fiber. Devices are connected to the switches with 1G Ethernet.
    2. Yesterday's tests were all on Windows 10. Today's (direct connect to ATP) was done on Windows 11.
    3. Not sure how to answer this one. As stated before, all LAN connections are processed by UTM (App Patrol and Content Filter). Anti-Malware, IP Reputation, URL Threat Filter, and IPS are enabled. Not sure if they are simply enabled or are applied somewhere.
    4. As requested, I created two new subnets and applied them to two open ports. We tested with the devices going through UTM and also not (skip App Patrol and Content Filter via policy). The results were 21-37 MB/s. UTM on and off didn't make much difference. The range seems to be a result of other traffic.

    Not sure if it helps but regular usage on the system seems low. CPU is usually around 10-15%, Memory 36%, and Flash 18%. A 30 second iPerf test with 1 stream bumps it up to ~30% CPU, while a 30-second test with 5 streams is ~48%. That is just based on a live view from the dashboard.

    Let me know if you would like anything else. Thanks.

  • PeterUK
    PeterUK Posts: 3,926  Guru Member
    100 Answers 2500 Comments Friend Collector Seventh Anniversary

    SSH to ATP800

    debug hardware fan-get

    see the the fans are on and temp good?

  • NEP
    NEP Posts: 87  Ally Member
    First Comment Friend Collector Third Anniversary

    Sensor[0]:43 degree
    Sensor[1]:21 degree
    Sensor[2]:30 degree
    FAN[0]:4927 RPM
    FAN[1]:4986 RPM
    FAN[2]:4856 RPM
    FAN[3]:4918 RPM

  • PeterUK
    PeterUK Posts: 3,926  Guru Member
    100 Answers 2500 Comments Friend Collector Seventh Anniversary

    and with the two subnets the test PC's were to the ATP800 without a switch?

  • NEP
    NEP Posts: 87  Ally Member
    First Comment Friend Collector Third Anniversary

    That is correct. Both laptops were connected to a port on the ATP800 and the config on each port was a separate subnet. We double-checked that before running the iPerf test.

  • PeterUK
    PeterUK Posts: 3,926  Guru Member
    100 Answers 2500 Comments Friend Collector Seventh Anniversary

    Hmm link the two laptops together with static IP's and test the speed