Intermittent loss of internet access and cannot ping firewall

Dudley_Winchester
Dudley_Winchester Posts: 22  Freshman Member
First Comment Sixth Anniversary
edited April 2021 in Security
I'm starting this thread as I have seen this problem on more than one client site and I can't see other posts relating to this issue.
Our setup is Leased Line (static IP) > USG310 > ZyXEL GS Managed Switch > ZyXEL Smaller GS Managed Switch > Desktop PCs and Macs
About once a day a few random PCs will loose internet access and we can't ping the firewall. Unplugging the network cable and plugging it back in resolves the issue. Oddly, it seems "established" services like Slack still work, the PC still has a valid IP and the firewall can still ping the desktop. I thought it may be Avast on the desktops, but the Macs sometimes have the same problem.
Our USG and switches have a management VLAN (Untagged / PVID=1), a Data VLAN (tagged everywhere up to the switch port that the PC plugs into) and a Phone VLAN. The Phones go through a different gateway (ISP router with it's own DHCP server) and never have a problem, hence I wonder if the issue is at the USG. The USG acts as a DHCP server for the management and Data VLANs.
Because it happens on user PCs we have limited time to fault find; next up we are going to see what the desktops can ping, e.g. the printers. We wouldn't be able to ping the switches as they are on a different VLAN.
We are also using Spanning Tree Protocol across our switches, but I suspect if that was the problem all desktops would stop, rather than random PCs at random times of day.

All Replies

  • Alfonso
    Alfonso Posts: 257  Master Member
    5 Answers First Comment Friend Collector Second Anniversary
    Hi @Dudley_Winchester

    It really looks very strange.
    I have some doubts about your scenario.

    Are the desktops connected to the ip phones, and are the ip phones connected to the switch?
    Or, are either the phones and the desktops physically connected to the switch?

    I suppose second scenario.

    As many network issues, it is not easy to give you a magic solution, but i am going to give you how i would proceed:

    - monitor the network interfaces off all network devices (USG, switches, ...) via SNMP, putting focus on bandwith, error, ... It you want a free solution i would use cacti
    - deploy a network sniffer, i would start sniffic traffic to the USG
    -  review all network devices logs (the best way is to send all network logs to a centralized syslog server).

    I hope this ideas will help you.

    If you have further information to share, please let us know.

    Regards
  • Dudley_Winchester
    Dudley_Winchester Posts: 22  Freshman Member
    First Comment Sixth Anniversary
    Hi Alfonso,
    Thanks so much for your reply. I should clarify.
    The phones and computers are each connected to switch ports (ie PCs don't piggy back off phones).
    What we do see is that the firewall has a LAN1 with no VLAN (although it's all tagged PVID = 1 in the switches for the purpose of management of those devices. in the firewall the port is called ge3.
    Then, in the firewall we have a VLAN10 on top of ge3 with its own subnet. Using tagging in the switches and untagging at the switch port, the PCs get the correct IP address and are members of VLAN10... usually.
    What we have observed - when a PC goes offline, it still has a valid IP from the correct subnet and can ping other devices in the same VLAN - except for the firewall itself. Renewing the lease on the PC (or unplugging / plugging in the network cable) means the firewall pings again and we are back online.
    What we have seen in the firewall DHCP table is very odd - The desktops PCs all get the correct IP address in the range, but some entries show that the DHCP server is VLAN10 (correct) or ge3 (wrong) - ge3 should only issue IP addresses from a different subnet for switch management.
    I'm thinking for now that we turn off the DHCP server linked to ge3 to see if that helps fix things - I'm also looking to see how I can roll back the USG from v4.3x firmware to v4.25.
  • danyedinak
    danyedinak Posts: 51  Ally Member
    First Comment Friend Collector Sixth Anniversary
    Hi Dudley,
    I don't know if you're still having this issue or not, but I have had this issue on a multitude of sites, primarily on the wireless connected clients. In each case, the resolution has been to increase the session-limit limit from 1000 to at least 4000, sometimes (though, rarely) higher. Keep an eye on it, though, as there appears to be at least some instances of that session limit being reset back to 1000 between firmware updates. There was also a pretty consistent problem with this on an older firmware version. 
  • Dudley_Winchester
    Dudley_Winchester Posts: 22  Freshman Member
    First Comment Sixth Anniversary
    Hi Dan,
    Thanks for taking the trouble to reply.
    I'm convinced it is a bug, only seen is on the USG3xx series (not 60's 100's) and I have found an answer.
    If you uncheck MAC/IP Binding, it still binds MAC addresses to their corresponding IP addresses if you have entries in the list - i.e. turning off the binding actually turns it on if there is a list! Turning it off = make my life a misery. So we turned it off and MAC-IP binding works fine.
  • Zyxel_Stanley
    Zyxel_Stanley Posts: 1,377  Zyxel Employee
    100 Answers 1000 Comments Friend Collector Seventh Anniversary

    Hi @Dudley_Winchester

    The VLAN issue has resolved on latest version. You can upgrade to 4.38 firmware.


    IP/MAC binding function include 2 parts.

    (1) IP/MAC binding function

    It will force all of clients request IP address by DHCP.

    If client configured static IP or IP did not renew again, then traffic will block by USG.


    (2) Static DHCP bindings.

    You can create IP/MAC entries in this table. Device will offer configured IP address for your PC.


  • Dudley_Winchester
    Dudley_Winchester Posts: 22  Freshman Member
    First Comment Sixth Anniversary
    Thanks Stanley,
    I think it is a question of interpretation - and the way you describe is not the same as the user guide!
    Option 1:
    Ticking the Enable IP/MAC Binding means that devices listed in the static DHCP Bindings list will get the listed IP address, those not in the list will get a random IP from the pool but will still work.
    That;s how it works in the USG60 but was not working for the USG310.
    Option 2:
    Ticking the Enable IP/MAC Binding means that devices listed in the static DHCP Bindings list will get the listed IP address, those not in the list will get a random IP from the pool, will work during the lease period and will then be blocked by the USG.
    This is what the USG310 was doing for me (two of them) and makes no sense!
    Hopefully 4.38 means option 1 is now deployed?
    The only other option would be where Enable means only devices in the list are allowed, but I would call that "force Binding" not "Enable" and change the user manual.
    Looking forward to trying 4.38 and see what happens.
    Cheers, Dudley
  • Zyxel_Stanley
    Zyxel_Stanley Posts: 1,377  Zyxel Employee
    100 Answers 1000 Comments Friend Collector Seventh Anniversary

    Hi @Dudley_Winchester

    The Static DHCP Bindings function will offer the listed IP address to specific client no matter IP/MAC binding is enabled or not.

    After enabled IP/MAC binding function, it can stop anyone else to setup IP address manually. (except clients those listed in static DHCP bindings table)


    According to your symptom on USG310:

    You can upgrade to 4.38 firmware first. After enabled IP/MAC binding function, you can release and renew IP address on all of your client first to make sure clients IP addresses are offered by USG.

    And monitor the status until next DHCP lease time.

Security Highlight