VLAN rogue-connection

novasune · December 2021

I have a network consisting of a handfull XGS1930 switches, a NXC controller & firewall. (The explanation here is simplied from the actual setup)

The network is divided into

administration 192.168.2.0/24 , with a windows server as DHCP server
Guest 192.168.10.0/25 , with a router as DHCP server.

management 192.168.3.0 where all switches , nxc2500 and AP's are on.

Wifi are are available on both administration and guest.
Switches and ap's are connected on truck ports that carry all 3 vlans.

We experience that sometimes a device that logs onto wifi-admin, gets an IP from the guest pool and vice versa.
Firewall rules prevents the device from going anywhere, but the devices becomes unusable as it has wrong IP.

It seems like there is a "loophole" between the VLANS , but i have difficulties tracking it.

Loopguard doesnt catch it, but i also believe that loopguard only catches loop on same VLAN.

We can see from ipconfig , that the devices that handed out the "wrong" ip is either the windows server, or the router.

I have found dhcp snooping, this allows us to block rogue DHCP servers, but this doesnt really stop or identidfy the problem. The 2 valid dhcp servers are known, but connected to the same trunk port most of the way. It doesnt have any option to log dhcp requests.

We suspect it is a device that is both connected to an accesport on ethernet,but also on the other wifi, and thereby causes a bug-bridge between the two networks.
The facility has alot of devices, such as infoscreens,bridges etc., printers, all that have both wifi and ethernet, but it could also be a unmanaged switch in a trunk port or similar.

We are looking for ways to log dhcp requests accros the network, and find the "path" , and thereby identify on which port the bug resides.

Does the switches offer such thing, or do we need a pc with wireshark?

The problem is intermittent, but we can sometimes/rarely force it to happen, by disable one of the dhcp servers, and keep renew ip.

I would suspect that if doing a mac lookup in the switches, the mac of the dhcp should now be on 2 ports, the normal trunk as expected but also on the port with bug on ,however i have not succeded with this.If this would work, can we keep poll thru syslog and eventually catch the culprit ?

Problem is that we cant just "tear down the network to pieces" as the facility is in use most of the time.

We are looking for tools/tips/tricks to identify this problem

Thanks in advance.