4.30 SIP ALG issues

grokit
grokit Posts: 18  Freshman Member
Friend Collector First Comment
edited April 2021 in Security
PBX in cloud <--> USG 310 <--> ca. 20 SIP phones in LAN

I wonder if someone else was able to observe this. I cannot be sure that I might not have misunderstood something wrong, but it seems to me that there is an issue with SIP ALG.

I found out the following:

- FW 4.25, SIP ALG ON, SIP-Transformation OFF. This did work fine over a couple of years. No problems.
- Updated to FW 4.30, same configuration, checked.

Now the problems started:
- (always different) phones not reachable from outside
- the PBX in the cloud complained with "phone UNREACHABLE"

After some digging:
The pbx sends every 60 seconds a SIP OPTIONS request to the phones. As long as SIP ALG works fine (i.e. with FW 4.25), those requests are being forwarded to the correct phones via the open sip session. The phones then answer and the pbx is satisfied and keeps the phones online.
 
Yet, after I upgraded to FW 4.30, most (but not all) of the SIP OPTIONS requests end up in the default rule of the firewall and are blocked. Very visible if one switches on logging for the default rule. Interestingly, the respective SIP sessions are still open, visible in the SIP Session Monitor. 

Thus I downgraded to 4.25 again, same configuration (except some minor FW version dependent settings).
..and all went smooth again, no SIP OPTIONS are stopped in the firewall anymore. 

Second upgrade to 4.30, again the same behavior.

I have now downgraded to 4.25 again, since I depend on telephony. And with this configuration all work well again.

It looks to me as if the SIP ALG functionality has ceased to work correctly on FW 4.30.

Anyone can confirm? And if not, could you eventually point me to a possible error in my configuration?

Again, same configuration (same ALG settings, same interface, rules, policies, etc) works on 4.25, but not on 4.30

I also got several smaller USG's with max. 5 phones in roughly the same setting. I have not seen issues there. Perhaps a question of max. sessions? Though for USG110 it's 100 sessions, and I have merely 20.

Anyone got an idea?

Thanks
Dan

«1

Comments

  • ChrisGer
    ChrisGer Posts: 205  Ally Member
    First Anniversary Friend Collector First Answer First Comment
    @grokit in my szenario, i've a local PBX in a transfer network behind a USG .... BUT disabled SIP ALG to have no issues during a SIP call. Without SIP ALG i'm able to have a running VOIP communication and can also send FAX messages.

    Have you ever test the SIP without the placed SIP ALG on the USG ?

    Regards
    Christian
  • grokit
    grokit Posts: 18  Freshman Member
    Friend Collector First Comment
    @ChristianG, my scenario differs a great deal from a local pbx. With a local pbx, the firewall has ever only to manage one SIP connection to the provider. In such cases, one can completely disregard SIP ALG and purely work with UDP Session Timers, or even NAT rules and port forwarding. The latter only works with one SIP endpoint in the local LAN (except if starting to meddle with port numbers). I am not talking about SIP-Transformation here, which is a completely different matter. I only care about havingthe SIP "channel" open through the firewall with the help of UDP or SIP sessions.

    In my case, however, the firewall has to maintain >20 SIP connections at the same time. Simplified: 20 different IP addresses in the LAN and 20 different ports towards the one IP address of the pbx in the internet. SIP ALG (on FW 4.25) does a great job here.

    Unfortunately, on FW 4.30, it simply does not work the way it should. 

    Yes, I have tried with SIP ALG OFF and with tampering the UDP timeout values. It works, up to a point on 4.25. I have to admit, I did not test this to the details with 4.30, as my main concern was to get telephony working again as soon as possible. And again, SIP ALG works perfectly fine with FW 4.25, but does not with FW 4.30.

    I spent the last two Saturdays troubleshooting and tried to understand what was the issue with SIP ALG on 4.30. The result is clear to me.

    With 4.25, SIP OPTION requests from the pbx in the internet towards the respective SIP phones in the local LAN will be properly forwarded through the open SIP session. As it should be.

    With 4.30 (and the exact same configuration), SIP OPTION requests will be blocked by the default firewall rule. The result is loss of inbound SIP connectivity, until the SIP phone chooses to re-register from the LAN to the pbx in the internet. It almost looks to me as if only a limited set of SIP Sessions can be open. And there is indeed a limit of 100 SIP sessions on USG110. But I am far from that limit.. Hmm.. 

    I am almost thinking as if the FW engineers preparing 4.30 have forgotten to set the limit to 100 in the USG 4.30 firmware....  But this is pure speculation. Who am I to be able to say something like that :-)

    Daniel

  • grokit
    grokit Posts: 18  Freshman Member
    Friend Collector First Comment
    Just learned that this "seems" to be a bug which has been fixed in 430AAPJ0ITS-WK10-r82493. 
    However, I was not able to get my hands on a Readme or Release Note of that WK version. 
    Does anymoe know where I can find a description of the fixed issues?
    Daniel

  • Zyxel_Charlie
    Zyxel_Charlie Posts: 1,034  Zyxel Employee
    First Anniversary Friend Collector First Answer First Comment
    @grokit
    Regarding to your case, I will private message the firmware to you for checking further.
    Charlie
  • grokit
    grokit Posts: 18  Freshman Member
    Friend Collector First Comment
    Hello @Zyxel_Charlie,
    I already have access to that firmware, thanks.
    What I miss is some kind of readme.txt or release notes. 
    I was told that "a similar issue to what I described seems to be fixed" by the country distributor over here. While this makles me almost happy, I do not like to install new firmware without any hint of what has been fixed and what might be other issues on a productive environment. 

    So, if you could help me out with "readme.txt" or similar, that would be great!.

    Thanks
    Daniel

  • R_I
    R_I Posts: 2
    First Comment
    Hi Daniel
    I experienced the exact same behavior and changed back to 4.25.
    Any news if this problem was indeed solved with the new firmware? Thinking of switching from 4.25 to 4.32 now...

    Reto
  • grokit
    grokit Posts: 18  Freshman Member
    Friend Collector First Comment
    Hello Reto,

    I am stuck on 4.25 for the time being.430AAPJ0ITS-WK10-r82493 did not fix it and 4.31 did not either.

    Testing is time intensive. After boting with a newer firmware, I always check the changes to the configuration first (just doing a diff between the config files works for me). I carefully make sure that there is no change in the related settings (UDP, ALG, etc...)

    After that I need to check the phone connections. For speed, I reboot every single phone, which is rather simple by rebooting the PoE Switches. The problem is that the SIP issue does not immediately show. It looks like there is a limited number of concurrent UDP/SIP sessions allowed. Each phone re-registering or connecting to the remote PBX, will steal away a session from another phone. The result is that phones go offline/online in a difficult to predict pattern. 

    Productive working is not possible.

    After the tests, I went back to the original 4.25 configuration, and here everything works absolutely perfectly!


    Actually, I urgently need the possibility to do a backup over Zyxel LTE3301, but that only works with higher USG Firmwares.. 

    Frankly, I am a bit disapointed from Zyxel. This bug is so obvious... but nobody seems to care..
    (or perhaps they do not unterstand the issue because they are like many other so-called VOIP specialists stuck on a scenario where the PBX is located on the LAN and only connects to one provider across the firewall)

    Daniel

  • ChrisGer
    ChrisGer Posts: 205  Ally Member
    First Anniversary Friend Collector First Answer First Comment
    Hello together,

    "(or perhaps they do not unterstand the issue because they are like many other so-called VOIP specialists stuck on a scenario where the PBX is located on the LAN and only connects to one provider across the firewall)".

    in this case, have you ever send ZYXEL or in this post a graph, about the planed and required ports/sources/destination to have a clear view ?

    in my szenario, i've a SOHO PBX with two active SIP provider (provider routing by PBX). in a DMZ behind the ZYWALL and i'v at the startup also  a lot of issues to configure the device with "best practice" to get successful and stable connected. :);)

    Regards
    Christian


  • grokit
    grokit Posts: 18  Freshman Member
    Friend Collector First Comment
    edited August 2018
    hello christian,

    I was not referring to you when I mentioned "so-called VOIP specialists". I appologize if that is what you thought. fact is that most people with VOIP background have, what I call the traditional use case or model only in their mindset:
    • PBX on local LAN
    • IP phones on same local LAN
    • PBX only connecting through firewall to provider(s).
    however, we have another use case here (I call it cloud model):
    • PBX not on local LAN, but in the cloud (internet, = cloud PBX)
    • many IP phones on local LAN
    • many IP phones connecting to cloud PBX through firewall
    do you see the difference?

    so, imagine you have your PBX not in the DMZ, but hosted somewhere in the internet. now you have to connect your >20 IP phones in your local LAN to that remote PBX. all those 20 SIP sessions (and more) will be routed across the firewall at the same time.
    up to FW 4.25, this was not a problem at all. with newer FW, we have issues.

    I had plenty of private chats with zyxel people. also the swiss support (studerus) which is VERY good, still seems not to understand the issues.

    the issues seem to be:
    if there are *many* parallel SIP sessions through the USG firewall, those SIP sessions suddenly get blocked by FW rules when there are (my assumption) too many sessions open at the same time.
    (yes, I have tuned the session limits, to no avail...)

    I repeat, up to firmware 4.25, all works perfect! 
    all firmware after 4.25 I tried (mentioned above) will not work! 
    that is why I run 4.25 on all my firewalls at the moment. and it's a shame.


    christian, you describe your ("traditional") setup which is perfectly working also with newer firmwares. I also have that scenario in place on some networks (asterisk based pbx on LAN, IP phones on LAN, SIP connections from PBX to provider(s) through firewall) and I never had issues.

    please do not mix this traditional model with the problem I have with the cloud model.
    I appreciate your hints, but I am much much further than that. 

    daniel

  • Zyxel_Stanley
    Zyxel_Stanley Posts: 1,361  Zyxel Employee
    First Anniversary 10 Comments Friend Collector First Answer
    Hi @grokit

    I will send you private message for check this issue more detail.

Security Highlight