firmware upgrade (4.60 to 4.62) breaks device-ha

noc_aba
noc_aba Posts: 20  Freshman Member
First Anniversary 10 Comments
edited April 2021 in Security
Hello
on atp800 (couple with device-ha on) the firmware upgrade process doesn't bring device-ha back. We did the upgrade from the running partition and wait for half an hour before connecting again.
The passive is updated and becomes active, but the device-ha status shows only one atp800. We can't ping the expected passive.
The sys led is flashing, the  leds of all port are on, while only the heartbeat port should be. Even after a reboot the passive is not seen by the active one and again  all the leds are on, the sys keeps flashing. Connecting through the console we realized that running config was just a fraction of the whole. 
We had to unplug the cables, reset, configure device-ha, plug the cables again. the device-ha gets established again.
We experienced this on two couple of atp800 in device-ha mode.
Any hint of fix for this ?
many thanks
Paolo

All Replies

  • TrondBKSuleimanCo
    TrondBKSuleimanCo Posts: 19  Freshman Member
    First Anniversary 10 Comments Friend Collector First Answer
    Unless you have to use the new firmware version because of some other critical update, the normal response when a new firmware breaks something is to downgrade to the previous stable version, until another new firmware upgrade is RTM.
  • noc_aba
    noc_aba Posts: 20  Freshman Member
    First Anniversary 10 Comments
    Well, when I login and atp800 says "there's a new firmware version for download", in this case to fix a couple of vulnerabilities, I assume that it's recommended and safe to upgrade.
    The problem is: why the configuration of the passive (that was good and running before the upgrade, when the atp800 was active)  got damaged in the process ?

  • Zyxel_Tobias
    Zyxel_Tobias Posts: 200  Zyxel Employee
    First Anniversary Friend Collector First Answer First Comment
    Hi noc_aba,

    that sounds not as it should work. We recently found an issue in some HA sync steps, which are planned to roll-out in next release version.

    If you want to test this version (Upgrade 4.62 to Support Version), just let me know. 

    I´ll help you to convert this Thread into a Support Ticket with one of our engineers and we can help you to verify your config in our lab first if you want us to do that.

    Let me know your decision and sorry for the issue happen on your site.

    Kind Regards,

    Tobias
  • noc_aba
    noc_aba Posts: 20  Freshman Member
    First Anniversary 10 Comments
    we have upgraded another couple of atp800 from 4.60 to 4.62 and the same problem occurred; HA is not here anymore and we have to reset the slave and configure HA again. Given this type of problem and given that the atp800 are deployed in a production environment, we can't risk to disrupt service trying debug firmware version that could have the same upgrade problem
  • kruess
    kruess Posts: 9
    First Anniversary First Answer First Comment
    Hi, we've observed troubles while updating HA pairs, too. USG's in those cases. We've been able to reduce the risk by rebooting the active device, waiting for the pair to be in-sync and again boot the other, now active device and wait again for sync'ing and then initiate the upgrade.
    Doing so, the upgrade seems to be running much more stable.
  • soul
    soul Posts: 29  Freshman Member
    First Comment
    try upgrading all partitions for both machines then reboot them.
    device ha should need same fw for both machines.

Security Highlight