Gkclient problem ( regression ) in h233plus 1.25
Hi everybody.
I think h323plus gkclient.cxx does not correctly process reregistration after a gatekeeper reboot in 1.25.
My scenario is the following. We have several h323 endpoint programs ( yate instances and homebrew ones ) registered to a gatekeeper. As part of a scheduled maintenance ew rebooted the gatekeeper. Every endpoint compiled against h323plus 1.24 reregisterd correctly and kept working, but a single one compiled with h323plus 1.25 ( and part of the maintenance is upgrading to 1.25 ) did not.
The problem was the endpoint sent a keepalive RRQ with the old gatekeeper data. The new gk instance replied with an RRJ with a full registration required cause. The endpoint sent a full registration, the gatekeeper replied with a RCF with a NEW ENDPOINT IDENTIFIER.
All the 1.24 programs then proceeded to send the new Ep Id, but the 1.25 one kept sending the old ones. I suspect it also sends it in ARQs, so it was unable to call ( will verify it later ).
I've been debugging the source code and didn't notice any difference in the clients, so I looked at the 1.25-1.24 diffs, in gkclient.cxx
http://h323plus.cvs.sourceforge.net/viewvc/h323plus/h323plus/src/gkclient.cx...
I noticed a difference in endpoint identifier handling, which I narrowed to this commit:
http://h323plus.cvs.sourceforge.net/viewvc/h323plus/h323plus/src/gkclient.cx...
which does not store a new endpoint indentifer from an rrq if one is set, but does not do further tests, and seems to be what is causing my problem. I assume there is a reason for not storing the new endpoint identifier, but I think it should at least be checked.
Also I think when receiving a fullregistrationrequired in a registration the stored fields should be reset. Actually I've seen it's processed like this:
case H225_RegistrationRejectReason::e_fullRegistrationRequired : registrationFailReason = GatekeeperLostRegistration; // Set timer to retry registration reregisterNow = TRUE; monitorTickle.Signal(); break;
but I've been unable to find more actions than signalling an immediate retry, I think in this case, if not in the general registration reject case, the endpoint identifier should be cleared, as it shouldn't be valid after a failed registration.
I've also studied the differences between tag v1.25 and HEAD
http://h323plus.cvs.sourceforge.net/viewvc/h323plus/h323plus/src/gkclient.cx...
and haven't found any differences in the relevant areas.
Summarising, I think endpoint identifier, and maybe some mor registration data should be cleared on registration rejects. And also stored endpoint identifier should be tested again a potential inconsistency against registration confirm data, althoguh this can be problematic ( what to do if they do not match ? ) and can be left off as it's not going to interfre with normal operations. I can try to make a patch, in my case probabley clearing endpointIdentifier on registration reject will work, but I fear it may broke other parts.
Francisco Olarte.
P.S. I cannot provide detailed packet traces at the moment, as they need some rather involved setup, but I can try to do it if someone think it's neccessary. F.O.
participants (1)
-
Francisco Olarte