[robustness]Comments on TD-42/Osaka
Robustness folks,
A couple of comments on TD-42/Osaka -
1) While adding the ASN.1 from TD-42 to the draft H.225.0 v4, I noticed the following paragraph about using StatusInquiry as a keepalive mechanism:
"The element closer to the called party shall send StatusInquiry periodically (this is the direction of least traffic during established calls). The period should be configurable. Two seconds is the recommended default, in order to allow detection of failure before other messages timeout. The selected value must be added to StatusInquiry so that the recipient can also monitor failure without an additional StatusInquiry/Status exchange in the opposite direction. The recipient system needs only to maintain a timer using the indicated value as a timeout."
This clause indicates that the period at which StatusInquiry is sent is also the amount of time that the receiving system should use as a timeToLive timer. This arrangement would result in a race condition since the next StatusInquiry would be expected at the exact moment that the timeToLive timer was due to expire.
IMHO, the StatusInquiry period and the timeToLive should be separately configurable, both having default values defined in Annex R, with the obvious restriction that timeToLive shall be greater than the StatusInquiry period.
2) Has consideration been given to adding the ability for the endpoint to indicate to the GK via the ARQ that a particular call requires robustness, or to allowing the GK to indicate to the EP that the EP must use the robustness procedure for a particular call? Several other features added in v4 use this model and it seems to be a useful framework.
- Rich
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
See comments below.
Rich Bowen wrote:
Robustness folks,
A couple of comments on TD-42/Osaka -
- While adding the ASN.1 from TD-42 to the draft H.225.0 v4, I noticed
the following paragraph about using StatusInquiry as a keepalive mechanism:
"The element closer to the called party shall send StatusInquiry periodically (this is the direction of least traffic during established calls). The period should be configurable. Two seconds is the recommended default, in order to allow detection of failure before other messages timeout. The selected value must be added to StatusInquiry so that the recipient can also monitor failure without an additional StatusInquiry/Status exchange in the opposite direction. The recipient system needs only to maintain a timer using the indicated value as a timeout."
This clause indicates that the period at which StatusInquiry is sent is also the amount of time that the receiving system should use as a timeToLive timer. This arrangement would result in a race condition since the next StatusInquiry would be expected at the exact moment that the timeToLive timer was due to expire.
IMHO, the StatusInquiry period and the timeToLive should be separately configurable, both having default values defined in Annex R, with the obvious restriction that timeToLive shall be greater than the StatusInquiry period.
What we had in mind is essentially a duplication of the RAS lightweight registration method. There timeToLive is indicated by either GW in RRQ or GK in RCF (RCF is the definitive) and is the only parameter. It is up to the GW to register enough earlier or the GK to allow registration to live enough longer to avoid the race. I would guess in practice that most systems do both. I see no reason the same cannot work here. I would assume that the recipient would add some suitable padding to the time before concluding failure. I see no reason to carry yet one more field in the message.
- Has consideration been given to adding the ability for the endpoint
to indicate to the GK via the ARQ that a particular call requires robustness, or to allowing the GK to indicate to the EP that the EP must use the robustness procedure for a particular call? Several other features added in v4 use this model and it seems to be a useful framework.
I think that the group has assumed that robustness is a property of network elements - either it has it for all calls or none. But we could certainly consider it on a per call basis if this seems useful. I suppose since there is some processing cost associated with the mechanism, one might choose to NOT use it for some calls even when it is available.
Do others think a per call choice is useful?
- Rich
For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
-- ------------------------------------------------------------ Terry L Anderson mailto:tla@lucent.com Tel:908.582.7013 Fax:908.582.6729 Pager:800.759.8352 pin 1704572 1704572@skytel.com Lucent Technologies/ Voice Over IP Access Networks/ Applications Grp Rm 2B-121, 600 Mountain Av, Murray Hill, NJ 07974 http://its.lucent.com/~tla (Lucent internal) http://www.gti.net/tla
Terry L Anderson wrote:
See comments below.
Rich Bowen wrote:
Robustness folks,
A couple of comments on TD-42/Osaka -
- While adding the ASN.1 from TD-42 to the draft H.225.0 v4, I noticed
the following paragraph about using StatusInquiry as a keepalive mechanism:
"The element closer to the called party shall send StatusInquiry periodically (this is the direction of least traffic during established calls). The period should be configurable. Two seconds is the recommended default, in order to allow detection of failure before other messages timeout. The selected value must be added to StatusInquiry so that the recipient can also monitor failure without an additional StatusInquiry/Status exchange in the opposite direction. The recipient system needs only to maintain a timer using the indicated value as a timeout."
This clause indicates that the period at which StatusInquiry is sent is also the amount of time that the receiving system should use as a timeToLive timer. This arrangement would result in a race condition since the next StatusInquiry would be expected at the exact moment that the timeToLive timer was due to expire.
IMHO, the StatusInquiry period and the timeToLive should be separately configurable, both having default values defined in Annex R, with the obvious restriction that timeToLive shall be greater than the StatusInquiry period.
What we had in mind is essentially a duplication of the RAS lightweight registration method. There timeToLive is indicated by either GW in RRQ or GK in RCF (RCF is the definitive) and is the only parameter. It is up to the GW to register enough earlier or the GK to allow registration to live enough longer to avoid the race. I would guess in practice that most systems do both. I see no reason the same cannot work here. I would assume that the recipient would add some suitable padding to the time before concluding failure. I see no reason to carry yet one more field in the message.
Actually the way the timeToLive has been handled in the LW RRQ has created some confusion in deployment, because different vendors use different algorithms to determine when to send the next RRQ. I think a standard approach makes network administration simpler.
A more typical approach in other protocols I'm familiar with is to define the amount of padding, and optionally allow it to be configurable (e.g., via a MIB object) to allow network tuning. Note that this doesn't add another field to the message, because there's no need to communicate the padding value between endpoints.
For example, one approach might be for the called endpoint to set the timeToLive sent in the StatusInquiry to be the sending period plus a padding value. The padding value could be defined in the standard to have some range, with a default value of say 1000ms. No padding would be necessary at the calling endpoint.
- Rich
- Has consideration been given to adding the ability for the endpoint
to indicate to the GK via the ARQ that a particular call requires robustness, or to allowing the GK to indicate to the EP that the EP must use the robustness procedure for a particular call? Several other features added in v4 use this model and it seems to be a useful framework.
I think that the group has assumed that robustness is a property of network elements - either it has it for all calls or none. But we could certainly consider it on a per call basis if this seems useful. I suppose since there is some processing cost associated with the mechanism, one might choose to NOT use it for some calls even when it is available.
Do others think a per call choice is useful?
- Rich
For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
--
Terry L Anderson mailto:tla@lucent.com Tel:908.582.7013 Fax:908.582.6729 Pager:800.759.8352 pin 1704572 1704572@skytel.com Lucent Technologies/ Voice Over IP Access Networks/ Applications Grp Rm 2B-121, 600 Mountain Av, Murray Hill, NJ 07974 http://its.lucent.com/~tla (Lucent internal) http://www.gti.net/tla
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
Rich Bowen wrote:
Terry L Anderson wrote:
See comments below.
Rich Bowen wrote:
Robustness folks,
A couple of comments on TD-42/Osaka -
- While adding the ASN.1 from TD-42 to the draft H.225.0 v4, I noticed
the following paragraph about using StatusInquiry as a keepalive mechanism:
"The element closer to the called party shall send StatusInquiry periodically (this is the direction of least traffic during established calls). The period should be configurable. Two seconds is the recommended default, in order to allow detection of failure before other messages timeout. The selected value must be added to StatusInquiry so that the recipient can also monitor failure without an additional StatusInquiry/Status exchange in the opposite direction. The recipient system needs only to maintain a timer using the indicated value as a timeout."
This clause indicates that the period at which StatusInquiry is sent is also the amount of time that the receiving system should use as a timeToLive timer. This arrangement would result in a race condition since the next StatusInquiry would be expected at the exact moment that the timeToLive timer was due to expire.
IMHO, the StatusInquiry period and the timeToLive should be separately configurable, both having default values defined in Annex R, with the obvious restriction that timeToLive shall be greater than the StatusInquiry period.
What we had in mind is essentially a duplication of the RAS lightweight registration method. There timeToLive is indicated by either GW in RRQ or GK in RCF (RCF is the definitive) and is the only parameter. It is up to the GW to register enough earlier or the GK to allow registration to live enough longer to avoid the race. I would guess in practice that most systems do both. I see no reason the same cannot work here. I would assume that the recipient would add some suitable padding to the time before concluding failure. I see no reason to carry yet one more field in the message.
Actually the way the timeToLive has been handled in the LW RRQ has created some confusion in deployment, because different vendors use different algorithms to determine when to send the next RRQ. I think a standard approach makes network administration simpler.
A more typical approach in other protocols I'm familiar with is to define the amount of padding, and optionally allow it to be configurable (e.g., via a MIB object) to allow network tuning. Note that this doesn't add another field to the message, because there's no need to communicate the padding value between endpoints.
For example, one approach might be for the called endpoint to set the timeToLive sent in the StatusInquiry to be the sending period plus a padding value. The padding value could be defined in the standard to have some range, with a default value of say 1000ms. No padding would be necessary at the calling endpoint.
Well I have no objection to "padding" or having it configurable, however, I see no reason for the standard to define it to be "configurable" if it does not effect any message fields. Current H.323 standards don't specify such implementation details. It WOULD be appropriate (IMHO) to state max and min values but whether they are fixed or configurable in an implementation seems inappropriate. Yes,Yes I realize I wrote (I wrote that paragraph) that the timeToLive should be "configuable" and I believe that this was also inappropriate. It should have stated that "the period is not defined by the standard". This is really similar to RAS message timeouts. The standard lists some suggested values but does NOT define how they are implemented (hard coded or configurable).
I believe there are reasons an implementation may want to choose different values for timeToLive depending on other factors and so should not be fixed by the standard. But while there should be "padding" I am not sure what other factors would lead me to vary the padding. What about stating that the padding should be 1/2 the timeToLive value or some other fixed fraction?
- Rich
- Has consideration been given to adding the ability for the endpoint
to indicate to the GK via the ARQ that a particular call requires robustness, or to allowing the GK to indicate to the EP that the EP must use the robustness procedure for a particular call? Several other features added in v4 use this model and it seems to be a useful framework.
I think that the group has assumed that robustness is a property of network elements - either it has it for all calls or none. But we could certainly consider it on a per call basis if this seems useful. I suppose since there is some processing cost associated with the mechanism, one might choose to NOT use it for some calls even when it is available.
Do others think a per call choice is useful?
- Rich
For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
--
Terry L Anderson mailto:tla@lucent.com Tel:908.582.7013 Fax:908.582.6729 Pager:800.759.8352 pin 1704572 1704572@skytel.com Lucent Technologies/ Voice Over IP Access Networks/ Applications Grp Rm 2B-121, 600 Mountain Av, Murray Hill, NJ 07974 http://its.lucent.com/~tla (Lucent internal) http://www.gti.net/tla
For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
-- ------------------------------------------------------------ Terry L Anderson mailto:tla@lucent.com Tel:908.582.7013 Fax:908.582.6729 Pager:800.759.8352 pin 1704572 1704572@skytel.com Lucent Technologies/ Voice Over IP Access Networks/ Applications Grp Rm 2B-121, 600 Mountain Av, Murray Hill, NJ 07974 http://its.lucent.com/~tla (Lucent internal) http://www.gti.net/tla
Terry,
Terry L Anderson wrote:
Rich Bowen wrote:
Terry L Anderson wrote:
See comments below.
Rich Bowen wrote:
Robustness folks,
A couple of comments on TD-42/Osaka -
- While adding the ASN.1 from TD-42 to the draft H.225.0 v4, I noticed
the following paragraph about using StatusInquiry as a keepalive mechanism:
"The element closer to the called party shall send StatusInquiry periodically (this is the direction of least traffic during established calls). The period should be configurable. Two seconds is the recommended default, in order to allow detection of failure before other messages timeout. The selected value must be added to StatusInquiry so that the recipient can also monitor failure without an additional StatusInquiry/Status exchange in the opposite direction. The recipient system needs only to maintain a timer using the indicated value as a timeout."
This clause indicates that the period at which StatusInquiry is sent is also the amount of time that the receiving system should use as a timeToLive timer. This arrangement would result in a race condition since the next StatusInquiry would be expected at the exact moment that the timeToLive timer was due to expire.
IMHO, the StatusInquiry period and the timeToLive should be separately configurable, both having default values defined in Annex R, with the obvious restriction that timeToLive shall be greater than the StatusInquiry period.
What we had in mind is essentially a duplication of the RAS lightweight registration method. There timeToLive is indicated by either GW in RRQ or GK in RCF (RCF is the definitive) and is the only parameter. It is up to the GW to register enough earlier or the GK to allow registration to live enough longer to avoid the race. I would guess in practice that most systems do both. I see no reason the same cannot work here. I would assume that the recipient would add some suitable padding to the time before concluding failure. I see no reason to carry yet one more field in the message.
Actually the way the timeToLive has been handled in the LW RRQ has created some confusion in deployment, because different vendors use different algorithms to determine when to send the next RRQ. I think a standard approach makes network administration simpler.
A more typical approach in other protocols I'm familiar with is to define the amount of padding, and optionally allow it to be configurable (e.g., via a MIB object) to allow network tuning. Note that this doesn't add another field to the message, because there's no need to communicate the padding value between endpoints.
For example, one approach might be for the called endpoint to set the timeToLive sent in the StatusInquiry to be the sending period plus a padding value. The padding value could be defined in the standard to have some range, with a default value of say 1000ms. No padding would be necessary at the calling endpoint.
Well I have no objection to "padding" or having it configurable, however, I see no reason for the standard to define it to be "configurable" if it does not effect any message fields. Current H.323 standards don't specify such implementation details. It WOULD be appropriate (IMHO) to state max and min values but whether they are fixed or configurable in an implementation seems inappropriate. Yes,Yes I realize I wrote (I wrote that paragraph) that the timeToLive should be "configuable" and I believe that this was also inappropriate. It should have stated that "the period is not defined by the standard". This is really similar to RAS message timeouts. The standard lists some suggested values but does NOT define how they are implemented (hard coded or configurable).
I agree it would be inappropriate for H.323 to say that a timer shall be configurable. But I also think it would be better for H.323 to specify mandatory "default" values rather than "recommended" values. The difference would be that, if your implementation does not make the timer configurable, then it behaves the same as any other implementation that does not make the timer configurable or which has not had its default configuration overridden. I think this consistency is important to network operators in multivendor networks.
I believe there are reasons an implementation may want to choose different values for timeToLive depending on other factors and so should not be fixed by the standard. But while there should be "padding" I am not sure what other factors would lead me to vary the padding.
I think one factor could be the network delay. You want to detect the loss of the channel as quickly as possible, but the padding has to be large enough to allow for the packet loss characteristics and propagation delay variations of your particular network.
What about stating that the padding should be 1/2 the timeToLive value or some other fixed fraction?
I would tend to see the two timers as being more independent -- you would set the Status Inquiry period according to how quickly you need to detect the channel loss, and then set the padding as low as possible given the network characteristics. But the rule that you propose above also seems reasonable to me.
- Rich
- Rich
- Has consideration been given to adding the ability for the endpoint
to indicate to the GK via the ARQ that a particular call requires robustness, or to allowing the GK to indicate to the EP that the EP must use the robustness procedure for a particular call? Several other features added in v4 use this model and it seems to be a useful framework.
I think that the group has assumed that robustness is a property of network elements - either it has it for all calls or none. But we could certainly consider it on a per call basis if this seems useful. I suppose since there is some processing cost associated with the mechanism, one might choose to NOT use it for some calls even when it is available.
Do others think a per call choice is useful?
- Rich
For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
--
Terry L Anderson mailto:tla@lucent.com Tel:908.582.7013 Fax:908.582.6729 Pager:800.759.8352 pin 1704572 1704572@skytel.com Lucent Technologies/ Voice Over IP Access Networks/ Applications Grp Rm 2B-121, 600 Mountain Av, Murray Hill, NJ 07974 http://its.lucent.com/~tla (Lucent internal) http://www.gti.net/tla
For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
--
Terry L Anderson mailto:tla@lucent.com Tel:908.582.7013 Fax:908.582.6729 Pager:800.759.8352 pin 1704572 1704572@skytel.com Lucent Technologies/ Voice Over IP Access Networks/ Applications Grp Rm 2B-121, 600 Mountain Av, Murray Hill, NJ 07974 http://its.lucent.com/~tla (Lucent internal) http://www.gti.net/tla
-- -------------------------------------------------------------------- Richard K. Bowen Cisco Systems, Inc. VoIP Session Protocols Research Triangle Park, NC, USA --------------------------------------------------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For help on this mail list, send "HELP ITU-SG16" in a message to listserv@mailbag.intel.com
participants (2)
-
Rich Bowen
-
Terry L Anderson