Re: Multimedia contexts for H.320 and H.324 support in megaco/H.g cp

28 Apr 1999

      John

I don't think you have achieved what you stated your goals
were by proposing that there is only one context, and the
MG sorts out what to do by the properties of the terminations.

For one thing, the picture you are advocating is:

   +------------------------------------+
   |            Context C1              |
   |                                    |
   |  DS0/0 -----\       /---- RTP/101  |
   |              \     /               |
   |  DS0/1 -------- o ------- RTP/102  |
   |               / |  \               |
   |  H.221/0 ----/  |   \---- RTP/103  |
   |                 |                  |
   |              H.233/0               |
   +------------------------------------+

Not a pretty picture, even if all the terminations have
enough properties to allow the gateway to figure out what
is intended.  You are really into the "chocolate chip
cookie model" where you just throw all kinds of different
termination types into one context and let them sort themselves
out.

You also don't like having the MUX visible, but you do want the
difference between H.221 and H.223 visible.  I don't get it;
the MGC doesn't care about the differences, and the MG doesn't
need to keep them separate with respect to the MGC.  You might
as well just have more properties in the Context to deal with
the H.221 and H.223 stuff, and not have H.221 and H.223
ephemerals at all.  At least by making the mux visible,
we deal with the issue that the DS0s are on one "side" of the
process with the RTP on the other.  I could have drawn the
picture above:
   +------------------------------------+
   |            Context C1              |
   |                                    |
   |  DS0/0 -----\       /---- RTP/101  |
   |              \     /               |
   |  RTP/102 ------ o ------- H.223/0  |
   |               / |  \               |
   |  H.221/0 ----/  |   \---- DS0/1    |
   |                 |                  |
   |              RTP/103               |
   +------------------------------------+
Since there is no natural way to sort out the relationships
between the external terminations, they are all in one big
hairball.  An even worse picture.

Making lots of properties of terminations is always possible.
It's not going to make the number of operations or the size or
number of messages much different.  All of the proposals end
up having similar sized messages to get the job done,
primarily because the operations are based on the external
terminations.  The differences are in the number of Actions,
the number of commands per Action, and the number of
properties per command but the product isn't much different
for any of the cases.  You have smaller numbers of
terminations and contexts, but larger numbers of properties.
Message sizes should be about the same.  There also isn't
any great differences I can see in what the MG has to do in
order to figure out how to accomplish what it is asked.

One large problem is how you specify which RTP flow is which.
Since all you have to help you is properties, we would have
to add properties to RTP that would specify which media was
being asked for on that RTP stream.  When you add a new
context type, you may have to extend the RTP termination
class; very undesirable.  This of course extends to all the
"packet/cell" termination types -- each of them would have
to be extended with the new media types.

Finally, you are making a complex case "simple"; H.320
gateways are not the focus of the effort, but are in scope.
Support for such uses should be possible, not necessarily
optimal.  By making contexts multimedia, you complicate the
simple cases of single media gateways; not a good thing.

So far, we have not needed any parameters to Action other
than ContextID.  If we really need them, we should add them.
I don't want to go there unless we have a good reason.
Making H320 gateways simpler is not a good reason IMO.

I guess the bottom line is that the current model of one
media per context, no context parameters, works for all the
really primary cases (Access Gateways, Tandems, residential
gateways, NASs, etc.), and it extends, perhaps less elegantly
than you would like, to corner cases like decomposed H320
gateways.  We have to have a really good reason to add more
stuff to the model.

While I like my model "best", there is a case to be made for
ContextType - I think it's more like "combination rule".
The audio rule is "mixing bridge".  That clearly does not work
for any kind of other media, and doesn't cover all the cases
of audio (although it covers 95+% cases!).  If there is any
change to be made, I think it would be that there is a single,
optional parameter to Action which is the combination rule.
The default is mixing bridge.  That still leaves us with how
to handle terminations that are different "types" going into
the rule.  Properties on terminations is one way, implied
naming conventions is another, extra termination classes is
another.

Similarly, there must be a way to relate multiple streams in
a conference.  Single context is one way, ConferenceID, either
as a parameter to an Action or a property of a termination is
another, and implied naming conventions is a third.

We can back our way into two parameters on the Action, or we
can use one mechanism for stream ID and another for CallID,
or we can use one mechanism for both.  Implied semantics in
TerminationIDs serves both purposes, so I favor that.

Brian
...
-----Original Message-----
From: John Segers [mailto:jsegers@LUCENT.COM]
Sent: Wednesday, April 28, 1999 1:44 PM
To: ITU-SG16@mailbag.cps.intel.com
Subject: Multimedia contexts for H.320 and H.324 support in
megaco/H.gcp
Over the past couple of days there has been some interesting
discussion
on the mailing list with respect to multimedia support in
H.gcp/MEGACO,
starting with Tom Taylor's note.  Unfortunately I have been
off-line and
have not been able to respond until now.
Tom indicated that he would not pursue an approach in which a
multimedia
session is made up of one single context.  The discussion
that followed
did not explore this option either.  I would like to take the
opportunity to look into this, and compare this approach to the
proposals that have been posted earlier.
Let me first state the main reason why I favor an approach in which
there is just one context for a multimedia, multi-party session:
simplicity of the model.  In my opinion having multiple contexts for
mux/demux, audio, video, data adds too much detail to the connection
model.  It starts to resemble the first MDCP draft where there were
separate entities in the model for echo canceling, bridging,
transcoding, etc.  I think we can all agree that there was too much
detail there.  It is much better to set properties of terminations (or
edge points as we used to call them in MDCP).
So the connection model should be as abstract and flexible as
possible,
keeping in mind that the 95% use case is point to point voice.  To me
this means that a model for a multimedia call does not need separate
entities in the model for mux/demux, the audio, video and data flows
that are sourced/sinked by the mux/demux.  To me it also means that
there is no need to have the conference bridge split up over different
contexts.  It seems better to have one context describing the bridging
functionality, and leave it to MG implementers to decide which
hardware/software units to use in the box.
Now let me get to the ingredients of the model that I propose for
multimedia support.  There are only ephemeral terminations.  Every
termination has a list of bearer channels it uses.  A termination
describes the (de)multiplexing rule used in case media are multiplexed
on the bearer(s).  Examples are H.221 terminations, H.223
terminations,
and terminations that use layered H.263 video encoding with
the multiple
layers going to/coming from different UDP ports.  As Tom noted in his
message, this can be seen as a shorthand notation for the separate
context that indicates the (de)multiplexing.  My point is
that we don't
need the context if we can do with the termination describing the
(de)multiplexing.  And having only one context for the multimedia
session,
there is no need to repeat the list of bearers in multple contexts.
+----------+          ---          +----------+
    |          | audio   /   \  audio  | audio on |
  --|  H.221   |--------|     |--------|   RTP    |--
    | termina- | video  | b f |        |          |
  --|   tion   |--------| r u |        +----------+
    |          | data   | i n |
    |          |--------| d c |        +----------+
    +----------+        | g t | video  | video on |
                        | e i |--------|   RTP    |--
    +----------+        |   o |        |          |
    |          | audio  |   n |        +----------+
    |  H.223   |--------|     |
  --| termina- | video  |     |        +----------+
    |   tion   |--------|     | data   | data on  |
    |          | data   |     |--------|  T.120   |--
    |          |--------|     |        |          |
    +----------+         \   /         +----------+
                          ---
The bridging functionality is described by means of properties of the
context.  An identification of the streams that are linked to one user
would be helpful to be able to easily specify that all users should
receive the video stream of the last active speaker.
The impact on protocol syntax is small.  Manipulating a
session amounts
to adding/deleting/modifying terminations as before, no extra commands
are needed.  There is a need to set the bridging properties,
so we need
a mechanism to set context properties.  In the other
approaches there is
a need to do so as well.
Another advantage of this approach is the use of ephemeral
terminatinos.
The MG can assign simple names to these (in the same way it assigns
names to contexts).  Thus there is no need any more to have the long
hierarchical names included in all commands that reference
terminations.
(I see no good reason for having termination names containing any
information about the type of transport and/or media used in the
termination.  This information is present in the termination
properties
already.)
Final remark on this approach: I think that this is actually what Tom
alluded to as Paul Sijben's approach (I didn't check with Paul, I have
to admit).
Now for a couple of comments on the earlier mails.  It seems overly
complex to me to have multiple ephemeral instances of mux
contexts/terminations around as Brian Rosen suggests.  The approach I
outlined above only has one ephemeral termination for the media
multiplexed for transport, which looks much simpler.  It is
as flexible
as the idea Brian presented and does not need the semantics that all
ephemeral instances of a mux are created when the first one is, nor do
you need multiple actions because you don't have to deal with multiple
contexts.
Fernando Cuervo suggested having terminations imply the
(de)multiplexing
requires that it is possible to describe in every termination
which bits
from which packet/frame are to be sent out because a media
stream may be
split up over multiple.  So I feel it would be much better to have the
mux explicit.  An advantage of having the (de)multiplixing in one
termination/context is that it is immediately clear which bearers are
used for the aggregated stream.  What I like about Fernando's proposal
is the fact that there is a context type.  What may even be better (I
haven't thought it through, though) is to have a limited number of
context profiles: one for voice-only with all streams going
to everyone
but the sender, video+audio with audio to everyone but the sender,
speaker's video to everyone, etc.
Kindest regards,
John Segers
--
John Segers                                  email: jsegers@lucent.com
Lucent Technologies                                        Room HE 344
Dept. Forward Looking Work                      phone: +31 35 687 4724
P.O. Box 18, 1270 AA  Huizen                      fax: +31 35 687 5954