Multimedia contexts for H.320 and H.324 support in megaco/H.gcp

28 Apr 1999

      Over the past couple of days there has been some interesting discussion
on the mailing list with respect to multimedia support in H.gcp/MEGACO,
starting with Tom Taylor's note.  Unfortunately I have been off-line and
have not been able to respond until now.

Tom indicated that he would not pursue an approach in which a multimedia
session is made up of one single context.  The discussion that followed
did not explore this option either.  I would like to take the
opportunity to look into this, and compare this approach to the
proposals that have been posted earlier.

Let me first state the main reason why I favor an approach in which
there is just one context for a multimedia, multi-party session:
simplicity of the model.  In my opinion having multiple contexts for
mux/demux, audio, video, data adds too much detail to the connection
model.  It starts to resemble the first MDCP draft where there were
separate entities in the model for echo canceling, bridging,
transcoding, etc.  I think we can all agree that there was too much
detail there.  It is much better to set properties of terminations (or
edge points as we used to call them in MDCP).

So the connection model should be as abstract and flexible as possible,
keeping in mind that the 95% use case is point to point voice.  To me
this means that a model for a multimedia call does not need separate
entities in the model for mux/demux, the audio, video and data flows
that are sourced/sinked by the mux/demux.  To me it also means that
there is no need to have the conference bridge split up over different
contexts.  It seems better to have one context describing the bridging
functionality, and leave it to MG implementers to decide which
hardware/software units to use in the box.

Now let me get to the ingredients of the model that I propose for
multimedia support.  There are only ephemeral terminations.  Every
termination has a list of bearer channels it uses.  A termination
describes the (de)multiplexing rule used in case media are multiplexed
on the bearer(s).  Examples are H.221 terminations, H.223 terminations,
and terminations that use layered H.263 video encoding with the multiple
layers going to/coming from different UDP ports.  As Tom noted in his
message, this can be seen as a shorthand notation for the separate
context that indicates the (de)multiplexing.  My point is that we don't
need the context if we can do with the termination describing the
(de)multiplexing.  And having only one context for the multimedia
session,
there is no need to repeat the list of bearers in multple contexts.

    +----------+          ---          +----------+
    |          | audio   /   \  audio  | audio on |
  --|  H.221   |--------|     |--------|   RTP    |--
    | termina- | video  | b f |        |          |
  --|   tion   |--------| r u |        +----------+
    |          | data   | i n |
    |          |--------| d c |        +----------+
    +----------+        | g t | video  | video on |
                        | e i |--------|   RTP    |--
    +----------+        |   o |        |          |
    |          | audio  |   n |        +----------+
    |  H.223   |--------|     |
  --| termina- | video  |     |        +----------+
    |   tion   |--------|     | data   | data on  |
    |          | data   |     |--------|  T.120   |--
    |          |--------|     |        |          |
    +----------+         \   /         +----------+
                          ---

The bridging functionality is described by means of properties of the
context.  An identification of the streams that are linked to one user
would be helpful to be able to easily specify that all users should
receive the video stream of the last active speaker.

The impact on protocol syntax is small.  Manipulating a session amounts
to adding/deleting/modifying terminations as before, no extra commands
are needed.  There is a need to set the bridging properties, so we need
a mechanism to set context properties.  In the other approaches there is
a need to do so as well.

Another advantage of this approach is the use of ephemeral terminatinos.
The MG can assign simple names to these (in the same way it assigns
names to contexts).  Thus there is no need any more to have the long
hierarchical names included in all commands that reference terminations.
(I see no good reason for having termination names containing any
information about the type of transport and/or media used in the
termination.  This information is present in the termination properties
already.)

Final remark on this approach: I think that this is actually what Tom
alluded to as Paul Sijben's approach (I didn't check with Paul, I have
to admit).

Now for a couple of comments on the earlier mails.  It seems overly
complex to me to have multiple ephemeral instances of mux
contexts/terminations around as Brian Rosen suggests.  The approach I
outlined above only has one ephemeral termination for the media
multiplexed for transport, which looks much simpler.  It is as flexible
as the idea Brian presented and does not need the semantics that all
ephemeral instances of a mux are created when the first one is, nor do
you need multiple actions because you don't have to deal with multiple
contexts.

Fernando Cuervo suggested having terminations imply the (de)multiplexing
requires that it is possible to describe in every termination which bits
from which packet/frame are to be sent out because a media stream may be
split up over multiple.  So I feel it would be much better to have the
mux explicit.  An advantage of having the (de)multiplixing in one
termination/context is that it is immediately clear which bearers are
used for the aggregated stream.  What I like about Fernando's proposal
is the fact that there is a context type.  What may even be better (I
haven't thought it through, though) is to have a limited number of
context profiles: one for voice-only with all streams going to everyone
but the sender, video+audio with audio to everyone but the sender,
speaker's video to everyone, etc.

Kindest regards,

John Segers
--
John Segers                                  email: jsegers@lucent.com
Lucent Technologies                                        Room HE 344
Dept. Forward Looking Work                      phone: +31 35 687 4724
P.O. Box 18, 1270 AA  Huizen                      fax: +31 35 687 5954