Add my two cents to _maximum_ vs _exact_ framing.
I think it is important that an endpoint take the number of frames in
OLC to mean the MAXIMUM frames that it expects to receive. In other
words, an EP should be able to receive packets that contain anywhere
from 1 to MAX. number of voice frames. The reason is that if I am doing
multiframes per packet and I am using the built-in VAD/CNG of G.723.1 or
G.729, I have no control over where the transition from voice to silence
occurs. The number of voice frames per packet will inevitably change. To
keep the number of frames per packet constant, I'll have to wait till
the next talk spurt, which is not much an option. This actually brings
up another constraint on the packet, i.e., all frames in a packet should
represent consecutive voice frames.
Regards,
Shan Lu
NexTone Communications
--- Paul Long <plong@SMITHMICRO.COM> wrote:
Francois,
(Worries me, too. I don't like Fast Connect.
"Half-baked" comes to mind.
Apologies to whoever the authors are...)
Exactly, the caller has to make a series of educated
guesses regarding
outgoing caps, based on the implementor's experience
and common sense.
(Wait a minute. Let me get on my high horse... okay,
I'm on... :-)
These fields have always indicated _maximum_
framing. For example, if I
indicate that I can receive packets containing as
many as 20 frames per
packet (for G.711, a frame happens to be 1ms in
duration), that means you
can send me packets containing 1, 20, 10, 2, 15, 8,
etc., frames. Mix and
match. Put a different number of frames in each
successive packet if you
want to. It doesn't matter. The _only_ constraint is
that no packet may
contain more than 20 frames. It has never meant that
you can only send me 20
frames per packet for each and every packet, which
for G.711 would be 20ms
packets all the time. However, several vendors
apparently have
"extra-normative" technical constraints that they
must deal with regarding
packet size, i.e., DSPs that require being fed
chunks of fixed-size audio,
e.g., 20ms packets. But there is no way in H.245 and
therefore H.323 to
indicate a fixed-size packet. These vendors should
either "fix" their DSPs
or perform re-framing between transport and decoder.
In our software-only
implementation, we have a FIFO into which we shove
all incoming audio
frames. Once frames are installed in the FIFO, it
doesn't matter where the
packet boundary was. The decoder just processes
frames in the order they
were received. I'm sure that there are really good
technical reasons why
some implementations require fixed-size audio
packets. Based on our perhaps
too simplistic implementation, however, this has
always puzzled me.
Paul Long
Smith Micro Software, Inc.
-----Original Message-----
From: Francois Audet
[mailto:audet@NORTELNETWORKS.COM]
Sent: Friday, March 10, 2000 1:55 PM
To: ITU-SG16@MAILBAG.INTEL.COM
Subject: Re: FastStart and payload size
Hum... That kind of worries me.
It means that the burden is on the originating side
to "guess" what
the terminator is likely to support.
I also agree that practically speaking,
implementations normally
support multiples of 10 ms as the payload size (with
the notable
exeption of Netmeeting).
However, technically speaking, if you advertize 30
ms as the maximum
payload size you can support, it implies that in
order to comply with
the H.323 specification, you would support any value
between 0 ms
(whatever it means) and 30 ms. I don't think many
implementations could
do this. But that is a separate issue.
In any case, if we all understand that the
originator is responsible
for accurately describing a decent subset of the
capabilities it can
support, we probably should describe this in the
spec because I
am convinced that most people will simply put the
maximum payload
size they can support...