Re: [h323plus] [Opalvoip-devel] Custom Video Frame Size
I have CC'd this to the h323plus list.
Robert
Getting back to the initial question. I want to move forward with H.239 support in h323plus so can I remove the fixed frame size constraints from the video plugins so the project can move forward or if that's not recommended then, as I don't want to have different versions of the video plugins that break interoperability, can I put in a compiler directive to get us out of a pickle? Once these opal architectural glitches are resolved then the directive can be removed.
I really am confused on the codec issues, and the discrete video sizes with H.261/H.263 and the generic capabilities etc. The way this is done in H323plus is to detect the capabilities of the video device at application startup via the changes I made in the ptlib videodevice factory which allows the device capability list to be exposed without instantaneousing the device. You use the device capabilities list to determine the maximum frame size available for the device so in this way you can detect and support HD webcams etc. There is a H323Endpoint function that then goes through and removes all the capabilities unsupported for that particular webcam. Easy! On the OpenVideoChannel function callback the user can then set the frame size and fps on the wire. This sets the header height/width fields of the YUV420 frame which then goes back into the plugin codec to resize the codec. This is how it used to work in OpenH323 and it works just fine. The problem you refer to is, I guess, an open Opal issue perhaps?.
The "Extended Video Channel" is different to Hannes work in Opal. ExtendedVideoCapability is a type of Video capability which contains a subset of capabilites designed to be used for the likes of H.239. There is a flag I have added to the codec definitions in the video plugins which marks the codecs to be loaded into this subset group. Hannes's work is on having multiple primary video windows which is not related in H.239. The secondary or "Extended" video capability is opened via a function with sends a H.245 OLC and returns a channel number which you can then use to close the channel. Since each channel has a unique channel number, multiple video windows can be opened/closed on the fly. There is a working example of this in simple in applications directory in the H323plus CVS. This type of concept opens the way to develop more advanced concepts like telepresence where you can allocate 3 or more different video input for each secondary channel. Since all this is done on a secondary video capability, existing interoperability on the primary video is ensured and no existing architectural changes in h323plus are required.
Simon
-----Original Message----- From: opalvoip-devel-bounces@lists.sourceforge.net [mailto:opalvoip-devel-bounces@lists.sourceforge.net]On Behalf Of Robert Jongbloed Sent: Tuesday, 6 November 2007 6:00 AM To: Opalvoip-devel@lists.sourceforge.net Subject: Re: [Opalvoip-devel] Custom Video Frame Size
-----Original Message----- From: opalvoip-devel-bounces@lists.sourceforge.net [mailto:opalvoip- devel-bounces@lists.sourceforge.net] On Behalf Of Simon Horne
...
There was some private discussion a few weeks back on setting custom video frame sizes in Opal.
I should probably summarise the result of that discussion and post the general list.
Currently looking at the video plugins (H.263 for instance) only predefined standard frame sizes are allowed (I added a few more BTW :) ). Now with H.239 application sharing that is just not going to happen. I can just comment out the frame size checking to allow custom sizes into the encoder, and get things to work in H323plus but I thought I'd check here before commiting stuff that breaks Opal. :)
There is a branch where I started to implement the results of that huge long thread. Unfortunately I found a show stopper with OpalMediaFormat where I needed it to be polymorphic, but all through OPAL and H323plus is it used in such a way that breaks polymorphism, e.g. code like:
... function(const OpalMediaFormat & videoFormat) { OpalMediaFormat newFormat = videoFormat;
Which, if videoFormat an instance of OpalVideoFormat, that type is lost in the copy constructor, including the correct virtual functions needed for video.
This I need to fix, and soon ...
On a side note honestly I cannot see how to set a custom frame size in Opal. Well that's not true, I can see how you can do it, if you know what the frame size is in advance but with H.239 the frame size is determined by the dimensions of the image capture (of course that changes when you resize the window) and also how do you get access to the videoInput device at creation to set the window handle for the Input device to grab so it can set the capture size in the first place. Maybe someone with more knowledge on this topic might be able to assist. :(
OK, this sounds like you are talking about changing the size of the transmitted video in mid flight, correct?
That is one of the reasons why every internal RTP frame for raw video (of type "YUV420P") has a small structure at the top with x,y,width,height in it. It should be theoretically possible for the video source to change the resolution of the generated video on a frame by frame basis. The code does this now setting the width/height from PVideoInputDevice::GetFrameSize() before handing to codec.
Now, your comments do raise an interesting point that is not yet allowed for, and that is how to deal with codecs that inherently only do discrete sizes such as H.261 and H.263. My understanding is all the others H.263+, H.264, MPEG4 part 2, Theora (?) can all do anything so I don't think they are an issue. Not sure how to address this one, first thought is a OpalMediaOption (readonly) which indicates if dynamic resize is possible, if not then something (OpalVideoConverter?) needs to scale/crop the image.
More thought needed on that one.
For background in H323plus I created a new function OpenExtendedVideoChannel() to allow the user access to create and set, within the application input device (vidinput_app.h now in ptlib), the window handle to capture, then retrieve the application frame size from the input device so you can (if necessary) scale the actual frame size on the wire.
Is an "Extended Video Channel" a similar in concept to what Hannes was doing with his OpalMediaType? The ability to open more than one video channel in a call?
I really would like to get some input on how you can envision this stuff to work in opal so when it comes time (or someone has the time) to port this stuff it will be a straight forward affair and not either a H.239 code or architectural mashup.
My experience has been that video is never straightforward, there is always some gotcha there. Especially, resolving the "old way" that H.261 and H.263 use (a million options in H.245) and the "new way" (profile/level) codecs such as MPEG4 and H.264 seem to be doing it now, is proving to be hideous. :-(
Robert Jongbloed OPAL/OpenH323 Architect and Co-founder.
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Opalvoip-devel mailing list Opalvoip-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opalvoip-devel
Simon,
On 06.11.2007, at 03:55, Simon Horne wrote:
I have CC'd this to the h323plus list.
Robert
Getting back to the initial question. I want to move forward with H. 239 support in h323plus so can I remove the fixed frame size constraints from the video plugins so the project can move forward or if that's not recommended then, as I don't want to have different versions of the video plugins that break interoperability, can I put in a compiler directive to get us out of a pickle? Once these opal architectural glitches are resolved then the directive can be removed.
I really am confused on the codec issues, and the discrete video sizes with H.261/H.263 and the generic capabilities etc. The way this is done in H323plus is to detect the capabilities of the video device at application startup via the changes I made in the ptlib videodevice factory which allows the device capability list to be exposed without instantaneousing the device. You use the device capabilities list to determine the maximum frame size available for the device so in this way you can detect and support HD webcams etc. There is a H323Endpoint function that then goes through and removes all the capabilities unsupported for that particular webcam. Easy! On the OpenVideoChannel function callback the user can then set the frame size and fps on the wire. This sets the header height/width fields of the YUV420 frame which then goes back into the plugin codec to resize the codec. This is how it used to work in OpenH323 and it works just fine. The problem you refer to is, I guess, an open Opal issue perhaps?.
Just don't forget that newer codecs such as H.264 no longer define explicit frame size, but rather profile/levels, which actually defines a range of sizes. So, it is no longer that easy to just remove particular capabilities, as a particular profile/level may mean higher resolution/lower framerate or vice versa. Also, I don't know how flexible existing H.239 systems are in terms of supported frame sizes. I guess it will be safest if you stick to well-known discrete resolutions such as 4CIF / 16 CIF.
The "Extended Video Channel" is different to Hannes work in Opal. ExtendedVideoCapability is a type of Video capability which contains a subset of capabilites designed to be used for the likes of H.239. There is a flag I have added to the codec definitions in the video plugins which marks the codecs to be loaded into this subset group. Hannes's work is on having multiple primary video windows which is not related in H.239. The secondary or "Extended" video capability is opened via a function with sends a H.245 OLC and returns a channel number which you can then use to close the channel. Since each channel has a unique channel number, multiple video windows can be opened/closed on the fly. There is a working example of this in simple in applications directory in the H323plus CVS. This type of concept opens the way to develop more advanced concepts like telepresence where you can allocate 3 or more different video input for each secondary channel. Since all this is done on a secondary video capability, existing interoperability on the primary video is ensured and no existing architectural changes in h323plus are required.
I think I have to explain more in-detail how the MediaType stuff actually works, as it really was intended to support H.239. A OpalEndpoint does not primarily know about H.239, as this is H.323 specific stuff. To Opal, this is just another video stream. However, this video stream has different characteristics as the primary video stream, since - as you mentioned - the capabilities used are different ones. So, it needs different OpalMediaFormat definitions. The OpalMediaType class introduced is just an extension to the sessionID parameter used so far. First, statically assigning session IDs others than 1,2,3 is not according to H.245, as these session IDs have to be assigned by the H.245 master. The MediaType is just a description of the media type (video, audio, application, etc) along with a label (e.g. DefaultVideo, SecondaryVideo) So far, the existing code explicitely tries to open a logical channel for DefaultAudioSessionID, DefaultVideoSessionID, DefaultDataSessionID. If you want to use other data streams (e.g. H. 224/H.281), you need to add #ifdef protected code at various places, which is rather painful. My changes simply try to open logical channels for each MediaType available, and the MediaTypeList is dynamically managed. I don't see why H.239 shouldn't fit into this concept.
Hannes
Hannes
With the plugin H.264 codec I defined a macro which defines profiles/levels for each standard frame size. This makes it a bit easier to identify the codecs that cannot be supported by the primary input device hence making it easier to remove these unsupported capabilities out of the capability list. In the case of the extended video codec these maximum frame sizes can be used as frames of reference when scaling the input from the application capture so in the OpenExtendedVideoChannel() function the codec maximum width/height from the plugin is compared to the application width/height and a correct scaling ratio can be calculated to keep the aspect ratio within the confines of the codec defined capability. This works just fine in H323plus but complains when passed to the plugin because it requires known frame sizes. Remove these lines from the plugin and it works. Hence the issue.
In defining a H.239 capability there is a flag in the plugin capability to define which codecs are to support extended video. There is no need to specify seperate OpalMediaFormats for primary or secondary video, they are the same, just handled differently. When the codec is opened, there are 2 capability factories, a primary and a smaller video secondary. The codecs with the video flag go in the primary the ones with the extvideo flag goes in the second, some have both so they are added to both. In the capability exchange, the secondary capabilities are negotiate along with and exactly the same way as the primary except they are done so inside a H323ExtendedVideoCapability which is assigned a Session ID in h323plus of 5. These capabilities do not auto-start with audio and video (although there is a flag to do so but it is disabled by default) and once the call is established you can invoke at anytime a H.245 OLC to open a Session ID 5 session (extendedVideo) to create a unidirectional video channel. The channel is assigned a unique identifier (like T-1xx) which you can use to close at any time. You can open multiple Session ID 5 sessions each with their own unique identifier. You can assign via H323Endpoint::OpenExtendedVideoChannel() different input devices for each channel and unicast multiple video streams simultaneously.
Hannes your changes are important and will work quite well with H.239 however it's not a show stopper as each extended video stream uses the same session id (in h323plus's case 5) in Opal that would be a dynamically allocated number.
There still is the outstanding issue of NAT and unidirectional Video streams but that, if you implement the relevent sections of H460.p2pnat as I wrote it, should not be that great an issue. :-)
The architecture of openH323 (now h323plus) and all its integrecies and downsides is still quite capable of handling this stuff.
Simon
-----Original Message----- From: Hannes Friederich [mailto:hannesf@ee.ethz.ch] Sent: Tuesday, 6 November 2007 4:21 PM To: Simon Horne Cc: Opalvoip-devel@lists.sourceforge.net; Robert Jongbloed; H323plus Subject: Re: [Opalvoip-devel] Custom Video Frame Size
Simon,
On 06.11.2007, at 03:55, Simon Horne wrote:
I have CC'd this to the h323plus list.
Robert
Getting back to the initial question. I want to move forward with H. 239 support in h323plus so can I remove the fixed frame size constraints from the video plugins so the project can move forward or if that's not recommended then, as I don't want to have different versions of the video plugins that break interoperability, can I put in a compiler directive to get us out of a pickle? Once these opal architectural glitches are resolved then the directive can be removed.
I really am confused on the codec issues, and the discrete video sizes with H.261/H.263 and the generic capabilities etc. The way this is done in H323plus is to detect the capabilities of the video device at application startup via the changes I made in the ptlib videodevice factory which allows the device capability list to be exposed without instantaneousing the device. You use the device capabilities list to determine the maximum frame size available for the device so in this way you can detect and support HD webcams etc. There is a H323Endpoint function that then goes through and removes all the capabilities unsupported for that particular webcam. Easy! On the OpenVideoChannel function callback the user can then set the frame size and fps on the wire. This sets the header height/width fields of the YUV420 frame which then goes back into the plugin codec to resize the codec. This is how it used to work in OpenH323 and it works just fine. The problem you refer to is, I guess, an open Opal issue perhaps?.
Just don't forget that newer codecs such as H.264 no longer define explicit frame size, but rather profile/levels, which actually defines a range of sizes. So, it is no longer that easy to just remove particular capabilities, as a particular profile/level may mean higher resolution/lower framerate or vice versa. Also, I don't know how flexible existing H.239 systems are in terms of supported frame sizes. I guess it will be safest if you stick to well-known discrete resolutions such as 4CIF / 16 CIF.
The "Extended Video Channel" is different to Hannes work in Opal. ExtendedVideoCapability is a type of Video capability which contains a subset of capabilites designed to be used for the likes of H.239. There is a flag I have added to the codec definitions in the video plugins which marks the codecs to be loaded into this subset group. Hannes's work is on having multiple primary video windows which is not related in H.239. The secondary or "Extended" video capability is opened via a function with sends a H.245 OLC and returns a channel number which you can then use to close the channel. Since each channel has a unique channel number, multiple video windows can be opened/closed on the fly. There is a working example of this in simple in applications directory in the H323plus CVS. This type of concept opens the way to develop more advanced concepts like telepresence where you can allocate 3 or more different video input for each secondary channel. Since all this is done on a secondary video capability, existing interoperability on the primary video is ensured and no existing architectural changes in h323plus are required.
I think I have to explain more in-detail how the MediaType stuff actually works, as it really was intended to support H.239. A OpalEndpoint does not primarily know about H.239, as this is H.323 specific stuff. To Opal, this is just another video stream. However, this video stream has different characteristics as the primary video stream, since - as you mentioned - the capabilities used are different ones. So, it needs different OpalMediaFormat definitions. The OpalMediaType class introduced is just an extension to the sessionID parameter used so far. First, statically assigning session IDs others than 1,2,3 is not according to H.245, as these session IDs have to be assigned by the H.245 master. The MediaType is just a description of the media type (video, audio, application, etc) along with a label (e.g. DefaultVideo, SecondaryVideo) So far, the existing code explicitely tries to open a logical channel for DefaultAudioSessionID, DefaultVideoSessionID, DefaultDataSessionID. If you want to use other data streams (e.g. H. 224/H.281), you need to add #ifdef protected code at various places, which is rather painful. My changes simply try to open logical channels for each MediaType available, and the MediaTypeList is dynamically managed. I don't see why H.239 shouldn't fit into this concept.
Hannes
-----Original Message----- From: Simon Horne [mailto:s.horne@packetizer.com]
....
Getting back to the initial question. I want to move forward with H.239 support in h323plus so can I remove the fixed frame size constraints from the video plugins so the project can move forward or if that's not recommended then, as I don't want to have different versions of the video plugins that break interoperability, can I put in a compiler directive to get us out of a pickle? Once these opal architectural glitches are resolved then the directive can be removed.
I am confused, the underlying system in OPAL has never really had any "fixed size" constraints. There are two OpalMediaOptions for width and height and they can be any value. Now come CODECS can only do fixed sizes, eg H.261, and that was part of the complexity Matthias and I were struggling with.
I really am confused on the codec issues, and the discrete video sizes with H.261/H.263 and the generic capabilities etc. The way this is done in H323plus is to detect the capabilities of the video device at application startup via the changes I made in the ptlib videodevice factory which allows the device capability list to be exposed without instantaneousing the device. You use the device capabilities list to determine the maximum frame size available for the device so in this way you can detect and support HD webcams etc. There is a H323Endpoint function that then goes through and removes all the capabilities unsupported for that particular webcam. Easy!
If I am reading this right, you are using the capabilities of the camera, which is used for transmit channels, to determine the H.323 capabilities which controls the RECEIVE channels. Surely the capability should use the VideoOutputDevice? Or is this for transmitVideo capabilities only?
Also, but what happens if you have a camera that only reports being able to do 320x240? Not sure this happens much anymore, but older cameras certainly used to. Again, if I am reading what you said right, you would not get H.261 or H.263 at all as they can't do that resolution.
On the OpenVideoChannel function callback the user can then set the frame size and fps on the wire. This sets the header height/width fields of the YUV420 frame which then goes back into the plugin codec to resize the codec. This is how it used to work in OpenH323 and it works just fine. The problem you refer to is, I guess, an open Opal issue perhaps?.
Given recent events I would be VERY careful about gibes like this.
Here is the problem as I see it, library neutral ...
Leaving out for the moment the added complexity of requiring symmetric codecs, the receive and transmit video streams are completely independent. Let's start with the receiver; there are three entities at work:
Video output device capabilities User/Application preferences Codec fundamentals
As a rule most output devices can do any resolution/frame rate, however it is possible with the YUVFile driver to indicate that it MUST be a size, e.g. using the filename "fred_qcif.yuv".
The User/Application may also have restrictions, most common is for PDA's where the screen size is such that you want to prevent 4CIF etc.
The Codec fundamentals are the most complicated as they may just be a maximum (via profile/level for H.264/MPEG4 and CustomPictureFormat for H.263+) or a set of discrete values as required by H.261 and H.263.
The presented capabilities sent to the remote must be derived from the above. I am not presenting a solution yet, just trying to state the problem in as complete a form as possible.
For the transmit side we have four entities:
Video input device capabilities User/Application preferences Codec fundamentals Remote capabilities
So, when selecting our specific video parameters we want to try and get as close to the user preference as possible given the constraints indicated by the other three entities.
One more random point: the Codec fundamentals are only known by the plug-in.
So, the result of a long discussion with Matthias (and others, but Matthias was the main person) we came up with a first cut solution. To simplify matters an assumption that the video devices can be told to go to any resolution is made. As we have the PColourConverter functionality which (mostly) can do scale/crop as it converts, I think that is fair.
Then we introduce the concept of "normalised" OpalMediaOptions and "custom" OpalMediaOptions. The normalised options are things like min/max width/height and the custom options are things like profile/level, or "QCIF MPI".
Then two functions are to be added to the plug in to set the options in an instance from normalised to custom, and back from custom to normalised options. In the process the plug-in can also apply any other rules it might have, such as discrete sizes.
And here is the sequence that I am stealing from the other thread describing how all the OpalMediaOptions get tweaked as they pass through the system:
OK, nothing better than a concrete example:
Media format on start up from a H.263 plug-in: Max Frame Width = 1408 Max Frame Height = 1152 Min Frame Width = 144 Min Frame Height = 96 Frame Width = 352 Frame Height = 288 Max Bit Rate = 384000 Target Bit Rate = 384000 Frame Time = 3000 SQCIF MPI = 1 QCIF MPI = 1 CIF MPI = 2 4CIF MPI = 3 16CIF MPI = 4 Annex D = 1
Then the user alters the following: Max Frame Width = 640 Max Frame Height = 480 Frame Width = 320 Frame Height = 240 Max Bit Rate = 128000 Frame Time = 6000
Then just before making a call the above, after user adjustment, is sent to the plug-in "from_normalised_options" function, the plug-in returns: Max Frame Width = 352 Max Frame Height = 288 Min Frame Width = 144 Min Frame Height = 96 Frame Width = 176 Frame Height = 144 Max Bit Rate = 128000 Target Bit Rate = 128000 Frame Time = 6000 SQCIF MPI = 2 QCIF MPI = 2 CIF MPI = 2 4CIF MPI = 5 16CIF MPI = 5 Annex D = 1
Note that as this particular H.263 implementation cannot do custom frame sizes, so all the frame sizes are adjusted appropriately. Some MPI's are set to an illegally large value (required for merging to work) and the new frame rate has made the MPI's that are left change upward.
This is encoded to: m=video 5002 RTP/AVP 34 a=rtpmap:34 h263/90000 a=fmtp:34 CIF=2;QCIF=2;SQCIF=2;D=1
The remote replies: m=video 5002 RTP/AVP 34 a=rtpmap:34 h263/90000 a=fmtp:34 CIF=3;4CIF=3
I deliberately tried to be a "rude" UA and return a frame size that was never offered to make sure it all works.
OPAL then constructs the following from the SDP: Max Frame Width = 640 Max Frame Height = 480 Min Frame Width = 144 Min Frame Height = 96 Frame Width = 320 Frame Height = 240 Max Bit Rate = 128000 Target Bit Rate = 128000 Frame Time = 6000 SQCIF MPI = 5 QCIF MPI = 5 CIF MPI = 3 4CIF MPI = 3 16CIF MPI = 5 Annex D = 0
Many of the above values are irrelevant at this stage and just inherited from the master format.
OPAL then merges the sent options with the received options to get: Max Frame Width = 352 Max Frame Height = 288 Min Frame Width = 144 Min Frame Height = 96 Frame Width = 176 Frame Height = 144 Max Bit Rate = 128000 Target Bit Rate = 128000 Frame Time = 6000 SQCIF MPI = 5 QCIF MPI = 5 CIF MPI = 3 4CIF MPI = 5 16CIF MPI = 5 Annex D = 0
Here the MaxMerge operator adjusts the MPI's, and the AndMerge operator turns off Annex D.
This is then passed to the plug-in "to_normalised_options" getting: Max Frame Width = 352 Max Frame Height = 288 Min Frame Width = 352 Min Frame Height = 288 Frame Width = 352 Frame Height = 288 Max Bit Rate = 128000 Target Bit Rate = 128000 Frame Time = 9009 SQCIF MPI = 5 QCIF MPI = 5 CIF MPI = 3 4CIF MPI = 5 16CIF MPI = 5 Annex D = 0
Where the Frame Size settles on its only possible value, and Frame Time gets adjusted to 10fps due to CPI MPI being 3. These options are then sent to the codec using set_codec_options and also merged with the YUV320P options so it can be used by the grabber.
Robert Jongbloed OPAL/OpenH323 Architect and Co-founder.
participants (3)
-
Hannes Friederich
-
Robert Jongbloed
-
Simon Horne