Media subsystem workshop 2011 - Prague - Oct 23-25
Group photo of the Kernel Summit Media Subsystem Workshop 2011
Since 2007, we're doing annual mini-summits for the media subsystem, in order to plan the new features that will be introduced there.
Last year, during the Kernel Summit 2010, it was decided that the Kernel Summit 2011 format will be modified, in order to strength the interaction between the several sub-system mini-summits and the main Kernel Summit. If this idea works well, the next Kernel Summits will also follow the same format.
So, some mini-summits were proposed to happen together with the Kernel Summit 2011. Among a few others, the Media subsystem was accepted to be held with this year's Kernel Summit.
So, we'd like to announce that the Media subsystem workshop 2011 will happen together with the Kernel Summit 2011.
The Media subsystem workshop is on early planning stages, but the idea is that we'll have an entire day to do the media discussions. We'll also planning to have workshop presentations inside the Kernel Summit 2011 with the workshop and Kernel Summit attendants present, where workshop results will be presented.
So, I'd like to invite V4L, DVB and RC developers to submit proposals for the themes to be discussed. Please email me if you're interested on being invited for the event.
Hoping to see you soon there!
Day 1 discussions
DVB video/audio.h conversion to V4L2
Only used by ivtv and av7110.
Deprecate old API, design new API in V4L2.
See RFC posted on June 9th 2011:
"RFC: Add V4L2 decoder commands/controls to replace dvb/video.h"
Hans will make an RFCv2, wait for comments from ST.
Hans can implement the API in the V4L2 core and ivtv.
videobuf2 - Migration plans for legacy drivers
Some drivers (ab)used videobuf for audio support, this has been cleaned up by patches.
Only supported by bttv and saa7134/7146. Not supported by newer hardware.
V4L2 overlay support requires userspace to pass a pointer to physical memory to the V4L2 driver. For security reasons this requires root permissions. A userspace setuid helper is thus required.
The original overlay API proposal involved querying the video adapter driver for a buffer ID and passing the ID to the V4L2 driver. The new buffers sharing API uses a similar approach. V4L2 overlay support should be deprecated in favour of the buffers sharing API.
One drawback is that applications would need to constantly queue/dequeue buffers. A possible solution is to add a buffer flag to tell drivers to constantly overwrite the same buffer over and over again if no new buffer is queued.
Can we migrate to vb2 with no OVERLAY support?
The existing overlay API needs to be supported until the buffers sharing API is in place. To convert bttv/saa7134/saa7146 to videobuf2 we thus need a working overlay API support solution. videobuf2 should not be touched if possible, the code should instead be put in the drivers. Drivers that don't support overlays will be ported to videobuf2 first, and the decision on how (and if) to support overlays will be postponed until bttv/saa7134/saa7146 get ported to videobuf2.
Multiple contiguous planes and padding
Use case: allocating an NV12 video buffer on the capture device side and pass it to a GPU that requires the Y and CbCr planes to be contiguous in memory with a GPU-specific amount of padding. Strictly speaking it's not the NV12 format anymore. The standard NV12 format is defined without any padding between planes.
CI == Common Interface.
Hauppauge has a new USB stick which does ATSC-MH. This device requires further software processing on its output data for it to be useful for the user. The current library implements the de-scrambling in the user space.
Should the descrambling be implemented in the kernel instead? That would be more useful for the user, but such processing is not necessarily seen best done in the kernel.
In V4L2 there is a comparable solution: libv4l.
The hardware provides its own encapsulation inside which UDP packets may be found. A separate data stream called FIC is provided alongside the UDP packets. Should the UDP packets be handled by the networking stack since they are network packets? For receiving a compressed stream, perhaps no, but the content could theoretically be anything.
tun (as in tuntap) may be used to inject the packets to a virtual network interface and the raw data is to be provided separately. This approach has a problem: configuring the tun device requires cap_sys_net requires root acces typically. This is seen as conflicting with the intent to be using the system as a regular user.
It might be better to handle the UDP packets in the library instead after all.
There is not other than LG hardware existence yet, so we don't know what hardware manufacturers intend to do in the future: do they provide such solutions in the future as well, or do go back to something more traditional.
Without that knowledge, the library approach should proceed. When more information on the devices that the hardware manufacturers make is available the decision could be re-evaluated. It might make sense to move this to the kernel after all.
There is a need to control certain aspects of embedded hardware such as sensor settings such as blanking, digital gain, black level clamping, test patterns and per-component gains but no control classes for such purposes exist at the moment.
We have the high level camera class but that is not seen suitable for this kind of controls.
Should we classify controls based on what is their function or where are they implemented? The common agreement appears to be that the function is the answer.
The controls mentioned above are fairly low level so hiding them from the regular user should be the default thing to do. Hiding of the controls should be done in the user space, and the kernel still should expose all the controls.
Some of these controls control the image capture process itself while the others affect on the processing of the data which may be done elsewhere in the pipeline. Whether hiding or showing the control is desired depends on an application: a regular application likely would not wish to see them while an application written for a specific embedded system would need these controls to function. The decision must be taken in the user space.
Currently it is seen these controls are best put to a separate control class, V4L2_CID_CLASS_LOW_LEVEL, which would hold all the low-level controls whether they are related to image capture process or image processing.
Day 2 discussions
V4L2/DVB on desktop vs. embedded systems
Goal: provide a uniform user-space API on both desktop and embedded (MC) systems
Reasons, why providing a default pipeline is difficult:
1. after a MC-aware application has run, a standard V4L2 application might not work anymore
2. hardware-specific library plugin is hard to maintain, is largely propriatory, bases on a vendor-local (non mainline) kernel branch, would only support vendor (Nokia) devices, not generic (OMAP3) systems
Sensor: [pixel array -> binner] -> ISP [CSI-2 -> format conversion -> scaler] -> memory
with possible in- and output from and to memory at multiple ISP stages
Currently any configuration, applied to a subdevice remains local and does not get propagated to other (connected) entities
Video drivers, implementing the V4L2 API, have to configure the complete video pipeline upon V4L2 ioctl()s.
Currently there is no way to distinguish between video devices, belonging to standard V4L2 devices and MC-devices
Among "regular" applications there are ones, using libv4l and those, not using it. libv4l can also be preloaded for applications, not using it directly. Alternatively media-ctl can be used to pre-configure the pipeline.
"Low-level" applications will not need libv4l, they use the MC API directly
MC drivers have to be testable, i.e., driver authors also have to provide an open-source plugin, doing at least a basic pipeline set up. For advanced features vendors can implement closed-source plugins. Device manufacturers should (but don't have to) additionally provide device-specific plugins for maximum flexibility, but those plugins are not compulsory.
Fazit: the user-space configuration consists of the following components:
* libv4l - the generic library
* SoC-specific plugin - open-source plugin for the specific SoC
* device-specific plugin - possibly closed-source plugin, specific to the device (optional)
* libvioctl - auxiliary library, using the same plugins, as libv4l (SoC- and / or device-), but specifically designed to export all the advanced configuration functions in a generic way
* media-ctl - an application, using libvioctl, used to set up the pipeline, before running generic V4L2 application, not using libv4l
Mauro: This is not V4L2, because it doesn't propagate S_FMT configuration. Possibilities: remove S_FMT, rename the ioctl(), or rename device nodes from videoN
Laurent: Currently omap3isp creates 7 /dev/videoN nodes, of those are 4 CAPTURE devices, they correspond to 7 DMA engines. The problem is the absence of a 1-to-1 relationship between actual data sources and device nodes, videoN node enumeration is the problem. Already now many configurations do not work without libv4l, shouldn't it be made a requirement for all V4L2 applications?
* separate video nodes - one V4L2-compliant for legacy applications, supporting default configurations
* do not return the CAPTURE capability, possibly add a new STREAM_MANAGEMENT capability
Proposed resolution: define profiles: "streaming" - for MC. An RFC should be created by Laurent and submitted
Agreement: the low level video nodes will not use CAPTURE / OUTPUT capability bits in queryctrl. Regular applications know from this that the video device node is a low level node which only provides a subset of the V4L2 API functionality.
libv4l should gain an additional library to enumerate video devices
STMicro to V4L2, DVB / MC
Complex topology, consisting of video input, processing and output
Applications include set-top boxes, TV-sets,...
Input possibilities include tuners, analog input, uncompressed data, HDMI-RX
Input can be passed through a transport engine, possibly a security block
Followed by a Stream Engine, eventually landing on a display device
The driver infrastructure is presented in form of an object model
MC should be used to configure the pipelines - video and audio
Typical data processing paths will pipe data in the kernel from input to output without going to the user-space. Simultaneous processing of several data streams is a typical use-case too.
Many input interface support - both digital (HDMI) and analog (SCART) will require API extensions, RFCs would be required
Example configuration, proposed for MC an implementation for DVB
Front-End -> Demux -> Decoder (* 2) -> ... (* 2) -> Output
MC will have to be extended to support dynamic number of pads, since that is a requirement to support demuxes. Re-routing data should be possible without breaking the stream. E.g., switching a demux to a different language.
Q: should an MC-generic ioctl be defined to configure data format on a pad, similar to S_FMT, which would be passed on by the MC core to the respective subsystem?
Q: should a request to start the pipeline, sent to one entity, increment the use-count of all entities in the pipeline and start them all?