Hi,
>From the today's V4L presentation, there were two missing topics that may be
useful to include for our discussions:
a) V4L overlay mode;
b) dvb.
So, I'm bringing those two topics for discussions. If needed, I can do some
presentation about them, but it seemed better to start the discussion via
ML, in order to know more about the interests over those two subject.
a) V4L overlay mode
================
The V4L Overlay mode were used a lot during kernel 2.2 and 2.4 days, were
most hardware were not capable enough to do real-time processing of video
streams. It is supported by xawtv and a Xorg v4l driver, and uses XV overlay
extensions to display video. It is simple to setup and it requires no CPU
usage for it, as the video framebuffer is passed directly to the video hardware,
that programs DMA to write directly into the fb memory.
The main structures used on overlay mode (from kerrnel include/linux/videodev2.h)
are:
struct v4l2_pix_format {
__u32 width;
__u32 height;
__u32 pixelformat;
enum v4l2_field field;
__u32 bytesperline; /* for padding, zero if unused */
__u32 sizeimage;
enum v4l2_colorspace colorspace;
__u32 priv; /* private data, depends on pixelformat */
};
struct v4l2_framebuffer {
__u32 capability;
__u32 flags;
/* FIXME: in theory we should pass something like PCI device + memory
* region + offset instead of some physical address */
void *base;
struct v4l2_pix_format fmt;
};
/* Flags for the 'capability' field. Read only */
#define V4L2_FBUF_CAP_EXTERNOVERLAY 0x0001
#define V4L2_FBUF_CAP_CHROMAKEY 0x0002
#define V4L2_FBUF_CAP_LIST_CLIPPING 0x0004
#define V4L2_FBUF_CAP_BITMAP_CLIPPING 0x0008
#define V4L2_FBUF_CAP_LOCAL_ALPHA 0x0010
#define V4L2_FBUF_CAP_GLOBAL_ALPHA 0x0020
#define V4L2_FBUF_CAP_LOCAL_INV_ALPHA 0x0040
#define V4L2_FBUF_CAP_SRC_CHROMAKEY 0x0080
/* Flags for the 'flags' field. */
#define V4L2_FBUF_FLAG_PRIMARY 0x0001
#define V4L2_FBUF_FLAG_OVERLAY 0x0002
#define V4L2_FBUF_FLAG_CHROMAKEY 0x0004
#define V4L2_FBUF_FLAG_LOCAL_ALPHA 0x0008
#define V4L2_FBUF_FLAG_GLOBAL_ALPHA 0x0010
#define V4L2_FBUF_FLAG_LOCAL_INV_ALPHA 0x0020
#define V4L2_FBUF_FLAG_SRC_CHROMAKEY 0x0040
Using it is as simple as selecting a format that the video display framebuffer
supports, and send a couple of ioctls to the video adapter.
This is what the Xorg v4l driver (v4l.c) does (simplified, to ease
comprehension):
struct v4l2_framebuffer yuv_fbuf;
int on = 1;
if (-1 == ioctl(V4L_FD, VIDIOC_G_FBUF, &yuv_fbuf))
return;
/* Sets the Framebuf data: width, heigth, bpp, format, base and display position */
yuv_fbuf.fmt.width = surface->width;
yuv_fbuf.fmt.height = surface->height;
yuv_fbuf.fmt.bytesperline = surface->pitches[0];
yuv_fbuf.fmt.pixelformat = V4L2_PIX_FMT_YUYV;
yuv_fbuf.base = (char *)(memPhysBase + surface->offsets[0]);
memset(&yuv_win, 0, sizeof(yuv_win));
yuv_win.w.left = 0;
yuv_win.w.top = 0;
yuv_win.w.width = surface->width;
yuv_win.w.height = surface->height;
if (-1 == ioctl(V4L_FD, VIDIOC_S_FBUF, yuv_fbuf))
return;
/* Sets mem transfer type to overlay mode */
memset(&fmt, 0, sizeof(fmt));
fmt.type = V4L2_BUF_TYPE_VIDEO_OVERLAY;
if (-1 == ioctl(V4L_FD, VIDIOC_S_FMT, &fmt))
return;
/* Enables overlay mode. Data are transfered directly from video capture device into display framebuffer */
memcpy(&fmt.fmt.win, &pPPriv->yuv_win, sizeof(pPPriv->yuv_win));
if (-1 == ioctl(V4L_FD, VIDIOC_OVERLAY, &on))
return;
The main issue with the overlay mode, as discussed on the first day,
is that the framebuffer pointer is a physical address. The original
idea, on v4l2, were to use some framebuffer ID.
That's said, it wouldn't be hard to add a new flag at v4l2_framebuffer.flags,
saying meant to say that it should use a GEM ID. I had some discussions
with David Arlie about that when I've submitted the v4l driver fixes due to
the removal of the V4L1 old API. I'm planning to submit something like that in
the future, when I have some spare time for doing it. Eventually, if Linaro
is interested, it could be an interesting project, as it may solve some of
the current needs.
It is probably simpler to do that than to add another mode to the V4L MMAP stuff.
b) DVB
===
Several new ARM devices are now shipped with Digital TV integrated on that. On my
Country, we have several mobile phones, tablets and GPS devices with DTV receptors
inside. Modern TV sets and set-top-boxes already use Linux with DVB support inside.
GoogleTV will for sure need DTV support, as well as similar products.
Even being used everywhere, currently, no big vendor tried to send us patches to
improve their DVB support, but I suspect that this should happen soon. This is
just an educated guess. It would be nice to have some feedback about that from the
vendors.
The DVB API is completely different from the V4L one, and there are two different
types of DVB devices:
- Full-featured DVB devices, with MPEG-TS, audio and video codec inside it;
- "simple" devices that just provide a read() interface to get an MPEG-TS stream.
As modern ARM SoC devices can have a codec DSP processor, it makes sense for them
to use the full-featured API, providing just audio and video via the DVB API
(yes, DVB has a different way to control and export audio/video than V4L/alsa).
The question here is: is there any demand for it right now? If so, what are the
requirements? Are the memory management requirements identical to the current
ones?
Thanks,
Mauro
Hi,
>From the today's V4L presentation, there were two missing topics that may be
useful to include for our discussions:
a) V4L overlay mode;
b) dvb.
So, I'm bringing those two topics for discussions. If needed, I can do some
presentation about them, but it seemed better to start the discussion via
ML, in order to know more about the interests over those two subject.
a) V4L overlay mode
================
The V4L Overlay mode were used a lot during kernel 2.2 and 2.4 days, were
most hardware were not capable enough to do real-time processing of video
streams. It is supported by xawtv and a Xorg v4l driver, and uses XV overlay
extensions to display video. It is simple to setup and it requires no CPU
usage for it, as the video framebuffer is passed directly to the video hardware,
that programs DMA to write directly into the fb memory.
The main structures used on overlay mode (from kerrnel include/linux/videodev2.h)
are:
struct v4l2_pix_format {
__u32 width;
__u32 height;
__u32 pixelformat;
enum v4l2_field field;
__u32 bytesperline; /* for padding, zero if unused */
__u32 sizeimage;
enum v4l2_colorspace colorspace;
__u32 priv; /* private data, depends on pixelformat */
};
struct v4l2_framebuffer {
__u32 capability;
__u32 flags;
/* FIXME: in theory we should pass something like PCI device + memory
* region + offset instead of some physical address */
void *base;
struct v4l2_pix_format fmt;
};
/* Flags for the 'capability' field. Read only */
#define V4L2_FBUF_CAP_EXTERNOVERLAY 0x0001
#define V4L2_FBUF_CAP_CHROMAKEY 0x0002
#define V4L2_FBUF_CAP_LIST_CLIPPING 0x0004
#define V4L2_FBUF_CAP_BITMAP_CLIPPING 0x0008
#define V4L2_FBUF_CAP_LOCAL_ALPHA 0x0010
#define V4L2_FBUF_CAP_GLOBAL_ALPHA 0x0020
#define V4L2_FBUF_CAP_LOCAL_INV_ALPHA 0x0040
#define V4L2_FBUF_CAP_SRC_CHROMAKEY 0x0080
/* Flags for the 'flags' field. */
#define V4L2_FBUF_FLAG_PRIMARY 0x0001
#define V4L2_FBUF_FLAG_OVERLAY 0x0002
#define V4L2_FBUF_FLAG_CHROMAKEY 0x0004
#define V4L2_FBUF_FLAG_LOCAL_ALPHA 0x0008
#define V4L2_FBUF_FLAG_GLOBAL_ALPHA 0x0010
#define V4L2_FBUF_FLAG_LOCAL_INV_ALPHA 0x0020
#define V4L2_FBUF_FLAG_SRC_CHROMAKEY 0x0040
Using it is as simple as selecting a format that the video display framebuffer
supports, and send a couple of ioctls to the video adapter.
This is what the Xorg v4l driver (v4l.c) does (simplified, to ease
comprehension):
struct v4l2_framebuffer yuv_fbuf;
int on = 1;
if (-1 == ioctl(V4L_FD, VIDIOC_G_FBUF, &yuv_fbuf))
return;
/* Sets the Framebuf data: width, heigth, bpp, format, base and display position */
yuv_fbuf.fmt.width = surface->width;
yuv_fbuf.fmt.height = surface->height;
yuv_fbuf.fmt.bytesperline = surface->pitches[0];
yuv_fbuf.fmt.pixelformat = V4L2_PIX_FMT_YUYV;
yuv_fbuf.base = (char *)(memPhysBase + surface->offsets[0]);
memset(&yuv_win, 0, sizeof(yuv_win));
yuv_win.w.left = 0;
yuv_win.w.top = 0;
yuv_win.w.width = surface->width;
yuv_win.w.height = surface->height;
if (-1 == ioctl(V4L_FD, VIDIOC_S_FBUF, yuv_fbuf))
return;
/* Sets mem transfer type to overlay mode */
memset(&fmt, 0, sizeof(fmt));
fmt.type = V4L2_BUF_TYPE_VIDEO_OVERLAY;
if (-1 == ioctl(V4L_FD, VIDIOC_S_FMT, &fmt))
return;
/* Enables overlay mode. Data are transfered directly from video capture device into display framebuffer */
memcpy(&fmt.fmt.win, &pPPriv->yuv_win, sizeof(pPPriv->yuv_win));
if (-1 == ioctl(V4L_FD, VIDIOC_OVERLAY, &on))
return;
The main issue with the overlay mode, as discussed on the first day,
is that the framebuffer pointer is a physical address. The original
idea, on v4l2, were to use some framebuffer ID.
That's said, it wouldn't be hard to add a new flag at v4l2_framebuffer.flags,
saying meant to say that it should use a GEM ID. I had some discussions
with David Arlie about that when I've submitted the v4l driver fixes due to
the removal of the V4L1 old API. I'm planning to submit something like that in
the future, when I have some spare time for doing it. Eventually, if Linaro
is interested, it could be an interesting project, as it may solve some of
the current needs.
It is probably simpler to do that than to add another mode to the V4L MMAP stuff.
b) DVB
===
Several new ARM devices are now shipped with Digital TV integrated on that. On my
Country, we have several mobile phones, tablets and GPS devices with DTV receptors
inside. Modern TV sets and set-top-boxes already use Linux with DVB support inside.
GoogleTV will for sure need DTV support, as well as similar products.
Even being used everywhere, currently, no big vendor tried to send us patches to
improve their DVB support, but I suspect that this should happen soon. This is
just an educated guess. It would be nice to have some feedback about that from the
vendors.
The DVB API is completely different from the V4L one, and there are two different
types of DVB devices:
- Full-featured DVB devices, with MPEG-TS, audio and video codec inside it;
- "simple" devices that just provide a read() interface to get an MPEG-TS stream.
As modern ARM SoC devices can have a codec DSP processor, it makes sense for them
to use the full-featured API, providing just audio and video via the DVB API
(yes, DVB has a different way to control and export audio/video than V4L/alsa).
The question here is: is there any demand for it right now? If so, what are the
requirements? Are the memory management requirements identical to the current
ones?
Thanks,
Mauro
I've added it to the wiki along with the existing use case.
cheers,
jesse
On Mon, May 9, 2011 at 5:15 PM, Sakari Ailus
<sakari.ailus(a)maxwell.research.nokia.com> wrote:
> Jesse Barker wrote:
>> Hi all,
>
> Hi Jesse,
>
>> I've updated the mini-summit wiki with a couple more details:
>>
>> https://wiki.linaro.org/Events/2011-05-MM
>>
>> one of which is a sample use case description from Samsung. I would
>> encourage everyone to look at that and see if there are other use
>> cases they think would make more sense, or if there is clarification
>> or amendment of the current proposal. The discussion around this is
>
> I have a small set of slides on a use case related to camera on TI OMAP
> 3. The slides are attached.
>
> The Samsung example also looks very good to me.
>
> Kind regards,
>
> --
> Sakari Ailus
> sakari.ailus(a)maxwell.research.nokia.com
>
Hi all,
Especially for those participating remotely, it looks like we will be
starting a little late today (at 1500 CEST). The summit scheduler
wouldn't allow me to mark the mini-summit sessions as actually
starting during the plenary sessions, so there was some confusion
around the start time. I would still like to begin Tuesday and
Wednesday's sessions at 1400 CEST as originally proposed.
cheers,
Jesse
Both, unfortunately. I've sent it to Jesse since he might be able to get
it through the list size filter.
/Thomas
On 05/09/2011 02:09 PM, Clark, Rob wrote:
> hmm, somehow attachment is missing or got stripped?
>
> On Mon, May 9, 2011 at 6:45 AM, Thomas Hellstrom<thellstrom(a)vmware.com> wrote:
>
>> Hi!
>>
>> Better late than never - A pdf version of the TTM presentation
>>
>> Thanks,
>> Thomas
>>
>>
>> _______________________________________________
>> Linaro-mm-sig mailing list
>> Linaro-mm-sig(a)lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig
>>
>>
Hi all,
The followings are Samsung S.LSI's requirement for unified memory manager.
1. User space API
1.1. New memory management(MM) features should includes followings
to the user space.: UMP
A. user space API for memory allocation from system memory: UMP
Any user process can allocate memory from kernel space by new MM model.
B. user space API for cache operations: flush, clean, invalidate
Any user process can do cache operation on the allocated memory.
C. user space API for mapping memory attribute as cacheable
When the system memory mapped into the user space,
user process can set its property as cacheable.
D. user space API for mapping memory attribute as non-cacheable
When the system memory mapped into the user space,
user process can set its property as non-cacheable.
1.2. Inter-process memory sharing: UMP
New MM features should provide memory sharing between user process.
A. Memory allocated by user space can be shared between user processes.
B. Memory allocated by kernel space can be shared between user processes.
2. Kernel space API
New MM features should includes followings to the kernel space.: CMA, VCMM
2-1. Physically memory allocator
A. kernel space API for contiguous memory allocation: CMA(*)
B. kernel space API for non-contiguous memory allocation: VCMM (*)
C. start address alignment: CMA, VCMM
D. selectable allocating region: CMA
*refer to the bottom's extension.
2-2. Device virtual address management: VCMM
New MM features should provide
the way of managing device virtual memory address as like followings:
A. IOMMU(System MMU) support
IOMMU is a kind of memory MMU, but IOMMU is dedicated for each device.
B. device virtual address mapping for each device
C. virtual memory allocation
D. mapping / remapping between phys and device virtual address
E. dedicated device virtual address space for each device
F. address translation between address space
U.V
/ \
K.V - P.A
\ /
D.V
U.V: User space address
K.A: Kernel space address
P.A: Physical address
D.V: Device virtual address
3. Extensions
A. extension for custom physical memory allocator
B. extension for custom MMU controller
-------------------------------------------------------------------------
You can find the implementation in the following git repository.
http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6-
samsung.git;a=tree;hb=refs/heads/2.6.36-samsung
1. UMP (Unified Memory Provider)
- The UMP is an auxiliary component which enables memory to be shared
across different applications, drivers and hardware components.
- http://blogs.arm.com/multimedia/249-making-the-mali-gpu-device-driver-
open-source/page__cid__133__show__newcomment/
- Suggested by ARM, Not submitted yet.
- implementation
drivers/media/video/samsung/ump/*
2. VCMM (Virtual Contiguous Memory Manager)
- The VCMM is a framework to deal with multiple IOMMUs in a system
with intuitive and abstract objects
- Submitted by Michal Nazarewicz @Samsung-SPRC
- Also submitted by KyongHo Cho @Samsung-SYS.LSI
- http://article.gmane.org/gmane.linux.kernel.mm/56912/match=vcm
- implementation
include/linux/vcm.h
include/linux/vcm-drv.h
mm/vcm.c
arch/arm/plat-s5p/s5p-vcm.c
arch/amr/plat-s5p/include/plat/s5p-vcm.h
3. CMA (Contiguous Memory Allocator)
- The Contiguous Memory Allocator (CMA) is a framework, which allows
setting up a machine-specific configuration for physically-contiguous
memory management. Memory for devices is then allocated according
to that configuration.
- http://lwn.net/Articles/396702/
- http://www.spinics.net/lists/linux-media/msg26486.html
- Submitted by Michal Nazarewicz @Samsung-SPRC
- implementation
mm/cma.c
include/linux/cma.h
4. SYS.MMU
- System MMU supports address transition from VA to PA.
- http://thread.gmane.org/gmane.linux.kernel.samsung-soc/3909
- Submitted by Sangbeom Kim
- Merged by Kukjin Kim, ARM/S5P ARM ARCHITECTURES maintainer
- implementation
arch/arm/plat-s5p/sysmmu.c
arch/arm/plat-s5p/include/plat/sysmmu.h
I think the recent discussions on linaro-mm-sig and the BoF last week
at ELC have been quite productive, and at least my understanding
of the missing pieces has improved quite a bit. This is a list of
things that I think need to be done in the kernel. Please complain
if any of these still seem controversial:
1. Fix the arm version of dma_alloc_coherent. It's in use today and
is broken on modern CPUs because it results in both cached and
uncached mappings. Rebecca suggested different approaches how to
get there.
2. Implement dma_alloc_noncoherent on ARM. Marek pointed out
that this is needed, and it currently is not implemented, with
an outdated comment explaining why it used to not be possible
to do it.
3. Convert ARM to use asm-generic/dma-mapping-common.h. We need
both IOMMU and direct mapped DMA on some machines.
4. Implement an architecture independent version of dma_map_ops
based on the iommu.h API. As Joerg mentioned, this has been
missing for some time, and it would be better to do it once
than for each IOMMU separately. This is probably a lot of work.
5. Find a way to define per-device IOMMUs, if that is not actually
possible already. We had conflicting statements for this.
6. Implement iommu_ops for each of the ARM platforms that has
an IOMMU. Needs some modifications for MSM and a rewrite for
OMAP. Implementation for Samsung is under work.
7. Extend the dma_map_ops to have a way for mapping a buffer
from dma_alloc_{non,}coherent into user space. We have not
discussed that yet, but after thinking this for some time, I
believe this would be the right approach to map buffers into
user space from code that doesn't care about the underlying
hardware.
After all these are in place, building anything on top of
dma_alloc_{non,}coherent should be much easier. The question
of passing buffers between V4L and DRM is still completely
unsolved as far as I can tell, but that discussion might become
more focused if we can agree on the above points and assume
that it will be done.
I expect that I will have to update the list above as people
point out mistakes in my assumptions.
Arnd
On Wednesday 27 April 2011 23:31:06 Benjamin Herrenschmidt wrote:
> On Thu, 2011-04-21 at 21:29 +0200, Arnd Bergmann wrote:
> > 7. Extend the dma_map_ops to have a way for mapping a buffer
> >
> > from dma_alloc_{non,}coherent into user space. We have not
> > discussed that yet, but after thinking this for some time, I
> > believe this would be the right approach to map buffers into
> > user space from code that doesn't care about the underlying
> > hardware.
>
> Yes. There is a dma_mmap_coherent() call that's not part of the "Real"
> API but is implemented by some archs and used by Alsa (I added support
> for it on powerpc recently).
>
> Maybe that should go into the dma ops.
>
> The question remains, if we ever want to do more complex demand-paged
> operations, should we also expose a lower level set of functions to get
> struct page out of a dma_alloc_coherent() allocation and to get the
> pgprot for the user dma mapping ?
>
> > After all these are in place, building anything on top of
> > dma_alloc_{non,}coherent should be much easier. The question
> > of passing buffers between V4L and DRM is still completely
> > unsolved as far as I can tell, but that discussion might become
> > more focused if we can agree on the above points and assume
> > that it will be done.
>
> My gut feeling is that it should be done by having V4L use DRM buffers
> in the first place...
V4L2 needs to capture data to a wide variety of memory locations (system
memory when you just want to process the data using the CPU, frame buffer
memory, contiguous buffers that will be passed to a DSP for video encoding,
GPU textures, ...). To allow those use cases (and more) the V4L2 API provides
two ways to handle data buffers: applications can request the driver to
allocate memory internally, and them mmap() the buffers to userspace, or pass
arbitrary memory pointers (and sizes) to the driver to be used as video
capture buffers.
In the first case drivers will allocate memory depending on their
requirements. This could mean vmalloc_user() for USB drivers that use memcpy()
in the kernel, vmalloc() for hardware that support SG-lists or IOMMUs,
dma_alloc_coherent() for drivers that need contiguous memory, ...
In the second case the driver will verify that the memory it receives from the
application matches its requirements (regarding contiguousness for instance)
and will then use that memory.
I think we could target the second case only if we want to share buffers
between V4L and DRM, as V4L2 (unlike DRM) is already pretty good at using
buffers it didn't allocate itself. The current API will need to be extended to
pass an opaque buffer handle instead of a memory address, as we want to avoid
requiring a userspace mapping for the buffer when not necessary. That's the
whole point of the initial memory management discussion.
We will of course need to make sure that the DRM buffers fullfil the V4L2
needs.
> > I expect that I will have to update the list above as people
> > point out mistakes in my assumptions.
--
Regards,
Laurent Pinchart