This is to let you know that the migration of lists.linaro.org has been
successfully completed.
As per the email I sent on Wednesday, it may take some time for the new
address of the server to be seen by your computer. You can check this by
trying to connect to the web site:
http://lists.linaro.org/
If you are able to connect and you do not get an error, this means you are
connecting to the new server and you can send email to the lists.
If you experience any problems after the weekend and you find that you
still cannot connect to the server, please reply to this email to let us
know.
Regards
Philip
IT Services Manager
Linaro
Hello
You are receiving this email because you are subscribed to one or more
mailing lists provided by the lists.linaro.org server.
IT Services are announcing planned maintenance for this server scheduled
for *Friday 15th March 2013, starting at 2pm GMT*. The purpose of the work
is to move the service to another server. There will be some disruption
during this maintenance.
In order to ensure that you do not accidentally try to use the service
while it is being moved, the current server will be shut down at 2pm.
A further email will be sent on Friday afternoon to confirm that the
migration of the service is completed. However, due to the way servers are
found, it may take a while before your computer is able to connect to the
relocated service.
After the old server has been shut down, email sent to any of the lists
will be queued, but it is possible that the sending server will still
trying to deliver the email to the old server rather than the new one when
it is started.
It is therefore *strongly* recommended that you do not send any email to an
@lists.linaro.org email address until you can connect to the new service,
which you will be able to test by trying to use a web browser to connect to
http://lists.linaro.org after you receive the email confirming that the
migration has been completed. Since the old service will be shut down, if
you are able to connect, you can be sure you have connected to the new
service.
If by Monday you are still unable to connect to the service or you are not
able to send email to an @lists.linaro.org email address, please send an
email to its(a)linaro.org.
Thank you.
Regards
Philip
IT Services Manager
Linaro
Atomic pool should always be allocated from DMA zone if such zone is
available in the system to avoid issues caused by limited dma mask of
any of the devices used for making an atomic allocation.
Reported-by: Krzysztof Halasa <khc(a)pm.waw.pl>
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
---
arch/arm/mm/dma-mapping.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index c7e3759..e9db6b4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -342,6 +342,7 @@ static int __init atomic_pool_init(void)
{
struct dma_pool *pool = &atomic_pool;
pgprot_t prot = pgprot_dmacoherent(pgprot_kernel);
+ gfp_t gfp = GFP_KERNEL | GFP_DMA;
unsigned long nr_pages = pool->size >> PAGE_SHIFT;
unsigned long *bitmap;
struct page *page;
@@ -361,8 +362,8 @@ static int __init atomic_pool_init(void)
ptr = __alloc_from_contiguous(NULL, pool->size, prot, &page,
atomic_pool_init);
else
- ptr = __alloc_remap_buffer(NULL, pool->size, GFP_KERNEL, prot,
- &page, atomic_pool_init);
+ ptr = __alloc_remap_buffer(NULL, pool->size, gfp, prot, &page,
+ atomic_pool_init);
if (ptr) {
int i;
--
1.7.9.5
As proposed yesterday, here's the Android sync driver patches for
staging.
I've preserved the commit history, but moved all the changes over
to be against the staging directory (instead of drivers/base).
The goal of submitting this driver to staging is to try to get
more collaberation, as there are some similar efforts going on
in the community with dmabuf-fences. My email from yesterday with
more details for how I hope this goes is here:
http://comments.gmane.org/gmane.linux.kernel/1448420
Erik also provided a nice background on the patch set in his
reply yesterday, which I'll quote here:
"In Honeycomb where we introduced the Hardware Composer HAL. This is a
userspace layer that allows composition acceleration on a per platform
basis. Different SoC vendors have implemented this using overlays, 2d
blitters, a combinations of both, or other clever/disgusting means.
Along with the HWC we consolidated a lot of our camera and media
pipeline to allow their input to be fed into the GPU or
display(overlay.) In order to exploit parallelism the the graphics
pipeline, this introduced lots of implicit synchronization
dependancies. After a couple years of working with many different SoC
vendors, we found that it was really difficult to communicate our
system's expectations of the implicit contract and it was difficult
for the SoC vendors to properly implement the implicit contract in
each of their IP blocks (display, gpu, camera, video codecs). It was
also incredibly difficult to debug when problems/deadlocks arose.
In an effort to clean up the situation we decided to create set of
simple synchronization primitives and have our compositor
(SurfaceFlinger) manage the synchronization contract explicitly. We
designed these primitives so that they can be passed across processes
(much like ion/dma_buf handles), can be backed by hardware
synchronization primitives, and can be combined with other sync
dependancies in a heterogeneous manner. We also added enough
debugging information to make pinpointing a synchronization deadlock
bug easier. There are also OpenGL extensions added (which I believe
have been ratified by Khronos) to convert a "native" sync object to a
gl fence object and vise versa.
So far shipped this system on two products (the Nexus 10 and 4) with
two different SoCs (Samsung Exynos5250 and Qualcomm MSM8064.) These
two projects were much easier to work out the kinks in the
graphics/compositing pipelines. In addition we were able to use the
telemetry and tracing features to track down the causes of dropped
frames aka "jank."
As for the implementation, I started with having the main driver op
primitive be a wait() op. I quickly noticed that most of the tricky
race condition prone code was ending up in the drivers wait() op. It
also made handling asynchronous waits of more than one type of sync_pt
difficult to manage. In the end I opted for something roughly like
poll() where all the heavy lifting is done at the high level and the
drivers only need to implement a simple check function."
Anyway, let me know what you think of the patches, and hopefully
this is something that could be considered for staging for 3.10
thanks
-john
Cc: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Cc: Erik Gilling <konkers(a)android.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Rob Clark <robclark(a)gmail.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Greg KH <gregkh(a)linuxfoundation.org>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: Android Kernel Team <kernel-team(a)android.com>
Erik Gilling (26):
staging: sync: Add synchronization framework
staging: sw_sync: Add cpu based sync driver
staging: sync: Add timestamps to sync_pts
staging: sync: Add debugfs support
staging: sw_sync: Add debug support
staging: sync: Add ioctl to get fence data
staging: sw_sync: Add fill_driver_data support
staging: sync: Add poll support
staging: sync: Allow async waits to be canceled
staging: sync: Export sync API symbols
staging: sw_sync: Export sw_sync API
staging: sync: Reorder sync_fence_release
staging: sync: Optimize fence merges
staging: sync: Add internal refcounting to fences
staging: sync: Add reference counting to timelines
staging: sync: Change wait timeout to mirror poll semantics
staging: sync: Dump sync state to console on timeout
staging: sync: Improve timeout dump messages
staging: sync: Dump sync state on fence errors
staging: sync: Protect unlocked access to fence status
staging: sync: Update new fence status with sync_fence_signal_pt
staging: sync: Use proper barriers when waiting indefinitely
staging: sync: Refactor sync debug printing
staging: sw_sync: Convert to use new value_str debug ops
staging: sync: Add tracepoint support
staging: sync: Don't log wait timeouts when timeout = 0
Jamie Gennis (1):
staging: sync: Fix timeout = 0 wait behavior
Rebecca Schultz Zavin (2):
staging: sync: Fix error paths
staging: sw_sync: Fix error paths
Ørjan Eide (1):
staging: sync: Fix race condition between merge and signal
drivers/staging/android/Kconfig | 27 +
drivers/staging/android/Makefile | 2 +
drivers/staging/android/sw_sync.c | 263 +++++++++
drivers/staging/android/sw_sync.h | 58 ++
drivers/staging/android/sync.c | 1016 ++++++++++++++++++++++++++++++++++
drivers/staging/android/sync.h | 426 ++++++++++++++
drivers/staging/android/trace/sync.h | 82 +++
7 files changed, 1874 insertions(+)
create mode 100644 drivers/staging/android/sw_sync.c
create mode 100644 drivers/staging/android/sw_sync.h
create mode 100644 drivers/staging/android/sync.c
create mode 100644 drivers/staging/android/sync.h
create mode 100644 drivers/staging/android/trace/sync.h
--
1.7.10.4
Hi everybody,
Here's a summary of the CDF BoF that took place at the ELC 2013.
I'd like to start by thanking all the participants who provided valuable
feedback (and those who didn't, but who now know a bit more about CDF and
will, I have no doubt about that, contribute in the future :-)). Thank you
also to Linus Walleij and Jesse Barker for taking notes during the meeting
while I was presenting. And obviously, thank you to Jesse Barker for
organizing the BoF.
I've tried to be as accurate as possible in this summary, but I might have
made mistakes. If you have attended the meeting, please point out any issue,
inconsistency, or just points I might have forgotten.
----
As not all attendees were familiar with CDF I started by briefly introducing
the problems that prompted me to start working on CDF.
CDF started as GPF, the Generic Panel Framework. While working on DT support
for a display controller driver I realized that panel control code was located
in board file. Moving the code somewhere in drivers/ was thus a prerequisite,
but it turned out that no framework existed in the kernel to support that
tasks. Several major display controller drivers (TI DSS and Samsung Exynos to
name a few) had a platform-specific panel driver framework, but the resulting
panel drivers wouldn't be reusable across different display controllers. A
need for a new framework became pretty evident to me.
After drafting an initial proposal and discussing it with several people
online and offline (in Helsinki with Tomi Valkeinen from TI, in Copenhagen at
Linaro Connect with Marcus Lorentzon from ST-Ericsson, and in Brussels during
a BoF at the FOSDEM) the need to support encoders in addition to panels
quickly arose, and GPF turned into CDF.
I then pursued with an overview of the latest CDF code and its key concepts.
While I was expecting this to be a short overview followed by more in-depth
discussions, it turned out to support our discussions for the whole 2 hours
meeting.
The latest available version at the time of the BoF (posted to the linaro-mm-
sig mailing list in reply to the BoF's annoucement) was the "non-quite-v3"
version. It incorporated feedback received on v2 but hadn't been properly
tested yet.
The basic CDF building block is called a display entity, modeled as an
instance of struct display_entity. They have sink ports through which they
receive video data and/or source ports through which they transmit video data.
Entities are chained via their ports to create a display pipeline.
>From the outside world entities are interfaced through two sets of abstract
operations they must provide:
- Control operations are called from "upper layers" (usually to implement
userspace requests) to get and set entity parameters (such as the physical
size, video modes, operation states, bus parameters, ...). Those operations
are implemented at the entity level.
Google asked how partial updates were handled, I answered they're not handled
yet (this is a key concept behind the CDF RFCs: while I try to make sure all
devices can be supported, I first concentrate on hardware features required
for the devices I work on). Linus Walleij mentioned he thought that partial
updates were becoming out of fashion, but larger display sizes might keep them
useful in the future.
- Video operations control video streams. They're implemented by entities on
their source ports, and are called in the upstream (from a video pipeline
point of view) direction. A panel will call video operations of the entity it
gets its video stream from (this could be an HDMI transmitter, the display
controller directly, ...) to control the video stream it receives.
Video operations are split in a set of common operations and sets of display
bus specific operations (for DPI, DBI, DSI, ...). Some discussion around ops
that might be needed in some cases but not others indicate that the ops
structures are not quite finished for all bus types (and/or that some ops
might be considered for "promotion" to common). In particular the current DSI
implementation is copied from a proposal posted by Tomasz Figa from Samsung.
As I have no DSI hardware to test it on I have kept it as-is.
Jesse Barker pointed out that to make this fly we willl need to get CDF into a
number of implementations, in particular the Samsung Exynos SoCs (needing
DSI). Several efforts are ongoing:
- Marcus Lorentzon (ST Ericsson, Linaro) is working on porting ST Ericsson
code to CDF, and in particular on the DSI interface.
- Tomasz Figa (Samsung) has worked on porting the Exynos display controller
driver to CDF and provided a DSI implementation.
- Tomi Valkeinen (TI) is working on porting the TI DSS driver to CDF (or
rather his own version of CDF as a first step, to avoid depending on an ever-
moving target right now) independently from Linaro.
- Alison Chaiken (Mentor Embedded Software) mentioned that Pengutronix is
working on panels support for the Freescale i.MX family.
- Linaro can probably also help extending the test coverage to various
platforms from its member companies.
- Finally, I'm working on CDF support for two display controllers found in
Renesas SoCs. One of them support DBI and DPI, the other supports DPI only.
However, I can't easily test DBI support, as I don't have access to the
necessary hardware.
I explained at that point that there is currently no clear agreement on a bus
and operations model. The initial CDF proposal created a Linux busses for DBI
and DSI (similar to I2C and SPI busses), with access to the control bus
implemented through those Linux busses, and access to the video bus
implemented through video operations on display entities. Tomi Valkeinen then
advocated for getting rid of the DBI and DSI Linux busses and implementing
access to both control and video through the display entity operations, while
Marcus Lorentzon wanted to implement all those operations at the Linux bus
level instead. The best way to arbitrate this will probably to work on several
implementations and find out which one works better.
SONY Mobile currently supports DSI auto-probing, with plug-n-play detection of
DSI panels. The panel ID is first retrieved, and the correct panel driver is
then loaded. We will likely need to support a similar model. Another option
would be to write a single panel-dcs driver to support all DSI panels that
conform with the DSI and DCS standards (although we will very likely need
panel-specific quirks in that case). The two options could also coexist.
We then moved to how display entities should be handled by KMS drivers and
mapped to KMS objects. The KMS model hardcodes the following fixed pipeline
CRTC -> encoder -> connector
The CRTC is controlled by the display controller driver, and panels can be
mapped to KMS connector objects. What goes in-between is more of a gray area,
as hardware pipeline can have several encoders chained together.
I've presented one possible control flow that could solution the problem by
grouping multiple objects into an abstract entity. The right-most entity would
be a standalone entity, and every encoder but the left-most one in the chain
would hide the entities connected at their output. This results in a "russian
dolls" model, where encoders forward control operations to the entities they
embed, and forward video operations to the entity at their sink side.
This can quickly become very complex, especially when locking and reference
counting are added to the model. Furthermore, this solution could only handle
linear pipelines, which will likely become a severe limitation in the future,
especially on embedded devices (for instance splitting a video stream between
two panels at the encoder level is a common use case, or driving a two-inputs
panel from two CRTCs).
Google asked whether this model tries to address both panels and
VGA(/HDMI/...) outputs. From what I've seen so far the only limits come from
the hardware engineers (often^H^H^H^H^Hsometimes troubled) minds, all kinds of
data streams may appear in practice. As most systems will have one CRTS, one
encoder and one panel (or connector), we should probably try to keep the model
simple to start with with 1:1 mappings between the KMS CRTC/encoder/connector
model and the CDF model. If we try to solve every possible problem right now
the complexity will explode and we won't be able to handle it. Getting a
simple solution upstream now and refactoring it later (there is no userspace
API involved, so no backward compatibility issue) might be the right answer. I
have no strong feeling about it, but I certainly want something I can get
upstream in a reasonable time frame.
Keith Packard bluntly (and totally rightfully) whether CDF is not just
duplicating part of the KMS API, and whether we shouldn't instead extend the
in-kernel KMS model to handle multiple encoders.
One reason that drove the creation of CDF outside of KMS was to support
sharing a single driver between multiple subsystems. For instance an HDMI
encoder could be connected to the output of a display controller handled by a
KMS driver on one board, and to the output of a video processor handled by a
V4L2 driver on another board. A panel could also be connected to a display
controller handled by a KMS driver on one board, and to a display controller
handled by an FBDEV driver on another board. Having a single driver for those
encoders or panels is one of the goals of CDF.
After publishing the first CDF RFC I realized there was a global consensus in
the kernel display community to deprecate FBDEV at some point. Sharing panel
drivers between KMS and FBDEV then became a "nice to have, but not important"
feature. As V4L2 doesn't handle panels (and shouldn't be extended to do so)
only encoders drivers would need to be shared, between KMS and V4L2.
It's important to note here that we don't need to share a given encoder
between two subsystems at runtime. On a given board the encoder will need to
be controlled by KMS or V4L2, but never both at the same time. In the CDF
context driver sharing refers to the ability to control a given driver from
either the KMS or V4L2 subsystem.
The discussion then moved to why V4L2 drivers for devices connected to an
encoder couldn't be moved to KMS. All display devices should be handled by
KMS, but we still have use cases where V4L2 need to handle video outputs. For
instance a system with the following pipeline
HDMI con. -> HDMI RX -> Processing Engine -> HDMI TX -> HDMI con.
doesn't involve memory buffers in the processing pipeline. This can't be
handled by KMS, as KMS cannot reporesent a video pipeline without memory in-
between the receiving side and the display side. Hans Verkuil also mentioned
that for certain applications one prefers to center the API around frames, and
that V4L2 is ideal for instance for video conferencing/telephony.
Keith Packard thought we should just extend KMS to handle the V4L2 use cases.
V4L2 would then (somehow) plug its infrastructure into KMS. This topic has
already been discussed in the past, and I agree that extending the KMS model
to support "live sources" for CRTCs will be needed in the near future. This
could be the basis of other KMS enhancements to support more complex
pipelines. Making KMS and V4L2 cooperate is also desirable on the display side
to write the output of the CRTC back to memory. KMS has no write-back feature
in the API, V4L2 could come to the rescue there.
With this kind of extension it might be possible to handle the display part of
memory-less pipelines in KMS, although that might be quite a challenge. There
was no clear consensus on whether this was desirable.
Furthermore, only two HDMI encoders currently need to be shared (both are only
supported out-of-tree at the moment). As we don't expect more than a handful
of such use cases in the near future, it might not be worth the hasle to
create a complete infrastructure to handle a use case that might disappear if
we later move all the display-side drivers to KMS.
Another solution mentioned by Hans Verkuil would be to create helper functions
to translate V4L2 calls to KMS calls (to be clear, this only covers in-kernel
calls to encoders).
There was no clear consensus on this topic.
We then moved on to the hot-plug (and hot-unplug) issues following a question
from Google. Hot-plug is currently not supported. We would need to add hot-
plugging notifiers and possibly a couple of other operations. However, the
video common operations structure has bind/unbind operations, that can serve
as a basis.
The hard part in hot-plugging support is actually hot-unplugging, as we need
to ensure that devices don't disappear all of a sudden while still in use.
This was a design goal of CDF from the start, and any issue there will need to
be resolved. Panels shouldn't be handled differently than HDMI connectors, CDF
will provide a common hot-plugging model.
Keith Packard then explained that DRM and KMS will likely be split in the
future. The main link between the DRM and KMS APIs is GEM objects. With the
recent addition of dmabuf to the Linux kernel the DRM and KMS APIs could be
split and use dmabuf to share buffers. DRM and KMS would then be exposed on
two separate device nodes. It would be a good idea to revisit the whole
KMS/V4L2 unification discussion when DRM and KMS will be split.
We briefly touched the subject of namespaces, and whether CDF should use the
KMS namespace (drm_*). There is some resistance on the V4L2 side on having CDF
structures be KMS objects.
It was then time to wrap up the meeting, and I asked the audience one final
question: should we shoehorn complex pipelines into the KMS three-stages
model, or should we extend the KMS model? That was unfortunately answered by
silence, showing that more thinking is needed.
A couple more minutes of offline discussions briefly touched the topics of GPU
driver reverse engineering and whether we could, after the KMS/DRM split, set
a kernel-side standard for embedded GPU drivers. As interesting as this topic
is, CDF will not solve that problem :-)
--
Regards,
Laurent Pinchart
I'd like to get a discussion going about submitting the Android sync
driver to staging.
I know there is currently some very similar work going on with the
dmabuf-fences, and rather then both approaches being worked out
individually on their own, I suspect there could be better collaboration
around this effort.
So my proposal is that we merge the Android sync driver into staging.
In my mind, this has the following benefits:
1) It allows other drivers that depend on the sync interface to also be
submitted to staging, rather then forcing those drivers to be hidden
away in various out of tree git repos, location unknown.
2) It would provide a baseline view to the upstream community of the
interface Android is using, providing a real-world, active use case of
the functionality.
Once the sync driver is in staging, if the dmabuf-fences work is fully
sufficient to replace the Android sync driver, we should be able to
whittle down the sync driver until its just a interface shim (and at
which point efforts can be made to convert Android userland over to
dmabuf-fences).
However, if the dmabuf-fences work is not fully sufficient to replace
the android sync driver, we should be able to at least to whittle down
the driver to those specific differences, which would provide a concrete
example of where the dmabuf-fences, or other work may need to be
expanded, or if maybe the sync driver is the better approach.
I've gone through the Android tree and reworked the sync driver to live
in staging, while still preserving the full patch history/authorship.
You can checkout the reworked patch queue here:
http://git.linaro.org/gitweb?p=people/jstultz/android-dev.git;a=shortlog;h=…
If folks would take a look and let me know what they think of the
changes as well as what they think about pushing it to staging, or other
ideas for how to improve collaboration so we can have common interfaces
here, I'd appreciate it.
Also note: I've done this so far without any feedback from the Android
devs (despite my reaching out to Erik a few times recently), so if they
object to pushing it to staging, in deference to it being their code
I'll back off, even though I do think it would be good to have the code
get more visibility upstream in staging. I don't mean to step on
anyone's toes. :)
thanks
-john
Hi Linus,
Here's the 3.9 pull request for dma-buf framework updates: could you
please pull?
Thanks and best regards,
~Sumit.
The following changes since commit d895cb1af15c04c522a25c79cc429076987c089b:
Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs (2013-02-26
20:16:07 -0800)
are available in the git repository at:
git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git
tags/tag-for-linus-3.9
for you to fetch changes up to 495c10cc1c0c359871d5bef32dd173252fc17995:
CHROMIUM: dma-buf: restore args on failure of dma_buf_mmap
(2013-02-27 15:14:02 +0530)
----------------------------------------------------------------
3.9: dma-buf updates
Refcounting implemented for vmap in core dma-buf
----------------------------------------------------------------
Daniel Vetter (1):
dma-buf: implement vmap refcounting in the interface logic
John Sheu (1):
CHROMIUM: dma-buf: restore args on failure of dma_buf_mmap
Documentation/dma-buf-sharing.txt | 6 +++-
drivers/base/dma-buf.c | 66 ++++++++++++++++++++++++++++++-------
include/linux/dma-buf.h | 4 ++-
3 files changed, 63 insertions(+), 13 deletions(-)
Add an iterator to walk through a scatter list a page at a time starting
at a specific page offset. As opposed to the mapping iterator this is
meant to be small, performing well even in simple loops like collecting
all pages on the scatterlist into an array or setting up an iommu table
based on the pages' DMA address.
v2:
- In each iteration sg_pgoffset pointed incorrectly at the next page not
the current one.
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
include/linux/scatterlist.h | 50 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4bd6c06..72578b5 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -231,6 +231,56 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents,
*/
#define SG_MAX_SINGLE_ALLOC (PAGE_SIZE / sizeof(struct scatterlist))
+struct sg_page_iter {
+ struct scatterlist *sg;
+ int sg_pgoffset;
+ struct page *page;
+};
+
+static inline int
+sg_page_cnt(struct scatterlist *sg)
+{
+ BUG_ON(sg->offset || sg->length & ~PAGE_MASK);
+
+ return sg->length >> PAGE_SHIFT;
+}
+
+static inline struct page *
+sg_page_iter_get_page(struct sg_page_iter *iter)
+{
+ while (iter->sg && iter->sg_pgoffset >= sg_page_cnt(iter->sg)) {
+ iter->sg_pgoffset -= sg_page_cnt(iter->sg);
+ iter->sg = sg_next(iter->sg);
+ }
+
+ return iter->sg ? nth_page(sg_page(iter->sg), iter->sg_pgoffset) : NULL;
+}
+
+static inline void
+sg_page_iter_next(struct sg_page_iter *iter)
+{
+ iter->sg_pgoffset++;
+ iter->page = sg_page_iter_get_page(iter);
+}
+
+static inline void
+sg_page_iter_start(struct sg_page_iter *iter, struct scatterlist *sglist,
+ unsigned long pgoffset)
+{
+ iter->sg = sglist;
+ iter->sg_pgoffset = pgoffset;
+ iter->page = sg_page_iter_get_page(iter);
+}
+
+/*
+ * Simple sg page iterator, starting off at the given page offset. Each entry
+ * on the sglist must start at offset 0 and can contain only full pages.
+ * iter->page will point to the current page, iter->sg_pgoffset to the page
+ * offset within the sg holding that page.
+ */
+#define for_each_sg_page(sglist, iter, pgoffset) \
+ for (sg_page_iter_start((iter), (sglist), (pgoffset)); \
+ (iter)->page; sg_page_iter_next(iter))
/*
* Mapping sg iterator
--
1.7.9.5
Hi All,
The final spec has had enum values assigned and been published on Khronos:
http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_im…
Thanks to all who've provided input.
Cheers,
Tom
> -----Original Message-----
> From: mesa-dev-bounces+tom.cooksey=arm.com(a)lists.freedesktop.org [mailto:mesa-dev-
> bounces+tom.cooksey=arm.com(a)lists.freedesktop.org] On Behalf Of Tom Cooksey
> Sent: 04 October 2012 13:10
> To: mesa-dev(a)lists.freedesktop.org; linaro-mm-sig(a)lists.linaro.org; dri-
> devel(a)lists.freedesktop.org; linux-media(a)vger.kernel.org
> Subject: [Mesa-dev] [RFC] New dma_buf -> EGLImage EGL extension - New draft!
>
> Hi All,
>
> After receiving a fair bit of feedback (thanks!), I've updated the
> EGL_EXT_image_dma_buf_import spec
> and expanded it to resolve a number of the issues. Please find the latest draft below and let
> me
> know any additional feedback you might have, either on the lists or by private e-mail - I
> don't mind
> which.
>
> I think the only remaining issue now is if we need a mechanism whereby an application can
> query
> which drm_fourcc.h formats EGL supports or if just failing with EGL_BAD_MATCH when the
> application
> has use one EGL doesn't support is sufficient. Any thoughts?
>
>
> Cheers,
>
> Tom
>
>
> --------------------8<--------------------
>
>
> Name
>
> EXT_image_dma_buf_import
>
> Name Strings
>
> EGL_EXT_image_dma_buf_import
>
> Contributors
>
> Jesse Barker
> Rob Clark
> Tom Cooksey
>
> Contacts
>
> Jesse Barker (jesse 'dot' barker 'at' linaro 'dot' org)
> Tom Cooksey (tom 'dot' cooksey 'at' arm 'dot' com)
>
> Status
>
> DRAFT
>
> Version
>
> Version 4, October 04, 2012
>
> Number
>
> EGL Extension ???
>
> Dependencies
>
> EGL 1.2 is required.
>
> EGL_KHR_image_base is required.
>
> The EGL implementation must be running on a Linux kernel supporting the
> dma_buf buffer sharing mechanism.
>
> This extension is written against the wording of the EGL 1.2 Specification.
>
> Overview
>
> This extension allows creating an EGLImage from a Linux dma_buf file
> descriptor or multiple file descriptors in the case of multi-plane YUV
> images.
>
> New Types
>
> None
>
> New Procedures and Functions
>
> None
>
> New Tokens
>
> Accepted by the <target> parameter of eglCreateImageKHR:
>
> EGL_LINUX_DMA_BUF_EXT
>
> Accepted as an attribute in the <attrib_list> parameter of
> eglCreateImageKHR:
>
> EGL_LINUX_DRM_FOURCC_EXT
> EGL_DMA_BUF_PLANE0_FD_EXT
> EGL_DMA_BUF_PLANE0_OFFSET_EXT
> EGL_DMA_BUF_PLANE0_PITCH_EXT
> EGL_DMA_BUF_PLANE1_FD_EXT
> EGL_DMA_BUF_PLANE1_OFFSET_EXT
> EGL_DMA_BUF_PLANE1_PITCH_EXT
> EGL_DMA_BUF_PLANE2_FD_EXT
> EGL_DMA_BUF_PLANE2_OFFSET_EXT
> EGL_DMA_BUF_PLANE2_PITCH_EXT
> EGL_YUV_COLOR_SPACE_HINT_EXT
> EGL_SAMPLE_RANGE_HINT_EXT
> EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT
> EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT
>
> Accepted as the value for the EGL_YUV_COLOR_SPACE_HINT_EXT attribute:
>
> EGL_ITU_REC601_EXT
> EGL_ITU_REC709_EXT
> EGL_ITU_REC2020_EXT
>
> Accepted as the value for the EGL_SAMPLE_RANGE_HINT_EXT attribute:
>
> EGL_YUV_FULL_RANGE_EXT
> EGL_YUV_NARROW_RANGE_EXT
>
> Accepted as the value for the EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT &
> EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT attributes:
>
> EGL_YUV_CHROMA_SITING_0_EXT
> EGL_YUV_CHROMA_SITING_0_5_EXT
>
>
> Additions to Chapter 2 of the EGL 1.2 Specification (EGL Operation)
>
> Add to section 2.5.1 "EGLImage Specification" (as defined by the
> EGL_KHR_image_base specification), in the description of
> eglCreateImageKHR:
>
> "Values accepted for <target> are listed in Table aaa, below.
>
> +-------------------------+--------------------------------------------+
> | <target> | Notes |
> +-------------------------+--------------------------------------------+
> | EGL_LINUX_DMA_BUF_EXT | Used for EGLImages imported from Linux |
> | | dma_buf file descriptors |
> +-------------------------+--------------------------------------------+
> Table aaa. Legal values for eglCreateImageKHR <target> parameter
>
> ...
>
> If <target> is EGL_LINUX_DMA_BUF_EXT, <dpy> must be a valid display, <ctx>
> must be EGL_NO_CONTEXT, and <buffer> must be NULL, cast into the type
> EGLClientBuffer. The details of the image is specified by the attributes
> passed into eglCreateImageKHR. Required attributes and their values are as
> follows:
>
> * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in pixels
>
> * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as specified
> by drm_fourcc.h and used as the pixel_format parameter of the
> drm_mode_fb_cmd2 ioctl.
>
> * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 of
> the image.
>
> * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the
> dma_buf of the first sample in plane 0, in bytes.
>
> * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the start of
> subsequent rows of samples in plane 0. May have special meaning for
> non-linear formats.
>
> For images in an RGB color-space or those using a single-plane YUV format,
> only the first plane's file descriptor, offset & pitch should be specified.
> For semi-planar YUV formats, the chroma samples are stored in plane 1 and
> for fully planar formats, U-samples are stored in plane 1 and V-samples are
> stored in plane 2. Planes 1 & 2 are specified by the following attributes,
> which have the same meanings as defined above for plane 0:
>
> * EGL_DMA_BUF_PLANE1_FD_EXT
> * EGL_DMA_BUF_PLANE1_OFFSET_EXT
> * EGL_DMA_BUF_PLANE1_PITCH_EXT
> * EGL_DMA_BUF_PLANE2_FD_EXT
> * EGL_DMA_BUF_PLANE2_OFFSET_EXT
> * EGL_DMA_BUF_PLANE2_PITCH_EXT
>
> In addition to the above required attributes, the application may also
> provide hints as to how the data should be interpreted by the GL. If any of
> these hints are not specified, the GL will guess based on the pixel format
> passed as the EGL_LINUX_DRM_FOURCC_EXT attribute or may fall-back to some
> default value. Not all GLs will be able to support all combinations of
> these hints and are free to use whatever settings they choose to achieve
> the closest possible match.
>
> * EGL_YUV_COLOR_SPACE_HINT_EXT: The color-space the data is in. Only
> relevant for images in a YUV format, ignored when specified for an
> image in an RGB format. Accepted values are:
> EGL_ITU_REC601_EXT, EGL_ITU_REC709_EXT & EGL_ITU_REC2020_EXT.
>
> * EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT &
> EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT: Where chroma samples are
> sited relative to luma samples when the image is in a sub-sampled
> format. When the image is not using chroma sub-sampling, the luma and
> chroma samples are assumed to be co-sited. Siting is split into the
> vertical and horizontal and is in a fixed range. A siting of zero
> means the first luma sample is taken from the same position in that
> dimension as the chroma sample. This is best illustrated in the
> diagram below:
>
> (0.5, 0.5) (0.0, 0.5) (0.0, 0.0)
> + + + + + + + + * + * +
> x x x x
> + + + + + + + + + + + +
>
> + + + + + + + + * + * +
> x x x x
> + + + + + + + + + + + +
>
> Luma samples (+), Chroma samples (x) Chrome & Luma samples (*)
>
> Note this attribute is ignored for RGB images and non sub-sampled
> YUV images. Accepted values are: EGL_YUV_CHROMA_SITING_0_EXT (0.0)
> & EGL_YUV_CHROMA_SITING_0_5_EXT (0.5)
>
> * EGL_SAMPLE_RANGE_HINT_EXT: The numerical range of samples. Only
> relevant for images in a YUV format, ignored when specified for
> images in an RGB format. Accepted values are: EGL_YUV_FULL_RANGE_EXT
> (0-256) & EGL_YUV_NARROW_RANGE_EXT (16-235).
>
>
> If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target, the
> EGL takes ownership of the file descriptor and is responsible for closing
> it, which it may do at any time while the EGLDisplay is initialized."
>
>
> Add to the list of error conditions for eglCreateImageKHR:
>
> "* If <target> is EGL_LINUX_DMA_BUF_EXT and <buffer> is not NULL, the
> error EGL_BAD_PARAMETER is generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT, and the list of attributes is
> incomplete, EGL_BAD_PARAMETER is generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
> attribute is set to a format not supported by the EGL, EGL_BAD_MATCH
> is generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT
> attribute indicates a single-plane format, EGL_BAD_ATTRIBUTE is
> generated if any of the EGL_DMA_BUF_PLANE1_* or EGL_DMA_BUF_PLANE2_*
> attributes are specified.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT and the value specified for
> EGL_YUV_COLOR_SPACE_HINT_EXT is not EGL_ITU_REC601_EXT,
> EGL_ITU_REC709_EXT or EGL_ITU_REC2020_EXT, EGL_BAD_ATTRIBUTE is
> generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT and the value specified for
> EGL_SAMPLE_RANGE_HINT_EXT is not EGL_YUV_FULL_RANGE_EXT or
> EGL_YUV_NARROW_RANGE_EXT, EGL_BAD_ATTRIBUTE is generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT and the value specified for
> EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT or
> EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT is not
> EGL_YUV_CHROMA_SITING_0_EXT or EGL_YUV_CHROMA_SITING_0_5_EXT,
> EGL_BAD_ATTRIBUTE is generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT and one or more of the values
> specified for a plane's pitch or offset isn't supported by EGL,
> EGL_BAD_ACCESS is generated.
>
> * If <target> is EGL_LINUX_DMA_BUF_EXT and eglCreateImageKHR fails,
> EGL does not retain ownership of the file descriptor and it is the
> responsibility of the application to close it."
>
>
> Issues
>
> 1. Should this be a KHR or EXT extension?
>
> ANSWER: EXT. Khronos EGL working group not keen on this extension as it is
> seen as contradicting the EGLStream direction the specification is going in.
> The working group recommends creating additional specs to allow an EGLStream
> producer/consumer connected to v4l2/DRM or any other Linux interface.
>
> 2. Should this be a generic any platform extension, or a Linux-only
> extension which explicitly states the handles are dma_buf fds?
>
> ANSWER: There's currently no intention to port this extension to any OS not
> based on the Linux kernel. Consequently, this spec can be explicitly written
> against Linux and the dma_buf API.
>
> 3. Does ownership of the file descriptor pass to the EGL library?
>
> ANSWER: If eglCreateImageKHR is successful, EGL assumes ownership of the
> file descriptors and is responsible for closing them.
>
> 4. How are the different YUV color spaces handled (BT.709/BT.601)?
>
> ANSWER: The pixel formats defined in drm_fourcc.h only specify how the data
> is laid out in memory. It does not define how that data should be
> interpreted. Added a new EGL_YUV_COLOR_SPACE_HINT_EXT attribute to allow the
> application to specify which color space the data is in to allow the GL to
> choose an appropriate set of co-efficients if it needs to convert that data
> to RGB for example.
>
> 5. What chroma-siting is used for sub-sampled YUV formats?
>
> ANSWER: The chroma siting is not specified by either the v4l2 or DRM APIs.
> This is similar to the color-space issue (4) in that the chroma siting
> doesn't affect how the data is stored in memory. However, the GL will need
> to know the siting in order to filter the image correctly. While the visual
> impact of getting the siting wrong is minor, provision should be made to
> allow an application to specify the siting if desired. Added additional
> EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT &
> EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT attributes to allow the siting to
> be specified using a set of pre-defined values (0 or 0.5).
>
> 6. How can an application query which formats the EGL implementation
> supports?
>
> PROPOSAL: Don't provide a query mechanism but instead add an error condition
> that EGL_BAD_MATCH is raised if the EGL implementation doesn't support that
> particular format.
>
> 7. Which image formats should be supported and how is format specified?
>
> Seem to be two options 1) specify a new enum in this specification and
> enumerate all possible formats. 2) Use an existing enum already in Linux,
> either v4l2_mbus_pixelcode and/or those formats listed in drm_fourcc.h?
>
> ANSWER: Go for option 2) and just use values defined in drm_fourcc.h.
>
> 8. How can AYUV images be handled?
>
> ANSWER: At least on fourcc.org and in drm_fourcc.h, there only seems to be
> a single AYUV format and that is a packed format, so everything, including
> the alpha component would be in the first plane.
>
> 9. How can you import interlaced images?
>
> ANSWER: Interlaced frames are usually stored with the top & bottom fields
> interleaved in a single buffer. As the fields would need to be displayed as
> at different times, the application would create two EGLImages from the same
> buffer, one for the top field and another for the bottom. Both EGLImages
> would set the pitch to 2x the buffer width and the second EGLImage would use
> a suitable offset to indicate it started on the second line of the buffer.
> This should work regardless of whether the data is packed in a single plane,
> semi-planar or multi-planar.
>
> If each interlaced field is stored in a separate buffer then it should be
> trivial to create two EGLImages, one for each field's buffer.
>
> 10. How are semi-planar/planar formats handled that have a different
> width/height for Y' and CbCr such as YUV420?
>
> ANSWER: The spec says EGL_WIDTH & EGL_HEIGHT specify the *logical* width and
> height of the buffer in pixels. For pixel formats with sub-sampled Chroma
> values, it should be trivial for the EGL implementation to calculate the
> width/height of the Chroma sample buffers using the logical width & height
> and by inspecting the pixel format passed as the EGL_LINUX_DRM_FOURCC_EXT
> attribute. I.e. If the pixel format says it's YUV420, the Chroma buffer's
> width = EGL_WIDTH/2 & height =EGL_HEIGHT/2.
>
> 11. How are Bayer formats handled?
>
> ANSWER: As of Linux 2.6.34, drm_fourcc.h does not include any Bayer formats.
> However, future kernel versions may add such formats in which case they
> would be handled in the same way as any other format.
>
> 12. Should the spec support buffers which have samples in a "narrow range"?
>
> Content sampled from older analogue sources typically don't use the full
> (0-256) range of the data type storing the sample and instead use a narrow
> (16-235) range to allow some headroom & toeroom in the signals to avoid
> clipping signals which overshoot slightly during processing. This is
> sometimes known as signals using "studio swing".
>
> ANSWER: Add a new attribute to define if the samples use a narrow 16-235
> range or the full 0-256 range.
>
> 13. Specifying the color space and range seems cumbersome, why not just
> allow the application to specify the full YUV->RGB color conversion matrix?
>
> ANSWER: Some hardware may not be able to use an arbitrary conversion matrix
> and needs to select an appropriate pre-defined matrix based on the color
> space and the sample range.
>
> 14. How do you handle EGL implementations which have restrictions on pitch
> and/or offset?
>
> ANSWER: Buffers being imported using dma_buf pretty much have to be
> allocated by a kernel-space driver. As such, it is expected that a system
> integrator would make sure all devices which allocate buffers suitable for
> exporting make sure they use a pitch supported by all possible importers.
> However, it is still possible eglCreateImageKHR can fail due to an
> unsupported pitch. Added a new error to the list indicating this.
>
> 15. Should this specification also describe how to export an existing
> EGLImage as a dma_buf file descriptor?
>
> ANSWER: No. Importing and exporting buffers are two separate operations and
> importing an existing dma_buf fd into an EGLImage is useful functionality in
> itself. Agree that exporting an EGLImage as a dma_buf fd is useful, E.g. it
> could be used by an OpenMAX IL implementation's OMX_UseEGLImage function to
> give access to the buffer backing an EGLImage to video hardware. However,
> exporting can be split into a separate extension specification.
>
>
> Revision History
>
> #4 (Tom Cooksey, October 04, 2012)
> - Fixed issue numbering!
> - Added issues 8 - 15.
> - Promoted proposal for Issue 3 to be the answer.
> - Added an additional attribute to allow an application to specify the color
> space as a hint which should address issue 4.
> - Added an additional attribute to allow an application to specify the chroma
> siting as a hint which should address issue 5.
> - Added an additional attribute to allow an application to specify the sample
> range as a hint which should address the new issue 12.
> - Added language to end of error section clarifying who owns the fd passed
> to eglCreateImageKHR if an error is generated.
>
> #3 (Tom Cooksey, August 16, 2012)
> - Changed name from EGL_EXT_image_external and re-written language to
> explicitly state this for use with Linux & dma_buf.
> - Added a list of issues, including some still open ones.
>
> #2 (Jesse Barker, May 30, 2012)
> - Revision to split eglCreateImageKHR functionality from export
> Functionality.
> - Update definition of EGLNativeBufferType to be a struct containing a list
> of handles to support multi-buffer/multi-planar formats.
>
> #1 (Jesse Barker, March 20, 2012)
> - Initial draft.
>
>
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev(a)lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Hi all,
Just a reminder of the BoF on Thursday at 4pm (pacific time) at ELC; in
particular, it looks like we'll be in the Mission room of the Parc 55
Wyndham. Remember that you don't need to be registered for the
conference to attend the BoF, so Bay Area folks are encouraged to join
in. Also, if you know of colleagues that are coming to ELC and might
not be on this list, please forward this along so they can attend.
Here's a rough agenda:
- Aims, goals, and non-goals.
- CDFv3 overview.
- Considerations for Android.
- Future direction.
In particular, I think it would be good to call out non-goals,
especially for the initial version of the framework so that we have a
basis that is genuinely useful to build upon.
For reference, the thread on Laurent's report from the FOSDEM BoF
http://lists.freedesktop.org/archives/dri-devel/2013-February/034576.html
cheers,
--
Jesse Barker
Principal Software Engineer
ARM
+1 (408) 576-1423
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 13 February 2013 15:10, Imre Deak <imre.deak(a)intel.com> wrote:
> For better code reuse use the newly added page iterator to iterate
> through the pages. The offset, length within the page is still
> calculated by the mapping iterator as well as the actual mapping.
> Idea from Tejun Heo <tj(a)kernel.org>.
>
> Signed-off-by: Imre Deak <imre.deak(a)intel.com>
> ---
> include/linux/scatterlist.h | 6 +++---
> lib/scatterlist.c | 46 ++++++++++++++++++++-----------------------
> 2 files changed, 24 insertions(+), 28 deletions(-)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index 788a853..a6cd692 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -295,9 +295,9 @@ struct sg_mapping_iter {
> size_t consumed; /* number of consumed bytes */
>
> /* these are internal states, keep away */
> - struct scatterlist *__sg; /* current entry */
> - unsigned int __nents; /* nr of remaining entries */
> - unsigned int __offset; /* offset within sg */
> + unsigned int __offset; /* offset within page */
> + struct sg_page_iter __piter; /* page iterator */
> + unsigned int __remaining; /* remaining bytes on page */
> unsigned int __flags;
> };
Hi,
FYI, in next-20130220 this appears to break the build of the dw_mmc driver:
drivers/mmc/host/dw_mmc.c In function 'dw_mci_read_data_pio':
drivers/mmc/host/dw_mmc.c +1457 : error: 'struct sg_mapping_iter' has
no member named '__sg'
drivers/mmc/host/dw_mmc.c In function 'dw_mci_write_data_pio':
drivers/mmc/host/dw_mmc.c +1512 : error: 'struct sg_mapping_iter' has
no member named '__sg'
Cheers
James
Adding APIs in arm dma mapping code to map a scatterlist in iommu domain
and get dma address. Allocators outside dma-mapping code like ION could
allocate pages and devices would need to map them for iommu and obtain a
linear dma address. Intention is to re-use the IOVA managment code and "mapping"
across allocators.
Can the above requirement be done without adding new APIs?
APIs available to map an sglist (arm_dma_map_sg) does not provide a linear
dma address.
Signed-off-by: Nishanth Peethambaran <nishanth(a)broadcom.com>
---
arch/arm/include/asm/dma-mapping.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/arm/include/asm/dma-mapping.h
b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..f9e2f6c 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -268,6 +268,16 @@ extern void arm_dma_sync_sg_for_device(struct
device *, struct scatterlist *, in
extern int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
struct dma_attrs *attrs);
+/*
+ * Map scatterlist pages for the device and return a dma address
+ */
+extern dma_addr_t arm_dma_map_sgtable(struct device *dev, struct sgtable *sgt,
+ enum dma_data_direction dir, struct dma_attrs *attrs);
+/*
+ * Unmap the dma address
+ */
+extern void arm_dma_unmap(struct device *, dma_addr_t iova, int size,
+ enum dma_data_direction dir, struct dma_attrs *attrs);
#endif /* __KERNEL__ */
#endif
--
1.7.9.5
This will allow me to call functions that have multiple arguments if fastpath fails.
This is required to support ticket mutexes, because they need to be able to pass an
extra argument to the fail function.
Originally I duplicated the functions, by adding __mutex_fastpath_lock_retval_arg.
This ended up being just a duplication of the existing function, so a way to test
if fastpath was called ended up being better.
This also cleaned up the reservation mutex patch some by being able to call an
atomic_set instead of atomic_xchg, and making it easier to detect if the wrong
unlock function was previously used.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
---
arch/ia64/include/asm/mutex.h | 10 ++++------
arch/powerpc/include/asm/mutex.h | 10 ++++------
arch/sh/include/asm/mutex-llsc.h | 4 ++--
arch/x86/include/asm/mutex_32.h | 11 ++++-------
arch/x86/include/asm/mutex_64.h | 11 ++++-------
include/asm-generic/mutex-dec.h | 10 ++++------
include/asm-generic/mutex-null.h | 2 +-
include/asm-generic/mutex-xchg.h | 10 ++++------
kernel/mutex.c | 32 ++++++++++++++------------------
9 files changed, 41 insertions(+), 59 deletions(-)
diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
index bed73a6..f41e66d 100644
--- a/arch/ia64/include/asm/mutex.h
+++ b/arch/ia64/include/asm/mutex.h
@@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
- * @fail_fn: function to call if the original value was not 1
*
- * Change the count from 1 to a value lower than 1, and call <fail_fn> if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
*/
static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
{
if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
- return fail_fn(count);
+ return -1;
return 0;
}
diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
index 5399f7e..127ab23 100644
--- a/arch/powerpc/include/asm/mutex.h
+++ b/arch/powerpc/include/asm/mutex.h
@@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
- * @fail_fn: function to call if the original value was not 1
*
- * Change the count from 1 to a value lower than 1, and call <fail_fn> if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
*/
static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
{
if (unlikely(__mutex_dec_return_lock(count) < 0))
- return fail_fn(count);
+ return -1;
return 0;
}
diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index 090358a..dad29b6 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
}
static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
{
int __done, __res;
@@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
: "t");
if (unlikely(!__done || __res != 0))
- __res = fail_fn(count);
+ __res = -1;
return __res;
}
diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
index 03f90c8..b7f6b34 100644
--- a/arch/x86/include/asm/mutex_32.h
+++ b/arch/x86/include/asm/mutex_32.h
@@ -42,17 +42,14 @@ do { \
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
- * @fail_fn: function to call if the original value was not 1
*
- * Change the count from 1 to a value lower than 1, and call <fail_fn> if it
- * wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or 1 otherwise.
*/
-static inline int __mutex_fastpath_lock_retval(atomic_t *count,
- int (*fail_fn)(atomic_t *))
+static inline int __mutex_fastpath_lock_retval(atomic_t *count)
{
if (unlikely(atomic_dec_return(count) < 0))
- return fail_fn(count);
+ return -1;
else
return 0;
}
diff --git a/arch/x86/include/asm/mutex_64.h b/arch/x86/include/asm/mutex_64.h
index 68a87b0..2c543ff 100644
--- a/arch/x86/include/asm/mutex_64.h
+++ b/arch/x86/include/asm/mutex_64.h
@@ -37,17 +37,14 @@ do { \
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
- * @fail_fn: function to call if the original value was not 1
*
- * Change the count from 1 to a value lower than 1, and call <fail_fn> if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
*/
-static inline int __mutex_fastpath_lock_retval(atomic_t *count,
- int (*fail_fn)(atomic_t *))
+static inline int __mutex_fastpath_lock_retval(atomic_t *count)
{
if (unlikely(atomic_dec_return(count) < 0))
- return fail_fn(count);
+ return -1;
else
return 0;
}
diff --git a/include/asm-generic/mutex-dec.h b/include/asm-generic/mutex-dec.h
index f104af7..d4f9fb4 100644
--- a/include/asm-generic/mutex-dec.h
+++ b/include/asm-generic/mutex-dec.h
@@ -28,17 +28,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
- * @fail_fn: function to call if the original value was not 1
*
- * Change the count from 1 to a value lower than 1, and call <fail_fn> if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
*/
static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
{
if (unlikely(atomic_dec_return(count) < 0))
- return fail_fn(count);
+ return -1;
return 0;
}
diff --git a/include/asm-generic/mutex-null.h b/include/asm-generic/mutex-null.h
index e1bbbc7..efd6206 100644
--- a/include/asm-generic/mutex-null.h
+++ b/include/asm-generic/mutex-null.h
@@ -11,7 +11,7 @@
#define _ASM_GENERIC_MUTEX_NULL_H
#define __mutex_fastpath_lock(count, fail_fn) fail_fn(count)
-#define __mutex_fastpath_lock_retval(count, fail_fn) fail_fn(count)
+#define __mutex_fastpath_lock_retval(count, fail_fn) (-1)
#define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count)
#define __mutex_fastpath_trylock(count, fail_fn) fail_fn(count)
#define __mutex_slowpath_needs_to_unlock() 1
diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h
index c04e0db..f169ec0 100644
--- a/include/asm-generic/mutex-xchg.h
+++ b/include/asm-generic/mutex-xchg.h
@@ -39,18 +39,16 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *))
* __mutex_fastpath_lock_retval - try to take the lock by moving the count
* from 1 to a 0 value
* @count: pointer of type atomic_t
- * @fail_fn: function to call if the original value was not 1
*
- * Change the count from 1 to a value lower than 1, and call <fail_fn> if it
- * wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
*/
static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
{
if (unlikely(atomic_xchg(count, 0) != 1))
if (likely(atomic_xchg(count, -1) != 1))
- return fail_fn(count);
+ return -1;
return 0;
}
diff --git a/kernel/mutex.c b/kernel/mutex.c
index a307cc9..5ac4522 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -350,10 +350,10 @@ __mutex_unlock_slowpath(atomic_t *lock_count)
* mutex_lock_interruptible() and mutex_trylock().
*/
static noinline int __sched
-__mutex_lock_killable_slowpath(atomic_t *lock_count);
+__mutex_lock_killable_slowpath(struct mutex *lock);
static noinline int __sched
-__mutex_lock_interruptible_slowpath(atomic_t *lock_count);
+__mutex_lock_interruptible_slowpath(struct mutex *lock);
/**
* mutex_lock_interruptible - acquire the mutex, interruptible
@@ -371,12 +371,12 @@ int __sched mutex_lock_interruptible(struct mutex *lock)
int ret;
might_sleep();
- ret = __mutex_fastpath_lock_retval
- (&lock->count, __mutex_lock_interruptible_slowpath);
- if (!ret)
+ ret = __mutex_fastpath_lock_retval(&lock->count);
+ if (likely(!ret)) {
mutex_set_owner(lock);
-
- return ret;
+ return 0;
+ } else
+ return __mutex_lock_interruptible_slowpath(lock);
}
EXPORT_SYMBOL(mutex_lock_interruptible);
@@ -386,12 +386,12 @@ int __sched mutex_lock_killable(struct mutex *lock)
int ret;
might_sleep();
- ret = __mutex_fastpath_lock_retval
- (&lock->count, __mutex_lock_killable_slowpath);
- if (!ret)
+ ret = __mutex_fastpath_lock_retval(&lock->count);
+ if (likely(!ret)) {
mutex_set_owner(lock);
-
- return ret;
+ return 0;
+ } else
+ return __mutex_lock_killable_slowpath(lock);
}
EXPORT_SYMBOL(mutex_lock_killable);
@@ -404,18 +404,14 @@ __mutex_lock_slowpath(atomic_t *lock_count)
}
static noinline int __sched
-__mutex_lock_killable_slowpath(atomic_t *lock_count)
+__mutex_lock_killable_slowpath(struct mutex *lock)
{
- struct mutex *lock = container_of(lock_count, struct mutex, count);
-
return __mutex_lock_common(lock, TASK_KILLABLE, 0, NULL, _RET_IP_);
}
static noinline int __sched
-__mutex_lock_interruptible_slowpath(atomic_t *lock_count)
+__mutex_lock_interruptible_slowpath(struct mutex *lock)
{
- struct mutex *lock = container_of(lock_count, struct mutex, count);
-
return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0, NULL, _RET_IP_);
}
#endif
On Tue, Feb 5, 2013 at 2:27 PM, Laurent Pinchart
<laurent.pinchart(a)ideasonboard.com> wrote:
> Hello,
>
> We've hosted a CDF meeting at the FOSDEM on Sunday morning. Here's a summary
> of the discussions.
>
> I would like to start with a big thank to UrLab, the ULB university hacker
> space, for providing us with a meeting room.
>
> The meeting would of course not have been successful without the wide range of
> participants, so I also want to thank all the people who woke up on Sunday
> morning to attend the meeting :-)
>
> (The CC list is pretty long, please let me know - by private e-mail in order
> not to spam the list - if you would like not to receive future CDF-related e-
> mails directly)
>
> 0. Abbreviations
> ----------------
>
> DBI - Display Bus Interface, a parallel video control and data bus that
> transmits data using parallel data, read/write, chip select and address
> signals, similarly to 8051-style microcontroller parallel busses. This is a
> mixed video control and data bus.
>
> DPI - Display Pixel Interface, a parallel video data bus that transmits data
> using parallel data, h/v sync and clock signals. This is a video data bus
> only.
>
> DSI - Display Serial Interface, a serial video control and data bus that
> transmits data using one or more differential serial lines. This is a mixed
> video control and data bus.
>
> DT - Device Tree, a representation of a hardware system as a tree of physical
> devices with associated properties.
>
> SFI - Simple Firmware Interface, a lightweight method for firmware to export
> static tables to the operating system. Those tables can contain display device
> topology information.
>
> VBT - Video BIOS Table, a block of data residing in the video BIOS that can
> contain display device topology information.
>
> 1. Goals
> --------
>
> The meeting started with a brief discussion about the CDF goals.
>
> Tomi Valkeinin and Tomasz Figa have sent RFC patches to show their views of
> what CDF could/should be. Many others have provided very valuable feedback.
> Given the early development stage propositions were sometimes contradictory,
> and focused on different areas of interest. We have thus started the meeting
> with a discussion about what CDF should try to achieve, and what it shouldn't.
>
> CDF has two main purposes. The original goal was to support display panels in
> a platform- and subsystem-independent way. While mostly useful for embedded
> systems, the emergence of platforms such as Intel Medfield and ARM-based PCs
> that blends the embedded and PC worlds makes panel support useful for the PC
> world as well.
>
> The second purpose is to provide a cross-subsystem interface to support video
> encoders. The idea originally came from a generalisation of the original RFC
> that supported panels only. While encoder support is considered as lower
> priority than display panel support by developers focussed on display
> controller driver (Intel, Renesas, ST Ericsson, TI), companies that produce
> video encoders (Analog Devices, and likely others) don't share that point of
> view and would like to provide a single encoder driver that can be used in
> both KMS and V4L2 drivers.
>
> Both display panels and encoders are thus the target of a lot of attention,
> depending on the audience. As long as none of them is forgotten in CDF, the
> overall agreement was that focussing on panels first is acceptable. Care shall
> be taken in that case to avoid any architecture that would make encoders
> support difficult or impossible.
>
> 2. Subsystems
> -------------
>
> Display panels are used in conjunction with FBDEV and KMS drivers. There was
> to the audience knowledge no V4L2 driver that needs to explicitly handle
> display panels. Even though at least one V4L2 output drivers (omap_vout) can
> output video to a display panel, it does so in conjunction with the KMS and/or
> FBDEV APIs that handle panel configuration. Panels are thus not exposed to
> V4L2 drivers.
>
> Encoders, on the other hand, are widely used in the V4L2 subsystem. Many V4L2
> devices output video in either analog (Composite, S-Video, VGA) or digital
> (DVI, HDMI) way.
>
> Display panel drivers don't need to be shared with the V4L2 subsystem.
> Furthermore, as the general opinion during the meeting was that the FBDEV
> subsystem should be considered as legacy and deprecate in the future,
> restricting panel support to KMS hasn't been considered by anyone as an issue.
> KMS will thus be the main target of display panel support in CDF, and FBDEV
> will be supported if that doesn't bring any drawback from an architecture
> point of view.
>
> Encoder drivers need to be shared with the V4L2 subsystem. Similarly to panel
> drivers, excluding FBDEV support from CDF isn't considered as an issue.
>
> 3. KMS Extensions
> -----------------
>
> The usefulness of V4L2 for output devices was questioned, and the possibility
> of using KMS for complex video devices usually associated with V4L2 was
> raised. The TI DaVinci 8xxx family is an example of chips that could benefit
> from KMS support.
>
> The KMS API is lacking support for deep-pipelining ("framebuffers" that are
> sourced from a data stream instead of a memory buffer) today. Extending the
> KMS API with deep-pipelining support was considered as a sensible goal that
> would mostly require the creation of a new KMS source object. Exposing the
> topology of the whole device would then be handled by the Media Controller
> API.
>
> Given that no evidence of this KMS extension being ready in a reasonable time
> frame exists, sharing encoder drivers with the V4L2 subsystem hasn't been
> seriously questioned.
>
> 4. Discovery and Initialization
> -------------------------------
>
> As CDF will split support for complete display devices across different
> drivers, the question of physical devices discovery and initialization caused
> concern among the audience.
>
> Topology and connectivity information can come from a wide variety of sources.
> Embedded platforms typically provide that information in platform data
> supplied by board code or through the device tree. PC platforms usually store
> the information in the firmware exposed through ACPI, SFI, VBT or other
> interfaces. Pluggable devices (PCI being the most common case) can also store
> the information on an on-board non-volatile memory or hardcode it in drivers.
>
> When using the device tree display entity information are bundled with the
> display entity device DT node. The associated driver shall thus extract itself
> information from the DT node. In all other cases the display entity driver
> shall not parse data from the information source directly, but shall instead
> received a platform data structure filled with data parsed by the display
> controller driver. In the most complex case a machine driver, similar to ASoC
> machine drivers, might be needed, in which case platform data could be
> provided by that machine driver.
>
> Display entity drivers are encouraged to internally fill a platform data
> structure from their DT node to reuse the same code path for both platform
> data- and DT-based initialization.
>
> 5. Bus Model
> ------------
>
> Display panels are connected to a video bus that transmits video data and
> optionally to a control bus. Those two busses can be separate physical
> interfaces or combined into a single physical interface.
>
> The Linux device model represents the system as a tree of devices (not to be
> confused by the device tree, abreviated as DT). The tree is organized around
> control busses, with every device being a child of its control bus master. For
> instance an I2C device will be a child of its I2C controller device, which can
> itself be a child of its parent PCI device.
>
> Display panels will be represented as Linux devices. They will have a single
> parent from the Linux device model point of view, but will be potentially
> connected to multiple physical busses. CDF thus needs to define what bus to
> select as the Linux parent bus.
>
> In theory any physical bus that the device is attached to can be selected as
> the parent bus. However, selecting a video data bus would depart from the
> traditional Linux device model that uses control busses only. This caused
> concern among several people who argued that not presenting the device to the
> kernel as attached to its control bus would bring issues in embedded system.
> Unlike on PC systems where the control bus master is usually the same physical
> device as the data bus master, embedded systems are made of a potentially
> complex assembly of completely unrelated devices. Not representing an I2C-
> controlled panel as a child of its I2C master in DT was thus frown upon, even
> though no clear agreement was reached on the subject.
>
> Panels can be divided in three categories based on their bus model.
>
> - No control bus
>
> Many panels don't offer any control interface. They are usually referred to as
> 'dumb panels' as they directly display the data received on their video bus
> without any configurable option. Panels in this category often use DPI is
> their video bus, but other options such as DSI (using the DSI video mode only)
> are possible.
>
> Panels with no control bus can be represented in the device model as platform
> devices, or as being attached to their video bus. In the later case we would
> need Linux busses for pure video data interfaces such as DPI or VGA. Nobody
> was particularly enthousiastic about this idea. Dumb panels will thus likely
> be represented as platform devices.
>
> - Separate video and control busses
>
> The typical case is a panel connected to an I2C or SPI bus that receives data
> through a DPI video interface or DSI video mode interface.
>
> Using a mixed control and video bus (such as DSI and DBI) for control only
> with a different bus for video data is possible in theory but very unlikely in
> practice (although the creativity of hardware developers should never be
> underestimated).
>
> Display panels that use a control bus supported by the Linux kernel should
> likely be represented as children of their control bus master. Other options
> are possible as mentioned above but were received without enthousiasm by most
> embedded kernel developers.
>
> When the control bus isn't supported by the kernel, a new bus type can be
> developed, or the panel can be represented as a platform device. The right
> option will likely very depending on the control bus.
>
> - Combined video and control busses
>
> When the two busses are combined in a single physical bus the panel device
> will obviously be represented as a child of that single physical bus.
>
> In such cases the control bus could expose video bus control methods. This
> would remove the need for a video source as proposed by Tomi Valkeinen in his
> CDF model. However, if the bus can be used for video data transfer in
> combination with a different control bus, a video source corresponding to the
> data bus will be needed.
>
> No decision has been taken on whether to use a video source in addition to the
> control bus in the combined busses case. Experimentation will be needed, and
> the right solution might depend on the bus type.
>
> - Multiple control busses
>
> One panel was mentioned as being connected to a DSI bus and an I2C bus. The
> DSI bus is used for both control and video, and the I2C bus for control only.
> configuring the panel requires sending commands through both DSI and I2C. The
> opinion on such panels was a large *sigh* followed by a "this should be
> handled by the device core, let's ask Greg KH".
>
> 6. Miscellaneous
> ----------------
>
> - If the OMAP3 DSS driver is used as a model for the DSI support
> implementation, Daniel Vetter requested the DSI bus lock semaphore to be
> killed as it prevents lockdep from working correctly (reference needed ;-)).
>
> - Do we need to support chaining several encoders ? We can come up with
> several theoretical use cases, some of them probably exist in real hardware,
> but the details are still a bit fuzzy.
So, a part which is completely omitted in this thread is how to handle
suspend/resume ordering. If you have multiple encoders which need to
be turned on/off in a given order at suspend/resume, how do you handle
that given the current scheme where they are just separate platform
drivers in drivers/video?
This problems occurs with drm/exynos in current 3.8 kernels for
example. On that platform, the DP driver and the FIMD driver will
suspend/resume in random order, and therefore fail resuming half the
time. Is there something which could be done in CDF to address that?
Stéphane
Hi,
While working with high-order page allocations (using `alloc_pages') I've
encountered some issues* with certain APIs and wanted to get a better
understanding of support for those APIs with high-order pages on ARM. In
short, I'm trying to give userspace access to those pages by using
`vm_insert_page' in an mmap handler. Without further ado, some
questions:
o vm_insert_page doesn't seem to work with high-order pages (it
eventually calls __flush_dcache_page which assumes pages of size
PAGE_SIZE). Is this analysis correct or am I missing something?
Things work fine if I use `remap_pfn_range' instead of
`vm_insert_page'. Things also seem to work if I use `vm_insert_page'
with an array of struct page * of size PAGE_SIZE (derived from the
high-order pages by picking out the PAGE_SIZE pages with
nth_page)...
o There's a comment in __dma_alloc (dma-alloc.c) to the effect that
__GFP_COMP is not supported on ARM. Is this true? The commit that
introduced this comment (ea2e7057) was actually ported from avr32
(3611553ef) so I'm curious about the basis for this claim...
I've tried pages of order 8 and order 4. The gfp flags I'm passing to
`alloc_pages' are (GFP_KERNEL | __GFP_HIGHMEM | __GFP_COMP).
Thanks!
* Some issues = in userspace mmap the buffer whose underlying mmap
handler is the one mentioned above, memset that to something and then
immediately check that the bytes are equal to whatever we just memset.
(With huge pages and vm_insert_page this test fails).
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
Add an iterator to walk through a scatter list a page at a time starting
at a specific page offset. As opposed to the mapping iterator this is
meant to be small, performing well even in simple loops like collecting
all pages on the scatterlist into an array or setting up an iommu table
based on the pages' DMA address.
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
include/linux/scatterlist.h | 48 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
[ Resending with proper email addresses. ]
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4bd6c06..d22851c 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -231,6 +231,54 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents,
*/
#define SG_MAX_SINGLE_ALLOC (PAGE_SIZE / sizeof(struct scatterlist))
+struct sg_page_iter {
+ struct scatterlist *sg;
+ int sg_pgoffset;
+ struct page *page;
+};
+
+static inline int
+sg_page_cnt(struct scatterlist *sg)
+{
+ BUG_ON(sg->offset || sg->length & ~PAGE_MASK);
+
+ return sg->length >> PAGE_SHIFT;
+}
+
+static inline void
+sg_page_iter_next(struct sg_page_iter *iter)
+{
+ while (iter->sg && iter->sg_pgoffset >= sg_page_cnt(iter->sg)) {
+ iter->sg_pgoffset -= sg_page_cnt(iter->sg);
+ iter->sg = sg_next(iter->sg);
+ }
+
+ if (iter->sg) {
+ iter->page = nth_page(sg_page(iter->sg), iter->sg_pgoffset);
+ iter->sg_pgoffset++;
+ }
+}
+
+static inline void
+sg_page_iter_start(struct sg_page_iter *iter, struct scatterlist *sglist,
+ unsigned long pgoffset)
+{
+ iter->sg = sglist;
+ iter->sg_pgoffset = pgoffset;
+ iter->page = NULL;
+
+ sg_page_iter_next(iter);
+}
+
+/*
+ * Simple sg page iterator, starting off at the given page offset. Each entry
+ * on the sglist must start at offset 0 and can contain only full pages.
+ * iter->page will point to the current page, iter->sg_pgoffset to the page
+ * offset within the sg holding that page.
+ */
+#define for_each_sg_page(sglist, iter, pgoffset) \
+ for (sg_page_iter_start((iter), (sglist), (pgoffset)); \
+ (iter)->sg; sg_page_iter_next(iter))
/*
* Mapping sg iterator
--
1.7.9.5
Hi Imre!
On Sat, Feb 09, 2013 at 05:27:33PM +0200, Imre Deak wrote:
> Add a helper to walk through a scatter list a page at a time. Needed by
> upcoming patches fixing the scatter list walking logic in the i915 driver.
Nice patch, but I think this would make a rather nice addition to the
common scatterlist api in scatterlist.h, maybe called sg_page_iter.
There's already another helper which does cpu mappings, but it has a
different use-case (gives you the page mapped, which we don't neeed and
can cope with not page-aligned sg tables). With dma-buf using sg tables I
expect more users of such a sg page iterator to pop up. Most possible
users of this will hang around on linaro-mm-sig, so please also cc that
besides the usual suspects.
Cheers, Daniel
>
> Signed-off-by: Imre Deak <imre.deak(a)intel.com>
> ---
> include/drm/drmP.h | 44 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 44 insertions(+)
>
> diff --git a/include/drm/drmP.h b/include/drm/drmP.h
> index fad21c9..0c0c213 100644
> --- a/include/drm/drmP.h
> +++ b/include/drm/drmP.h
> @@ -1578,6 +1578,50 @@ extern int drm_sg_alloc_ioctl(struct drm_device *dev, void *data,
> extern int drm_sg_alloc(struct drm_device *dev, struct drm_scatter_gather * request);
> extern int drm_sg_free(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
> +struct drm_sg_iter {
> + struct scatterlist *sg;
> + int sg_offset;
Imo using sg_pfn_offset (i.e. sg_offset >> PAGE_SIZE) would make it
clearer that this is all about iterating page-aligned sg tables.
> + struct page *page;
> +};
> +
> +static inline int
> +__drm_sg_iter_seek(struct drm_sg_iter *iter)
> +{
> + while (iter->sg && iter->sg_offset >= iter->sg->length) {
> + iter->sg_offset -= iter->sg->length;
> + iter->sg = sg_next(iter->sg);
And adding a WARN_ON(sg->legnth & ~PAGE_MASK); here would enforce that.
> + }
> +
> + return iter->sg ? 0 : -1;
> +}
> +
> +static inline struct page *
> +drm_sg_iter_next(struct drm_sg_iter *iter)
> +{
> + struct page *page;
> +
> + if (__drm_sg_iter_seek(iter))
> + return NULL;
> +
> + page = nth_page(sg_page(iter->sg), iter->sg_offset >> PAGE_SHIFT);
> + iter->sg_offset = (iter->sg_offset + PAGE_SIZE) & PAGE_MASK;
> +
> + return page;
> +}
> +
> +static inline struct page *
> +drm_sg_iter_start(struct drm_sg_iter *iter, struct scatterlist *sg,
> + unsigned long offset)
> +{
> + iter->sg = sg;
> + iter->sg_offset = offset;
> +
> + return drm_sg_iter_next(iter);
> +}
> +
> +#define drm_for_each_sg_page(iter, sg, pgoffset) \
> + for ((iter)->page = drm_sg_iter_start((iter), (sg), (pgoffset));\
> + (iter)->page; (iter)->page = drm_sg_iter_next(iter))
Again, for the initialization I'd go with page numbers, not an offset in
bytes.
>
> /* ATI PCIGART support (ati_pcigart.h) */
> extern int drm_ati_pcigart_init(struct drm_device *dev,
> --
> 1.7.10.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx(a)lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
On allocation or kmalloc failure in system heap allocate, the exit path
iterates over the allocated page infos and frees the allocated pages
and page info. The same page info structure is used as loop iterator.
Use the safe version of list iterator.
Signed-off-by: Nishanth Peethambaran <nishanth(a)broadcom.com>
---
drivers/gpu/ion/ion_system_heap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/ion/ion_system_heap.c
b/drivers/gpu/ion/ion_system_heap.c
index c1061a8..d079e2b 100644
--- a/drivers/gpu/ion/ion_system_heap.c
+++ b/drivers/gpu/ion/ion_system_heap.c
@@ -200,7 +200,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
err1:
kfree(table);
err:
- list_for_each_entry(info, &pages, list) {
+ list_for_each_entry_safe(info, tmp_info, &pages, list) {
free_buffer_page(sys_heap, buffer, info->page, info->order);
kfree(info);
}
--
1.7.9.5
Hi,
I'm working on Android Linux Kernel Vesion 3.0.15 and seeing a
"deadlock" in the ashmem driver, while handling mmap request. I seek
your support in finding the correct fix.
The locks that involved in the dead lock are
1) mm->mmap_sem
2) ashmem_mutex
The following is the sequence of events that leads to the deadlock.
There are two threads A and B that belong to the same process
(system_server) and hence share the mm struct.
A1) In the A's context an mmap system call is made with an fd of ashmem
A2) The system call sys_mmap_pgoff acquires the mmap_sem of the "mm"
and sleeps before calling the .mmap of ashmem i.e before calling
ashmem_mmap
Now the thread B runs and proceeds to do the following
B1) In the B's context ashmem ioctl with option ASHMEM_SET_NAME is called.
B2) Now the code proceeds to acquire the ashmem_mutex and performs a
"copy_from_user"
B3) copy_from_user raises a valid exception to copy the data from user
space and proceeds to handle it gracefully, do_DataAbort -->
do_page_fault
B4) In do_page_fault it finds that the mm->mmap_sem is not available
(Note A & B share the mm) since A has it and sleeps
Now the thread A runs again
A3) It proceeds to call ashmem_mmap and tries to acquired
ashmem_mutex, which is not available (is with B) and sleeps.
Now A has acquired mmap_sem and waits for B to release ashmem_mutex
B has acquired ashmem_mutex and waits for the mmap_sem to be
available, which is held by A
This creates a dead lock in the system.
I'm not sure how to use these locks in such a way as to prevent this
scenario. Any suggestions would be of great help.
Workaround:
One possible work around is to replace the mutex_lock call made in the
ashmem_mmap with mutex_trylock and if it fails, wait for few
milliseconds and try back for few iterations and finally give up after
few iterations. This will bring the system out deadlock if this
scneario happens. I myself feel that this suggestion is not clean. But
I'm unable to think of anything.
Is there any suggestion to avoid this scenario.
Warm Regards,
Shankar