Hello,
This is a fourth version of my proposal for device tree integration for
reserved memory and Contiguous Memory Allocator. After the comments from
Grant Likely I moved back memory region definitions back to /memory node
(as it was in the first version of this proposal). I've also extended
the code and made it more generic, added support for so called reserved
dma memory (special dma memory regions created by dma_alloc_coherent()
function, for exclusive usage for dma allocation for the given device).
Just a few words for those who see this code for the first time:
The proposed bindings allows to define contiguous memory regions of
specified base address and size. Then, the defined regions can be
assigned to the given device(s) by adding a property with a phanle to
the defined contiguous memory region. From the device tree perspective
that's all. Once the bindings are added, all the memory allocations from
dma-mapping subsystem will be served from the defined contiguous memory
regions.
Contiguous Memory Allocator is a framework, which lets to provide a
large contiguous memory buffers for (usually a multimedia) devices. The
contiguous memory is reserved during early boot and then shared with
kernel, which is allowed to allocate it for movable pages. Then, when
device driver requests a contigouous buffer, the framework migrates
movable pages out of contiguous region and gives it to the driver. When
device driver frees the buffer, it is added to kernel memory pool again.
For more information, please refer to commit c64be2bb1c6eb43c838b2c6d57
("drivers: add Contiguous Memory Allocator") and d484864dd96e1830e76895
(CMA merge commit).
Why we need device tree bindings for CMA at all?
Older ARM kernels used so called board-based initialization. Those board
files contained a definition of all hardware blocks available on the
target system and particular kernel and driver software configuration
selected by the board maintainer.
In the new approach the board files will be removed completely and
Device Tree approach is used to describe all hardware blocks available
on the target system. By definition, the bindings should be software
independent, so at least in theory it should be possible to use those
bindings with other operating systems than Linux kernel.
Reserved memory configuration belongs to the grey area. It might depend
on hardware restriction of the board or modules and low-level
configuration done by bootloader. Putting reserved and contiguous memory
regions to /memory node and having phandles to those regions in the
device nodes however matches well with the device-tree typical style of
linking devices with other resources like clocks, interrupts,
regulators, power domains, etc. This is the main reason to use such
approach instead of putting everything to /chosen node as it has been
proposed in v2 and v3.
Best regards
Marek Szyprowski
Samsung R&D Institute Poland
Changelog:
v4:
- moved back contiguous-memory bindings from /chosen/contiguous-memory
to /memory nodes as suggested by Grant (see
http://article.gmane.org/gmane.linux.drivers.devicetree/41030
for more details)
- added support for DMA reserved memory with dma_declare_coherent()
- moved code to drivers/of/of_reserved_mem.c
- added generic code to scan specific path in flat device tree
v3: http://thread.gmane.org/gmane.linux.drivers.devicetree/40013/
- fixed issues pointed by Laura and updated documentation
v2: http://thread.gmane.org/gmane.linux.drivers.devicetree/34075
- moved contiguous-memory bindings from /memory to /chosen/contiguous-memory/
node to avoid spreading Linux specific parameters over the whole device
tree definitions
- added support for autoconfigured regions (use zero base)
- fixes minor bugs
v1: http://thread.gmane.org/gmane.linux.drivers.devicetree/30111/
- initial proposal
Patch summary:
Marek Szyprowski (4):
drivers: dma-contiguous: clean source code and prepare for device
tree
drivers: of: add function to scan fdt nodes given by path
drivers: of: add initialization code for dma reserved memory
ARM: init: add support for reserved memory defined by device tree
Documentation/devicetree/bindings/memory.txt | 152 ++++++++++++++++++++++
arch/arm/mm/init.c | 3 +
drivers/base/dma-contiguous.c | 147 +++++++++++-----------
drivers/of/Kconfig | 6 +
drivers/of/Makefile | 1 +
drivers/of/fdt.c | 76 +++++++++++
drivers/of/of_reserved_mem.c | 175 ++++++++++++++++++++++++++
include/asm-generic/dma-coherent.h | 6 +
include/asm-generic/dma-contiguous.h | 2 -
include/linux/dma-contiguous.h | 49 +++++++-
include/linux/of_fdt.h | 3 +
11 files changed, 541 insertions(+), 79 deletions(-)
create mode 100644 Documentation/devicetree/bindings/memory.txt
create mode 100644 drivers/of/of_reserved_mem.c
--
1.7.9.5
On Wed, Aug 7, 2013 at 12:23 AM, John Stultz <john.stultz(a)linaro.org> wrote:
> On Tue, Aug 6, 2013 at 5:15 AM, Rob Clark <robdclark(a)gmail.com> wrote:
>> well, let's divide things up into two categories:
>>
>> 1) the arrangement and format of pixels.. ie. what userspace would
>> need to know if it mmap's a buffer. This includes pixel format,
>> stride, etc. This should be negotiated in userspace, it would be
>> crazy to try to do this in the kernel.
>>
>> 2) the physical placement of the pages. Ie. whether it is contiguous
>> or not. Which bank the pages in the buffer are placed in, etc. This
>> is not visible to userspace. This is the purpose of the attach step,
>> so you know all the devices involved in sharing up front before
>> allocating the backing pages. (Or in the worst case, if you have a
>> "late attacher" you at least know when no device is doing dma access
>> to a buffer and can reallocate and move the buffer.) A long time
>
> One concern I know the Android folks have expressed previously (and
> correct me if its no longer an objection), is that this attach time
> in-kernel constraint solving / moving or reallocating buffers is
> likely to hurt determinism. If I understood, their perspective was
> that userland knows the device path the buffers will travel through,
> so why not leverage that knowledge, rather then having the kernel have
> to sort it out for itself after the fact.
If you know the device path, then attach the buffer at all the devices
before you start using it. Problem solved.. kernel knows all devices
before pages need be allocated ;-)
BR,
-R
On Tue, Aug 6, 2013 at 10:03 AM, Tom Cooksey <tom.cooksey(a)arm.com> wrote:
> Hi Rob,
>
>> >> > We may also then have additional constraints when sharing buffers
>> >> > between the display HW and video decode or even camera ISP HW.
>> >> > Programmatically describing buffer allocation constraints is very
>> >> > difficult and I'm not sure you can actually do it - there's some
>> >> > pretty complex constraints out there! E.g. I believe there's a
>> >> > platform where Y and UV planes of the reference frame need to be
>> >> > in separate DRAM banks for real-time 1080p decode, or something
>> >> > like that?
>> >>
>> >> yes, this was discussed. This is different from pitch/format/size
>> >> constraints.. it is really just a placement constraint (ie. where
>> >> do the physical pages go). IIRC the conclusion was to use a dummy
>> >> devices with it's own CMA pool for attaching the Y vs UV buffers.
>> >>
>> >> > Anyway, I guess my point is that even if we solve how to allocate
>> >> > buffers which will be shared between the GPU and display HW such
>> >> > that both sets of constraints are satisfied, that may not be the
>> >> > end of the story.
>> >> >
>> >>
>> >> that was part of the reason to punt this problem to userspace ;-)
>> >>
>> >> In practice, the kernel drivers doesn't usually know too much about
>> >> the dimensions/format/etc.. that is really userspace level
>> >> knowledge. There are a few exceptions when the kernel needs to know
>> >> how to setup GTT/etc for tiled buffers, but normally this sort of
>> >> information is up at the next level up (userspace, and
>> >> drm_framebuffer in case of scanout). Userspace media frameworks
>> >> like GStreamer already have a concept of format/caps negotiation.
>> >> For non-display<->gpu sharing, I think this is probably where this
>> >> sort of constraint negotiation should be handled.
>> >
>> > I agree that user-space will know which devices will access the
>> > buffer and thus can figure out at least a common pixel format.
>> > Though I'm not so sure userspace can figure out more low-level
>> > details like alignment and placement in physical memory, etc.
>> >
>>
>> well, let's divide things up into two categories:
>>
>> 1) the arrangement and format of pixels.. ie. what userspace would
>> need to know if it mmap's a buffer. This includes pixel format,
>> stride, etc. This should be negotiated in userspace, it would be
>> crazy to try to do this in the kernel.
>
> Absolutely. Pixel format has to be negotiated by user-space as in
> most cases, user-space can map the buffer and thus will need to
> know how to interpret the data.
>
>
>
>> 2) the physical placement of the pages. Ie. whether it is contiguous
>> or not. Which bank the pages in the buffer are placed in, etc. This
>> is not visible to userspace.
>
> Seems sensible to me.
>
>
>> ... This is the purpose of the attach step,
>> so you know all the devices involved in sharing up front before
>> allocating the backing pages. (Or in the worst case, if you have a
>> "late attacher" you at least know when no device is doing dma access
>> to a buffer and can reallocate and move the buffer.) A long time
>> back, I had a patch that added a field or two to 'struct
>> device_dma_parameters' so that it could be known if a device required
>> contiguous buffers.. looks like that never got merged, so I'd need to
>> dig that back up and resend it. But the idea was to have the 'struct
>> device' encapsulate all the information that would be needed to
>> do-the-right-thing when it comes to placement.
>
> As I understand it, it's up to the exporting device to allocate the
> memory backing the dma_buf buffer. I guess the latest possible point
> you can allocate the backing pages is when map_dma_buf is first
> called? At that point the exporter can iterate over the current set
> of attachments, programmatically determine the all the constraints of
> all the attached drivers and attempt to allocate the backing pages
> in such a way as to satisfy all those constraints?
yes, this is the idea.. possibly some room for some helpers to help
out with this, but that is all under the hood from userspace
perspective
> Didn't you say that programmatically describing device placement
> constraints was an unbounded problem? I guess we would have to
> accept that it's not possible to describe all possible constraints
> and instead find a way to describe the common ones?
well, the point I'm trying to make, is by dividing your constraints
into two groups, one that impacts and is handled by userspace, and one
that is in the kernel (ie. where the pages go), you cut down the
number of permutations that the kernel has to care about considerably.
And kernel already cares about, for example, what range of addresses
that a device can dma to/from. I think really the only thing missing
is the max # of sglist entries (contiguous or not)
> One problem with this is it duplicates a lot of logic in each
> driver which can export a dma_buf buffer. Each exporter will need to
> do pretty much the same thing: iterate over all the attachments,
> determine of all the constraints (assuming that can be done) and
> allocate pages such that the lowest-common-denominator is satisfied.
>
> Perhaps rather than duplicating that logic in every driver, we could
> Instead move allocation of the backing pages into dma_buf itself?
>
I tend to think it is better to add helpers as we see common patterns
emerge, which drivers can opt-in to using. I don't think that we
should move allocation into dma_buf itself, but it would perhaps be
useful to have dma_alloc_*() variants that could allocate for multiple
devices. That would help for simple stuff, although I'd suspect
eventually a GPU driver will move away from that. (Since you probably
want to play tricks w/ pools of pages that are pre-zero'd and in the
correct cache state, use spare cycles on the gpu or dma engine to
pre-zero uncached pages, and games like that.)
>
>> > Anyway, assuming user-space can figure out how a buffer should be
>> > stored in memory, how does it indicate this to a kernel driver and
>> > actually allocate it? Which ioctl on which device does user-space
>> > call, with what parameters? Are you suggesting using something like
>> > ION which exposes the low-level details of how buffers are laid out
>> in
>> > physical memory to userspace? If not, what?
>>
>> no, userspace should not need to know this. And having a central
>> driver that knows this for all the other drivers in the system doesn't
>> really solve anything and isn't really scalable. At best you might
>> want, in some cases, a flag you can pass when allocating. For
>> example, some of the drivers have a 'SCANOUT' flag that can be passed
>> when allocating a GEM buffer, as a hint to the kernel that 'if this hw
>> requires contig memory for scanout, allocate this buffer contig'. But
>> really, when it comes to sharing buffers between devices, we want this
>> sort of information in dev->dma_params of the importing device(s).
>
> If you had a single driver which knew the constraints of all devices
> on that particular SoC and the interface allowed user-space to specify
> which devices a buffer is intended to be used with, I guess it could
> pretty trivially allocate pages which satisfy those constraints? It
keep in mind, even a number of SoC's come with pcie these days. You
already have things like
https://developer.nvidia.com/content/kayla-platform
You probably want to get out of the SoC mindset, otherwise you are
going to make bad assumptions that come back to bite you later on.
> wouldn't need a way to programmatically describe the constraints
> either: As you say, if userspace sets the "SCANOUT" flag, it would
> just "know" that on this SoC, that buffer needs to be physically
> contiguous for example.
not really.. it just knows it wants to scanout the buffer, and tells
this as a hint to the kernel.
For example, on omapdrm, the SCANOUT flag does nothing on omap4+
(where phys contig is not required for scanout), but causes CMA
(dma_alloc_*()) to be used on omap3. Userspace doesn't care. It just
knows that it wants to be able to scanout that particular buffer.
> Though It would effectively mean you'd need an "allocation" driver per
> SoC, which as you say may not be scalable?
Right.. and not actually even possible in the general sense (see SoC +
external pcie gfx card)
BR,
-R
>
>
> Cheers,
>
> Tom
>
>
>
>
>
Am Dienstag, den 06.08.2013, 12:31 +0100 schrieb Tom Cooksey:
> Hi Rob,
>
> +lkml
>
> > >> On Fri, Jul 26, 2013 at 11:58 AM, Tom Cooksey <tom.cooksey(a)arm.com>
> > >> wrote:
> > >> >> > * It abuses flags parameter of DRM_IOCTL_MODE_CREATE_DUMB to
> > >> >> > also allocate buffers for the GPU. Still not sure how to
> > >> >> > resolve this as we don't use DRM for our GPU driver.
> > >> >>
> > >> >> any thoughts/plans about a DRM GPU driver? Ideally long term
> > >> >> (esp. once the dma-fence stuff is in place), we'd have
> > >> >> gpu-specific drm (gpu-only, no kms) driver, and SoC/display
> > >> >> specific drm/kms driver, using prime/dmabuf to share between
> > >> >> the two.
> > >> >
> > >> > The "extra" buffers we were allocating from armsoc DDX were really
> > >> > being allocated through DRM/GEM so we could get an flink name
> > >> > for them and pass a reference to them back to our GPU driver on
> > >> > the client side. If it weren't for our need to access those
> > >> > extra off-screen buffers with the GPU we wouldn't need to
> > >> > allocate them with DRM at all. So, given they are really "GPU"
> > >> > buffers, it does absolutely make sense to allocate them in a
> > >> > different driver to the display driver.
> > >> >
> > >> > However, to avoid unnecessary memcpys & related cache
> > >> > maintenance ops, we'd also like the GPU to render into buffers
> > >> > which are scanned out by the display controller. So let's say
> > >> > we continue using DRM_IOCTL_MODE_CREATE_DUMB to allocate scan
> > >> > out buffers with the display's DRM driver but a custom ioctl
> > >> > on the GPU's DRM driver to allocate non scanout, off-screen
> > >> > buffers. Sounds great, but I don't think that really works
> > >> > with DRI2. If we used two drivers to allocate buffers, which
> > >> > of those drivers do we return in DRI2ConnectReply? Even if we
> > >> > solve that somehow, GEM flink names are name-spaced to a
> > >> > single device node (AFAIK). So when we do a DRI2GetBuffers,
> > >> > how does the EGL in the client know which DRM device owns GEM
> > >> > flink name "1234"? We'd need some pretty dirty hacks.
> > >>
> > >> You would return the name of the display driver allocating the
> > >> buffers. On the client side you can use generic ioctls to go from
> > >> flink -> handle -> dmabuf. So the client side would end up opening
> > >> both the display drm device and the gpu, but without needing to know
> > >> too much about the display.
> > >
> > > I think the bit I was missing was that a GEM bo for a buffer imported
> > > using dma_buf/PRIME can still be flink'd. So the display controller's
> > > DRM driver allocates scan-out buffers via the DUMB buffer allocate
> > > ioctl. Those scan-out buffers than then be exported from the
> > > dispaly's DRM driver and imported into the GPU's DRM driver using
> > > PRIME. Once imported into the GPU's driver, we can use flink to get a
> > > name for that buffer within the GPU DRM driver's name-space to return
> > > to the DRI2 client. That same namespace is also what DRI2 back-
> > > buffers are allocated from, so I think that could work... Except...
> >
> > (and.. the general direction is that things will move more to just use
> > dmabuf directly, ie. wayland or dri3)
>
> I agree, DRI2 is the only reason why we need a system-wide ID. I also
> prefer buffers to be passed around by dma_buf fd, but we still need to
> support DRI2 and will do for some time I expect.
>
>
>
> > >> > Anyway, that latter case also gets quite difficult. The "GPU"
> > >> > DRM driver would need to know the constraints of the display
> > >> > controller when allocating buffers intended to be scanned out.
> > >> > For example, pl111 typically isn't behind an IOMMU and so
> > >> > requires physically contiguous memory. We'd have to teach the
> > >> > GPU's DRM driver about the constraints of the display HW. Not
> > >> > exactly a clean driver model. :-(
> > >> >
> > >> > I'm still a little stuck on how to proceed, so any ideas
> > >> > would greatly appreciated! My current train of thought is
> > >> > having a kind of SoC-specific DRM driver which allocates
> > >> > buffers for both display and GPU within a single GEM
> > >> > namespace. That SoC-specific DRM driver could then know the
> > >> > constraints of both the GPU and the display HW. We could then
> > >> > use PRIME to export buffers allocated with the SoC DRM driver
> > >> > and import them into the GPU and/or display DRM driver.
> > >>
> > >> Usually if the display drm driver is allocating the buffers that
> > >> might be scanned out, it just needs to have minimal knowledge of
> > >> the GPU (pitch alignment constraints). I don't think we need a
> > >> 3rd device just to allocate buffers.
> > >
> > > While Mali can render to pretty much any buffer, there is a mild
> > > performance improvement to be had if the buffer stride is aligned to
> > > the AXI bus's max burst length when drawing to the buffer.
> >
> > I suspect the display controllers might frequently benefit if the
> > pitch is aligned to AXI burst length too..
>
> If the display controller is going to be reading from linear memory
> I don't think it will make much difference - you'll just get an extra
> 1-2 bus transactions per scanline. With a tile-based GPU like Mali,
> you get those extra transactions per _tile_ scan-line and as such,
> the overhead is more pronounced.
>
>
>
> > > So in some respects, there is a constraint on how buffers which will
> > > be drawn to using the GPU are allocated. I don't really like the idea
> > > of teaching the display controller DRM driver about the GPU buffer
> > > constraints, even if they are fairly trivial like this. If the same
> > > display HW IP is being used on several SoCs, it seems wrong somehow
> > > to enforce those GPU constraints if some of those SoCs don't have a
> > > GPU.
> >
> > Well, I suppose you could get min_pitch_alignment from devicetree, or
> > something like this..
> >
> > In the end, the easy solution is just to make the display allocate to
> > the worst-case pitch alignment. In the early days of dma-buf
> > discussions, we kicked around the idea of negotiating or
> > programatically describing the constraints, but that didn't really
> > seem like a bounded problem.
>
> Yeah - I was around for some of those discussions and agree it's not
> really an easy problem to solve.
>
>
>
> > > We may also then have additional constraints when sharing buffers
> > > between the display HW and video decode or even camera ISP HW.
> > > Programmatically describing buffer allocation constraints is very
> > > difficult and I'm not sure you can actually do it - there's some
> > > pretty complex constraints out there! E.g. I believe there's a
> > > platform where Y and UV planes of the reference frame need to be in
> > > separate DRAM banks for real-time 1080p decode, or something like
> > > that?
> >
> > yes, this was discussed. This is different from pitch/format/size
> > constraints.. it is really just a placement constraint (ie. where do
> > the physical pages go). IIRC the conclusion was to use a dummy
> > devices with it's own CMA pool for attaching the Y vs UV buffers.
> >
> > > Anyway, I guess my point is that even if we solve how to allocate
> > > buffers which will be shared between the GPU and display HW such that
> > > both sets of constraints are satisfied, that may not be the end of
> > > the story.
> > >
> >
> > that was part of the reason to punt this problem to userspace ;-)
> >
> > In practice, the kernel drivers doesn't usually know too much about
> > the dimensions/format/etc.. that is really userspace level knowledge.
> > There are a few exceptions when the kernel needs to know how to setup
> > GTT/etc for tiled buffers, but normally this sort of information is up
> > at the next level up (userspace, and drm_framebuffer in case of
> > scanout). Userspace media frameworks like GStreamer already have a
> > concept of format/caps negotiation. For non-display<->gpu sharing, I
> > think this is probably where this sort of constraint negotiation
> > should be handled.
>
> I agree that user-space will know which devices will access the buffer
> and thus can figure out at least a common pixel format. Though I'm not
> so sure userspace can figure out more low-level details like alignment
> and placement in physical memory, etc.
>
> Anyway, assuming user-space can figure out how a buffer should be
> stored in memory, how does it indicate this to a kernel driver and
> actually allocate it? Which ioctl on which device does user-space
> call, with what parameters? Are you suggesting using something like
> ION which exposes the low-level details of how buffers are laid out in
> physical memory to userspace? If not, what?
>
I strongly disagree with exposing low-level hardware details like tiling
to userspace. If we have to do the negotiation of those things in
userspace we will end up with having to pipe those information through
things like the wayland protocol. I don't see how this could ever be
considered a good idea.
I would rather see kernel drivers negotiating those things at dmabuf
attach time in way invisible to userspace. I agree that this negotiation
thing isn't easy to get right for the plethora of different hardware
constraints we see today, but I would rather see this in-kernel, where
we have the chance to fix things up if needed, than in a fixed userspace
interface.
Regards,
Lucas
--
Pengutronix e.K. | Lucas Stach |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
On Tue, Aug 6, 2013 at 7:31 AM, Tom Cooksey <tom.cooksey(a)arm.com> wrote:
>
>> > So in some respects, there is a constraint on how buffers which will
>> > be drawn to using the GPU are allocated. I don't really like the idea
>> > of teaching the display controller DRM driver about the GPU buffer
>> > constraints, even if they are fairly trivial like this. If the same
>> > display HW IP is being used on several SoCs, it seems wrong somehow
>> > to enforce those GPU constraints if some of those SoCs don't have a
>> > GPU.
>>
>> Well, I suppose you could get min_pitch_alignment from devicetree, or
>> something like this..
>>
>> In the end, the easy solution is just to make the display allocate to
>> the worst-case pitch alignment. In the early days of dma-buf
>> discussions, we kicked around the idea of negotiating or
>> programatically describing the constraints, but that didn't really
>> seem like a bounded problem.
>
> Yeah - I was around for some of those discussions and agree it's not
> really an easy problem to solve.
>
>
>
>> > We may also then have additional constraints when sharing buffers
>> > between the display HW and video decode or even camera ISP HW.
>> > Programmatically describing buffer allocation constraints is very
>> > difficult and I'm not sure you can actually do it - there's some
>> > pretty complex constraints out there! E.g. I believe there's a
>> > platform where Y and UV planes of the reference frame need to be in
>> > separate DRAM banks for real-time 1080p decode, or something like
>> > that?
>>
>> yes, this was discussed. This is different from pitch/format/size
>> constraints.. it is really just a placement constraint (ie. where do
>> the physical pages go). IIRC the conclusion was to use a dummy
>> devices with it's own CMA pool for attaching the Y vs UV buffers.
>>
>> > Anyway, I guess my point is that even if we solve how to allocate
>> > buffers which will be shared between the GPU and display HW such that
>> > both sets of constraints are satisfied, that may not be the end of
>> > the story.
>> >
>>
>> that was part of the reason to punt this problem to userspace ;-)
>>
>> In practice, the kernel drivers doesn't usually know too much about
>> the dimensions/format/etc.. that is really userspace level knowledge.
>> There are a few exceptions when the kernel needs to know how to setup
>> GTT/etc for tiled buffers, but normally this sort of information is up
>> at the next level up (userspace, and drm_framebuffer in case of
>> scanout). Userspace media frameworks like GStreamer already have a
>> concept of format/caps negotiation. For non-display<->gpu sharing, I
>> think this is probably where this sort of constraint negotiation
>> should be handled.
>
> I agree that user-space will know which devices will access the buffer
> and thus can figure out at least a common pixel format. Though I'm not
> so sure userspace can figure out more low-level details like alignment
> and placement in physical memory, etc.
well, let's divide things up into two categories:
1) the arrangement and format of pixels.. ie. what userspace would
need to know if it mmap's a buffer. This includes pixel format,
stride, etc. This should be negotiated in userspace, it would be
crazy to try to do this in the kernel.
2) the physical placement of the pages. Ie. whether it is contiguous
or not. Which bank the pages in the buffer are placed in, etc. This
is not visible to userspace. This is the purpose of the attach step,
so you know all the devices involved in sharing up front before
allocating the backing pages. (Or in the worst case, if you have a
"late attacher" you at least know when no device is doing dma access
to a buffer and can reallocate and move the buffer.) A long time
back, I had a patch that added a field or two to 'struct
device_dma_parameters' so that it could be known if a device required
contiguous buffers.. looks like that never got merged, so I'd need to
dig that back up and resend it. But the idea was to have the 'struct
device' encapsulate all the information that would be needed to
do-the-right-thing when it comes to placement.
> Anyway, assuming user-space can figure out how a buffer should be
> stored in memory, how does it indicate this to a kernel driver and
> actually allocate it? Which ioctl on which device does user-space
> call, with what parameters? Are you suggesting using something like
> ION which exposes the low-level details of how buffers are laid out in
> physical memory to userspace? If not, what?
no, userspace should not need to know this. And having a central
driver that knows this for all the other drivers in the system doesn't
really solve anything and isn't really scalable. At best you might
want, in some cases, a flag you can pass when allocating. For
example, some of the drivers have a 'SCANOUT' flag that can be passed
when allocating a GEM buffer, as a hint to the kernel that 'if this hw
requires contig memory for scanout, allocate this buffer contig'. But
really, when it comes to sharing buffers between devices, we want this
sort of information in dev->dma_params of the importing device(s).
BR,
-R
On Mon, Aug 5, 2013 at 1:10 PM, Tom Cooksey <tom.cooksey(a)arm.com> wrote:
> Hi Rob,
>
> +linux-media, +linaro-mm-sig for discussion of video/camera
> buffer constraints...
>
>
>> On Fri, Jul 26, 2013 at 11:58 AM, Tom Cooksey <tom.cooksey(a)arm.com>
>> wrote:
>> >> > * It abuses flags parameter of DRM_IOCTL_MODE_CREATE_DUMB to also
>> >> > allocate buffers for the GPU. Still not sure how to resolve
>> >> > this as we don't use DRM for our GPU driver.
>> >>
>> >> any thoughts/plans about a DRM GPU driver? Ideally long term (esp.
>> >> once the dma-fence stuff is in place), we'd have gpu-specific drm
>> >> (gpu-only, no kms) driver, and SoC/display specific drm/kms driver,
>> >> using prime/dmabuf to share between the two.
>> >
>> > The "extra" buffers we were allocating from armsoc DDX were really
>> > being allocated through DRM/GEM so we could get an flink name
>> > for them and pass a reference to them back to our GPU driver on
>> > the client side. If it weren't for our need to access those
>> > extra off-screen buffers with the GPU we wouldn't need to
>> > allocate them with DRM at all. So, given they are really "GPU"
>> > buffers, it does absolutely make sense to allocate them in a
>> > different driver to the display driver.
>> >
>> > However, to avoid unnecessary memcpys & related cache
>> > maintenance ops, we'd also like the GPU to render into buffers
>> > which are scanned out by the display controller. So let's say
>> > we continue using DRM_IOCTL_MODE_CREATE_DUMB to allocate scan
>> > out buffers with the display's DRM driver but a custom ioctl
>> > on the GPU's DRM driver to allocate non scanout, off-screen
>> > buffers. Sounds great, but I don't think that really works
>> > with DRI2. If we used two drivers to allocate buffers, which
>> > of those drivers do we return in DRI2ConnectReply? Even if we
>> > solve that somehow, GEM flink names are name-spaced to a
>> > single device node (AFAIK). So when we do a DRI2GetBuffers,
>> > how does the EGL in the client know which DRM device owns GEM
>> > flink name "1234"? We'd need some pretty dirty hacks.
>>
>> You would return the name of the display driver allocating the
>> buffers. On the client side you can use generic ioctls to go from
>> flink -> handle -> dmabuf. So the client side would end up opening
>> both the display drm device and the gpu, but without needing to know
>> too much about the display.
>
> I think the bit I was missing was that a GEM bo for a buffer imported
> using dma_buf/PRIME can still be flink'd. So the display controller's
> DRM driver allocates scan-out buffers via the DUMB buffer allocate
> ioctl. Those scan-out buffers than then be exported from the
> dispaly's DRM driver and imported into the GPU's DRM driver using
> PRIME. Once imported into the GPU's driver, we can use flink to get a
> name for that buffer within the GPU DRM driver's name-space to return
> to the DRI2 client. That same namespace is also what DRI2 back-buffers
> are allocated from, so I think that could work... Except...
>
(and.. the general direction is that things will move more to just use
dmabuf directly, ie. wayland or dri3)
>
>> > Anyway, that latter case also gets quite difficult. The "GPU"
>> > DRM driver would need to know the constraints of the display
>> > controller when allocating buffers intended to be scanned out.
>> > For example, pl111 typically isn't behind an IOMMU and so
>> > requires physically contiguous memory. We'd have to teach the
>> > GPU's DRM driver about the constraints of the display HW. Not
>> > exactly a clean driver model. :-(
>> >
>> > I'm still a little stuck on how to proceed, so any ideas
>> > would greatly appreciated! My current train of thought is
>> > having a kind of SoC-specific DRM driver which allocates
>> > buffers for both display and GPU within a single GEM
>> > namespace. That SoC-specific DRM driver could then know the
>> > constraints of both the GPU and the display HW. We could then
>> > use PRIME to export buffers allocated with the SoC DRM driver
>> > and import them into the GPU and/or display DRM driver.
>>
>> Usually if the display drm driver is allocating the buffers that might
>> be scanned out, it just needs to have minimal knowledge of the GPU
>> (pitch alignment constraints). I don't think we need a 3rd device
>> just to allocate buffers.
>
> While Mali can render to pretty much any buffer, there is a mild
> performance improvement to be had if the buffer stride is aligned to
> the AXI bus's max burst length when drawing to the buffer.
I suspect the display controllers might frequently benefit if the
pitch is aligned to AXI burst length too..
> So in some respects, there is a constraint on how buffers which will
> be drawn to using the GPU are allocated. I don't really like the idea
> of teaching the display controller DRM driver about the GPU buffer
> constraints, even if they are fairly trivial like this. If the same
> display HW IP is being used on several SoCs, it seems wrong somehow
> to enforce those GPU constraints if some of those SoCs don't have a
> GPU.
Well, I suppose you could get min_pitch_alignment from devicetree, or
something like this..
In the end, the easy solution is just to make the display allocate to
the worst-case pitch alignment. In the early days of dma-buf
discussions, we kicked around the idea of negotiating or
programatically describing the constraints, but that didn't really
seem like a bounded problem.
> We may also then have additional constraints when sharing buffers
> between the display HW and video decode or even camera ISP HW.
> Programmatically describing buffer allocation constraints is very
> difficult and I'm not sure you can actually do it - there's some
> pretty complex constraints out there! E.g. I believe there's a
> platform where Y and UV planes of the reference frame need to be in
> separate DRAM banks for real-time 1080p decode, or something like
> that?
yes, this was discussed. This is different from pitch/format/size
constraints.. it is really just a placement constraint (ie. where do
the physical pages go). IIRC the conclusion was to use a dummy
devices with it's own CMA pool for attaching the Y vs UV buffers.
> Anyway, I guess my point is that even if we solve how to allocate
> buffers which will be shared between the GPU and display HW such that
> both sets of constraints are satisfied, that may not be the end of
> the story.
>
that was part of the reason to punt this problem to userspace ;-)
In practice, the kernel drivers doesn't usually know too much about
the dimensions/format/etc.. that is really userspace level knowledge.
There are a few exceptions when the kernel needs to know how to setup
GTT/etc for tiled buffers, but normally this sort of information is up
at the next level up (userspace, and drm_framebuffer in case of
scanout). Userspace media frameworks like GStreamer already have a
concept of format/caps negotiation. For non-display<->gpu sharing, I
think this is probably where this sort of constraint negotiation
should be handled.
BR,
-R
>
> Cheers,
>
> Tom
>
>
>
>
>
Hi Rob,
+linux-media, +linaro-mm-sig for discussion of video/camera
buffer constraints...
> On Fri, Jul 26, 2013 at 11:58 AM, Tom Cooksey <tom.cooksey(a)arm.com>
> wrote:
> >> > * It abuses flags parameter of DRM_IOCTL_MODE_CREATE_DUMB to also
> >> > allocate buffers for the GPU. Still not sure how to resolve
> >> > this as we don't use DRM for our GPU driver.
> >>
> >> any thoughts/plans about a DRM GPU driver? Ideally long term (esp.
> >> once the dma-fence stuff is in place), we'd have gpu-specific drm
> >> (gpu-only, no kms) driver, and SoC/display specific drm/kms driver,
> >> using prime/dmabuf to share between the two.
> >
> > The "extra" buffers we were allocating from armsoc DDX were really
> > being allocated through DRM/GEM so we could get an flink name
> > for them and pass a reference to them back to our GPU driver on
> > the client side. If it weren't for our need to access those
> > extra off-screen buffers with the GPU we wouldn't need to
> > allocate them with DRM at all. So, given they are really "GPU"
> > buffers, it does absolutely make sense to allocate them in a
> > different driver to the display driver.
> >
> > However, to avoid unnecessary memcpys & related cache
> > maintenance ops, we'd also like the GPU to render into buffers
> > which are scanned out by the display controller. So let's say
> > we continue using DRM_IOCTL_MODE_CREATE_DUMB to allocate scan
> > out buffers with the display's DRM driver but a custom ioctl
> > on the GPU's DRM driver to allocate non scanout, off-screen
> > buffers. Sounds great, but I don't think that really works
> > with DRI2. If we used two drivers to allocate buffers, which
> > of those drivers do we return in DRI2ConnectReply? Even if we
> > solve that somehow, GEM flink names are name-spaced to a
> > single device node (AFAIK). So when we do a DRI2GetBuffers,
> > how does the EGL in the client know which DRM device owns GEM
> > flink name "1234"? We'd need some pretty dirty hacks.
>
> You would return the name of the display driver allocating the
> buffers. On the client side you can use generic ioctls to go from
> flink -> handle -> dmabuf. So the client side would end up opening
> both the display drm device and the gpu, but without needing to know
> too much about the display.
I think the bit I was missing was that a GEM bo for a buffer imported
using dma_buf/PRIME can still be flink'd. So the display controller's
DRM driver allocates scan-out buffers via the DUMB buffer allocate
ioctl. Those scan-out buffers than then be exported from the
dispaly's DRM driver and imported into the GPU's DRM driver using
PRIME. Once imported into the GPU's driver, we can use flink to get a
name for that buffer within the GPU DRM driver's name-space to return
to the DRI2 client. That same namespace is also what DRI2 back-buffers
are allocated from, so I think that could work... Except...
> > Anyway, that latter case also gets quite difficult. The "GPU"
> > DRM driver would need to know the constraints of the display
> > controller when allocating buffers intended to be scanned out.
> > For example, pl111 typically isn't behind an IOMMU and so
> > requires physically contiguous memory. We'd have to teach the
> > GPU's DRM driver about the constraints of the display HW. Not
> > exactly a clean driver model. :-(
> >
> > I'm still a little stuck on how to proceed, so any ideas
> > would greatly appreciated! My current train of thought is
> > having a kind of SoC-specific DRM driver which allocates
> > buffers for both display and GPU within a single GEM
> > namespace. That SoC-specific DRM driver could then know the
> > constraints of both the GPU and the display HW. We could then
> > use PRIME to export buffers allocated with the SoC DRM driver
> > and import them into the GPU and/or display DRM driver.
>
> Usually if the display drm driver is allocating the buffers that might
> be scanned out, it just needs to have minimal knowledge of the GPU
> (pitch alignment constraints). I don't think we need a 3rd device
> just to allocate buffers.
While Mali can render to pretty much any buffer, there is a mild
performance improvement to be had if the buffer stride is aligned to
the AXI bus's max burst length when drawing to the buffer.
So in some respects, there is a constraint on how buffers which will
be drawn to using the GPU are allocated. I don't really like the idea
of teaching the display controller DRM driver about the GPU buffer
constraints, even if they are fairly trivial like this. If the same
display HW IP is being used on several SoCs, it seems wrong somehow
to enforce those GPU constraints if some of those SoCs don't have a
GPU.
We may also then have additional constraints when sharing buffers
between the display HW and video decode or even camera ISP HW.
Programmatically describing buffer allocation constraints is very
difficult and I'm not sure you can actually do it - there's some
pretty complex constraints out there! E.g. I believe there's a
platform where Y and UV planes of the reference frame need to be in
separate DRAM banks for real-time 1080p decode, or something like
that?
Anyway, I guess my point is that even if we solve how to allocate
buffers which will be shared between the GPU and display HW such that
both sets of constraints are satisfied, that may not be the end of
the story.
Cheers,
Tom
Hello,
This is a fourth version of my proposal for device tree integration for
reserved memory and Contiguous Memory Allocator. After the comments from
Grant Likely I moved back memory region definitions back to /memory node
(as it was in the first version of this proposal). I've also extended
the code and made it more generic, added support for so called reserved
dma memory (special dma memory regions created by dma_alloc_coherent()
function, for exclusive usage for dma allocation for the given device).
Just a few words for those who see this code for the first time:
The proposed bindings allows to define contiguous memory regions of
specified base address and size. Then, the defined regions can be
assigned to the given device(s) by adding a property with a phanle to
the defined contiguous memory region. From the device tree perspective
that's all. Once the bindings are added, all the memory allocations from
dma-mapping subsystem will be served from the defined contiguous memory
regions.
Contiguous Memory Allocator is a framework, which lets to provide a
large contiguous memory buffers for (usually a multimedia) devices. The
contiguous memory is reserved during early boot and then shared with
kernel, which is allowed to allocate it for movable pages. Then, when
device driver requests a contigouous buffer, the framework migrates
movable pages out of contiguous region and gives it to the driver. When
device driver frees the buffer, it is added to kernel memory pool again.
For more information, please refer to commit c64be2bb1c6eb43c838b2c6d57
("drivers: add Contiguous Memory Allocator") and d484864dd96e1830e76895
(CMA merge commit).
Why we need device tree bindings for CMA at all?
Older ARM kernels used so called board-based initialization. Those board
files contained a definition of all hardware blocks available on the
target system and particular kernel and driver software configuration
selected by the board maintainer.
In the new approach the board files will be removed completely and
Device Tree approach is used to describe all hardware blocks available
on the target system. By definition, the bindings should be software
independent, so at least in theory it should be possible to use those
bindings with other operating systems than Linux kernel.
Reserved memory configuration belongs to the grey area. It might depend
on hardware restriction of the board or modules and low-level
configuration done by bootloader. Putting reserved and contiguous memory
regions to /memory node and having phandles to those regions in the
device nodes however matches well with the device-tree typical style of
linking devices with other resources like clocks, interrupts,
regulators, power domains, etc. This is the main reason to use such
approach instead of putting everything to /chosen node as it has been
proposed in v2 and v3.
Best regards
Marek Szyprowski
Samsung R&D Institute Poland
Changelog:
v4:
- corrected Devcie Tree mailing list address (resend)
- moved back contiguous-memory bindings from /chosen/contiguous-memory
to /memory nodes as suggested by Grant (see
http://article.gmane.org/gmane.linux.drivers.devicetree/41030
for more details)
- added support for DMA reserved memory with dma_declare_coherent()
- moved code to drivers/of/of_reserved_mem.c
- added generic code to scan specific path in flat device tree
v3: http://thread.gmane.org/gmane.linux.drivers.devicetree/40013/
- fixed issues pointed by Laura and updated documentation
v2: http://thread.gmane.org/gmane.linux.drivers.devicetree/34075
- moved contiguous-memory bindings from /memory to /chosen/contiguous-memory/
node to avoid spreading Linux specific parameters over the whole device
tree definitions
- added support for autoconfigured regions (use zero base)
- fixes minor bugs
v1: http://thread.gmane.org/gmane.linux.drivers.devicetree/30111/
- initial proposal
Patch summary:
Marek Szyprowski (4):
drivers: dma-contiguous: clean source code and prepare for device
tree
drivers: of: add function to scan fdt nodes given by path
drivers: of: add initialization code for dma reserved memory
ARM: init: add support for reserved memory defined by device tree
Documentation/devicetree/bindings/memory.txt | 152 ++++++++++++++++++++++
arch/arm/mm/init.c | 3 +
drivers/base/dma-contiguous.c | 147 +++++++++++-----------
drivers/of/Kconfig | 6 +
drivers/of/Makefile | 1 +
drivers/of/fdt.c | 76 +++++++++++
drivers/of/of_reserved_mem.c | 175 ++++++++++++++++++++++++++
include/asm-generic/dma-coherent.h | 6 +
include/asm-generic/dma-contiguous.h | 2 -
include/linux/dma-contiguous.h | 49 +++++++-
include/linux/of_fdt.h | 3 +
11 files changed, 541 insertions(+), 79 deletions(-)
create mode 100644 Documentation/devicetree/bindings/memory.txt
create mode 100644 drivers/of/of_reserved_mem.c
--
1.7.9.5
From: "Luis R. Rodriguez" <mcgrof(a)do-not-panic.com>
This backports the kernel's wound/wait style locks 040a0a371,
using the linux-stable v3.11-rc2 as a base for development.
Given the complexity to support debugging mutexes this backport
implementation is simplified by only making this feature availabe
if you to have DEBUG_MUTEXES and DEBUG_LOCK_ALLOC disabled.
Given that ww mutex is required for DRM this also means we must
update the kconfig for DRM and require you to also not be able to build
DRM if you have either of these options enabled. Support for
DEBUG_MUTEXES and DEBUG_LOCK_ALLOC can be added later by anyone
daring. This uses the new dependencies file kconfig language
extension to specify the backport feature build restrictions
for DRM.
Part of the ww mutex addition to the kernel required modifying
the fast path mutex locking scheme by requiring you to deal
with the slow path alternatives on your own (refer to a41b56ef).
The reason for this change was that the mutex fastpath implementation
assumed your slowpath alternative can only be passed one argument
and the addition of ww mutexes requires dealing with the slow
path with a context passed.
It'd be painful to backport all asm for an optimized fastpath
implementation so we penalize the backport ww mutex fast path
by using the generic atomic_dec_return().
To backport a clean our own mutex_lock_common() with the least
amount of changes against upstream commits 2bd2c92c and 41fcb9f2
also needed to be backported. Commit 2bd2c92c dealt with adding
support for queue mutex spinners with an MCS lock, since this
cannot be backported for older kernels we provide empty inlines.
Commit 41fcb9f2 just removed SCHED_FEAT_OWNER_SPIN as it was an
early hack, the only thing required to backport this commit was
to provide an alternative declaration for mutex_spin_on_owner()
as it was declared non-inline for older kernels.
Finally c5491ea7 required backporting schedule_preempt_disabled()
as well but that just consisted of carrying over the original
implementation. Since its not exported we need to reimplement
it to make it available to our internal core ww mutex port.
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains 040a0a371
v3.11-rc1~147^2~5
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains a41b56ef
v3.11-rc1~147^2~6
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains 2bd2c92c
v3.10-rc1~200^2~3
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains 41fcb9f2
v3.10-rc1~200^2~5
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains c5491ea7
v3.4-rc1~3^2~27
commit 040a0a37100563754bb1fee6ff6427420bcfa609
Author: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Date: Mon Jun 24 10:30:04 2013 +0200
mutex: Add support for wound/wait style locks
Wound/wait mutexes are used when other multiple lock
acquisitions of a similar type can be done in an arbitrary
order. The deadlock handling used here is called wait/wound in
the RDBMS literature: The older tasks waits until it can acquire
the contended lock. The younger tasks needs to back off and drop
all the locks it is currently holding, i.e. the younger task is
wounded.
For full documentation please read Documentation/ww-mutex-design.txt.
References: https://lwn.net/Articles/548909/
Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Acked-by: Rob Clark <robdclark(a)gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: rostedt(a)goodmis.org
Cc: daniel(a)ffwll.ch
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/51C8038C.9000106@canonical.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
commit a41b56efa70e060f650aeb54740aaf52044a1ead
Author: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Date: Thu Jun 20 13:31:05 2013 +0200
arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
This will allow me to call functions that have multiple
arguments if fastpath fails. This is required to support ticket
mutexes, because they need to be able to pass an extra argument
to the fail function.
Originally I duplicated the functions, by adding
__mutex_fastpath_lock_retval_arg. This ended up being just a
duplication of the existing function, so a way to test if
fastpath was called ended up being better.
This also cleaned up the reservation mutex patch some by being
able to call an atomic_set instead of atomic_xchg, and making it
easier to detect if the wrong unlock function was previously
used.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Acked-by: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: robclark(a)gmail.com
Cc: rostedt(a)goodmis.org
Cc: daniel(a)ffwll.ch
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/20130620113105.4001.83929.stgit@patser
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
commit 2bd2c92cf07cc4a373bf316c75b78ac465fefd35
Author: Waiman Long <Waiman.Long(a)hp.com>
Date: Wed Apr 17 15:23:13 2013 -0400
mutex: Queue mutex spinners with MCS lock to reduce cacheline contention
<-- snip -->
commit 41fcb9f230bf773656d1768b73000ef720bf00c3
Author: Waiman Long <Waiman.Long(a)hp.com>
Date: Wed Apr 17 15:23:11 2013 -0400
mutex: Move mutex spinning code from sched/core.c back to mutex.c
<-- snip -->
commit c5491ea779793f977d282754db478157cc409d82
Author: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon Mar 21 12:09:35 2011 +0100
sched/rt: Add schedule_preempt_disabled()
<-- snip -->
Cc: maarten.lankhorst(a)canonical.com
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Rob Clark <robdclark(a)gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: rostedt(a)goodmis.org
Cc: daniel(a)ffwll.ch
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Luis R. Rodriguez <mcgrof(a)do-not-panic.com>
---
backport/backport-include/linux/ww_mutex.h | 333 ++++++++++++++
backport/compat/Kconfig | 11 +
backport/compat/Makefile | 1 +
backport/compat/kernel/ww_mutex.c | 667 ++++++++++++++++++++++++++++
dependencies | 6 +
5 files changed, 1018 insertions(+)
create mode 100644 backport/backport-include/linux/ww_mutex.h
create mode 100644 backport/compat/kernel/ww_mutex.c
diff --git a/backport/backport-include/linux/ww_mutex.h b/backport/backport-include/linux/ww_mutex.h
new file mode 100644
index 0000000..0953939
--- /dev/null
+++ b/backport/backport-include/linux/ww_mutex.h
@@ -0,0 +1,333 @@
+#ifndef __BACKPORT_LINUX_WW_MUTEX_H
+#define __BACKPORT_LINUX_WW_MUTEX_H
+
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,11,0)
+#include_next <linux/ww_mutex.h>
+#else
+#ifdef CPTCFG_BACKPORT_BUILD_WW_MUTEX
+/*
+ * Wound/Wait Mutexes: blocking mutual exclusion locks with deadlock avoidance
+ *
+ * Original mutex implementation started by Ingo Molnar:
+ *
+ * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar <mingo(a)redhat.com>
+ *
+ * Wound/wait implementation:
+ * Copyright (C) 2013 Canonical Ltd.
+ *
+ * This file contains the main data structure and API definitions.
+ */
+
+#include <linux/mutex.h>
+
+struct ww_class {
+ atomic_long_t stamp;
+ struct lock_class_key acquire_key;
+ struct lock_class_key mutex_key;
+ const char *acquire_name;
+ const char *mutex_name;
+};
+
+struct ww_acquire_ctx {
+ struct task_struct *task;
+ unsigned long stamp;
+ unsigned acquired;
+};
+
+struct ww_mutex {
+ struct mutex base;
+ struct ww_acquire_ctx *ctx;
+};
+
+# define __WW_CLASS_MUTEX_INITIALIZER(lockname, ww_class)
+
+#define __WW_CLASS_INITIALIZER(ww_class) \
+ { .stamp = ATOMIC_LONG_INIT(0) \
+ , .acquire_name = #ww_class "_acquire" \
+ , .mutex_name = #ww_class "_mutex" }
+
+#define __WW_MUTEX_INITIALIZER(lockname, class) \
+ { .base = { \__MUTEX_INITIALIZER(lockname) } \
+ __WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
+
+#define DEFINE_WW_CLASS(classname) \
+ struct ww_class classname = __WW_CLASS_INITIALIZER(classname)
+
+#define DEFINE_WW_MUTEX(mutexname, ww_class) \
+ struct ww_mutex mutexname = __WW_MUTEX_INITIALIZER(mutexname, ww_class)
+
+/**
+ * ww_mutex_init - initialize the w/w mutex
+ * @lock: the mutex to be initialized
+ * @ww_class: the w/w class the mutex should belong to
+ *
+ * Initialize the w/w mutex to unlocked state and associate it with the given
+ * class.
+ *
+ * It is not allowed to initialize an already locked mutex.
+ */
+#define ww_mutex_init LINUX_BACKPORT(ww_mutex_init)
+static inline void ww_mutex_init(struct ww_mutex *lock,
+ struct ww_class *ww_class)
+{
+ __mutex_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key);
+ lock->ctx = NULL;
+}
+
+/**
+ * ww_acquire_init - initialize a w/w acquire context
+ * @ctx: w/w acquire context to initialize
+ * @ww_class: w/w class of the context
+ *
+ * Initializes an context to acquire multiple mutexes of the given w/w class.
+ *
+ * Context-based w/w mutex acquiring can be done in any order whatsoever within
+ * a given lock class. Deadlocks will be detected and handled with the
+ * wait/wound logic.
+ *
+ * Mixing of context-based w/w mutex acquiring and single w/w mutex locking can
+ * result in undetected deadlocks and is so forbidden. Mixing different contexts
+ * for the same w/w class when acquiring mutexes can also result in undetected
+ * deadlocks, and is hence also forbidden. Both types of abuse will be caught by
+ * enabling CONFIG_PROVE_LOCKING.
+ *
+ * Nesting of acquire contexts for _different_ w/w classes is possible, subject
+ * to the usual locking rules between different lock classes.
+ *
+ * An acquire context must be released with ww_acquire_fini by the same task
+ * before the memory is freed. It is recommended to allocate the context itself
+ * on the stack.
+ */
+#define ww_acquire_init LINUX_BACKPORT(ww_acquire_init)
+static inline void ww_acquire_init(struct ww_acquire_ctx *ctx,
+ struct ww_class *ww_class)
+{
+ ctx->task = current;
+ ctx->stamp = atomic_long_inc_return(&ww_class->stamp);
+ ctx->acquired = 0;
+}
+
+/**
+ * ww_acquire_done - marks the end of the acquire phase
+ * @ctx: the acquire context
+ *
+ * Marks the end of the acquire phase, any further w/w mutex lock calls using
+ * this context are forbidden.
+ *
+ * Calling this function is optional, it is just useful to document w/w mutex
+ * code and clearly designated the acquire phase from actually using the locked
+ * data structures.
+ */
+#define ww_acquire_done LINUX_BACKPORT(ww_acquire_done)
+static inline void ww_acquire_done(struct ww_acquire_ctx *ctx)
+{
+}
+
+/**
+ * ww_acquire_fini - releases a w/w acquire context
+ * @ctx: the acquire context to free
+ *
+ * Releases a w/w acquire context. This must be called _after_ all acquired w/w
+ * mutexes have been released with ww_mutex_unlock.
+ */
+#define ww_acquire_fini LINUX_BACKPORT(ww_acquire_fini)
+static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx)
+{
+}
+
+#define __ww_mutex_lock LINUX_BACKPORT(__ww_mutex_lock)
+extern int __must_check __ww_mutex_lock(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx);
+#define __ww_mutex_lock_interruptible LINUX_BACKPORT(__ww_mutex_lock_interruptible)
+extern int __must_check __ww_mutex_lock_interruptible(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx);
+
+/**
+ * ww_mutex_lock - acquire the w/w mutex
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context, or NULL to acquire only a single lock.
+ *
+ * Lock the w/w mutex exclusively for this task.
+ *
+ * Deadlocks within a given w/w class of locks are detected and handled with the
+ * wait/wound algorithm. If the lock isn't immediately avaiable this function
+ * will either sleep until it is (wait case). Or it selects the current context
+ * for backing off by returning -EDEADLK (wound case). Trying to acquire the
+ * same lock with the same context twice is also detected and signalled by
+ * returning -EALREADY. Returns 0 if the mutex was successfully acquired.
+ *
+ * In the wound case the caller must release all currently held w/w mutexes for
+ * the given context and then wait for this contending lock to be available by
+ * calling ww_mutex_lock_slow. Alternatively callers can opt to not acquire this
+ * lock and proceed with trying to acquire further w/w mutexes (e.g. when
+ * scanning through lru lists trying to free resources).
+ *
+ * The mutex must later on be released by the same task that
+ * acquired it. The task may not exit without first unlocking the mutex. Also,
+ * kernel memory where the mutex resides must not be freed with the mutex still
+ * locked. The mutex must first be initialized (or statically defined) before it
+ * can be locked. memset()-ing the mutex to 0 is not allowed. The mutex must be
+ * of the same w/w lock class as was used to initialize the acquire context.
+ *
+ * A mutex acquired with this function must be released with ww_mutex_unlock.
+ */
+#define ww_mutex_lock LINUX_BACKPORT(ww_mutex_lock)
+static inline int ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ if (ctx)
+ return __ww_mutex_lock(lock, ctx);
+
+ mutex_lock(&lock->base);
+ return 0;
+}
+
+/**
+ * ww_mutex_lock_interruptible - acquire the w/w mutex, interruptible
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context
+ *
+ * Lock the w/w mutex exclusively for this task.
+ *
+ * Deadlocks within a given w/w class of locks are detected and handled with the
+ * wait/wound algorithm. If the lock isn't immediately avaiable this function
+ * will either sleep until it is (wait case). Or it selects the current context
+ * for backing off by returning -EDEADLK (wound case). Trying to acquire the
+ * same lock with the same context twice is also detected and signalled by
+ * returning -EALREADY. Returns 0 if the mutex was successfully acquired. If a
+ * signal arrives while waiting for the lock then this function returns -EINTR.
+ *
+ * In the wound case the caller must release all currently held w/w mutexes for
+ * the given context and then wait for this contending lock to be available by
+ * calling ww_mutex_lock_slow_interruptible. Alternatively callers can opt to
+ * not acquire this lock and proceed with trying to acquire further w/w mutexes
+ * (e.g. when scanning through lru lists trying to free resources).
+ *
+ * The mutex must later on be released by the same task that
+ * acquired it. The task may not exit without first unlocking the mutex. Also,
+ * kernel memory where the mutex resides must not be freed with the mutex still
+ * locked. The mutex must first be initialized (or statically defined) before it
+ * can be locked. memset()-ing the mutex to 0 is not allowed. The mutex must be
+ * of the same w/w lock class as was used to initialize the acquire context.
+ *
+ * A mutex acquired with this function must be released with ww_mutex_unlock.
+ */
+#define ww_mutex_lock_interruptible LINUX_BACKPORT(ww_mutex_lock_interruptible)
+static inline int __must_check ww_mutex_lock_interruptible(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ if (ctx)
+ return __ww_mutex_lock_interruptible(lock, ctx);
+ else
+ return mutex_lock_interruptible(&lock->base);
+}
+
+/**
+ * ww_mutex_lock_slow - slowpath acquiring of the w/w mutex
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context
+ *
+ * Acquires a w/w mutex with the given context after a wound case. This function
+ * will sleep until the lock becomes available.
+ *
+ * The caller must have released all w/w mutexes already acquired with the
+ * context and then call this function on the contended lock.
+ *
+ * Afterwards the caller may continue to (re)acquire the other w/w mutexes it
+ * needs with ww_mutex_lock. Note that the -EALREADY return code from
+ * ww_mutex_lock can be used to avoid locking this contended mutex twice.
+ *
+ * It is forbidden to call this function with any other w/w mutexes associated
+ * with the context held. It is forbidden to call this on anything else than the
+ * contending mutex.
+ *
+ * Note that the slowpath lock acquiring can also be done by calling
+ * ww_mutex_lock directly. This function here is simply to help w/w mutex
+ * locking code readability by clearly denoting the slowpath.
+ */
+#define ww_mutex_lock_slow LINUX_BACKPORT(ww_mutex_lock_slow)
+static inline void
+ww_mutex_lock_slow(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ int ret;
+ ret = ww_mutex_lock(lock, ctx);
+ (void)ret;
+}
+
+/**
+ * ww_mutex_lock_slow_interruptible - slowpath acquiring of the w/w mutex, interruptible
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context
+ *
+ * Acquires a w/w mutex with the given context after a wound case. This function
+ * will sleep until the lock becomes available and returns 0 when the lock has
+ * been acquired. If a signal arrives while waiting for the lock then this
+ * function returns -EINTR.
+ *
+ * The caller must have released all w/w mutexes already acquired with the
+ * context and then call this function on the contended lock.
+ *
+ * Afterwards the caller may continue to (re)acquire the other w/w mutexes it
+ * needs with ww_mutex_lock. Note that the -EALREADY return code from
+ * ww_mutex_lock can be used to avoid locking this contended mutex twice.
+ *
+ * It is forbidden to call this function with any other w/w mutexes associated
+ * with the given context held. It is forbidden to call this on anything else
+ * than the contending mutex.
+ *
+ * Note that the slowpath lock acquiring can also be done by calling
+ * ww_mutex_lock_interruptible directly. This function here is simply to help
+ * w/w mutex locking code readability by clearly denoting the slowpath.
+ */
+#define ww_mutex_lock_slow_interruptible LINUX_BACKPORT(ww_mutex_lock_slow_interruptible)
+static inline int __must_check
+ww_mutex_lock_slow_interruptible(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ return ww_mutex_lock_interruptible(lock, ctx);
+}
+
+#define ww_mutex_unlock LINUX_BACKPORT(ww_mutex_unlock)
+extern void ww_mutex_unlock(struct ww_mutex *lock);
+
+/**
+ * ww_mutex_trylock - tries to acquire the w/w mutex without acquire context
+ * @lock: mutex to lock
+ *
+ * Trylocks a mutex without acquire context, so no deadlock detection is
+ * possible. Returns 1 if the mutex has been acquired successfully, 0 otherwise.
+ */
+#define ww_mutex_trylock LINUX_BACKPORT(ww_mutex_trylock)
+static inline int __must_check ww_mutex_trylock(struct ww_mutex *lock)
+{
+ return mutex_trylock(&lock->base);
+}
+
+/***
+ * ww_mutex_destroy - mark a w/w mutex unusable
+ * @lock: the mutex to be destroyed
+ *
+ * This function marks the mutex uninitialized, and any subsequent
+ * use of the mutex is forbidden. The mutex must not be locked when
+ * this function is called.
+ */
+#define ww_mutex_destroy LINUX_BACKPORT(ww_mutex_destroy)
+static inline void ww_mutex_destroy(struct ww_mutex *lock)
+{
+ mutex_destroy(&lock->base);
+}
+
+/**
+ * ww_mutex_is_locked - is the w/w mutex locked
+ * @lock: the mutex to be queried
+ *
+ * Returns 1 if the mutex is locked, 0 if unlocked.
+ */
+#define ww_mutex_is_locked LINUX_BACKPORT(ww_mutex_is_locked)
+static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
+{
+ return mutex_is_locked(&lock->base);
+}
+
+#endif /* CPTCFG_BACKPORT_BUILD_WW_MUTEX */
+#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(3,11,0) */
+#endif /* __BACKPORT_LINUX_WW_MUTEX_H */
diff --git a/backport/compat/Kconfig b/backport/compat/Kconfig
index e2f0cdd..f3c1ab3 100644
--- a/backport/compat/Kconfig
+++ b/backport/compat/Kconfig
@@ -185,6 +185,17 @@ config BACKPORT_LEDS_CLASS
config BACKPORT_LEDS_TRIGGERS
bool
+config BACKPORT_BUILD_WW_MUTEX
+ bool
+ # Build only if on kernels < 3.11
+ # For now only DRM drivers use ww mutexes.
+ depends on DRM && BACKPORT_KERNEL_3_11
+ default y if BACKPORT_USERSEL_BUILD_ALL
+ # probably a bad idea if you have these options given we
+ # ripped those options out.
+ depends on !DEBUG_MUTEXES
+ depends on !DEBUG_LOCK_ALLOC
+
config BACKPORT_BUILD_RADIX_HELPERS
bool
# You have selected to build backported DRM drivers
diff --git a/backport/compat/Makefile b/backport/compat/Makefile
index 252290e..fec01c4 100644
--- a/backport/compat/Makefile
+++ b/backport/compat/Makefile
@@ -41,3 +41,4 @@ compat-$(CPTCFG_BACKPORT_BUILD_KFIFO) += kfifo.o
compat-$(CPTCFG_BACKPORT_BUILD_GENERIC_ATOMIC64) += compat_atomic.o
compat-$(CPTCFG_BACKPORT_BUILD_DMA_SHARED_HELPERS) += dma-shared-helpers.o
compat-$(CPTCFG_BACKPORT_BUILD_RADIX_HELPERS) += lib-radix-tree-helpers.o
+compat-$(CPTCFG_BACKPORT_BUILD_WW_MUTEX) += kernel/ww_mutex.o
diff --git a/backport/compat/kernel/ww_mutex.c b/backport/compat/kernel/ww_mutex.c
new file mode 100644
index 0000000..257c2a4
--- /dev/null
+++ b/backport/compat/kernel/ww_mutex.c
@@ -0,0 +1,667 @@
+/*
+ * Copyright (c) 2013 Luis R. Rodriguez <mcgrof(a)do-not-panic.com>
+ *
+ * Backport ww mutex for older kernels. This is not supported when
+ * DEBUG_MUTEXES or DEBUG_LOCK_ALLOC is enabled.
+ *
+ * Taken from: kernel/mutex.c - via linux-stable v3.11-rc2
+ *
+ * Mutexes: blocking mutual exclusion locks
+ *
+ * Started by Ingo Molnar:
+ *
+ * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar <mingo(a)redhat.com>
+ *
+ * Many thanks to Arjan van de Ven, Thomas Gleixner, Steven Rostedt and
+ * David Howells for suggestions and improvements.
+ *
+ * - Adaptive spinning for mutexes by Peter Zijlstra. (Ported to mainline
+ * from the -rt tree, where it was originally implemented for rtmutexes
+ * by Steven Rostedt, based on work by Gregory Haskins, Peter Morreale
+ * and Sven Dietrich.
+ *
+ * Also see Documentation/mutex-design.txt.
+ */
+
+#include <linux/mutex.h>
+#include <linux/ww_mutex.h>
+#include <asm/mutex.h>
+#include <linux/sched.h>
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,9,0)
+#include <linux/sched/rt.h>
+#endif
+#include <linux/export.h>
+#include <linux/spinlock.h>
+#include <linux/interrupt.h>
+#include <linux/debug_locks.h>
+#include <linux/version.h>
+
+/*
+ * A negative mutex count indicates that waiters are sleeping waiting for the
+ * mutex.
+ */
+#define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
+
+#define spin_lock_mutex(lock, flags) \
+ do { spin_lock(lock); (void)(flags); } while (0)
+#define spin_unlock_mutex(lock, flags) \
+ do { spin_unlock(lock); (void)(flags); } while (0)
+#define mutex_remove_waiter(lock, waiter, ti) \
+ __list_del((waiter)->list.prev, (waiter)->list.next)
+
+#ifdef CONFIG_SMP
+static inline void mutex_set_owner(struct mutex *lock)
+{
+ lock->owner = current;
+}
+
+static inline void mutex_clear_owner(struct mutex *lock)
+{
+ lock->owner = NULL;
+}
+#else
+static inline void mutex_set_owner(struct mutex *lock)
+{
+}
+
+static inline void mutex_clear_owner(struct mutex *lock)
+{
+}
+#endif
+
+
+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0) /* 2bd2c92c and 41fcb9f2 */
+/*
+ * In order to avoid a stampede of mutex spinners from acquiring the mutex
+ * more or less simultaneously, the spinners need to acquire a MCS lock
+ * first before spinning on the owner field.
+ *
+ * We don't inline mspin_lock() so that perf can correctly account for the
+ * time spent in this lock function.
+ */
+struct mspin_node {
+ struct mspin_node *next ;
+ int locked; /* 1 if lock acquired */
+};
+#define MLOCK(mutex) ((struct mspin_node **)&((mutex)->spin_mlock))
+
+static noinline
+void mspin_lock(struct mspin_node **lock, struct mspin_node *node)
+{
+ struct mspin_node *prev;
+
+ /* Init node */
+ node->locked = 0;
+ node->next = NULL;
+
+ prev = xchg(lock, node);
+ if (likely(prev == NULL)) {
+ /* Lock acquired */
+ node->locked = 1;
+ return;
+ }
+ ACCESS_ONCE(prev->next) = node;
+ smp_wmb();
+ /* Wait until the lock holder passes the lock down */
+ while (!ACCESS_ONCE(node->locked))
+ arch_mutex_cpu_relax();
+}
+
+static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node)
+{
+ struct mspin_node *next = ACCESS_ONCE(node->next);
+
+ if (likely(!next)) {
+ /*
+ * Release the lock by setting it to NULL
+ */
+ if (cmpxchg(lock, node, NULL) == node)
+ return;
+ /* Wait until the next pointer is set */
+ while (!(next = ACCESS_ONCE(node->next)))
+ arch_mutex_cpu_relax();
+ }
+ ACCESS_ONCE(next->locked) = 1;
+ smp_wmb();
+}
+
+/*
+ * Mutex spinning code migrated from kernel/sched/core.c
+ */
+
+static inline bool owner_running(struct mutex *lock, struct task_struct *owner)
+{
+ if (lock->owner != owner)
+ return false;
+
+ /*
+ * Ensure we emit the owner->on_cpu, dereference _after_ checking
+ * lock->owner still matches owner, if that fails, owner might
+ * point to free()d memory, if it still matches, the rcu_read_lock()
+ * ensures the memory stays valid.
+ */
+ barrier();
+
+ return owner->on_cpu;
+}
+
+/*
+ * Look out! "owner" is an entirely speculative pointer
+ * access and not reliable.
+ */
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0)
+static noinline
+#endif
+int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
+{
+ rcu_read_lock();
+ while (owner_running(lock, owner)) {
+ if (need_resched())
+ break;
+
+ arch_mutex_cpu_relax();
+ }
+ rcu_read_unlock();
+
+ /*
+ * We break out the loop above on need_resched() and when the
+ * owner changed, which is a sign for heavy contention. Return
+ * success only when lock->owner is NULL.
+ */
+ return lock->owner == NULL;
+}
+
+/*
+ * Initial check for entering the mutex spinning loop
+ */
+static inline int mutex_can_spin_on_owner(struct mutex *lock)
+{
+ int retval = 1;
+
+ rcu_read_lock();
+ if (lock->owner)
+ retval = lock->owner->on_cpu;
+ rcu_read_unlock();
+ /*
+ * if lock->owner is not set, the mutex owner may have just acquired
+ * it and not set the owner yet or the mutex has been released.
+ */
+ return retval;
+}
+#else /* Backport 2bd2c92c: help keep backport_mutex_lock_common() clean */
+
+struct mspin_node {
+};
+#define MLOCK(mutex) NULL
+
+static noinline
+void mspin_lock(struct mspin_node **lock, struct mspin_node *node)
+{
+}
+
+static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node)
+{
+}
+
+static inline bool owner_running(struct mutex *lock, struct task_struct *owner)
+{
+}
+
+int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
+{
+}
+
+static inline int mutex_can_spin_on_owner(struct mutex *lock)
+{
+ return 1;
+}
+#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0) */
+#endif /* CONFIG_MUTEX_SPIN_ON_OWNER */
+
+/*
+ * Release the lock, slowpath:
+ */
+static inline void
+__mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
+{
+ struct mutex *lock = container_of(lock_count, struct mutex, count);
+ unsigned long flags;
+
+ spin_lock_mutex(&lock->wait_lock, flags);
+ mutex_release(&lock->dep_map, nested, _RET_IP_);
+ /* debug_mutex_unlock(lock); */
+
+ /*
+ * some architectures leave the lock unlocked in the fastpath failure
+ * case, others need to leave it locked. In the later case we have to
+ * unlock it here
+ */
+ if (__mutex_slowpath_needs_to_unlock())
+ atomic_set(&lock->count, 1);
+
+ if (!list_empty(&lock->wait_list)) {
+ /* get the first entry from the wait-list: */
+ struct mutex_waiter *waiter =
+ list_entry(lock->wait_list.next,
+ struct mutex_waiter, list);
+
+ /* debug_mutex_wake_waiter(lock, waiter); */
+
+ wake_up_process(waiter->task);
+ }
+
+ spin_unlock_mutex(&lock->wait_lock, flags);
+}
+
+/*
+ * Release the lock, slowpath:
+ */
+static __used noinline void
+__mutex_unlock_slowpath(atomic_t *lock_count)
+{
+ __mutex_unlock_common_slowpath(lock_count, 1);
+}
+
+/**
+ * ww_mutex_unlock - release the w/w mutex
+ * @lock: the mutex to be released
+ *
+ * Unlock a mutex that has been locked by this task previously with any of the
+ * ww_mutex_lock* functions (with or without an acquire context). It is
+ * forbidden to release the locks after releasing the acquire context.
+ *
+ * This function must not be used in interrupt context. Unlocking
+ * of a unlocked mutex is not allowed.
+ */
+void __sched ww_mutex_unlock(struct ww_mutex *lock)
+{
+ /*
+ * The unlocking fastpath is the 0->1 transition from 'locked'
+ * into 'unlocked' state:
+ */
+ if (lock->ctx) {
+ if (lock->ctx->acquired > 0)
+ lock->ctx->acquired--;
+ lock->ctx = NULL;
+ }
+
+ __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
+}
+EXPORT_SYMBOL_GPL(ww_mutex_unlock);
+
+static inline int __sched
+__mutex_lock_check_stamp(struct mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
+ struct ww_acquire_ctx *hold_ctx = ACCESS_ONCE(ww->ctx);
+
+ if (!hold_ctx)
+ return 0;
+
+ if (unlikely(ctx == hold_ctx))
+ return -EALREADY;
+
+ if (ctx->stamp - hold_ctx->stamp <= LONG_MAX &&
+ (ctx->stamp != hold_ctx->stamp || ctx > hold_ctx)) {
+ return -EDEADLK;
+ }
+
+ return 0;
+}
+
+static __always_inline void ww_mutex_lock_acquired(struct ww_mutex *ww,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ ww_ctx->acquired++;
+}
+
+/*
+ * after acquiring lock with fastpath or when we lost out in contested
+ * slowpath, set ctx and wake up any waiters so they can recheck.
+ *
+ * This function is never called when CONFIG_DEBUG_LOCK_ALLOC is set,
+ * as the fastpath and opportunistic spinning are disabled in that case.
+ */
+static __always_inline void
+ww_mutex_set_context_fastpath(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ unsigned long flags;
+ struct mutex_waiter *cur;
+
+ ww_mutex_lock_acquired(lock, ctx);
+
+ lock->ctx = ctx;
+
+ /*
+ * The lock->ctx update should be visible on all cores before
+ * the atomic read is done, otherwise contended waiters might be
+ * missed. The contended waiters will either see ww_ctx == NULL
+ * and keep spinning, or it will acquire wait_lock, add itself
+ * to waiter list and sleep.
+ */
+ smp_mb(); /* ^^^ */
+
+ /*
+ * Check if lock is contended, if not there is nobody to wake up
+ */
+ if (likely(atomic_read(&lock->base.count) == 0))
+ return;
+
+ /*
+ * Uh oh, we raced in fastpath, wake up everyone in this case,
+ * so they can see the new lock->ctx.
+ */
+ spin_lock_mutex(&lock->base.wait_lock, flags);
+ list_for_each_entry(cur, &lock->base.wait_list, list) {
+ /* debug_mutex_wake_waiter(&lock->base, cur); */
+ wake_up_process(cur->task);
+ }
+ spin_unlock_mutex(&lock->base.wait_lock, flags);
+}
+
+/**
+ * backport_schedule_preempt_disabled - called with preemption disabled
+ *
+ * Backports c5491ea7. This is not exported so we leave it
+ * here as this is the only current core user on backports.
+ * Although available on >= 3.4 its only for in-kernel code so
+ * we provide our own.
+ *
+ * Returns with preemption disabled. Note: preempt_count must be 1
+ */
+static void __sched backport_schedule_preempt_disabled(void)
+{
+ preempt_enable_no_resched();
+ schedule();
+ preempt_disable();
+}
+
+/*
+ * Lock a mutex (possibly interruptible), slowpath:
+ */
+static __always_inline int __sched
+__backport_mutex_lock_common(struct mutex *lock, long state,
+ unsigned int subclass,
+ struct lockdep_map *nest_lock, unsigned long ip,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ struct task_struct *task = current;
+ struct mutex_waiter waiter;
+ unsigned long flags;
+ int ret;
+
+ preempt_disable();
+ mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
+
+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
+ /*
+ * Optimistic spinning.
+ *
+ * We try to spin for acquisition when we find that there are no
+ * pending waiters and the lock owner is currently running on a
+ * (different) CPU.
+ *
+ * The rationale is that if the lock owner is running, it is likely to
+ * release the lock soon.
+ *
+ * Since this needs the lock owner, and this mutex implementation
+ * doesn't track the owner atomically in the lock field, we need to
+ * track it non-atomically.
+ *
+ * We can't do this for DEBUG_MUTEXES because that relies on wait_lock
+ * to serialize everything.
+ *
+ * The mutex spinners are queued up using MCS lock so that only one
+ * spinner can compete for the mutex. However, if mutex spinning isn't
+ * going to happen, there is no point in going through the lock/unlock
+ * overhead.
+ */
+ if (!mutex_can_spin_on_owner(lock))
+ goto slowpath;
+
+ for (;;) {
+ struct task_struct *owner;
+ struct mspin_node node;
+
+ if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 0) {
+ struct ww_mutex *ww;
+
+ ww = container_of(lock, struct ww_mutex, base);
+ /*
+ * If ww->ctx is set the contents are undefined, only
+ * by acquiring wait_lock there is a guarantee that
+ * they are not invalid when reading.
+ *
+ * As such, when deadlock detection needs to be
+ * performed the optimistic spinning cannot be done.
+ */
+ if (ACCESS_ONCE(ww->ctx))
+ break;
+ }
+
+ /*
+ * If there's an owner, wait for it to either
+ * release the lock or go to sleep.
+ */
+ mspin_lock(MLOCK(lock), &node);
+ owner = ACCESS_ONCE(lock->owner);
+ if (owner && !mutex_spin_on_owner(lock, owner)) {
+ mspin_unlock(MLOCK(lock), &node);
+ break;
+ }
+
+ if ((atomic_read(&lock->count) == 1) &&
+ (atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
+ lock_acquired(&lock->dep_map, ip);
+ if (!__builtin_constant_p(ww_ctx == NULL)) {
+ struct ww_mutex *ww;
+ ww = container_of(lock, struct ww_mutex, base);
+
+ ww_mutex_set_context_fastpath(ww, ww_ctx);
+ }
+
+ mutex_set_owner(lock);
+ mspin_unlock(MLOCK(lock), &node);
+ preempt_enable();
+ return 0;
+ }
+ mspin_unlock(MLOCK(lock), &node);
+
+ /*
+ * When there's no owner, we might have preempted between the
+ * owner acquiring the lock and setting the owner field. If
+ * we're an RT task that will live-lock because we won't let
+ * the owner complete.
+ */
+ if (!owner && (need_resched() || rt_task(task)))
+ break;
+
+ /*
+ * The cpu_relax() call is a compiler barrier which forces
+ * everything in this loop to be re-loaded. We don't need
+ * memory barriers as we'll eventually observe the right
+ * values at the cost of a few extra spins.
+ */
+ arch_mutex_cpu_relax();
+ }
+slowpath:
+#endif
+ spin_lock_mutex(&lock->wait_lock, flags);
+
+ /* We don't support DEBUG_MUTEXES on the backport */
+ /* debug_mutex_lock_common(lock, &waiter); */
+ /* debug_mutex_add_waiter(lock, &waiter, task_thread_info(task)); */
+
+ /* add waiting tasks to the end of the waitqueue (FIFO): */
+ list_add_tail(&waiter.list, &lock->wait_list);
+ waiter.task = task;
+
+ if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
+ goto done;
+
+ lock_contended(&lock->dep_map, ip);
+
+ for (;;) {
+ /*
+ * Lets try to take the lock again - this is needed even if
+ * we get here for the first time (shortly after failing to
+ * acquire the lock), to make sure that we get a wakeup once
+ * it's unlocked. Later on, if we sleep, this is the
+ * operation that gives us the lock. We xchg it to -1, so
+ * that when we release the lock, we properly wake up the
+ * other waiters:
+ */
+ if (MUTEX_SHOW_NO_WAITER(lock) &&
+ (atomic_xchg(&lock->count, -1) == 1))
+ break;
+
+ /*
+ * got a signal? (This code gets eliminated in the
+ * TASK_UNINTERRUPTIBLE case.)
+ */
+ if (unlikely(signal_pending_state(state, task))) {
+ ret = -EINTR;
+ goto err;
+ }
+
+ if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 0) {
+ ret = __mutex_lock_check_stamp(lock, ww_ctx);
+ if (ret)
+ goto err;
+ }
+
+ __set_task_state(task, state);
+
+ /* didn't get the lock, go to sleep: */
+ spin_unlock_mutex(&lock->wait_lock, flags);
+ backport_schedule_preempt_disabled();
+ spin_lock_mutex(&lock->wait_lock, flags);
+ }
+
+done:
+ lock_acquired(&lock->dep_map, ip);
+ /* got the lock - rejoice! */
+ mutex_remove_waiter(lock, &waiter, current_thread_info());
+ mutex_set_owner(lock);
+
+ if (!__builtin_constant_p(ww_ctx == NULL)) {
+ struct ww_mutex *ww = container_of(lock,
+ struct ww_mutex,
+ base);
+ struct mutex_waiter *cur;
+
+ /*
+ * This branch gets optimized out for the common case,
+ * and is only important for ww_mutex_lock.
+ */
+
+ ww_mutex_lock_acquired(ww, ww_ctx);
+ ww->ctx = ww_ctx;
+
+ /*
+ * Give any possible sleeping processes the chance to wake up,
+ * so they can recheck if they have to back off.
+ */
+ list_for_each_entry(cur, &lock->wait_list, list) {
+ /* debug_mutex_wake_waiter(lock, cur); */
+ wake_up_process(cur->task);
+ }
+ }
+
+ /* set it to 0 if there are no waiters left: */
+ if (likely(list_empty(&lock->wait_list)))
+ atomic_set(&lock->count, 0);
+
+ spin_unlock_mutex(&lock->wait_lock, flags);
+
+ /* debug_mutex_free_waiter(&waiter); */
+ preempt_enable();
+
+ return 0;
+
+err:
+ mutex_remove_waiter(lock, &waiter, task_thread_info(task));
+ spin_unlock_mutex(&lock->wait_lock, flags);
+ /* debug_mutex_free_waiter(&waiter); */
+ mutex_release(&lock->dep_map, 1, ip);
+ preempt_enable();
+ return ret;
+}
+
+static noinline int __sched
+__ww_mutex_lock_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ return __backport_mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,
+ NULL, _RET_IP_, ctx);
+}
+
+static noinline int __sched
+__ww_mutex_lock_interruptible_slowpath(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ return __backport_mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE, 0,
+ NULL, _RET_IP_, ctx);
+}
+
+/**
+ * __mutex_fastpath_lock_retval - try to take the lock by moving the count
+ * from 1 to a 0 value
+ * @count: pointer of type atomic_t
+ *
+ * For backporting purposes we can't use the older kernel's
+ * __mutex_fastpath_lock_retval() since upon failure of a fastpath
+ * lock we want to call our a failure routine with more than one argument, in
+ * this case the context for ww mutexes. Refer to commit a41b56ef the
+ * argument increase. It'd be painful to backport all asm code for the
+ * supported architectures so instead lets penalize the backport ww mutex
+ * fastpath lock with the not so efficient generic atomic_dec_return()
+ * implementation.
+ *
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
+ */
+static inline int
+__backport_mutex_fastpath_lock_retval(atomic_t *count)
+{
+ if (unlikely(atomic_dec_return(count) < 0))
+ return -1;
+ return 0;
+}
+
+int __sched
+__ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ int ret;
+
+ might_sleep();
+
+ ret = __backport_mutex_fastpath_lock_retval(&lock->base.count);
+
+ if (likely(!ret)) {
+ ww_mutex_set_context_fastpath(lock, ctx);
+ mutex_set_owner(&lock->base);
+ } else
+ ret = __ww_mutex_lock_slowpath(lock, ctx);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(__ww_mutex_lock);
+
+int __sched
+__ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ int ret;
+
+ might_sleep();
+
+ ret = __backport_mutex_fastpath_lock_retval(&lock->base.count);
+
+ if (likely(!ret)) {
+ ww_mutex_set_context_fastpath(lock, ctx);
+ mutex_set_owner(&lock->base);
+ } else
+ ret = __ww_mutex_lock_interruptible_slowpath(lock, ctx);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
diff --git a/dependencies b/dependencies
index 6841613..5c50571 100644
--- a/dependencies
+++ b/dependencies
@@ -48,6 +48,12 @@ MWIFIEX 2.6.27
# DRM stuff
HDMI 3.2
DRM 3.2
+# As of 3.11 DRM depends on the new ww_mutex which is
+# backported via BACKPORT_BUILD_WW_MUTEX. This backported
+# feature however has does not yet have support for
+# DEBUG_MUTEXES and DEBUG_LOCK_ALLOC.
+DRM kconfig: !BACKPORT_KERNEL_3_11 || !DEBUG_MUTEXES
+DRM kconfig: !BACKPORT_KERNEL_3_11 || !DEBUG_LOCK_ALLOC
DRM_TTM 3.2
# See e2bdb933, this was added on v3.3, in order to
# support DRM_QXL on 3.2 you'd have to backport 78c1d7848
--
1.7.10.4
From: "Luis R. Rodriguez" <mcgrof(a)do-not-panic.com>
This backports the kernel's wound/wait style locks 040a0a371,
using the linux-stable v3.11-rc2 as a base for development.
Given the complexity to support debugging mutexes this backport
implementation is simplified by only making this feature availabe
if you to have DEBUG_MUTEXES and DEBUG_LOCK_ALLOC disabled. Given
that ww mutex is required for DRM this also means we must update
the kconfig for DRM and require you to also not be able to build
DRM if you have either of these options enabled. Support for
DEBUG_MUTEXES and DEBUG_LOCK_ALLOC can be added later by anyone
daring.
Part of the ww mutex addition to the kernel required modifying
the fast path mutex locking scheme by requiring you to deal
with the slow path alternatives on your own (refer to a41b56ef).
The reason for this change was that the mutex fastpath implementation
assumed your slowpath alternative can only be passed one argument
and the addition of ww mutexes requires dealing with the slow
path with a context passed.
It'd be painful to backport all asm for an optimized fastpath
implementation so we penalize the backport ww mutex fast path
by using the generic atomic_dec_return().
To backport a clean our own mutex_lock_common() with the least
amount of changes against upstream commits 2bd2c92c and 41fcb9f2
also needed to be backported. Commit 2bd2c92c dealt with adding
support for queue mutex spinners with an MCS lock, since this
cannot be backported for older kernels we provide empty inlines.
Commit 41fcb9f2 just removed SCHED_FEAT_OWNER_SPIN as it was an
early hack, the only thing required to backport this commit was
to provide an alternative declaration for mutex_spin_on_owner()
as it was declared non-inline for older kernels.
Finally c5491ea7 required backporting schedule_preempt_disabled()
as well but that just consisted of carrying over the original
implementation. Since its not exported we need to reimplement
it to make it available to our internal core ww mutex port.
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains 040a0a371
v3.11-rc1~147^2~5
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains a41b56ef
v3.11-rc1~147^2~6
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains 2bd2c92c
v3.10-rc1~200^2~3
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains 41fcb9f2
v3.10-rc1~200^2~5
mcgrof@frijol ~/linux-stable (git::master)$ git describe --contains c5491ea7
v3.4-rc1~3^2~27
commit 040a0a37100563754bb1fee6ff6427420bcfa609
Author: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Date: Mon Jun 24 10:30:04 2013 +0200
mutex: Add support for wound/wait style locks
Wound/wait mutexes are used when other multiple lock
acquisitions of a similar type can be done in an arbitrary
order. The deadlock handling used here is called wait/wound in
the RDBMS literature: The older tasks waits until it can acquire
the contended lock. The younger tasks needs to back off and drop
all the locks it is currently holding, i.e. the younger task is
wounded.
For full documentation please read Documentation/ww-mutex-design.txt.
References: https://lwn.net/Articles/548909/
Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Acked-by: Rob Clark <robdclark(a)gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: rostedt(a)goodmis.org
Cc: daniel(a)ffwll.ch
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/51C8038C.9000106@canonical.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
commit a41b56efa70e060f650aeb54740aaf52044a1ead
Author: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Date: Thu Jun 20 13:31:05 2013 +0200
arch: Make __mutex_fastpath_lock_retval return whether fastpath succeeded or not
This will allow me to call functions that have multiple
arguments if fastpath fails. This is required to support ticket
mutexes, because they need to be able to pass an extra argument
to the fail function.
Originally I duplicated the functions, by adding
__mutex_fastpath_lock_retval_arg. This ended up being just a
duplication of the existing function, so a way to test if
fastpath was called ended up being better.
This also cleaned up the reservation mutex patch some by being
able to call an atomic_set instead of atomic_xchg, and making it
easier to detect if the wrong unlock function was previously
used.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com>
Acked-by: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: robclark(a)gmail.com
Cc: rostedt(a)goodmis.org
Cc: daniel(a)ffwll.ch
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/20130620113105.4001.83929.stgit@patser
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
commit 2bd2c92cf07cc4a373bf316c75b78ac465fefd35
Author: Waiman Long <Waiman.Long(a)hp.com>
Date: Wed Apr 17 15:23:13 2013 -0400
mutex: Queue mutex spinners with MCS lock to reduce cacheline contention
<-- snip -->
commit 41fcb9f230bf773656d1768b73000ef720bf00c3
Author: Waiman Long <Waiman.Long(a)hp.com>
Date: Wed Apr 17 15:23:11 2013 -0400
mutex: Move mutex spinning code from sched/core.c back to mutex.c
<-- snip -->
commit c5491ea779793f977d282754db478157cc409d82
Author: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon Mar 21 12:09:35 2011 +0100
sched/rt: Add schedule_preempt_disabled()
<-- snip -->
Cc: maarten.lankhorst(a)canonical.com
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Rob Clark <robdclark(a)gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: dri-devel(a)lists.freedesktop.org
Cc: linaro-mm-sig(a)lists.linaro.org
Cc: rostedt(a)goodmis.org
Cc: daniel(a)ffwll.ch
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Luis R. Rodriguez <mcgrof(a)do-not-panic.com>
---
backport/backport-include/linux/ww_mutex.h | 333 ++++++++++++++
backport/compat/Kconfig | 11 +
backport/compat/Makefile | 1 +
backport/compat/kernel/ww_mutex.c | 667 ++++++++++++++++++++++++++++
4 files changed, 1012 insertions(+)
create mode 100644 backport/backport-include/linux/ww_mutex.h
create mode 100644 backport/compat/kernel/ww_mutex.c
diff --git a/backport/backport-include/linux/ww_mutex.h b/backport/backport-include/linux/ww_mutex.h
new file mode 100644
index 0000000..0953939
--- /dev/null
+++ b/backport/backport-include/linux/ww_mutex.h
@@ -0,0 +1,333 @@
+#ifndef __BACKPORT_LINUX_WW_MUTEX_H
+#define __BACKPORT_LINUX_WW_MUTEX_H
+
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,11,0)
+#include_next <linux/ww_mutex.h>
+#else
+#ifdef CPTCFG_BACKPORT_BUILD_WW_MUTEX
+/*
+ * Wound/Wait Mutexes: blocking mutual exclusion locks with deadlock avoidance
+ *
+ * Original mutex implementation started by Ingo Molnar:
+ *
+ * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar <mingo(a)redhat.com>
+ *
+ * Wound/wait implementation:
+ * Copyright (C) 2013 Canonical Ltd.
+ *
+ * This file contains the main data structure and API definitions.
+ */
+
+#include <linux/mutex.h>
+
+struct ww_class {
+ atomic_long_t stamp;
+ struct lock_class_key acquire_key;
+ struct lock_class_key mutex_key;
+ const char *acquire_name;
+ const char *mutex_name;
+};
+
+struct ww_acquire_ctx {
+ struct task_struct *task;
+ unsigned long stamp;
+ unsigned acquired;
+};
+
+struct ww_mutex {
+ struct mutex base;
+ struct ww_acquire_ctx *ctx;
+};
+
+# define __WW_CLASS_MUTEX_INITIALIZER(lockname, ww_class)
+
+#define __WW_CLASS_INITIALIZER(ww_class) \
+ { .stamp = ATOMIC_LONG_INIT(0) \
+ , .acquire_name = #ww_class "_acquire" \
+ , .mutex_name = #ww_class "_mutex" }
+
+#define __WW_MUTEX_INITIALIZER(lockname, class) \
+ { .base = { \__MUTEX_INITIALIZER(lockname) } \
+ __WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
+
+#define DEFINE_WW_CLASS(classname) \
+ struct ww_class classname = __WW_CLASS_INITIALIZER(classname)
+
+#define DEFINE_WW_MUTEX(mutexname, ww_class) \
+ struct ww_mutex mutexname = __WW_MUTEX_INITIALIZER(mutexname, ww_class)
+
+/**
+ * ww_mutex_init - initialize the w/w mutex
+ * @lock: the mutex to be initialized
+ * @ww_class: the w/w class the mutex should belong to
+ *
+ * Initialize the w/w mutex to unlocked state and associate it with the given
+ * class.
+ *
+ * It is not allowed to initialize an already locked mutex.
+ */
+#define ww_mutex_init LINUX_BACKPORT(ww_mutex_init)
+static inline void ww_mutex_init(struct ww_mutex *lock,
+ struct ww_class *ww_class)
+{
+ __mutex_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key);
+ lock->ctx = NULL;
+}
+
+/**
+ * ww_acquire_init - initialize a w/w acquire context
+ * @ctx: w/w acquire context to initialize
+ * @ww_class: w/w class of the context
+ *
+ * Initializes an context to acquire multiple mutexes of the given w/w class.
+ *
+ * Context-based w/w mutex acquiring can be done in any order whatsoever within
+ * a given lock class. Deadlocks will be detected and handled with the
+ * wait/wound logic.
+ *
+ * Mixing of context-based w/w mutex acquiring and single w/w mutex locking can
+ * result in undetected deadlocks and is so forbidden. Mixing different contexts
+ * for the same w/w class when acquiring mutexes can also result in undetected
+ * deadlocks, and is hence also forbidden. Both types of abuse will be caught by
+ * enabling CONFIG_PROVE_LOCKING.
+ *
+ * Nesting of acquire contexts for _different_ w/w classes is possible, subject
+ * to the usual locking rules between different lock classes.
+ *
+ * An acquire context must be released with ww_acquire_fini by the same task
+ * before the memory is freed. It is recommended to allocate the context itself
+ * on the stack.
+ */
+#define ww_acquire_init LINUX_BACKPORT(ww_acquire_init)
+static inline void ww_acquire_init(struct ww_acquire_ctx *ctx,
+ struct ww_class *ww_class)
+{
+ ctx->task = current;
+ ctx->stamp = atomic_long_inc_return(&ww_class->stamp);
+ ctx->acquired = 0;
+}
+
+/**
+ * ww_acquire_done - marks the end of the acquire phase
+ * @ctx: the acquire context
+ *
+ * Marks the end of the acquire phase, any further w/w mutex lock calls using
+ * this context are forbidden.
+ *
+ * Calling this function is optional, it is just useful to document w/w mutex
+ * code and clearly designated the acquire phase from actually using the locked
+ * data structures.
+ */
+#define ww_acquire_done LINUX_BACKPORT(ww_acquire_done)
+static inline void ww_acquire_done(struct ww_acquire_ctx *ctx)
+{
+}
+
+/**
+ * ww_acquire_fini - releases a w/w acquire context
+ * @ctx: the acquire context to free
+ *
+ * Releases a w/w acquire context. This must be called _after_ all acquired w/w
+ * mutexes have been released with ww_mutex_unlock.
+ */
+#define ww_acquire_fini LINUX_BACKPORT(ww_acquire_fini)
+static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx)
+{
+}
+
+#define __ww_mutex_lock LINUX_BACKPORT(__ww_mutex_lock)
+extern int __must_check __ww_mutex_lock(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx);
+#define __ww_mutex_lock_interruptible LINUX_BACKPORT(__ww_mutex_lock_interruptible)
+extern int __must_check __ww_mutex_lock_interruptible(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx);
+
+/**
+ * ww_mutex_lock - acquire the w/w mutex
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context, or NULL to acquire only a single lock.
+ *
+ * Lock the w/w mutex exclusively for this task.
+ *
+ * Deadlocks within a given w/w class of locks are detected and handled with the
+ * wait/wound algorithm. If the lock isn't immediately avaiable this function
+ * will either sleep until it is (wait case). Or it selects the current context
+ * for backing off by returning -EDEADLK (wound case). Trying to acquire the
+ * same lock with the same context twice is also detected and signalled by
+ * returning -EALREADY. Returns 0 if the mutex was successfully acquired.
+ *
+ * In the wound case the caller must release all currently held w/w mutexes for
+ * the given context and then wait for this contending lock to be available by
+ * calling ww_mutex_lock_slow. Alternatively callers can opt to not acquire this
+ * lock and proceed with trying to acquire further w/w mutexes (e.g. when
+ * scanning through lru lists trying to free resources).
+ *
+ * The mutex must later on be released by the same task that
+ * acquired it. The task may not exit without first unlocking the mutex. Also,
+ * kernel memory where the mutex resides must not be freed with the mutex still
+ * locked. The mutex must first be initialized (or statically defined) before it
+ * can be locked. memset()-ing the mutex to 0 is not allowed. The mutex must be
+ * of the same w/w lock class as was used to initialize the acquire context.
+ *
+ * A mutex acquired with this function must be released with ww_mutex_unlock.
+ */
+#define ww_mutex_lock LINUX_BACKPORT(ww_mutex_lock)
+static inline int ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ if (ctx)
+ return __ww_mutex_lock(lock, ctx);
+
+ mutex_lock(&lock->base);
+ return 0;
+}
+
+/**
+ * ww_mutex_lock_interruptible - acquire the w/w mutex, interruptible
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context
+ *
+ * Lock the w/w mutex exclusively for this task.
+ *
+ * Deadlocks within a given w/w class of locks are detected and handled with the
+ * wait/wound algorithm. If the lock isn't immediately avaiable this function
+ * will either sleep until it is (wait case). Or it selects the current context
+ * for backing off by returning -EDEADLK (wound case). Trying to acquire the
+ * same lock with the same context twice is also detected and signalled by
+ * returning -EALREADY. Returns 0 if the mutex was successfully acquired. If a
+ * signal arrives while waiting for the lock then this function returns -EINTR.
+ *
+ * In the wound case the caller must release all currently held w/w mutexes for
+ * the given context and then wait for this contending lock to be available by
+ * calling ww_mutex_lock_slow_interruptible. Alternatively callers can opt to
+ * not acquire this lock and proceed with trying to acquire further w/w mutexes
+ * (e.g. when scanning through lru lists trying to free resources).
+ *
+ * The mutex must later on be released by the same task that
+ * acquired it. The task may not exit without first unlocking the mutex. Also,
+ * kernel memory where the mutex resides must not be freed with the mutex still
+ * locked. The mutex must first be initialized (or statically defined) before it
+ * can be locked. memset()-ing the mutex to 0 is not allowed. The mutex must be
+ * of the same w/w lock class as was used to initialize the acquire context.
+ *
+ * A mutex acquired with this function must be released with ww_mutex_unlock.
+ */
+#define ww_mutex_lock_interruptible LINUX_BACKPORT(ww_mutex_lock_interruptible)
+static inline int __must_check ww_mutex_lock_interruptible(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ if (ctx)
+ return __ww_mutex_lock_interruptible(lock, ctx);
+ else
+ return mutex_lock_interruptible(&lock->base);
+}
+
+/**
+ * ww_mutex_lock_slow - slowpath acquiring of the w/w mutex
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context
+ *
+ * Acquires a w/w mutex with the given context after a wound case. This function
+ * will sleep until the lock becomes available.
+ *
+ * The caller must have released all w/w mutexes already acquired with the
+ * context and then call this function on the contended lock.
+ *
+ * Afterwards the caller may continue to (re)acquire the other w/w mutexes it
+ * needs with ww_mutex_lock. Note that the -EALREADY return code from
+ * ww_mutex_lock can be used to avoid locking this contended mutex twice.
+ *
+ * It is forbidden to call this function with any other w/w mutexes associated
+ * with the context held. It is forbidden to call this on anything else than the
+ * contending mutex.
+ *
+ * Note that the slowpath lock acquiring can also be done by calling
+ * ww_mutex_lock directly. This function here is simply to help w/w mutex
+ * locking code readability by clearly denoting the slowpath.
+ */
+#define ww_mutex_lock_slow LINUX_BACKPORT(ww_mutex_lock_slow)
+static inline void
+ww_mutex_lock_slow(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ int ret;
+ ret = ww_mutex_lock(lock, ctx);
+ (void)ret;
+}
+
+/**
+ * ww_mutex_lock_slow_interruptible - slowpath acquiring of the w/w mutex, interruptible
+ * @lock: the mutex to be acquired
+ * @ctx: w/w acquire context
+ *
+ * Acquires a w/w mutex with the given context after a wound case. This function
+ * will sleep until the lock becomes available and returns 0 when the lock has
+ * been acquired. If a signal arrives while waiting for the lock then this
+ * function returns -EINTR.
+ *
+ * The caller must have released all w/w mutexes already acquired with the
+ * context and then call this function on the contended lock.
+ *
+ * Afterwards the caller may continue to (re)acquire the other w/w mutexes it
+ * needs with ww_mutex_lock. Note that the -EALREADY return code from
+ * ww_mutex_lock can be used to avoid locking this contended mutex twice.
+ *
+ * It is forbidden to call this function with any other w/w mutexes associated
+ * with the given context held. It is forbidden to call this on anything else
+ * than the contending mutex.
+ *
+ * Note that the slowpath lock acquiring can also be done by calling
+ * ww_mutex_lock_interruptible directly. This function here is simply to help
+ * w/w mutex locking code readability by clearly denoting the slowpath.
+ */
+#define ww_mutex_lock_slow_interruptible LINUX_BACKPORT(ww_mutex_lock_slow_interruptible)
+static inline int __must_check
+ww_mutex_lock_slow_interruptible(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ return ww_mutex_lock_interruptible(lock, ctx);
+}
+
+#define ww_mutex_unlock LINUX_BACKPORT(ww_mutex_unlock)
+extern void ww_mutex_unlock(struct ww_mutex *lock);
+
+/**
+ * ww_mutex_trylock - tries to acquire the w/w mutex without acquire context
+ * @lock: mutex to lock
+ *
+ * Trylocks a mutex without acquire context, so no deadlock detection is
+ * possible. Returns 1 if the mutex has been acquired successfully, 0 otherwise.
+ */
+#define ww_mutex_trylock LINUX_BACKPORT(ww_mutex_trylock)
+static inline int __must_check ww_mutex_trylock(struct ww_mutex *lock)
+{
+ return mutex_trylock(&lock->base);
+}
+
+/***
+ * ww_mutex_destroy - mark a w/w mutex unusable
+ * @lock: the mutex to be destroyed
+ *
+ * This function marks the mutex uninitialized, and any subsequent
+ * use of the mutex is forbidden. The mutex must not be locked when
+ * this function is called.
+ */
+#define ww_mutex_destroy LINUX_BACKPORT(ww_mutex_destroy)
+static inline void ww_mutex_destroy(struct ww_mutex *lock)
+{
+ mutex_destroy(&lock->base);
+}
+
+/**
+ * ww_mutex_is_locked - is the w/w mutex locked
+ * @lock: the mutex to be queried
+ *
+ * Returns 1 if the mutex is locked, 0 if unlocked.
+ */
+#define ww_mutex_is_locked LINUX_BACKPORT(ww_mutex_is_locked)
+static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
+{
+ return mutex_is_locked(&lock->base);
+}
+
+#endif /* CPTCFG_BACKPORT_BUILD_WW_MUTEX */
+#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(3,11,0) */
+#endif /* __BACKPORT_LINUX_WW_MUTEX_H */
diff --git a/backport/compat/Kconfig b/backport/compat/Kconfig
index e2f0cdd..f3c1ab3 100644
--- a/backport/compat/Kconfig
+++ b/backport/compat/Kconfig
@@ -185,6 +185,17 @@ config BACKPORT_LEDS_CLASS
config BACKPORT_LEDS_TRIGGERS
bool
+config BACKPORT_BUILD_WW_MUTEX
+ bool
+ # Build only if on kernels < 3.11
+ # For now only DRM drivers use ww mutexes.
+ depends on DRM && BACKPORT_KERNEL_3_11
+ default y if BACKPORT_USERSEL_BUILD_ALL
+ # probably a bad idea if you have these options given we
+ # ripped those options out.
+ depends on !DEBUG_MUTEXES
+ depends on !DEBUG_LOCK_ALLOC
+
config BACKPORT_BUILD_RADIX_HELPERS
bool
# You have selected to build backported DRM drivers
diff --git a/backport/compat/Makefile b/backport/compat/Makefile
index 252290e..fec01c4 100644
--- a/backport/compat/Makefile
+++ b/backport/compat/Makefile
@@ -41,3 +41,4 @@ compat-$(CPTCFG_BACKPORT_BUILD_KFIFO) += kfifo.o
compat-$(CPTCFG_BACKPORT_BUILD_GENERIC_ATOMIC64) += compat_atomic.o
compat-$(CPTCFG_BACKPORT_BUILD_DMA_SHARED_HELPERS) += dma-shared-helpers.o
compat-$(CPTCFG_BACKPORT_BUILD_RADIX_HELPERS) += lib-radix-tree-helpers.o
+compat-$(CPTCFG_BACKPORT_BUILD_WW_MUTEX) += kernel/ww_mutex.o
diff --git a/backport/compat/kernel/ww_mutex.c b/backport/compat/kernel/ww_mutex.c
new file mode 100644
index 0000000..257c2a4
--- /dev/null
+++ b/backport/compat/kernel/ww_mutex.c
@@ -0,0 +1,667 @@
+/*
+ * Copyright (c) 2013 Luis R. Rodriguez <mcgrof(a)do-not-panic.com>
+ *
+ * Backport ww mutex for older kernels. This is not supported when
+ * DEBUG_MUTEXES or DEBUG_LOCK_ALLOC is enabled.
+ *
+ * Taken from: kernel/mutex.c - via linux-stable v3.11-rc2
+ *
+ * Mutexes: blocking mutual exclusion locks
+ *
+ * Started by Ingo Molnar:
+ *
+ * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar <mingo(a)redhat.com>
+ *
+ * Many thanks to Arjan van de Ven, Thomas Gleixner, Steven Rostedt and
+ * David Howells for suggestions and improvements.
+ *
+ * - Adaptive spinning for mutexes by Peter Zijlstra. (Ported to mainline
+ * from the -rt tree, where it was originally implemented for rtmutexes
+ * by Steven Rostedt, based on work by Gregory Haskins, Peter Morreale
+ * and Sven Dietrich.
+ *
+ * Also see Documentation/mutex-design.txt.
+ */
+
+#include <linux/mutex.h>
+#include <linux/ww_mutex.h>
+#include <asm/mutex.h>
+#include <linux/sched.h>
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,9,0)
+#include <linux/sched/rt.h>
+#endif
+#include <linux/export.h>
+#include <linux/spinlock.h>
+#include <linux/interrupt.h>
+#include <linux/debug_locks.h>
+#include <linux/version.h>
+
+/*
+ * A negative mutex count indicates that waiters are sleeping waiting for the
+ * mutex.
+ */
+#define MUTEX_SHOW_NO_WAITER(mutex) (atomic_read(&(mutex)->count) >= 0)
+
+#define spin_lock_mutex(lock, flags) \
+ do { spin_lock(lock); (void)(flags); } while (0)
+#define spin_unlock_mutex(lock, flags) \
+ do { spin_unlock(lock); (void)(flags); } while (0)
+#define mutex_remove_waiter(lock, waiter, ti) \
+ __list_del((waiter)->list.prev, (waiter)->list.next)
+
+#ifdef CONFIG_SMP
+static inline void mutex_set_owner(struct mutex *lock)
+{
+ lock->owner = current;
+}
+
+static inline void mutex_clear_owner(struct mutex *lock)
+{
+ lock->owner = NULL;
+}
+#else
+static inline void mutex_set_owner(struct mutex *lock)
+{
+}
+
+static inline void mutex_clear_owner(struct mutex *lock)
+{
+}
+#endif
+
+
+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0) /* 2bd2c92c and 41fcb9f2 */
+/*
+ * In order to avoid a stampede of mutex spinners from acquiring the mutex
+ * more or less simultaneously, the spinners need to acquire a MCS lock
+ * first before spinning on the owner field.
+ *
+ * We don't inline mspin_lock() so that perf can correctly account for the
+ * time spent in this lock function.
+ */
+struct mspin_node {
+ struct mspin_node *next ;
+ int locked; /* 1 if lock acquired */
+};
+#define MLOCK(mutex) ((struct mspin_node **)&((mutex)->spin_mlock))
+
+static noinline
+void mspin_lock(struct mspin_node **lock, struct mspin_node *node)
+{
+ struct mspin_node *prev;
+
+ /* Init node */
+ node->locked = 0;
+ node->next = NULL;
+
+ prev = xchg(lock, node);
+ if (likely(prev == NULL)) {
+ /* Lock acquired */
+ node->locked = 1;
+ return;
+ }
+ ACCESS_ONCE(prev->next) = node;
+ smp_wmb();
+ /* Wait until the lock holder passes the lock down */
+ while (!ACCESS_ONCE(node->locked))
+ arch_mutex_cpu_relax();
+}
+
+static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node)
+{
+ struct mspin_node *next = ACCESS_ONCE(node->next);
+
+ if (likely(!next)) {
+ /*
+ * Release the lock by setting it to NULL
+ */
+ if (cmpxchg(lock, node, NULL) == node)
+ return;
+ /* Wait until the next pointer is set */
+ while (!(next = ACCESS_ONCE(node->next)))
+ arch_mutex_cpu_relax();
+ }
+ ACCESS_ONCE(next->locked) = 1;
+ smp_wmb();
+}
+
+/*
+ * Mutex spinning code migrated from kernel/sched/core.c
+ */
+
+static inline bool owner_running(struct mutex *lock, struct task_struct *owner)
+{
+ if (lock->owner != owner)
+ return false;
+
+ /*
+ * Ensure we emit the owner->on_cpu, dereference _after_ checking
+ * lock->owner still matches owner, if that fails, owner might
+ * point to free()d memory, if it still matches, the rcu_read_lock()
+ * ensures the memory stays valid.
+ */
+ barrier();
+
+ return owner->on_cpu;
+}
+
+/*
+ * Look out! "owner" is an entirely speculative pointer
+ * access and not reliable.
+ */
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0)
+static noinline
+#endif
+int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
+{
+ rcu_read_lock();
+ while (owner_running(lock, owner)) {
+ if (need_resched())
+ break;
+
+ arch_mutex_cpu_relax();
+ }
+ rcu_read_unlock();
+
+ /*
+ * We break out the loop above on need_resched() and when the
+ * owner changed, which is a sign for heavy contention. Return
+ * success only when lock->owner is NULL.
+ */
+ return lock->owner == NULL;
+}
+
+/*
+ * Initial check for entering the mutex spinning loop
+ */
+static inline int mutex_can_spin_on_owner(struct mutex *lock)
+{
+ int retval = 1;
+
+ rcu_read_lock();
+ if (lock->owner)
+ retval = lock->owner->on_cpu;
+ rcu_read_unlock();
+ /*
+ * if lock->owner is not set, the mutex owner may have just acquired
+ * it and not set the owner yet or the mutex has been released.
+ */
+ return retval;
+}
+#else /* Backport 2bd2c92c: help keep backport_mutex_lock_common() clean */
+
+struct mspin_node {
+};
+#define MLOCK(mutex) NULL
+
+static noinline
+void mspin_lock(struct mspin_node **lock, struct mspin_node *node)
+{
+}
+
+static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node)
+{
+}
+
+static inline bool owner_running(struct mutex *lock, struct task_struct *owner)
+{
+}
+
+int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
+{
+}
+
+static inline int mutex_can_spin_on_owner(struct mutex *lock)
+{
+ return 1;
+}
+#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(3,10,0) */
+#endif /* CONFIG_MUTEX_SPIN_ON_OWNER */
+
+/*
+ * Release the lock, slowpath:
+ */
+static inline void
+__mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
+{
+ struct mutex *lock = container_of(lock_count, struct mutex, count);
+ unsigned long flags;
+
+ spin_lock_mutex(&lock->wait_lock, flags);
+ mutex_release(&lock->dep_map, nested, _RET_IP_);
+ /* debug_mutex_unlock(lock); */
+
+ /*
+ * some architectures leave the lock unlocked in the fastpath failure
+ * case, others need to leave it locked. In the later case we have to
+ * unlock it here
+ */
+ if (__mutex_slowpath_needs_to_unlock())
+ atomic_set(&lock->count, 1);
+
+ if (!list_empty(&lock->wait_list)) {
+ /* get the first entry from the wait-list: */
+ struct mutex_waiter *waiter =
+ list_entry(lock->wait_list.next,
+ struct mutex_waiter, list);
+
+ /* debug_mutex_wake_waiter(lock, waiter); */
+
+ wake_up_process(waiter->task);
+ }
+
+ spin_unlock_mutex(&lock->wait_lock, flags);
+}
+
+/*
+ * Release the lock, slowpath:
+ */
+static __used noinline void
+__mutex_unlock_slowpath(atomic_t *lock_count)
+{
+ __mutex_unlock_common_slowpath(lock_count, 1);
+}
+
+/**
+ * ww_mutex_unlock - release the w/w mutex
+ * @lock: the mutex to be released
+ *
+ * Unlock a mutex that has been locked by this task previously with any of the
+ * ww_mutex_lock* functions (with or without an acquire context). It is
+ * forbidden to release the locks after releasing the acquire context.
+ *
+ * This function must not be used in interrupt context. Unlocking
+ * of a unlocked mutex is not allowed.
+ */
+void __sched ww_mutex_unlock(struct ww_mutex *lock)
+{
+ /*
+ * The unlocking fastpath is the 0->1 transition from 'locked'
+ * into 'unlocked' state:
+ */
+ if (lock->ctx) {
+ if (lock->ctx->acquired > 0)
+ lock->ctx->acquired--;
+ lock->ctx = NULL;
+ }
+
+ __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
+}
+EXPORT_SYMBOL_GPL(ww_mutex_unlock);
+
+static inline int __sched
+__mutex_lock_check_stamp(struct mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
+ struct ww_acquire_ctx *hold_ctx = ACCESS_ONCE(ww->ctx);
+
+ if (!hold_ctx)
+ return 0;
+
+ if (unlikely(ctx == hold_ctx))
+ return -EALREADY;
+
+ if (ctx->stamp - hold_ctx->stamp <= LONG_MAX &&
+ (ctx->stamp != hold_ctx->stamp || ctx > hold_ctx)) {
+ return -EDEADLK;
+ }
+
+ return 0;
+}
+
+static __always_inline void ww_mutex_lock_acquired(struct ww_mutex *ww,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ ww_ctx->acquired++;
+}
+
+/*
+ * after acquiring lock with fastpath or when we lost out in contested
+ * slowpath, set ctx and wake up any waiters so they can recheck.
+ *
+ * This function is never called when CONFIG_DEBUG_LOCK_ALLOC is set,
+ * as the fastpath and opportunistic spinning are disabled in that case.
+ */
+static __always_inline void
+ww_mutex_set_context_fastpath(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ unsigned long flags;
+ struct mutex_waiter *cur;
+
+ ww_mutex_lock_acquired(lock, ctx);
+
+ lock->ctx = ctx;
+
+ /*
+ * The lock->ctx update should be visible on all cores before
+ * the atomic read is done, otherwise contended waiters might be
+ * missed. The contended waiters will either see ww_ctx == NULL
+ * and keep spinning, or it will acquire wait_lock, add itself
+ * to waiter list and sleep.
+ */
+ smp_mb(); /* ^^^ */
+
+ /*
+ * Check if lock is contended, if not there is nobody to wake up
+ */
+ if (likely(atomic_read(&lock->base.count) == 0))
+ return;
+
+ /*
+ * Uh oh, we raced in fastpath, wake up everyone in this case,
+ * so they can see the new lock->ctx.
+ */
+ spin_lock_mutex(&lock->base.wait_lock, flags);
+ list_for_each_entry(cur, &lock->base.wait_list, list) {
+ /* debug_mutex_wake_waiter(&lock->base, cur); */
+ wake_up_process(cur->task);
+ }
+ spin_unlock_mutex(&lock->base.wait_lock, flags);
+}
+
+/**
+ * backport_schedule_preempt_disabled - called with preemption disabled
+ *
+ * Backports c5491ea7. This is not exported so we leave it
+ * here as this is the only current core user on backports.
+ * Although available on >= 3.4 its only for in-kernel code so
+ * we provide our own.
+ *
+ * Returns with preemption disabled. Note: preempt_count must be 1
+ */
+static void __sched backport_schedule_preempt_disabled(void)
+{
+ preempt_enable_no_resched();
+ schedule();
+ preempt_disable();
+}
+
+/*
+ * Lock a mutex (possibly interruptible), slowpath:
+ */
+static __always_inline int __sched
+__backport_mutex_lock_common(struct mutex *lock, long state,
+ unsigned int subclass,
+ struct lockdep_map *nest_lock, unsigned long ip,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ struct task_struct *task = current;
+ struct mutex_waiter waiter;
+ unsigned long flags;
+ int ret;
+
+ preempt_disable();
+ mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
+
+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
+ /*
+ * Optimistic spinning.
+ *
+ * We try to spin for acquisition when we find that there are no
+ * pending waiters and the lock owner is currently running on a
+ * (different) CPU.
+ *
+ * The rationale is that if the lock owner is running, it is likely to
+ * release the lock soon.
+ *
+ * Since this needs the lock owner, and this mutex implementation
+ * doesn't track the owner atomically in the lock field, we need to
+ * track it non-atomically.
+ *
+ * We can't do this for DEBUG_MUTEXES because that relies on wait_lock
+ * to serialize everything.
+ *
+ * The mutex spinners are queued up using MCS lock so that only one
+ * spinner can compete for the mutex. However, if mutex spinning isn't
+ * going to happen, there is no point in going through the lock/unlock
+ * overhead.
+ */
+ if (!mutex_can_spin_on_owner(lock))
+ goto slowpath;
+
+ for (;;) {
+ struct task_struct *owner;
+ struct mspin_node node;
+
+ if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 0) {
+ struct ww_mutex *ww;
+
+ ww = container_of(lock, struct ww_mutex, base);
+ /*
+ * If ww->ctx is set the contents are undefined, only
+ * by acquiring wait_lock there is a guarantee that
+ * they are not invalid when reading.
+ *
+ * As such, when deadlock detection needs to be
+ * performed the optimistic spinning cannot be done.
+ */
+ if (ACCESS_ONCE(ww->ctx))
+ break;
+ }
+
+ /*
+ * If there's an owner, wait for it to either
+ * release the lock or go to sleep.
+ */
+ mspin_lock(MLOCK(lock), &node);
+ owner = ACCESS_ONCE(lock->owner);
+ if (owner && !mutex_spin_on_owner(lock, owner)) {
+ mspin_unlock(MLOCK(lock), &node);
+ break;
+ }
+
+ if ((atomic_read(&lock->count) == 1) &&
+ (atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
+ lock_acquired(&lock->dep_map, ip);
+ if (!__builtin_constant_p(ww_ctx == NULL)) {
+ struct ww_mutex *ww;
+ ww = container_of(lock, struct ww_mutex, base);
+
+ ww_mutex_set_context_fastpath(ww, ww_ctx);
+ }
+
+ mutex_set_owner(lock);
+ mspin_unlock(MLOCK(lock), &node);
+ preempt_enable();
+ return 0;
+ }
+ mspin_unlock(MLOCK(lock), &node);
+
+ /*
+ * When there's no owner, we might have preempted between the
+ * owner acquiring the lock and setting the owner field. If
+ * we're an RT task that will live-lock because we won't let
+ * the owner complete.
+ */
+ if (!owner && (need_resched() || rt_task(task)))
+ break;
+
+ /*
+ * The cpu_relax() call is a compiler barrier which forces
+ * everything in this loop to be re-loaded. We don't need
+ * memory barriers as we'll eventually observe the right
+ * values at the cost of a few extra spins.
+ */
+ arch_mutex_cpu_relax();
+ }
+slowpath:
+#endif
+ spin_lock_mutex(&lock->wait_lock, flags);
+
+ /* We don't support DEBUG_MUTEXES on the backport */
+ /* debug_mutex_lock_common(lock, &waiter); */
+ /* debug_mutex_add_waiter(lock, &waiter, task_thread_info(task)); */
+
+ /* add waiting tasks to the end of the waitqueue (FIFO): */
+ list_add_tail(&waiter.list, &lock->wait_list);
+ waiter.task = task;
+
+ if (MUTEX_SHOW_NO_WAITER(lock) && (atomic_xchg(&lock->count, -1) == 1))
+ goto done;
+
+ lock_contended(&lock->dep_map, ip);
+
+ for (;;) {
+ /*
+ * Lets try to take the lock again - this is needed even if
+ * we get here for the first time (shortly after failing to
+ * acquire the lock), to make sure that we get a wakeup once
+ * it's unlocked. Later on, if we sleep, this is the
+ * operation that gives us the lock. We xchg it to -1, so
+ * that when we release the lock, we properly wake up the
+ * other waiters:
+ */
+ if (MUTEX_SHOW_NO_WAITER(lock) &&
+ (atomic_xchg(&lock->count, -1) == 1))
+ break;
+
+ /*
+ * got a signal? (This code gets eliminated in the
+ * TASK_UNINTERRUPTIBLE case.)
+ */
+ if (unlikely(signal_pending_state(state, task))) {
+ ret = -EINTR;
+ goto err;
+ }
+
+ if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 0) {
+ ret = __mutex_lock_check_stamp(lock, ww_ctx);
+ if (ret)
+ goto err;
+ }
+
+ __set_task_state(task, state);
+
+ /* didn't get the lock, go to sleep: */
+ spin_unlock_mutex(&lock->wait_lock, flags);
+ backport_schedule_preempt_disabled();
+ spin_lock_mutex(&lock->wait_lock, flags);
+ }
+
+done:
+ lock_acquired(&lock->dep_map, ip);
+ /* got the lock - rejoice! */
+ mutex_remove_waiter(lock, &waiter, current_thread_info());
+ mutex_set_owner(lock);
+
+ if (!__builtin_constant_p(ww_ctx == NULL)) {
+ struct ww_mutex *ww = container_of(lock,
+ struct ww_mutex,
+ base);
+ struct mutex_waiter *cur;
+
+ /*
+ * This branch gets optimized out for the common case,
+ * and is only important for ww_mutex_lock.
+ */
+
+ ww_mutex_lock_acquired(ww, ww_ctx);
+ ww->ctx = ww_ctx;
+
+ /*
+ * Give any possible sleeping processes the chance to wake up,
+ * so they can recheck if they have to back off.
+ */
+ list_for_each_entry(cur, &lock->wait_list, list) {
+ /* debug_mutex_wake_waiter(lock, cur); */
+ wake_up_process(cur->task);
+ }
+ }
+
+ /* set it to 0 if there are no waiters left: */
+ if (likely(list_empty(&lock->wait_list)))
+ atomic_set(&lock->count, 0);
+
+ spin_unlock_mutex(&lock->wait_lock, flags);
+
+ /* debug_mutex_free_waiter(&waiter); */
+ preempt_enable();
+
+ return 0;
+
+err:
+ mutex_remove_waiter(lock, &waiter, task_thread_info(task));
+ spin_unlock_mutex(&lock->wait_lock, flags);
+ /* debug_mutex_free_waiter(&waiter); */
+ mutex_release(&lock->dep_map, 1, ip);
+ preempt_enable();
+ return ret;
+}
+
+static noinline int __sched
+__ww_mutex_lock_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ return __backport_mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,
+ NULL, _RET_IP_, ctx);
+}
+
+static noinline int __sched
+__ww_mutex_lock_interruptible_slowpath(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ctx)
+{
+ return __backport_mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE, 0,
+ NULL, _RET_IP_, ctx);
+}
+
+/**
+ * __mutex_fastpath_lock_retval - try to take the lock by moving the count
+ * from 1 to a 0 value
+ * @count: pointer of type atomic_t
+ *
+ * For backporting purposes we can't use the older kernel's
+ * __mutex_fastpath_lock_retval() since upon failure of a fastpath
+ * lock we want to call our a failure routine with more than one argument, in
+ * this case the context for ww mutexes. Refer to commit a41b56ef the
+ * argument increase. It'd be painful to backport all asm code for the
+ * supported architectures so instead lets penalize the backport ww mutex
+ * fastpath lock with the not so efficient generic atomic_dec_return()
+ * implementation.
+ *
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
+ */
+static inline int
+__backport_mutex_fastpath_lock_retval(atomic_t *count)
+{
+ if (unlikely(atomic_dec_return(count) < 0))
+ return -1;
+ return 0;
+}
+
+int __sched
+__ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ int ret;
+
+ might_sleep();
+
+ ret = __backport_mutex_fastpath_lock_retval(&lock->base.count);
+
+ if (likely(!ret)) {
+ ww_mutex_set_context_fastpath(lock, ctx);
+ mutex_set_owner(&lock->base);
+ } else
+ ret = __ww_mutex_lock_slowpath(lock, ctx);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(__ww_mutex_lock);
+
+int __sched
+__ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ int ret;
+
+ might_sleep();
+
+ ret = __backport_mutex_fastpath_lock_retval(&lock->base.count);
+
+ if (likely(!ret)) {
+ ww_mutex_set_context_fastpath(lock, ctx);
+ mutex_set_owner(&lock->base);
+ } else
+ ret = __ww_mutex_lock_interruptible_slowpath(lock, ctx);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
--
1.7.10.4