On Tue, Jul 6, 2021 at 12:03 PM Oded Gabbay <oded.gabbay(a)gmail.com> wrote:
>
> On Tue, Jul 6, 2021 at 11:40 AM Daniel Vetter <daniel(a)ffwll.ch> wrote:
> >
> > On Mon, Jul 05, 2021 at 04:03:12PM +0300, Oded Gabbay wrote:
> > > Hi,
> > > I'm sending v4 of this patch-set following the long email thread.
> > > I want to thank Jason for reviewing v3 and pointing out the errors, saving
> > > us time later to debug it :)
> > >
> > > I consulted with Christian on how to fix patch 2 (the implementation) and
> > > at the end of the day I shamelessly copied the relevant content from
> > > amdgpu_vram_mgr_alloc_sgt() and amdgpu_dma_buf_attach(), regarding the
> > > usage of dma_map_resource() and pci_p2pdma_distance_many(), respectively.
> > >
> > > I also made a few improvements after looking at the relevant code in amdgpu.
> > > The details are in the changelog of patch 2.
> > >
> > > I took the time to write an import code into the driver, allowing me to
> > > check real P2P with two Gaudi devices, one as exporter and the other as
> > > importer. I'm not going to include the import code in the product, it was
> > > just for testing purposes (although I can share it if anyone wants).
> > >
> > > I run it on a bare-metal environment with IOMMU enabled, on a sky-lake CPU
> > > with a white-listed PCIe bridge (to make the pci_p2pdma_distance_many happy).
> > >
> > > Greg, I hope this will be good enough for you to merge this code.
> >
> > So we're officially going to use dri-devel for technical details review
> > and then Greg for merging so we don't have to deal with other merge
> > criteria dri-devel folks have?
> I'm glad to receive any help or review, regardless of the subsystem
> the person giving that help belongs to.
>
> >
> > I don't expect anything less by now, but it does make the original claim
> > that drivers/misc will not step all over accelerators folks a complete
> > farce under the totally-not-a-gpu banner.
> >
> > This essentially means that for any other accelerator stack that doesn't
> > fit the dri-devel merge criteria, even if it's acting like a gpu and uses
> > other gpu driver stuff, you can just send it to Greg and it's good to go.
>
> What's wrong with Greg ??? ;)
>
> On a more serious note, yes, I do think the dri-devel merge criteria
> is very extreme, and effectively drives-out many AI accelerator
> companies that want to contribute to the kernel but can't/won't open
> their software IP and patents.
>
> I think the expectation from AI startups (who are 90% of the deep
> learning field) to cooperate outside of company boundaries is not
> realistic, especially on the user-side, where the real IP of the
> company resides.
>
> Personally I don't think there is a real justification for that at
> this point of time, but if it will make you (and other people here)
> happy I really don't mind creating a non-gpu accelerator subsystem
> that will contain all the totally-not-a-gpu accelerators, and will
> have a more relaxed criteria for upstreaming. Something along an
> "rdma-core" style library looks like the correct amount of user-level
> open source that should be enough.
>
> The question is, what will happen later ? Will it be sufficient to
> "allow" us to use dmabuf and maybe other gpu stuff in the future (e.g.
> hmm) ?
>
> If the community and dri-devel maintainers (and you among them) will
> assure me it is good enough, then I'll happily contribute my work and
> personal time to organize this effort and implement it.
I think dri-devel stance is pretty clear and well known: We want the
userspace to be open, because that's where most of the driver stack
is. Without an open driver stack there's no way to ever have anything
cross-vendor.
And that includes the compiler and anything else you need to drive the hardware.
Afaik linux cpu arch ports are also not accepted if there's no open
gcc or llvm port around, because without that the overall stack just
becomes useless.
If that means AI companies don't want to open our their hw specs
enough to allow that, so be it - all you get in that case is
offloading the kernel side of the stack for convenience, with zero
long term prospects to ever make this into a cross vendor subsystem
stack that does something useful. If the business case says you can't
open up your hw enough for that, I really don't see the point in
merging such a driver, it'll be an unmaintainable stack by anyone else
who's not having access to those NDA covered specs and patents and
everything.
If the stack is actually cross vendor to begin with that's just bonus,
but generally that doesn't happen voluntarily and needs a few years to
decades to get there. So that's not really something we require.
tldr; just a runtime isn't enough for dri-devel.
Now Greg seems to be happy to merge kernel drivers that aren't useful
with the open bits provided, so *shrug*.
Cheers, Daniel
PS: If requiring an actually useful open driver stack is somehow
*extreme* I have no idea why we even bother with merging device
drivers to upstream. Just make a stable driver api and done, vendors
can then do whatever they feel like and protect their "valuable IP and
patents" or whatever it is.
> Thanks,
> oded
>
> >
> > There's quite a lot of these floating around actually (and many do have
> > semi-open runtimes, like habanalabs have now too, just not open enough to
> > be actually useful). It's going to be absolutely lovely having to explain
> > to these companies in background chats why habanalabs gets away with their
> > stack and they don't.
> >
> > Or maybe we should just merge them all and give up on the idea of having
> > open cross-vendor driver stacks for these accelerators.
> >
> > Thanks, Daniel
> >
> > >
> > > Thanks,
> > > Oded
> > >
> > > Oded Gabbay (1):
> > > habanalabs: define uAPI to export FD for DMA-BUF
> > >
> > > Tomer Tayar (1):
> > > habanalabs: add support for dma-buf exporter
> > >
> > > drivers/misc/habanalabs/Kconfig | 1 +
> > > drivers/misc/habanalabs/common/habanalabs.h | 26 ++
> > > drivers/misc/habanalabs/common/memory.c | 480 +++++++++++++++++++-
> > > drivers/misc/habanalabs/gaudi/gaudi.c | 1 +
> > > drivers/misc/habanalabs/goya/goya.c | 1 +
> > > include/uapi/misc/habanalabs.h | 28 +-
> > > 6 files changed, 532 insertions(+), 5 deletions(-)
> > >
> > > --
> > > 2.25.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Tue, Jul 06, 2021 at 12:44:49PM +0300, Oded Gabbay wrote:
> > > + /* In case we got a large memory area to export, we need to divide it
> > > + * to smaller areas because each entry in the dmabuf sgt can only
> > > + * describe unsigned int.
> > > + */
> >
> > Huh? This is forming a SGL, it should follow the SGL rules which means
> > you have to fragment based on the dma_get_max_seg_size() of the
> > importer device.
> >
> hmm
> I don't see anyone in drm checking this value (and using it) when
> creating the SGL when exporting dmabuf. (e.g.
> amdgpu_vram_mgr_alloc_sgt)
For dmabuf the only importer is RDMA and it doesn't care, but you
certainly should not introduce a hardwired constant instead of using
the correct function.
Jason
On Tue, Jul 6, 2021 at 2:46 PM Oded Gabbay <oded.gabbay(a)gmail.com> wrote:
>
> On Tue, Jul 6, 2021 at 3:23 PM Daniel Vetter <daniel(a)ffwll.ch> wrote:
> >
> > On Tue, Jul 06, 2021 at 02:21:10PM +0200, Christoph Hellwig wrote:
> > > On Tue, Jul 06, 2021 at 10:40:37AM +0200, Daniel Vetter wrote:
> > > > > Greg, I hope this will be good enough for you to merge this code.
> > > >
> > > > So we're officially going to use dri-devel for technical details review
> > > > and then Greg for merging so we don't have to deal with other merge
> > > > criteria dri-devel folks have?
> > > >
> > > > I don't expect anything less by now, but it does make the original claim
> > > > that drivers/misc will not step all over accelerators folks a complete
> > > > farce under the totally-not-a-gpu banner.
> > > >
> > > > This essentially means that for any other accelerator stack that doesn't
> > > > fit the dri-devel merge criteria, even if it's acting like a gpu and uses
> > > > other gpu driver stuff, you can just send it to Greg and it's good to go.
> > > >
> > > > There's quite a lot of these floating around actually (and many do have
> > > > semi-open runtimes, like habanalabs have now too, just not open enough to
> > > > be actually useful). It's going to be absolutely lovely having to explain
> > > > to these companies in background chats why habanalabs gets away with their
> > > > stack and they don't.
> > >
> > > FYI, I fully agree with Daniel here. Habanlabs needs to open up their
> > > runtime if they want to push any additional feature in the kernel.
> > > The current situation is not sustainable.
> Well, that's like, your opinion...
>
> >
> > Before anyone replies: The runtime is open, the compiler is still closed.
> > This has become the new default for accel driver submissions, I think
> > mostly because all the interesting bits for non-3d accelerators are in the
> > accel ISA, and no longer in the runtime. So vendors are fairly happy to
> > throw in the runtime as a freebie.
> >
> > It's still incomplete, and it's still useless if you want to actually hack
> > on the driver stack.
> > -Daniel
> > --
> I don't understand what's not sustainable here.
>
> There is zero code inside the driver that communicates or interacts
> with our TPC code (TPC is the Tensor Processing Core).
> Even submitting works to the TPC is done via a generic queue
> interface. And that queue IP is common between all our engines
> (TPC/DMA/NIC). The driver provides all the specs of that queue IP,
> because the driver's code is handling that queue. But why is the TPC
> compiler code even relevant here ?
Can I use the hw how it's intended to be used without it?
If the answer is no, then essentially what you're doing with your
upstream driver is getting all the benefits of an upstream driver,
while upstream gets nothing. We can't use your stack, not as-is. Sure
we can use the queue, but we can't actually submit anything
interesting. And I'm pretty sure the point of your hw is to do more
than submit no-op packets to a queue.
This is all "I want my cake and eat it too" approach to upstreaming,
and it's totally fine attitude to have, but if you don't see why
there's maybe an different side to it then I don't get what you're
arguing. Upstream isn't free lunch for nothing.
Frankly I'm starting to assume you're arguing this all in bad faith
just because habanalabds doesn't want to actually have an open driver
stack, so any attack is good, no matter what. Which is also what
everyone else does who submits their accel driver to upstream, and
which gets us back to the starting point of this sub-thread of me
really appreciation how this will improve background discussions going
forward for everyone.
Like if the requirement for accel drivers truly is that you can submit
a dummy command to the queues then I have about 5-10 drivers at least
I could merge instantly. For something like the intel gpu driver it
would be about 50 lines of code (including all the structure boiler
plate the ioctls require)in userspace to submit a dummy queue command.
GPU and accel vendors would really love that, because it would allow
them to freeload on upstream and do essentially nothing in return.
And we'd end up with an unmaintainable disaster of a gpu or well
accelerator subsystem because there's nothing you can change or
improve because all the really useful bits of the stack are closed.
And ofc that's not any companies problem anymore, so ofc you with the
habanalabs hat on don't care and call this *extreme*.
> btw, you can today see our TPC code at
> https://github.com/HabanaAI/Habana_Custom_Kernel
> There is a link there to the TPC user guide and link to download the
> LLVM compiler.
I got stuck clicking links before I found the source for that llvm
compiler. Can you give me a direct link to the repo with sourcecode
instead please?
Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
On Mon, Jul 05, 2021 at 04:03:14PM +0300, Oded Gabbay wrote:
> + rc = sg_alloc_table(*sgt, nents, GFP_KERNEL | __GFP_ZERO);
> + if (rc)
> + goto error_free;
If you are not going to include a CPU list then I suggest setting
sg_table->orig_nents == 0
And using only the nents which is the length of the DMA list.
At least it gives some hope that other parts of the system could
detect this.
> +
> + /* Merge pages and put them into the scatterlist */
> + cur_page = 0;
> + for_each_sgtable_sg((*sgt), sg, i) {
for_each_sgtable_sg should never be used when working with
sg_dma_address() type stuff, here and everywhere else. The DMA list
should be iterated using the for_each_sgtable_dma_sg() macro.
> + /* In case we got a large memory area to export, we need to divide it
> + * to smaller areas because each entry in the dmabuf sgt can only
> + * describe unsigned int.
> + */
Huh? This is forming a SGL, it should follow the SGL rules which means
you have to fragment based on the dma_get_max_seg_size() of the
importer device.
> + hl_dmabuf->pages = kcalloc(hl_dmabuf->npages, sizeof(*hl_dmabuf->pages),
> + GFP_KERNEL);
> + if (!hl_dmabuf->pages) {
> + rc = -ENOMEM;
> + goto err_free_dmabuf_wrapper;
> + }
Why not just create the SGL directly? Is there a reason it needs to
make a page list?
Jason
On Mon, Jul 05, 2021 at 10:15:45AM +0800, Desmond Cheong Zhi Xi wrote:
> On 3/7/21 3:07 am, Daniel Vetter wrote:
> > On Fri, Jul 02, 2021 at 12:53:53AM +0800, Desmond Cheong Zhi Xi wrote:
> > > This patch series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique():
> > > https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f8…
> > >
> > > The series is broken up into five patches:
> > >
> > > 1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
> > >
> > > 2. Move a call to _drm_lease_held() out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find().
> > >
> > > 3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
> > >
> > > 4. Serialize drm_file.master by introducing a new lock that's held whenever the value of drm_file.master changes.
> > >
> > > 5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
> > >
> > > Changes in v6 -> v7:
> > > - Patch 2:
> > > Modify code alignment as suggested by the intel-gfx CI.
> > >
> > > Update commit message based on the changes to patch 5.
> > >
> > > - Patch 4:
> > > Add patch 4 to the series. This patch adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
> > >
> > > - Patch 5:
> > > Move kerneldoc comment about protecting drm_file.master with drm_device.master_mutex into patch 4.
> > >
> > > Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
> >
> > So there's another one now because master->leases is protected by the
> > mode_config.idr_mutex, and that's a bit awkward to untangle.
> >
> > Also I'm really surprised that there was now lockdep through the atomic
> > code anywhere. The reason seems to be that somehow CI reboot first before
> > it managed to run any of the kms_atomic tests, and we can only hit this
> > when we go through the atomic kms ioctl, the legacy kms ioctl don't have
> > that specific issue.
> >
> > Anyway I think this approach doesn't look too workable, and we need
> > something new.
> >
> > But first things first: Are you still on board working on this? You
> > started with a simple patch to fix a UAF bug, now we're deep into
> > reworking tricky locking ... If you feel like you want out I'm totally
> > fine with that.
> >
>
> Hi Daniel,
>
> Thanks for asking, but I'm committed to seeing this through :) In fact, I
> really appreciate all your guidance and patience as the simple patch evolved
> into the current state of things.
Cool, it's definitely been fun trying to figure out a good solution for
this tricky problem here :-)
> > Anyway, I think we need to split drm_device->master_mutex up into two
> > parts:
> >
> > - One part that protects the actual access/changes, which I think for
> > simplicity we'll just leave as the current lock. That lock is a very
> > inner lock, since for the drm_lease.c stuff it has to nest within
> > mode_config.idr_mutex even.
> >
> > - Now the issue with checking master status/leases/whatever as an
> > innermost lock is that you can race, it's a classic time of check vs
> > time of use race: By the time we actually use the thing we validate
> > we'er allowed to use, we might now have access anymore. There's two
> > reasons for that:
> >
> > * DROPMASTER ioctl could remove the master rights, which removes access
> > rights also for all leases
> >
> > * REVOKE_LEASE ioctl can do the same but only for a specific lease
> >
> > This is the thing we're trying to protect against in fbcon code, but
> > that's very spotty protection because all the ioctls by other users
> > aren't actually protected against this.
> >
> > So I think for this we need some kind of big reader lock.
> >
> > Now for the implementation, there's a few things:
> >
> > - I think best option for this big reader lock would be to just use srcu.
> > We only need to flush out all current readers when we drop master or
> > revoke a lease, so synchronize_srcu is perfectly good enough for this
> > purpose.
> >
> > - The fbdev code would switch over to srcu in
> > drm_master_internal_acquire() and drm_master_internal_release(). Ofc
> > within drm_master_internal_acquire we'd still need to check master
> > status with the normal master_mutex.
> >
> > - While we revamp all this we should fix the ioctl checks in drm_ioctl.c.
> > Just noticed that drm_ioctl_permit() could and should be unexported,
> > last user was removed.
> >
> > Within drm_ioctl_kernel we'd then replace the check for
> > drm_is_current_master with the drm_master_internal_acquire/release.
> >
> > - This alone does nothing, we still need to make sure that dropmaster and
> > revoke_lease ioctl flush out all other access before they return to
> > userspace. We can't just call synchronize_srcu because due to the ioctl
> > code in drm_ioctl_kernel we're in that sruc section, we'd need to add a
> > DRM_MASTER_FLUSH ioctl flag which we'd check only when DRM_MASTER is
> > set, and use to call synchronize_srcu. Maybe wrap that in a
> > drm_master_flush or so, or perhaps a drm_master_internal_release_flush.
> >
> > - Also maybe we should drop the _internal_ from that name. Feels a bit
> > wrong when we're also going to use this in the ioctl handler.
> >
> > Thoughts? Totally silly and overkill?
> >
> > Cheers, Daniel
> >
> >
>
> Just some thoughts on the previous approach before we move on to something
> new. Regarding the lockdep warning for mode_config.idr_mutex, I think that's
> resolvable now by simply removing patch 2, which is no longer really
> necessary with the introduction of a new mutex at the bottom of the lock
> hierarchy in patch 4.
Oh I missed that, this is essentially part-way to what I'm describing
above.
> I was hesitant to create a new mutex (especially since this means that
> drm_file.master is now protected by either of two mutexes), but it's
> probably the smallest fix in terms of code churn. Is that approach no good?
That's the other approach I considered. It solves the use-after-free
issue, but while I was musing all the different issues here I realized
that we might as well use the opportunity to plug a few functional races
around drm_device ownership rules.
I do think it works. One thing I'd change is make it a spinlock - that
wayy it's very clear that it's a tiny inner lock that's really only meant
to protect the ->master pointer.
> Otherwise, on a high level, I think using an srcu mechanism makes a lot of
> sense to me to address the issue of data items being reclaimed while some
> readers still have references to them.
>
> The implementation details seem sound to me too, but I'll need to code it up
> a bit before I can comment further.
So maybe this is complete overkill, but what about three locks :-)
- innermost spinlock, just to protect against use-after-free until we
successfully got a reference. Essentially this is the lookup lock -
maybe we could call it master_lookup_lock for clarity?
- mutex like we have right now to make sure master state is consistent
when someone races set/dropmaster in userspace. This would be the only
write lock we have.
- new srcu to make sure that after a dropmaster/revoke-lease all previous
users calls are flushed out with synchronize_srcu(). Essentially this
wouldn't be a lock, but more a barrier. So maybe should call it
master_barrier_srcu or so? fbdev emulation in drm_client would use this,
and also drm_ioctl code to plug the race I've spotted.
So maybe refresh your series with just the pieces you think we need for
the master lookup spinlock, and we try to land that first?
I do agree this should work against the use-after-free.
Cheers, Daniel
>
> Best wishes,
> Desmond
>
> > > Changes in v5 -> v6:
> > > - Patch 2:
> > > Add patch 2 to the series. This patch moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
> > >
> > > - Patch 5:
> > > Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
> > >
> > > Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
> > >
> > > Modify comparison to NULL into "!master", as suggested by the intel-gfx CI.
> > >
> > > Changes in v4 -> v5:
> > > - Patch 1:
> > > Add patch 1 to the series. The changes in patch 1 do not apply to stable because they apply to new changes in the drm-misc-next branch. This patch moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
> > >
> > > Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
> > >
> > > - Patch 3:
> > > Move changes to drm_connector.c into patch 1.
> > >
> > > Changes in v3 -> v4:
> > > - Patch 3:
> > > Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
> > >
> > > Additionally, inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
> > >
> > > - Patch 5:
> > > Modify kerneldoc formatting.
> > >
> > > Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
> > >
> > > Changes in v2 -> v3:
> > > - Patch 3:
> > > Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
> > >
> > > - Patch 5:
> > > Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
> > >
> > > Changes in v1 -> v2:
> > > - Patch 5:
> > > Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
> > >
> > > Desmond Cheong Zhi Xi (5):
> > > drm: avoid circular locks in drm_mode_getconnector
> > > drm: separate locks in __drm_mode_object_find
> > > drm: add a locked version of drm_is_current_master
> > > drm: serialize drm_file.master with a master lock
> > > drm: protect drm_master pointers in drm_lease.c
> > >
> > > drivers/gpu/drm/drm_auth.c | 86 +++++++++++++++++++++++--------
> > > drivers/gpu/drm/drm_connector.c | 5 +-
> > > drivers/gpu/drm/drm_file.c | 1 +
> > > drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++-------
> > > drivers/gpu/drm/drm_mode_object.c | 10 ++--
> > > include/drm/drm_auth.h | 1 +
> > > include/drm/drm_file.h | 18 +++++--
> > > 7 files changed, 153 insertions(+), 49 deletions(-)
> > >
> > > --
> > > 2.25.1
> > >
> >
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Integrated into the scheduler now and all users converted over.
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com>
Cc: Maxime Ripard <mripard(a)kernel.org>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: David Airlie <airlied(a)linux.ie>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: linux-media(a)vger.kernel.org
Cc: linaro-mm-sig(a)lists.linaro.org
---
drivers/gpu/drm/drm_gem.c | 96 ---------------------------------------
include/drm/drm_gem.h | 5 --
2 files changed, 101 deletions(-)
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 68deb1de8235..24d49a2636e0 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
ww_acquire_fini(acquire_ctx);
}
EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
- struct dma_fence *fence)
-{
- struct dma_fence *entry;
- unsigned long index;
- u32 id = 0;
- int ret;
-
- if (!fence)
- return 0;
-
- /* Deduplicate if we already depend on a fence from the same context.
- * This lets the size of the array of deps scale with the number of
- * engines involved, rather than the number of BOs.
- */
- xa_for_each(fence_array, index, entry) {
- if (entry->context != fence->context)
- continue;
-
- if (dma_fence_is_later(fence, entry)) {
- dma_fence_put(entry);
- xa_store(fence_array, index, fence, GFP_KERNEL);
- } else {
- dma_fence_put(fence);
- }
- return 0;
- }
-
- ret = xa_alloc(fence_array, &id, fence, xa_limit_32b, GFP_KERNEL);
- if (ret != 0)
- dma_fence_put(fence);
-
- return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
- struct drm_gem_object *obj,
- bool write)
-{
- int ret;
- struct dma_fence **fences;
- unsigned int i, fence_count;
-
- if (!write) {
- struct dma_fence *fence =
- dma_resv_get_excl_unlocked(obj->resv);
-
- return drm_gem_fence_array_add(fence_array, fence);
- }
-
- ret = dma_resv_get_fences(obj->resv, NULL,
- &fence_count, &fences);
- if (ret || !fence_count)
- return ret;
-
- for (i = 0; i < fence_count; i++) {
- ret = drm_gem_fence_array_add(fence_array, fences[i]);
- if (ret)
- break;
- }
-
- for (; i < fence_count; i++)
- dma_fence_put(fences[i]);
- kfree(fences);
- return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 240049566592..6d5e33b89074 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object **objs, int count,
struct ww_acquire_ctx *acquire_ctx);
void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
- struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
- struct drm_gem_object *obj,
- bool write);
int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
u32 handle, u64 *offset);
--
2.32.0.rc2