Linaro-mm-sig

linaro-mm-sig@lists.linaro.org

13 participants
3245 discussions

Re: [Linaro-mm-sig] [PATCH V2] treewide: Add missing semicolons to __assign_str uses

by Steven Rostedt

On Wed, 30 Jun 2021 04:28:39 -0700 Joe Perches <joe(a)perches.com> wrote: > On Sat, 2021-06-12 at 08:42 -0700, Joe Perches wrote: > > The __assign_str macro has an unusual ending semicolon but the vast > > majority of uses of the macro already have semicolon termination. > > ping? > I wasn't sure I was the one to take this. I can, as I can run tests on it as well. I have some last minute fixes sent to me on something else, and I can apply this along with them. -- Steve

4 years, 5 months

Re: [Linaro-mm-sig] [PATCH v5 3/3] drm: protect drm_master pointers in drm_lease.c

by Daniel Vetter

On Wed, Jun 30, 2021 at 9:18 AM Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com> wrote: > > On 30/6/21 12:07 am, Daniel Vetter wrote: > > On Tue, Jun 29, 2021 at 11:37:06AM +0800, Desmond Cheong Zhi Xi wrote: > >> Currently, direct copies of drm_file->master pointers should be > >> protected by drm_device.master_mutex when being dereferenced. This is > >> because drm_file->master is not invariant for the lifetime of > >> drm_file. If drm_file is not the creator of master, then > >> drm_file->is_master is false, and a call to drm_setmaster_ioctl will > >> invoke drm_new_set_master, which then allocates a new master for > >> drm_file and puts the old master. > >> > >> Thus, without holding drm_device.master_mutex, the old value of > >> drm_file->master could be freed while it is being used by another > >> concurrent process. > >> > >> In drm_lease.c, there are multiple instances where drm_file->master is > >> accessed and dereferenced while drm_device.master_mutex is not > >> held. This makes drm_lease.c vulnerable to use-after-free bugs. > >> > >> We address this issue in 3 ways: > >> > >> 1. Clarify in the kerneldoc that drm_file->master is protected by > >> drm_device.master_mutex. > >> > >> 2. Add a new drm_file_get_master() function that calls drm_master_get > >> on drm_file->master while holding on to drm_device.master_mutex. Since > >> drm_master_get increments the reference count of master, this > >> prevents master from being freed until we unreference it with > >> drm_master_put. > >> > >> 3. In each case where drm_file->master is directly accessed and > >> eventually dereferenced in drm_lease.c, we wrap the access in a call > >> to the new drm_file_get_master function, then unreference the master > >> pointer once we are done using it. > >> > >> Reported-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> > >> Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com> > > > > Series looks very nice, let's see what intel-gfx-ci says. You should get a > > mail, but results are also here: > > > > https://patchwork.freedesktop.org/series/91969/#rev2 > > > > One tiny comment below. > > > >> --- > >> drivers/gpu/drm/drm_auth.c | 25 ++++++++++++ > >> drivers/gpu/drm/drm_lease.c | 77 +++++++++++++++++++++++++++---------- > >> include/drm/drm_auth.h | 1 + > >> include/drm/drm_file.h | 15 ++++++-- > >> 4 files changed, 95 insertions(+), 23 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c > >> index ab1863c5a5a0..c36a0b72be26 100644 > >> --- a/drivers/gpu/drm/drm_auth.c > >> +++ b/drivers/gpu/drm/drm_auth.c > >> @@ -384,6 +384,31 @@ struct drm_master *drm_master_get(struct drm_master *master) > >> } > >> EXPORT_SYMBOL(drm_master_get); > >> > >> +/** > >> + * drm_file_get_master - reference &drm_file.master of @file_priv > >> + * @file_priv: DRM file private > >> + * > >> + * Increments the reference count of @file_priv's &drm_file.master and returns > >> + * the &drm_file.master. If @file_priv has no &drm_file.master, returns NULL. > >> + * > >> + * Master pointers returned from this function should be unreferenced using > >> + * drm_master_put(). > >> + */ > >> +struct drm_master *drm_file_get_master(struct drm_file *file_priv) > >> +{ > >> + struct drm_master *master = NULL; > >> + > >> + mutex_lock(&file_priv->minor->dev->master_mutex); > >> + if (!file_priv->master) > >> + goto unlock; > >> + master = drm_master_get(file_priv->master); > >> + > >> +unlock: > >> + mutex_unlock(&file_priv->minor->dev->master_mutex); > >> + return master; > >> +} > >> +EXPORT_SYMBOL(drm_file_get_master); > >> + > >> static void drm_master_destroy(struct kref *kref) > >> { > >> struct drm_master *master = container_of(kref, struct drm_master, refcount); > >> diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c > >> index 00fb433bcef1..cdcc87fa9685 100644 > >> --- a/drivers/gpu/drm/drm_lease.c > >> +++ b/drivers/gpu/drm/drm_lease.c > >> @@ -106,10 +106,19 @@ static bool _drm_has_leased(struct drm_master *master, int id) > >> */ > >> bool _drm_lease_held(struct drm_file *file_priv, int id) > >> { > >> - if (!file_priv || !file_priv->master) > >> + bool ret; > >> + struct drm_master *master; > >> + > >> + if (!file_priv) > >> return true; > >> > >> - return _drm_lease_held_master(file_priv->master, id); > >> + master = drm_file_get_master(file_priv); > >> + if (master == NULL) > >> + return true; > >> + ret = _drm_lease_held_master(master, id); > >> + drm_master_put(&master); > >> + > >> + return ret; > >> } > >> > >> /** > >> @@ -128,13 +137,20 @@ bool drm_lease_held(struct drm_file *file_priv, int id) > >> struct drm_master *master; > >> bool ret; > >> > >> - if (!file_priv || !file_priv->master || !file_priv->master->lessor) > >> + if (!file_priv) > >> return true; > >> > >> - master = file_priv->master; > >> + master = drm_file_get_master(file_priv); > >> + if (master == NULL) > >> + return true; > >> + if (!master->lessor) { > >> + drm_master_put(&master); > >> + return true; > >> + } > >> mutex_lock(&master->dev->mode_config.idr_mutex); > >> ret = _drm_lease_held_master(master, id); > >> mutex_unlock(&master->dev->mode_config.idr_mutex); > >> + drm_master_put(&master); > >> return ret; > >> } > >> > >> @@ -154,10 +170,16 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) > >> int count_in, count_out; > >> uint32_t crtcs_out = 0; > >> > >> - if (!file_priv || !file_priv->master || !file_priv->master->lessor) > >> + if (!file_priv) > >> return crtcs_in; > >> > >> - master = file_priv->master; > >> + master = drm_file_get_master(file_priv); > >> + if (master == NULL) > >> + return crtcs_in; > >> + if (!master->lessor) { > >> + drm_master_put(&master); > >> + return crtcs_in; > >> + } > >> dev = master->dev; > >> > >> count_in = count_out = 0; > >> @@ -176,6 +198,7 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) > >> count_in++; > >> } > >> mutex_unlock(&master->dev->mode_config.idr_mutex); > >> + drm_master_put(&master); > >> return crtcs_out; > >> } > >> > >> @@ -489,7 +512,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > >> size_t object_count; > >> int ret = 0; > >> struct idr leases; > >> - struct drm_master *lessor = lessor_priv->master; > >> + struct drm_master *lessor; > >> struct drm_master *lessee = NULL; > >> struct file *lessee_file = NULL; > >> struct file *lessor_file = lessor_priv->filp; > >> @@ -501,12 +524,6 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > >> if (!drm_core_check_feature(dev, DRIVER_MODESET)) > >> return -EOPNOTSUPP; > >> > >> - /* Do not allow sub-leases */ > >> - if (lessor->lessor) { > >> - DRM_DEBUG_LEASE("recursive leasing not allowed\n"); > >> - return -EINVAL; > >> - } > >> - > >> /* need some objects */ > >> if (cl->object_count == 0) { > >> DRM_DEBUG_LEASE("no objects in lease\n"); > >> @@ -518,12 +535,22 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > >> return -EINVAL; > >> } > >> > >> + lessor = drm_file_get_master(lessor_priv); > >> + /* Do not allow sub-leases */ > >> + if (lessor->lessor) { > >> + DRM_DEBUG_LEASE("recursive leasing not allowed\n"); > >> + ret = -EINVAL; > >> + goto out_lessor; > >> + } > >> + > >> object_count = cl->object_count; > >> > >> object_ids = memdup_user(u64_to_user_ptr(cl->object_ids), > >> array_size(object_count, sizeof(__u32))); > >> - if (IS_ERR(object_ids)) > >> - return PTR_ERR(object_ids); > >> + if (IS_ERR(object_ids)) { > >> + ret = PTR_ERR(object_ids); > >> + goto out_lessor; > >> + } > >> > >> idr_init(&leases); > >> > >> @@ -534,14 +561,15 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > >> if (ret) { > >> DRM_DEBUG_LEASE("lease object lookup failed: %i\n", ret); > >> idr_destroy(&leases); > >> - return ret; > >> + goto out_lessor; > >> } > >> > >> /* Allocate a file descriptor for the lease */ > >> fd = get_unused_fd_flags(cl->flags & (O_CLOEXEC | O_NONBLOCK)); > >> if (fd < 0) { > >> idr_destroy(&leases); > >> - return fd; > >> + ret = fd; > >> + goto out_lessor; > >> } > >> > >> DRM_DEBUG_LEASE("Creating lease\n"); > >> @@ -577,6 +605,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > >> /* Hook up the fd */ > >> fd_install(fd, lessee_file); > >> > >> + drm_master_put(&lessor); > >> DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl succeeded\n"); > >> return 0; > >> > >> @@ -586,6 +615,8 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > >> out_leases: > >> put_unused_fd(fd); > >> > >> +out_lessor: > >> + drm_master_put(&lessor); > >> DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl failed: %d\n", ret); > >> return ret; > >> } > >> @@ -608,7 +639,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, > >> struct drm_mode_list_lessees *arg = data; > >> __u32 __user *lessee_ids = (__u32 __user *) (uintptr_t) (arg->lessees_ptr); > >> __u32 count_lessees = arg->count_lessees; > >> - struct drm_master *lessor = lessor_priv->master, *lessee; > >> + struct drm_master *lessor, *lessee; > >> int count; > >> int ret = 0; > >> > >> @@ -619,6 +650,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, > >> if (!drm_core_check_feature(dev, DRIVER_MODESET)) > >> return -EOPNOTSUPP; > >> > >> + lessor = drm_file_get_master(lessor_priv); > >> DRM_DEBUG_LEASE("List lessees for %d\n", lessor->lessee_id); > >> > >> mutex_lock(&dev->mode_config.idr_mutex); > >> @@ -642,6 +674,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, > >> arg->count_lessees = count; > >> > >> mutex_unlock(&dev->mode_config.idr_mutex); > >> + drm_master_put(&lessor); > >> > >> return ret; > >> } > >> @@ -661,7 +694,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, > >> struct drm_mode_get_lease *arg = data; > >> __u32 __user *object_ids = (__u32 __user *) (uintptr_t) (arg->objects_ptr); > >> __u32 count_objects = arg->count_objects; > >> - struct drm_master *lessee = lessee_priv->master; > >> + struct drm_master *lessee; > >> struct idr *object_idr; > >> int count; > >> void *entry; > >> @@ -675,6 +708,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, > >> if (!drm_core_check_feature(dev, DRIVER_MODESET)) > >> return -EOPNOTSUPP; > >> > >> + lessee = drm_file_get_master(lessee_priv); > >> DRM_DEBUG_LEASE("get lease for %d\n", lessee->lessee_id); > >> > >> mutex_lock(&dev->mode_config.idr_mutex); > >> @@ -702,6 +736,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, > >> arg->count_objects = count; > >> > >> mutex_unlock(&dev->mode_config.idr_mutex); > >> + drm_master_put(&lessee); > >> > >> return ret; > >> } > >> @@ -720,7 +755,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, > >> void *data, struct drm_file *lessor_priv) > >> { > >> struct drm_mode_revoke_lease *arg = data; > >> - struct drm_master *lessor = lessor_priv->master; > >> + struct drm_master *lessor; > >> struct drm_master *lessee; > >> int ret = 0; > >> > >> @@ -730,6 +765,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, > >> if (!drm_core_check_feature(dev, DRIVER_MODESET)) > >> return -EOPNOTSUPP; > >> > >> + lessor = drm_file_get_master(lessor_priv); > >> mutex_lock(&dev->mode_config.idr_mutex); > >> > >> lessee = _drm_find_lessee(lessor, arg->lessee_id); > >> @@ -750,6 +786,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, > >> > >> fail: > >> mutex_unlock(&dev->mode_config.idr_mutex); > >> + drm_master_put(&lessor); > >> > >> return ret; > >> } > >> diff --git a/include/drm/drm_auth.h b/include/drm/drm_auth.h > >> index 6bf8b2b78991..f99d3417f304 100644 > >> --- a/include/drm/drm_auth.h > >> +++ b/include/drm/drm_auth.h > >> @@ -107,6 +107,7 @@ struct drm_master { > >> }; > >> > >> struct drm_master *drm_master_get(struct drm_master *master); > >> +struct drm_master *drm_file_get_master(struct drm_file *file_priv); > >> void drm_master_put(struct drm_master **master); > >> bool drm_is_current_master(struct drm_file *fpriv); > >> > >> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h > >> index b81b3bfb08c8..e9931fca4ab7 100644 > >> --- a/include/drm/drm_file.h > >> +++ b/include/drm/drm_file.h > >> @@ -226,9 +226,18 @@ struct drm_file { > >> /** > >> * @master: > >> * > >> - * Master this node is currently associated with. Only relevant if > >> - * drm_is_primary_client() returns true. Note that this only > >> - * matches &drm_device.master if the master is the currently active one. > >> + * Master this node is currently associated with. Protected by struct > >> + * &drm_device.master_mutex. > >> + * > >> + * Only relevant if drm_is_primary_client() returns true. Note that > >> + * this only matches &drm_device.master if the master is the currently > >> + * active one. > >> + * > >> + * When obtaining a copy of this pointer, it is recommended to either > > > > I found this a bit confusing, since I generally don't think of > > dereferencing the pointer as "taking a copy". That's more for the entire > > datastructure when you have a memcpy() call, or kmemdup() or something > > like that. Also "it is recommended" is a bit weak if you get a > > use-after-free if you dont :-) > > > > So instead "When dererencing this pointer either hold ... or use > > drm_file_get_master() ...." > > > > Cheers, Daniel > > > >> + * hold struct &drm_device.master_mutex for the duration of the > >> + * pointer's use, or to use drm_file_get_master() if struct > >> + * &drm_device.master_mutex is not currently held and there is no other > >> + * need to hold it. This prevents @master from being freed during use. > >> * > >> * See also @authentication and @is_master and the :ref:`section on > >> * primary nodes and authentication <drm_primary_node>`. > >> -- > >> 2.25.1 > >> > > > > Hi Daniel, > > Thanks for the suggestion, I'll clarify the kerneldoc accordingly. > > Regarding the results from intel-gfx-ci, it seems that the patch is > inverting the lock hierarchy for > &dev->master_mutex --> &dev->mode_config.idr_mutex > > Currently the dmesg warnings share a common call trace: > drm_file_get_master+0x1b/0x70 > _drm_lease_held+0x21/0x70 > __drm_mode_object_find+0xd1/0xe0 > > Looking at the functions that lock &dev->mode_config.idr_mutex, this > should be the only instance of this lock order inversion. > > I'm thinking the call to _drm_lease_held can be moved outside of the > &dev->mode_config.idr_mutex lock in __drm_mode_object. Any thoughts? Uh very annoying. One of the callers of this is the atomic ioctl, where we're calling this while holding drm_modeset_lock. The nesting hierarchy is, from outermost lock to innermost: dev->master_mutex -> dev->mode_config.mutex -> drm_modeset_lock. So I think we'll again have an inversion, just moved it a bit :-( But I'm also not 100% sure, so maybe type it up and see what happens? Otherwise I think we need to figure out a solution for how we can check leases without having to take the dev->master_mutex that serializes a lot more things ... I think the fundamental problem we have here is that dev->master_mutex serves 2 purposes: a) protecting the pointers and just data consistency and b) synchronizing against concurrent master changes where we think that's required. It's the latter (through the fbdev emulation code) that causes all the inversion problems, a) it's could easily nest very deeply in other locks. > diff --git a/drivers/gpu/drm/drm_mode_object.c > b/drivers/gpu/drm/drm_mode_object.c > index b26588b52795..63d35f1f98dd 100644 > --- a/drivers/gpu/drm/drm_mode_object.c > +++ b/drivers/gpu/drm/drm_mode_object.c > @@ -146,16 +146,18 @@ struct drm_mode_object > *__drm_mode_object_find(struct drm_device *dev, > if (obj && obj->id != id) > obj = NULL; > > - if (obj && drm_mode_object_lease_required(obj->type) && > - !_drm_lease_held(file_priv, obj->id)) > - obj = NULL; > - > if (obj && obj->free_cb) { > if (!kref_get_unless_zero(&obj->refcount)) > obj = NULL; > } > mutex_unlock(&dev->mode_config.idr_mutex); > > + if (obj && drm_mode_object_lease_required(obj->type) && > + !_drm_lease_held(file_priv, obj->id)) { > + drm_mode_object_put(obj); > + obj = NULL; > + } Irrespective of all this I think this change here makes sense since in untangels the master stuff from the lookup idr, and that's always good. Maybe do this hunk as a separate patch, and I'll apply that as prep work? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

4 years, 5 months

[PATCH -next] <linux/dma-resv.h>: correct a function name in kernel-doc

by Randy Dunlap

Fix kernel-doc function name warning: ../include/linux/dma-resv.h:227: warning: expecting prototype for dma_resv_exclusive(). Prototype was for dma_resv_excl_fence() instead Fixes: 6edbd6abb783d ("ma-buf: rename and cleanup dma_resv_get_excl v3") Signed-off-by: Randy Dunlap <rdunlap(a)infradead.org> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: Christian König <christian.koenig(a)amd.com> Cc: linux-media(a)vger.kernel.org Cc: dri-devel(a)lists.freedesktop.org Cc: linaro-mm-sig(a)lists.linaro.org --- include/linux/dma-resv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-next-20210625.orig/include/linux/dma-resv.h +++ linux-next-20210625/include/linux/dma-resv.h @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struc } /** - * dma_resv_exclusive - return the object's exclusive fence + * dma_resv_excl_fence - return the object's exclusive fence * @obj: the reservation object * * Returns the exclusive fence (if any). Caller must either hold the objects

4 years, 5 months

Re: [Linaro-mm-sig] [PATCH v5 3/3] drm: protect drm_master pointers in drm_lease.c

by Daniel Vetter

On Tue, Jun 29, 2021 at 11:37:06AM +0800, Desmond Cheong Zhi Xi wrote: > Currently, direct copies of drm_file->master pointers should be > protected by drm_device.master_mutex when being dereferenced. This is > because drm_file->master is not invariant for the lifetime of > drm_file. If drm_file is not the creator of master, then > drm_file->is_master is false, and a call to drm_setmaster_ioctl will > invoke drm_new_set_master, which then allocates a new master for > drm_file and puts the old master. > > Thus, without holding drm_device.master_mutex, the old value of > drm_file->master could be freed while it is being used by another > concurrent process. > > In drm_lease.c, there are multiple instances where drm_file->master is > accessed and dereferenced while drm_device.master_mutex is not > held. This makes drm_lease.c vulnerable to use-after-free bugs. > > We address this issue in 3 ways: > > 1. Clarify in the kerneldoc that drm_file->master is protected by > drm_device.master_mutex. > > 2. Add a new drm_file_get_master() function that calls drm_master_get > on drm_file->master while holding on to drm_device.master_mutex. Since > drm_master_get increments the reference count of master, this > prevents master from being freed until we unreference it with > drm_master_put. > > 3. In each case where drm_file->master is directly accessed and > eventually dereferenced in drm_lease.c, we wrap the access in a call > to the new drm_file_get_master function, then unreference the master > pointer once we are done using it. > > Reported-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> > Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx(a)gmail.com> Series looks very nice, let's see what intel-gfx-ci says. You should get a mail, but results are also here: https://patchwork.freedesktop.org/series/91969/#rev2 One tiny comment below. > --- > drivers/gpu/drm/drm_auth.c | 25 ++++++++++++ > drivers/gpu/drm/drm_lease.c | 77 +++++++++++++++++++++++++++---------- > include/drm/drm_auth.h | 1 + > include/drm/drm_file.h | 15 ++++++-- > 4 files changed, 95 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c > index ab1863c5a5a0..c36a0b72be26 100644 > --- a/drivers/gpu/drm/drm_auth.c > +++ b/drivers/gpu/drm/drm_auth.c > @@ -384,6 +384,31 @@ struct drm_master *drm_master_get(struct drm_master *master) > } > EXPORT_SYMBOL(drm_master_get); > > +/** > + * drm_file_get_master - reference &drm_file.master of @file_priv > + * @file_priv: DRM file private > + * > + * Increments the reference count of @file_priv's &drm_file.master and returns > + * the &drm_file.master. If @file_priv has no &drm_file.master, returns NULL. > + * > + * Master pointers returned from this function should be unreferenced using > + * drm_master_put(). > + */ > +struct drm_master *drm_file_get_master(struct drm_file *file_priv) > +{ > + struct drm_master *master = NULL; > + > + mutex_lock(&file_priv->minor->dev->master_mutex); > + if (!file_priv->master) > + goto unlock; > + master = drm_master_get(file_priv->master); > + > +unlock: > + mutex_unlock(&file_priv->minor->dev->master_mutex); > + return master; > +} > +EXPORT_SYMBOL(drm_file_get_master); > + > static void drm_master_destroy(struct kref *kref) > { > struct drm_master *master = container_of(kref, struct drm_master, refcount); > diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c > index 00fb433bcef1..cdcc87fa9685 100644 > --- a/drivers/gpu/drm/drm_lease.c > +++ b/drivers/gpu/drm/drm_lease.c > @@ -106,10 +106,19 @@ static bool _drm_has_leased(struct drm_master *master, int id) > */ > bool _drm_lease_held(struct drm_file *file_priv, int id) > { > - if (!file_priv || !file_priv->master) > + bool ret; > + struct drm_master *master; > + > + if (!file_priv) > return true; > > - return _drm_lease_held_master(file_priv->master, id); > + master = drm_file_get_master(file_priv); > + if (master == NULL) > + return true; > + ret = _drm_lease_held_master(master, id); > + drm_master_put(&master); > + > + return ret; > } > > /** > @@ -128,13 +137,20 @@ bool drm_lease_held(struct drm_file *file_priv, int id) > struct drm_master *master; > bool ret; > > - if (!file_priv || !file_priv->master || !file_priv->master->lessor) > + if (!file_priv) > return true; > > - master = file_priv->master; > + master = drm_file_get_master(file_priv); > + if (master == NULL) > + return true; > + if (!master->lessor) { > + drm_master_put(&master); > + return true; > + } > mutex_lock(&master->dev->mode_config.idr_mutex); > ret = _drm_lease_held_master(master, id); > mutex_unlock(&master->dev->mode_config.idr_mutex); > + drm_master_put(&master); > return ret; > } > > @@ -154,10 +170,16 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) > int count_in, count_out; > uint32_t crtcs_out = 0; > > - if (!file_priv || !file_priv->master || !file_priv->master->lessor) > + if (!file_priv) > return crtcs_in; > > - master = file_priv->master; > + master = drm_file_get_master(file_priv); > + if (master == NULL) > + return crtcs_in; > + if (!master->lessor) { > + drm_master_put(&master); > + return crtcs_in; > + } > dev = master->dev; > > count_in = count_out = 0; > @@ -176,6 +198,7 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) > count_in++; > } > mutex_unlock(&master->dev->mode_config.idr_mutex); > + drm_master_put(&master); > return crtcs_out; > } > > @@ -489,7 +512,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > size_t object_count; > int ret = 0; > struct idr leases; > - struct drm_master *lessor = lessor_priv->master; > + struct drm_master *lessor; > struct drm_master *lessee = NULL; > struct file *lessee_file = NULL; > struct file *lessor_file = lessor_priv->filp; > @@ -501,12 +524,6 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > if (!drm_core_check_feature(dev, DRIVER_MODESET)) > return -EOPNOTSUPP; > > - /* Do not allow sub-leases */ > - if (lessor->lessor) { > - DRM_DEBUG_LEASE("recursive leasing not allowed\n"); > - return -EINVAL; > - } > - > /* need some objects */ > if (cl->object_count == 0) { > DRM_DEBUG_LEASE("no objects in lease\n"); > @@ -518,12 +535,22 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > return -EINVAL; > } > > + lessor = drm_file_get_master(lessor_priv); > + /* Do not allow sub-leases */ > + if (lessor->lessor) { > + DRM_DEBUG_LEASE("recursive leasing not allowed\n"); > + ret = -EINVAL; > + goto out_lessor; > + } > + > object_count = cl->object_count; > > object_ids = memdup_user(u64_to_user_ptr(cl->object_ids), > array_size(object_count, sizeof(__u32))); > - if (IS_ERR(object_ids)) > - return PTR_ERR(object_ids); > + if (IS_ERR(object_ids)) { > + ret = PTR_ERR(object_ids); > + goto out_lessor; > + } > > idr_init(&leases); > > @@ -534,14 +561,15 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > if (ret) { > DRM_DEBUG_LEASE("lease object lookup failed: %i\n", ret); > idr_destroy(&leases); > - return ret; > + goto out_lessor; > } > > /* Allocate a file descriptor for the lease */ > fd = get_unused_fd_flags(cl->flags & (O_CLOEXEC | O_NONBLOCK)); > if (fd < 0) { > idr_destroy(&leases); > - return fd; > + ret = fd; > + goto out_lessor; > } > > DRM_DEBUG_LEASE("Creating lease\n"); > @@ -577,6 +605,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > /* Hook up the fd */ > fd_install(fd, lessee_file); > > + drm_master_put(&lessor); > DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl succeeded\n"); > return 0; > > @@ -586,6 +615,8 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, > out_leases: > put_unused_fd(fd); > > +out_lessor: > + drm_master_put(&lessor); > DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl failed: %d\n", ret); > return ret; > } > @@ -608,7 +639,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, > struct drm_mode_list_lessees *arg = data; > __u32 __user *lessee_ids = (__u32 __user *) (uintptr_t) (arg->lessees_ptr); > __u32 count_lessees = arg->count_lessees; > - struct drm_master *lessor = lessor_priv->master, *lessee; > + struct drm_master *lessor, *lessee; > int count; > int ret = 0; > > @@ -619,6 +650,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, > if (!drm_core_check_feature(dev, DRIVER_MODESET)) > return -EOPNOTSUPP; > > + lessor = drm_file_get_master(lessor_priv); > DRM_DEBUG_LEASE("List lessees for %d\n", lessor->lessee_id); > > mutex_lock(&dev->mode_config.idr_mutex); > @@ -642,6 +674,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, > arg->count_lessees = count; > > mutex_unlock(&dev->mode_config.idr_mutex); > + drm_master_put(&lessor); > > return ret; > } > @@ -661,7 +694,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, > struct drm_mode_get_lease *arg = data; > __u32 __user *object_ids = (__u32 __user *) (uintptr_t) (arg->objects_ptr); > __u32 count_objects = arg->count_objects; > - struct drm_master *lessee = lessee_priv->master; > + struct drm_master *lessee; > struct idr *object_idr; > int count; > void *entry; > @@ -675,6 +708,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, > if (!drm_core_check_feature(dev, DRIVER_MODESET)) > return -EOPNOTSUPP; > > + lessee = drm_file_get_master(lessee_priv); > DRM_DEBUG_LEASE("get lease for %d\n", lessee->lessee_id); > > mutex_lock(&dev->mode_config.idr_mutex); > @@ -702,6 +736,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, > arg->count_objects = count; > > mutex_unlock(&dev->mode_config.idr_mutex); > + drm_master_put(&lessee); > > return ret; > } > @@ -720,7 +755,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, > void *data, struct drm_file *lessor_priv) > { > struct drm_mode_revoke_lease *arg = data; > - struct drm_master *lessor = lessor_priv->master; > + struct drm_master *lessor; > struct drm_master *lessee; > int ret = 0; > > @@ -730,6 +765,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, > if (!drm_core_check_feature(dev, DRIVER_MODESET)) > return -EOPNOTSUPP; > > + lessor = drm_file_get_master(lessor_priv); > mutex_lock(&dev->mode_config.idr_mutex); > > lessee = _drm_find_lessee(lessor, arg->lessee_id); > @@ -750,6 +786,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, > > fail: > mutex_unlock(&dev->mode_config.idr_mutex); > + drm_master_put(&lessor); > > return ret; > } > diff --git a/include/drm/drm_auth.h b/include/drm/drm_auth.h > index 6bf8b2b78991..f99d3417f304 100644 > --- a/include/drm/drm_auth.h > +++ b/include/drm/drm_auth.h > @@ -107,6 +107,7 @@ struct drm_master { > }; > > struct drm_master *drm_master_get(struct drm_master *master); > +struct drm_master *drm_file_get_master(struct drm_file *file_priv); > void drm_master_put(struct drm_master **master); > bool drm_is_current_master(struct drm_file *fpriv); > > diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h > index b81b3bfb08c8..e9931fca4ab7 100644 > --- a/include/drm/drm_file.h > +++ b/include/drm/drm_file.h > @@ -226,9 +226,18 @@ struct drm_file { > /** > * @master: > * > - * Master this node is currently associated with. Only relevant if > - * drm_is_primary_client() returns true. Note that this only > - * matches &drm_device.master if the master is the currently active one. > + * Master this node is currently associated with. Protected by struct > + * &drm_device.master_mutex. > + * > + * Only relevant if drm_is_primary_client() returns true. Note that > + * this only matches &drm_device.master if the master is the currently > + * active one. > + * > + * When obtaining a copy of this pointer, it is recommended to either I found this a bit confusing, since I generally don't think of dereferencing the pointer as "taking a copy". That's more for the entire datastructure when you have a memcpy() call, or kmemdup() or something like that. Also "it is recommended" is a bit weak if you get a use-after-free if you dont :-) So instead "When dererencing this pointer either hold ... or use drm_file_get_master() ...." Cheers, Daniel > + * hold struct &drm_device.master_mutex for the duration of the > + * pointer's use, or to use drm_file_get_master() if struct > + * &drm_device.master_mutex is not currently held and there is no other > + * need to hold it. This prevents @master from being freed during use. > * > * See also @authentication and @is_master and the :ref:`section on > * primary nodes and authentication <drm_primary_node>`. > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

4 years, 5 months

[PATCH 02/11] drm/sched: Add dependency tracking

by Daniel Vetter

Instead of just a callback we can just glue in the gem helpers that panfrost, v3d and lima currently use. There's really not that many ways to skin this cat. On the naming bikeshed: The idea for using _await_ to denote adding dependencies to a job comes from i915, where that's used quite extensively all over the place, in lots of datastructures. Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> Cc: David Airlie <airlied(a)linux.ie> Cc: Daniel Vetter <daniel(a)ffwll.ch> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: "Christian König" <christian.koenig(a)amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com> Cc: Lee Jones <lee.jones(a)linaro.org> Cc: Nirmoy Das <nirmoy.aiemd(a)gmail.com> Cc: Boris Brezillon <boris.brezillon(a)collabora.com> Cc: Luben Tuikov <luben.tuikov(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: Jack Zhang <Jack.Zhang1(a)amd.com> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- drivers/gpu/drm/scheduler/sched_entity.c | 18 +++- drivers/gpu/drm/scheduler/sched_main.c | 103 +++++++++++++++++++++++ include/drm/gpu_scheduler.h | 31 ++++++- 3 files changed, 146 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index f7347c284886..b6f72fafd504 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, job->sched->ops->free_job(job); } +static struct dma_fence * +drm_sched_job_dependency(struct drm_sched_job *job, + struct drm_sched_entity *entity) +{ + if (!xa_empty(&job->dependencies)) + return xa_erase(&job->dependencies, job->last_dependency++); + + if (job->sched->ops->dependency) + return job->sched->ops->dependency(job, entity); + + return NULL; +} + /** * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed * @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) struct drm_sched_fence *s_fence = job->s_fence; /* Wait for all dependencies to avoid data corruptions */ - while ((f = job->sched->ops->dependency(job, entity))) + while ((f = drm_sched_job_dependency(job, entity))) dma_fence_wait(f, false); drm_sched_fence_scheduled(s_fence); @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity) */ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) { - struct drm_gpu_scheduler *sched = entity->rq->sched; struct drm_sched_job *sched_job; sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) return NULL; while ((entity->dependency = - sched->ops->dependency(sched_job, entity))) { + drm_sched_job_dependency(sched_job, entity))) { trace_drm_sched_job_wait_dep(sched_job, entity->dependency); if (drm_sched_entity_add_dependency_cb(entity)) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 70eefed17e06..370c336d383f 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job, INIT_LIST_HEAD(&job->list); + xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC); + return 0; } EXPORT_SYMBOL(drm_sched_job_init); @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job) } EXPORT_SYMBOL(drm_sched_job_arm); +/** + * drm_sched_job_await_fence - adds the fence as a job dependency + * @job: scheduler job to add the dependencies to + * @fence: the dma_fence to add to the list of dependencies. + * + * Note that @fence is consumed in both the success and error cases. + * + * Returns: + * 0 on success, or an error on failing to expand the array. + */ +int drm_sched_job_await_fence(struct drm_sched_job *job, + struct dma_fence *fence) +{ + struct dma_fence *entry; + unsigned long index; + u32 id = 0; + int ret; + + if (!fence) + return 0; + + /* Deduplicate if we already depend on a fence from the same context. + * This lets the size of the array of deps scale with the number of + * engines involved, rather than the number of BOs. + */ + xa_for_each(&job->dependencies, index, entry) { + if (entry->context != fence->context) + continue; + + if (dma_fence_is_later(fence, entry)) { + dma_fence_put(entry); + xa_store(&job->dependencies, index, fence, GFP_KERNEL); + } else { + dma_fence_put(fence); + } + return 0; + } + + ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL); + if (ret != 0) + dma_fence_put(fence); + + return ret; +} +EXPORT_SYMBOL(drm_sched_job_await_fence); + +/** + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies + * @job: scheduler job to add the dependencies to + * @obj: the gem object to add new dependencies from. + * @write: whether the job might write the object (so we need to depend on + * shared fences in the reservation object). + * + * This should be called after drm_gem_lock_reservations() on your array of + * GEM objects used in the job but before updating the reservations with your + * own fences. + * + * Returns: + * 0 on success, or an error on failing to expand the array. + */ +int drm_sched_job_await_implicit(struct drm_sched_job *job, + struct drm_gem_object *obj, + bool write) +{ + int ret; + struct dma_fence **fences; + unsigned int i, fence_count; + + if (!write) { + struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv); + + return drm_sched_job_await_fence(job, fence); + } + + ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences); + if (ret || !fence_count) + return ret; + + for (i = 0; i < fence_count; i++) { + ret = drm_sched_job_await_fence(job, fences[i]); + if (ret) + break; + } + + for (; i < fence_count; i++) + dma_fence_put(fences[i]); + kfree(fences); + return ret; +} +EXPORT_SYMBOL(drm_sched_job_await_implicit); + + /** * drm_sched_job_cleanup - clean up scheduler job resources * @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm); */ void drm_sched_job_cleanup(struct drm_sched_job *job) { + struct dma_fence *fence; + unsigned long index; + dma_fence_put(&job->s_fence->finished); job->s_fence = NULL; + + xa_for_each(&job->dependencies, index, fence) { + dma_fence_put(fence); + } + xa_destroy(&job->dependencies); + } EXPORT_SYMBOL(drm_sched_job_cleanup); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 80438d126c9d..e4d7e1496296 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -27,9 +27,12 @@ #include <drm/spsc_queue.h> #include <linux/dma-fence.h> #include <linux/completion.h> +#include <linux/xarray.h> #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000) +struct drm_gem_object; + struct drm_gpu_scheduler; struct drm_sched_rq; @@ -198,6 +201,16 @@ struct drm_sched_job { enum drm_sched_priority s_priority; struct drm_sched_entity *entity; struct dma_fence_cb cb; + /** + * @dependencies: + * + * Contains the dependencies as struct dma_fence for this job, see + * drm_sched_job_await_fence() and drm_sched_job_await_implicit(). + */ + struct xarray dependencies; + + /** @last_dependency: tracks @dependencies as they signal */ + unsigned long last_dependency; }; static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat { */ struct drm_sched_backend_ops { /** - * @dependency: Called when the scheduler is considering scheduling - * this job next, to get another struct dma_fence for this job to - * block on. Once it returns NULL, run_job() may be called. + * @dependency: + * + * Called when the scheduler is considering scheduling this job next, to + * get another struct dma_fence for this job to block on. Once it + * returns NULL, run_job() may be called. + * + * If a driver exclusively uses drm_sched_job_await_fence() and + * drm_sched_job_await_implicit() this can be ommitted and left as NULL. */ struct dma_fence *(*dependency)(struct drm_sched_job *sched_job, struct drm_sched_entity *s_entity); @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, void *owner); void drm_sched_job_arm(struct drm_sched_job *job); +int drm_sched_job_await_fence(struct drm_sched_job *job, + struct dma_fence *fence); +int drm_sched_job_await_implicit(struct drm_sched_job *job, + struct drm_gem_object *obj, + bool write); + + void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, struct drm_gpu_scheduler **sched_list, unsigned int num_sched_list); -- 2.32.0.rc2

4 years, 5 months

[PATCH 09/11] drm/gem: Delete gem array fencing helpers

by Daniel Vetter

Integrated into the scheduler now and all users converted over. Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> Cc: Maxime Ripard <mripard(a)kernel.org> Cc: Thomas Zimmermann <tzimmermann(a)suse.de> Cc: David Airlie <airlied(a)linux.ie> Cc: Daniel Vetter <daniel(a)ffwll.ch> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: "Christian König" <christian.koenig(a)amd.com> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- drivers/gpu/drm/drm_gem.c | 96 --------------------------------------- include/drm/drm_gem.h | 5 -- 2 files changed, 101 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 68deb1de8235..24d49a2636e0 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, ww_acquire_fini(acquire_ctx); } EXPORT_SYMBOL(drm_gem_unlock_reservations); - -/** - * drm_gem_fence_array_add - Adds the fence to an array of fences to be - * waited on, deduplicating fences from the same context. - * - * @fence_array: array of dma_fence * for the job to block on. - * @fence: the dma_fence to add to the list of dependencies. - * - * This functions consumes the reference for @fence both on success and error - * cases. - * - * Returns: - * 0 on success, or an error on failing to expand the array. - */ -int drm_gem_fence_array_add(struct xarray *fence_array, - struct dma_fence *fence) -{ - struct dma_fence *entry; - unsigned long index; - u32 id = 0; - int ret; - - if (!fence) - return 0; - - /* Deduplicate if we already depend on a fence from the same context. - * This lets the size of the array of deps scale with the number of - * engines involved, rather than the number of BOs. - */ - xa_for_each(fence_array, index, entry) { - if (entry->context != fence->context) - continue; - - if (dma_fence_is_later(fence, entry)) { - dma_fence_put(entry); - xa_store(fence_array, index, fence, GFP_KERNEL); - } else { - dma_fence_put(fence); - } - return 0; - } - - ret = xa_alloc(fence_array, &id, fence, xa_limit_32b, GFP_KERNEL); - if (ret != 0) - dma_fence_put(fence); - - return ret; -} -EXPORT_SYMBOL(drm_gem_fence_array_add); - -/** - * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked - * in the GEM object's reservation object to an array of dma_fences for use in - * scheduling a rendering job. - * - * This should be called after drm_gem_lock_reservations() on your array of - * GEM objects used in the job but before updating the reservations with your - * own fences. - * - * @fence_array: array of dma_fence * for the job to block on. - * @obj: the gem object to add new dependencies from. - * @write: whether the job might write the object (so we need to depend on - * shared fences in the reservation object). - */ -int drm_gem_fence_array_add_implicit(struct xarray *fence_array, - struct drm_gem_object *obj, - bool write) -{ - int ret; - struct dma_fence **fences; - unsigned int i, fence_count; - - if (!write) { - struct dma_fence *fence = - dma_resv_get_excl_unlocked(obj->resv); - - return drm_gem_fence_array_add(fence_array, fence); - } - - ret = dma_resv_get_fences(obj->resv, NULL, - &fence_count, &fences); - if (ret || !fence_count) - return ret; - - for (i = 0; i < fence_count; i++) { - ret = drm_gem_fence_array_add(fence_array, fences[i]); - if (ret) - break; - } - - for (; i < fence_count; i++) - dma_fence_put(fences[i]); - kfree(fences); - return ret; -} -EXPORT_SYMBOL(drm_gem_fence_array_add_implicit); diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 240049566592..6d5e33b89074 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx); void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx); -int drm_gem_fence_array_add(struct xarray *fence_array, - struct dma_fence *fence); -int drm_gem_fence_array_add_implicit(struct xarray *fence_array, - struct drm_gem_object *obj, - bool write); int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, u32 handle, u64 *offset); -- 2.32.0.rc2

4 years, 5 months

[PATCH 08/11] drm/etnaviv: Use scheduler dependency handling

by Daniel Vetter

We need to pull the drm_sched_job_init much earlier, but that's very minor surgery. Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> Cc: Lucas Stach <l.stach(a)pengutronix.de> Cc: Russell King <linux+etnaviv(a)armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner(a)gmail.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: "Christian König" <christian.koenig(a)amd.com> Cc: etnaviv(a)lists.freedesktop.org Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- drivers/gpu/drm/etnaviv/etnaviv_gem.h | 5 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 32 +++++----- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 61 +------------------- drivers/gpu/drm/etnaviv/etnaviv_sched.h | 3 +- 4 files changed, 20 insertions(+), 81 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h index 98e60df882b6..63688e6e4580 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h @@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo { u64 va; struct etnaviv_gem_object *obj; struct etnaviv_vram_mapping *mapping; - struct dma_fence *excl; - unsigned int nr_shared; - struct dma_fence **shared; }; /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc, @@ -95,7 +92,7 @@ struct etnaviv_gem_submit { struct etnaviv_file_private *ctx; struct etnaviv_gpu *gpu; struct etnaviv_iommu_context *mmu_context, *prev_mmu_context; - struct dma_fence *out_fence, *in_fence; + struct dma_fence *out_fence; int out_fence_id; struct list_head node; /* GPU active submit list */ struct etnaviv_cmdbuf cmdbuf; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 4dd7d9d541c0..92478a50a580 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit) if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) continue; - if (bo->flags & ETNA_SUBMIT_BO_WRITE) { - ret = dma_resv_get_fences(robj, &bo->excl, - &bo->nr_shared, - &bo->shared); - if (ret) - return ret; - } else { - bo->excl = dma_resv_get_excl_unlocked(robj); - } - + ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base, + bo->flags & ETNA_SUBMIT_BO_WRITE); + if (ret) + return ret; } return ret; @@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref) wake_up_all(&submit->gpu->fence_event); - if (submit->in_fence) - dma_fence_put(submit->in_fence); if (submit->out_fence) { /* first remove from IDR, so fence can not be found anymore */ mutex_lock(&submit->gpu->fence_lock); @@ -537,6 +529,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, submit->exec_state = args->exec_state; submit->flags = args->flags; + ret = drm_sched_job_init(&submit->sched_job, + &ctx->sched_entity[args->pipe], + submit->ctx); + if (ret) + goto err_submit_objects; + ret = submit_lookup_objects(submit, file, bos, args->nr_bos); if (ret) goto err_submit_objects; @@ -549,11 +547,15 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, } if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) { - submit->in_fence = sync_file_get_fence(args->fence_fd); - if (!submit->in_fence) { + struct dma_fence *in_fence = sync_file_get_fence(args->fence_fd); + if (!in_fence) { ret = -EINVAL; goto err_submit_objects; } + + ret = drm_sched_job_await_fence(&submit->sched_job, in_fence); + if (ret) + goto err_submit_objects; } ret = submit_pin_objects(submit); @@ -579,7 +581,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto err_submit_objects; - ret = etnaviv_sched_push_job(&ctx->sched_entity[args->pipe], submit); + ret = etnaviv_sched_push_job(submit); if (ret) goto err_submit_objects; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index 77995f190790..d62053b69953 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -17,58 +17,6 @@ module_param_named(job_hang_limit, etnaviv_job_hang_limit, int , 0444); static int etnaviv_hw_jobs_limit = 4; module_param_named(hw_job_limit, etnaviv_hw_jobs_limit, int , 0444); -static struct dma_fence * -etnaviv_sched_dependency(struct drm_sched_job *sched_job, - struct drm_sched_entity *entity) -{ - struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); - struct dma_fence *fence; - int i; - - if (unlikely(submit->in_fence)) { - fence = submit->in_fence; - submit->in_fence = NULL; - - if (!dma_fence_is_signaled(fence)) - return fence; - - dma_fence_put(fence); - } - - for (i = 0; i < submit->nr_bos; i++) { - struct etnaviv_gem_submit_bo *bo = &submit->bos[i]; - int j; - - if (bo->excl) { - fence = bo->excl; - bo->excl = NULL; - - if (!dma_fence_is_signaled(fence)) - return fence; - - dma_fence_put(fence); - } - - for (j = 0; j < bo->nr_shared; j++) { - if (!bo->shared[j]) - continue; - - fence = bo->shared[j]; - bo->shared[j] = NULL; - - if (!dma_fence_is_signaled(fence)) - return fence; - - dma_fence_put(fence); - } - kfree(bo->shared); - bo->nr_shared = 0; - bo->shared = NULL; - } - - return NULL; -} - static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job) { struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); @@ -140,14 +88,12 @@ static void etnaviv_sched_free_job(struct drm_sched_job *sched_job) } static const struct drm_sched_backend_ops etnaviv_sched_ops = { - .dependency = etnaviv_sched_dependency, .run_job = etnaviv_sched_run_job, .timedout_job = etnaviv_sched_timedout_job, .free_job = etnaviv_sched_free_job, }; -int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, - struct etnaviv_gem_submit *submit) +int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit) { int ret = 0; @@ -158,11 +104,6 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, */ mutex_lock(&submit->gpu->fence_lock); - ret = drm_sched_job_init(&submit->sched_job, sched_entity, - submit->ctx); - if (ret) - goto out_unlock; - drm_sched_job_arm(&submit->sched_job); submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.h b/drivers/gpu/drm/etnaviv/etnaviv_sched.h index c0a6796e22c9..baebfa069afc 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.h @@ -18,7 +18,6 @@ struct etnaviv_gem_submit *to_etnaviv_submit(struct drm_sched_job *sched_job) int etnaviv_sched_init(struct etnaviv_gpu *gpu); void etnaviv_sched_fini(struct etnaviv_gpu *gpu); -int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, - struct etnaviv_gem_submit *submit); +int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit); #endif /* __ETNAVIV_SCHED_H__ */ -- 2.32.0.rc2

4 years, 5 months

[PATCH 04/11] drm/panfrost: use scheduler dependency tracking

by Daniel Vetter

Just deletes some code that's now more shared. Note that thanks to the split into drm_sched_job_init/arm we can now easily pull the _init() part from under the submission lock way ahead where we're adding the sync file in-fences as dependencies. Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> Cc: Rob Herring <robh(a)kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso(a)collabora.com> Cc: Steven Price <steven.price(a)arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig(a)collabora.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: "Christian König" <christian.koenig(a)amd.com> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- drivers/gpu/drm/panfrost/panfrost_drv.c | 14 +++++++--- drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++---------------------- drivers/gpu/drm/panfrost/panfrost_job.h | 5 +--- 3 files changed, 15 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 1ffaef5ec5ff..79904f55c19f 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev, if (ret) goto fail; - ret = drm_gem_fence_array_add(&job->deps, fence); + ret = drm_sched_job_await_fence(&job->base, fence); if (ret) goto fail; @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, struct drm_panfrost_submit *args = data; struct drm_syncobj *sync_out = NULL; struct panfrost_job *job; - int ret = 0; + int ret = 0, slot; if (!args->jc) return -EINVAL; @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, kref_init(&job->refcount); - xa_init_flags(&job->deps, XA_FLAGS_ALLOC); - job->pfdev = pfdev; job->jc = args->jc; job->requirements = args->requirements; job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev); job->file_priv = file->driver_priv; + slot = panfrost_job_get_slot(job); + + ret = drm_sched_job_init(&job->base, + &job->file_priv->sched_entity[slot], + NULL); + if (ret) + goto fail_job; + ret = panfrost_copy_in_sync(dev, file, args, job); if (ret) goto fail_job; diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 2d01a670a4e8..d097e52f8caa 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -109,7 +109,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in return &fence->base; } -static int panfrost_job_get_slot(struct panfrost_job *job) +int panfrost_job_get_slot(struct panfrost_job *job) { /* JS0: fragment jobs. * JS1: vertex/tiler jobs @@ -198,13 +198,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) static int panfrost_acquire_object_fences(struct drm_gem_object **bos, int bo_count, - struct xarray *deps) + struct drm_sched_job *job) { int i, ret; for (i = 0; i < bo_count; i++) { /* panfrost always uses write mode in its current uapi */ - ret = drm_gem_fence_array_add_implicit(deps, bos[i], true); + ret = drm_sched_job_await_implicit(job, bos[i], true); if (ret) return ret; } @@ -225,31 +225,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos, int panfrost_job_push(struct panfrost_job *job) { struct panfrost_device *pfdev = job->pfdev; - int slot = panfrost_job_get_slot(job); - struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot]; struct ww_acquire_ctx acquire_ctx; int ret = 0; - ret = drm_gem_lock_reservations(job->bos, job->bo_count, &acquire_ctx); if (ret) return ret; mutex_lock(&pfdev->sched_lock); - - ret = drm_sched_job_init(&job->base, entity, NULL); - if (ret) { - mutex_unlock(&pfdev->sched_lock); - goto unlock; - } - drm_sched_job_arm(&job->base); job->render_done_fence = dma_fence_get(&job->base.s_fence->finished); ret = panfrost_acquire_object_fences(job->bos, job->bo_count, - &job->deps); + &job->base); if (ret) { mutex_unlock(&pfdev->sched_lock); goto unlock; @@ -274,15 +264,8 @@ static void panfrost_job_cleanup(struct kref *ref) { struct panfrost_job *job = container_of(ref, struct panfrost_job, refcount); - struct dma_fence *fence; - unsigned long index; unsigned int i; - xa_for_each(&job->deps, index, fence) { - dma_fence_put(fence); - } - xa_destroy(&job->deps); - dma_fence_put(job->done_fence); dma_fence_put(job->render_done_fence); @@ -321,17 +304,6 @@ static void panfrost_job_free(struct drm_sched_job *sched_job) panfrost_job_put(job); } -static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job, - struct drm_sched_entity *s_entity) -{ - struct panfrost_job *job = to_panfrost_job(sched_job); - - if (!xa_empty(&job->deps)) - return xa_erase(&job->deps, job->last_dep++); - - return NULL; -} - static struct dma_fence *panfrost_job_run(struct drm_sched_job *sched_job) { struct panfrost_job *job = to_panfrost_job(sched_job); @@ -457,7 +429,6 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct drm_sched_job } static const struct drm_sched_backend_ops panfrost_sched_ops = { - .dependency = panfrost_job_dependency, .run_job = panfrost_job_run, .timedout_job = panfrost_job_timedout, .free_job = panfrost_job_free diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h index 82306a03b57e..77e6d0e6f612 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.h +++ b/drivers/gpu/drm/panfrost/panfrost_job.h @@ -19,10 +19,6 @@ struct panfrost_job { struct panfrost_device *pfdev; struct panfrost_file_priv *file_priv; - /* Contains both explicit and implicit fences */ - struct xarray deps; - unsigned long last_dep; - /* Fence to be signaled by IRQ handler when the job is complete. */ struct dma_fence *done_fence; @@ -42,6 +38,7 @@ int panfrost_job_init(struct panfrost_device *pfdev); void panfrost_job_fini(struct panfrost_device *pfdev); int panfrost_job_open(struct panfrost_file_priv *panfrost_priv); void panfrost_job_close(struct panfrost_file_priv *panfrost_priv); +int panfrost_job_get_slot(struct panfrost_job *job); int panfrost_job_push(struct panfrost_job *job); void panfrost_job_put(struct panfrost_job *job); void panfrost_job_enable_interrupts(struct panfrost_device *pfdev); -- 2.32.0.rc2

4 years, 5 months

Re: [Linaro-mm-sig] [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

by Christoph Hellwig

On Wed, Jun 23, 2021 at 10:00:29PM +0300, Oded Gabbay wrote: > I understand the argument and I agree that for the generic case, the > top of the stack can't assume anything. > Having said that, in this case the SGL is encapsulated inside a dma-buf object. But the scatterlist is defined to have a valid page. If in dma-bufs you can't do that dmabufs are completely broken. Apparently the gpu folks can somehow live with that and deal with the pitfals, but for dma-buf users outside of their little fiefdom were they arbitrarily break rules it simply is not acceptable.

4 years, 5 months

[PATCH 03/15] dma-buf: Document dma-buf implicit fencing/resv fencing rules

by Daniel Vetter

Docs for struct dma_resv are fairly clear: "A reservation object can have attached one exclusive fence (normally associated with write operations) or N shared fences (read operations)." https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-ob… Furthermore a review across all of upstream. First of render drivers and how they set implicit fences: - nouveau follows this contract, see in validate_fini_no_ticket() nouveau_bo_fence(nvbo, fence, !!b->write_domains); and that last boolean controls whether the exclusive or shared fence slot is used. - radeon follows this contract by setting p->relocs[i].tv.num_shared = !r->write_domain; in radeon_cs_parser_relocs(), which ensures that the call to ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the right thing. - vmwgfx seems to follow this contract with the shotgun approach of always setting ttm_val_buf->num_shared = 0, which means ttm_eu_fence_buffer_objects() will only use the exclusive slot. - etnaviv follows this contract, as can be trivially seen by looking at submit_attach_object_fences() - i915 is a bit a convoluted maze with multiple paths leading to i915_vma_move_to_active(). Which sets the exclusive flag if EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for softpin mode, or through the write_domain when using relocations. It follows this contract. - lima follows this contract, see lima_gem_submit() which sets the exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that bo - msm follows this contract, see msm_gpu_submit() which sets the exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer - panfrost follows this contract with the shotgun approach of just always setting the exclusive fence, see panfrost_attach_object_fences(). Benefits of a single engine I guess - v3d follows this contract with the same shotgun approach in v3d_attach_fences_and_unlock_reservation(), but it has at least an XXX comment that maybe this should be improved - v4c uses the same shotgun approach of always setting an exclusive fence, see vc4_update_bo_seqnos() - vgem also follows this contract, see vgem_fence_attach_ioctl() and the VGEM_FENCE_WRITE. This is used in some igts to validate prime sharing with i915.ko without the need of a 2nd gpu - vritio follows this contract again with the shotgun approach of always setting an exclusive fence, see virtio_gpu_array_add_fence() This covers the setting of the exclusive fences when writing. Synchronizing against the exclusive fence is a lot more tricky, and I only spot checked a few: - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all implicit dependencies (which is used by vulkan) - etnaviv does this. Implicit dependencies are collected in submit_fence_sync(), again with an opt-out flag ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in etnaviv_sched_dependency which is the drm_sched_backend_ops->dependency callback. - v4c seems to not do much here, maybe gets away with it by not having a scheduler and only a single engine. Since all newer broadcom chips than the OG vc4 use v3d for rendering, which follows this contract, the impact of this issue is fairly small. - v3d does this using the drm_gem_fence_array_add_implicit() helper, which then it's drm_sched_backend_ops->dependency callback v3d_job_dependency() picks up. - panfrost is nice here and tracks the implicit fences in panfrost_job->implicit_fences, which again the drm_sched_backend_ops->dependency callback panfrost_job_dependency() picks up. It is mildly questionable though since it only picks up exclusive fences in panfrost_acquire_object_fences(), but not buggy in practice because it also always sets the exclusive fence. It should pick up both sets of fences, just in case there's ever going to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a pcie port and a real gpu, which might actually happen eventually. A bug, but easy to fix. Should probably use the drm_gem_fence_array_add_implicit() helper. - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and the same schema as v3d. - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT, but because it doesn't use the drm/scheduler it handles fences from the wrong context with a synchronous dma_fence_wait. See submit_fence_sync() leading to msm_gem_sync_object(). Investing into a scheduler might be a good idea. - all the remaining drivers are ttm based, where I hope they do appropriately obey implicit fences already. I didn't do the full audit there because a) not follow the contract would confuse ttm quite well and b) reading non-standard scheduler and submit code which isn't based on drm/scheduler is a pain. Onwards to the display side. - Any driver using the drm_gem_plane_helper_prepare_fb() helper will correctly. Overwhelmingly most drivers get this right, except a few totally dont. I'll follow up with a patch to make this the default and avoid a bunch of bugs. - I didn't audit the ttm drivers, but given that dma_resv started there I hope they get this right. In conclusion this IS the contract, both as documented and overwhelmingly implemented, specically as implemented by all render drivers except amdgpu. Amdgpu tried to fix this already in commit 049aca4363d8af87cab8d53de5401602db3b9999 Author: Christian König <christian.koenig(a)amd.com> Date: Wed Sep 19 16:54:35 2018 +0200 drm/amdgpu: fix using shared fence for exported BOs v2 but this fix falls short on a number of areas: - It's racy, by the time the buffer is shared it might be too late. To make sure there's definitely never a problem we need to set the fences correctly for any buffer that's potentially exportable. - It's breaking uapi, dma-buf fds support poll() and differentitiate between, which was introduced in commit 9b495a5887994a6d74d5c261d012083a92b94738 Author: Maarten Lankhorst <maarten.lankhorst(a)canonical.com> Date: Tue Jul 1 12:57:43 2014 +0200 dma-buf: add poll support, v3 - Christian König wants to nack new uapi building further on this dma_resv contract because it breaks amdgpu, quoting "Yeah, and that is exactly the reason why I will NAK this uAPI change. "This doesn't works for amdgpu at all for the reasons outlined above." https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmai… Rejecting new development because your own driver is broken and violates established cross driver contracts and uapi is really not how upstream works. Now this patch will have a severe performance impact on anything that runs on multiple engines. So we can't just merge it outright, but need a bit a plan: - amdgpu needs a proper uapi for handling implicit fencing. The funny thing is that to do it correctly, implicit fencing must be treated as a very strange IPC mechanism for transporting fences, where both setting the fence and dependency intercepts must be handled explicitly. Current best practices is a per-bo flag to indicate writes, and a per-bo flag to to skip implicit fencing in the CS ioctl as a new chunk. - Since amdgpu has been shipping with broken behaviour we need an opt-out flag from the butchered implicit fencing model to enable the proper explicit implicit fencing model. - for kernel memory fences due to bo moves at least the i915 idea is to use ttm_bo->moving. amdgpu probably needs the same. - since the current p2p dma-buf interface assumes the kernel memory fence is in the exclusive dma_resv fence slot we need to add a new fence slot for kernel fences, which must never be ignored. Since currently only amdgpu supports this there's no real problem here yet, until amdgpu gains a NO_IMPLICIT CS flag. - New userspace needs to ship in enough desktop distros so that users wont notice the perf impact. I think we can ignore LTS distros who upgrade their kernels but not their mesa3d snapshot. - Then when this is all in place we can merge this patch here. What is not a solution to this problem here is trying to make the dma_resv rules in the kernel more clever. The fundamental issue here is that the amdgpu CS uapi is the least expressive one across all drivers (only equalled by panfrost, which has an actual excuse) by not allowing any userspace control over how implicit sync is conducted. Until this is fixed it's completely pointless to make the kernel more clever to improve amdgpu, because all we're doing is papering over this uapi design issue. amdgpu needs to attain the status quo established by other drivers first, once that's achieved we can tackle the remaining issues in a consistent way across drivers. v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I entirely missed. This is great because it means the amdgpu specific piece for proper implicit fence handling exists already, and that since a while. The only thing that's now missing is - fishing the implicit fences out of a shared object at the right time - setting the exclusive implicit fence slot at the right time. Jason has a patch series to fill that gap with a bunch of generic ioctl on the dma-buf fd: https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.… v3: Since Christian has fixed amdgpu now in commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next) Author: Christian König <christian.koenig(a)amd.com> Date: Wed Jun 9 13:51:36 2021 +0200 drm/amdgpu: rework dma_resv handling v3 Use the audit covered in this commit message as the excuse to update the dma-buf docs around dma_buf.resv usage across drivers. Since dynamic importers have different rules also hammer these in again while we're at it. Cc: mesa-dev(a)lists.freedesktop.org Cc: Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl> Cc: Dave Airlie <airlied(a)gmail.com> Cc: Rob Clark <robdclark(a)chromium.org> Cc: Kristian H. Kristensen <hoegsberg(a)google.com> Cc: Michel Dänzer <michel(a)daenzer.net> Cc: Daniel Stone <daniels(a)collabora.com> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: "Christian König" <christian.koenig(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Deepak R Varma <mh12gx2825(a)gmail.com> Cc: Chen Li <chenli(a)uniontech.com> Cc: Kevin Wang <kevin1.wang(a)amd.com> Cc: Dennis Li <Dennis.Li(a)amd.com> Cc: Luben Tuikov <luben.tuikov(a)amd.com> Cc: linaro-mm-sig(a)lists.linaro.org Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com> --- include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 6d18b9e448b9..4807cefe81f5 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -388,6 +388,45 @@ struct dma_buf { * @resv: * * Reservation object linked to this dma-buf. + * + * IMPLICIT SYNCHRONIZATION RULES: + * + * Drivers which support implicit synchronization of buffer access as + * e.g. exposed in `Implicit Fence Poll Support`_ should follow the + * below rules. + * + * - Drivers should add a shared fence through + * dma_resv_add_shared_fence() for anything the userspace API + * considers a read access. This highly depends upon the API and + * window system: E.g. OpenGL is generally implicitly synchronized on + * Linux, but explicitly synchronized on Android. Whereas Vulkan is + * generally explicitly synchronized for everything, and window system + * buffers have explicit API calls (which then need to make sure the + * implicit fences store here in @resv are updated correctly). + * + * - Similarly drivers should set the exclusive fence through + * dma_resv_add_excl_fence() for anything the userspace API considers + * write access. + * + * - Drivers may just always set the exclusive fence, since that only + * causes unecessarily synchronization, but no correctness issues. + * + * - Some drivers only expose a synchronous userspace API with no + * pipelining across drivers. These do not set any fences for their + * access. An example here is v4l. + * + * DYNAMIC IMPORTER RULES: + * + * Dynamic importers, see dma_buf_attachment_is_dynamic(), have + * additional constraints on how they set up fences: + * + * - Dynamic importers must obey the exclusive fence and wait for it to + * signal before allowing access to the buffer's underlying storage + * through. + * + * - Dynamic importers should set fences for any access that they can't + * disable immediately from their @dma_buf_attach_ops.move_notify + * callback. */ struct dma_resv *resv; -- 2.32.0.rc2

4 years, 5 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig