Calltree:
timeline_fence_release
drm_sched_entity_wakeup
dma_fence_signal_locked
sync_timeline_signal
sw_sync_ioctl
Releasing the reference to the fence in the fence signal callback
seems reasonable to me, so this patch avoids the locking issue in
sw_sync.
d3862e44daa7 ("dma-buf/sw-sync: Fix locking around sync_timeline lists")
fixed the recursive locking issue but caused an use-after-free. Later
d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
fixed the use-after-free but reintroduced the recursive locking issue.
In this attempt we avoid the use-after-free still because the release
function still always locks, and outside of the locking region in the
signal function we have properly refcounted references.
We furthermore also avoid the recurive lock by making sure that either:
1) We have a properly refcounted reference, preventing the signal from
triggering the release function inside the locked region.
2) The refcount was already zero, and hence nobody will be able to trigger
the release function from the signal function.
Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Gustavo Padovan <gustavo(a)padovan.org>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Bas Nieuwenhuizen <bas(a)basnieuwenhuizen.nl>
---
drivers/dma-buf/sw_sync.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..30a482f75d56 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -192,9 +192,12 @@ static const struct dma_fence_ops timeline_fence_ops = {
static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
{
struct sync_pt *pt, *next;
+ struct list_head ref_list;
trace_sync_timeline(obj);
+ INIT_LIST_HEAD(&ref_list);
+
spin_lock_irq(&obj->lock);
obj->value += inc;
@@ -206,18 +209,27 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
list_del_init(&pt->link);
rb_erase(&pt->node, &obj->pt_tree);
- /*
- * A signal callback may release the last reference to this
- * fence, causing it to be freed. That operation has to be
- * last to avoid a use after free inside this loop, and must
- * be after we remove the fence from the timeline in order to
- * prevent deadlocking on timeline->lock inside
- * timeline_fence_release().
- */
+ /* We need to take a reference to avoid a release during
+ * signalling (which can cause a recursive lock of obj->lock).
+ * If refcount was already zero, another thread is already taking
+ * care of destructing the fence, so the signal cannot release
+ * it again and we hence will not have the recursive lock. */
+ if (dma_fence_get_rcu(&pt->base))
+ list_add_tail(&pt->link, &ref_list);
+
dma_fence_signal_locked(&pt->base);
}
spin_unlock_irq(&obj->lock);
+
+ list_for_each_entry_safe(pt, next, &ref_list, link) {
+ /* This needs to be cleared before release, otherwise the
+ * timeline_fence_release function gets confused about also
+ * removing the fence from the pt_tree. */
+ list_del_init(&pt->link);
+
+ dma_fence_put(&pt->base);
+ }
}
/**
--
2.27.0
When "ovl_is_inuse" true case, trap inode reference not put.
plus adding the comment explaining sequence of
ovl_is_inuse after ovl_setup_trap.
Fixes: 0be0bfd2de9d ("ovl: fix regression caused by overlapping layers..")
Cc: <stable(a)vger.kernel.org> # v4.19+
Reviewed-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: youngjun <her0gyugyu(a)gmail.com>
---
fs/overlayfs/super.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 91476bc422f9..3097142b1e23 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1493,14 +1493,22 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs,
if (err < 0)
goto out;
+ /*
+ * Check if lower root conflicts with this overlay layers before checking
+ * if it is in-use as upperdir/workdir of "another" mount, because we do
+ * not bother to check in ovl_is_inuse() if the upperdir/workdir is in fact
+ * in-use by our upperdir/workdir.
+ */
err = ovl_setup_trap(sb, stack[i].dentry, &trap, "lowerdir");
if (err)
goto out;
if (ovl_is_inuse(stack[i].dentry)) {
err = ovl_report_in_use(ofs, "lowerdir");
- if (err)
+ if (err) {
+ iput(trap);
goto out;
+ }
}
mnt = clone_private_mount(&stack[i]);
--
2.17.1
Again, Great thanks Amir. I revise my patch through your kind guidance.
I hit this FTBFS earlier today while trying to build the LT 4.19.132
kernel from source:
https://lore.kernel.org/patchwork/patch/960281/
The "quick hack" patch mentioned at the bottom of the thread gets it
to compile. Is this the correct solution, or is there a better fix? As
it stands, tools/perf doesn't seem to compile on 4.19.132.