From: Eric Biggers <ebiggers(a)google.com>
If drm_gem_handle_create() fails in vgem_gem_create(), then the
drm_vgem_gem_object is freed twice: once when the reference is dropped
by drm_gem_object_put_unlocked(), and again by __vgem_gem_destroy().
This was hit by syzkaller using fault injection.
Fix it by skipping the second free.
Reported-by: syzbot+e73f2fb5ed5a5df36d33(a)syzkaller.appspotmail.com
Fixes: af33a9190d02 ("drm/vgem: Enable dmabuf import interfaces")
Reviewed-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Laura Abbott <labbott(a)redhat.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
drivers/gpu/drm/vgem/vgem_drv.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index 5930facd6d2d8..11a8f99ba18c5 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -191,13 +191,9 @@ static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
ret = drm_gem_handle_create(file, &obj->base, handle);
drm_gem_object_put_unlocked(&obj->base);
if (ret)
- goto err;
+ return ERR_PTR(ret);
return &obj->base;
-
-err:
- __vgem_gem_destroy(obj);
- return ERR_PTR(ret);
}
static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
--
2.21.0.rc2.261.ga7da99ff1b-goog
On Wed, Feb 27, 2019 at 05:38:46PM +0900, Masami Hiramatsu wrote:
SNIP
> > When we switch it to raw_spin_lock_irqsave the return probe
> > on _raw_spin_lock starts working.
>
> Yes, there can be a race between probes and probe on irq handler.
>
> kretprobe_hash_lock()/kretprobe_hash_unlock() are safe because
> those disables irqs. Only recycle_rp_inst() has this problem.
>
> Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
>
> And this is one of the oldest bug in kprobe.
>
> commit ef53d9c5e4da ("kprobes: improve kretprobe scalability with hashed locking")
>
> introduced the spin_lock(&rp->lock) in recycle_rp_inst() but forgot to disable irqs.
> And
>
> commit c9becf58d935 ("[PATCH] kretprobe: kretprobe-booster")
ok, so I'll add:
Fixes: c9becf58d935 ("[PATCH] kretprobe: kretprobe-booster")
>
> introduced assembly-based trampoline which didn't disable irq.
>
> Could you add Cc:stable to this patch too?
sure, attaching patch with updated changelog
thanks,
jirka
---
We can call recycle_rp_inst from both task and irq contexts,
so we should irqsave/irqrestore locking functions.
I wasn't able to hit this particular lockup, but I found it
while checking on why return probe on _raw_spin_lock locks
the system, reported by David by using bpftrace on simple
script, like:
kprobe:_raw_spin_lock
{
@time[tid] = nsecs;
@symb[tid] = arg0;
}
kretprobe:_raw_spin_lock
/ @time[tid] /
{
delete(@time[tid]);
delete(@symb[tid]);
}
or by perf tool:
# perf probe -a _raw_spin_lock:%return
# perf record -e probe:_raw_spin_lock__return -a
The thing is that the _raw_spin_lock call in recycle_rp_inst,
is the only one that return probe code paths call and it will
trigger another kprobe instance while already processing one
and lock up on kretprobe_table_lock lock:
#12 [ffff99c337403d28] queued_spin_lock_slowpath at ffffffff9712693b
#13 [ffff99c337403d28] _raw_spin_lock_irqsave at ffffffff9794c100
#14 [ffff99c337403d38] pre_handler_kretprobe at ffffffff9719435c
#15 [ffff99c337403d68] kprobe_ftrace_handler at ffffffff97059f12
#16 [ffff99c337403d98] ftrace_ops_assist_func at ffffffff971a0421
#17 [ffff99c337403dd8] handle_edge_irq at ffffffff97139f55
#18 [ffff99c337403df0] handle_edge_irq at ffffffff97139f55
#19 [ffff99c337403e58] _raw_spin_lock at ffffffff9794c111
#20 [ffff99c337403e88] _raw_spin_lock at ffffffff9794c115
#21 [ffff99c337403ea8] trampoline_handler at ffffffff97058a8f
#22 [ffff99c337403f00] kretprobe_trampoline at ffffffff970586d5
#23 [ffff99c337403fb0] handle_irq at ffffffff97027b1f
#24 [ffff99c337403fc0] do_IRQ at ffffffff97a01bc9
--- <IRQ stack> ---
#25 [ffffa5c3c1f9fb38] ret_from_intr at ffffffff97a0098f
[exception RIP: smp_call_function_many+460]
RIP: ffffffff9716685c RSP: ffffa5c3c1f9fbe0 RFLAGS: 00000202
RAX: 0000000000000005 RBX: ffff99c337421c80 RCX: ffff99c337566260
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff99c337421c88
RBP: ffff99c337421c88 R8: 0000000000000001 R9: ffffffff98352940
R10: ffff99c33703c910 R11: ffffffff9794c110 R12: ffffffff97055680
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000040
ORIG_RAX: ffffffffffffffde CS: 0010 SS: 0018
#26 [ffffa5c3c1f9fc20] on_each_cpu at ffffffff97166918
#27 [ffffa5c3c1f9fc40] ftrace_replace_code at ffffffff97055a34
#28 [ffffa5c3c1f9fc88] ftrace_modify_all_code at ffffffff971a3552
#29 [ffffa5c3c1f9fca8] arch_ftrace_update_code at ffffffff97055a6c
#30 [ffffa5c3c1f9fcb0] ftrace_run_update_code at ffffffff971a3683
#31 [ffffa5c3c1f9fcc0] ftrace_startup at ffffffff971a6638
#32 [ffffa5c3c1f9fce8] register_ftrace_function at ffffffff971a66a0
When we switch it to raw_spin_lock_irqsave the return probe
on _raw_spin_lock starts working.
Fixes: c9becf58d935 ("[PATCH] kretprobe: kretprobe-booster")
Cc: stable(a)vger.kernel.org
Reported-by: David Valin <dvalin(a)redhat.com>
Acked-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
---
kernel/kprobes.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index c83e54727131..c82056b354cc 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1154,9 +1154,11 @@ void recycle_rp_inst(struct kretprobe_instance *ri,
hlist_del(&ri->hlist);
INIT_HLIST_NODE(&ri->hlist);
if (likely(rp)) {
- raw_spin_lock(&rp->lock);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&rp->lock, flags);
hlist_add_head(&ri->hlist, &rp->free_instances);
- raw_spin_unlock(&rp->lock);
+ raw_spin_unlock_irqrestore(&rp->lock, flags);
} else
/* Unregistering */
hlist_add_head(&ri->hlist, head);
--
2.17.2
On 27/02/19 22:31, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe
>
> to the 4.20-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> sfc-suppress-duplicate-nvmem-partition-types-in-efx_.patch
> and it can be found in the queue-4.20 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
If you are taking this patch, you also need c65285428b6e
sfc: initialise found bitmap in efx_ef10_mtd_probe
which fixes bugs in the above patch; I don't currently see it in the
stable-queue.
(Also, it's not clear whether the original fix is really needed on stable
kernels; while the bug is present there, it is harmless until a v5.0-rc1
commit, probably c4dfa25ab307 ("mtd: add support for reading MTD devices via the nvmem API")
interacts with it.)
The above remarks apply to all six stable trees for which this patch has
been queued.
-Ed
The information contained in this message is confidential and is intended for the addressee(s) only. If you have received this message in error, please notify the sender immediately and delete the message. Unless you are an addressee (or authorized to receive for an addressee), you may not use, copy or disclose to anyone this message or any information contained in this message. The unauthorized use, disclosure, copying or alteration of this message is strictly prohibited.
Kernel 4.14 fails to build with GCC 8 on powerpc64, due to 'in' being
uninitialised in epapr_hypercall*.
This is fixed in commit 186b8f1587c79c2fa04bfa392fdf08 upstream, and
this commit applies cleanly to the 4.14 tree. This commit is already on
the 4.19 branch.
Best,
--arw
--
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org
Hi Sasha,
Thanks for the heads-up!
On Tue, Feb 26, 2019 at 09:24:00PM +0000, Sasha Levin wrote:
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v4.20.12, v4.19.25, v4.14.103, v4.9.160, v4.4.176, v3.18.136.
>
Lu Baolu, can you please check for which stable trees this commit is
relevant and provide the backports of the patch (with dependencies if
necessary) to the relevant stable trees?
Thanks,
Joerg
On 2/11/19 8:27 PM, Andrew Morton wrote:
> On Mon, 11 Feb 2019 10:02:45 -0800 <rcampbell(a)nvidia.com> wrote:
>
>> From: Ralph Campbell <rcampbell(a)nvidia.com>
>>
>> The system call, get_mempolicy() [1], passes an unsigned long *nodemask
>> pointer and an unsigned long maxnode argument which specifies the
>> length of the user's nodemask array in bits (which is rounded up).
>> The manual page says that if the maxnode value is too small,
>> get_mempolicy will return EINVAL but there is no system call to return
>> this minimum value. To determine this value, some programs search
>> /proc/<pid>/status for a line starting with "Mems_allowed:" and use
>> the number of digits in the mask to determine the minimum value.
>> A recent change to the way this line is formatted [2] causes these
>> programs to compute a value less than MAX_NUMNODES so get_mempolicy()
>> returns EINVAL.
>>
>> Change get_mempolicy(), the older compat version of get_mempolicy(), and
>> the copy_nodes_to_user() function to use nr_node_ids instead of
>> MAX_NUMNODES, thus preserving the defacto method of computing the
>> minimum size for the nodemask array and the maxnode argument.
>>
>> [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
>> [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redha…
Please, the next time include linux-api and people involved in the previous
thread [1] into the CC list. Likely there should have been a Suggested-by: for
Alexander as well.
>>
>
> Ugh, what a mess.
I'm afraid it's even somewhat worse mess now.
> For a start, that's a crazy interface. I wish that had been brought to
> our attention so we could have provided a sane way for userspace to
> determine MAX_NUMNODES.
>
> Secondly, 4fb8e5b89bcbbb ("include/linux/nodemask.h: use nr_node_ids
> (not MAX_NUMNODES) in __nodemask_pr_numnodes()") introduced a
There's no such commit, that sha was probably from linux-next. The patch is
still in mmotm [1]. Luckily, I would say. Maybe Linus or some automation could
run some script to check for bogus Fixes tags before accepting patches?
> regession. The proposed get_mempolicy() change appears to be a good
> one, but is a strange way of addressing the regression. I suppose it's
> acceptable, as long as this change is backported into kernels which
> have 4fb8e5b89bcbbb.
Based on the non-existing sha, hopefully it wasn't backported anywhere, but
maybe some AI did anyway. Ah, seems like it indeed made it as far as 4.9, as a
fix for non-existing commit and without proper linux-api consideration :(
I guess it's too late to revert it for 5.0. Hopefully the change is really safe
and won't break anything, i.e. hopefully nobody was determining MAX_NUMNODES by
increasing buffer size until get_mempolicy() stopped returning EINVAL. Or other
problem in e.g. CRIU context.
What about the manpage? It says "The value specified by maxnode is less than
the number of node IDs supported by the system." which could be perhaps applied
both to nr_node_ids or MAX_NUMNODES. Or should we update it?
[1]
https://lore.kernel.org/linux-mm/631c44cc-df2d-40d4-a537-d24864df0679@nvidi…
[2]
https://www.ozlabs.org/~akpm/mmotm/broken-out/include-linux-nodemaskh-use-n…
Quoting Kenneth Graunke (2018-01-05 06:06:34)
> On Thursday, January 4, 2018 4:41:35 PM PST Rodrigo Vivi wrote:
> > On Thu, Jan 04, 2018 at 11:39:23PM +0000, Kenneth Graunke wrote:
> > > On Thursday, January 4, 2018 1:23:06 PM PST Chris Wilson wrote:
> > > > Quoting Kenneth Graunke (2018-01-04 19:38:05)
> > > > > Geminilake requires the 3D driver to select whether barriers are
> > > > > intended for compute shaders, or tessellation control shaders, by
> > > > > whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
> > > > > switching pipelines. Failure to do this properly can result in GPU
> > > > > hangs.
> > > > >
> > > > > Unfortunately, this means it needs to switch mid-batch, so only
> > > > > userspace can properly set it. To facilitate this, the kernel needs
> > > > > to whitelist the register.
> > > > >
> > > > > Signed-off-by: Kenneth Graunke <kenneth(a)whitecape.org>
> > > > > Cc: stable(a)vger.kernel.org
> > > > > ---
> > > > > drivers/gpu/drm/i915/i915_reg.h | 2 ++
> > > > > drivers/gpu/drm/i915/intel_engine_cs.c | 5 +++++
> > > > > 2 files changed, 7 insertions(+)
> > > > >
> > > > > Hello,
> > > > >
> > > > > We unfortunately need to whitelist an extra register for GPU hang fix
> > > > > on Geminilake. Here's the corresponding Mesa patch:
> > > >
> > > > Thankfully it appears to be context saved. Has a w/a name been assigned
> > > > for this?
> > > > -Chris
> > >
> > > There doesn't appear to be one. The workaround page lists it, but there
> > > is no name. The register description has a note saying that you need to
> > > set this, but doesn't call it out as a workaround.
> >
> > It mentions only BXT:ALL, but not mention to GLK.
> >
> > Should we add to both then?
>
> Well, that's irritating. On the workarounds page, it does indeed say
> "BXT" with no mention of GLK. But the workaround text says to set
> "SLICE_COMMON_CHICKEN_ECO1 Barrier Mode [...] (bit 7 of MMIO 0x731C)."
>
> Looking at the register definition for SLICE_COMMON_ECO_CHICKEN1, bit 7
> is "Barrier Mode" on [GLK] only, with no mention of BXT. It's marked
> reserved PBC on [SKL+, not GLK, not KBL]. On KBL it's something else.
>
> I believe Mark saw hangs in tessellation control shader hangs on
> Geminilake only, and never saw this issue on Broxton. So, my guess is
> that the workaround really is new on Geminilake, and the BXT tag on the
> workarounds page is incorrect. (Mark, does that sound right to you?)
Hi, I'm back!
This fails a selftest on glk as we can't even write to the register
0x731c, or at least can't read from the register.
Did bspec ever get updated to include this register & wa?
-Chris
Daniel Verkamp reported that the backport of 0d640732dbeb ("arm64: KVM: Skip
MMIO insn after emulation") to 4.4-stable has broken KVM on arm/arm64.
It turns out that the guest cannot make forward progress as soon as it hits
a device emulated by the host kernel, like the interrupt controller. The
reason for this is a set of missing dependencies from the 4.7 era. With
these patches added to 4.4.175, I'm able to boot guests normally.
Tested with both kvmtool and crossvm.
Christoffer Dall (1):
KVM: arm/arm64: Fix MMIO emulation data handling
Marc Zyngier (1):
arm/arm64: KVM: Feed initialized memory to MMIO accesses
arch/arm/kvm/mmio.c | 10 ++++++----
virt/kvm/arm/vgic.c | 7 -------
2 files changed, 6 insertions(+), 11 deletions(-)
--
2.20.1