This is the start of the stable review cycle for the 4.19.123 release. There are 48 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 15 May 2020 09:41:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.123-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.123-rc1
Oleg Nesterov oleg@redhat.com ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()
Ivan Delalande colona@arista.com scripts/decodecode: fix trapping instruction formatting
Josh Poimboeuf jpoimboe@redhat.com objtool: Fix stack offset tracking for indirect CFAs
Arnd Bergmann arnd@arndb.de netfilter: nf_osf: avoid passing pointer to local var
Guillaume Nault gnault@redhat.com netfilter: nat: never update the UDP checksum when it's 0
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Fix premature unwind stoppage due to IRET frames
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Fix error path for bad ORC entry type
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Prevent unwinding before ORC initialization
Miroslav Benes mbenes@suse.cz x86/unwind/orc: Don't skip the first frame for inactive tasks
Jann Horn jannh@google.com x86/entry/64: Fix unwind hints in rewind_stack_do_exit()
Josh Poimboeuf jpoimboe@redhat.com x86/entry/64: Fix unwind hints in kernel exit path
Josh Poimboeuf jpoimboe@redhat.com x86/entry/64: Fix unwind hints in register clearing code
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_v_ogm_process
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_store_throughput_override
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_show_throughput_override
George Spelvin lkml@sdf.org batman-adv: fix batadv_nc_random_weight_tq
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Mark RCX, RDX and RSI as clobbered in vmx_vcpu_run()'s asm blob
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs
Luis Chamberlain mcgrof@kernel.org coredump: fix crash when umh is disabled
Oscar Carter oscar.carter@gmx.com staging: gasket: Check the return value of gasket_get_bar_index()
David Hildenbrand david@redhat.com mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
Mark Rutland mark.rutland@arm.com arm64: hugetlb: avoid potential NULL dereference
Marc Zyngier maz@kernel.org KVM: arm64: Fix 32bit PC wrap-around
Marc Zyngier maz@kernel.org KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Add a vmalloc_sync_mappings() for safe measure
Oliver Neukum oneukum@suse.com USB: serial: garmin_gps: add sanity checking for data length
Oliver Neukum oneukum@suse.com USB: uas: add quirk for LaCie 2Big Quadra
Alan Stern stern@rowland.harvard.edu HID: usbhid: Fix race between usbhid_close() and usbhid_stop()
Jere Leppänen jere.leppanen@nokia.com sctp: Fix bundling of SHUTDOWN with COOKIE-ACK
Jason Gerecke jason.gerecke@wacom.com HID: wacom: Read HID_DG_CONTACTMAX directly for non-generic devices
Willem de Bruijn willemb@google.com net: stricter validation of untrusted gso packets
Michael Chan michael.chan@broadcom.com bnxt_en: Fix VF anti-spoof filter setup.
Michael Chan michael.chan@broadcom.com bnxt_en: Improve AER slot reset.
Moshe Shemesh moshe@mellanox.com net/mlx5: Fix command entry leak in Internal Error State
Moshe Shemesh moshe@mellanox.com net/mlx5: Fix forced completion access non initialized command entry
Michael Chan michael.chan@broadcom.com bnxt_en: Fix VLAN acceleration handling in bnxt_fix_features().
Tuong Lien tuong.t.lien@dektech.com.au tipc: fix partial topology connection closure
Eric Dumazet edumazet@google.com sch_sfq: validate silly quantum values
Eric Dumazet edumazet@google.com sch_choke: avoid potential panic in choke_reset()
Matt Jolly Kangie@footclan.ninja net: usb: qmi_wwan: add support for DW5816e
Eric Dumazet edumazet@google.com net_sched: sch_skbprio: add message validation to skbprio_change()
Tariq Toukan tariqt@mellanox.com net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
Scott Dial scott@scottdial.com net: macsec: preserve ingress frame ordering
Eric Dumazet edumazet@google.com fq_codel: fix TCA_FQ_CODEL_DROP_BATCH_SIZE sanity checks
Julia Lawall Julia.Lawall@inria.fr dp83640: reverse arguments to list_add_tail
Nicolas Pitre nico@fluxnic.net vt: fix unicode console freeing with a common interface
Masami Hiramatsu mhiramat@kernel.org tracing/kprobes: Fix a double initialization typo
Matt Jolly Kangie@footclan.ninja USB: serial: qcserial: Add DW5816e support
-------------
Diffstat:
Makefile | 4 +- arch/arm64/kvm/guest.c | 7 ++ arch/arm64/mm/hugetlbpage.c | 2 + arch/x86/entry/calling.h | 40 +++++------ arch/x86/entry/entry_64.S | 9 +-- arch/x86/include/asm/unwind.h | 2 +- arch/x86/kernel/unwind_orc.c | 61 ++++++++++++----- arch/x86/kvm/vmx.c | 91 ++++++++++++++----------- drivers/hid/usbhid/hid-core.c | 37 +++++++--- drivers/hid/usbhid/usbhid.h | 1 + drivers/hid/wacom_sys.c | 4 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 18 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 9 +-- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 6 +- drivers/net/macsec.c | 3 +- drivers/net/phy/dp83640.c | 2 +- drivers/net/usb/qmi_wwan.c | 1 + drivers/staging/gasket/gasket_core.c | 4 ++ drivers/tty/vt/vt.c | 9 ++- drivers/usb/serial/garmin_gps.c | 4 +- drivers/usb/serial/qcserial.c | 1 + drivers/usb/storage/unusual_uas.h | 7 ++ fs/coredump.c | 8 +++ include/linux/virtio_net.h | 26 ++++++- ipc/mqueue.c | 34 ++++++--- kernel/trace/trace.c | 13 ++++ kernel/trace/trace_kprobe.c | 2 +- kernel/umh.c | 5 ++ mm/page_alloc.c | 1 + net/batman-adv/bat_v_ogm.c | 2 +- net/batman-adv/network-coding.c | 9 +-- net/batman-adv/sysfs.c | 3 +- net/netfilter/nf_nat_proto_udp.c | 5 +- net/netfilter/nfnetlink_osf.c | 12 ++-- net/sched/sch_choke.c | 3 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_sfq.c | 9 +++ net/sched/sch_skbprio.c | 3 + net/sctp/sm_statefuns.c | 6 +- net/tipc/topsrv.c | 5 +- scripts/decodecode | 2 +- tools/objtool/check.c | 2 +- virt/kvm/arm/hyp/aarch32.c | 8 ++- virt/kvm/arm/vgic/vgic-mmio.c | 4 +- 46 files changed, 335 insertions(+), 156 deletions(-)
From: Matt Jolly Kangie@footclan.ninja
commit 78d6de3cfbd342918d31cf68d0d2eda401338aef upstream.
Add support for Dell Wireless 5816e to drivers/usb/serial/qcserial.c
Signed-off-by: Matt Jolly Kangie@footclan.ninja Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/qcserial.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/qcserial.c +++ b/drivers/usb/serial/qcserial.c @@ -173,6 +173,7 @@ static const struct usb_device_id id_tab {DEVICE_SWI(0x413c, 0x81b3)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ {DEVICE_SWI(0x413c, 0x81b5)}, /* Dell Wireless 5811e QDL */ {DEVICE_SWI(0x413c, 0x81b6)}, /* Dell Wireless 5811e QDL */ + {DEVICE_SWI(0x413c, 0x81cc)}, /* Dell Wireless 5816e */ {DEVICE_SWI(0x413c, 0x81cf)}, /* Dell Wireless 5819 */ {DEVICE_SWI(0x413c, 0x81d0)}, /* Dell Wireless 5819 */ {DEVICE_SWI(0x413c, 0x81d1)}, /* Dell Wireless 5818 */
From: Masami Hiramatsu mhiramat@kernel.org
[ Upstream commit dcbd21c9fca5e954fd4e3d91884907eb6d47187e ]
Fix a typo that resulted in an unnecessary double initialization to addr.
Link: http://lkml.kernel.org/r/158779374968.6082.2337484008464939919.stgit@devnote...
Cc: Tom Zanussi zanussi@kernel.org Cc: Ingo Molnar mingo@kernel.org Cc: stable@vger.kernel.org Fixes: c7411a1a126f ("tracing/kprobe: Check whether the non-suffixed symbol is notrace") Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_kprobe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 65b4e28ff425f..c45b017bacd47 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -538,7 +538,7 @@ static bool __within_notrace_func(unsigned long addr)
static bool within_notrace_func(struct trace_kprobe *tk) { - unsigned long addr = addr = trace_kprobe_address(tk); + unsigned long addr = trace_kprobe_address(tk); char symname[KSYM_NAME_LEN], *p;
if (!__within_notrace_func(addr))
From: Nicolas Pitre nico@fluxnic.net
[ Upstream commit 57d38f26d81e4275748b69372f31df545dcd9b71 ]
By directly using kfree() in different places we risk missing one if it is switched to using vfree(), especially if the corresponding vmalloc() is hidden away within a common abstraction.
Oh wait, that's exactly what happened here.
So let's fix this by creating a common abstraction for the free case as well.
Signed-off-by: Nicolas Pitre nico@fluxnic.net Reported-by: syzbot+0bfda3ade1ee9288a1be@syzkaller.appspotmail.com Fixes: 9a98e7a80f95 ("vt: don't use kmalloc() for the unicode screen buffer") Cc: stable@vger.kernel.org Reviewed-by: Sam Ravnborg sam@ravnborg.org Link: https://lore.kernel.org/r/nycvar.YSQ.7.76.2005021043110.2671@knanqh.ubzr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/vt/vt.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index ca8c6ddc1ca8c..5c7a968a5ea67 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -365,9 +365,14 @@ static struct uni_screen *vc_uniscr_alloc(unsigned int cols, unsigned int rows) return uniscr; }
+static void vc_uniscr_free(struct uni_screen *uniscr) +{ + vfree(uniscr); +} + static void vc_uniscr_set(struct vc_data *vc, struct uni_screen *new_uniscr) { - vfree(vc->vc_uni_screen); + vc_uniscr_free(vc->vc_uni_screen); vc->vc_uni_screen = new_uniscr; }
@@ -1233,7 +1238,7 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc, err = resize_screen(vc, new_cols, new_rows, user); if (err) { kfree(newscreen); - kfree(new_uniscr); + vc_uniscr_free(new_uniscr); return err; }
From: Julia Lawall Julia.Lawall@inria.fr
[ Upstream commit 865308373ed49c9fb05720d14cbf1315349b32a9 ]
In this code, it appears that phyter_clocks is a list head, based on the previous list_for_each, and that clock->list is intended to be a list element, given that it has just been initialized in dp83640_clock_init. Accordingly, switch the arguments to list_add_tail, which takes the list head as the second argument.
Fixes: cb646e2b02b27 ("ptp: Added a clock driver for the National Semiconductor PHYTER.") Signed-off-by: Julia Lawall Julia.Lawall@inria.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/dp83640.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/phy/dp83640.c +++ b/drivers/net/phy/dp83640.c @@ -1114,7 +1114,7 @@ static struct dp83640_clock *dp83640_clo goto out; } dp83640_clock_init(clock, bus); - list_add_tail(&phyter_clocks, &clock->list); + list_add_tail(&clock->list, &phyter_clocks); out: mutex_unlock(&phyter_clocks_lock);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 14695212d4cd8b0c997f6121b6df8520038ce076 ]
My intent was to not let users set a zero drop_batch_size, it seems I once again messed with min()/max().
Fixes: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_fq_codel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -429,7 +429,7 @@ static int fq_codel_change(struct Qdisc q->quantum = max(256U, nla_get_u32(tb[TCA_FQ_CODEL_QUANTUM]));
if (tb[TCA_FQ_CODEL_DROP_BATCH_SIZE]) - q->drop_batch_size = min(1U, nla_get_u32(tb[TCA_FQ_CODEL_DROP_BATCH_SIZE])); + q->drop_batch_size = max(1U, nla_get_u32(tb[TCA_FQ_CODEL_DROP_BATCH_SIZE]));
if (tb[TCA_FQ_CODEL_MEMORY_LIMIT]) q->memory_limit = min(1U << 31, nla_get_u32(tb[TCA_FQ_CODEL_MEMORY_LIMIT]));
From: Scott Dial scott@scottdial.com
[ Upstream commit ab046a5d4be4c90a3952a0eae75617b49c0cb01b ]
MACsec decryption always occurs in a softirq context. Since the FPU may not be usable in the softirq context, the call to decrypt may be scheduled on the cryptd work queue. The cryptd work queue does not provide ordering guarantees. Therefore, preserving order requires masking out ASYNC implementations of gcm(aes).
For instance, an Intel CPU with AES-NI makes available the generic-gcm-aesni driver from the aesni_intel module to implement gcm(aes). However, this implementation requires the FPU, so it is not always available to use from a softirq context, and will fallback to the cryptd work queue, which does not preserve frame ordering. With this change, such a system would select gcm_base(ctr(aes-aesni),ghash-generic). While the aes-aesni implementation prefers to use the FPU, it will fallback to the aes-asm implementation if unavailable.
By using a synchronous version of gcm(aes), the decryption will complete before returning from crypto_aead_decrypt(). Therefore, the macsec_decrypt_done() callback will be called before returning from macsec_decrypt(). Thus, the order of calls to macsec_post_decrypt() for the frames is preserved.
While it's presumable that the pure AES-NI version of gcm(aes) is more performant, the hybrid solution is capable of gigabit speeds on modest hardware. Regardless, preserving the order of frames is paramount for many network protocols (e.g., triggering TCP retries). Within the MACsec driver itself, the replay protection is tripped by the out-of-order frames, and can cause frames to be dropped.
This bug has been present in this code since it was added in v4.6, however it may not have been noticed since not all CPUs have FPU offload available. Additionally, the bug manifests as occasional out-of-order packets that are easily misattributed to other network phenomena.
When this code was added in v4.6, the crypto/gcm.c code did not restrict selection of the ghash function based on the ASYNC flag. For instance, x86 CPUs with PCLMULQDQ would select the ghash-clmulni driver instead of ghash-generic, which submits to the cryptd work queue if the FPU is busy. However, this bug was was corrected in v4.8 by commit b30bdfa86431afbafe15284a3ad5ac19b49b88e3, and was backported all the way back to the v3.14 stable branch, so this patch should be applicable back to the v4.6 stable branch.
Signed-off-by: Scott Dial scott@scottdial.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/macsec.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -1313,7 +1313,8 @@ static struct crypto_aead *macsec_alloc_ struct crypto_aead *tfm; int ret;
- tfm = crypto_alloc_aead("gcm(aes)", 0, 0); + /* Pick a sync gcm(aes) cipher to ensure order is preserved. */ + tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
if (IS_ERR(tfm)) return tfm;
From: Tariq Toukan tariqt@mellanox.com
[ Upstream commit 40e473071dbad04316ddc3613c3a3d1c75458299 ]
When ENOSPC is set the idx is still valid and gets set to the global MLX4_SINK_COUNTER_INDEX. However gcc's static analysis cannot tell that ENOSPC is impossible from mlx4_cmd_imm() and gives this warning:
drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be used uninitialized in this function [-Wmaybe-uninitialized] 2552 | priv->def_counter[port] = idx;
Also, when ENOSPC is returned mlx4_allocate_default_counters should not fail.
Fixes: 6de5f7f6a1fa ("net/mlx4_core: Allocate default counter per port") Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -2539,6 +2539,7 @@ static int mlx4_allocate_default_counter
if (!err || err == -ENOSPC) { priv->def_counter[port] = idx; + err = 0; } else if (err == -ENOENT) { err = 0; continue; @@ -2589,7 +2590,8 @@ int mlx4_counter_alloc(struct mlx4_dev * MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED); if (!err) *idx = get_param_l(&out_param); - + if (WARN_ON(err == -ENOSPC)) + err = -EINVAL; return err; } return __mlx4_counter_alloc(dev, idx);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 2761121af87de45951989a0adada917837d8fa82 ]
Do not assume the attribute has the right size.
Fixes: aea5f654e6b7 ("net/sched: add skbprio scheduler") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_skbprio.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/net/sched/sch_skbprio.c +++ b/net/sched/sch_skbprio.c @@ -173,6 +173,9 @@ static int skbprio_change(struct Qdisc * { struct tc_skbprio_qopt *ctl = nla_data(opt);
+ if (opt->nla_len != nla_attr_size(sizeof(*ctl))) + return -EINVAL; + sch->limit = ctl->limit; return 0; }
From: Matt Jolly Kangie@footclan.ninja
[ Upstream commit 57c7f2bd758eed867295c81d3527fff4fab1ed74 ]
Add support for Dell Wireless 5816e to drivers/net/usb/qmi_wwan.c
Signed-off-by: Matt Jolly Kangie@footclan.ninja Acked-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1294,6 +1294,7 @@ static const struct usb_device_id produc {QMI_FIXED_INTF(0x413c, 0x81b3, 8)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ {QMI_FIXED_INTF(0x413c, 0x81b6, 8)}, /* Dell Wireless 5811e */ {QMI_FIXED_INTF(0x413c, 0x81b6, 10)}, /* Dell Wireless 5811e */ + {QMI_FIXED_INTF(0x413c, 0x81cc, 8)}, /* Dell Wireless 5816e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 1)}, /* Dell Wireless 5821e preproduction config */ {QMI_FIXED_INTF(0x413c, 0x81e0, 0)}, /* Dell Wireless 5821e with eSIM support*/
From: Eric Dumazet edumazet@google.com
[ Upstream commit 8738c85c72b3108c9b9a369a39868ba5f8e10ae0 ]
If choke_init() could not allocate q->tab, we would crash later in choke_reset().
BUG: KASAN: null-ptr-deref in memset include/linux/string.h:366 [inline] BUG: KASAN: null-ptr-deref in choke_reset+0x208/0x340 net/sched/sch_choke.c:326 Write of size 8 at addr 0000000000000000 by task syz-executor822/7022
CPU: 1 PID: 7022 Comm: syz-executor822 Not tainted 5.7.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 __kasan_report.cold+0x5/0x4d mm/kasan/report.c:515 kasan_report+0x33/0x50 mm/kasan/common.c:625 check_memory_region_inline mm/kasan/generic.c:187 [inline] check_memory_region+0x141/0x190 mm/kasan/generic.c:193 memset+0x20/0x40 mm/kasan/common.c:85 memset include/linux/string.h:366 [inline] choke_reset+0x208/0x340 net/sched/sch_choke.c:326 qdisc_reset+0x6b/0x520 net/sched/sch_generic.c:910 dev_deactivate_queue.constprop.0+0x13c/0x240 net/sched/sch_generic.c:1138 netdev_for_each_tx_queue include/linux/netdevice.h:2197 [inline] dev_deactivate_many+0xe2/0xba0 net/sched/sch_generic.c:1195 dev_deactivate+0xf8/0x1c0 net/sched/sch_generic.c:1233 qdisc_graft+0xd25/0x1120 net/sched/sch_api.c:1051 tc_modify_qdisc+0xbab/0x1a00 net/sched/sch_api.c:1670 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
Fixes: 77e62da6e60c ("sch_choke: drop all packets in queue during reset") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_choke.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -327,7 +327,8 @@ static void choke_reset(struct Qdisc *sc
sch->q.qlen = 0; sch->qstats.backlog = 0; - memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); + if (q->tab) + memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); q->head = q->tail = 0; red_restart(&q->vars); }
From: Eric Dumazet edumazet@google.com
[ Upstream commit df4953e4e997e273501339f607b77953772e3559 ]
syzbot managed to set up sfq so that q->scaled_quantum was zero, triggering an infinite loop in sfq_dequeue()
More generally, we must only accept quantum between 1 and 2^18 - 7, meaning scaled_quantum must be in [1, 0x7FFF] range.
Otherwise, we also could have a loop in sfq_dequeue() if scaled_quantum happens to be 0x8000, since slot->allot could indefinitely switch between 0 and 0x8000.
Fixes: eeaeb068f139 ("sch_sfq: allow big packets and be fair") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com Cc: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_sfq.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -641,6 +641,15 @@ static int sfq_change(struct Qdisc *sch, if (ctl->divisor && (!is_power_of_2(ctl->divisor) || ctl->divisor > 65536)) return -EINVAL; + + /* slot->allot is a short, make sure quantum is not too big. */ + if (ctl->quantum) { + unsigned int scaled = SFQ_ALLOT_SIZE(ctl->quantum); + + if (scaled <= 0 || scaled > SHRT_MAX) + return -EINVAL; + } + if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog)) return -EINVAL;
From: Tuong Lien tuong.t.lien@dektech.com.au
[ Upstream commit 980d69276f3048af43a045be2925dacfb898a7be ]
When an application connects to the TIPC topology server and subscribes to some services, a new connection is created along with some objects - 'tipc_subscription' to store related data correspondingly... However, there is one omission in the connection handling that when the connection or application is orderly shutdown (e.g. via SIGQUIT, etc.), the connection is not closed in kernel, the 'tipc_subscription' objects are not freed too. This results in: - The maximum number of subscriptions (65535) will be reached soon, new subscriptions will be rejected; - TIPC module cannot be removed (unless the objects are somehow forced to release first);
The commit fixes the issue by closing the connection if the 'recvmsg()' returns '0' i.e. when the peer is shutdown gracefully. It also includes the other unexpected cases.
Acked-by: Jon Maloy jmaloy@redhat.com Acked-by: Ying Xue ying.xue@windriver.com Signed-off-by: Tuong Lien tuong.t.lien@dektech.com.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/topsrv.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -409,10 +409,11 @@ static int tipc_conn_rcv_from_sock(struc read_lock_bh(&sk->sk_callback_lock); ret = tipc_conn_rcv_sub(srv, con, &s); read_unlock_bh(&sk->sk_callback_lock); + if (!ret) + return 0; } - if (ret < 0) - tipc_conn_close(con);
+ tipc_conn_close(con); return ret; }
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit c72cb303aa6c2ae7e4184f0081c6d11bf03fb96b ]
The current logic in bnxt_fix_features() will inadvertently turn on both CTAG and STAG VLAN offload if the user tries to disable both. Fix it by checking that the user is trying to enable CTAG or STAG before enabling both. The logic is supposed to enable or disable both CTAG and STAG together.
Fixes: 5a9f6b238e59 ("bnxt_en: Enable and disable RX CTAG and RX STAG VLAN acceleration together.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -7562,6 +7562,7 @@ static netdev_features_t bnxt_fix_featur netdev_features_t features) { struct bnxt *bp = netdev_priv(dev); + netdev_features_t vlan_features;
if ((features & NETIF_F_NTUPLE) && !bnxt_rfs_capable(bp)) features &= ~NETIF_F_NTUPLE; @@ -7578,12 +7579,14 @@ static netdev_features_t bnxt_fix_featur /* Both CTAG and STAG VLAN accelaration on the RX side have to be * turned on or off together. */ - if ((features & (NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX)) != - (NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX)) { + vlan_features = features & (NETIF_F_HW_VLAN_CTAG_RX | + NETIF_F_HW_VLAN_STAG_RX); + if (vlan_features != (NETIF_F_HW_VLAN_CTAG_RX | + NETIF_F_HW_VLAN_STAG_RX)) { if (dev->features & NETIF_F_HW_VLAN_CTAG_RX) features &= ~(NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX); - else + else if (vlan_features) features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX; }
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit f3cb3cebe26ed4c8036adbd9448b372129d3c371 ]
mlx5_cmd_flush() will trigger forced completions to all valid command entries. Triggered by an asynch event such as fast teardown it can happen at any stage of the command, including command initialization. It will trigger forced completion and that can lead to completion on an uninitialized command entry.
Setting MLX5_CMD_ENT_STATE_PENDING_COMP only after command entry is initialized will ensure force completion is treated only if command entry is initialized.
Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -862,7 +862,6 @@ static void cmd_work_handler(struct work }
cmd->ent_arr[ent->idx] = ent; - set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state); lay = get_inst(cmd, ent->idx); ent->lay = lay; memset(lay, 0, sizeof(*lay)); @@ -884,6 +883,7 @@ static void cmd_work_handler(struct work
if (ent->callback) schedule_delayed_work(&ent->cb_timeout_work, cb_timeout); + set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state);
/* Skip sending command to fw if internal error */ if (pci_channel_offline(dev->pdev) ||
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit cece6f432cca9f18900463ed01b97a152a03600a ]
Processing commands by cmd_work_handler() while already in Internal Error State will result in entry leak, since the handler process force completion without doorbell. Forced completion doesn't release the entry and event completion will never arrive, so entry should be released.
Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -896,6 +896,10 @@ static void cmd_work_handler(struct work MLX5_SET(mbox_out, ent->out, syndrome, drv_synd);
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); + /* no doorbell, no need to keep the entry */ + free_ent(cmd, ent->idx); + if (ent->callback) + free_cmd(ent); return; }
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit bae361c54fb6ac6eba3b4762f49ce14beb73ef13 ]
Improve the slot reset sequence by disabling the device to prevent bad DMAs if slot reset fails. Return the proper result instead of always PCI_ERS_RESULT_RECOVERED to the caller.
Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -9300,8 +9300,11 @@ static pci_ers_result_t bnxt_io_slot_res } }
- if (result != PCI_ERS_RESULT_RECOVERED && netif_running(netdev)) - dev_close(netdev); + if (result != PCI_ERS_RESULT_RECOVERED) { + if (netif_running(netdev)) + dev_close(netdev); + pci_disable_device(pdev); + }
rtnl_unlock();
@@ -9312,7 +9315,7 @@ static pci_ers_result_t bnxt_io_slot_res err); /* non-fatal, continue */ }
- return PCI_ERS_RESULT_RECOVERED; + return result; }
/**
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit c71c4e49afe173823a2a85b0cabc9b3f1176ffa2 ]
Fix the logic that sets the enable/disable flag for the source MAC filter according to firmware spec 1.7.1.
In the original firmware spec. before 1.7.1, the VF spoof check flags were not latched after making the HWRM_FUNC_CFG call, so there was a need to keep the func_flags so that subsequent calls would perserve the VF spoof check setting. A change was made in the 1.7.1 spec so that the flags became latched. So we now set or clear the anti- spoof setting directly without retrieving the old settings in the stored vf->func_flags which are no longer valid. We also remove the unneeded vf->func_flags.
Fixes: 8eb992e876a8 ("bnxt_en: Update firmware interface spec to 1.7.6.2.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 9 ++------- 2 files changed, 2 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -839,7 +839,6 @@ struct bnxt_vf_info { #define BNXT_VF_LINK_FORCED 0x4 #define BNXT_VF_LINK_UP 0x8 #define BNXT_VF_TRUST 0x10 - u32 func_flags; /* func cfg flags */ u32 min_tx_rate; u32 max_tx_rate; void *hwrm_cmd_req_addr; --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -99,11 +99,10 @@ int bnxt_set_vf_spoofchk(struct net_devi if (old_setting == setting) return 0;
- func_flags = vf->func_flags; if (setting) - func_flags |= FUNC_CFG_REQ_FLAGS_SRC_MAC_ADDR_CHECK_ENABLE; + func_flags = FUNC_CFG_REQ_FLAGS_SRC_MAC_ADDR_CHECK_ENABLE; else - func_flags |= FUNC_CFG_REQ_FLAGS_SRC_MAC_ADDR_CHECK_DISABLE; + func_flags = FUNC_CFG_REQ_FLAGS_SRC_MAC_ADDR_CHECK_DISABLE; /*TODO: if the driver supports VLAN filter on guest VLAN, * the spoof check should also include vlan anti-spoofing */ @@ -112,7 +111,6 @@ int bnxt_set_vf_spoofchk(struct net_devi req.flags = cpu_to_le32(func_flags); rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); if (!rc) { - vf->func_flags = func_flags; if (setting) vf->flags |= BNXT_VF_SPOOFCHK; else @@ -197,7 +195,6 @@ int bnxt_set_vf_mac(struct net_device *d memcpy(vf->mac_addr, mac, ETH_ALEN); bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_CFG, -1, -1); req.fid = cpu_to_le16(vf->fw_fid); - req.flags = cpu_to_le32(vf->func_flags); req.enables = cpu_to_le32(FUNC_CFG_REQ_ENABLES_DFLT_MAC_ADDR); memcpy(req.dflt_mac_addr, mac, ETH_ALEN); return hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); @@ -235,7 +232,6 @@ int bnxt_set_vf_vlan(struct net_device *
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_CFG, -1, -1); req.fid = cpu_to_le16(vf->fw_fid); - req.flags = cpu_to_le32(vf->func_flags); req.dflt_vlan = cpu_to_le16(vlan_tag); req.enables = cpu_to_le32(FUNC_CFG_REQ_ENABLES_DFLT_VLAN); rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); @@ -274,7 +270,6 @@ int bnxt_set_vf_bw(struct net_device *de return 0; bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_FUNC_CFG, -1, -1); req.fid = cpu_to_le16(vf->fw_fid); - req.flags = cpu_to_le32(vf->func_flags); req.enables = cpu_to_le32(FUNC_CFG_REQ_ENABLES_MAX_BW); req.max_bw = cpu_to_le32(max_tx_rate); req.enables |= cpu_to_le32(FUNC_CFG_REQ_ENABLES_MIN_BW);
From: Willem de Bruijn willemb@google.com
[ Upstream commit 9274124f023b5c56dc4326637d4f787968b03607 ]
Syzkaller again found a path to a kernel crash through bad gso input: a packet with transport header extending beyond skb_headlen(skb).
Tighten validation at kernel entry:
- Verify that the transport header lies within the linear section.
To avoid pulling linux/tcp.h, verify just sizeof tcphdr. tcp_gso_segment will call pskb_may_pull (th->doff * 4) before use.
- Match the gso_type against the ip_proto found by the flow dissector.
Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/virtio_net.h | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)
--- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -3,6 +3,8 @@ #define _LINUX_VIRTIO_NET_H
#include <linux/if_vlan.h> +#include <uapi/linux/tcp.h> +#include <uapi/linux/udp.h> #include <uapi/linux/virtio_net.h>
static inline int virtio_net_hdr_set_proto(struct sk_buff *skb, @@ -28,17 +30,25 @@ static inline int virtio_net_hdr_to_skb( bool little_endian) { unsigned int gso_type = 0; + unsigned int thlen = 0; + unsigned int ip_proto;
if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) { switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { case VIRTIO_NET_HDR_GSO_TCPV4: gso_type = SKB_GSO_TCPV4; + ip_proto = IPPROTO_TCP; + thlen = sizeof(struct tcphdr); break; case VIRTIO_NET_HDR_GSO_TCPV6: gso_type = SKB_GSO_TCPV6; + ip_proto = IPPROTO_TCP; + thlen = sizeof(struct tcphdr); break; case VIRTIO_NET_HDR_GSO_UDP: gso_type = SKB_GSO_UDP; + ip_proto = IPPROTO_UDP; + thlen = sizeof(struct udphdr); break; default: return -EINVAL; @@ -57,16 +67,22 @@ static inline int virtio_net_hdr_to_skb(
if (!skb_partial_csum_set(skb, start, off)) return -EINVAL; + + if (skb_transport_offset(skb) + thlen > skb_headlen(skb)) + return -EINVAL; } else { /* gso packets without NEEDS_CSUM do not set transport_offset. * probe and drop if does not match one of the above types. */ if (gso_type && skb->network_header) { + struct flow_keys_basic keys; + if (!skb->protocol) virtio_net_hdr_set_proto(skb, hdr); retry: - skb_probe_transport_header(skb, -1); - if (!skb_transport_header_was_set(skb)) { + if (!skb_flow_dissect_flow_keys_basic(skb, &keys, + NULL, 0, 0, 0, + 0)) { /* UFO does not specify ipv4 or 6: try both */ if (gso_type & SKB_GSO_UDP && skb->protocol == htons(ETH_P_IP)) { @@ -75,6 +91,12 @@ retry: } return -EINVAL; } + + if (keys.control.thoff + thlen > skb_headlen(skb) || + keys.basic.ip_proto != ip_proto) + return -EINVAL; + + skb_set_transport_header(skb, keys.control.thoff); } }
From: Jason Gerecke jason.gerecke@wacom.com
commit 778fbf4179991e7652e97d7f1ca1f657ef828422 upstream.
We've recently switched from extracting the value of HID_DG_CONTACTMAX at a fixed offset (which may not be correct for all tablets) to injecting the report into the driver for the generic codepath to handle. Unfortunately, this change was made for *all* tablets, even those which aren't generic. Because `wacom_wac_report` ignores reports from non- generic devices, the contact count never gets initialized. Ultimately this results in the touch device itself failing to probe, and thus the loss of touch input.
This commit adds back the fixed-offset extraction for non-generic devices.
Link: https://github.com/linuxwacom/input-wacom/issues/155 Fixes: 184eccd40389 ("HID: wacom: generic: read HID_DG_CONTACTMAX from any feature report") Signed-off-by: Jason Gerecke jason.gerecke@wacom.com Reviewed-by: Aaron Armstrong Skomra aaron.skomra@wacom.com CC: stable@vger.kernel.org # 5.3+ Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/wacom_sys.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/hid/wacom_sys.c +++ b/drivers/hid/wacom_sys.c @@ -290,9 +290,11 @@ static void wacom_feature_mapping(struct data[0] = field->report->id; ret = wacom_get_report(hdev, HID_FEATURE_REPORT, data, n, WAC_CMD_RETRIES); - if (ret == n) { + if (ret == n && features->type == HID_GENERIC) { ret = hid_report_raw_event(hdev, HID_FEATURE_REPORT, data, n, 0); + } else if (ret == 2 && features->type != HID_GENERIC) { + features->touch_max = data[1]; } else { features->touch_max = 16; hid_warn(hdev, "wacom_feature_mapping: "
From: Jere Leppänen jere.leppanen@nokia.com
commit 145cb2f7177d94bc54563ed26027e952ee0ae03c upstream.
When we start shutdown in sctp_sf_do_dupcook_a(), we want to bundle the SHUTDOWN with the COOKIE-ACK to ensure that the peer receives them at the same time and in the correct order. This bundling was broken by commit 4ff40b86262b ("sctp: set chunk transport correctly when it's a new asoc"), which assigns a transport for the COOKIE-ACK, but not for the SHUTDOWN.
Fix this by passing a reference to the COOKIE-ACK chunk as an argument to sctp_sf_do_9_2_start_shutdown() and onward to sctp_make_shutdown(). This way the SHUTDOWN chunk is assigned the same transport as the COOKIE-ACK chunk, which allows them to be bundled.
In sctp_sf_do_9_2_start_shutdown(), the void *arg parameter was previously unused. Now that we're taking it into use, it must be a valid pointer to a chunk, or NULL. There is only one call site where it's not, in sctp_sf_autoclose_timer_expire(). Fix that too.
Fixes: 4ff40b86262b ("sctp: set chunk transport correctly when it's a new asoc") Signed-off-by: Jere Leppänen jere.leppanen@nokia.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sctp/sm_statefuns.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -1880,7 +1880,7 @@ static enum sctp_disposition sctp_sf_do_ */ sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); return sctp_sf_do_9_2_start_shutdown(net, ep, asoc, - SCTP_ST_CHUNK(0), NULL, + SCTP_ST_CHUNK(0), repl, commands); } else { sctp_add_cmd_sf(commands, SCTP_CMD_NEW_STATE, @@ -5483,7 +5483,7 @@ enum sctp_disposition sctp_sf_do_9_2_sta * in the Cumulative TSN Ack field the last sequential TSN it * has received from the peer. */ - reply = sctp_make_shutdown(asoc, NULL); + reply = sctp_make_shutdown(asoc, arg); if (!reply) goto nomem;
@@ -6081,7 +6081,7 @@ enum sctp_disposition sctp_sf_autoclose_ disposition = SCTP_DISPOSITION_CONSUME; if (sctp_outq_is_empty(&asoc->outqueue)) { disposition = sctp_sf_do_9_2_start_shutdown(net, ep, asoc, type, - arg, commands); + NULL, commands); }
return disposition;
From: Alan Stern stern@rowland.harvard.edu
commit 0ed08faded1da03eb3def61502b27f81aef2e615 upstream.
The syzbot fuzzer discovered a bad race between in the usbhid driver between usbhid_stop() and usbhid_close(). In particular, usbhid_stop() does:
usb_free_urb(usbhid->urbin); ... usbhid->urbin = NULL; /* don't mess up next start */
and usbhid_close() does:
usb_kill_urb(usbhid->urbin);
with no mutual exclusion. If the two routines happen to run concurrently so that usb_kill_urb() is called in between the usb_free_urb() and the NULL assignment, it will access the deallocated urb structure -- a use-after-free bug.
This patch adds a mutex to the usbhid private structure and uses it to enforce mutual exclusion of the usbhid_start(), usbhid_stop(), usbhid_open() and usbhid_close() callbacks.
Reported-and-tested-by: syzbot+7bf5a7b0f0a1f9446f4c@syzkaller.appspotmail.com Signed-off-by: Alan Stern stern@rowland.harvard.edu CC: stable@vger.kernel.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/usbhid/hid-core.c | 37 +++++++++++++++++++++++++++++-------- drivers/hid/usbhid/usbhid.h | 1 + 2 files changed, 30 insertions(+), 8 deletions(-)
--- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -685,16 +685,21 @@ static int usbhid_open(struct hid_device struct usbhid_device *usbhid = hid->driver_data; int res;
+ mutex_lock(&usbhid->mutex); + set_bit(HID_OPENED, &usbhid->iofl);
- if (hid->quirks & HID_QUIRK_ALWAYS_POLL) - return 0; + if (hid->quirks & HID_QUIRK_ALWAYS_POLL) { + res = 0; + goto Done; + }
res = usb_autopm_get_interface(usbhid->intf); /* the device must be awake to reliably request remote wakeup */ if (res < 0) { clear_bit(HID_OPENED, &usbhid->iofl); - return -EIO; + res = -EIO; + goto Done; }
usbhid->intf->needs_remote_wakeup = 1; @@ -728,6 +733,9 @@ static int usbhid_open(struct hid_device msleep(50);
clear_bit(HID_RESUME_RUNNING, &usbhid->iofl); + + Done: + mutex_unlock(&usbhid->mutex); return res; }
@@ -735,6 +743,8 @@ static void usbhid_close(struct hid_devi { struct usbhid_device *usbhid = hid->driver_data;
+ mutex_lock(&usbhid->mutex); + /* * Make sure we don't restart data acquisition due to * a resumption we no longer care about by avoiding racing @@ -746,12 +756,13 @@ static void usbhid_close(struct hid_devi clear_bit(HID_IN_POLLING, &usbhid->iofl); spin_unlock_irq(&usbhid->lock);
- if (hid->quirks & HID_QUIRK_ALWAYS_POLL) - return; + if (!(hid->quirks & HID_QUIRK_ALWAYS_POLL)) { + hid_cancel_delayed_stuff(usbhid); + usb_kill_urb(usbhid->urbin); + usbhid->intf->needs_remote_wakeup = 0; + }
- hid_cancel_delayed_stuff(usbhid); - usb_kill_urb(usbhid->urbin); - usbhid->intf->needs_remote_wakeup = 0; + mutex_unlock(&usbhid->mutex); }
/* @@ -1060,6 +1071,8 @@ static int usbhid_start(struct hid_devic unsigned int n, insize = 0; int ret;
+ mutex_lock(&usbhid->mutex); + clear_bit(HID_DISCONNECTED, &usbhid->iofl);
usbhid->bufsize = HID_MIN_BUFFER_SIZE; @@ -1180,6 +1193,8 @@ static int usbhid_start(struct hid_devic usbhid_set_leds(hid); device_set_wakeup_enable(&dev->dev, 1); } + + mutex_unlock(&usbhid->mutex); return 0;
fail: @@ -1190,6 +1205,7 @@ fail: usbhid->urbout = NULL; usbhid->urbctrl = NULL; hid_free_buffers(dev, hid); + mutex_unlock(&usbhid->mutex); return ret; }
@@ -1205,6 +1221,8 @@ static void usbhid_stop(struct hid_devic usbhid->intf->needs_remote_wakeup = 0; }
+ mutex_lock(&usbhid->mutex); + clear_bit(HID_STARTED, &usbhid->iofl); spin_lock_irq(&usbhid->lock); /* Sync with error and led handlers */ set_bit(HID_DISCONNECTED, &usbhid->iofl); @@ -1225,6 +1243,8 @@ static void usbhid_stop(struct hid_devic usbhid->urbout = NULL;
hid_free_buffers(hid_to_usb_dev(hid), hid); + + mutex_unlock(&usbhid->mutex); }
static int usbhid_power(struct hid_device *hid, int lvl) @@ -1385,6 +1405,7 @@ static int usbhid_probe(struct usb_inter INIT_WORK(&usbhid->reset_work, hid_reset); timer_setup(&usbhid->io_retry, hid_retry_timeout, 0); spin_lock_init(&usbhid->lock); + mutex_init(&usbhid->mutex);
ret = hid_add_device(hid); if (ret) { --- a/drivers/hid/usbhid/usbhid.h +++ b/drivers/hid/usbhid/usbhid.h @@ -93,6 +93,7 @@ struct usbhid_device { dma_addr_t outbuf_dma; /* Output buffer dma */ unsigned long last_out; /* record of last output for timeouts */
+ struct mutex mutex; /* start/stop/open/close */ spinlock_t lock; /* fifo spinlock */ unsigned long iofl; /* I/O flags (CTRL_RUNNING, OUT_RUNNING) */ struct timer_list io_retry; /* Retry timer */
From: Oliver Neukum oneukum@suse.com
commit 9f04db234af691007bb785342a06abab5fb34474 upstream.
This device needs US_FL_NO_REPORT_OPCODES to avoid going through prolonged error handling on enumeration.
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: Julian Groß julian.g@posteo.de Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200429155218.7308-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -28,6 +28,13 @@ * and don't forget to CC: the USB development list linux-usb@vger.kernel.org */
+/* Reported-by: Julian Groß julian.g@posteo.de */ +UNUSUAL_DEV(0x059f, 0x105f, 0x0000, 0x9999, + "LaCie", + "2Big Quadra USB3", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_NO_REPORT_OPCODES), + /* * Apricorn USB3 dongle sometimes returns "USBSUSBSUSBS" in response to SCSI * commands in UAS mode. Observed with the 1.28 firmware; are there others?
From: Oliver Neukum oneukum@suse.com
commit e9b3c610a05c1cdf8e959a6d89c38807ff758ee6 upstream.
We must not process packets shorter than a packet ID
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-and-tested-by: syzbot+d29e9263e13ce0b9f4fd@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/garmin_gps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/serial/garmin_gps.c +++ b/drivers/usb/serial/garmin_gps.c @@ -1138,8 +1138,8 @@ static void garmin_read_process(struct g send it directly to the tty port */ if (garmin_data_p->flags & FLAGS_QUEUING) { pkt_add(garmin_data_p, data, data_length); - } else if (bulk_data || - getLayerId(data) == GARMIN_LAYERID_APPL) { + } else if (bulk_data || (data_length >= sizeof(u32) && + getLayerId(data) == GARMIN_LAYERID_APPL)) {
spin_lock_irqsave(&garmin_data_p->lock, flags); garmin_data_p->flags |= APP_RESP_SEEN;
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 11f5efc3ab66284f7aaacc926e9351d658e2577b upstream.
x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu areas can be complex, to say the least. Mappings may happen at boot up, and if nothing synchronizes the page tables, those page mappings may not be synced till they are used. This causes issues for anything that might touch one of those mappings in the path of the page fault handler. When one of those unmapped mappings is touched in the page fault handler, it will cause another page fault, which in turn will cause a page fault, and leave us in a loop of page faults.
Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split vmalloc_sync_all() into vmalloc_sync_unmappings() and vmalloc_sync_mappings(), as on system exit, it did not need to do a full sync on x86_64 (although it still needed to be done on x86_32). By chance, the vmalloc_sync_all() would synchronize the page mappings done at boot up and prevent the per cpu area from being a problem for tracing in the page fault handler. But when that synchronization in the exit of a task became a nop, it caused the problem to appear.
Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home
Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: "Tzvetomir Stoyanov (VMware)" tz.stoyanov@gmail.com Suggested-by: Joerg Roedel jroedel@suse.de Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7750,6 +7750,19 @@ static int allocate_trace_buffers(struct */ allocate_snapshot = false; #endif + + /* + * Because of some magic with the way alloc_percpu() works on + * x86_64, we need to synchronize the pgd of all the tables, + * otherwise the trace events that happen in x86_64 page fault + * handlers can't cope with accessing the chance that a + * alloc_percpu()'d memory might be touched in the page fault trace + * event. Oh, and we need to audit all other alloc_percpu() and vmalloc() + * calls in tracing, because something might get triggered within a + * page fault trace event! + */ + vmalloc_sync_mappings(); + return 0; }
From: Marc Zyngier maz@kernel.org
commit 1c32ca5dc6d00012f0c964e5fdd7042fcc71efb1 upstream.
When deciding whether a guest has to be stopped we check whether this is a private interrupt or not. Unfortunately, there's an off-by-one bug here, and we fail to recognize a whole range of interrupts as being global (GICv2 SPIs 32-63).
Fix the condition from > to be >=.
Cc: stable@vger.kernel.org Fixes: abd7229626b93 ("KVM: arm/arm64: Simplify active_change_prepare and plug race") Reported-by: André Przywara andre.przywara@arm.com Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- virt/kvm/arm/vgic/vgic-mmio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/virt/kvm/arm/vgic/vgic-mmio.c +++ b/virt/kvm/arm/vgic/vgic-mmio.c @@ -381,7 +381,7 @@ static void vgic_mmio_change_active(stru static void vgic_change_active_prepare(struct kvm_vcpu *vcpu, u32 intid) { if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 || - intid > VGIC_NR_PRIVATE_IRQS) + intid >= VGIC_NR_PRIVATE_IRQS) kvm_arm_halt_guest(vcpu->kvm); }
@@ -389,7 +389,7 @@ static void vgic_change_active_prepare(s static void vgic_change_active_finish(struct kvm_vcpu *vcpu, u32 intid) { if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 || - intid > VGIC_NR_PRIVATE_IRQS) + intid >= VGIC_NR_PRIVATE_IRQS) kvm_arm_resume_guest(vcpu->kvm); }
From: Marc Zyngier maz@kernel.org
commit 0225fd5e0a6a32af7af0aefac45c8ebf19dc5183 upstream.
In the unlikely event that a 32bit vcpu traps into the hypervisor on an instruction that is located right at the end of the 32bit range, the emulation of that instruction is going to increment PC past the 32bit range. This isn't great, as userspace can then observe this value and get a bit confused.
Conversly, userspace can do things like (in the context of a 64bit guest that is capable of 32bit EL0) setting PSTATE to AArch64-EL0, set PC to a 64bit value, change PSTATE to AArch32-USR, and observe that PC hasn't been truncated. More confusion.
Fix both by: - truncating PC increments for 32bit guests - sanitizing all 32bit regs every time a core reg is changed by userspace, and that PSTATE indicates a 32bit mode.
Cc: stable@vger.kernel.org Acked-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kvm/guest.c | 7 +++++++ virt/kvm/arm/hyp/aarch32.c | 8 ++++++-- 2 files changed, 13 insertions(+), 2 deletions(-)
--- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -179,6 +179,13 @@ static int set_core_reg(struct kvm_vcpu }
memcpy((u32 *)regs + off, valp, KVM_REG_SIZE(reg->id)); + + if (*vcpu_cpsr(vcpu) & PSR_MODE32_BIT) { + int i; + + for (i = 0; i < 16; i++) + *vcpu_reg32(vcpu, i) = (u32)*vcpu_reg32(vcpu, i); + } out: return err; } --- a/virt/kvm/arm/hyp/aarch32.c +++ b/virt/kvm/arm/hyp/aarch32.c @@ -125,12 +125,16 @@ static void __hyp_text kvm_adjust_itstat */ void __hyp_text kvm_skip_instr32(struct kvm_vcpu *vcpu, bool is_wide_instr) { + u32 pc = *vcpu_pc(vcpu); bool is_thumb;
is_thumb = !!(*vcpu_cpsr(vcpu) & PSR_AA32_T_BIT); if (is_thumb && !is_wide_instr) - *vcpu_pc(vcpu) += 2; + pc += 2; else - *vcpu_pc(vcpu) += 4; + pc += 4; + + *vcpu_pc(vcpu) = pc; + kvm_adjust_itstate(vcpu); }
From: Mark Rutland mark.rutland@arm.com
commit 027d0c7101f50cf03aeea9eebf484afd4920c8d3 upstream.
The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL:
| CC arch/arm64/mm/pageattr.o | CC arch/arm64/mm/hugetlbpage.o | from arch/arm64/mm/hugetlbpage.c:10: | arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’: | ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference] | ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’ | arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’ | |arch/arm64/mm/hugetlbpage.c:232:10: | |./arch/arm64/include/asm/pgtable-types.h:28:24: | ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’ | arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
This can only occur when the kernel cannot allocate a page, and so is unlikely to happen in practice before other systems start failing.
We can avoid this by bailing out if pmd_alloc() fails, as we do earlier in the function if pud_alloc() fails.
Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Signed-off-by: Mark Rutland mark.rutland@arm.com Reported-by: Kyrill Tkachov kyrylo.tkachov@arm.com Cc: stable@vger.kernel.org # 4.5.x- Cc: Will Deacon will@kernel.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/mm/hugetlbpage.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -218,6 +218,8 @@ pte_t *huge_pte_alloc(struct mm_struct * ptep = (pte_t *)pudp; } else if (sz == (PAGE_SIZE * CONT_PTES)) { pmdp = pmd_alloc(mm, pudp, addr); + if (!pmdp) + return NULL;
WARN_ON(addr & (sz - 1)); /*
From: David Hildenbrand david@redhat.com
commit e84fe99b68ce353c37ceeecc95dce9696c976556 upstream.
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected, e.g., while booting up.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: __pageblock_pfn_to_page+0x134/0x1c0 Call Trace: set_zone_contiguous+0x56/0x70 page_alloc_init_late+0x166/0x176 kernel_init_freeable+0xfa/0x255 kernel_init+0xa/0x106 ret_from_fork+0x35/0x40
The issue becomes visible when having a lot of memory (e.g., 4TB) assigned to a single NUMA node - a system that can easily be created using QEMU. Inside VMs on a hypervisor with quite some memory overcommit, this is fairly easy to trigger.
Signed-off-by: David Hildenbrand david@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Pavel Tatashin pasha.tatashin@soleen.com Reviewed-by: Pankaj Gupta pankaj.gupta.linux@gmail.com Reviewed-by: Baoquan He bhe@redhat.com Reviewed-by: Shile Zhang shile.zhang@linux.alibaba.com Acked-by: Michal Hocko mhocko@suse.com Cc: Kirill Tkhai ktkhai@virtuozzo.com Cc: Shile Zhang shile.zhang@linux.alibaba.com Cc: Pavel Tatashin pasha.tatashin@soleen.com Cc: Daniel Jordan daniel.m.jordan@oracle.com Cc: Michal Hocko mhocko@kernel.org Cc: Alexander Duyck alexander.duyck@gmail.com Cc: Baoquan He bhe@redhat.com Cc: Oscar Salvador osalvador@suse.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1422,6 +1422,7 @@ void set_zone_contiguous(struct zone *zo if (!__pageblock_pfn_to_page(block_start_pfn, block_end_pfn, zone)) return; + cond_resched(); }
/* We confirm that there is no hole */
From: Oscar Carter oscar.carter@gmx.com
commit 769acc3656d93aaacada814939743361d284fd87 upstream.
Check the return value of gasket_get_bar_index function as it can return a negative one (-EINVAL). If this happens, a negative index is used in the "gasket_dev->bar_data" array.
Addresses-Coverity-ID: 1438542 ("Negative array index read") Fixes: 9a69f5087ccc2 ("drivers/staging: Gasket driver framework + Apex driver") Signed-off-by: Oscar Carter oscar.carter@gmx.com Cc: stable stable@vger.kernel.org Reviewed-by: Richard Yeh rcy@google.com Link: https://lore.kernel.org/r/20200501155118.13380-1-oscar.carter@gmx.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/gasket/gasket_core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/staging/gasket/gasket_core.c +++ b/drivers/staging/gasket/gasket_core.c @@ -933,6 +933,10 @@ do_map_region(const struct gasket_dev *g gasket_get_bar_index(gasket_dev, (vma->vm_pgoff << PAGE_SHIFT) + driver_desc->legacy_mmap_address_offset); + + if (bar_index < 0) + return DO_MAP_REGION_INVALID; + phys_base = gasket_dev->bar_data[bar_index].phys_base + phys_offset; while (mapped_bytes < map_length) { /*
From: Luis Chamberlain mcgrof@kernel.org
commit 3740d93e37902b31159a82da2d5c8812ed825404 upstream.
Commit 64e90a8acb859 ("Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()") added the optiont to disable all call_usermodehelper() calls by setting STATIC_USERMODEHELPER_PATH to an empty string. When this is done, and crashdump is triggered, it will crash on null pointer dereference, since we make assumptions over what call_usermodehelper_exec() did.
This has been reported by Sergey when one triggers a a coredump with the following configuration:
``` CONFIG_STATIC_USERMODEHELPER=y CONFIG_STATIC_USERMODEHELPER_PATH="" kernel.core_pattern = |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e ```
The way disabling the umh was designed was that call_usermodehelper_exec() would just return early, without an error. But coredump assumes certain variables are set up for us when this happens, and calls ile_start_write(cprm.file) with a NULL file.
[ 2.819676] BUG: kernel NULL pointer dereference, address: 0000000000000020 [ 2.819859] #PF: supervisor read access in kernel mode [ 2.820035] #PF: error_code(0x0000) - not-present page [ 2.820188] PGD 0 P4D 0 [ 2.820305] Oops: 0000 [#1] SMP PTI [ 2.820436] CPU: 2 PID: 89 Comm: a Not tainted 5.7.0-rc1+ #7 [ 2.820680] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014 [ 2.821150] RIP: 0010:do_coredump+0xd80/0x1060 [ 2.821385] Code: e8 95 11 ed ff 48 c7 c6 cc a7 b4 81 48 8d bd 28 ff ff ff 89 c2 e8 70 f1 ff ff 41 89 c2 85 c0 0f 84 72 f7 ff ff e9 b4 fe ff ff <48> 8b 57 20 0f b7 02 66 25 00 f0 66 3d 00 8 0 0f 84 9c 01 00 00 44 [ 2.822014] RSP: 0000:ffffc9000029bcb8 EFLAGS: 00010246 [ 2.822339] RAX: 0000000000000000 RBX: ffff88803f860000 RCX: 000000000000000a [ 2.822746] RDX: 0000000000000009 RSI: 0000000000000282 RDI: 0000000000000000 [ 2.823141] RBP: ffffc9000029bde8 R08: 0000000000000000 R09: ffffc9000029bc00 [ 2.823508] R10: 0000000000000001 R11: ffff88803dec90be R12: ffffffff81c39da0 [ 2.823902] R13: ffff88803de84400 R14: 0000000000000000 R15: 0000000000000000 [ 2.824285] FS: 00007fee08183540(0000) GS:ffff88803e480000(0000) knlGS:0000000000000000 [ 2.824767] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.825111] CR2: 0000000000000020 CR3: 000000003f856005 CR4: 0000000000060ea0 [ 2.825479] Call Trace: [ 2.825790] get_signal+0x11e/0x720 [ 2.826087] do_signal+0x1d/0x670 [ 2.826361] ? force_sig_info_to_task+0xc1/0xf0 [ 2.826691] ? force_sig_fault+0x3c/0x40 [ 2.826996] ? do_trap+0xc9/0x100 [ 2.827179] exit_to_usermode_loop+0x49/0x90 [ 2.827359] prepare_exit_to_usermode+0x77/0xb0 [ 2.827559] ? invalid_op+0xa/0x30 [ 2.827747] ret_from_intr+0x20/0x20 [ 2.827921] RIP: 0033:0x55e2c76d2129 [ 2.828107] Code: 2d ff ff ff e8 68 ff ff ff 5d c6 05 18 2f 00 00 01 c3 0f 1f 80 00 00 00 00 c3 0f 1f 80 00 00 00 00 e9 7b ff ff ff 55 48 89 e5 <0f> 0b b8 00 00 00 00 5d c3 66 2e 0f 1f 84 0 0 00 00 00 00 0f 1f 40 [ 2.828603] RSP: 002b:00007fffeba5e080 EFLAGS: 00010246 [ 2.828801] RAX: 000055e2c76d2125 RBX: 0000000000000000 RCX: 00007fee0817c718 [ 2.829034] RDX: 00007fffeba5e188 RSI: 00007fffeba5e178 RDI: 0000000000000001 [ 2.829257] RBP: 00007fffeba5e080 R08: 0000000000000000 R09: 00007fee08193c00 [ 2.829482] R10: 0000000000000009 R11: 0000000000000000 R12: 000055e2c76d2040 [ 2.829727] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 2.829964] CR2: 0000000000000020 [ 2.830149] ---[ end trace ceed83d8c68a1bf1 ]--- ```
Cc: stable@vger.kernel.org # v4.11+ Fixes: 64e90a8acb85 ("Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199795 Reported-by: Tony Vroon chainsaw@gentoo.org Reported-by: Sergey Kvachonok ravenexp@gmail.com Tested-by: Sergei Trofimovich slyfox@gentoo.org Signed-off-by: Luis Chamberlain mcgrof@kernel.org Link: https://lore.kernel.org/r/20200416162859.26518-1-mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/coredump.c | 8 ++++++++ kernel/umh.c | 5 +++++ 2 files changed, 13 insertions(+)
--- a/fs/coredump.c +++ b/fs/coredump.c @@ -753,6 +753,14 @@ void do_coredump(const siginfo_t *siginf if (displaced) put_files_struct(displaced); if (!dump_interrupted()) { + /* + * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would + * have this set to NULL. + */ + if (!cprm.file) { + pr_info("Core dump to |%s disabled\n", cn.corename); + goto close_fail; + } file_start_write(cprm.file); core_dumped = binfmt->core_dump(&cprm); file_end_write(cprm.file); --- a/kernel/umh.c +++ b/kernel/umh.c @@ -522,6 +522,11 @@ EXPORT_SYMBOL_GPL(fork_usermode_blob); * Runs a user-space application. The application is started * asynchronously if wait is not set, and runs as a child of system workqueues. * (ie. it runs with full root capabilities and optimized affinity). + * + * Note: successful return value does not guarantee the helper was called at + * all. You can't rely on sub_info->{init,cleanup} being called even for + * UMH_WAIT_* wait modes as STATIC_USERMODEHELPER_PATH="" turns all helpers + * into a successful no-op. */ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait) {
From: Sean Christopherson sean.j.christopherson@intel.com
commit 051a2d3e59e51ae49fd56aef34e472832897ce46 upstream.
Use '%% " _ASM_CX"' instead of '%0' to dereference RCX, i.e. the 'struct vcpu_vmx' pointer, in the VM-Enter asm blobs of vmx_vcpu_run() and nested_vmx_check_vmentry_hw(). Using the symbolic name means that adding/removing an output parameter(s) requires "rewriting" almost all of the asm blob, which makes it nearly impossible to understand what's being changed in even the most minor patches.
Opportunistically improve the code comments.
Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Reviewed-by: Andi Kleen ak@linux.intel.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx.c | 86 ++++++++++++++++++++++++++++------------------------- 1 file changed, 47 insertions(+), 39 deletions(-)
--- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10776,9 +10776,9 @@ static void __noclone vmx_vcpu_run(struc "push %%" _ASM_DX "; push %%" _ASM_BP ";" "push %%" _ASM_CX " \n\t" /* placeholder for guest rcx */ "push %%" _ASM_CX " \n\t" - "cmp %%" _ASM_SP ", %c[host_rsp](%0) \n\t" + "cmp %%" _ASM_SP ", %c[host_rsp](%%" _ASM_CX ") \n\t" "je 1f \n\t" - "mov %%" _ASM_SP ", %c[host_rsp](%0) \n\t" + "mov %%" _ASM_SP ", %c[host_rsp](%%" _ASM_CX ") \n\t" /* Avoid VMWRITE when Enlightened VMCS is in use */ "test %%" _ASM_SI ", %%" _ASM_SI " \n\t" "jz 2f \n\t" @@ -10788,32 +10788,33 @@ static void __noclone vmx_vcpu_run(struc __ex(ASM_VMX_VMWRITE_RSP_RDX) "\n\t" "1: \n\t" /* Reload cr2 if changed */ - "mov %c[cr2](%0), %%" _ASM_AX " \n\t" + "mov %c[cr2](%%" _ASM_CX "), %%" _ASM_AX " \n\t" "mov %%cr2, %%" _ASM_DX " \n\t" "cmp %%" _ASM_AX ", %%" _ASM_DX " \n\t" "je 3f \n\t" "mov %%" _ASM_AX", %%cr2 \n\t" "3: \n\t" /* Check if vmlaunch of vmresume is needed */ - "cmpb $0, %c[launched](%0) \n\t" + "cmpb $0, %c[launched](%%" _ASM_CX ") \n\t" /* Load guest registers. Don't clobber flags. */ - "mov %c[rax](%0), %%" _ASM_AX " \n\t" - "mov %c[rbx](%0), %%" _ASM_BX " \n\t" - "mov %c[rdx](%0), %%" _ASM_DX " \n\t" - "mov %c[rsi](%0), %%" _ASM_SI " \n\t" - "mov %c[rdi](%0), %%" _ASM_DI " \n\t" - "mov %c[rbp](%0), %%" _ASM_BP " \n\t" + "mov %c[rax](%%" _ASM_CX "), %%" _ASM_AX " \n\t" + "mov %c[rbx](%%" _ASM_CX "), %%" _ASM_BX " \n\t" + "mov %c[rdx](%%" _ASM_CX "), %%" _ASM_DX " \n\t" + "mov %c[rsi](%%" _ASM_CX "), %%" _ASM_SI " \n\t" + "mov %c[rdi](%%" _ASM_CX "), %%" _ASM_DI " \n\t" + "mov %c[rbp](%%" _ASM_CX "), %%" _ASM_BP " \n\t" #ifdef CONFIG_X86_64 - "mov %c[r8](%0), %%r8 \n\t" - "mov %c[r9](%0), %%r9 \n\t" - "mov %c[r10](%0), %%r10 \n\t" - "mov %c[r11](%0), %%r11 \n\t" - "mov %c[r12](%0), %%r12 \n\t" - "mov %c[r13](%0), %%r13 \n\t" - "mov %c[r14](%0), %%r14 \n\t" - "mov %c[r15](%0), %%r15 \n\t" + "mov %c[r8](%%" _ASM_CX "), %%r8 \n\t" + "mov %c[r9](%%" _ASM_CX "), %%r9 \n\t" + "mov %c[r10](%%" _ASM_CX "), %%r10 \n\t" + "mov %c[r11](%%" _ASM_CX "), %%r11 \n\t" + "mov %c[r12](%%" _ASM_CX "), %%r12 \n\t" + "mov %c[r13](%%" _ASM_CX "), %%r13 \n\t" + "mov %c[r14](%%" _ASM_CX "), %%r14 \n\t" + "mov %c[r15](%%" _ASM_CX "), %%r15 \n\t" #endif - "mov %c[rcx](%0), %%" _ASM_CX " \n\t" /* kills %0 (ecx) */ + /* Load guest RCX. This kills the vmx_vcpu pointer! */ + "mov %c[rcx](%%" _ASM_CX "), %%" _ASM_CX " \n\t"
/* Enter guest mode */ "jne 1f \n\t" @@ -10821,26 +10822,33 @@ static void __noclone vmx_vcpu_run(struc "jmp 2f \n\t" "1: " __ex(ASM_VMX_VMRESUME) "\n\t" "2: " - /* Save guest registers, load host registers, keep flags */ - "mov %0, %c[wordsize](%%" _ASM_SP ") \n\t" - "pop %0 \n\t" - "setbe %c[fail](%0)\n\t" - "mov %%" _ASM_AX ", %c[rax](%0) \n\t" - "mov %%" _ASM_BX ", %c[rbx](%0) \n\t" - __ASM_SIZE(pop) " %c[rcx](%0) \n\t" - "mov %%" _ASM_DX ", %c[rdx](%0) \n\t" - "mov %%" _ASM_SI ", %c[rsi](%0) \n\t" - "mov %%" _ASM_DI ", %c[rdi](%0) \n\t" - "mov %%" _ASM_BP ", %c[rbp](%0) \n\t" + + /* Save guest's RCX to the stack placeholder (see above) */ + "mov %%" _ASM_CX ", %c[wordsize](%%" _ASM_SP ") \n\t" + + /* Load host's RCX, i.e. the vmx_vcpu pointer */ + "pop %%" _ASM_CX " \n\t" + + /* Set vmx->fail based on EFLAGS.{CF,ZF} */ + "setbe %c[fail](%%" _ASM_CX ")\n\t" + + /* Save all guest registers, including RCX from the stack */ + "mov %%" _ASM_AX ", %c[rax](%%" _ASM_CX ") \n\t" + "mov %%" _ASM_BX ", %c[rbx](%%" _ASM_CX ") \n\t" + __ASM_SIZE(pop) " %c[rcx](%%" _ASM_CX ") \n\t" + "mov %%" _ASM_DX ", %c[rdx](%%" _ASM_CX ") \n\t" + "mov %%" _ASM_SI ", %c[rsi](%%" _ASM_CX ") \n\t" + "mov %%" _ASM_DI ", %c[rdi](%%" _ASM_CX ") \n\t" + "mov %%" _ASM_BP ", %c[rbp](%%" _ASM_CX ") \n\t" #ifdef CONFIG_X86_64 - "mov %%r8, %c[r8](%0) \n\t" - "mov %%r9, %c[r9](%0) \n\t" - "mov %%r10, %c[r10](%0) \n\t" - "mov %%r11, %c[r11](%0) \n\t" - "mov %%r12, %c[r12](%0) \n\t" - "mov %%r13, %c[r13](%0) \n\t" - "mov %%r14, %c[r14](%0) \n\t" - "mov %%r15, %c[r15](%0) \n\t" + "mov %%r8, %c[r8](%%" _ASM_CX ") \n\t" + "mov %%r9, %c[r9](%%" _ASM_CX ") \n\t" + "mov %%r10, %c[r10](%%" _ASM_CX ") \n\t" + "mov %%r11, %c[r11](%%" _ASM_CX ") \n\t" + "mov %%r12, %c[r12](%%" _ASM_CX ") \n\t" + "mov %%r13, %c[r13](%%" _ASM_CX ") \n\t" + "mov %%r14, %c[r14](%%" _ASM_CX ") \n\t" + "mov %%r15, %c[r15](%%" _ASM_CX ") \n\t"
/* * Clear all general purpose registers (except RSP, which is loaded by @@ -10860,7 +10868,7 @@ static void __noclone vmx_vcpu_run(struc "xor %%r15d, %%r15d \n\t" #endif "mov %%cr2, %%" _ASM_AX " \n\t" - "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" + "mov %%" _ASM_AX ", %c[cr2](%%" _ASM_CX ") \n\t"
"xor %%eax, %%eax \n\t" "xor %%ebx, %%ebx \n\t"
From: Sean Christopherson sean.j.christopherson@intel.com
Based on upstream commit f3689e3f17f064fd4cd5f0cb01ae2395c94f39d9.
Save RCX, RDX and RSI to fake outputs to coerce the compiler into treating them as clobbered. RCX in particular is likely to be reused by the compiler to dereference the 'struct vcpu_vmx' pointer, which will result in a null pointer dereference now that RCX is zeroed by the asm blob.
Tag the asm() blob as volatile to prevent GCC from dropping the blob, which is possible now that the blob has output values, all of which are unused.
Upstream commit f3689e3f17f06 ("KVM: VMX: Save RSI to an unused output in the vCPU-run asm blob") is not a direct equivalent of this patch. As its shortlog states, it only tagged RSI as clobbered, whereas here RCX and RDX are also clobbered.
In upstream at the time of the offending commit (b4be98039a92 in 4.19, 0e0ab73c9a024 upstream), the inline asm blob had previously been moved to a dedicated helper, __vmx_vcpu_run(). For unrelated reasons, __vmx_vcpu_run() was put into its own optimization unit, which for all intents and purposes made it impossible to consume clobbered registers because RCX, RDX and RSI are volatile and __vmx_vcpu_run() couldn't itself be inlined. In other words, the bug existed but couldn't be hit.
Similarly, the lack of "volatile" was also a bug in upstream that was hidden by an unrelated change that exists in upstream but not in 4.19. In this case, the asm blob also uses ASM_CALL_CONSTRAINT (marks RSP as being an input/output constraint) in upstream to play nice with objtool due the blob making a CALL. In 4.19, there is no CALL and thus no ASM_CALL_CONSTRAINT.
Furthermore, both of the lurking bugs were blasted away in upstream by commits 5e0781df1899 ("KVM: VMX: Move vCPU-run code to a proper assembly routine") and fc2ba5a27a1a ("KVM: VMX: Call vCPU-run asm sub-routine from C and remove clobbering"), i.e. these bugs will never be directly fixed in upstream.
Reported-by: Tobias Urdin tobias.urdin@binero.com Fixes: b4be98039a92 ("KVM: VMX: Zero out *all* general purpose registers after VM-Exit") Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com Cc: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/vmx.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10771,7 +10771,7 @@ static void __noclone vmx_vcpu_run(struc else if (static_branch_unlikely(&mds_user_clear)) mds_clear_cpu_buffers();
- asm( + asm volatile ( /* Store host registers */ "push %%" _ASM_DX "; push %%" _ASM_BP ";" "push %%" _ASM_CX " \n\t" /* placeholder for guest rcx */ @@ -10882,7 +10882,8 @@ static void __noclone vmx_vcpu_run(struc ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t" ".popsection" - : : "c"(vmx), "d"((unsigned long)HOST_RSP), "S"(evmcs_rsp), + : "=c"((int){0}), "=d"((int){0}), "=S"((int){0}) + : "c"(vmx), "d"((unsigned long)HOST_RSP), "S"(evmcs_rsp), [launched]"i"(offsetof(struct vcpu_vmx, __launched)), [fail]"i"(offsetof(struct vcpu_vmx, fail)), [host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)),
From: George Spelvin lkml@sdf.org
commit fd0c42c4dea54335967c5a86f15fc064235a2797 upstream.
and change to pseudorandom numbers, as this is a traffic dithering operation that doesn't need crypto-grade.
The previous code operated in 4 steps:
1. Generate a random byte 0 <= rand_tq <= 255 2. Multiply it by BATADV_TQ_MAX_VALUE - tq 3. Divide by 255 (= BATADV_TQ_MAX_VALUE) 4. Return BATADV_TQ_MAX_VALUE - rand_tq
This would apperar to scale (BATADV_TQ_MAX_VALUE - tq) by a random value between 0/255 and 255/255.
But! The intermediate value between steps 3 and 4 is stored in a u8 variable. So it's truncated, and most of the time, is less than 255, after which the division produces 0. Specifically, if tq is odd, the product is always even, and can never be 255. If tq is even, there's exactly one random byte value that will produce a product byte of 255.
Thus, the return value is 255 (511/512 of the time) or 254 (1/512 of the time).
If we assume that the truncation is a bug, and the code is meant to scale the input, a simpler way of looking at it is that it's returning a random value between tq and BATADV_TQ_MAX_VALUE, inclusive.
Well, we have an optimized function for doing just that.
Fixes: 3c12de9a5c75 ("batman-adv: network coding - code and transmit packets if possible") Signed-off-by: George Spelvin lkml@sdf.org Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/network-coding.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
--- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -1021,15 +1021,8 @@ static struct batadv_nc_path *batadv_nc_ */ static u8 batadv_nc_random_weight_tq(u8 tq) { - u8 rand_val, rand_tq; - - get_random_bytes(&rand_val, sizeof(rand_val)); - /* randomize the estimated packet loss (max TQ - estimated TQ) */ - rand_tq = rand_val * (BATADV_TQ_MAX_VALUE - tq); - - /* normalize the randomized packet loss */ - rand_tq /= BATADV_TQ_MAX_VALUE; + u8 rand_tq = prandom_u32_max(BATADV_TQ_MAX_VALUE + 1 - tq);
/* convert to (randomized) estimated tq again */ return BATADV_TQ_MAX_VALUE - rand_tq;
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit f872de8185acf1b48b954ba5bd8f9bc0a0d14016 upstream.
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(), which gets a batadv_hard_iface object from net_dev with increased refcnt and its reference is assigned to a local pointer 'hard_iface'.
When batadv_show_throughput_override() returns, "hard_iface" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in the normal path of batadv_show_throughput_override(), which forgets to decrease the refcnt increased by batadv_hardif_get_by_netdev() before the function returns, causing a refcnt leak.
Fix this issue by calling batadv_hardif_put() before the batadv_show_throughput_override() returns in the normal path.
Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces") Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/sysfs.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/batman-adv/sysfs.c +++ b/net/batman-adv/sysfs.c @@ -1126,6 +1126,7 @@ static ssize_t batadv_show_throughput_ov
tp_override = atomic_read(&hard_iface->bat_v.throughput_override);
+ batadv_hardif_put(hard_iface); return sprintf(buff, "%u.%u MBit\n", tp_override / 10, tp_override % 10); }
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit 6107c5da0fca8b50b4d3215e94d619d38cc4a18c upstream.
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(), which gets a batadv_hard_iface object from net_dev with increased refcnt and its reference is assigned to a local pointer 'hard_iface'.
When batadv_store_throughput_override() returns, "hard_iface" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in one error path of batadv_store_throughput_override(). When batadv_parse_throughput() returns NULL, the refcnt increased by batadv_hardif_get_by_netdev() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_parse_throughput() returns NULL.
Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces") Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/batman-adv/sysfs.c +++ b/net/batman-adv/sysfs.c @@ -1093,7 +1093,7 @@ static ssize_t batadv_store_throughput_o ret = batadv_parse_throughput(net_dev, buff, "throughput_override", &tp_override); if (!ret) - return count; + goto out;
old_tp_override = atomic_read(&hard_iface->bat_v.throughput_override); if (old_tp_override == tp_override)
Hi!
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit 6107c5da0fca8b50b4d3215e94d619d38cc4a18c upstream.
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(), which gets a batadv_hard_iface object from net_dev with increased refcnt and its reference is assigned to a local pointer 'hard_iface'.
When batadv_store_throughput_override() returns, "hard_iface" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in one error path of batadv_store_throughput_override(). When batadv_parse_throughput() returns NULL, the refcnt increased by batadv_hardif_get_by_netdev() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_parse_throughput() returns NULL.
Ok, this fixes the issue, but it brings up a question:
--- a/net/batman-adv/sysfs.c +++ b/net/batman-adv/sysfs.c @@ -1093,7 +1093,7 @@ static ssize_t batadv_store_throughput_o ret = batadv_parse_throughput(net_dev, buff, "throughput_override", &tp_override); if (!ret)
return count;
goto out;
If parsing of value from userspace failed we are currently returning success. That seems wrong. Should we return -EINVAL instead?
Best regards, Pavel
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit 6f91a3f7af4186099dd10fa530dd7e0d9c29747d upstream.
batadv_v_ogm_process() invokes batadv_hardif_neigh_get(), which returns a reference of the neighbor object to "hardif_neigh" with increased refcount.
When batadv_v_ogm_process() returns, "hardif_neigh" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling paths of batadv_v_ogm_process(). When batadv_v_ogm_orig_get() fails to get the orig node and returns NULL, the refcnt increased by batadv_hardif_neigh_get() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_v_ogm_orig_get() fails to get the orig node.
Fixes: 9323158ef9f4 ("batman-adv: OGMv2 - implement originators logic") Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/bat_v_ogm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/batman-adv/bat_v_ogm.c +++ b/net/batman-adv/bat_v_ogm.c @@ -735,7 +735,7 @@ static void batadv_v_ogm_process(const s
orig_node = batadv_v_ogm_orig_get(bat_priv, ogm_packet->orig); if (!orig_node) - return; + goto out;
neigh_node = batadv_neigh_node_get_or_create(orig_node, if_incoming, ethhdr->h_source);
From: Josh Poimboeuf jpoimboe@redhat.com
commit 06a9750edcffa808494d56da939085c35904e618 upstream.
The PUSH_AND_CLEAR_REGS macro zeroes each register immediately after pushing it. If an NMI or exception hits after a register is cleared, but before the UNWIND_HINT_REGS annotation, the ORC unwinder will wrongly think the previous value of the register was zero. This can confuse the unwinding process and cause it to exit early.
Because ORC is simpler than DWARF, there are a limited number of unwind annotation states, so it's not possible to add an individual unwind hint after each push/clear combination. Instead, the register clearing instructions need to be consolidated and moved to after the UNWIND_HINT_REGS annotation.
Fixes: 3f01daecd545 ("x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro") Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Dave Jones dsj@fb.com Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lore.kernel.org/r/68fd3d0bc92ae2d62ff7879d15d3684217d51f08.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/calling.h | 40 +++++++++++++++++++++------------------- 1 file changed, 21 insertions(+), 19 deletions(-)
--- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -98,13 +98,6 @@ For 32-bit we have the following convent #define SIZEOF_PTREGS 21*8
.macro PUSH_AND_CLEAR_REGS rdx=%rdx rax=%rax save_ret=0 - /* - * Push registers and sanitize registers of values that a - * speculation attack might otherwise want to exploit. The - * lower registers are likely clobbered well before they - * could be put to use in a speculative execution gadget. - * Interleave XOR with PUSH for better uop scheduling: - */ .if \save_ret pushq %rsi /* pt_regs->si */ movq 8(%rsp), %rsi /* temporarily store the return address in %rsi */ @@ -114,34 +107,43 @@ For 32-bit we have the following convent pushq %rsi /* pt_regs->si */ .endif pushq \rdx /* pt_regs->dx */ - xorl %edx, %edx /* nospec dx */ pushq %rcx /* pt_regs->cx */ - xorl %ecx, %ecx /* nospec cx */ pushq \rax /* pt_regs->ax */ pushq %r8 /* pt_regs->r8 */ - xorl %r8d, %r8d /* nospec r8 */ pushq %r9 /* pt_regs->r9 */ - xorl %r9d, %r9d /* nospec r9 */ pushq %r10 /* pt_regs->r10 */ - xorl %r10d, %r10d /* nospec r10 */ pushq %r11 /* pt_regs->r11 */ - xorl %r11d, %r11d /* nospec r11*/ pushq %rbx /* pt_regs->rbx */ - xorl %ebx, %ebx /* nospec rbx*/ pushq %rbp /* pt_regs->rbp */ - xorl %ebp, %ebp /* nospec rbp*/ pushq %r12 /* pt_regs->r12 */ - xorl %r12d, %r12d /* nospec r12*/ pushq %r13 /* pt_regs->r13 */ - xorl %r13d, %r13d /* nospec r13*/ pushq %r14 /* pt_regs->r14 */ - xorl %r14d, %r14d /* nospec r14*/ pushq %r15 /* pt_regs->r15 */ - xorl %r15d, %r15d /* nospec r15*/ UNWIND_HINT_REGS + .if \save_ret pushq %rsi /* return address on top of stack */ .endif + + /* + * Sanitize registers of values that a speculation attack might + * otherwise want to exploit. The lower registers are likely clobbered + * well before they could be put to use in a speculative execution + * gadget. + */ + xorl %edx, %edx /* nospec dx */ + xorl %ecx, %ecx /* nospec cx */ + xorl %r8d, %r8d /* nospec r8 */ + xorl %r9d, %r9d /* nospec r9 */ + xorl %r10d, %r10d /* nospec r10 */ + xorl %r11d, %r11d /* nospec r11 */ + xorl %ebx, %ebx /* nospec rbx */ + xorl %ebp, %ebp /* nospec rbp */ + xorl %r12d, %r12d /* nospec r12 */ + xorl %r13d, %r13d /* nospec r13 */ + xorl %r14d, %r14d /* nospec r14 */ + xorl %r15d, %r15d /* nospec r15 */ + .endm
.macro POP_REGS pop_rdi=1 skip_r11rcx=0
On Wed 2020-05-13 11:45:03, Greg Kroah-Hartman wrote:
From: Josh Poimboeuf jpoimboe@redhat.com
commit 06a9750edcffa808494d56da939085c35904e618 upstream.
The PUSH_AND_CLEAR_REGS macro zeroes each register immediately after pushing it. If an NMI or exception hits after a register is cleared, but before the UNWIND_HINT_REGS annotation, the ORC unwinder will wrongly think the previous value of the register was zero. This can confuse the unwinding process and cause it to exit early.
Because ORC is simpler than DWARF, there are a limited number of unwind annotation states, so it's not possible to add an individual unwind hint after each push/clear combination. Instead, the register clearing instructions need to be consolidated and moved to after the UNWIND_HINT_REGS annotation.
This actually makes kernel entry/exit slower, due to poor instruction scheduling. And that is a bit of hot path... Is it strictly neccessary? Not everyone needs ORC scheduler. Should it be somehow optional?
Best regards, Pavel
* Interleave XOR with PUSH for better uop scheduling:
.if \save_ret pushq %rsi /* pt_regs->si */ movq 8(%rsp), %rsi /* temporarily store the return address in %rsi */*/
@@ -114,34 +107,43 @@ For 32-bit we have the following convent pushq %rsi /* pt_regs->si */ .endif pushq \rdx /* pt_regs->dx */
- xorl %edx, %edx /* nospec dx */ pushq %rcx /* pt_regs->cx */
- xorl %ecx, %ecx /* nospec cx */ pushq \rax /* pt_regs->ax */ pushq %r8 /* pt_regs->r8 */
- xorl %r8d, %r8d /* nospec r8 */ pushq %r9 /* pt_regs->r9 */
- xorl %r9d, %r9d /* nospec r9 */ pushq %r10 /* pt_regs->r10 */
- xorl %r10d, %r10d /* nospec r10 */ pushq %r11 /* pt_regs->r11 */
- xorl %r11d, %r11d /* nospec r11*/ pushq %rbx /* pt_regs->rbx */
- xorl %ebx, %ebx /* nospec rbx*/ pushq %rbp /* pt_regs->rbp */
- xorl %ebp, %ebp /* nospec rbp*/ pushq %r12 /* pt_regs->r12 */
- xorl %r12d, %r12d /* nospec r12*/ pushq %r13 /* pt_regs->r13 */
- xorl %r13d, %r13d /* nospec r13*/ pushq %r14 /* pt_regs->r14 */
- xorl %r14d, %r14d /* nospec r14*/ pushq %r15 /* pt_regs->r15 */
- xorl %r15d, %r15d /* nospec r15*/ UNWIND_HINT_REGS
- .if \save_ret pushq %rsi /* return address on top of stack */ .endif
- /*
* Sanitize registers of values that a speculation attack might
* otherwise want to exploit. The lower registers are likely clobbered
* well before they could be put to use in a speculative execution
* gadget.
*/
- xorl %edx, %edx /* nospec dx */
- xorl %ecx, %ecx /* nospec cx */
- xorl %r8d, %r8d /* nospec r8 */
- xorl %r9d, %r9d /* nospec r9 */
- xorl %r10d, %r10d /* nospec r10 */
- xorl %r11d, %r11d /* nospec r11 */
- xorl %ebx, %ebx /* nospec rbx */
- xorl %ebp, %ebp /* nospec rbp */
- xorl %r12d, %r12d /* nospec r12 */
- xorl %r13d, %r13d /* nospec r13 */
- xorl %r14d, %r14d /* nospec r14 */
- xorl %r15d, %r15d /* nospec r15 */
.endm .macro POP_REGS pop_rdi=1 skip_r11rcx=0
On Wed, May 13, 2020 at 11:48:56PM +0200, Pavel Machek wrote:
On Wed 2020-05-13 11:45:03, Greg Kroah-Hartman wrote:
From: Josh Poimboeuf jpoimboe@redhat.com
commit 06a9750edcffa808494d56da939085c35904e618 upstream.
The PUSH_AND_CLEAR_REGS macro zeroes each register immediately after pushing it. If an NMI or exception hits after a register is cleared, but before the UNWIND_HINT_REGS annotation, the ORC unwinder will wrongly think the previous value of the register was zero. This can confuse the unwinding process and cause it to exit early.
Because ORC is simpler than DWARF, there are a limited number of unwind annotation states, so it's not possible to add an individual unwind hint after each push/clear combination. Instead, the register clearing instructions need to be consolidated and moved to after the UNWIND_HINT_REGS annotation.
This actually makes kernel entry/exit slower, due to poor instruction scheduling. And that is a bit of hot path... Is it strictly neccessary? Not everyone needs ORC scheduler. Should it be somehow optional?
I didn't measure a difference beyond the noise level, did you?
From: Josh Poimboeuf jpoimboe@redhat.com
commit 1fb143634a38095b641a3a21220774799772dc4c upstream.
In swapgs_restore_regs_and_return_to_usermode, after the stack is switched to the trampoline stack, the existing UNWIND_HINT_REGS hint is no longer valid, which can result in the following ORC unwinder warning:
WARNING: can't dereference registers at 000000003aeb0cdd for ip swapgs_restore_regs_and_return_to_usermode+0x93/0xa0
For full correctness, we could try to add complicated unwind hints so the unwinder could continue to find the registers, but when when it's this close to kernel exit, unwind hints aren't really needed anymore and it's fine to just use an empty hint which tells the unwinder to stop.
For consistency, also move the UNWIND_HINT_EMPTY in entry_SYSCALL_64_after_hwframe to a similar location.
Fixes: 3e3b9293d392 ("x86/entry/64: Return to userspace from the trampoline stack") Reported-by: Vince Weaver vincent.weaver@maine.edu Reported-by: Dave Jones dsj@fb.com Reported-by: Dr. David Alan Gilbert dgilbert@redhat.com Reported-by: Joe Mario jmario@redhat.com Reported-by: Jann Horn jannh@google.com Reported-by: Linus Torvalds torvalds@linux-foundation.org Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/60ea8f562987ed2d9ace2977502fe481c0d7c9a0.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_64.S | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -312,7 +312,6 @@ GLOBAL(entry_SYSCALL_64_after_hwframe) */ syscall_return_via_sysret: /* rcx and r11 are already restored (see code above) */ - UNWIND_HINT_EMPTY POP_REGS pop_rdi=0 skip_r11rcx=1
/* @@ -321,6 +320,7 @@ syscall_return_via_sysret: */ movq %rsp, %rdi movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp + UNWIND_HINT_EMPTY
pushq RSP-RDI(%rdi) /* RSP */ pushq (%rdi) /* RDI */ @@ -700,6 +700,7 @@ GLOBAL(swapgs_restore_regs_and_return_to */ movq %rsp, %rdi movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp + UNWIND_HINT_EMPTY
/* Copy the IRET frame to the trampoline stack. */ pushq 6*8(%rdi) /* SS */
From: Jann Horn jannh@google.com
commit f977df7b7ca45a4ac4b66d30a8931d0434c394b1 upstream.
The LEAQ instruction in rewind_stack_do_exit() moves the stack pointer directly below the pt_regs at the top of the task stack before calling do_exit(). Tell the unwinder to expect pt_regs.
Fixes: 8c1f75587a18 ("x86/entry/64: Add unwind hint annotations") Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Dave Jones dsj@fb.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lore.kernel.org/r/68c33e17ae5963854916a46f522624f8e1d264f2.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_64.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1745,7 +1745,7 @@ ENTRY(rewind_stack_do_exit)
movq PER_CPU_VAR(cpu_current_top_of_stack), %rax leaq -PTREGS_SIZE(%rax), %rsp - UNWIND_HINT_FUNC sp_offset=PTREGS_SIZE + UNWIND_HINT_REGS
call do_exit END(rewind_stack_do_exit)
From: Miroslav Benes mbenes@suse.cz
commit f1d9a2abff66aa8156fbc1493abed468db63ea48 upstream.
When unwinding an inactive task, the ORC unwinder skips the first frame by default. If both the 'regs' and 'first_frame' parameters of unwind_start() are NULL, 'state->sp' and 'first_frame' are later initialized to the same value for an inactive task. Given there is a "less than or equal to" comparison used at the end of __unwind_start() for skipping stack frames, the first frame is skipped.
Drop the equal part of the comparison and make the behavior equivalent to the frame pointer unwinder.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Dave Jones dsj@fb.com Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lore.kernel.org/r/7f08db872ab59e807016910acdbe82f744de7065.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/unwind_orc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -629,7 +629,7 @@ void __unwind_start(struct unwind_state /* Otherwise, skip ahead to the user-specified starting frame: */ while (!unwind_done(state) && (!on_stack(&state->stack_info, first_frame, sizeof(long)) || - state->sp <= (unsigned long)first_frame)) + state->sp < (unsigned long)first_frame)) unwind_next_frame(state);
return;
From: Josh Poimboeuf jpoimboe@redhat.com
commit 98d0c8ebf77e0ba7c54a9ae05ea588f0e9e3f46e upstream.
If the unwinder is called before the ORC data has been initialized, orc_find() returns NULL, and it tries to fall back to using frame pointers. This can cause some unexpected warnings during boot.
Move the 'orc_init' check from orc_find() to __unwind_init(), so that it doesn't even try to unwind from an uninitialized state.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Dave Jones dsj@fb.com Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lore.kernel.org/r/069d1499ad606d85532eb32ce39b2441679667d5.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/unwind_orc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -131,9 +131,6 @@ static struct orc_entry *orc_find(unsign { static struct orc_entry *orc;
- if (!orc_init) - return NULL; - if (ip == 0) return &null_orc_entry;
@@ -563,6 +560,9 @@ EXPORT_SYMBOL_GPL(unwind_next_frame); void __unwind_start(struct unwind_state *state, struct task_struct *task, struct pt_regs *regs, unsigned long *first_frame) { + if (!orc_init) + goto done; + memset(state, 0, sizeof(*state)); state->task = task;
Hi!
From: Josh Poimboeuf jpoimboe@redhat.com
commit 98d0c8ebf77e0ba7c54a9ae05ea588f0e9e3f46e upstream.
If the unwinder is called before the ORC data has been initialized, orc_find() returns NULL, and it tries to fall back to using frame pointers. This can cause some unexpected warnings during boot.
Move the 'orc_init' check from orc_find() to __unwind_init(), so that it doesn't even try to unwind from an uninitialized state.
@@ -563,6 +560,9 @@ EXPORT_SYMBOL_GPL(unwind_next_frame); void __unwind_start(struct unwind_state *state, struct task_struct *task, struct pt_regs *regs, unsigned long *first_frame) {
- if (!orc_init)
goto done;
- memset(state, 0, sizeof(*state)); state->task = task;
As this returns the *state to the caller, should the "goto done" move below the memset? Otherwise we are returning partialy-initialized struct, which is ... weird.
Best regards, Pavel
On Wed, May 13, 2020 at 11:52:10PM +0200, Pavel Machek wrote:
Hi!
From: Josh Poimboeuf jpoimboe@redhat.com
commit 98d0c8ebf77e0ba7c54a9ae05ea588f0e9e3f46e upstream.
If the unwinder is called before the ORC data has been initialized, orc_find() returns NULL, and it tries to fall back to using frame pointers. This can cause some unexpected warnings during boot.
Move the 'orc_init' check from orc_find() to __unwind_init(), so that it doesn't even try to unwind from an uninitialized state.
@@ -563,6 +560,9 @@ EXPORT_SYMBOL_GPL(unwind_next_frame); void __unwind_start(struct unwind_state *state, struct task_struct *task, struct pt_regs *regs, unsigned long *first_frame) {
- if (!orc_init)
goto done;
- memset(state, 0, sizeof(*state)); state->task = task;
As this returns the *state to the caller, should the "goto done" move below the memset? Otherwise we are returning partialy-initialized struct, which is ... weird.
Yeah, it is a little weird. In most cases it should be fine, but there is an edge case where if there's a corrupt ORC table and this returns early, 'arch_stack_walk_reliable() -> unwind_error()' could check an uninitialized value.
Also the __unwind_start() error handling needs to set that error bit anyway, in its error cases. I'll fix it up.
Hi!
From: Josh Poimboeuf jpoimboe@redhat.com
commit 98d0c8ebf77e0ba7c54a9ae05ea588f0e9e3f46e upstream.
If the unwinder is called before the ORC data has been initialized, orc_find() returns NULL, and it tries to fall back to using frame pointers. This can cause some unexpected warnings during boot.
Move the 'orc_init' check from orc_find() to __unwind_init(), so that it doesn't even try to unwind from an uninitialized state.
@@ -563,6 +560,9 @@ EXPORT_SYMBOL_GPL(unwind_next_frame); void __unwind_start(struct unwind_state *state, struct task_struct *task, struct pt_regs *regs, unsigned long *first_frame) {
- if (!orc_init)
goto done;
- memset(state, 0, sizeof(*state)); state->task = task;
As this returns the *state to the caller, should the "goto done" move below the memset? Otherwise we are returning partialy-initialized struct, which is ... weird.
Yeah, it is a little weird. In most cases it should be fine, but there is an edge case where if there's a corrupt ORC table and this returns early, 'arch_stack_walk_reliable() -> unwind_error()' could check an uninitialized value.
Also the __unwind_start() error handling needs to set that error bit anyway, in its error cases. I'll fix it up.
I did this in the mean time. It moves goto around memset, and I believe that 8 in get_reg should have been sizeof(long) [not that it matters, x86-32 is protected by build bug on.]
Signed-off-by: Pavel Machek pavel@ucw.cz
Best regards, Pavel
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 169b96492b7c..90cb3cb2b4f1 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -375,7 +375,7 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr static bool get_reg(struct unwind_state *state, unsigned int reg_off, unsigned long *val) { - unsigned int reg = reg_off/8; + unsigned int reg = reg_off/sizeof(long);
if (!state->regs) return false; @@ -589,12 +589,12 @@ EXPORT_SYMBOL_GPL(unwind_next_frame); void __unwind_start(struct unwind_state *state, struct task_struct *task, struct pt_regs *regs, unsigned long *first_frame) { - if (!orc_init) - goto done; - memset(state, 0, sizeof(*state)); state->task = task;
+ if (!orc_init) + goto done; + /* * Refuse to unwind the stack of a task while it's executing on another * CPU. This check is racy, but that's ok: the unwinder has other
On Thu, May 14, 2020 at 10:13:40PM +0200, Pavel Machek wrote:
@@ -563,6 +560,9 @@ EXPORT_SYMBOL_GPL(unwind_next_frame); void __unwind_start(struct unwind_state *state, struct task_struct *task, struct pt_regs *regs, unsigned long *first_frame) {
- if (!orc_init)
goto done;
- memset(state, 0, sizeof(*state)); state->task = task;
As this returns the *state to the caller, should the "goto done" move below the memset? Otherwise we are returning partialy-initialized struct, which is ... weird.
Yeah, it is a little weird. In most cases it should be fine, but there is an edge case where if there's a corrupt ORC table and this returns early, 'arch_stack_walk_reliable() -> unwind_error()' could check an uninitialized value.
Also the __unwind_start() error handling needs to set that error bit anyway, in its error cases. I'll fix it up.
I did this in the mean time. It moves goto around memset, and I believe that 8 in get_reg should have been sizeof(long) [not that it matters, x86-32 is protected by build bug on.]
Signed-off-by: Pavel Machek pavel@ucw.cz
I already have the same memset patch (along with other error-handling fixes) which I'll be posting shortly once it runs through my testing.
Since the sizeof(long) thing isn't really a bug, I'll make that change later, along with some other pending improvements I have.
From: Josh Poimboeuf jpoimboe@redhat.com
commit a0f81bf26888048100bf017fadf438a5bdffa8d8 upstream.
If the ORC entry type is unknown, nothing else can be done other than reporting an error. Exit the function instead of breaking out of the switch statement.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Dave Jones dsj@fb.com Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lore.kernel.org/r/a7fa668ca6eabbe81ab18b2424f15adbbfdc810a.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kernel/unwind_orc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -509,7 +509,7 @@ bool unwind_next_frame(struct unwind_sta default: orc_warn("unknown .orc_unwind entry type %d for ip %pB\n", orc->type, (void *)orig_ip); - break; + goto err; }
/* Find BP: */
From: Josh Poimboeuf jpoimboe@redhat.com
commit 81b67439d147677d844d492fcbd03712ea438f42 upstream.
The following execution path is possible:
fsnotify() [ realign the stack and store previous SP in R10 ] <IRQ> [ only IRET regs saved ] common_interrupt() interrupt_entry() <NMI> [ full pt_regs saved ] ... [ unwind stack ]
When the unwinder goes through the NMI and the IRQ on the stack, and then sees fsnotify(), it doesn't have access to the value of R10, because it only has the five IRET registers. So the unwind stops prematurely.
However, because the interrupt_entry() code is careful not to clobber R10 before saving the full regs, the unwinder should be able to read R10 from the previously saved full pt_regs associated with the NMI.
Handle this case properly. When encountering an IRET regs frame immediately after a full pt_regs frame, use the pt_regs as a backup which can be used to get the C register values.
Also, note that a call frame resets the 'prev_regs' value, because a function is free to clobber the registers. For this fix to work, the IRET and full regs frames must be adjacent, with no FUNC frames in between. So replace the FUNC hint in interrupt_entry() with an IRET_REGS hint.
Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Dave Jones dsj@fb.com Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vince Weaver vincent.weaver@maine.edu Link: https://lore.kernel.org/r/97a408167cc09f1cfa0de31a7b70dd88868d743f.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/entry_64.S | 4 +-- arch/x86/include/asm/unwind.h | 2 - arch/x86/kernel/unwind_orc.c | 51 ++++++++++++++++++++++++++++++++---------- 3 files changed, 43 insertions(+), 14 deletions(-)
--- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -575,7 +575,7 @@ END(spurious_entries_start) * +----------------------------------------------------+ */ ENTRY(interrupt_entry) - UNWIND_HINT_FUNC + UNWIND_HINT_IRET_REGS offset=16 ASM_CLAC cld
@@ -607,9 +607,9 @@ ENTRY(interrupt_entry) pushq 5*8(%rdi) /* regs->eflags */ pushq 4*8(%rdi) /* regs->cs */ pushq 3*8(%rdi) /* regs->ip */ + UNWIND_HINT_IRET_REGS pushq 2*8(%rdi) /* regs->orig_ax */ pushq 8(%rdi) /* return address */ - UNWIND_HINT_FUNC
movq (%rdi), %rdi jmp 2f --- a/arch/x86/include/asm/unwind.h +++ b/arch/x86/include/asm/unwind.h @@ -19,7 +19,7 @@ struct unwind_state { #if defined(CONFIG_UNWINDER_ORC) bool signal, full_regs; unsigned long sp, bp, ip; - struct pt_regs *regs; + struct pt_regs *regs, *prev_regs; #elif defined(CONFIG_UNWINDER_FRAME_POINTER) bool got_irq; unsigned long *bp, *orig_sp, ip; --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -364,9 +364,38 @@ static bool deref_stack_iret_regs(struct return true; }
+/* + * If state->regs is non-NULL, and points to a full pt_regs, just get the reg + * value from state->regs. + * + * Otherwise, if state->regs just points to IRET regs, and the previous frame + * had full regs, it's safe to get the value from the previous regs. This can + * happen when early/late IRQ entry code gets interrupted by an NMI. + */ +static bool get_reg(struct unwind_state *state, unsigned int reg_off, + unsigned long *val) +{ + unsigned int reg = reg_off/8; + + if (!state->regs) + return false; + + if (state->full_regs) { + *val = ((unsigned long *)state->regs)[reg]; + return true; + } + + if (state->prev_regs) { + *val = ((unsigned long *)state->prev_regs)[reg]; + return true; + } + + return false; +} + bool unwind_next_frame(struct unwind_state *state) { - unsigned long ip_p, sp, orig_ip = state->ip, prev_sp = state->sp; + unsigned long ip_p, sp, tmp, orig_ip = state->ip, prev_sp = state->sp; enum stack_type prev_type = state->stack_info.type; struct orc_entry *orc; bool indirect = false; @@ -420,39 +449,35 @@ bool unwind_next_frame(struct unwind_sta break;
case ORC_REG_R10: - if (!state->regs || !state->full_regs) { + if (!get_reg(state, offsetof(struct pt_regs, r10), &sp)) { orc_warn("missing regs for base reg R10 at ip %pB\n", (void *)state->ip); goto err; } - sp = state->regs->r10; break;
case ORC_REG_R13: - if (!state->regs || !state->full_regs) { + if (!get_reg(state, offsetof(struct pt_regs, r13), &sp)) { orc_warn("missing regs for base reg R13 at ip %pB\n", (void *)state->ip); goto err; } - sp = state->regs->r13; break;
case ORC_REG_DI: - if (!state->regs || !state->full_regs) { + if (!get_reg(state, offsetof(struct pt_regs, di), &sp)) { orc_warn("missing regs for base reg DI at ip %pB\n", (void *)state->ip); goto err; } - sp = state->regs->di; break;
case ORC_REG_DX: - if (!state->regs || !state->full_regs) { + if (!get_reg(state, offsetof(struct pt_regs, dx), &sp)) { orc_warn("missing regs for base reg DX at ip %pB\n", (void *)state->ip); goto err; } - sp = state->regs->dx; break;
default: @@ -479,6 +504,7 @@ bool unwind_next_frame(struct unwind_sta
state->sp = sp; state->regs = NULL; + state->prev_regs = NULL; state->signal = false; break;
@@ -490,6 +516,7 @@ bool unwind_next_frame(struct unwind_sta }
state->regs = (struct pt_regs *)sp; + state->prev_regs = NULL; state->full_regs = true; state->signal = true; break; @@ -501,6 +528,8 @@ bool unwind_next_frame(struct unwind_sta goto err; }
+ if (state->full_regs) + state->prev_regs = state->regs; state->regs = (void *)sp - IRET_FRAME_OFFSET; state->full_regs = false; state->signal = true; @@ -515,8 +544,8 @@ bool unwind_next_frame(struct unwind_sta /* Find BP: */ switch (orc->bp_reg) { case ORC_REG_UNDEFINED: - if (state->regs && state->full_regs) - state->bp = state->regs->bp; + if (get_reg(state, offsetof(struct pt_regs, bp), &tmp)) + state->bp = tmp; break;
case ORC_REG_PREV_SP:
From: Guillaume Nault gnault@redhat.com
commit ea64d8d6c675c0bb712689b13810301de9d8f77a upstream.
If the UDP header of a local VXLAN endpoint is NAT-ed, and the VXLAN device has disabled UDP checksums and enabled Tx checksum offloading, then the skb passed to udp_manip_pkt() has hdr->check == 0 (outer checksum disabled) and skb->ip_summed == CHECKSUM_PARTIAL (inner packet checksum offloaded).
Because of the ->ip_summed value, udp_manip_pkt() tries to update the outer checksum with the new address and port, leading to an invalid checksum sent on the wire, as the original null checksum obviously didn't take the old address and port into account.
So, we can't take ->ip_summed into account in udp_manip_pkt(), as it might not refer to the checksum we're acting on. Instead, we can base the decision to update the UDP checksum entirely on the value of hdr->check, because it's null if and only if checksum is disabled:
* A fully computed checksum can't be 0, since a 0 checksum is represented by the CSUM_MANGLED_0 value instead.
* A partial checksum can't be 0, since the pseudo-header always adds at least one non-zero value (the UDP protocol type 0x11) and adding more values to the sum can't make it wrap to 0 as the carry is then added to the wrapped number.
* A disabled checksum uses the special value 0.
The problem seems to be there from day one, although it was probably not visible before UDP tunnels were implemented.
Fixes: 5b1158e909ec ("[NETFILTER]: Add NAT support for nf_conntrack") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_nat_proto_udp.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/net/netfilter/nf_nat_proto_udp.c +++ b/net/netfilter/nf_nat_proto_udp.c @@ -66,15 +66,14 @@ static bool udp_manip_pkt(struct sk_buff enum nf_nat_manip_type maniptype) { struct udphdr *hdr; - bool do_csum;
if (!skb_make_writable(skb, hdroff + sizeof(*hdr))) return false;
hdr = (struct udphdr *)(skb->data + hdroff); - do_csum = hdr->check || skb->ip_summed == CHECKSUM_PARTIAL; + __udp_manip_pkt(skb, l3proto, iphdroff, hdr, tuple, maniptype, + !!hdr->check);
- __udp_manip_pkt(skb, l3proto, iphdroff, hdr, tuple, maniptype, do_csum); return true; }
From: Arnd Bergmann arnd@arndb.de
commit c165d57b552aaca607fa5daf3fb524a6efe3c5a3 upstream.
gcc-10 points out that a code path exists where a pointer to a stack variable may be passed back to the caller:
net/netfilter/nfnetlink_osf.c: In function 'nf_osf_hdr_ctx_init': cc1: warning: function may return address of local variable [-Wreturn-local-addr] net/netfilter/nfnetlink_osf.c:171:16: note: declared here 171 | struct tcphdr _tcph; | ^~~~~
I am not sure whether this can happen in practice, but moving the variable declaration into the callers avoids the problem.
Fixes: 31a9c29210e2 ("netfilter: nf_osf: add struct nf_osf_hdr_ctx") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nfnetlink_osf.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/net/netfilter/nfnetlink_osf.c +++ b/net/netfilter/nfnetlink_osf.c @@ -170,12 +170,12 @@ static bool nf_osf_match_one(const struc static const struct tcphdr *nf_osf_hdr_ctx_init(struct nf_osf_hdr_ctx *ctx, const struct sk_buff *skb, const struct iphdr *ip, - unsigned char *opts) + unsigned char *opts, + struct tcphdr *_tcph) { const struct tcphdr *tcp; - struct tcphdr _tcph;
- tcp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(struct tcphdr), &_tcph); + tcp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(struct tcphdr), _tcph); if (!tcp) return NULL;
@@ -210,10 +210,11 @@ nf_osf_match(const struct sk_buff *skb, int fmatch = FMATCH_WRONG; struct nf_osf_hdr_ctx ctx; const struct tcphdr *tcp; + struct tcphdr _tcph;
memset(&ctx, 0, sizeof(ctx));
- tcp = nf_osf_hdr_ctx_init(&ctx, skb, ip, opts); + tcp = nf_osf_hdr_ctx_init(&ctx, skb, ip, opts, &_tcph); if (!tcp) return false;
@@ -270,10 +271,11 @@ const char *nf_osf_find(const struct sk_ struct nf_osf_hdr_ctx ctx; const struct tcphdr *tcp; const char *genre = NULL; + struct tcphdr _tcph;
memset(&ctx, 0, sizeof(ctx));
- tcp = nf_osf_hdr_ctx_init(&ctx, skb, ip, opts); + tcp = nf_osf_hdr_ctx_init(&ctx, skb, ip, opts, &_tcph); if (!tcp) return NULL;
From: Josh Poimboeuf jpoimboe@redhat.com
commit d8dd25a461e4eec7190cb9d66616aceacc5110ad upstream.
When the current frame address (CFA) is stored on the stack (i.e., cfa->base == CFI_SP_INDIRECT), objtool neglects to adjust the stack offset when there are subsequent pushes or pops. This results in bad ORC data at the end of the ENTER_IRQ_STACK macro, when it puts the previous stack pointer on the stack and does a subsequent push.
This fixes the following unwinder warning:
WARNING: can't dereference registers at 00000000f0a6bdba for ip interrupt_entry+0x9f/0xa0
Fixes: 627fce14809b ("objtool: Add ORC unwind table generation") Reported-by: Vince Weaver vincent.weaver@maine.edu Reported-by: Dave Jones dsj@fb.com Reported-by: Steven Rostedt rostedt@goodmis.org Reported-by: Vegard Nossum vegard.nossum@oracle.com Reported-by: Joe Mario jmario@redhat.com Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/853d5d691b29e250333332f09b8e27410b2d9924.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1315,7 +1315,7 @@ static int update_insn_state_regs(struct struct cfi_reg *cfa = &state->cfa; struct stack_op *op = &insn->stack_op;
- if (cfa->base != CFI_SP) + if (cfa->base != CFI_SP && cfa->base != CFI_SP_INDIRECT) return 0;
/* push */
From: Ivan Delalande colona@arista.com
commit e08df079b23e2e982df15aa340bfbaf50f297504 upstream.
If the trapping instruction contains a ':', for a memory access through segment registers for example, the sed substitution will insert the '*' marker in the middle of the instruction instead of the line address:
2b: 65 48 0f c7 0f cmpxchg16b %gs:*(%rdi) <-- trapping instruction
I started to think I had forgotten some quirk of the assembly syntax before noticing that it was actually coming from the script. Fix it to add the address marker at the right place for these instructions:
28: 49 8b 06 mov (%r14),%rax 2b:* 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) <-- trapping instruction 30: 0f 94 c0 sete %al
Fixes: 18ff44b189e2 ("scripts/decodecode: make faulting insn ptr more robust") Signed-off-by: Ivan Delalande colona@arista.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Borislav Petkov bp@suse.de Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/decodecode | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/decodecode +++ b/scripts/decodecode @@ -119,7 +119,7 @@ faultlinenum=$(( $(wc -l $T.oo | cut -d faultline=`cat $T.dis | head -1 | cut -d":" -f2-` faultline=`echo "$faultline" | sed -e 's/[/\[/g; s/]/\]/g'`
-cat $T.oo | sed -e "${faultlinenum}s/^(.*:)(.*)/\1*\2\t\t<-- trapping instruction/" +cat $T.oo | sed -e "${faultlinenum}s/^([^:]*:)(.*)/\1*\2\t\t<-- trapping instruction/" echo cat $T.aa cleanup
From: Oleg Nesterov oleg@redhat.com
[ Upstream commit b5f2006144c6ae941726037120fa1001ddede784 ]
Commit cc731525f26a ("signal: Remove kernel interal si_code magic") changed the value of SI_FROMUSER(SI_MESGQ), this means that mq_notify() no longer works if the sender doesn't have rights to send a signal.
Change __do_notify() to use do_send_sig_info() instead of kill_pid_info() to avoid check_kill_permission().
This needs the additional notify.sigev_signo != 0 check, shouldn't we change do_mq_notify() to deny sigev_signo == 0 ?
Test-case:
#include <signal.h> #include <mqueue.h> #include <unistd.h> #include <sys/wait.h> #include <assert.h>
static int notified;
static void sigh(int sig) { notified = 1; }
int main(void) { signal(SIGIO, sigh);
int fd = mq_open("/mq", O_RDWR|O_CREAT, 0666, NULL); assert(fd >= 0);
struct sigevent se = { .sigev_notify = SIGEV_SIGNAL, .sigev_signo = SIGIO, }; assert(mq_notify(fd, &se) == 0);
if (!fork()) { assert(setuid(1) == 0); mq_send(fd, "",1,0); return 0; }
wait(NULL); mq_unlink("/mq"); assert(notified); return 0; }
[manfred@colorfullife.com: 1) Add self_exec_id evaluation so that the implementation matches do_notify_parent 2) use PIDTYPE_TGID everywhere] Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic") Reported-by: Yoji yoji.fujihar.min@gmail.com Signed-off-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Manfred Spraul manfred@colorfullife.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: "Eric W. Biederman" ebiederm@xmission.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Markus Elfring elfring@users.sourceforge.net Cc: 1vier1@web.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/e2a782e4-eab9-4f5c-c749-c07a8f7a4e66@colorfullife.c... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- ipc/mqueue.c | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-)
diff --git a/ipc/mqueue.c b/ipc/mqueue.c index de4070d5472f2..46d0265423f5b 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -76,6 +76,7 @@ struct mqueue_inode_info {
struct sigevent notify; struct pid *notify_owner; + u32 notify_self_exec_id; struct user_namespace *notify_user_ns; struct user_struct *user; /* user who created, for accounting */ struct sock *notify_sock; @@ -662,28 +663,44 @@ static void __do_notify(struct mqueue_inode_info *info) * synchronously. */ if (info->notify_owner && info->attr.mq_curmsgs == 1) { - struct siginfo sig_i; switch (info->notify.sigev_notify) { case SIGEV_NONE: break; - case SIGEV_SIGNAL: - /* sends signal */ + case SIGEV_SIGNAL: { + struct siginfo sig_i; + struct task_struct *task; + + /* do_mq_notify() accepts sigev_signo == 0, why?? */ + if (!info->notify.sigev_signo) + break;
clear_siginfo(&sig_i); sig_i.si_signo = info->notify.sigev_signo; sig_i.si_errno = 0; sig_i.si_code = SI_MESGQ; sig_i.si_value = info->notify.sigev_value; - /* map current pid/uid into info->owner's namespaces */ rcu_read_lock(); + /* map current pid/uid into info->owner's namespaces */ sig_i.si_pid = task_tgid_nr_ns(current, ns_of_pid(info->notify_owner)); - sig_i.si_uid = from_kuid_munged(info->notify_user_ns, current_uid()); + sig_i.si_uid = from_kuid_munged(info->notify_user_ns, + current_uid()); + /* + * We can't use kill_pid_info(), this signal should + * bypass check_kill_permission(). It is from kernel + * but si_fromuser() can't know this. + * We do check the self_exec_id, to avoid sending + * signals to programs that don't expect them. + */ + task = pid_task(info->notify_owner, PIDTYPE_TGID); + if (task && task->self_exec_id == + info->notify_self_exec_id) { + do_send_sig_info(info->notify.sigev_signo, + &sig_i, task, PIDTYPE_TGID); + } rcu_read_unlock(); - - kill_pid_info(info->notify.sigev_signo, - &sig_i, info->notify_owner); break; + } case SIGEV_THREAD: set_cookie(info->notify_cookie, NOTIFY_WOKENUP); netlink_sendskb(info->notify_sock, info->notify_cookie); @@ -1273,6 +1290,7 @@ retry: info->notify.sigev_signo = notification->sigev_signo; info->notify.sigev_value = notification->sigev_value; info->notify.sigev_notify = SIGEV_SIGNAL; + info->notify_self_exec_id = current->self_exec_id; break; }
On 13/05/2020 10:44, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.123 release. There are 48 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 15 May 2020 09:41:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.123-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.19: 11 builds: 11 pass, 0 fail 22 boots: 22 pass, 0 fail 32 tests: 32 pass, 0 fail
Linux version: 4.19.123-rc1-g6d5c161fb73d Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On Wed, May 13, 2020 at 11:44:26AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.123 release. There are 48 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 421 pass: 421 fail: 0
Guenter
On Wed, 13 May 2020 at 15:17, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.19.123 release. There are 48 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 15 May 2020 09:41:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.123-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
NOTE: While running LTP sched on stable-rc 4.19 branch kernel on arm64 hikey device. Thermal alarm triggered and followed by kernel warnings and Internal error: https://lore.kernel.org/stable/CA+G9fYvo2yUVicoZ7fOYf8=QxTtS8nW-Z2JGD4iLtU61...
Summary ------------------------------------------------------------------------
kernel: 4.19.123-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.19.y git commit: 6d5c161fb73d8e3d1a5a0efcf2d089b939a1e165 git describe: v4.19.122-49-g6d5c161fb73d Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.122-4...
No regressions (compared to build v4.19.122)
No fixes (compared to build v4.19.122)
Ran 32876 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - juno-r2-compat - juno-r2-kasan - nxp-ls2088 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64 - x86-kasan
Test Suites ----------- * build * install-android-platform-tools-r2600 * install-android-platform-tools-r2800 * kselftest * kselftest/drivers * kselftest/filesystems * kselftest/net * kselftest/networking * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-hugetlb-tests * ltp-ipc-tests * ltp-mm-tests * ltp-sched-tests * perf * v4l2-compliance * kvm-unit-tests * ltp-commands-tests * ltp-containers-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-io-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-securebits-tests * ltp-syscalls-tests * network-basic-tests * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems * kselftest-vsyscall-mode-native/net * kselftest-vsyscall-mode-native/networking * kselftest-vsyscall-mode-none * kselftest-vsyscall-mode-none/drivers * kselftest-vsyscall-mode-none/filesystems * kselftest-vsyscall-mode-none/net * kselftest-vsyscall-mode-none/networking
Hello Greg,
From: stable-owner@vger.kernel.org stable-owner@vger.kernel.org On Behalf Of Greg Kroah-Hartman Sent: 13 May 2020 10:44
This is the start of the stable review cycle for the 4.19.123 release. There are 48 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No build/boot issues seen for CIP configs for Linux 4.19.123-rc1 (6d5c161fb73d).
Build/test pipeline/logs: https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/pipelines/1456... GitLab CI pipeline: https://gitlab.com/cip-project/cip-testing/linux-cip-pipelines/-/blob/master... Relevant LAVA jobs: https://lava.ciplatform.org/scheduler/alljobs?length=25&search=6d5c16#ta...
Kind regards, Chris
Responses should be made by Fri, 15 May 2020 09:41:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch- 4.19.123-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.123-rc1
Oleg Nesterov oleg@redhat.com ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()
Ivan Delalande colona@arista.com scripts/decodecode: fix trapping instruction formatting
Josh Poimboeuf jpoimboe@redhat.com objtool: Fix stack offset tracking for indirect CFAs
Arnd Bergmann arnd@arndb.de netfilter: nf_osf: avoid passing pointer to local var
Guillaume Nault gnault@redhat.com netfilter: nat: never update the UDP checksum when it's 0
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Fix premature unwind stoppage due to IRET frames
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Fix error path for bad ORC entry type
Josh Poimboeuf jpoimboe@redhat.com x86/unwind/orc: Prevent unwinding before ORC initialization
Miroslav Benes mbenes@suse.cz x86/unwind/orc: Don't skip the first frame for inactive tasks
Jann Horn jannh@google.com x86/entry/64: Fix unwind hints in rewind_stack_do_exit()
Josh Poimboeuf jpoimboe@redhat.com x86/entry/64: Fix unwind hints in kernel exit path
Josh Poimboeuf jpoimboe@redhat.com x86/entry/64: Fix unwind hints in register clearing code
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_v_ogm_process
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_store_throughput_override
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_show_throughput_override
George Spelvin lkml@sdf.org batman-adv: fix batadv_nc_random_weight_tq
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Mark RCX, RDX and RSI as clobbered in vmx_vcpu_run()'s asm blob
Sean Christopherson sean.j.christopherson@intel.com KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs
Luis Chamberlain mcgrof@kernel.org coredump: fix crash when umh is disabled
Oscar Carter oscar.carter@gmx.com staging: gasket: Check the return value of gasket_get_bar_index()
David Hildenbrand david@redhat.com mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
Mark Rutland mark.rutland@arm.com arm64: hugetlb: avoid potential NULL dereference
Marc Zyngier maz@kernel.org KVM: arm64: Fix 32bit PC wrap-around
Marc Zyngier maz@kernel.org KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Add a vmalloc_sync_mappings() for safe measure
Oliver Neukum oneukum@suse.com USB: serial: garmin_gps: add sanity checking for data length
Oliver Neukum oneukum@suse.com USB: uas: add quirk for LaCie 2Big Quadra
Alan Stern stern@rowland.harvard.edu HID: usbhid: Fix race between usbhid_close() and usbhid_stop()
Jere Leppänen jere.leppanen@nokia.com sctp: Fix bundling of SHUTDOWN with COOKIE-ACK
Jason Gerecke jason.gerecke@wacom.com HID: wacom: Read HID_DG_CONTACTMAX directly for non-generic devices
Willem de Bruijn willemb@google.com net: stricter validation of untrusted gso packets
Michael Chan michael.chan@broadcom.com bnxt_en: Fix VF anti-spoof filter setup.
Michael Chan michael.chan@broadcom.com bnxt_en: Improve AER slot reset.
Moshe Shemesh moshe@mellanox.com net/mlx5: Fix command entry leak in Internal Error State
Moshe Shemesh moshe@mellanox.com net/mlx5: Fix forced completion access non initialized command entry
Michael Chan michael.chan@broadcom.com bnxt_en: Fix VLAN acceleration handling in bnxt_fix_features().
Tuong Lien tuong.t.lien@dektech.com.au tipc: fix partial topology connection closure
Eric Dumazet edumazet@google.com sch_sfq: validate silly quantum values
Eric Dumazet edumazet@google.com sch_choke: avoid potential panic in choke_reset()
Matt Jolly Kangie@footclan.ninja net: usb: qmi_wwan: add support for DW5816e
Eric Dumazet edumazet@google.com net_sched: sch_skbprio: add message validation to skbprio_change()
Tariq Toukan tariqt@mellanox.com net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
Scott Dial scott@scottdial.com net: macsec: preserve ingress frame ordering
Eric Dumazet edumazet@google.com fq_codel: fix TCA_FQ_CODEL_DROP_BATCH_SIZE sanity checks
Julia Lawall Julia.Lawall@inria.fr dp83640: reverse arguments to list_add_tail
Nicolas Pitre nico@fluxnic.net vt: fix unicode console freeing with a common interface
Masami Hiramatsu mhiramat@kernel.org tracing/kprobes: Fix a double initialization typo
Matt Jolly Kangie@footclan.ninja USB: serial: qcserial: Add DW5816e support
Diffstat:
Makefile | 4 +- arch/arm64/kvm/guest.c | 7 ++ arch/arm64/mm/hugetlbpage.c | 2 + arch/x86/entry/calling.h | 40 +++++------ arch/x86/entry/entry_64.S | 9 +-- arch/x86/include/asm/unwind.h | 2 +- arch/x86/kernel/unwind_orc.c | 61 ++++++++++++----- arch/x86/kvm/vmx.c | 91 ++++++++++++++----------- drivers/hid/usbhid/hid-core.c | 37 +++++++--- drivers/hid/usbhid/usbhid.h | 1 + drivers/hid/wacom_sys.c | 4 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 18 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 9 +-- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 6 +- drivers/net/macsec.c | 3 +- drivers/net/phy/dp83640.c | 2 +- drivers/net/usb/qmi_wwan.c | 1 + drivers/staging/gasket/gasket_core.c | 4 ++ drivers/tty/vt/vt.c | 9 ++- drivers/usb/serial/garmin_gps.c | 4 +- drivers/usb/serial/qcserial.c | 1 + drivers/usb/storage/unusual_uas.h | 7 ++ fs/coredump.c | 8 +++ include/linux/virtio_net.h | 26 ++++++- ipc/mqueue.c | 34 ++++++--- kernel/trace/trace.c | 13 ++++ kernel/trace/trace_kprobe.c | 2 +- kernel/umh.c | 5 ++ mm/page_alloc.c | 1 + net/batman-adv/bat_v_ogm.c | 2 +- net/batman-adv/network-coding.c | 9 +-- net/batman-adv/sysfs.c | 3 +- net/netfilter/nf_nat_proto_udp.c | 5 +- net/netfilter/nfnetlink_osf.c | 12 ++-- net/sched/sch_choke.c | 3 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_sfq.c | 9 +++ net/sched/sch_skbprio.c | 3 + net/sctp/sm_statefuns.c | 6 +- net/tipc/topsrv.c | 5 +- scripts/decodecode | 2 +- tools/objtool/check.c | 2 +- virt/kvm/arm/hyp/aarch32.c | 8 ++- virt/kvm/arm/vgic/vgic-mmio.c | 4 +- 46 files changed, 335 insertions(+), 156 deletions(-)
On 5/13/20 3:44 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.123 release. There are 48 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 15 May 2020 09:41:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.123-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org