This is the start of the stable review cycle for the 4.9.224 release. There are 90 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.9.224-rc1
Sergei Trofimovich slyfox@gentoo.org Makefile: disallow data races on gcc-10 as well
Jim Mattson jmattson@google.com KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
Geert Uytterhoeven geert+renesas@glider.be ARM: dts: r8a7740: Add missing extal2 to CPG node
Geert Uytterhoeven geert+renesas@glider.be ARM: dts: r8a73a4: Add missing CMT1 interrupts
Kai-Heng Feng kai.heng.feng@canonical.com Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
Wei Yongjun weiyongjun1@huawei.com usb: gadget: legacy: fix error return code in cdc_bind()
Wei Yongjun weiyongjun1@huawei.com usb: gadget: legacy: fix error return code in gncm_bind()
Christophe JAILLET christophe.jaillet@wanadoo.fr usb: gadget: audio: Fix a missing error return value in audio_bind()
Christophe JAILLET christophe.jaillet@wanadoo.fr usb: gadget: net2272: Fix a memory leak in an error handling path in 'net2272_plat_probe()'
Eric W. Biederman ebiederm@xmission.com exec: Move would_dump into flush_old_exec
Borislav Petkov bp@suse.de x86: Fix early boot crash on gcc-10, third try
Fabio Estevam festevam@gmail.com ARM: dts: imx27-phytec-phycard-s-rdk: Fix the I2C1 pinctrl entries
Sriharsha Allenki sallenki@codeaurora.org usb: xhci: Fix NULL pointer dereference when enqueuing trbs from urb sg list
Kyungtae Kim kt0755@gmail.com USB: gadget: fix illegal array access in binding with UDC
Jesus Ramos jesus-ramos@live.com ALSA: usb-audio: Add control message quirk delay for Kingston HyperX headset
Takashi Iwai tiwai@suse.de ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
Takashi Iwai tiwai@suse.de ALSA: rawmidi: Initialize allocated buffers
Takashi Iwai tiwai@suse.de ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
Zefan Li lizefan@huawei.com netprio_cgroup: Fix unlimited memory leak of v2 cgroups
Paolo Abeni pabeni@redhat.com net: ipv4: really enforce backoff for redirects
Maciej Żenczykowski maze@google.com Revert "ipv6: add mtu lock check in __ip6_rt_update_pmtu"
Paolo Abeni pabeni@redhat.com netlabel: cope with NULL catmap
Cong Wang xiyou.wangcong@gmail.com net: fix a potential recursive NETDEV_FEAT_CHANGE
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'restrict' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'stringop-overflow' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'array-bounds' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: disable 'zero-length-bounds' warning for now
Linus Torvalds torvalds@linux-foundation.org gcc-10: avoid shadowing standard library 'free()' in crypto
Florian Fainelli f.fainelli@gmail.com net: phy: micrel: Use strlcpy() for ethtool::get_strings
Linus Torvalds torvalds@linux-foundation.org Stop the ad-hoc games with -Wno-maybe-initialized
Masahiro Yamada yamada.masahiro@socionext.com kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig
Linus Torvalds torvalds@linux-foundation.org gcc-10 warnings: fix low-hanging fruit
Jason Gunthorpe jgg@mellanox.com pnp: Use list_for_each_entry() instead of open coding
Jack Morgenstein jackm@dev.mellanox.co.il IB/mlx4: Test return value of calls to ib_get_cached_pkey
Arnd Bergmann arnd@arndb.de netfilter: conntrack: avoid gcc-10 zero-length-bounds warning
Dan Carpenter dan.carpenter@oracle.com i40iw: Fix error handling in i40iw_manage_arp_cache()
Grace Kao grace.kao@intel.com pinctrl: cherryview: Add missing spinlock usage in chv_gpio_irq_handler
Vasily Averin vvs@virtuozzo.com ipc/util.c: sysvipc_find_ipc() incorrectly updates position index
Vasily Averin vvs@virtuozzo.com drm/qxl: lost qxl_bo_kunmap_atomic_page in qxl_image_init_helper()
Kai Vehmanen kai.vehmanen@linux.intel.com ALSA: hda/hdmi: fix race in monitor detection during probe
Lubomir Rintel lkundrak@v3.sk dmaengine: mmp_tdma: Reset channel error on release
Madhuparna Bhowmik madhuparnabhowmik10@gmail.com dmaengine: pch_dma.c: Avoid data race between probe and irq handler
Ronnie Sahlberg lsahlber@redhat.com cifs: Fix a race condition with cifs_echo_request
Samuel Cabrero scabrero@suse.de cifs: Check for timeout on Negotiate stage
wuxu.wu wuxu.wu@huawei.com spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls
Wu Bo wubo40@huawei.com scsi: sg: add sg_remove_request in sg_write
Arnd Bergmann arnd@arndb.de drop_monitor: work around gcc-10 stringop-overflow warning
Christophe JAILLET christophe.jaillet@wanadoo.fr net: moxa: Fix a potential double 'free_irq()'
Christophe JAILLET christophe.jaillet@wanadoo.fr net/sonic: Fix a resource leak in an error handling path in 'jazz_sonic_probe()'
Hugh Dickins hughd@google.com shmem: fix possible deadlocks on shmlock_user_lock
Vladis Dronov vdronov@redhat.com ptp: free ptp device pin descriptors properly
Vladis Dronov vdronov@redhat.com ptp: fix the race between the release of ptp_clock and cdev
YueHaibing yuehaibing@huawei.com ptp: Fix pass zero to ERR_PTR() in ptp_clock_register
Logan Gunthorpe logang@deltatee.com chardev: add helper function to register char devs with a struct device
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: create "pins" together with the rest of attributes
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: use is_visible method to hide unused attributes
Dmitry Torokhov dmitry.torokhov@gmail.com ptp: do not explicitly set drvdata in ptp_clock_register()
Cengiz Can cengiz@kernel.wtf blktrace: fix dereference after null check
Jan Kara jack@suse.cz blktrace: Protect q->blk_trace with RCU
Jens Axboe axboe@kernel.dk blktrace: fix trace mutex deadlock
Jens Axboe axboe@kernel.dk blktrace: fix unlocked access to init/start-stop/teardown
Waiman Long longman@redhat.com blktrace: Fix potential deadlock between delete & sysfs ops
Sabrina Dubroca sd@queasysnail.net net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup
Sabrina Dubroca sd@queasysnail.net net: ipv6: add net argument to ip6_dst_lookup_flow
Shijie Luo luoshijie1@huawei.com ext4: add cond_resched() to ext4_protect_reserved_inode
Kees Cook keescook@chromium.org binfmt_elf: Do not move brk for INTERP-less ET_EXEC
Ivan Delalande colona@arista.com scripts/decodecode: fix trapping instruction formatting
Josh Poimboeuf jpoimboe@redhat.com objtool: Fix stack offset tracking for indirect CFAs
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_v_ogm_process
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_store_throughput_override
Xiyu Yang xiyuyang19@fudan.edu.cn batman-adv: Fix refcnt leak in batadv_show_throughput_override
George Spelvin lkml@sdf.org batman-adv: fix batadv_nc_random_weight_tq
David Hildenbrand david@redhat.com mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Add a vmalloc_sync_mappings() for safe measure
Oliver Neukum oneukum@suse.com USB: serial: garmin_gps: add sanity checking for data length
Oliver Neukum oneukum@suse.com USB: uas: add quirk for LaCie 2Big Quadra
Kees Cook keescook@chromium.org binfmt_elf: move brk out of mmap when doing direct loader exec
Hans de Goede hdegoede@redhat.com Revert "ACPI / video: Add force_native quirk for HP Pavilion dv6"
Michael Chan michael.chan@broadcom.com bnxt_en: Improve AER slot reset.
Moshe Shemesh moshe@mellanox.com net/mlx5: Fix command entry leak in Internal Error State
Moshe Shemesh moshe@mellanox.com net/mlx5: Fix forced completion access non initialized command entry
Michael Chan michael.chan@broadcom.com bnxt_en: Fix VLAN acceleration handling in bnxt_fix_features().
Eric Dumazet edumazet@google.com sch_sfq: validate silly quantum values
Eric Dumazet edumazet@google.com sch_choke: avoid potential panic in choke_reset()
Matt Jolly Kangie@footclan.ninja net: usb: qmi_wwan: add support for DW5816e
Tariq Toukan tariqt@mellanox.com net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
Scott Dial scott@scottdial.com net: macsec: preserve ingress frame ordering
Eric Dumazet edumazet@google.com fq_codel: fix TCA_FQ_CODEL_DROP_BATCH_SIZE sanity checks
Julia Lawall Julia.Lawall@inria.fr dp83640: reverse arguments to list_add_tail
Matt Jolly Kangie@footclan.ninja USB: serial: qcserial: Add DW5816e support
-------------
Diffstat:
Makefile | 25 +-- arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts | 4 +- arch/arm/boot/dts/r8a73a4.dtsi | 9 +- arch/arm/boot/dts/r8a7740.dtsi | 2 +- arch/x86/include/asm/stackprotector.h | 7 +- arch/x86/kernel/smpboot.c | 8 + arch/x86/kvm/x86.c | 2 +- arch/x86/xen/smp.c | 1 + block/blk-core.c | 3 + crypto/lrw.c | 4 +- crypto/xts.c | 4 +- drivers/acpi/video_detect.c | 11 -- drivers/dma/mmp_tdma.c | 2 + drivers/dma/pch_dma.c | 2 +- drivers/gpu/drm/qxl/qxl_image.c | 3 +- drivers/infiniband/core/addr.c | 8 +- drivers/infiniband/hw/i40iw/i40iw_hw.c | 2 +- drivers/infiniband/hw/mlx4/qp.c | 14 +- drivers/infiniband/sw/rxe/rxe_net.c | 8 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 18 ++- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 6 +- drivers/net/ethernet/moxa/moxart_ether.c | 2 +- drivers/net/ethernet/natsemi/jazzsonic.c | 6 +- drivers/net/geneve.c | 4 +- drivers/net/macsec.c | 3 +- drivers/net/phy/dp83640.c | 2 +- drivers/net/phy/micrel.c | 4 +- drivers/net/usb/qmi_wwan.c | 1 + drivers/net/vxlan.c | 10 +- drivers/pinctrl/intel/pinctrl-cherryview.c | 4 + drivers/ptp/ptp_clock.c | 42 ++--- drivers/ptp/ptp_private.h | 9 +- drivers/ptp/ptp_sysfs.c | 162 ++++++++----------- drivers/scsi/sg.c | 4 +- drivers/spi/spi-dw.c | 15 +- drivers/spi/spi-dw.h | 1 + drivers/usb/gadget/configfs.c | 3 + drivers/usb/gadget/legacy/audio.c | 4 +- drivers/usb/gadget/legacy/cdc2.c | 4 +- drivers/usb/gadget/legacy/ncm.c | 4 +- drivers/usb/gadget/udc/net2272.c | 2 + drivers/usb/host/xhci-ring.c | 4 +- drivers/usb/serial/garmin_gps.c | 4 +- drivers/usb/serial/qcserial.c | 1 + drivers/usb/storage/unusual_uas.h | 7 + fs/binfmt_elf.c | 12 ++ fs/char_dev.c | 86 ++++++++++ fs/cifs/cifssmb.c | 12 ++ fs/cifs/connect.c | 11 +- fs/cifs/smb2pdu.c | 12 ++ fs/exec.c | 4 +- fs/ext4/block_validity.c | 1 + include/linux/blkdev.h | 3 +- include/linux/blktrace_api.h | 18 ++- include/linux/cdev.h | 5 + include/linux/compiler.h | 7 + include/linux/fs.h | 2 +- include/linux/pnp.h | 29 ++-- include/linux/posix-clock.h | 19 ++- include/linux/tty.h | 2 +- include/net/addrconf.h | 6 +- include/net/ipv6.h | 2 +- include/net/netfilter/nf_conntrack.h | 2 +- include/sound/rawmidi.h | 1 + init/main.c | 2 + ipc/util.c | 12 +- kernel/time/posix-clock.c | 31 ++-- kernel/trace/blktrace.c | 191 +++++++++++++++++------ kernel/trace/trace.c | 13 ++ mm/page_alloc.c | 1 + mm/shmem.c | 7 +- net/batman-adv/bat_v_ogm.c | 2 +- net/batman-adv/network-coding.c | 9 +- net/batman-adv/sysfs.c | 3 +- net/core/dev.c | 4 +- net/core/drop_monitor.c | 11 +- net/core/netprio_cgroup.c | 2 + net/dccp/ipv6.c | 6 +- net/ipv4/cipso_ipv4.c | 6 +- net/ipv4/route.c | 2 +- net/ipv6/addrconf_core.c | 11 +- net/ipv6/af_inet6.c | 4 +- net/ipv6/calipso.c | 3 +- net/ipv6/datagram.c | 2 +- net/ipv6/inet6_connection_sock.c | 4 +- net/ipv6/ip6_output.c | 8 +- net/ipv6/raw.c | 2 +- net/ipv6/route.c | 6 +- net/ipv6/syncookies.c | 2 +- net/ipv6/tcp_ipv6.c | 4 +- net/l2tp/l2tp_ip6.c | 2 +- net/mpls/af_mpls.c | 7 +- net/netfilter/nf_conntrack_core.c | 4 +- net/netlabel/netlabel_kapi.c | 6 + net/sched/sch_choke.c | 3 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_sfq.c | 9 ++ net/sctp/ipv6.c | 4 +- net/tipc/udp_media.c | 9 +- scripts/decodecode | 2 +- sound/core/rawmidi.c | 35 ++++- sound/pci/hda/patch_hdmi.c | 2 + sound/pci/hda/patch_realtek.c | 12 +- sound/usb/quirks.c | 9 +- tools/objtool/check.c | 2 +- 106 files changed, 747 insertions(+), 392 deletions(-)
From: Matt Jolly Kangie@footclan.ninja
commit 78d6de3cfbd342918d31cf68d0d2eda401338aef upstream.
Add support for Dell Wireless 5816e to drivers/usb/serial/qcserial.c
Signed-off-by: Matt Jolly Kangie@footclan.ninja Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/qcserial.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/qcserial.c +++ b/drivers/usb/serial/qcserial.c @@ -177,6 +177,7 @@ static const struct usb_device_id id_tab {DEVICE_SWI(0x413c, 0x81b3)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ {DEVICE_SWI(0x413c, 0x81b5)}, /* Dell Wireless 5811e QDL */ {DEVICE_SWI(0x413c, 0x81b6)}, /* Dell Wireless 5811e QDL */ + {DEVICE_SWI(0x413c, 0x81cc)}, /* Dell Wireless 5816e */ {DEVICE_SWI(0x413c, 0x81cf)}, /* Dell Wireless 5819 */ {DEVICE_SWI(0x413c, 0x81d0)}, /* Dell Wireless 5819 */ {DEVICE_SWI(0x413c, 0x81d1)}, /* Dell Wireless 5818 */
From: Julia Lawall Julia.Lawall@inria.fr
[ Upstream commit 865308373ed49c9fb05720d14cbf1315349b32a9 ]
In this code, it appears that phyter_clocks is a list head, based on the previous list_for_each, and that clock->list is intended to be a list element, given that it has just been initialized in dp83640_clock_init. Accordingly, switch the arguments to list_add_tail, which takes the list head as the second argument.
Fixes: cb646e2b02b27 ("ptp: Added a clock driver for the National Semiconductor PHYTER.") Signed-off-by: Julia Lawall Julia.Lawall@inria.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/dp83640.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/phy/dp83640.c +++ b/drivers/net/phy/dp83640.c @@ -1108,7 +1108,7 @@ static struct dp83640_clock *dp83640_clo goto out; } dp83640_clock_init(clock, bus); - list_add_tail(&phyter_clocks, &clock->list); + list_add_tail(&clock->list, &phyter_clocks); out: mutex_unlock(&phyter_clocks_lock);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 14695212d4cd8b0c997f6121b6df8520038ce076 ]
My intent was to not let users set a zero drop_batch_size, it seems I once again messed with min()/max().
Fixes: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_fq_codel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -428,7 +428,7 @@ static int fq_codel_change(struct Qdisc q->quantum = max(256U, nla_get_u32(tb[TCA_FQ_CODEL_QUANTUM]));
if (tb[TCA_FQ_CODEL_DROP_BATCH_SIZE]) - q->drop_batch_size = min(1U, nla_get_u32(tb[TCA_FQ_CODEL_DROP_BATCH_SIZE])); + q->drop_batch_size = max(1U, nla_get_u32(tb[TCA_FQ_CODEL_DROP_BATCH_SIZE]));
if (tb[TCA_FQ_CODEL_MEMORY_LIMIT]) q->memory_limit = min(1U << 31, nla_get_u32(tb[TCA_FQ_CODEL_MEMORY_LIMIT]));
From: Scott Dial scott@scottdial.com
[ Upstream commit ab046a5d4be4c90a3952a0eae75617b49c0cb01b ]
MACsec decryption always occurs in a softirq context. Since the FPU may not be usable in the softirq context, the call to decrypt may be scheduled on the cryptd work queue. The cryptd work queue does not provide ordering guarantees. Therefore, preserving order requires masking out ASYNC implementations of gcm(aes).
For instance, an Intel CPU with AES-NI makes available the generic-gcm-aesni driver from the aesni_intel module to implement gcm(aes). However, this implementation requires the FPU, so it is not always available to use from a softirq context, and will fallback to the cryptd work queue, which does not preserve frame ordering. With this change, such a system would select gcm_base(ctr(aes-aesni),ghash-generic). While the aes-aesni implementation prefers to use the FPU, it will fallback to the aes-asm implementation if unavailable.
By using a synchronous version of gcm(aes), the decryption will complete before returning from crypto_aead_decrypt(). Therefore, the macsec_decrypt_done() callback will be called before returning from macsec_decrypt(). Thus, the order of calls to macsec_post_decrypt() for the frames is preserved.
While it's presumable that the pure AES-NI version of gcm(aes) is more performant, the hybrid solution is capable of gigabit speeds on modest hardware. Regardless, preserving the order of frames is paramount for many network protocols (e.g., triggering TCP retries). Within the MACsec driver itself, the replay protection is tripped by the out-of-order frames, and can cause frames to be dropped.
This bug has been present in this code since it was added in v4.6, however it may not have been noticed since not all CPUs have FPU offload available. Additionally, the bug manifests as occasional out-of-order packets that are easily misattributed to other network phenomena.
When this code was added in v4.6, the crypto/gcm.c code did not restrict selection of the ghash function based on the ASYNC flag. For instance, x86 CPUs with PCLMULQDQ would select the ghash-clmulni driver instead of ghash-generic, which submits to the cryptd work queue if the FPU is busy. However, this bug was was corrected in v4.8 by commit b30bdfa86431afbafe15284a3ad5ac19b49b88e3, and was backported all the way back to the v3.14 stable branch, so this patch should be applicable back to the v4.6 stable branch.
Signed-off-by: Scott Dial scott@scottdial.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/macsec.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -1315,7 +1315,8 @@ static struct crypto_aead *macsec_alloc_ struct crypto_aead *tfm; int ret;
- tfm = crypto_alloc_aead("gcm(aes)", 0, 0); + /* Pick a sync gcm(aes) cipher to ensure order is preserved. */ + tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
if (IS_ERR(tfm)) return tfm;
From: Tariq Toukan tariqt@mellanox.com
[ Upstream commit 40e473071dbad04316ddc3613c3a3d1c75458299 ]
When ENOSPC is set the idx is still valid and gets set to the global MLX4_SINK_COUNTER_INDEX. However gcc's static analysis cannot tell that ENOSPC is impossible from mlx4_cmd_imm() and gives this warning:
drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be used uninitialized in this function [-Wmaybe-uninitialized] 2552 | priv->def_counter[port] = idx;
Also, when ENOSPC is returned mlx4_allocate_default_counters should not fail.
Fixes: 6de5f7f6a1fa ("net/mlx4_core: Allocate default counter per port") Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Tariq Toukan tariqt@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx4/main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -2478,6 +2478,7 @@ static int mlx4_allocate_default_counter
if (!err || err == -ENOSPC) { priv->def_counter[port] = idx; + err = 0; } else if (err == -ENOENT) { err = 0; continue; @@ -2527,7 +2528,8 @@ int mlx4_counter_alloc(struct mlx4_dev * MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED); if (!err) *idx = get_param_l(&out_param); - + if (WARN_ON(err == -ENOSPC)) + err = -EINVAL; return err; } return __mlx4_counter_alloc(dev, idx);
From: Matt Jolly Kangie@footclan.ninja
[ Upstream commit 57c7f2bd758eed867295c81d3527fff4fab1ed74 ]
Add support for Dell Wireless 5816e to drivers/net/usb/qmi_wwan.c
Signed-off-by: Matt Jolly Kangie@footclan.ninja Acked-by: Bjørn Mork bjorn@mork.no Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -950,6 +950,7 @@ static const struct usb_device_id produc {QMI_FIXED_INTF(0x413c, 0x81b3, 8)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ {QMI_FIXED_INTF(0x413c, 0x81b6, 8)}, /* Dell Wireless 5811e */ {QMI_FIXED_INTF(0x413c, 0x81b6, 10)}, /* Dell Wireless 5811e */ + {QMI_FIXED_INTF(0x413c, 0x81cc, 8)}, /* Dell Wireless 5816e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 1)}, /* Dell Wireless 5821e preproduction config */ {QMI_FIXED_INTF(0x413c, 0x81e0, 0)}, /* Dell Wireless 5821e with eSIM support*/
From: Eric Dumazet edumazet@google.com
[ Upstream commit 8738c85c72b3108c9b9a369a39868ba5f8e10ae0 ]
If choke_init() could not allocate q->tab, we would crash later in choke_reset().
BUG: KASAN: null-ptr-deref in memset include/linux/string.h:366 [inline] BUG: KASAN: null-ptr-deref in choke_reset+0x208/0x340 net/sched/sch_choke.c:326 Write of size 8 at addr 0000000000000000 by task syz-executor822/7022
CPU: 1 PID: 7022 Comm: syz-executor822 Not tainted 5.7.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 __kasan_report.cold+0x5/0x4d mm/kasan/report.c:515 kasan_report+0x33/0x50 mm/kasan/common.c:625 check_memory_region_inline mm/kasan/generic.c:187 [inline] check_memory_region+0x141/0x190 mm/kasan/generic.c:193 memset+0x20/0x40 mm/kasan/common.c:85 memset include/linux/string.h:366 [inline] choke_reset+0x208/0x340 net/sched/sch_choke.c:326 qdisc_reset+0x6b/0x520 net/sched/sch_generic.c:910 dev_deactivate_queue.constprop.0+0x13c/0x240 net/sched/sch_generic.c:1138 netdev_for_each_tx_queue include/linux/netdevice.h:2197 [inline] dev_deactivate_many+0xe2/0xba0 net/sched/sch_generic.c:1195 dev_deactivate+0xf8/0x1c0 net/sched/sch_generic.c:1233 qdisc_graft+0xd25/0x1120 net/sched/sch_api.c:1051 tc_modify_qdisc+0xbab/0x1a00 net/sched/sch_api.c:1670 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362 ___sys_sendmsg+0x100/0x170 net/socket.c:2416 __sys_sendmsg+0xec/0x1b0 net/socket.c:2449 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
Fixes: 77e62da6e60c ("sch_choke: drop all packets in queue during reset") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_choke.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sched/sch_choke.c +++ b/net/sched/sch_choke.c @@ -382,7 +382,8 @@ static void choke_reset(struct Qdisc *sc
sch->q.qlen = 0; sch->qstats.backlog = 0; - memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); + if (q->tab) + memset(q->tab, 0, (q->tab_mask + 1) * sizeof(struct sk_buff *)); q->head = q->tail = 0; red_restart(&q->vars); }
From: Eric Dumazet edumazet@google.com
[ Upstream commit df4953e4e997e273501339f607b77953772e3559 ]
syzbot managed to set up sfq so that q->scaled_quantum was zero, triggering an infinite loop in sfq_dequeue()
More generally, we must only accept quantum between 1 and 2^18 - 7, meaning scaled_quantum must be in [1, 0x7FFF] range.
Otherwise, we also could have a loop in sfq_dequeue() if scaled_quantum happens to be 0x8000, since slot->allot could indefinitely switch between 0 and 0x8000.
Fixes: eeaeb068f139 ("sch_sfq: allow big packets and be fair") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com Cc: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_sfq.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -635,6 +635,15 @@ static int sfq_change(struct Qdisc *sch, if (ctl->divisor && (!is_power_of_2(ctl->divisor) || ctl->divisor > 65536)) return -EINVAL; + + /* slot->allot is a short, make sure quantum is not too big. */ + if (ctl->quantum) { + unsigned int scaled = SFQ_ALLOT_SIZE(ctl->quantum); + + if (scaled <= 0 || scaled > SHRT_MAX) + return -EINVAL; + } + if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog)) return -EINVAL;
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit c72cb303aa6c2ae7e4184f0081c6d11bf03fb96b ]
The current logic in bnxt_fix_features() will inadvertently turn on both CTAG and STAG VLAN offload if the user tries to disable both. Fix it by checking that the user is trying to enable CTAG or STAG before enabling both. The logic is supposed to enable or disable both CTAG and STAG together.
Fixes: 5a9f6b238e59 ("bnxt_en: Enable and disable RX CTAG and RX STAG VLAN acceleration together.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5997,6 +5997,7 @@ static netdev_features_t bnxt_fix_featur netdev_features_t features) { struct bnxt *bp = netdev_priv(dev); + netdev_features_t vlan_features;
if ((features & NETIF_F_NTUPLE) && !bnxt_rfs_capable(bp)) features &= ~NETIF_F_NTUPLE; @@ -6004,12 +6005,14 @@ static netdev_features_t bnxt_fix_featur /* Both CTAG and STAG VLAN accelaration on the RX side have to be * turned on or off together. */ - if ((features & (NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX)) != - (NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX)) { + vlan_features = features & (NETIF_F_HW_VLAN_CTAG_RX | + NETIF_F_HW_VLAN_STAG_RX); + if (vlan_features != (NETIF_F_HW_VLAN_CTAG_RX | + NETIF_F_HW_VLAN_STAG_RX)) { if (dev->features & NETIF_F_HW_VLAN_CTAG_RX) features &= ~(NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX); - else + else if (vlan_features) features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_STAG_RX; }
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit f3cb3cebe26ed4c8036adbd9448b372129d3c371 ]
mlx5_cmd_flush() will trigger forced completions to all valid command entries. Triggered by an asynch event such as fast teardown it can happen at any stage of the command, including command initialization. It will trigger forced completion and that can lead to completion on an uninitialized command entry.
Setting MLX5_CMD_ENT_STATE_PENDING_COMP only after command entry is initialized will ensure force completion is treated only if command entry is initialized.
Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -813,7 +813,6 @@ static void cmd_work_handler(struct work }
cmd->ent_arr[ent->idx] = ent; - set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state); lay = get_inst(cmd, ent->idx); ent->lay = lay; memset(lay, 0, sizeof(*lay)); @@ -835,6 +834,7 @@ static void cmd_work_handler(struct work
if (ent->callback) schedule_delayed_work(&ent->cb_timeout_work, cb_timeout); + set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state);
/* Skip sending command to fw if internal error */ if (pci_channel_offline(dev->pdev) ||
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit cece6f432cca9f18900463ed01b97a152a03600a ]
Processing commands by cmd_work_handler() while already in Internal Error State will result in entry leak, since the handler process force completion without doorbell. Forced completion doesn't release the entry and event completion will never arrive, so entry should be released.
Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Eran Ben Elisha eranbe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -847,6 +847,10 @@ static void cmd_work_handler(struct work MLX5_SET(mbox_out, ent->out, syndrome, drv_synd);
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); + /* no doorbell, no need to keep the entry */ + free_ent(cmd, ent->idx); + if (ent->callback) + free_cmd(ent); return; }
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit bae361c54fb6ac6eba3b4762f49ce14beb73ef13 ]
Improve the slot reset sequence by disabling the device to prevent bad DMAs if slot reset fails. Return the proper result instead of always PCI_ERS_RESULT_RECOVERED to the caller.
Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -7166,8 +7166,11 @@ static pci_ers_result_t bnxt_io_slot_res result = PCI_ERS_RESULT_RECOVERED; }
- if (result != PCI_ERS_RESULT_RECOVERED && netif_running(netdev)) - dev_close(netdev); + if (result != PCI_ERS_RESULT_RECOVERED) { + if (netif_running(netdev)) + dev_close(netdev); + pci_disable_device(pdev); + }
rtnl_unlock();
@@ -7178,7 +7181,7 @@ static pci_ers_result_t bnxt_io_slot_res err); /* non-fatal, continue */ }
- return PCI_ERS_RESULT_RECOVERED; + return result; }
/**
From: Hans de Goede hdegoede@redhat.com
commit fd25ea29093e275195d0ae8b2573021a1c98959f upstream.
Revert commit 6276e53fa8c0 (ACPI / video: Add force_native quirk for HP Pavilion dv6).
In the commit message for the quirk this revert removes I wrote:
"Note that there are quite a few HP Pavilion dv6 variants, some woth ATI and some with NVIDIA hybrid gfx, both seem to need this quirk to have working backlight control. There are also some versions with only Intel integrated gfx, these may not need this quirk, but it should not hurt there."
Unfortunately that seems wrong, I've already received 2 reports of this commit causing regressions on some dv6 variants (at least one of which actually has a nvidia GPU). So it seems that HP has made a mess here by using the same model-name both in marketing and in the DMI data for many different variants. Some of which need acpi_backlight=native for functional backlight control (as the quirk this commit reverts was doing), where as others are broken by it. So lets get back to the old sitation so as to avoid regressing on models which used to work without any kernel cmdline arguments before.
Fixes: 6276e53fa8c0 (ACPI / video: Add force_native quirk for HP Pavilion dv6) Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/acpi/video_detect.c | 11 ----------- 1 file changed, 11 deletions(-)
--- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -314,17 +314,6 @@ static const struct dmi_system_id video_ DMI_MATCH(DMI_PRODUCT_NAME, "Dell System XPS L702X"), }, }, - { - /* https://bugzilla.redhat.com/show_bug.cgi?id=1204476 */ - /* https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1416940 */ - .callback = video_detect_force_native, - .ident = "HP Pavilion dv6", - .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), - DMI_MATCH(DMI_PRODUCT_NAME, "HP Pavilion dv6 Notebook PC"), - }, - }, - { }, };
From: Kees Cook keescook@chromium.org
commit bbdc6076d2e5d07db44e74c11b01a3e27ab90b32 upstream.
Commmit eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"), made changes in the rare case when the ELF loader was directly invoked (e.g to set a non-inheritable LD_LIBRARY_PATH, testing new versions of the loader), by moving into the mmap region to avoid both ET_EXEC and PIE binaries. This had the effect of also moving the brk region into mmap, which could lead to the stack and brk being arbitrarily close to each other. An unlucky process wouldn't get its requested stack size and stack allocations could end up scribbling on the heap.
This is illustrated here. In the case of using the loader directly, brk (so helpfully identified as "[heap]") is allocated with the _loader_ not the binary. For example, with ASLR entirely disabled, you can see this more clearly:
$ /bin/cat /proc/self/maps 555555554000-55555555c000 r-xp 00000000 ... /bin/cat 55555575b000-55555575c000 r--p 00007000 ... /bin/cat 55555575c000-55555575d000 rw-p 00008000 ... /bin/cat 55555575d000-55555577e000 rw-p 00000000 ... [heap] ... 7ffff7ff7000-7ffff7ffa000 r--p 00000000 ... [vvar] 7ffff7ffa000-7ffff7ffc000 r-xp 00000000 ... [vdso] 7ffff7ffc000-7ffff7ffd000 r--p 00027000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffd000-7ffff7ffe000 rw-p 00028000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffe000-7ffff7fff000 rw-p 00000000 ... 7ffffffde000-7ffffffff000 rw-p 00000000 ... [stack]
$ /lib/x86_64-linux-gnu/ld-2.27.so /bin/cat /proc/self/maps ... 7ffff7bcc000-7ffff7bd4000 r-xp 00000000 ... /bin/cat 7ffff7bd4000-7ffff7dd3000 ---p 00008000 ... /bin/cat 7ffff7dd3000-7ffff7dd4000 r--p 00007000 ... /bin/cat 7ffff7dd4000-7ffff7dd5000 rw-p 00008000 ... /bin/cat 7ffff7dd5000-7ffff7dfc000 r-xp 00000000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7fb2000-7ffff7fd6000 rw-p 00000000 ... 7ffff7ff7000-7ffff7ffa000 r--p 00000000 ... [vvar] 7ffff7ffa000-7ffff7ffc000 r-xp 00000000 ... [vdso] 7ffff7ffc000-7ffff7ffd000 r--p 00027000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffd000-7ffff7ffe000 rw-p 00028000 ... /lib/x86_64-linux-gnu/ld-2.27.so 7ffff7ffe000-7ffff8020000 rw-p 00000000 ... [heap] 7ffffffde000-7ffffffff000 rw-p 00000000 ... [stack]
The solution is to move brk out of mmap and into ELF_ET_DYN_BASE since nothing is there in the direct loader case (and ET_EXEC is still far away at 0x400000). Anything that ran before should still work (i.e. the ultimately-launched binary already had the brk very far from its text, so this should be no different from a COMPAT_BRK standpoint). The only risk I see here is that if someone started to suddenly depend on the entire memory space lower than the mmap region being available when launching binaries via a direct loader execs which seems highly unlikely, I'd hope: this would mean a binary would _not_ work when exec()ed normally.
(Note that this is only done under CONFIG_ARCH_HAS_ELF_RANDOMIZATION when randomization is turned on.)
Link: http://lkml.kernel.org/r/20190422225727.GA21011@beast Link: https://lkml.kernel.org/r/CAGXu5jJ5sj3emOT2QPxQkNQk0qbU6zEfu9=Omfhx_p0nCKPSj... Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") Signed-off-by: Kees Cook keescook@chromium.org Reported-by: Ali Saidi alisaidi@amazon.com Cc: Ali Saidi alisaidi@amazon.com Cc: Guenter Roeck linux@roeck-us.net Cc: Michal Hocko mhocko@suse.com Cc: Matthew Wilcox willy@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Jann Horn jannh@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/binfmt_elf.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1100,6 +1100,17 @@ static int load_elf_binary(struct linux_ current->mm->start_stack = bprm->p;
if ((current->flags & PF_RANDOMIZE) && (randomize_va_space > 1)) { + /* + * For architectures with ELF randomization, when executing + * a loader directly (i.e. no interpreter listed in ELF + * headers), move the brk area out of the mmap region + * (since it grows up, and may collide early with the stack + * growing down), and into the unused ELF_ET_DYN_BASE region. + */ + if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && !interpreter) + current->mm->brk = current->mm->start_brk = + ELF_ET_DYN_BASE; + current->mm->brk = current->mm->start_brk = arch_randomize_brk(current->mm); #ifdef compat_brk_randomized
From: Oliver Neukum oneukum@suse.com
commit 9f04db234af691007bb785342a06abab5fb34474 upstream.
This device needs US_FL_NO_REPORT_OPCODES to avoid going through prolonged error handling on enumeration.
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-by: Julian Groß julian.g@posteo.de Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200429155218.7308-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -41,6 +41,13 @@ * and don't forget to CC: the USB development list linux-usb@vger.kernel.org */
+/* Reported-by: Julian Groß julian.g@posteo.de */ +UNUSUAL_DEV(0x059f, 0x105f, 0x0000, 0x9999, + "LaCie", + "2Big Quadra USB3", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_NO_REPORT_OPCODES), + /* * Apricorn USB3 dongle sometimes returns "USBSUSBSUSBS" in response to SCSI * commands in UAS mode. Observed with the 1.28 firmware; are there others?
From: Oliver Neukum oneukum@suse.com
commit e9b3c610a05c1cdf8e959a6d89c38807ff758ee6 upstream.
We must not process packets shorter than a packet ID
Signed-off-by: Oliver Neukum oneukum@suse.com Reported-and-tested-by: syzbot+d29e9263e13ce0b9f4fd@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/garmin_gps.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/serial/garmin_gps.c +++ b/drivers/usb/serial/garmin_gps.c @@ -1161,8 +1161,8 @@ static void garmin_read_process(struct g send it directly to the tty port */ if (garmin_data_p->flags & FLAGS_QUEUING) { pkt_add(garmin_data_p, data, data_length); - } else if (bulk_data || - getLayerId(data) == GARMIN_LAYERID_APPL) { + } else if (bulk_data || (data_length >= sizeof(u32) && + getLayerId(data) == GARMIN_LAYERID_APPL)) {
spin_lock_irqsave(&garmin_data_p->lock, flags); garmin_data_p->flags |= APP_RESP_SEEN;
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 11f5efc3ab66284f7aaacc926e9351d658e2577b upstream.
x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu areas can be complex, to say the least. Mappings may happen at boot up, and if nothing synchronizes the page tables, those page mappings may not be synced till they are used. This causes issues for anything that might touch one of those mappings in the path of the page fault handler. When one of those unmapped mappings is touched in the page fault handler, it will cause another page fault, which in turn will cause a page fault, and leave us in a loop of page faults.
Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split vmalloc_sync_all() into vmalloc_sync_unmappings() and vmalloc_sync_mappings(), as on system exit, it did not need to do a full sync on x86_64 (although it still needed to be done on x86_32). By chance, the vmalloc_sync_all() would synchronize the page mappings done at boot up and prevent the per cpu area from being a problem for tracing in the page fault handler. But when that synchronization in the exit of a task became a nop, it caused the problem to appear.
Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home
Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: "Tzvetomir Stoyanov (VMware)" tz.stoyanov@gmail.com Suggested-by: Joerg Roedel jroedel@suse.de Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/trace/trace.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7032,6 +7032,19 @@ static int allocate_trace_buffers(struct */ allocate_snapshot = false; #endif + + /* + * Because of some magic with the way alloc_percpu() works on + * x86_64, we need to synchronize the pgd of all the tables, + * otherwise the trace events that happen in x86_64 page fault + * handlers can't cope with accessing the chance that a + * alloc_percpu()'d memory might be touched in the page fault trace + * event. Oh, and we need to audit all other alloc_percpu() and vmalloc() + * calls in tracing, because something might get triggered within a + * page fault trace event! + */ + vmalloc_sync_mappings(); + return 0; }
From: David Hildenbrand david@redhat.com
commit e84fe99b68ce353c37ceeecc95dce9696c976556 upstream.
Without CONFIG_PREEMPT, it can happen that we get soft lockups detected, e.g., while booting up.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0-next-20200331+ #4 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: __pageblock_pfn_to_page+0x134/0x1c0 Call Trace: set_zone_contiguous+0x56/0x70 page_alloc_init_late+0x166/0x176 kernel_init_freeable+0xfa/0x255 kernel_init+0xa/0x106 ret_from_fork+0x35/0x40
The issue becomes visible when having a lot of memory (e.g., 4TB) assigned to a single NUMA node - a system that can easily be created using QEMU. Inside VMs on a hypervisor with quite some memory overcommit, this is fairly easy to trigger.
Signed-off-by: David Hildenbrand david@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Pavel Tatashin pasha.tatashin@soleen.com Reviewed-by: Pankaj Gupta pankaj.gupta.linux@gmail.com Reviewed-by: Baoquan He bhe@redhat.com Reviewed-by: Shile Zhang shile.zhang@linux.alibaba.com Acked-by: Michal Hocko mhocko@suse.com Cc: Kirill Tkhai ktkhai@virtuozzo.com Cc: Shile Zhang shile.zhang@linux.alibaba.com Cc: Pavel Tatashin pasha.tatashin@soleen.com Cc: Daniel Jordan daniel.m.jordan@oracle.com Cc: Michal Hocko mhocko@kernel.org Cc: Alexander Duyck alexander.duyck@gmail.com Cc: Baoquan He bhe@redhat.com Cc: Oscar Salvador osalvador@suse.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20200416073417.5003-1-david@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1408,6 +1408,7 @@ void set_zone_contiguous(struct zone *zo if (!__pageblock_pfn_to_page(block_start_pfn, block_end_pfn, zone)) return; + cond_resched(); }
/* We confirm that there is no hole */
From: George Spelvin lkml@sdf.org
commit fd0c42c4dea54335967c5a86f15fc064235a2797 upstream.
and change to pseudorandom numbers, as this is a traffic dithering operation that doesn't need crypto-grade.
The previous code operated in 4 steps:
1. Generate a random byte 0 <= rand_tq <= 255 2. Multiply it by BATADV_TQ_MAX_VALUE - tq 3. Divide by 255 (= BATADV_TQ_MAX_VALUE) 4. Return BATADV_TQ_MAX_VALUE - rand_tq
This would apperar to scale (BATADV_TQ_MAX_VALUE - tq) by a random value between 0/255 and 255/255.
But! The intermediate value between steps 3 and 4 is stored in a u8 variable. So it's truncated, and most of the time, is less than 255, after which the division produces 0. Specifically, if tq is odd, the product is always even, and can never be 255. If tq is even, there's exactly one random byte value that will produce a product byte of 255.
Thus, the return value is 255 (511/512 of the time) or 254 (1/512 of the time).
If we assume that the truncation is a bug, and the code is meant to scale the input, a simpler way of looking at it is that it's returning a random value between tq and BATADV_TQ_MAX_VALUE, inclusive.
Well, we have an optimized function for doing just that.
Fixes: 3c12de9a5c75 ("batman-adv: network coding - code and transmit packets if possible") Signed-off-by: George Spelvin lkml@sdf.org Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/network-coding.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
--- a/net/batman-adv/network-coding.c +++ b/net/batman-adv/network-coding.c @@ -1012,15 +1012,8 @@ static struct batadv_nc_path *batadv_nc_ */ static u8 batadv_nc_random_weight_tq(u8 tq) { - u8 rand_val, rand_tq; - - get_random_bytes(&rand_val, sizeof(rand_val)); - /* randomize the estimated packet loss (max TQ - estimated TQ) */ - rand_tq = rand_val * (BATADV_TQ_MAX_VALUE - tq); - - /* normalize the randomized packet loss */ - rand_tq /= BATADV_TQ_MAX_VALUE; + u8 rand_tq = prandom_u32_max(BATADV_TQ_MAX_VALUE + 1 - tq);
/* convert to (randomized) estimated tq again */ return BATADV_TQ_MAX_VALUE - rand_tq;
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit f872de8185acf1b48b954ba5bd8f9bc0a0d14016 upstream.
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(), which gets a batadv_hard_iface object from net_dev with increased refcnt and its reference is assigned to a local pointer 'hard_iface'.
When batadv_show_throughput_override() returns, "hard_iface" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in the normal path of batadv_show_throughput_override(), which forgets to decrease the refcnt increased by batadv_hardif_get_by_netdev() before the function returns, causing a refcnt leak.
Fix this issue by calling batadv_hardif_put() before the batadv_show_throughput_override() returns in the normal path.
Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces") Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/sysfs.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/batman-adv/sysfs.c +++ b/net/batman-adv/sysfs.c @@ -1120,6 +1120,7 @@ static ssize_t batadv_show_throughput_ov
tp_override = atomic_read(&hard_iface->bat_v.throughput_override);
+ batadv_hardif_put(hard_iface); return sprintf(buff, "%u.%u MBit\n", tp_override / 10, tp_override % 10); }
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit 6107c5da0fca8b50b4d3215e94d619d38cc4a18c upstream.
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(), which gets a batadv_hard_iface object from net_dev with increased refcnt and its reference is assigned to a local pointer 'hard_iface'.
When batadv_store_throughput_override() returns, "hard_iface" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in one error path of batadv_store_throughput_override(). When batadv_parse_throughput() returns NULL, the refcnt increased by batadv_hardif_get_by_netdev() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_parse_throughput() returns NULL.
Fixes: 0b5ecc6811bd ("batman-adv: add throughput override attribute to hard_ifaces") Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/batman-adv/sysfs.c +++ b/net/batman-adv/sysfs.c @@ -1087,7 +1087,7 @@ static ssize_t batadv_store_throughput_o ret = batadv_parse_throughput(net_dev, buff, "throughput_override", &tp_override); if (!ret) - return count; + goto out;
old_tp_override = atomic_read(&hard_iface->bat_v.throughput_override); if (old_tp_override == tp_override)
From: Xiyu Yang xiyuyang19@fudan.edu.cn
commit 6f91a3f7af4186099dd10fa530dd7e0d9c29747d upstream.
batadv_v_ogm_process() invokes batadv_hardif_neigh_get(), which returns a reference of the neighbor object to "hardif_neigh" with increased refcount.
When batadv_v_ogm_process() returns, "hardif_neigh" becomes invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling paths of batadv_v_ogm_process(). When batadv_v_ogm_orig_get() fails to get the orig node and returns NULL, the refcnt increased by batadv_hardif_neigh_get() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_v_ogm_orig_get() fails to get the orig node.
Fixes: 9323158ef9f4 ("batman-adv: OGMv2 - implement originators logic") Signed-off-by: Xiyu Yang xiyuyang19@fudan.edu.cn Signed-off-by: Xin Tan tanxin.ctf@gmail.com Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/bat_v_ogm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/batman-adv/bat_v_ogm.c +++ b/net/batman-adv/bat_v_ogm.c @@ -709,7 +709,7 @@ static void batadv_v_ogm_process(const s
orig_node = batadv_v_ogm_orig_get(bat_priv, ogm_packet->orig); if (!orig_node) - return; + goto out;
neigh_node = batadv_neigh_node_get_or_create(orig_node, if_incoming, ethhdr->h_source);
From: Josh Poimboeuf jpoimboe@redhat.com
commit d8dd25a461e4eec7190cb9d66616aceacc5110ad upstream.
When the current frame address (CFA) is stored on the stack (i.e., cfa->base == CFI_SP_INDIRECT), objtool neglects to adjust the stack offset when there are subsequent pushes or pops. This results in bad ORC data at the end of the ENTER_IRQ_STACK macro, when it puts the previous stack pointer on the stack and does a subsequent push.
This fixes the following unwinder warning:
WARNING: can't dereference registers at 00000000f0a6bdba for ip interrupt_entry+0x9f/0xa0
Fixes: 627fce14809b ("objtool: Add ORC unwind table generation") Reported-by: Vince Weaver vincent.weaver@maine.edu Reported-by: Dave Jones dsj@fb.com Reported-by: Steven Rostedt rostedt@goodmis.org Reported-by: Vegard Nossum vegard.nossum@oracle.com Reported-by: Joe Mario jmario@redhat.com Reviewed-by: Miroslav Benes mbenes@suse.cz Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Ingo Molnar mingo@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Jann Horn jannh@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/853d5d691b29e250333332f09b8e27410b2d9924.158780874... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1264,7 +1264,7 @@ static int update_insn_state_regs(struct struct cfi_reg *cfa = &state->cfa; struct stack_op *op = &insn->stack_op;
- if (cfa->base != CFI_SP) + if (cfa->base != CFI_SP && cfa->base != CFI_SP_INDIRECT) return 0;
/* push */
From: Ivan Delalande colona@arista.com
commit e08df079b23e2e982df15aa340bfbaf50f297504 upstream.
If the trapping instruction contains a ':', for a memory access through segment registers for example, the sed substitution will insert the '*' marker in the middle of the instruction instead of the line address:
2b: 65 48 0f c7 0f cmpxchg16b %gs:*(%rdi) <-- trapping instruction
I started to think I had forgotten some quirk of the assembly syntax before noticing that it was actually coming from the script. Fix it to add the address marker at the right place for these instructions:
28: 49 8b 06 mov (%r14),%rax 2b:* 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) <-- trapping instruction 30: 0f 94 c0 sete %al
Fixes: 18ff44b189e2 ("scripts/decodecode: make faulting insn ptr more robust") Signed-off-by: Ivan Delalande colona@arista.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Borislav Petkov bp@suse.de Link: http://lkml.kernel.org/r/20200419223653.GA31248@visor Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- scripts/decodecode | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/decodecode +++ b/scripts/decodecode @@ -98,7 +98,7 @@ faultlinenum=$(( $(wc -l $T.oo | cut -d faultline=`cat $T.dis | head -1 | cut -d":" -f2-` faultline=`echo "$faultline" | sed -e 's/[/\[/g; s/]/\]/g'`
-cat $T.oo | sed -e "${faultlinenum}s/^(.*:)(.*)/\1*\2\t\t<-- trapping instruction/" +cat $T.oo | sed -e "${faultlinenum}s/^([^:]*:)(.*)/\1*\2\t\t<-- trapping instruction/" echo cat $T.aa cleanup
From: Kees Cook keescook@chromium.org
commit 7be3cb019db1cbd5fd5ffe6d64a23fefa4b6f229 upstream.
When brk was moved for binaries without an interpreter, it should have been limited to ET_DYN only. In other words, the special case was an ET_DYN that lacks an INTERP, not just an executable that lacks INTERP. The bug manifested for giant static executables, where the brk would end up in the middle of the text area on 32-bit architectures.
Reported-and-tested-by: Richard Kojedzinszky richard@kojedz.in Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/binfmt_elf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1107,7 +1107,8 @@ static int load_elf_binary(struct linux_ * (since it grows up, and may collide early with the stack * growing down), and into the unused ELF_ET_DYN_BASE region. */ - if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && !interpreter) + if (IS_ENABLED(CONFIG_ARCH_HAS_ELF_RANDOMIZE) && + loc->elf_ex.e_type == ET_DYN && !interpreter) current->mm->brk = current->mm->start_brk = ELF_ET_DYN_BASE;
From: Shijie Luo luoshijie1@huawei.com
commit af133ade9a40794a37104ecbcc2827c0ea373a3c upstream.
When journal size is set too big by "mkfs.ext4 -J size=", or when we mount a crafted image to make journal inode->i_size too big, the loop, "while (i < num)", holds cpu too long. This could cause soft lockup.
[ 529.357541] Call trace: [ 529.357551] dump_backtrace+0x0/0x198 [ 529.357555] show_stack+0x24/0x30 [ 529.357562] dump_stack+0xa4/0xcc [ 529.357568] watchdog_timer_fn+0x300/0x3e8 [ 529.357574] __hrtimer_run_queues+0x114/0x358 [ 529.357576] hrtimer_interrupt+0x104/0x2d8 [ 529.357580] arch_timer_handler_virt+0x38/0x58 [ 529.357584] handle_percpu_devid_irq+0x90/0x248 [ 529.357588] generic_handle_irq+0x34/0x50 [ 529.357590] __handle_domain_irq+0x68/0xc0 [ 529.357593] gic_handle_irq+0x6c/0x150 [ 529.357595] el1_irq+0xb8/0x140 [ 529.357599] __ll_sc_atomic_add_return_acquire+0x14/0x20 [ 529.357668] ext4_map_blocks+0x64/0x5c0 [ext4] [ 529.357693] ext4_setup_system_zone+0x330/0x458 [ext4] [ 529.357717] ext4_fill_super+0x2170/0x2ba8 [ext4] [ 529.357722] mount_bdev+0x1a8/0x1e8 [ 529.357746] ext4_mount+0x44/0x58 [ext4] [ 529.357748] mount_fs+0x50/0x170 [ 529.357752] vfs_kern_mount.part.9+0x54/0x188 [ 529.357755] do_mount+0x5ac/0xd78 [ 529.357758] ksys_mount+0x9c/0x118 [ 529.357760] __arm64_sys_mount+0x28/0x38 [ 529.357764] el0_svc_common+0x78/0x130 [ 529.357766] el0_svc_handler+0x38/0x78 [ 529.357769] el0_svc+0x8/0xc [ 541.356516] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [mount:18674]
Link: https://lore.kernel.org/r/20200211011752.29242-1-luoshijie1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Shijie Luo luoshijie1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/block_validity.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c index d31d93ee5e76f..45c7b0f9a8e3f 100644 --- a/fs/ext4/block_validity.c +++ b/fs/ext4/block_validity.c @@ -152,6 +152,7 @@ static int ext4_protect_reserved_inode(struct super_block *sb, u32 ino) return PTR_ERR(inode); num = (inode->i_size + sb->s_blocksize - 1) >> sb->s_blocksize_bits; while (i < num) { + cond_resched(); map.m_lblk = i; map.m_len = num - i; n = ext4_map_blocks(NULL, inode, &map, 0);
From: Sabrina Dubroca sd@queasysnail.net
commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e upstream.
This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow, as some modules currently pass a net argument without a socket to ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument").
Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ipv6.h | 2 +- net/dccp/ipv6.c | 6 +++--- net/ipv6/af_inet6.c | 2 +- net/ipv6/datagram.c | 2 +- net/ipv6/inet6_connection_sock.c | 4 ++-- net/ipv6/ip6_output.c | 8 ++++---- net/ipv6/raw.c | 2 +- net/ipv6/syncookies.c | 2 +- net/ipv6/tcp_ipv6.c | 4 ++-- net/l2tp/l2tp_ip6.c | 2 +- net/sctp/ipv6.c | 4 ++-- 11 files changed, 19 insertions(+), 19 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 168009eef5e42..1a48e10ec617d 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -856,7 +856,7 @@ static inline struct sk_buff *ip6_finish_skb(struct sock *sk)
int ip6_dst_lookup(struct net *net, struct sock *sk, struct dst_entry **dst, struct flowi6 *fl6); -struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, +struct dst_entry *ip6_dst_lookup_flow(struct net *net, const struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst); struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst); diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 87c513b5ff2eb..9438873fc3c87 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -209,7 +209,7 @@ static int dccp_v6_send_response(const struct sock *sk, struct request_sock *req final_p = fl6_update_dst(&fl6, rcu_dereference(np->opt), &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); dst = NULL; @@ -280,7 +280,7 @@ static void dccp_v6_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb) security_skb_classify_flow(rxskb, flowi6_to_flowi(&fl6));
/* sk = NULL, but it is safe for now. RST socket required. */ - dst = ip6_dst_lookup_flow(ctl_sk, &fl6, NULL); + dst = ip6_dst_lookup_flow(sock_net(ctl_sk), ctl_sk, &fl6, NULL); if (!IS_ERR(dst)) { skb_dst_set(skb, dst); ip6_xmit(ctl_sk, skb, &fl6, 0, NULL, 0); @@ -889,7 +889,7 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr, opt = rcu_dereference_protected(np->opt, lockdep_sock_is_held(sk)); final_p = fl6_update_dst(&fl6, opt, &final);
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto failure; diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 8885dbad217b1..2e91637c9d491 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -702,7 +702,7 @@ int inet6_sk_rebuild_header(struct sock *sk) &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { sk->sk_route_caps = 0; sk->sk_err_soft = -PTR_ERR(dst); diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 956af11e9ba3e..58929622de0ea 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -87,7 +87,7 @@ int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr) final_p = fl6_update_dst(&fl6, opt, &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto out; diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c index 798a0950e9a62..b760ccec44d3c 100644 --- a/net/ipv6/inet6_connection_sock.c +++ b/net/ipv6/inet6_connection_sock.c @@ -90,7 +90,7 @@ struct dst_entry *inet6_csk_route_req(const struct sock *sk, fl6->fl6_sport = htons(ireq->ir_num); security_req_classify_flow(req, flowi6_to_flowi(fl6));
- dst = ip6_dst_lookup_flow(sk, fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p); if (IS_ERR(dst)) return NULL;
@@ -144,7 +144,7 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
dst = __inet6_csk_dst_check(sk, np->dst_cookie); if (!dst) { - dst = ip6_dst_lookup_flow(sk, fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
if (!IS_ERR(dst)) ip6_dst_store(sk, dst, NULL, NULL); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 11407dd6bc7c0..d93a98dfe52df 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1075,19 +1075,19 @@ EXPORT_SYMBOL_GPL(ip6_dst_lookup); * It returns a valid dst pointer on success, or a pointer encoded * error code. */ -struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, +struct dst_entry *ip6_dst_lookup_flow(struct net *net, const struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst) { struct dst_entry *dst = NULL; int err;
- err = ip6_dst_lookup_tail(sock_net(sk), sk, &dst, fl6); + err = ip6_dst_lookup_tail(net, sk, &dst, fl6); if (err) return ERR_PTR(err); if (final_dst) fl6->daddr = *final_dst;
- return xfrm_lookup_route(sock_net(sk), dst, flowi6_to_flowi(fl6), sk, 0); + return xfrm_lookup_route(net, dst, flowi6_to_flowi(fl6), sk, 0); } EXPORT_SYMBOL_GPL(ip6_dst_lookup_flow);
@@ -1112,7 +1112,7 @@ struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6,
dst = ip6_sk_dst_check(sk, dst, fl6); if (!dst) - dst = ip6_dst_lookup_flow(sk, fl6, final_dst); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_dst);
return dst; } diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 301978df650e6..47acd20d4e1fc 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -920,7 +920,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto out; diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index 7a86433d8896b..4834015b27f49 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -230,7 +230,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) fl6.fl6_sport = inet_sk(sk)->inet_sport; security_req_classify_flow(req, flowi6_to_flowi(&fl6));
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) goto out_free; } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 4953466cf98f0..7b336b7803ffa 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -244,7 +244,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
security_sk_classify_flow(sk, flowi6_to_flowi(&fl6));
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto failure; @@ -841,7 +841,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 * Underlying function will use this to retrieve the network * namespace */ - dst = ip6_dst_lookup_flow(ctl_sk, &fl6, NULL); + dst = ip6_dst_lookup_flow(sock_net(ctl_sk), ctl_sk, &fl6, NULL); if (!IS_ERR(dst)) { skb_dst_set(buff, dst); ip6_xmit(ctl_sk, buff, &fl6, fl6.flowi6_mark, NULL, tclass); diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index 423cb095ad37c..28274f397c55e 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -620,7 +620,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
- dst = ip6_dst_lookup_flow(sk, &fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p); if (IS_ERR(dst)) { err = PTR_ERR(dst); goto out; diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 34ab7f92f0643..50bc8c4ca9068 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -269,7 +269,7 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr, final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final); rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sk, fl6, final_p); + dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p); if (!asoc || saddr) { t->dst = dst; memcpy(fl, &_fl, sizeof(_fl)); @@ -327,7 +327,7 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr, fl6->saddr = laddr->a.v6.sin6_addr; fl6->fl6_sport = laddr->a.v6.sin6_port; final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final); - bdst = ip6_dst_lookup_flow(sk, fl6, final_p); + bdst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
if (IS_ERR(bdst)) continue;
From: Sabrina Dubroca sd@queasysnail.net
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2 upstream.
ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely.
All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route().
This requires some changes in all the callers, as these two functions take different arguments and have different return types.
Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu xmu@redhat.com Signed-off-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: David S. Miller davem@davemloft.net [bwh: Backported to 4.9: - Drop changes in lwt_bpf.c and mlx5 - Initialise "dst" in drivers/infiniband/core/addr.c:addr_resolve() to avoid introducing a spurious "may be used uninitialised" warning - Adjust filename, context, indentation] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/addr.c | 8 ++++---- drivers/infiniband/sw/rxe/rxe_net.c | 8 +++++--- drivers/net/geneve.c | 4 +++- drivers/net/vxlan.c | 10 ++++------ include/net/addrconf.h | 6 ++++-- net/ipv6/addrconf_core.c | 11 ++++++----- net/ipv6/af_inet6.c | 2 +- net/mpls/af_mpls.c | 7 +++---- net/tipc/udp_media.c | 9 ++++++--- 9 files changed, 36 insertions(+), 29 deletions(-)
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index f7d23c1081dc4..68eed45b86003 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -453,16 +453,15 @@ static int addr6_resolve(struct sockaddr_in6 *src_in, struct flowi6 fl6; struct dst_entry *dst; struct rt6_info *rt; - int ret;
memset(&fl6, 0, sizeof fl6); fl6.daddr = dst_in->sin6_addr; fl6.saddr = src_in->sin6_addr; fl6.flowi6_oif = addr->bound_dev_if;
- ret = ipv6_stub->ipv6_dst_lookup(addr->net, NULL, &dst, &fl6); - if (ret < 0) - return ret; + dst = ipv6_stub->ipv6_dst_lookup_flow(addr->net, NULL, &fl6, NULL); + if (IS_ERR(dst)) + return PTR_ERR(dst);
rt = (struct rt6_info *)dst; if (ipv6_addr_any(&src_in->sin6_addr)) { @@ -552,6 +551,7 @@ static int addr_resolve(struct sockaddr *src_in, const struct sockaddr_in6 *dst_in6 = (const struct sockaddr_in6 *)dst_in;
+ dst = NULL; ret = addr6_resolve((struct sockaddr_in6 *)src_in, dst_in6, addr, &dst); diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c index f4f3942ebbd13..d19e003e8381e 100644 --- a/drivers/infiniband/sw/rxe/rxe_net.c +++ b/drivers/infiniband/sw/rxe/rxe_net.c @@ -182,10 +182,12 @@ static struct dst_entry *rxe_find_route6(struct net_device *ndev, memcpy(&fl6.daddr, daddr, sizeof(*daddr)); fl6.flowi6_proto = IPPROTO_UDP;
- if (unlikely(ipv6_stub->ipv6_dst_lookup(sock_net(recv_sockets.sk6->sk), - recv_sockets.sk6->sk, &ndst, &fl6))) { + ndst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(recv_sockets.sk6->sk), + recv_sockets.sk6->sk, &fl6, + NULL); + if (unlikely(IS_ERR(ndst))) { pr_err_ratelimited("no route to %pI6\n", daddr); - goto put; + return NULL; }
if (unlikely(ndst->error)) { diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 92ad43e53c722..35d8c636de123 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -835,7 +835,9 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff *skb, return dst; }
- if (ipv6_stub->ipv6_dst_lookup(geneve->net, gs6->sock->sk, &dst, fl6)) { + dst = ipv6_stub->ipv6_dst_lookup_flow(geneve->net, gs6->sock->sk, fl6, + NULL); + if (IS_ERR(dst)) { netdev_dbg(dev, "no route to %pI6\n", &fl6->daddr); return ERR_PTR(-ENETUNREACH); } diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index bc4542d9a08d3..58ddb6c904185 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1881,7 +1881,6 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan, bool use_cache = ip_tunnel_dst_cache_usable(skb, info); struct dst_entry *ndst; struct flowi6 fl6; - int err;
if (!sock6) return ERR_PTR(-EIO); @@ -1902,11 +1901,10 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan, fl6.flowi6_mark = skb->mark; fl6.flowi6_proto = IPPROTO_UDP;
- err = ipv6_stub->ipv6_dst_lookup(vxlan->net, - sock6->sock->sk, - &ndst, &fl6); - if (err < 0) - return ERR_PTR(err); + ndst = ipv6_stub->ipv6_dst_lookup_flow(vxlan->net, sock6->sock->sk, + &fl6, NULL); + if (unlikely(IS_ERR(ndst))) + return ERR_PTR(-ENETUNREACH);
*saddr = fl6.saddr; if (use_cache) diff --git a/include/net/addrconf.h b/include/net/addrconf.h index b8ee8a113e324..019b06c035a84 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -206,8 +206,10 @@ struct ipv6_stub { const struct in6_addr *addr); int (*ipv6_sock_mc_drop)(struct sock *sk, int ifindex, const struct in6_addr *addr); - int (*ipv6_dst_lookup)(struct net *net, struct sock *sk, - struct dst_entry **dst, struct flowi6 *fl6); + struct dst_entry *(*ipv6_dst_lookup_flow)(struct net *net, + const struct sock *sk, + struct flowi6 *fl6, + const struct in6_addr *final_dst); void (*udpv6_encap_enable)(void); void (*ndisc_send_na)(struct net_device *dev, const struct in6_addr *daddr, const struct in6_addr *solicited_addr, diff --git a/net/ipv6/addrconf_core.c b/net/ipv6/addrconf_core.c index bfa941fc11650..129324b36fb60 100644 --- a/net/ipv6/addrconf_core.c +++ b/net/ipv6/addrconf_core.c @@ -107,15 +107,16 @@ int inet6addr_notifier_call_chain(unsigned long val, void *v) } EXPORT_SYMBOL(inet6addr_notifier_call_chain);
-static int eafnosupport_ipv6_dst_lookup(struct net *net, struct sock *u1, - struct dst_entry **u2, - struct flowi6 *u3) +static struct dst_entry *eafnosupport_ipv6_dst_lookup_flow(struct net *net, + const struct sock *sk, + struct flowi6 *fl6, + const struct in6_addr *final_dst) { - return -EAFNOSUPPORT; + return ERR_PTR(-EAFNOSUPPORT); }
const struct ipv6_stub *ipv6_stub __read_mostly = &(struct ipv6_stub) { - .ipv6_dst_lookup = eafnosupport_ipv6_dst_lookup, + .ipv6_dst_lookup_flow = eafnosupport_ipv6_dst_lookup_flow, }; EXPORT_SYMBOL_GPL(ipv6_stub);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 2e91637c9d491..c6746aaf7fbfb 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -860,7 +860,7 @@ static struct pernet_operations inet6_net_ops = { static const struct ipv6_stub ipv6_stub_impl = { .ipv6_sock_mc_join = ipv6_sock_mc_join, .ipv6_sock_mc_drop = ipv6_sock_mc_drop, - .ipv6_dst_lookup = ip6_dst_lookup, + .ipv6_dst_lookup_flow = ip6_dst_lookup_flow, .udpv6_encap_enable = udpv6_encap_enable, .ndisc_send_na = ndisc_send_na, .nd_tbl = &nd_tbl, diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c index ffab94d61e1da..eab9c1d70856b 100644 --- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -497,16 +497,15 @@ static struct net_device *inet6_fib_lookup_dev(struct net *net, struct net_device *dev; struct dst_entry *dst; struct flowi6 fl6; - int err;
if (!ipv6_stub) return ERR_PTR(-EAFNOSUPPORT);
memset(&fl6, 0, sizeof(fl6)); memcpy(&fl6.daddr, addr, sizeof(struct in6_addr)); - err = ipv6_stub->ipv6_dst_lookup(net, NULL, &dst, &fl6); - if (err) - return ERR_PTR(err); + dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL); + if (IS_ERR(dst)) + return ERR_CAST(dst);
dev = dst->dev; dev_hold(dev); diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c index 05033ab05b8f3..c6ff3de1de37b 100644 --- a/net/tipc/udp_media.c +++ b/net/tipc/udp_media.c @@ -187,10 +187,13 @@ static int tipc_udp_xmit(struct net *net, struct sk_buff *skb, .saddr = src->ipv6, .flowi6_proto = IPPROTO_UDP }; - err = ipv6_stub->ipv6_dst_lookup(net, ub->ubsock->sk, &ndst, - &fl6); - if (err) + ndst = ipv6_stub->ipv6_dst_lookup_flow(net, + ub->ubsock->sk, + &fl6, NULL); + if (IS_ERR(ndst)) { + err = PTR_ERR(ndst); goto tx_error; + } ttl = ip6_dst_hoplimit(ndst); err = udp_tunnel6_xmit_skb(ndst, ub->ubsock->sk, skb, NULL, &src->ipv6, &dst->ipv6, 0, ttl, 0,
From: Waiman Long longman@redhat.com
commit 5acb3cc2c2e9d3020a4fee43763c6463767f1572 upstream.
The lockdep code had reported the following unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(s_active#228); lock(&bdev->bd_mutex/1); lock(s_active#228); lock(&bdev->bd_mutex);
*** DEADLOCK ***
The deadlock may happen when one task (CPU1) is trying to delete a partition in a block device and another task (CPU0) is accessing tracing sysfs file (e.g. /sys/block/dm-1/trace/act_mask) in that partition.
The s_active isn't an actual lock. It is a reference count (kn->count) on the sysfs (kernfs) file. Removal of a sysfs file, however, require a wait until all the references are gone. The reference count is treated like a rwsem using lockdep instrumentation code.
The fact that a thread is in the sysfs callback method or in the ioctl call means there is a reference to the opended sysfs or device file. That should prevent the underlying block structure from being removed.
Instead of using bd_mutex in the block_device structure, a new blk_trace_mutex is now added to the request_queue structure to protect access to the blk_trace structure.
Suggested-by: Christoph Hellwig hch@infradead.org Signed-off-by: Waiman Long longman@redhat.com Acked-by: Steven Rostedt (VMware) rostedt@goodmis.org
Fix typo in patch subject line, and prune a comment detailing how the code used to work.
Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-core.c | 3 +++ include/linux/blkdev.h | 1 + kernel/trace/blktrace.c | 18 ++++++++++++------ 3 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c index bdb906bbfe198..4987f312a95f4 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -729,6 +729,9 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
kobject_init(&q->kobj, &blk_queue_ktype);
+#ifdef CONFIG_BLK_DEV_IO_TRACE + mutex_init(&q->blk_trace_mutex); +#endif mutex_init(&q->sysfs_lock); spin_lock_init(&q->__queue_lock);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2fc4ba6fa07f9..a8dfbad42d1b0 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -446,6 +446,7 @@ struct request_queue { int node; #ifdef CONFIG_BLK_DEV_IO_TRACE struct blk_trace *blk_trace; + struct mutex blk_trace_mutex; #endif /* * for flush operations diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index bfa8bb3a6e196..ff1384c5884c5 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -644,6 +644,12 @@ int blk_trace_startstop(struct request_queue *q, int start) } EXPORT_SYMBOL_GPL(blk_trace_startstop);
+/* + * When reading or writing the blktrace sysfs files, the references to the + * opened sysfs or device files should prevent the underlying block device + * from being removed. So no further delete protection is really needed. + */ + /** * blk_trace_ioctl: - handle the ioctls associated with tracing * @bdev: the block device @@ -661,7 +667,7 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) if (!q) return -ENXIO;
- mutex_lock(&bdev->bd_mutex); + mutex_lock(&q->blk_trace_mutex);
switch (cmd) { case BLKTRACESETUP: @@ -687,7 +693,7 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) break; }
- mutex_unlock(&bdev->bd_mutex); + mutex_unlock(&q->blk_trace_mutex); return ret; }
@@ -1656,7 +1662,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev, if (q == NULL) goto out_bdput;
- mutex_lock(&bdev->bd_mutex); + mutex_lock(&q->blk_trace_mutex);
if (attr == &dev_attr_enable) { ret = sprintf(buf, "%u\n", !!q->blk_trace); @@ -1675,7 +1681,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev, ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba);
out_unlock_bdev: - mutex_unlock(&bdev->bd_mutex); + mutex_unlock(&q->blk_trace_mutex); out_bdput: bdput(bdev); out: @@ -1717,7 +1723,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, if (q == NULL) goto out_bdput;
- mutex_lock(&bdev->bd_mutex); + mutex_lock(&q->blk_trace_mutex);
if (attr == &dev_attr_enable) { if (!!value == !!q->blk_trace) { @@ -1747,7 +1753,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, }
out_unlock_bdev: - mutex_unlock(&bdev->bd_mutex); + mutex_unlock(&q->blk_trace_mutex); out_bdput: bdput(bdev); out:
From: Jens Axboe axboe@kernel.dk
commit 1f2cac107c591c24b60b115d6050adc213d10fc0 upstream.
sg.c calls into the blktrace functions without holding the proper queue mutex for doing setup, start/stop, or teardown.
Add internal unlocked variants, and export the ones that do the proper locking.
Fixes: 6da127ad0918 ("blktrace: Add blktrace ioctls to SCSI generic devices") Tested-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/blktrace.c | 58 ++++++++++++++++++++++++++++++++++------- 1 file changed, 48 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index ff1384c5884c5..55337d797deb1 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -329,7 +329,7 @@ static void blk_trace_cleanup(struct blk_trace *bt) put_probe_ref(); }
-int blk_trace_remove(struct request_queue *q) +static int __blk_trace_remove(struct request_queue *q) { struct blk_trace *bt;
@@ -342,6 +342,17 @@ int blk_trace_remove(struct request_queue *q)
return 0; } + +int blk_trace_remove(struct request_queue *q) +{ + int ret; + + mutex_lock(&q->blk_trace_mutex); + ret = __blk_trace_remove(q); + mutex_unlock(&q->blk_trace_mutex); + + return ret; +} EXPORT_SYMBOL_GPL(blk_trace_remove);
static ssize_t blk_dropped_read(struct file *filp, char __user *buffer, @@ -546,9 +557,8 @@ err: return ret; }
-int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, - struct block_device *bdev, - char __user *arg) +static int __blk_trace_setup(struct request_queue *q, char *name, dev_t dev, + struct block_device *bdev, char __user *arg) { struct blk_user_trace_setup buts; int ret; @@ -567,6 +577,19 @@ int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, } return 0; } + +int blk_trace_setup(struct request_queue *q, char *name, dev_t dev, + struct block_device *bdev, + char __user *arg) +{ + int ret; + + mutex_lock(&q->blk_trace_mutex); + ret = __blk_trace_setup(q, name, dev, bdev, arg); + mutex_unlock(&q->blk_trace_mutex); + + return ret; +} EXPORT_SYMBOL_GPL(blk_trace_setup);
#if defined(CONFIG_COMPAT) && defined(CONFIG_X86_64) @@ -603,7 +626,7 @@ static int compat_blk_trace_setup(struct request_queue *q, char *name, } #endif
-int blk_trace_startstop(struct request_queue *q, int start) +static int __blk_trace_startstop(struct request_queue *q, int start) { int ret; struct blk_trace *bt = q->blk_trace; @@ -642,6 +665,17 @@ int blk_trace_startstop(struct request_queue *q, int start)
return ret; } + +int blk_trace_startstop(struct request_queue *q, int start) +{ + int ret; + + mutex_lock(&q->blk_trace_mutex); + ret = __blk_trace_startstop(q, start); + mutex_unlock(&q->blk_trace_mutex); + + return ret; +} EXPORT_SYMBOL_GPL(blk_trace_startstop);
/* @@ -672,7 +706,7 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) switch (cmd) { case BLKTRACESETUP: bdevname(bdev, b); - ret = blk_trace_setup(q, b, bdev->bd_dev, bdev, arg); + ret = __blk_trace_setup(q, b, bdev->bd_dev, bdev, arg); break; #if defined(CONFIG_COMPAT) && defined(CONFIG_X86_64) case BLKTRACESETUP32: @@ -683,10 +717,10 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) case BLKTRACESTART: start = 1; case BLKTRACESTOP: - ret = blk_trace_startstop(q, start); + ret = __blk_trace_startstop(q, start); break; case BLKTRACETEARDOWN: - ret = blk_trace_remove(q); + ret = __blk_trace_remove(q); break; default: ret = -ENOTTY; @@ -704,10 +738,14 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) **/ void blk_trace_shutdown(struct request_queue *q) { + mutex_lock(&q->blk_trace_mutex); + if (q->blk_trace) { - blk_trace_startstop(q, 0); - blk_trace_remove(q); + __blk_trace_startstop(q, 0); + __blk_trace_remove(q); } + + mutex_unlock(&q->blk_trace_mutex); }
/*
From: Jens Axboe axboe@kernel.dk
commit 2967acbb257a6a9bf912f4778b727e00972eac9b upstream.
A previous commit changed the locking around registration/cleanup, but direct callers of blk_trace_remove() were missed. This means that if we hit the error path in setup, we will deadlock on attempting to re-acquire the queue trace mutex.
Fixes: 1f2cac107c59 ("blktrace: fix unlocked access to init/start-stop/teardown") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/blktrace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 55337d797deb1..a88e677c74f31 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -572,7 +572,7 @@ static int __blk_trace_setup(struct request_queue *q, char *name, dev_t dev, return ret;
if (copy_to_user(arg, &buts, sizeof(buts))) { - blk_trace_remove(q); + __blk_trace_remove(q); return -EFAULT; } return 0; @@ -618,7 +618,7 @@ static int compat_blk_trace_setup(struct request_queue *q, char *name, return ret;
if (copy_to_user(arg, &buts.name, ARRAY_SIZE(buts.name))) { - blk_trace_remove(q); + __blk_trace_remove(q); return -EFAULT; }
From: Jan Kara jack@suse.cz
commit c780e86dd48ef6467a1146cf7d0fe1e05a635039 upstream.
KASAN is reporting that __blk_add_trace() has a use-after-free issue when accessing q->blk_trace. Indeed the switching of block tracing (and thus eventual freeing of q->blk_trace) is completely unsynchronized with the currently running tracing and thus it can happen that the blk_trace structure is being freed just while __blk_add_trace() works on it. Protect accesses to q->blk_trace by RCU during tracing and make sure we wait for the end of RCU grace period when shutting down tracing. Luckily that is rare enough event that we can afford that. Note that postponing the freeing of blk_trace to an RCU callback should better be avoided as it could have unexpected user visible side-effects as debugfs files would be still existing for a short while block tracing has been shut down.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711 CC: stable@vger.kernel.org Reviewed-by: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Reviewed-by: Ming Lei ming.lei@redhat.com Tested-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bart Van Assche bvanassche@acm.org Reported-by: Tristan Madani tristmd@gmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/blkdev.h | 2 +- include/linux/blktrace_api.h | 18 ++++-- kernel/trace/blktrace.c | 110 +++++++++++++++++++++++++---------- 3 files changed, 94 insertions(+), 36 deletions(-)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index a8dfbad42d1b0..060881478e59e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -445,7 +445,7 @@ struct request_queue { unsigned int sg_reserved_size; int node; #ifdef CONFIG_BLK_DEV_IO_TRACE - struct blk_trace *blk_trace; + struct blk_trace __rcu *blk_trace; struct mutex blk_trace_mutex; #endif /* diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h index cceb72f9e29f5..45fb00427306f 100644 --- a/include/linux/blktrace_api.h +++ b/include/linux/blktrace_api.h @@ -51,18 +51,26 @@ void __trace_note_message(struct blk_trace *, const char *fmt, ...); **/ #define blk_add_trace_msg(q, fmt, ...) \ do { \ - struct blk_trace *bt = (q)->blk_trace; \ + struct blk_trace *bt; \ + \ + rcu_read_lock(); \ + bt = rcu_dereference((q)->blk_trace); \ if (unlikely(bt)) \ __trace_note_message(bt, fmt, ##__VA_ARGS__); \ + rcu_read_unlock(); \ } while (0) #define BLK_TN_MAX_MSG 128
static inline bool blk_trace_note_message_enabled(struct request_queue *q) { - struct blk_trace *bt = q->blk_trace; - if (likely(!bt)) - return false; - return bt->act_mask & BLK_TC_NOTIFY; + struct blk_trace *bt; + bool ret; + + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + ret = bt && (bt->act_mask & BLK_TC_NOTIFY); + rcu_read_unlock(); + return ret; }
extern void blk_add_driver_data(struct request_queue *q, struct request *rq, diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index a88e677c74f31..78a896acd21ac 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -325,6 +325,7 @@ static void put_probe_ref(void)
static void blk_trace_cleanup(struct blk_trace *bt) { + synchronize_rcu(); blk_trace_free(bt); put_probe_ref(); } @@ -629,8 +630,10 @@ static int compat_blk_trace_setup(struct request_queue *q, char *name, static int __blk_trace_startstop(struct request_queue *q, int start) { int ret; - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (bt == NULL) return -EINVAL;
@@ -739,8 +742,8 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) void blk_trace_shutdown(struct request_queue *q) { mutex_lock(&q->blk_trace_mutex); - - if (q->blk_trace) { + if (rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex))) { __blk_trace_startstop(q, 0); __blk_trace_remove(q); } @@ -766,10 +769,14 @@ void blk_trace_shutdown(struct request_queue *q) static void blk_add_trace_rq(struct request_queue *q, struct request *rq, unsigned int nr_bytes, u32 what) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
if (rq->cmd_type == REQ_TYPE_BLOCK_PC) { what |= BLK_TC_ACT(BLK_TC_PC); @@ -780,6 +787,7 @@ static void blk_add_trace_rq(struct request_queue *q, struct request *rq, __blk_add_trace(bt, blk_rq_pos(rq), nr_bytes, req_op(rq), rq->cmd_flags, what, rq->errors, 0, NULL); } + rcu_read_unlock(); }
static void blk_add_trace_rq_abort(void *ignore, @@ -829,13 +837,18 @@ static void blk_add_trace_rq_complete(void *ignore, static void blk_add_trace_bio(struct request_queue *q, struct bio *bio, u32 what, int error) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, bio_op(bio), bio->bi_opf, what, error, 0, NULL); + rcu_read_unlock(); }
static void blk_add_trace_bio_bounce(void *ignore, @@ -880,11 +893,14 @@ static void blk_add_trace_getrq(void *ignore, if (bio) blk_add_trace_bio(q, bio, BLK_TA_GETRQ, 0); else { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_GETRQ, 0, 0, NULL); + rcu_read_unlock(); } }
@@ -896,27 +912,35 @@ static void blk_add_trace_sleeprq(void *ignore, if (bio) blk_add_trace_bio(q, bio, BLK_TA_SLEEPRQ, 0); else { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_SLEEPRQ, 0, 0, NULL); + rcu_read_unlock(); } }
static void blk_add_trace_plug(void *ignore, struct request_queue *q) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL); + rcu_read_unlock(); }
static void blk_add_trace_unplug(void *ignore, struct request_queue *q, unsigned int depth, bool explicit) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) { __be64 rpdu = cpu_to_be64(depth); u32 what; @@ -928,14 +952,17 @@ static void blk_add_trace_unplug(void *ignore, struct request_queue *q,
__blk_add_trace(bt, 0, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu); } + rcu_read_unlock(); }
static void blk_add_trace_split(void *ignore, struct request_queue *q, struct bio *bio, unsigned int pdu) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
+ rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) { __be64 rpdu = cpu_to_be64(pdu);
@@ -944,6 +971,7 @@ static void blk_add_trace_split(void *ignore, BLK_TA_SPLIT, bio->bi_error, sizeof(rpdu), &rpdu); } + rcu_read_unlock(); }
/** @@ -963,11 +991,15 @@ static void blk_add_trace_bio_remap(void *ignore, struct request_queue *q, struct bio *bio, dev_t dev, sector_t from) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; struct blk_io_trace_remap r;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
r.device_from = cpu_to_be32(dev); r.device_to = cpu_to_be32(bio->bi_bdev->bd_dev); @@ -976,6 +1008,7 @@ static void blk_add_trace_bio_remap(void *ignore, __blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, bio_op(bio), bio->bi_opf, BLK_TA_REMAP, bio->bi_error, sizeof(r), &r); + rcu_read_unlock(); }
/** @@ -996,11 +1029,15 @@ static void blk_add_trace_rq_remap(void *ignore, struct request *rq, dev_t dev, sector_t from) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; struct blk_io_trace_remap r;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
r.device_from = cpu_to_be32(dev); r.device_to = cpu_to_be32(disk_devt(rq->rq_disk)); @@ -1009,6 +1046,7 @@ static void blk_add_trace_rq_remap(void *ignore, __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq), rq_data_dir(rq), 0, BLK_TA_REMAP, !!rq->errors, sizeof(r), &r); + rcu_read_unlock(); }
/** @@ -1026,10 +1064,14 @@ void blk_add_driver_data(struct request_queue *q, struct request *rq, void *data, size_t len) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt;
- if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + }
if (rq->cmd_type == REQ_TYPE_BLOCK_PC) __blk_add_trace(bt, 0, blk_rq_bytes(rq), 0, 0, @@ -1037,6 +1079,7 @@ void blk_add_driver_data(struct request_queue *q, else __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq), 0, 0, BLK_TA_DRV_DATA, rq->errors, len, data); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(blk_add_driver_data);
@@ -1529,6 +1572,7 @@ static int blk_trace_remove_queue(struct request_queue *q) return -EINVAL;
put_probe_ref(); + synchronize_rcu(); blk_trace_free(bt); return 0; } @@ -1690,6 +1734,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev, struct hd_struct *p = dev_to_part(dev); struct request_queue *q; struct block_device *bdev; + struct blk_trace *bt; ssize_t ret = -ENXIO;
bdev = bdget(part_devt(p)); @@ -1702,21 +1747,23 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
mutex_lock(&q->blk_trace_mutex);
+ bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (attr == &dev_attr_enable) { - ret = sprintf(buf, "%u\n", !!q->blk_trace); + ret = sprintf(buf, "%u\n", !!bt); goto out_unlock_bdev; }
- if (q->blk_trace == NULL) + if (bt == NULL) ret = sprintf(buf, "disabled\n"); else if (attr == &dev_attr_act_mask) - ret = blk_trace_mask2str(buf, q->blk_trace->act_mask); + ret = blk_trace_mask2str(buf, bt->act_mask); else if (attr == &dev_attr_pid) - ret = sprintf(buf, "%u\n", q->blk_trace->pid); + ret = sprintf(buf, "%u\n", bt->pid); else if (attr == &dev_attr_start_lba) - ret = sprintf(buf, "%llu\n", q->blk_trace->start_lba); + ret = sprintf(buf, "%llu\n", bt->start_lba); else if (attr == &dev_attr_end_lba) - ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba); + ret = sprintf(buf, "%llu\n", bt->end_lba);
out_unlock_bdev: mutex_unlock(&q->blk_trace_mutex); @@ -1733,6 +1780,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, struct block_device *bdev; struct request_queue *q; struct hd_struct *p; + struct blk_trace *bt; u64 value; ssize_t ret = -EINVAL;
@@ -1763,8 +1811,10 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
mutex_lock(&q->blk_trace_mutex);
+ bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (attr == &dev_attr_enable) { - if (!!value == !!q->blk_trace) { + if (!!value == !!bt) { ret = 0; goto out_unlock_bdev; } @@ -1776,18 +1826,18 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, }
ret = 0; - if (q->blk_trace == NULL) + if (bt == NULL) ret = blk_trace_setup_queue(q, bdev);
if (ret == 0) { if (attr == &dev_attr_act_mask) - q->blk_trace->act_mask = value; + bt->act_mask = value; else if (attr == &dev_attr_pid) - q->blk_trace->pid = value; + bt->pid = value; else if (attr == &dev_attr_start_lba) - q->blk_trace->start_lba = value; + bt->start_lba = value; else if (attr == &dev_attr_end_lba) - q->blk_trace->end_lba = value; + bt->end_lba = value; }
out_unlock_bdev:
From: Cengiz Can cengiz@kernel.wtf
commit 153031a301bb07194e9c37466cfce8eacb977621 upstream.
There was a recent change in blktrace.c that added a RCU protection to `q->blk_trace` in order to fix a use-after-free issue during access.
However the change missed an edge case that can lead to dereferencing of `bt` pointer even when it's NULL:
Coverity static analyzer marked this as a FORWARD_NULL issue with CID 1460458.
``` /kernel/trace/blktrace.c: 1904 in sysfs_blk_trace_attr_store() 1898 ret = 0; 1899 if (bt == NULL) 1900 ret = blk_trace_setup_queue(q, bdev); 1901 1902 if (ret == 0) { 1903 if (attr == &dev_attr_act_mask)
CID 1460458: Null pointer dereferences (FORWARD_NULL) Dereferencing null pointer "bt".
1904 bt->act_mask = value; 1905 else if (attr == &dev_attr_pid) 1906 bt->pid = value; 1907 else if (attr == &dev_attr_start_lba) 1908 bt->start_lba = value; 1909 else if (attr == &dev_attr_end_lba) ```
Added a reassignment with RCU annotation to fix the issue.
Fixes: c780e86dd48 ("blktrace: Protect q->blk_trace with RCU") Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Bob Liu bob.liu@oracle.com Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Cengiz Can cengiz@kernel.wtf Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/blktrace.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 78a896acd21ac..6d3b432a748a6 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -1826,8 +1826,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, }
ret = 0; - if (bt == NULL) + if (bt == NULL) { ret = blk_trace_setup_queue(q, bdev); + bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); + }
if (ret == 0) { if (attr == &dev_attr_act_mask)
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit 882f312dc0751c973db26478f07f082c584d16aa upstream.
We do not need explicitly call dev_set_drvdata(), as it is done for us by device_create().
Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_clock.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 2aa5b37cc6d24..08f304b83ad13 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -220,8 +220,6 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, if (IS_ERR(ptp->dev)) goto no_device;
- dev_set_drvdata(ptp->dev, ptp); - err = ptp_populate_sysfs(ptp); if (err) goto no_sysfs;
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit af59e717d5ff9c8dbf9bcc581c0dfb3b2a9c9030 upstream.
Instead of creating selected attributes after the device is created (and after userspace potentially seen uevent), lets use attribute group is_visible() method to control which attributes are shown. This will allow us to create all attributes (except "pins" group, which will be taken care of later) before userspace gets notified about new ptp class device.
Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_sysfs.c | 125 ++++++++++++++++++---------------------- 1 file changed, 55 insertions(+), 70 deletions(-)
diff --git a/drivers/ptp/ptp_sysfs.c b/drivers/ptp/ptp_sysfs.c index 302e626fe6b01..a55a6eb4dfde9 100644 --- a/drivers/ptp/ptp_sysfs.c +++ b/drivers/ptp/ptp_sysfs.c @@ -46,27 +46,6 @@ PTP_SHOW_INT(n_periodic_outputs, n_per_out); PTP_SHOW_INT(n_programmable_pins, n_pins); PTP_SHOW_INT(pps_available, pps);
-static struct attribute *ptp_attrs[] = { - &dev_attr_clock_name.attr, - &dev_attr_max_adjustment.attr, - &dev_attr_n_alarms.attr, - &dev_attr_n_external_timestamps.attr, - &dev_attr_n_periodic_outputs.attr, - &dev_attr_n_programmable_pins.attr, - &dev_attr_pps_available.attr, - NULL, -}; - -static const struct attribute_group ptp_group = { - .attrs = ptp_attrs, -}; - -const struct attribute_group *ptp_groups[] = { - &ptp_group, - NULL, -}; - - static ssize_t extts_enable_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -91,6 +70,7 @@ static ssize_t extts_enable_store(struct device *dev, out: return err; } +static DEVICE_ATTR(extts_enable, 0220, NULL, extts_enable_store);
static ssize_t extts_fifo_show(struct device *dev, struct device_attribute *attr, char *page) @@ -124,6 +104,7 @@ out: mutex_unlock(&ptp->tsevq_mux); return cnt; } +static DEVICE_ATTR(fifo, 0444, extts_fifo_show, NULL);
static ssize_t period_store(struct device *dev, struct device_attribute *attr, @@ -151,6 +132,7 @@ static ssize_t period_store(struct device *dev, out: return err; } +static DEVICE_ATTR(period, 0220, NULL, period_store);
static ssize_t pps_enable_store(struct device *dev, struct device_attribute *attr, @@ -177,6 +159,57 @@ static ssize_t pps_enable_store(struct device *dev, out: return err; } +static DEVICE_ATTR(pps_enable, 0220, NULL, pps_enable_store); + +static struct attribute *ptp_attrs[] = { + &dev_attr_clock_name.attr, + + &dev_attr_max_adjustment.attr, + &dev_attr_n_alarms.attr, + &dev_attr_n_external_timestamps.attr, + &dev_attr_n_periodic_outputs.attr, + &dev_attr_n_programmable_pins.attr, + &dev_attr_pps_available.attr, + + &dev_attr_extts_enable.attr, + &dev_attr_fifo.attr, + &dev_attr_period.attr, + &dev_attr_pps_enable.attr, + NULL +}; + +static umode_t ptp_is_attribute_visible(struct kobject *kobj, + struct attribute *attr, int n) +{ + struct device *dev = kobj_to_dev(kobj); + struct ptp_clock *ptp = dev_get_drvdata(dev); + struct ptp_clock_info *info = ptp->info; + umode_t mode = attr->mode; + + if (attr == &dev_attr_extts_enable.attr || + attr == &dev_attr_fifo.attr) { + if (!info->n_ext_ts) + mode = 0; + } else if (attr == &dev_attr_period.attr) { + if (!info->n_per_out) + mode = 0; + } else if (attr == &dev_attr_pps_enable.attr) { + if (!info->pps) + mode = 0; + } + + return mode; +} + +static const struct attribute_group ptp_group = { + .is_visible = ptp_is_attribute_visible, + .attrs = ptp_attrs, +}; + +const struct attribute_group *ptp_groups[] = { + &ptp_group, + NULL +};
static int ptp_pin_name2index(struct ptp_clock *ptp, const char *name) { @@ -235,26 +268,11 @@ static ssize_t ptp_pin_store(struct device *dev, struct device_attribute *attr, return count; }
-static DEVICE_ATTR(extts_enable, 0220, NULL, extts_enable_store); -static DEVICE_ATTR(fifo, 0444, extts_fifo_show, NULL); -static DEVICE_ATTR(period, 0220, NULL, period_store); -static DEVICE_ATTR(pps_enable, 0220, NULL, pps_enable_store); - int ptp_cleanup_sysfs(struct ptp_clock *ptp) { struct device *dev = ptp->dev; struct ptp_clock_info *info = ptp->info;
- if (info->n_ext_ts) { - device_remove_file(dev, &dev_attr_extts_enable); - device_remove_file(dev, &dev_attr_fifo); - } - if (info->n_per_out) - device_remove_file(dev, &dev_attr_period); - - if (info->pps) - device_remove_file(dev, &dev_attr_pps_enable); - if (info->n_pins) { sysfs_remove_group(&dev->kobj, &ptp->pin_attr_group); kfree(ptp->pin_attr); @@ -307,46 +325,13 @@ no_dev_attr:
int ptp_populate_sysfs(struct ptp_clock *ptp) { - struct device *dev = ptp->dev; struct ptp_clock_info *info = ptp->info; int err;
- if (info->n_ext_ts) { - err = device_create_file(dev, &dev_attr_extts_enable); - if (err) - goto out1; - err = device_create_file(dev, &dev_attr_fifo); - if (err) - goto out2; - } - if (info->n_per_out) { - err = device_create_file(dev, &dev_attr_period); - if (err) - goto out3; - } - if (info->pps) { - err = device_create_file(dev, &dev_attr_pps_enable); - if (err) - goto out4; - } if (info->n_pins) { err = ptp_populate_pins(ptp); if (err) - goto out5; + return err; } return 0; -out5: - if (info->pps) - device_remove_file(dev, &dev_attr_pps_enable); -out4: - if (info->n_per_out) - device_remove_file(dev, &dev_attr_period); -out3: - if (info->n_ext_ts) - device_remove_file(dev, &dev_attr_fifo); -out2: - if (info->n_ext_ts) - device_remove_file(dev, &dev_attr_extts_enable); -out1: - return err; }
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit 85a66e55019583da1e0f18706b7a8281c9f6de5b upstream.
Let's switch to using device_create_with_groups(), which will allow us to create "pins" attribute group together with the rest of ptp device attributes, and before userspace gets notified about ptp device creation.
Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_clock.c | 20 +++++++++++--------- drivers/ptp/ptp_private.h | 7 ++++--- drivers/ptp/ptp_sysfs.c | 39 +++++++++------------------------------ 3 files changed, 24 insertions(+), 42 deletions(-)
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 08f304b83ad13..d5ac33350889e 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -214,16 +214,17 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, mutex_init(&ptp->pincfg_mux); init_waitqueue_head(&ptp->tsev_wq);
+ err = ptp_populate_pin_groups(ptp); + if (err) + goto no_pin_groups; + /* Create a new device in our class. */ - ptp->dev = device_create(ptp_class, parent, ptp->devid, ptp, - "ptp%d", ptp->index); + ptp->dev = device_create_with_groups(ptp_class, parent, ptp->devid, + ptp, ptp->pin_attr_groups, + "ptp%d", ptp->index); if (IS_ERR(ptp->dev)) goto no_device;
- err = ptp_populate_sysfs(ptp); - if (err) - goto no_sysfs; - /* Register a new PPS source. */ if (info->pps) { struct pps_source_info pps; @@ -251,10 +252,10 @@ no_clock: if (ptp->pps_source) pps_unregister_source(ptp->pps_source); no_pps: - ptp_cleanup_sysfs(ptp); -no_sysfs: device_destroy(ptp_class, ptp->devid); no_device: + ptp_cleanup_pin_groups(ptp); +no_pin_groups: mutex_destroy(&ptp->tsevq_mux); mutex_destroy(&ptp->pincfg_mux); ida_simple_remove(&ptp_clocks_map, index); @@ -273,8 +274,9 @@ int ptp_clock_unregister(struct ptp_clock *ptp) /* Release the clock's resources. */ if (ptp->pps_source) pps_unregister_source(ptp->pps_source); - ptp_cleanup_sysfs(ptp); + device_destroy(ptp_class, ptp->devid); + ptp_cleanup_pin_groups(ptp);
posix_clock_unregister(&ptp->clock); return 0; diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h index 9c5d41421b651..d95888974d0c6 100644 --- a/drivers/ptp/ptp_private.h +++ b/drivers/ptp/ptp_private.h @@ -54,6 +54,8 @@ struct ptp_clock { struct device_attribute *pin_dev_attr; struct attribute **pin_attr; struct attribute_group pin_attr_group; + /* 1st entry is a pointer to the real group, 2nd is NULL terminator */ + const struct attribute_group *pin_attr_groups[2]; };
/* @@ -94,8 +96,7 @@ uint ptp_poll(struct posix_clock *pc,
extern const struct attribute_group *ptp_groups[];
-int ptp_cleanup_sysfs(struct ptp_clock *ptp); - -int ptp_populate_sysfs(struct ptp_clock *ptp); +int ptp_populate_pin_groups(struct ptp_clock *ptp); +void ptp_cleanup_pin_groups(struct ptp_clock *ptp);
#endif diff --git a/drivers/ptp/ptp_sysfs.c b/drivers/ptp/ptp_sysfs.c index a55a6eb4dfde9..731d0423c8aa7 100644 --- a/drivers/ptp/ptp_sysfs.c +++ b/drivers/ptp/ptp_sysfs.c @@ -268,25 +268,14 @@ static ssize_t ptp_pin_store(struct device *dev, struct device_attribute *attr, return count; }
-int ptp_cleanup_sysfs(struct ptp_clock *ptp) +int ptp_populate_pin_groups(struct ptp_clock *ptp) { - struct device *dev = ptp->dev; - struct ptp_clock_info *info = ptp->info; - - if (info->n_pins) { - sysfs_remove_group(&dev->kobj, &ptp->pin_attr_group); - kfree(ptp->pin_attr); - kfree(ptp->pin_dev_attr); - } - return 0; -} - -static int ptp_populate_pins(struct ptp_clock *ptp) -{ - struct device *dev = ptp->dev; struct ptp_clock_info *info = ptp->info; int err = -ENOMEM, i, n_pins = info->n_pins;
+ if (!n_pins) + return 0; + ptp->pin_dev_attr = kzalloc(n_pins * sizeof(*ptp->pin_dev_attr), GFP_KERNEL); if (!ptp->pin_dev_attr) @@ -310,28 +299,18 @@ static int ptp_populate_pins(struct ptp_clock *ptp) ptp->pin_attr_group.name = "pins"; ptp->pin_attr_group.attrs = ptp->pin_attr;
- err = sysfs_create_group(&dev->kobj, &ptp->pin_attr_group); - if (err) - goto no_group; + ptp->pin_attr_groups[0] = &ptp->pin_attr_group; + return 0;
-no_group: - kfree(ptp->pin_attr); no_pin_attr: kfree(ptp->pin_dev_attr); no_dev_attr: return err; }
-int ptp_populate_sysfs(struct ptp_clock *ptp) +void ptp_cleanup_pin_groups(struct ptp_clock *ptp) { - struct ptp_clock_info *info = ptp->info; - int err; - - if (info->n_pins) { - err = ptp_populate_pins(ptp); - if (err) - return err; - } - return 0; + kfree(ptp->pin_attr); + kfree(ptp->pin_dev_attr); }
From: Logan Gunthorpe logang@deltatee.com
commit 233ed09d7fdacf592ee91e6c97ce5f4364fbe7c0 upstream.
Credit for this patch goes is shared with Dan Williams [1]. I've taken things one step further to make the helper function more useful and clean up calling code.
There's a common pattern in the kernel whereby a struct cdev is placed in a structure along side a struct device which manages the life-cycle of both. In the naive approach, the reference counting is broken and the struct device can free everything before the chardev code is entirely released.
Many developers have solved this problem by linking the internal kobjs in this fashion:
cdev.kobj.parent = &parent_dev.kobj;
The cdev code explicitly gets and puts a reference to it's kobj parent. So this seems like it was intended to be used this way. Dmitrty Torokhov first put this in place in 2012 with this commit:
2f0157f char_dev: pin parent kobject
and the first instance of the fix was then done in the input subsystem in the following commit:
4a215aa Input: fix use-after-free introduced with dynamic minor changes
Subsequently over the years, however, this issue seems to have tripped up multiple developers independently. For example, see these commits:
0d5b7da iio: Prevent race between IIO chardev opening and IIO device (by Lars-Peter Clausen in 2013)
ba0ef85 tpm: Fix initialization of the cdev (by Jason Gunthorpe in 2015)
5b28dde [media] media: fix use-after-free in cdev_put() when app exits after driver unbind (by Shauh Khan in 2016)
This technique is similarly done in at least 15 places within the kernel and probably should have been done so in another, at least, 5 places. The kobj line also looks very suspect in that one would not expect drivers to have to mess with kobject internals in this way. Even highly experienced kernel developers can be surprised by this code, as seen in [2].
To help alleviate this situation, and hopefully prevent future wasted effort on this problem, this patch introduces a helper function to register a char device along with its parent struct device. This creates a more regular API for tying a char device to its parent without the developer having to set members in the underlying kobject.
This patch introduce cdev_device_add and cdev_device_del which replaces a common pattern including setting the kobj parent, calling cdev_add and then calling device_add. It also introduces cdev_set_parent for the few cases that set the kobject parent without using device_add.
[1] https://lkml.org/lkml/2017/2/13/700 [2] https://lkml.org/lkml/2017/2/10/370
Signed-off-by: Logan Gunthorpe logang@deltatee.com Signed-off-by: Dan Williams dan.j.williams@intel.com Reviewed-by: Hans Verkuil hans.verkuil@cisco.com Reviewed-by: Alexandre Belloni alexandre.belloni@free-electrons.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/char_dev.c | 86 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/cdev.h | 5 +++ 2 files changed, 91 insertions(+)
diff --git a/fs/char_dev.c b/fs/char_dev.c index 23e0477edf7d6..1bbb966c0783f 100644 --- a/fs/char_dev.c +++ b/fs/char_dev.c @@ -477,6 +477,85 @@ int cdev_add(struct cdev *p, dev_t dev, unsigned count) return 0; }
+/** + * cdev_set_parent() - set the parent kobject for a char device + * @p: the cdev structure + * @kobj: the kobject to take a reference to + * + * cdev_set_parent() sets a parent kobject which will be referenced + * appropriately so the parent is not freed before the cdev. This + * should be called before cdev_add. + */ +void cdev_set_parent(struct cdev *p, struct kobject *kobj) +{ + WARN_ON(!kobj->state_initialized); + p->kobj.parent = kobj; +} + +/** + * cdev_device_add() - add a char device and it's corresponding + * struct device, linkink + * @dev: the device structure + * @cdev: the cdev structure + * + * cdev_device_add() adds the char device represented by @cdev to the system, + * just as cdev_add does. It then adds @dev to the system using device_add + * The dev_t for the char device will be taken from the struct device which + * needs to be initialized first. This helper function correctly takes a + * reference to the parent device so the parent will not get released until + * all references to the cdev are released. + * + * This helper uses dev->devt for the device number. If it is not set + * it will not add the cdev and it will be equivalent to device_add. + * + * This function should be used whenever the struct cdev and the + * struct device are members of the same structure whose lifetime is + * managed by the struct device. + * + * NOTE: Callers must assume that userspace was able to open the cdev and + * can call cdev fops callbacks at any time, even if this function fails. + */ +int cdev_device_add(struct cdev *cdev, struct device *dev) +{ + int rc = 0; + + if (dev->devt) { + cdev_set_parent(cdev, &dev->kobj); + + rc = cdev_add(cdev, dev->devt, 1); + if (rc) + return rc; + } + + rc = device_add(dev); + if (rc) + cdev_del(cdev); + + return rc; +} + +/** + * cdev_device_del() - inverse of cdev_device_add + * @dev: the device structure + * @cdev: the cdev structure + * + * cdev_device_del() is a helper function to call cdev_del and device_del. + * It should be used whenever cdev_device_add is used. + * + * If dev->devt is not set it will not remove the cdev and will be equivalent + * to device_del. + * + * NOTE: This guarantees that associated sysfs callbacks are not running + * or runnable, however any cdevs already open will remain and their fops + * will still be callable even after this function returns. + */ +void cdev_device_del(struct cdev *cdev, struct device *dev) +{ + device_del(dev); + if (dev->devt) + cdev_del(cdev); +} + static void cdev_unmap(dev_t dev, unsigned count) { kobj_unmap(cdev_map, dev, count); @@ -488,6 +567,10 @@ static void cdev_unmap(dev_t dev, unsigned count) * * cdev_del() removes @p from the system, possibly freeing the structure * itself. + * + * NOTE: This guarantees that cdev device will no longer be able to be + * opened, however any cdevs already open will remain and their fops will + * still be callable even after cdev_del returns. */ void cdev_del(struct cdev *p) { @@ -576,5 +659,8 @@ EXPORT_SYMBOL(cdev_init); EXPORT_SYMBOL(cdev_alloc); EXPORT_SYMBOL(cdev_del); EXPORT_SYMBOL(cdev_add); +EXPORT_SYMBOL(cdev_set_parent); +EXPORT_SYMBOL(cdev_device_add); +EXPORT_SYMBOL(cdev_device_del); EXPORT_SYMBOL(__register_chrdev); EXPORT_SYMBOL(__unregister_chrdev); diff --git a/include/linux/cdev.h b/include/linux/cdev.h index f8763615a5f2d..408bc09ce497b 100644 --- a/include/linux/cdev.h +++ b/include/linux/cdev.h @@ -4,6 +4,7 @@ #include <linux/kobject.h> #include <linux/kdev_t.h> #include <linux/list.h> +#include <linux/device.h>
struct file_operations; struct inode; @@ -26,6 +27,10 @@ void cdev_put(struct cdev *p);
int cdev_add(struct cdev *, dev_t, unsigned);
+void cdev_set_parent(struct cdev *p, struct kobject *kobj); +int cdev_device_add(struct cdev *cdev, struct device *dev); +void cdev_device_del(struct cdev *cdev, struct device *dev); + void cdev_del(struct cdev *);
void cd_forget(struct inode *);
From: YueHaibing yuehaibing@huawei.com
commit aea0a897af9e44c258e8ab9296fad417f1bc063a upstream.
Fix smatch warning:
drivers/ptp/ptp_clock.c:298 ptp_clock_register() warn: passing zero to 'ERR_PTR'
'err' should be set while device_create_with_groups and pps_register_source fails
Fixes: 85a66e550195 ("ptp: create "pins" together with the rest of attributes") Signed-off-by: YueHaibing yuehaibing@huawei.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_clock.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index d5ac33350889e..b87b7b0867a4c 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -222,8 +222,10 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, ptp->dev = device_create_with_groups(ptp_class, parent, ptp->devid, ptp, ptp->pin_attr_groups, "ptp%d", ptp->index); - if (IS_ERR(ptp->dev)) + if (IS_ERR(ptp->dev)) { + err = PTR_ERR(ptp->dev); goto no_device; + }
/* Register a new PPS source. */ if (info->pps) { @@ -234,6 +236,7 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, pps.owner = info->owner; ptp->pps_source = pps_register_source(&pps, PTP_PPS_DEFAULTS); if (!ptp->pps_source) { + err = -EINVAL; pr_err("failed to register pps source\n"); goto no_pps; }
From: Vladis Dronov vdronov@redhat.com
commit a33121e5487b424339636b25c35d3a180eaa5f5e upstream.
In a case when a ptp chardev (like /dev/ptp0) is open but an underlying device is removed, closing this file leads to a race. This reproduces easily in a kvm virtual machine:
ts# cat openptp0.c int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); } ts# uname -r 5.5.0-rc3-46cf053e ts# cat /proc/cmdline ... slub_debug=FZP ts# modprobe ptp_kvm ts# ./openptp0 & [1] 670 opened /dev/ptp0, sleeping 10s... ts# rmmod ptp_kvm ts# ls /dev/ptp* ls: cannot access '/dev/ptp*': No such file or directory ts# ...woken up [ 48.010809] general protection fault: 0000 [#1] SMP [ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25 [ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... [ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80 [ 48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202 [ 48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0 [ 48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b [ 48.019470] ... ^^^ a slub poison [ 48.023854] Call Trace: [ 48.024050] __fput+0x21f/0x240 [ 48.024288] task_work_run+0x79/0x90 [ 48.024555] do_exit+0x2af/0xab0 [ 48.024799] ? vfs_write+0x16a/0x190 [ 48.025082] do_group_exit+0x35/0x90 [ 48.025387] __x64_sys_exit_group+0xf/0x10 [ 48.025737] do_syscall_64+0x3d/0x130 [ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 48.026479] RIP: 0033:0x7f53b12082f6 [ 48.026792] ... [ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm] [ 48.045001] Fixing recursive fault but reboot is needed!
This happens in:
static void __fput(struct file *file) { ... if (file->f_op->release) file->f_op->release(inode, file); <<< cdev is kfree'd here if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL && !(mode & FMODE_PATH))) { cdev_put(inode->i_cdev); <<< cdev fields are accessed here
Namely:
__fput() posix_clock_release() kref_put(&clk->kref, delete_clock) <<< the last reference delete_clock() delete_ptp_clock() kfree(ptp) <<< cdev is embedded in ptp cdev_put module_put(p->owner) <<< *p is kfree'd, bang!
Here cdev is embedded in posix_clock which is embedded in ptp_clock. The race happens because ptp_clock's lifetime is controlled by two refcounts: kref and cdev.kobj in posix_clock. This is wrong.
Make ptp_clock's sysfs device a parent of cdev with cdev_device_add() created especially for such cases. This way the parent device with its ptp_clock is not released until all references to the cdev are released. This adds a requirement that an initialized but not exposed struct device should be provided to posix_clock_register() by a caller instead of a simple dev_t.
This approach was adopted from the commit 72139dfa2464 ("watchdog: Fix the race between the release of watchdog_core_data and cdev"). See details of the implementation in the commit 233ed09d7fda ("chardev: add helper function to register char devs with a struct device").
Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.c... Analyzed-by: Stephen Johnston sjohnsto@redhat.com Analyzed-by: Vern Lovejoy vlovejoy@redhat.com Signed-off-by: Vladis Dronov vdronov@redhat.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_clock.c | 31 ++++++++++++++----------------- drivers/ptp/ptp_private.h | 2 +- include/linux/posix-clock.h | 19 +++++++++++-------- kernel/time/posix-clock.c | 31 +++++++++++++------------------ 4 files changed, 39 insertions(+), 44 deletions(-)
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index b87b7b0867a4c..2c1ae324da182 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -171,9 +171,9 @@ static struct posix_clock_operations ptp_clock_ops = { .read = ptp_read, };
-static void delete_ptp_clock(struct posix_clock *pc) +static void ptp_clock_release(struct device *dev) { - struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock); + struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
mutex_destroy(&ptp->tsevq_mux); mutex_destroy(&ptp->pincfg_mux); @@ -205,7 +205,6 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, }
ptp->clock.ops = ptp_clock_ops; - ptp->clock.release = delete_ptp_clock; ptp->info = info; ptp->devid = MKDEV(major, index); ptp->index = index; @@ -218,15 +217,6 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, if (err) goto no_pin_groups;
- /* Create a new device in our class. */ - ptp->dev = device_create_with_groups(ptp_class, parent, ptp->devid, - ptp, ptp->pin_attr_groups, - "ptp%d", ptp->index); - if (IS_ERR(ptp->dev)) { - err = PTR_ERR(ptp->dev); - goto no_device; - } - /* Register a new PPS source. */ if (info->pps) { struct pps_source_info pps; @@ -242,8 +232,18 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, } }
- /* Create a posix clock. */ - err = posix_clock_register(&ptp->clock, ptp->devid); + /* Initialize a new device of our class in our clock structure. */ + device_initialize(&ptp->dev); + ptp->dev.devt = ptp->devid; + ptp->dev.class = ptp_class; + ptp->dev.parent = parent; + ptp->dev.groups = ptp->pin_attr_groups; + ptp->dev.release = ptp_clock_release; + dev_set_drvdata(&ptp->dev, ptp); + dev_set_name(&ptp->dev, "ptp%d", ptp->index); + + /* Create a posix clock and link it to the device. */ + err = posix_clock_register(&ptp->clock, &ptp->dev); if (err) { pr_err("failed to create posix clock\n"); goto no_clock; @@ -255,8 +255,6 @@ no_clock: if (ptp->pps_source) pps_unregister_source(ptp->pps_source); no_pps: - device_destroy(ptp_class, ptp->devid); -no_device: ptp_cleanup_pin_groups(ptp); no_pin_groups: mutex_destroy(&ptp->tsevq_mux); @@ -278,7 +276,6 @@ int ptp_clock_unregister(struct ptp_clock *ptp) if (ptp->pps_source) pps_unregister_source(ptp->pps_source);
- device_destroy(ptp_class, ptp->devid); ptp_cleanup_pin_groups(ptp);
posix_clock_unregister(&ptp->clock); diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h index d95888974d0c6..15346e840caa9 100644 --- a/drivers/ptp/ptp_private.h +++ b/drivers/ptp/ptp_private.h @@ -40,7 +40,7 @@ struct timestamp_event_queue {
struct ptp_clock { struct posix_clock clock; - struct device *dev; + struct device dev; struct ptp_clock_info *info; dev_t devid; int index; /* index into clocks.map */ diff --git a/include/linux/posix-clock.h b/include/linux/posix-clock.h index 83b22ae9ae12a..b39420a0321c3 100644 --- a/include/linux/posix-clock.h +++ b/include/linux/posix-clock.h @@ -104,29 +104,32 @@ struct posix_clock_operations { * * @ops: Functional interface to the clock * @cdev: Character device instance for this clock - * @kref: Reference count. + * @dev: Pointer to the clock's device. * @rwsem: Protects the 'zombie' field from concurrent access. * @zombie: If 'zombie' is true, then the hardware has disappeared. - * @release: A function to free the structure when the reference count reaches - * zero. May be NULL if structure is statically allocated. * * Drivers should embed their struct posix_clock within a private * structure, obtaining a reference to it during callbacks using * container_of(). + * + * Drivers should supply an initialized but not exposed struct device + * to posix_clock_register(). It is used to manage lifetime of the + * driver's private structure. It's 'release' field should be set to + * a release function for this private structure. */ struct posix_clock { struct posix_clock_operations ops; struct cdev cdev; - struct kref kref; + struct device *dev; struct rw_semaphore rwsem; bool zombie; - void (*release)(struct posix_clock *clk); };
/** * posix_clock_register() - register a new clock - * @clk: Pointer to the clock. Caller must provide 'ops' and 'release' - * @devid: Allocated device id + * @clk: Pointer to the clock. Caller must provide 'ops' field + * @dev: Pointer to the initialized device. Caller must provide + * 'release' field * * A clock driver calls this function to register itself with the * clock device subsystem. If 'clk' points to dynamically allocated @@ -135,7 +138,7 @@ struct posix_clock { * * Returns zero on success, non-zero otherwise. */ -int posix_clock_register(struct posix_clock *clk, dev_t devid); +int posix_clock_register(struct posix_clock *clk, struct device *dev);
/** * posix_clock_unregister() - unregister a clock diff --git a/kernel/time/posix-clock.c b/kernel/time/posix-clock.c index e24008c098c6b..45a0a26023d4b 100644 --- a/kernel/time/posix-clock.c +++ b/kernel/time/posix-clock.c @@ -25,8 +25,6 @@ #include <linux/syscalls.h> #include <linux/uaccess.h>
-static void delete_clock(struct kref *kref); - /* * Returns NULL if the posix_clock instance attached to 'fp' is old and stale. */ @@ -168,7 +166,7 @@ static int posix_clock_open(struct inode *inode, struct file *fp) err = 0;
if (!err) { - kref_get(&clk->kref); + get_device(clk->dev); fp->private_data = clk; } out: @@ -184,7 +182,7 @@ static int posix_clock_release(struct inode *inode, struct file *fp) if (clk->ops.release) err = clk->ops.release(clk);
- kref_put(&clk->kref, delete_clock); + put_device(clk->dev);
fp->private_data = NULL;
@@ -206,38 +204,35 @@ static const struct file_operations posix_clock_file_operations = { #endif };
-int posix_clock_register(struct posix_clock *clk, dev_t devid) +int posix_clock_register(struct posix_clock *clk, struct device *dev) { int err;
- kref_init(&clk->kref); init_rwsem(&clk->rwsem);
cdev_init(&clk->cdev, &posix_clock_file_operations); + err = cdev_device_add(&clk->cdev, dev); + if (err) { + pr_err("%s unable to add device %d:%d\n", + dev_name(dev), MAJOR(dev->devt), MINOR(dev->devt)); + return err; + } clk->cdev.owner = clk->ops.owner; - err = cdev_add(&clk->cdev, devid, 1); + clk->dev = dev;
- return err; + return 0; } EXPORT_SYMBOL_GPL(posix_clock_register);
-static void delete_clock(struct kref *kref) -{ - struct posix_clock *clk = container_of(kref, struct posix_clock, kref); - - if (clk->release) - clk->release(clk); -} - void posix_clock_unregister(struct posix_clock *clk) { - cdev_del(&clk->cdev); + cdev_device_del(&clk->cdev, clk->dev);
down_write(&clk->rwsem); clk->zombie = true; up_write(&clk->rwsem);
- kref_put(&clk->kref, delete_clock); + put_device(clk->dev); } EXPORT_SYMBOL_GPL(posix_clock_unregister);
From: Vladis Dronov vdronov@redhat.com
commit 75718584cb3c64e6269109d4d54f888ac5a5fd15 upstream.
There is a bug in ptp_clock_unregister(), where ptp_cleanup_pin_groups() first frees ptp->pin_{,dev_}attr, but then posix_clock_unregister() needs them to destroy a related sysfs device.
These functions can not be just swapped, as posix_clock_unregister() frees ptp which is needed in the ptp_cleanup_pin_groups(). Fix this by calling ptp_cleanup_pin_groups() in ptp_clock_release(), right before ptp is freed.
This makes this patch fix an UAF bug in a patch which fixes an UAF bug.
Reported-by: Antti Laakso antti.laakso@intel.com Fixes: a33121e5487b ("ptp: fix the race between the release of ptp_clock and cdev") Link: https://lore.kernel.org/netdev/3d2bd09735dbdaf003585ca376b7c1e5b69a19bd.came... Signed-off-by: Vladis Dronov vdronov@redhat.com Acked-by: Richard Cochran richardcochran@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_clock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 2c1ae324da182..bf1536f1c90bb 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -175,6 +175,7 @@ static void ptp_clock_release(struct device *dev) { struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
+ ptp_cleanup_pin_groups(ptp); mutex_destroy(&ptp->tsevq_mux); mutex_destroy(&ptp->pincfg_mux); ida_simple_remove(&ptp_clocks_map, ptp->index); @@ -276,9 +277,8 @@ int ptp_clock_unregister(struct ptp_clock *ptp) if (ptp->pps_source) pps_unregister_source(ptp->pps_source);
- ptp_cleanup_pin_groups(ptp); - posix_clock_unregister(&ptp->clock); + return 0; } EXPORT_SYMBOL(ptp_clock_unregister);
From: Hugh Dickins hughd@google.com
[ Upstream commit ea0dfeb4209b4eab954d6e00ed136bc6b48b380d ]
Recent commit 71725ed10c40 ("mm: huge tmpfs: try to split_huge_page() when punching hole") has allowed syzkaller to probe deeper, uncovering a long-standing lockdep issue between the irq-unsafe shmlock_user_lock, the irq-safe xa_lock on mapping->i_pages, and shmem inode's info->lock which nests inside xa_lock (or tree_lock) since 4.8's shmem_uncharge().
user_shm_lock(), servicing SysV shmctl(SHM_LOCK), wants shmlock_user_lock while its caller shmem_lock() holds info->lock with interrupts disabled; but hugetlbfs_file_setup() calls user_shm_lock() with interrupts enabled, and might be interrupted by a writeback endio wanting xa_lock on i_pages.
This may not risk an actual deadlock, since shmem inodes do not take part in writeback accounting, but there are several easy ways to avoid it.
Requiring interrupts disabled for shmlock_user_lock would be easy, but it's a high-level global lock for which that seems inappropriate. Instead, recall that the use of info->lock to guard info->flags in shmem_lock() dates from pre-3.1 days, when races with SHMEM_PAGEIN and SHMEM_TRUNCATE could occur: nowadays it serves no purpose, the only flag added or removed is VM_LOCKED itself, and calls to shmem_lock() an inode are already serialized by the caller.
Take info->lock out of the chain and the possibility of deadlock or lockdep warning goes away.
Fixes: 4595ef88d136 ("shmem: make shmem_inode_info::lock irq-safe") Reported-by: syzbot+c8a8197c8852f566b9d9@syzkaller.appspotmail.com Reported-by: syzbot+40b71e145e73f78f81ad@syzkaller.appspotmail.com Signed-off-by: Hugh Dickins hughd@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Yang Shi yang.shi@linux.alibaba.com Cc: Yang Shi yang.shi@linux.alibaba.com Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2004161707410.16322@eggly.anvils Link: https://lore.kernel.org/lkml/000000000000e5838c05a3152f53@google.com/ Link: https://lore.kernel.org/lkml/0000000000003712b305a331d3b1@google.com/ Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/shmem.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c index 90ccbb35458bd..31b0c09fe6c60 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2082,7 +2082,11 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user) struct shmem_inode_info *info = SHMEM_I(inode); int retval = -ENOMEM;
- spin_lock_irq(&info->lock); + /* + * What serializes the accesses to info->flags? + * ipc_lock_object() when called from shmctl_do_lock(), + * no serialization needed when called from shm_destroy(). + */ if (lock && !(info->flags & VM_LOCKED)) { if (!user_shm_lock(inode->i_size, user)) goto out_nomem; @@ -2097,7 +2101,6 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user) retval = 0;
out_nomem: - spin_unlock_irq(&info->lock); return retval; }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 10e3cc180e64385edc9890c6855acf5ed9ca1339 ]
A call to 'dma_alloc_coherent()' is hidden in 'sonic_alloc_descriptors()', called from 'sonic_probe1()'.
This is correctly freed in the remove function, but not in the error handling path of the probe function. Fix it and add the missing 'dma_free_coherent()' call.
While at it, rename a label in order to be slightly more informative.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/natsemi/jazzsonic.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/natsemi/jazzsonic.c b/drivers/net/ethernet/natsemi/jazzsonic.c index acf3f11e38cc1..68d2f31921ff8 100644 --- a/drivers/net/ethernet/natsemi/jazzsonic.c +++ b/drivers/net/ethernet/natsemi/jazzsonic.c @@ -247,13 +247,15 @@ static int jazz_sonic_probe(struct platform_device *pdev) goto out; err = register_netdev(dev); if (err) - goto out1; + goto undo_probe1;
printk("%s: MAC %pM IRQ %d\n", dev->name, dev->dev_addr, dev->irq);
return 0;
-out1: +undo_probe1: + dma_free_coherent(lp->device, SIZEOF_SONIC_DESC * SONIC_BUS_SCALE(lp->dma_bitmode), + lp->descriptors, lp->descriptors_laddr); release_mem_region(dev->base_addr, SONIC_MEM_SIZE); out: free_netdev(dev);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit ee8d2267f0e39a1bfd95532da3a6405004114b27 ]
Should an irq requested with 'devm_request_irq' be released explicitly, it should be done by 'devm_free_irq()', not 'free_irq()'.
Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/moxa/moxart_ether.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c index 0622fd03941b8..6fe61d9343cb8 100644 --- a/drivers/net/ethernet/moxa/moxart_ether.c +++ b/drivers/net/ethernet/moxa/moxart_ether.c @@ -571,7 +571,7 @@ static int moxart_remove(struct platform_device *pdev) struct net_device *ndev = platform_get_drvdata(pdev);
unregister_netdev(ndev); - free_irq(ndev->irq, ndev); + devm_free_irq(&pdev->dev, ndev->irq, ndev); moxart_mac_free_memory(ndev); free_netdev(ndev);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit dc30b4059f6e2abf3712ab537c8718562b21c45d ]
The current gcc-10 snapshot produces a false-positive warning:
net/core/drop_monitor.c: In function 'trace_drop_common.constprop': cc1: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] In file included from net/core/drop_monitor.c:23: include/uapi/linux/net_dropmon.h:36:8: note: at offset 0 to object 'entries' with size 4 declared here 36 | __u32 entries; | ^~~~~~~
I reported this in the gcc bugzilla, but in case it does not get fixed in the release, work around it by using a temporary variable.
Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881 Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/drop_monitor.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c index ca2c9c8b9a3e9..6d7ff117f3792 100644 --- a/net/core/drop_monitor.c +++ b/net/core/drop_monitor.c @@ -159,6 +159,7 @@ static void sched_send_work(unsigned long _data) static void trace_drop_common(struct sk_buff *skb, void *location) { struct net_dm_alert_msg *msg; + struct net_dm_drop_point *point; struct nlmsghdr *nlh; struct nlattr *nla; int i; @@ -177,11 +178,13 @@ static void trace_drop_common(struct sk_buff *skb, void *location) nlh = (struct nlmsghdr *)dskb->data; nla = genlmsg_data(nlmsg_data(nlh)); msg = nla_data(nla); + point = msg->points; for (i = 0; i < msg->entries; i++) { - if (!memcmp(&location, msg->points[i].pc, sizeof(void *))) { - msg->points[i].count++; + if (!memcmp(&location, &point->pc, sizeof(void *))) { + point->count++; goto out; } + point++; } if (msg->entries == dm_hit_limit) goto out; @@ -190,8 +193,8 @@ static void trace_drop_common(struct sk_buff *skb, void *location) */ __nla_reserve_nohdr(dskb, sizeof(struct net_dm_drop_point)); nla->nla_len += NLA_ALIGN(sizeof(struct net_dm_drop_point)); - memcpy(msg->points[msg->entries].pc, &location, sizeof(void *)); - msg->points[msg->entries].count = 1; + memcpy(point->pc, &location, sizeof(void *)); + point->count = 1; msg->entries++;
if (!timer_pending(&data->send_timer)) {
From: Wu Bo wubo40@huawei.com
commit 83c6f2390040f188cc25b270b4befeb5628c1aee upstream.
If the __copy_from_user function failed we need to call sg_remove_request in sg_write.
Link: https://lore.kernel.org/r/610618d9-e983-fd56-ed0f-639428343af7@huawei.com Acked-by: Douglas Gilbert dgilbert@interlog.com Signed-off-by: Wu Bo wubo40@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org [groeck: Backport to v5.4.y and older kernels] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sg.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -695,8 +695,10 @@ sg_write(struct file *filp, const char _ hp->flags = input_size; /* structure abuse ... */ hp->pack_id = old_hdr.pack_id; hp->usr_ptr = NULL; - if (__copy_from_user(cmnd, buf, cmd_size)) + if (__copy_from_user(cmnd, buf, cmd_size)) { + sg_remove_request(sfp, srp); return -EFAULT; + } /* * SG_DXFER_TO_FROM_DEV is functionally equivalent to SG_DXFER_FROM_DEV, * but is is possible that the app intended SG_DXFER_TO_DEV, because there
From: wuxu.wu wuxu.wu@huawei.com
commit 19b61392c5a852b4e8a0bf35aecb969983c5932d upstream.
dw_spi_irq() and dw_spi_transfer_one concurrent calls.
I find a panic in dw_writer(): txw = *(u8 *)(dws->tx), when dw->tx==null, dw->len==4, and dw->tx_end==1.
When tpm driver's message overtime dw_spi_irq() and dw_spi_transfer_one may concurrent visit dw_spi, so I think dw_spi structure lack of protection.
Otherwise dw_spi_transfer_one set dw rx/tx buffer and then open irq, store dw rx/tx instructions and other cores handle irq load dw rx/tx instructions may out of order.
[ 1025.321302] Call trace: ... [ 1025.321319] __crash_kexec+0x98/0x148 [ 1025.321323] panic+0x17c/0x314 [ 1025.321329] die+0x29c/0x2e8 [ 1025.321334] die_kernel_fault+0x68/0x78 [ 1025.321337] __do_kernel_fault+0x90/0xb0 [ 1025.321346] do_page_fault+0x88/0x500 [ 1025.321347] do_translation_fault+0xa8/0xb8 [ 1025.321349] do_mem_abort+0x68/0x118 [ 1025.321351] el1_da+0x20/0x8c [ 1025.321362] dw_writer+0xc8/0xd0 [ 1025.321364] interrupt_transfer+0x60/0x110 [ 1025.321365] dw_spi_irq+0x48/0x70 ...
Signed-off-by: wuxu.wu wuxu.wu@huawei.com Link: https://lore.kernel.org/r/1577849981-31489-1-git-send-email-wuxu.wu@huawei.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Nobuhiro Iwamatsu (CIP) nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/spi/spi-dw.c | 15 ++++++++++++--- drivers/spi/spi-dw.h | 1 + 2 files changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/spi/spi-dw.c +++ b/drivers/spi/spi-dw.c @@ -180,9 +180,11 @@ static inline u32 rx_max(struct dw_spi *
static void dw_writer(struct dw_spi *dws) { - u32 max = tx_max(dws); + u32 max; u16 txw = 0;
+ spin_lock(&dws->buf_lock); + max = tx_max(dws); while (max--) { /* Set the tx word if the transfer's original "tx" is not null */ if (dws->tx_end - dws->len) { @@ -194,13 +196,16 @@ static void dw_writer(struct dw_spi *dws dw_write_io_reg(dws, DW_SPI_DR, txw); dws->tx += dws->n_bytes; } + spin_unlock(&dws->buf_lock); }
static void dw_reader(struct dw_spi *dws) { - u32 max = rx_max(dws); + u32 max; u16 rxw;
+ spin_lock(&dws->buf_lock); + max = rx_max(dws); while (max--) { rxw = dw_read_io_reg(dws, DW_SPI_DR); /* Care rx only if the transfer's original "rx" is not null */ @@ -212,6 +217,7 @@ static void dw_reader(struct dw_spi *dws } dws->rx += dws->n_bytes; } + spin_unlock(&dws->buf_lock); }
static void int_error_stop(struct dw_spi *dws, const char *msg) @@ -284,18 +290,20 @@ static int dw_spi_transfer_one(struct sp { struct dw_spi *dws = spi_master_get_devdata(master); struct chip_data *chip = spi_get_ctldata(spi); + unsigned long flags; u8 imask = 0; u16 txlevel = 0; u32 cr0; int ret;
dws->dma_mapped = 0; - + spin_lock_irqsave(&dws->buf_lock, flags); dws->tx = (void *)transfer->tx_buf; dws->tx_end = dws->tx + transfer->len; dws->rx = transfer->rx_buf; dws->rx_end = dws->rx + transfer->len; dws->len = transfer->len; + spin_unlock_irqrestore(&dws->buf_lock, flags);
spi_enable_chip(dws, 0);
@@ -487,6 +495,7 @@ int dw_spi_add_host(struct device *dev, dws->dma_inited = 0; dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR); snprintf(dws->name, sizeof(dws->name), "dw_spi%d", dws->bus_num); + spin_lock_init(&dws->buf_lock);
ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dws->name, master); if (ret < 0) { --- a/drivers/spi/spi-dw.h +++ b/drivers/spi/spi-dw.h @@ -117,6 +117,7 @@ struct dw_spi { size_t len; void *tx; void *tx_end; + spinlock_t buf_lock; void *rx; void *rx_end; int dma_mapped;
From: Samuel Cabrero scabrero@suse.de
[ Upstream commit 76e752701a8af4404bbd9c45723f7cbd6e4a251e ]
Some servers seem to accept connections while booting but never send the SMBNegotiate response neither close the connection, causing all processes accessing the share hang on uninterruptible sleep state.
This happens when the cifs_demultiplex_thread detects the server is unresponsive so releases the socket and start trying to reconnect. At some point, the faulty server will accept the socket and the TCP status will be set to NeedNegotiate. The first issued command accessing the share will start the negotiation (pid 5828 below), but the response will never arrive so other commands will be blocked waiting on the mutex (pid 55352).
This patch checks for unresponsive servers also on the negotiate stage releasing the socket and reconnecting if the response is not received and checking again the tcp state when the mutex is acquired.
PID: 55352 TASK: ffff880fd6cc02c0 CPU: 0 COMMAND: "ls" #0 [ffff880fd9add9f0] schedule at ffffffff81467eb9 #1 [ffff880fd9addb38] __mutex_lock_slowpath at ffffffff81468fe0 #2 [ffff880fd9addba8] mutex_lock at ffffffff81468b1a #3 [ffff880fd9addbc0] cifs_reconnect_tcon at ffffffffa042f905 [cifs] #4 [ffff880fd9addc60] smb_init at ffffffffa042faeb [cifs] #5 [ffff880fd9addca0] CIFSSMBQPathInfo at ffffffffa04360b5 [cifs] ....
Which is waiting a mutex owned by:
PID: 5828 TASK: ffff880fcc55e400 CPU: 0 COMMAND: "xxxx" #0 [ffff880fbfdc19b8] schedule at ffffffff81467eb9 #1 [ffff880fbfdc1b00] wait_for_response at ffffffffa044f96d [cifs] #2 [ffff880fbfdc1b60] SendReceive at ffffffffa04505ce [cifs] #3 [ffff880fbfdc1bb0] CIFSSMBNegotiate at ffffffffa0438d79 [cifs] #4 [ffff880fbfdc1c50] cifs_negotiate_protocol at ffffffffa043b383 [cifs] #5 [ffff880fbfdc1c80] cifs_reconnect_tcon at ffffffffa042f911 [cifs] #6 [ffff880fbfdc1d20] smb_init at ffffffffa042faeb [cifs] #7 [ffff880fbfdc1d60] CIFSSMBQFSInfo at ffffffffa0434eb0 [cifs] ....
Signed-off-by: Samuel Cabrero scabrero@suse.de Reviewed-by: Aurélien Aptel aaptel@suse.de Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French smfrench@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifssmb.c | 12 ++++++++++++ fs/cifs/connect.c | 3 ++- fs/cifs/smb2pdu.c | 12 ++++++++++++ 3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 741b83c59a306..568abcd6d0dd3 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -184,6 +184,18 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command) * reconnect the same SMB session */ mutex_lock(&ses->session_mutex); + + /* + * Recheck after acquire mutex. If another thread is negotiating + * and the server never sends an answer the socket will be closed + * and tcpStatus set to reconnect. + */ + if (server->tcpStatus == CifsNeedReconnect) { + rc = -EHOSTDOWN; + mutex_unlock(&ses->session_mutex); + goto out; + } + rc = cifs_negotiate_protocol(0, ses); if (rc == 0 && ses->need_reconnect) rc = cifs_setup_session(0, ses, nls_codepage); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index c018d161735c4..37c8cac86431f 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -561,7 +561,8 @@ server_unresponsive(struct TCP_Server_Info *server) * 65s kernel_recvmsg times out, and we see that we haven't gotten * a response in >60s. */ - if (server->tcpStatus == CifsGood && + if ((server->tcpStatus == CifsGood || + server->tcpStatus == CifsNeedNegotiate) && time_after(jiffies, server->lstrp + 2 * server->echo_interval)) { cifs_dbg(VFS, "Server %s has not responded in %lu seconds. Reconnecting...\n", server->hostname, (2 * server->echo_interval) / HZ); diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index e8dc28dbe563e..0a23b6002ff12 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -246,6 +246,18 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon) * the same SMB session */ mutex_lock(&tcon->ses->session_mutex); + + /* + * Recheck after acquire mutex. If another thread is negotiating + * and the server never sends an answer the socket will be closed + * and tcpStatus set to reconnect. + */ + if (server->tcpStatus == CifsNeedReconnect) { + rc = -EHOSTDOWN; + mutex_unlock(&tcon->ses->session_mutex); + goto out; + } + rc = cifs_negotiate_protocol(0, tcon->ses); if (!rc && tcon->ses->need_reconnect) { rc = cifs_setup_session(0, tcon->ses, nls_codepage);
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit f2caf901c1b7ce65f9e6aef4217e3241039db768 ]
There is a race condition with how we send (or supress and don't send) smb echos that will cause the client to incorrectly think the server is unresponsive and thus needs to be reconnected.
Summary of the race condition: 1) Daisy chaining scheduling creates a gap. 2) If traffic comes unfortunate shortly after the last echo, the planned echo is suppressed. 3) Due to the gap, the next echo transmission is delayed until after the timeout, which is set hard to twice the echo interval.
This is fixed by changing the timeouts from 2 to three times the echo interval.
Detailed description of the bug: https://lutz.donnerhacke.de/eng/Blog/Groundhog-Day-with-SMB-remount
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 37c8cac86431f..3545b237187a8 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -551,10 +551,10 @@ static bool server_unresponsive(struct TCP_Server_Info *server) { /* - * We need to wait 2 echo intervals to make sure we handle such + * We need to wait 3 echo intervals to make sure we handle such * situations right: * 1s client sends a normal SMB request - * 2s client gets a response + * 3s client gets a response * 30s echo workqueue job pops, and decides we got a response recently * and don't need to send another * ... @@ -563,9 +563,9 @@ server_unresponsive(struct TCP_Server_Info *server) */ if ((server->tcpStatus == CifsGood || server->tcpStatus == CifsNeedNegotiate) && - time_after(jiffies, server->lstrp + 2 * server->echo_interval)) { + time_after(jiffies, server->lstrp + 3 * server->echo_interval)) { cifs_dbg(VFS, "Server %s has not responded in %lu seconds. Reconnecting...\n", - server->hostname, (2 * server->echo_interval) / HZ); + server->hostname, (3 * server->echo_interval) / HZ); cifs_reconnect(server); wake_up(&server->response_q); return true;
From: Madhuparna Bhowmik madhuparnabhowmik10@gmail.com
[ Upstream commit 2e45676a4d33af47259fa186ea039122ce263ba9 ]
pd->dma.dev is read in irq handler pd_irq(). However, it is set to pdev->dev after request_irq(). Therefore, set pd->dma.dev to pdev->dev before request_irq() to avoid data race between pch_dma_probe() and pd_irq().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Madhuparna Bhowmik madhuparnabhowmik10@gmail.com Link: https://lore.kernel.org/r/20200416062335.29223-1-madhuparnabhowmik10@gmail.c... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/pch_dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/pch_dma.c b/drivers/dma/pch_dma.c index df95727dc2fba..8a0c70e4f7277 100644 --- a/drivers/dma/pch_dma.c +++ b/drivers/dma/pch_dma.c @@ -876,6 +876,7 @@ static int pch_dma_probe(struct pci_dev *pdev, }
pci_set_master(pdev); + pd->dma.dev = &pdev->dev;
err = request_irq(pdev->irq, pd_irq, IRQF_SHARED, DRV_NAME, pd); if (err) { @@ -891,7 +892,6 @@ static int pch_dma_probe(struct pci_dev *pdev, goto err_free_irq; }
- pd->dma.dev = &pdev->dev;
INIT_LIST_HEAD(&pd->dma.channels);
From: Lubomir Rintel lkundrak@v3.sk
[ Upstream commit 0c89446379218698189a47871336cb30286a7197 ]
When a channel configuration fails, the status of the channel is set to DEV_ERROR so that an attempt to submit it fails. However, this status sticks until the heat end of the universe, making it impossible to recover from the error.
Let's reset it when the channel is released so that further use of the channel with correct configuration is not impacted.
Signed-off-by: Lubomir Rintel lkundrak@v3.sk Link: https://lore.kernel.org/r/20200419164912.670973-5-lkundrak@v3.sk Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/mmp_tdma.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/dma/mmp_tdma.c b/drivers/dma/mmp_tdma.c index 13c68b6434ce2..15b4a44e60069 100644 --- a/drivers/dma/mmp_tdma.c +++ b/drivers/dma/mmp_tdma.c @@ -362,6 +362,8 @@ static void mmp_tdma_free_descriptor(struct mmp_tdma_chan *tdmac) gen_pool_free(gpool, (unsigned long)tdmac->desc_arr, size); tdmac->desc_arr = NULL; + if (tdmac->status == DMA_ERROR) + tdmac->status = DMA_COMPLETE;
return; }
From: Kai Vehmanen kai.vehmanen@linux.intel.com
[ Upstream commit ca76282b6faffc83601c25bd2a95f635c03503ef ]
A race exists between build_pcms() and build_controls() phases of codec setup. Build_pcms() sets up notifier for jack events. If a monitor event is received before build_controls() is run, the initial jack state is lost and never reported via mixer controls.
The problem can be hit at least with SOF as the controller driver. SOF calls snd_hda_codec_build_controls() in its workqueue-based probe and this can be delayed enough to hit the race condition.
Fix the issue by invalidating the per-pin ELD information when build_controls() is called. The existing call to hdmi_present_sense() will update the ELD contents. This ensures initial monitor state is correctly reflected via mixer controls.
BugLink: https://github.com/thesofproject/linux/issues/1687 Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20200428123836.24512-1-kai.vehmanen@linux.intel.co... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_hdmi.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c index e19f447e27ae1..a866a20349c32 100644 --- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -2044,7 +2044,9 @@ static int generic_hdmi_build_controls(struct hda_codec *codec)
for (pin_idx = 0; pin_idx < spec->num_pins; pin_idx++) { struct hdmi_spec_per_pin *per_pin = get_pin(spec, pin_idx); + struct hdmi_eld *pin_eld = &per_pin->sink_eld;
+ pin_eld->eld_valid = false; hdmi_present_sense(per_pin, 0); }
From: Vasily Averin vvs@virtuozzo.com
[ Upstream commit 5b5703dbafae74adfbe298a56a81694172caf5e6 ]
v2: removed TODO reminder
Signed-off-by: Vasily Averin vvs@virtuozzo.com Link: http://patchwork.freedesktop.org/patch/msgid/a4e0ae09-a73c-1c62-04ef-3f990d4... Signed-off-by: Gerd Hoffmann kraxel@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/qxl/qxl_image.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/qxl/qxl_image.c b/drivers/gpu/drm/qxl/qxl_image.c index 7fbcc35e8ad35..c89c10055641e 100644 --- a/drivers/gpu/drm/qxl/qxl_image.c +++ b/drivers/gpu/drm/qxl/qxl_image.c @@ -210,7 +210,8 @@ qxl_image_init_helper(struct qxl_device *qdev, break; default: DRM_ERROR("unsupported image bit depth\n"); - return -EINVAL; /* TODO: cleanup */ + qxl_bo_kunmap_atomic_page(qdev, image_bo, ptr); + return -EINVAL; } image->u.bitmap.flags = QXL_BITMAP_TOP_DOWN; image->u.bitmap.x = width;
From: Vasily Averin vvs@virtuozzo.com
[ Upstream commit 5e698222c70257d13ae0816720dde57c56f81e15 ]
Commit 89163f93c6f9 ("ipc/util.c: sysvipc_find_ipc() should increase position index") is causing this bug (seen on 5.6.8):
# ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages
# ipcmk -Q Message queue id: 0 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x82db8127 0 root 644 0 0
# ipcmk -Q Message queue id: 1 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x82db8127 0 root 644 0 0 0x76d1fb2a 1 root 644 0 0
# ipcrm -q 0 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x76d1fb2a 1 root 644 0 0 0x76d1fb2a 1 root 644 0 0
# ipcmk -Q Message queue id: 2 # ipcrm -q 2 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x76d1fb2a 1 root 644 0 0 0x76d1fb2a 1 root 644 0 0
# ipcmk -Q Message queue id: 3 # ipcrm -q 1 # ipcs -q
------ Message Queues -------- key msqid owner perms used-bytes messages 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0
Whenever an IPC item with a low id is deleted, the items with higher ids are duplicated, as if filling a hole.
new_pos should jump through hole of unused ids, pos can be updated inside "for" cycle.
Fixes: 89163f93c6f9 ("ipc/util.c: sysvipc_find_ipc() should increase position index") Reported-by: Andreas Schwab schwab@suse.de Reported-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Vasily Averin vvs@virtuozzo.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Waiman Long longman@redhat.com Cc: NeilBrown neilb@suse.com Cc: Steven Rostedt rostedt@goodmis.org Cc: Ingo Molnar mingo@redhat.com Cc: Peter Oberparleiter oberpar@linux.ibm.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Manfred Spraul manfred@colorfullife.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/4921fe9b-9385-a2b4-1dc4-1099be6d2e39@virtuozzo.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- ipc/util.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/ipc/util.c b/ipc/util.c index e65ecf3ccbdab..76d4afcde7bbb 100644 --- a/ipc/util.c +++ b/ipc/util.c @@ -751,21 +751,21 @@ static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos, total++; }
- *new_pos = pos + 1; + ipc = NULL; if (total >= ids->in_use) - return NULL; + goto out;
for (; pos < IPCMNI; pos++) { ipc = idr_find(&ids->ipcs_idr, pos); if (ipc != NULL) { rcu_read_lock(); ipc_lock_object(ipc); - return ipc; + break; } } - - /* Out of range - return NULL to terminate iteration */ - return NULL; +out: + *new_pos = pos + 1; + return ipc; }
static void *sysvipc_proc_next(struct seq_file *s, void *it, loff_t *pos)
From: Grace Kao grace.kao@intel.com
[ Upstream commit 69388e15f5078c961b9e5319e22baea4c57deff1 ]
According to Braswell NDA Specification Update (#557593), concurrent read accesses may result in returning 0xffffffff and write instructions may be dropped. We have an established format for the commit references, i.e. cdca06e4e859 ("pinctrl: baytrail: Add missing spinlock usage in byt_gpio_irq_handler")
Fixes: 0bd50d719b00 ("pinctrl: cherryview: prevent concurrent access to GPIO controllers") Signed-off-by: Grace Kao grace.kao@intel.com Reported-by: Brian Norris briannorris@chromium.org Reviewed-by: Brian Norris briannorris@chromium.org Acked-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/intel/pinctrl-cherryview.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c b/drivers/pinctrl/intel/pinctrl-cherryview.c index e8c08eb975301..d1a99b2e2d4c3 100644 --- a/drivers/pinctrl/intel/pinctrl-cherryview.c +++ b/drivers/pinctrl/intel/pinctrl-cherryview.c @@ -1509,11 +1509,15 @@ static void chv_gpio_irq_handler(struct irq_desc *desc) struct chv_pinctrl *pctrl = gpiochip_get_data(gc); struct irq_chip *chip = irq_desc_get_chip(desc); unsigned long pending; + unsigned long flags; u32 intr_line;
chained_irq_enter(chip, desc);
+ raw_spin_lock_irqsave(&chv_lock, flags); pending = readl(pctrl->regs + CHV_INTSTAT); + raw_spin_unlock_irqrestore(&chv_lock, flags); + for_each_set_bit(intr_line, &pending, pctrl->community->nirqs) { unsigned irq, offset;
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 37e31d2d26a4124506c24e95434e9baf3405a23a ]
The i40iw_arp_table() function can return -EOVERFLOW if i40iw_alloc_resource() fails so we can't just test for "== -1".
Fixes: 4e9042e647ff ("i40iw: add hw and utils files") Link: https://lore.kernel.org/r/20200422092211.GA195357@mwanda Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Shiraz Saleem shiraz.saleem@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/i40iw/i40iw_hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_hw.c b/drivers/infiniband/hw/i40iw/i40iw_hw.c index 0c92a40b3e869..e4867d6de7893 100644 --- a/drivers/infiniband/hw/i40iw/i40iw_hw.c +++ b/drivers/infiniband/hw/i40iw/i40iw_hw.c @@ -479,7 +479,7 @@ void i40iw_manage_arp_cache(struct i40iw_device *iwdev, int arp_index;
arp_index = i40iw_arp_table(iwdev, ip_addr, ipv4, mac_addr, action); - if (arp_index == -1) + if (arp_index < 0) return; cqp_request = i40iw_get_cqp_request(&iwdev->cqp, false); if (!cqp_request)
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 2c407aca64977ede9b9f35158e919773cae2082f ]
gcc-10 warns around a suspicious access to an empty struct member:
net/netfilter/nf_conntrack_core.c: In function '__nf_conntrack_alloc': net/netfilter/nf_conntrack_core.c:1522:9: warning: array subscript 0 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[0]'} [-Wzero-length-bounds] 1522 | memset(&ct->__nfct_init_offset[0], 0, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from net/netfilter/nf_conntrack_core.c:37: include/net/netfilter/nf_conntrack.h:90:5: note: while referencing '__nfct_init_offset' 90 | u8 __nfct_init_offset[0]; | ^~~~~~~~~~~~~~~~~~
The code is correct but a bit unusual. Rework it slightly in a way that does not trigger the warning, using an empty struct instead of an empty array. There are probably more elegant ways to do this, but this is the smallest change.
Fixes: c41884ce0562 ("netfilter: conntrack: avoid zeroing timer") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_conntrack.h | 2 +- net/netfilter/nf_conntrack_core.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h index b57a9f37c297b..7befec513295d 100644 --- a/include/net/netfilter/nf_conntrack.h +++ b/include/net/netfilter/nf_conntrack.h @@ -103,7 +103,7 @@ struct nf_conn { struct hlist_node nat_bysource; #endif /* all members below initialized via memset */ - u8 __nfct_init_offset[0]; + struct { } __nfct_init_offset;
/* If we were expected by an expectation, this will be it */ struct nf_conn *master; diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 1bdae8f188e1f..d507d0fc7858a 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1124,9 +1124,9 @@ __nf_conntrack_alloc(struct net *net, *(unsigned long *)(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev) = hash; ct->status = 0; write_pnet(&ct->ct_net, net); - memset(&ct->__nfct_init_offset[0], 0, + memset(&ct->__nfct_init_offset, 0, offsetof(struct nf_conn, proto) - - offsetof(struct nf_conn, __nfct_init_offset[0])); + offsetof(struct nf_conn, __nfct_init_offset));
nf_ct_zone_add(ct, zone);
From: Jack Morgenstein jackm@dev.mellanox.co.il
[ Upstream commit 6693ca95bd4330a0ad7326967e1f9bcedd6b0800 ]
In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey() without checking its return value. If ib_get_cached_pkey() returns an error code, these functions should return failure.
Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support") Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Fixes: e622f2f4ad21 ("IB: split struct ib_send_wr") Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org Signed-off-by: Jack Morgenstein jackm@dev.mellanox.co.il Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx4/qp.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 709d6491d2435..7284a9176844e 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -2307,6 +2307,7 @@ static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp, int send_size; int header_size; int spc; + int err; int i;
if (wr->wr.opcode != IB_WR_SEND) @@ -2341,7 +2342,9 @@ static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp,
sqp->ud_header.lrh.virtual_lane = 0; sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); - ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); + if (err) + return err; sqp->ud_header.bth.pkey = cpu_to_be16(pkey); if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_TUN_SMI_OWNER) sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); @@ -2618,9 +2621,14 @@ static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr, } sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); if (!sqp->qp.ibqp.qp_num) - ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, &pkey); + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, + &pkey); else - ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, &pkey); + err = ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, + &pkey); + if (err) + return err; + sqp->ud_header.bth.pkey = cpu_to_be16(pkey); sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1));
From: Jason Gunthorpe jgg@mellanox.com
commit 01b2bafe57b19d9119413f138765ef57990921ce upstream.
Aside from good practice, this avoids a warning from gcc 10:
./include/linux/kernel.h:997:3: warning: array subscript -31 is outside array bounds of ‘struct list_head[1]’ [-Warray-bounds] 997 | ((type *)(__mptr - offsetof(type, member))); }) | ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/list.h:493:2: note: in expansion of macro ‘container_of’ 493 | container_of(ptr, type, member) | ^~~~~~~~~~~~ ./include/linux/pnp.h:275:30: note: in expansion of macro ‘list_entry’ 275 | #define global_to_pnp_dev(n) list_entry(n, struct pnp_dev, global_list) | ^~~~~~~~~~ ./include/linux/pnp.h:281:11: note: in expansion of macro ‘global_to_pnp_dev’ 281 | (dev) != global_to_pnp_dev(&pnp_global); \ | ^~~~~~~~~~~~~~~~~ arch/x86/kernel/rtc.c:189:2: note: in expansion of macro ‘pnp_for_each_dev’ 189 | pnp_for_each_dev(dev) {
Because the common code doesn't cast the starting list_head to the containing struct.
Signed-off-by: Jason Gunthorpe jgg@mellanox.com [ rjw: Whitespace adjustments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/pnp.h | 29 +++++++++-------------------- 1 file changed, 9 insertions(+), 20 deletions(-)
--- a/include/linux/pnp.h +++ b/include/linux/pnp.h @@ -219,10 +219,8 @@ struct pnp_card { #define global_to_pnp_card(n) list_entry(n, struct pnp_card, global_list) #define protocol_to_pnp_card(n) list_entry(n, struct pnp_card, protocol_list) #define to_pnp_card(n) container_of(n, struct pnp_card, dev) -#define pnp_for_each_card(card) \ - for((card) = global_to_pnp_card(pnp_cards.next); \ - (card) != global_to_pnp_card(&pnp_cards); \ - (card) = global_to_pnp_card((card)->global_list.next)) +#define pnp_for_each_card(card) \ + list_for_each_entry(card, &pnp_cards, global_list)
struct pnp_card_link { struct pnp_card *card; @@ -275,14 +273,9 @@ struct pnp_dev { #define card_to_pnp_dev(n) list_entry(n, struct pnp_dev, card_list) #define protocol_to_pnp_dev(n) list_entry(n, struct pnp_dev, protocol_list) #define to_pnp_dev(n) container_of(n, struct pnp_dev, dev) -#define pnp_for_each_dev(dev) \ - for((dev) = global_to_pnp_dev(pnp_global.next); \ - (dev) != global_to_pnp_dev(&pnp_global); \ - (dev) = global_to_pnp_dev((dev)->global_list.next)) -#define card_for_each_dev(card,dev) \ - for((dev) = card_to_pnp_dev((card)->devices.next); \ - (dev) != card_to_pnp_dev(&(card)->devices); \ - (dev) = card_to_pnp_dev((dev)->card_list.next)) +#define pnp_for_each_dev(dev) list_for_each_entry(dev, &pnp_global, global_list) +#define card_for_each_dev(card, dev) \ + list_for_each_entry(dev, &(card)->devices, card_list) #define pnp_dev_name(dev) (dev)->name
static inline void *pnp_get_drvdata(struct pnp_dev *pdev) @@ -436,14 +429,10 @@ struct pnp_protocol { };
#define to_pnp_protocol(n) list_entry(n, struct pnp_protocol, protocol_list) -#define protocol_for_each_card(protocol,card) \ - for((card) = protocol_to_pnp_card((protocol)->cards.next); \ - (card) != protocol_to_pnp_card(&(protocol)->cards); \ - (card) = protocol_to_pnp_card((card)->protocol_list.next)) -#define protocol_for_each_dev(protocol,dev) \ - for((dev) = protocol_to_pnp_dev((protocol)->devices.next); \ - (dev) != protocol_to_pnp_dev(&(protocol)->devices); \ - (dev) = protocol_to_pnp_dev((dev)->protocol_list.next)) +#define protocol_for_each_card(protocol, card) \ + list_for_each_entry(card, &(protocol)->cards, protocol_list) +#define protocol_for_each_dev(protocol, dev) \ + list_for_each_entry(dev, &(protocol)->devices, protocol_list)
extern struct bus_type pnp_bus_type;
From: Linus Torvalds torvalds@linux-foundation.org
commit 9d82973e032e246ff5663c9805fbb5407ae932e3 upstream.
Due to a bug-report that was compiler-dependent, I updated one of my machines to gcc-10. That shows a lot of new warnings. Happily they seem to be mostly the valid kind, but it's going to cause a round of churn for getting rid of them..
This is the really low-hanging fruit of removing a couple of zero-sized arrays in some core code. We have had a round of these patches before, and we'll have many more coming, and there is nothing special about these except that they were particularly trivial, and triggered more warnings than most.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/fs.h | 2 +- include/linux/tty.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -931,7 +931,7 @@ struct file_handle { __u32 handle_bytes; int handle_type; /* file identifier */ - unsigned char f_handle[0]; + unsigned char f_handle[]; };
static inline struct file *get_file(struct file *f) --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -64,7 +64,7 @@ struct tty_buffer { int read; int flags; /* Data points here */ - unsigned long data[0]; + unsigned long data[]; };
/* Values for .flags field of tty_buffer */
From: Masahiro Yamada yamada.masahiro@socionext.com
commit b303c6df80c9f8f13785aa83a0471fca7e38b24d upstream.
Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched various false positives:
- commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os") turned off this option for -Os.
- commit 815eb71e7149 ("Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for CONFIG_PROFILE_ALL_BRANCHES
- commit a76bcf557ef4 ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"") turned off this option for GCC < 4.9 Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903
I think this looks better by shifting the logic from Makefile to Kconfig.
Link: https://github.com/ClangBuiltLinux/linux/issues/350 Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Reviewed-by: Nathan Chancellor natechancellor@gmail.com Tested-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 11 ++++------- init/Kconfig | 17 +++++++++++++++++ kernel/trace/Kconfig | 1 + 3 files changed, 22 insertions(+), 7 deletions(-)
--- a/Makefile +++ b/Makefile @@ -658,17 +658,14 @@ KBUILD_CFLAGS += $(call cc-option,-fdata endif
ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE -KBUILD_CFLAGS += -Os $(call cc-disable-warning,maybe-uninitialized,) -else -ifdef CONFIG_PROFILE_ALL_BRANCHES -KBUILD_CFLAGS += -O2 $(call cc-disable-warning,maybe-uninitialized,) +KBUILD_CFLAGS += -Os else KBUILD_CFLAGS += -O2 endif -endif
-KBUILD_CFLAGS += $(call cc-ifversion, -lt, 0409, \ - $(call cc-disable-warning,maybe-uninitialized,)) +ifdef CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED +KBUILD_CFLAGS += -Wno-maybe-uninitialized +endif
# Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) --- a/init/Kconfig +++ b/init/Kconfig @@ -16,6 +16,22 @@ config DEFCONFIG_LIST default "$ARCH_DEFCONFIG" default "arch/$ARCH/defconfig"
+config CC_HAS_WARN_MAYBE_UNINITIALIZED + def_bool $(cc-option,-Wmaybe-uninitialized) + help + GCC >= 4.7 supports this option. + +config CC_DISABLE_WARN_MAYBE_UNINITIALIZED + bool + depends on CC_HAS_WARN_MAYBE_UNINITIALIZED + default CC_IS_GCC && GCC_VERSION < 40900 # unreliable for GCC < 4.9 + help + GCC's -Wmaybe-uninitialized is not reliable by definition. + Lots of false positive warnings are produced in some cases. + + If this option is enabled, -Wno-maybe-uninitialzed is passed + to the compiler to suppress maybe-uninitialized warnings. + config CONSTRUCTORS bool depends on !UML @@ -1333,6 +1349,7 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
config CC_OPTIMIZE_FOR_SIZE bool "Optimize for size" + imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help Enabling this option will pass "-Os" instead of "-O2" to your compiler resulting in a smaller kernel. --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -342,6 +342,7 @@ config PROFILE_ANNOTATED_BRANCHES config PROFILE_ALL_BRANCHES bool "Profile all if conditionals" select TRACE_BRANCH_PROFILING + imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help This tracer profiles all branch conditions. Every if () taken in the kernel is recorded whether it hit or miss.
From: Linus Torvalds torvalds@linux-foundation.org
commit 78a5255ffb6a1af189a83e493d916ba1c54d8c75 upstream.
We have some rather random rules about when we accept the "maybe-initialized" warnings, and when we don't.
For example, we consider it unreliable for gcc versions < 4.9, but also if -O3 is enabled, or if optimizing for size. And then various kernel config options disabled it, because they know that they trigger that warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).
And now gcc-10 seems to be introducing a lot of those warnings too, so it falls under the same heading as 4.9 did.
At the same time, we have a very straightforward way to _enable_ that warning when wanted: use "W=2" to enable more warnings.
So stop playing these ad-hoc games, and just disable that warning by default, with the known and straight-forward "if you want to work on the extra compiler warnings, use W=123".
Would it be great to have code that is always so obvious that it never confuses the compiler whether a variable is used initialized or not? Yes, it would. In a perfect world, the compilers would be smarter, and our source code would be simpler.
That's currently not the world we live in, though.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 7 +++---- init/Kconfig | 17 ----------------- kernel/trace/Kconfig | 1 - 3 files changed, 3 insertions(+), 22 deletions(-)
--- a/Makefile +++ b/Makefile @@ -663,10 +663,6 @@ else KBUILD_CFLAGS += -O2 endif
-ifdef CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED -KBUILD_CFLAGS += -Wno-maybe-uninitialized -endif - # Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
@@ -801,6 +797,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni # disable stringop warnings in gcc 8+ KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation)
+# Enabled with W=2, disabled by default as noisy +KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized) + # disable invalid "can't wrap" optimizations for signed / pointers KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
--- a/init/Kconfig +++ b/init/Kconfig @@ -16,22 +16,6 @@ config DEFCONFIG_LIST default "$ARCH_DEFCONFIG" default "arch/$ARCH/defconfig"
-config CC_HAS_WARN_MAYBE_UNINITIALIZED - def_bool $(cc-option,-Wmaybe-uninitialized) - help - GCC >= 4.7 supports this option. - -config CC_DISABLE_WARN_MAYBE_UNINITIALIZED - bool - depends on CC_HAS_WARN_MAYBE_UNINITIALIZED - default CC_IS_GCC && GCC_VERSION < 40900 # unreliable for GCC < 4.9 - help - GCC's -Wmaybe-uninitialized is not reliable by definition. - Lots of false positive warnings are produced in some cases. - - If this option is enabled, -Wno-maybe-uninitialzed is passed - to the compiler to suppress maybe-uninitialized warnings. - config CONSTRUCTORS bool depends on !UML @@ -1349,7 +1333,6 @@ config CC_OPTIMIZE_FOR_PERFORMANCE
config CC_OPTIMIZE_FOR_SIZE bool "Optimize for size" - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help Enabling this option will pass "-Os" instead of "-O2" to your compiler resulting in a smaller kernel. --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -342,7 +342,6 @@ config PROFILE_ANNOTATED_BRANCHES config PROFILE_ALL_BRANCHES bool "Profile all if conditionals" select TRACE_BRANCH_PROFILING - imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives help This tracer profiles all branch conditions. Every if () taken in the kernel is recorded whether it hit or miss.
From: Florian Fainelli f.fainelli@gmail.com
commit 55f53567afe5f0cd2fd9e006b174c08c31c466f8 upstream.
Our statistics strings are allocated at initialization without being bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using memcpy() which would create out of bounds accesses, this was flagged by KASAN. Replace this with strlcpy() to make sure we are bound the source buffer size and we also always NUL-terminate strings.
Fixes: 2b2427d06426 ("phy: micrel: Add ethtool statistics counters") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/phy/micrel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -677,8 +677,8 @@ static void kszphy_get_strings(struct ph int i;
for (i = 0; i < ARRAY_SIZE(kszphy_hw_stats); i++) { - memcpy(data + i * ETH_GSTRING_LEN, - kszphy_hw_stats[i].string, ETH_GSTRING_LEN); + strlcpy(data + i * ETH_GSTRING_LEN, + kszphy_hw_stats[i].string, ETH_GSTRING_LEN); } }
From: Linus Torvalds torvalds@linux-foundation.org
commit 1a263ae60b04de959d9ce9caea4889385eefcc7b upstream.
gcc-10 has started warning about conflicting types for a few new built-in functions, particularly 'free()'.
This results in warnings like:
crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch]
because the crypto layer had its local freeing functions called 'free()'.
Gcc-10 is in the wrong here, since that function is marked 'static', and thus there is no chance of confusion with any standard library function namespace.
But the simplest thing to do is to just use a different name here, and avoid this gcc mis-feature.
[ Side note: gcc knowing about 'free()' is in itself not the mis-feature: the semantics of 'free()' are special enough that a compiler can validly do special things when seeing it.
So the mis-feature here is that gcc thinks that 'free()' is some restricted name, and you can't shadow it as a local static function.
Making the special 'free()' semantics be a function attribute rather than tied to the name would be the much better model ]
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/lrw.c | 4 ++-- crypto/xts.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/crypto/lrw.c +++ b/crypto/lrw.c @@ -377,7 +377,7 @@ out_put_alg: return inst; }
-static void free(struct crypto_instance *inst) +static void free_inst(struct crypto_instance *inst) { crypto_drop_spawn(crypto_instance_ctx(inst)); kfree(inst); @@ -386,7 +386,7 @@ static void free(struct crypto_instance static struct crypto_template crypto_tmpl = { .name = "lrw", .alloc = alloc, - .free = free, + .free = free_inst, .module = THIS_MODULE, };
--- a/crypto/xts.c +++ b/crypto/xts.c @@ -329,7 +329,7 @@ out_put_alg: return inst; }
-static void free(struct crypto_instance *inst) +static void free_inst(struct crypto_instance *inst) { crypto_drop_spawn(crypto_instance_ctx(inst)); kfree(inst); @@ -338,7 +338,7 @@ static void free(struct crypto_instance static struct crypto_template crypto_tmpl = { .name = "xts", .alloc = alloc, - .free = free, + .free = free_inst, .module = THIS_MODULE, };
From: Linus Torvalds torvalds@linux-foundation.org
commit 5c45de21a2223fe46cf9488c99a7fbcf01527670 upstream.
This is a fine warning, but we still have a number of zero-length arrays in the kernel that come from the traditional gcc extension. Yes, they are getting converted to flexible arrays, but in the meantime the gcc-10 warning about zero-length bounds is very verbose, and is hiding other issues.
I missed one actual build failure because it was hidden among hundreds of lines of warning. Thankfully I caught it on the second go before pushing things out, but it convinced me that I really need to disable the new warnings for now.
We'll hopefully be all done with our conversion to flexible arrays in the not too distant future, and we can then re-enable this warning.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 3 +++ 1 file changed, 3 insertions(+)
--- a/Makefile +++ b/Makefile @@ -797,6 +797,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni # disable stringop warnings in gcc 8+ KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation)
+# We'll want to enable this eventually, but it's not going away for 5.7 at least +KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) + # Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Linus Torvalds torvalds@linux-foundation.org
commit 44720996e2d79e47d508b0abe99b931a726a3197 upstream.
This is another fine warning, related to the 'zero-length-bounds' one, but hitting the same historical code in the kernel.
Because C didn't historically support flexible array members, we have code that instead uses a one-sized array, the same way we have cases of zero-sized arrays.
The one-sized arrays come from either not wanting to use the gcc zero-sized array extension, or from a slight convenience-feature, where particularly for strings, the size of the structure now includes the allocation for the final NUL character.
So with a "char name[1];" at the end of a structure, you can do things like
v = my_malloc(sizeof(struct vendor) + strlen(name));
and avoid the "+1" for the terminator.
Yes, the modern way to do that is with a flexible array, and using 'offsetof()' instead of 'sizeof()', and adding the "+1" by hand. That also technically gets the size "more correct" in that it avoids any alignment (and thus padding) issues, but this is another long-term cleanup thing that will not happen for 5.7.
So disable the warning for now, even though it's potentially quite useful. Having a slew of warnings that then hide more urgent new issues is not an improvement.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/Makefile +++ b/Makefile @@ -799,6 +799,7 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# We'll want to enable this eventually, but it's not going away for 5.7 at least KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) +KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds)
# Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Linus Torvalds torvalds@linux-foundation.org
commit 5a76021c2eff7fcf2f0918a08fd8a37ce7922921 upstream.
This is the final array bounds warning removal for gcc-10 for now.
Again, the warning is good, and we should re-enable all these warnings when we have converted all the legacy array declaration cases to flexible arrays. But in the meantime, it's just noise.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/Makefile +++ b/Makefile @@ -800,6 +800,7 @@ KBUILD_CFLAGS += $(call cc-disable-warni # We'll want to enable this eventually, but it's not going away for 5.7 at least KBUILD_CFLAGS += $(call cc-disable-warning, zero-length-bounds) KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds) +KBUILD_CFLAGS += $(call cc-disable-warning, stringop-overflow)
# Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Linus Torvalds torvalds@linux-foundation.org
commit adc71920969870dfa54e8f40dac8616284832d02 upstream.
gcc-10 now warns about passing aliasing pointers to functions that take restricted pointers.
That's actually a great warning, and if we ever start using 'restrict' in the kernel, it might be quite useful. But right now we don't, and it turns out that the only thing this warns about is an idiom where we have declared a few functions to be "printf-like" (which seems to make gcc pick up the restricted pointer thing), and then we print to the same buffer that we also use as an input.
And people do that as an odd concatenation pattern, with code like this:
#define sysfs_show_gen_prop(buffer, fmt, ...) \ snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__)
where we have 'buffer' as both the destination of the final result, and as the initial argument.
Yes, it's a bit questionable. And outside of the kernel, people do have standard declarations like
int snprintf( char *restrict buffer, size_t bufsz, const char *restrict format, ... );
where that output buffer is marked as a restrict pointer that cannot alias with any other arguments.
But in the context of the kernel, that 'use snprintf() to concatenate to the end result' does work, and the pattern shows up in multiple places. And we have not marked our own version of snprintf() as taking restrict pointers, so the warning is incorrect for now, and gcc picks it up on its own.
If we do start using 'restrict' in the kernel (and it might be a good idea if people find places where it matters), we'll need to figure out how to avoid this issue for snprintf and friends. But in the meantime, this warning is not useful.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 3 +++ 1 file changed, 3 insertions(+)
--- a/Makefile +++ b/Makefile @@ -802,6 +802,9 @@ KBUILD_CFLAGS += $(call cc-disable-warni KBUILD_CFLAGS += $(call cc-disable-warning, array-bounds) KBUILD_CFLAGS += $(call cc-disable-warning, stringop-overflow)
+# Another good warning that we'll want to enable eventually +KBUILD_CFLAGS += $(call cc-disable-warning, restrict) + # Enabled with W=2, disabled by default as noisy KBUILD_CFLAGS += $(call cc-disable-warning, maybe-uninitialized)
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit dd912306ff008891c82cd9f63e8181e47a9cb2fb ]
syzbot managed to trigger a recursive NETDEV_FEAT_CHANGE event between bonding master and slave. I managed to find a reproducer for this:
ip li set bond0 up ifenslave bond0 eth0 brctl addbr br0 ethtool -K eth0 lro off brctl addif br0 bond0 ip li set br0 up
When a NETDEV_FEAT_CHANGE event is triggered on a bonding slave, it captures this and calls bond_compute_features() to fixup its master's and other slaves' features. However, when syncing with its lower devices by netdev_sync_lower_features() this event is triggered again on slaves when the LRO feature fails to change, so it goes back and forth recursively until the kernel stack is exhausted.
Commit 17b85d29e82c intentionally lets __netdev_update_features() return -1 for such a failure case, so we have to just rely on the existing check inside netdev_sync_lower_features() and skip NETDEV_FEAT_CHANGE event only for this specific failure case.
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack") Reported-by: syzbot+e73ceacfd8560cc8a3ca@syzkaller.appspotmail.com Reported-by: syzbot+c2fb6f9ddcea95ba49b5@syzkaller.appspotmail.com Cc: Jarod Wilson jarod@redhat.com Cc: Nikolay Aleksandrov nikolay@cumulusnetworks.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jann Horn jannh@google.com Reviewed-by: Jay Vosburgh jay.vosburgh@canonical.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Acked-by: Nikolay Aleksandrov nikolay@cumulusnetworks.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -6939,11 +6939,13 @@ static void netdev_sync_lower_features(s netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n", &feature, lower->name); lower->wanted_features &= ~feature; - netdev_update_features(lower); + __netdev_update_features(lower);
if (unlikely(lower->features & feature)) netdev_WARN(upper, "failed to disable %pNF on %s!\n", &feature, lower->name); + else + netdev_features_change(lower); } } }
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit eead1c2ea2509fd754c6da893a94f0e69e83ebe4 ]
The cipso and calipso code can set the MLS_CAT attribute on successful parsing, even if the corresponding catmap has not been allocated, as per current configuration and external input.
Later, selinux code tries to access the catmap if the MLS_CAT flag is present via netlbl_catmap_getlong(). That may cause null ptr dereference while processing incoming network traffic.
Address the issue setting the MLS_CAT flag only if the catmap is really allocated. Additionally let netlbl_catmap_getlong() cope with NULL catmap.
Reported-by: Matthew Sheets matthew.sheets@gd-ms.com Fixes: 4b8feff251da ("netlabel: fix the horribly broken catmap functions") Fixes: ceba1832b1b2 ("calipso: Set the calipso socket label to match the secattr.") Signed-off-by: Paolo Abeni pabeni@redhat.com Acked-by: Paul Moore paul@paul-moore.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/cipso_ipv4.c | 6 ++++-- net/ipv6/calipso.c | 3 ++- net/netlabel/netlabel_kapi.c | 6 ++++++ 3 files changed, 12 insertions(+), 3 deletions(-)
--- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -1272,7 +1272,8 @@ static int cipso_v4_parsetag_rbm(const s return ret_val; }
- secattr->flags |= NETLBL_SECATTR_MLS_CAT; + if (secattr->attr.mls.cat) + secattr->flags |= NETLBL_SECATTR_MLS_CAT; }
return 0; @@ -1453,7 +1454,8 @@ static int cipso_v4_parsetag_rng(const s return ret_val; }
- secattr->flags |= NETLBL_SECATTR_MLS_CAT; + if (secattr->attr.mls.cat) + secattr->flags |= NETLBL_SECATTR_MLS_CAT; }
return 0; --- a/net/ipv6/calipso.c +++ b/net/ipv6/calipso.c @@ -1061,7 +1061,8 @@ static int calipso_opt_getattr(const uns goto getattr_return; }
- secattr->flags |= NETLBL_SECATTR_MLS_CAT; + if (secattr->attr.mls.cat) + secattr->flags |= NETLBL_SECATTR_MLS_CAT; }
secattr->type = NETLBL_NLTYPE_CALIPSO; --- a/net/netlabel/netlabel_kapi.c +++ b/net/netlabel/netlabel_kapi.c @@ -748,6 +748,12 @@ int netlbl_catmap_getlong(struct netlbl_ if ((off & (BITS_PER_LONG - 1)) != 0) return -EINVAL;
+ /* a null catmap is equivalent to an empty one */ + if (!catmap) { + *offset = (u32)-1; + return 0; + } + if (off < catmap->startbit) { off = catmap->startbit; *offset = off;
From: "Maciej Żenczykowski" maze@google.com
[ Upstream commit 09454fd0a4ce23cb3d8af65066c91a1bf27120dd ]
This reverts commit 19bda36c4299ce3d7e5bce10bebe01764a655a6d:
| ipv6: add mtu lock check in __ip6_rt_update_pmtu | | Prior to this patch, ipv6 didn't do mtu lock check in ip6_update_pmtu. | It leaded to that mtu lock doesn't really work when receiving the pkt | of ICMPV6_PKT_TOOBIG. | | This patch is to add mtu lock check in __ip6_rt_update_pmtu just as ipv4 | did in __ip_rt_update_pmtu.
The above reasoning is incorrect. IPv6 *requires* icmp based pmtu to work. There's already a comment to this effect elsewhere in the kernel:
$ git grep -p -B1 -A3 'RTAX_MTU lock' net/ipv6/route.c=4813=
static int rt6_mtu_change_route(struct fib6_info *f6i, void *p_arg) ... /* In IPv6 pmtu discovery is not optional, so that RTAX_MTU lock cannot disable it. We still use this lock to block changes caused by addrconf/ndisc. */
This reverts to the pre-4.9 behaviour.
Cc: Eric Dumazet edumazet@google.com Cc: Willem de Bruijn willemb@google.com Cc: Xin Long lucien.xin@gmail.com Cc: Hannes Frederic Sowa hannes@stressinduktion.org Signed-off-by: Maciej Żenczykowski maze@google.com Fixes: 19bda36c4299 ("ipv6: add mtu lock check in __ip6_rt_update_pmtu") Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/route.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1373,8 +1373,10 @@ static void __ip6_rt_update_pmtu(struct { struct rt6_info *rt6 = (struct rt6_info *)dst;
- if (dst_metric_locked(dst, RTAX_MTU)) - return; + /* Note: do *NOT* check dst_metric_locked(dst, RTAX_MTU) + * IPv6 pmtu discovery isn't optional, so 'mtu lock' cannot disable it. + * [see also comment in rt6_mtu_change_route()] + */
dst_confirm(dst); mtu = max_t(u32, mtu, IPV6_MIN_MTU);
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 57644431a6c2faac5d754ebd35780cf43a531b1a ]
In commit b406472b5ad7 ("net: ipv4: avoid mixed n_redirects and rate_tokens usage") I missed the fact that a 0 'rate_tokens' will bypass the backoff algorithm.
Since rate_tokens is cleared after a redirect silence, and never incremented on redirects, if the host keeps receiving packets requiring redirect it will reply ignoring the backoff.
Additionally, the 'rate_last' field will be updated with the cadence of the ingress packet requiring redirect. If that rate is high enough, that will prevent the host from generating any other kind of ICMP messages
The check for a zero 'rate_tokens' value was likely a shortcut to avoid the more complex backoff algorithm after a redirect silence period. Address the issue checking for 'n_redirects' instead, which is incremented on successful redirect, and does not interfere with other ICMP replies.
Fixes: b406472b5ad7 ("net: ipv4: avoid mixed n_redirects and rate_tokens usage") Reported-and-tested-by: Colin Walters walters@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/route.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -898,7 +898,7 @@ void ip_rt_send_redirect(struct sk_buff /* Check for load limit; set rate_last to the latest sent * redirect. */ - if (peer->rate_tokens == 0 || + if (peer->n_redirects == 0 || time_after(jiffies, (peer->rate_last + (ip_rt_redirect_load << peer->n_redirects)))) {
From: Zefan Li lizefan@huawei.com
[ Upstream commit 090e28b229af92dc5b40786ca673999d59e73056 ]
If systemd is configured to use hybrid mode which enables the use of both cgroup v1 and v2, systemd will create new cgroup on both the default root (v2) and netprio_cgroup hierarchy (v1) for a new session and attach task to the two cgroups. If the task does some network thing then the v2 cgroup can never be freed after the session exited.
One of our machines ran into OOM due to this memory leak.
In the scenario described above when sk_alloc() is called cgroup_sk_alloc() thought it's in v2 mode, so it stores the cgroup pointer in sk->sk_cgrp_data and increments the cgroup refcnt, but then sock_update_netprioidx() thought it's in v1 mode, so it stores netprioidx value in sk->sk_cgrp_data, so the cgroup refcnt will never be freed.
Currently we do the mode switch when someone writes to the ifpriomap cgroup control file. The easiest fix is to also do the switch when a task is attached to a new cgroup.
Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Reported-by: Yang Yingliang yangyingliang@huawei.com Tested-by: Yang Yingliang yangyingliang@huawei.com Signed-off-by: Zefan Li lizefan@huawei.com Acked-by: Tejun Heo tj@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/netprio_cgroup.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/core/netprio_cgroup.c +++ b/net/core/netprio_cgroup.c @@ -237,6 +237,8 @@ static void net_prio_attach(struct cgrou struct task_struct *p; struct cgroup_subsys_state *css;
+ cgroup_sk_alloc_disable(); + cgroup_taskset_for_each(p, css, tset) { void *v = (void *)(unsigned long)css->cgroup->id;
From: Takashi Iwai tiwai@suse.de
commit b590b38ca305d6d7902ec7c4f7e273e0069f3bcc upstream.
Lenovo Thinkpad T530 seems to have a sensitive internal mic capture that needs to limit the mic boost like a few other Thinkpad models. Although we may change the quirk for ALC269_FIXUP_LENOVO_DOCK, this hits way too many other laptop models, so let's add a new fixup model that limits the internal mic boost on top of the existing quirk and apply to only T530.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1171293 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200514160533.10337-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4890,6 +4890,7 @@ enum { ALC269_FIXUP_HP_LINE1_MIC1_LED, ALC269_FIXUP_INV_DMIC, ALC269_FIXUP_LENOVO_DOCK, + ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, ALC269_FIXUP_NO_SHUTUP, ALC286_FIXUP_SONY_MIC_NO_PRESENCE, ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT, @@ -5157,6 +5158,12 @@ static const struct hda_fixup alc269_fix .chained = true, .chain_id = ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT }, + [ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc269_fixup_limit_int_mic_boost, + .chained = true, + .chain_id = ALC269_FIXUP_LENOVO_DOCK, + }, [ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT] = { .type = HDA_FIXUP_FUNC, .v.func = alc269_fixup_pincfg_no_hp_to_lineout, @@ -5820,7 +5827,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x21b8, "Thinkpad Edge 14", ALC269_FIXUP_SKU_IGNORE), SND_PCI_QUIRK(0x17aa, 0x21ca, "Thinkpad L412", ALC269_FIXUP_SKU_IGNORE), SND_PCI_QUIRK(0x17aa, 0x21e9, "Thinkpad Edge 15", ALC269_FIXUP_SKU_IGNORE), - SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK), + SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST), SND_PCI_QUIRK(0x17aa, 0x21fa, "Thinkpad X230", ALC269_FIXUP_LENOVO_DOCK), SND_PCI_QUIRK(0x17aa, 0x21f3, "Thinkpad T430", ALC269_FIXUP_LENOVO_DOCK), SND_PCI_QUIRK(0x17aa, 0x21fb, "Thinkpad T430s", ALC269_FIXUP_LENOVO_DOCK), @@ -5945,6 +5952,7 @@ static const struct hda_model_fixup alc2 {.id = ALC269_FIXUP_HEADSET_MODE, .name = "headset-mode"}, {.id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC, .name = "headset-mode-no-hp-mic"}, {.id = ALC269_FIXUP_LENOVO_DOCK, .name = "lenovo-dock"}, + {.id = ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, .name = "lenovo-dock-limit-boost"}, {.id = ALC269_FIXUP_HP_GPIO_LED, .name = "hp-gpio-led"}, {.id = ALC269_FIXUP_HP_DOCK_GPIO_MIC1_LED, .name = "hp-dock-gpio-mic1-led"}, {.id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, .name = "dell-headset-multi"},
From: Takashi Iwai tiwai@suse.de
commit 5a7b44a8df822e0667fc76ed7130252523993bda upstream.
syzbot reported the uninitialized value exposure in certain situations using virmidi loop. It's likely a very small race at writing and reading, and the influence is almost negligible. But it's safer to paper over this just by replacing the existing kvmalloc() with kvzalloc().
Reported-by: syzbot+194dffdb8b22fc5d207a@syzkaller.appspotmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/rawmidi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -125,7 +125,7 @@ static int snd_rawmidi_runtime_create(st runtime->avail = 0; else runtime->avail = runtime->buffer_size; - if ((runtime->buffer = kmalloc(runtime->buffer_size, GFP_KERNEL)) == NULL) { + if ((runtime->buffer = kzalloc(runtime->buffer_size, GFP_KERNEL)) == NULL) { kfree(runtime); return -ENOMEM; } @@ -650,7 +650,7 @@ int snd_rawmidi_output_params(struct snd return -EINVAL; } if (params->buffer_size != runtime->buffer_size) { - newbuf = kmalloc(params->buffer_size, GFP_KERNEL); + newbuf = kzalloc(params->buffer_size, GFP_KERNEL); if (!newbuf) return -ENOMEM; spin_lock_irq(&runtime->lock);
From: Takashi Iwai tiwai@suse.de
commit c1f6e3c818dd734c30f6a7eeebf232ba2cf3181d upstream.
The rawmidi core allows user to resize the runtime buffer via ioctl, and this may lead to UAF when performed during concurrent reads or writes: the read/write functions unlock the runtime lock temporarily during copying form/to user-space, and that's the race window.
This patch fixes the hole by introducing a reference counter for the runtime buffer read/write access and returns -EBUSY error when the resize is performed concurrently against read/write.
Note that the ref count field is a simple integer instead of refcount_t here, since the all contexts accessing the buffer is basically protected with a spinlock, hence we need no expensive atomic ops. Also, note that this busy check is needed only against read / write functions, and not in receive/transmit callbacks; the race can happen only at the spinlock hole mentioned in the above, while the whole function is protected for receive / transmit callbacks.
Reported-by: butt3rflyh4ck butterflyhuangxx@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CAFcO6XMWpUVK_yzzCpp8_XP7+=oUpQvuBeCbMffEDkpe8jWrf... Link: https://lore.kernel.org/r/s5heerw3r5z.wl-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/sound/rawmidi.h | 1 + sound/core/rawmidi.c | 31 +++++++++++++++++++++++++++---- 2 files changed, 28 insertions(+), 4 deletions(-)
--- a/include/sound/rawmidi.h +++ b/include/sound/rawmidi.h @@ -76,6 +76,7 @@ struct snd_rawmidi_runtime { size_t avail_min; /* min avail for wakeup */ size_t avail; /* max used buffer for wakeup */ size_t xruns; /* over/underruns counter */ + int buffer_ref; /* buffer reference count */ /* misc */ spinlock_t lock; wait_queue_head_t sleep; --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -108,6 +108,17 @@ static void snd_rawmidi_input_event_work runtime->event(runtime->substream); }
+/* buffer refcount management: call with runtime->lock held */ +static inline void snd_rawmidi_buffer_ref(struct snd_rawmidi_runtime *runtime) +{ + runtime->buffer_ref++; +} + +static inline void snd_rawmidi_buffer_unref(struct snd_rawmidi_runtime *runtime) +{ + runtime->buffer_ref--; +} + static int snd_rawmidi_runtime_create(struct snd_rawmidi_substream *substream) { struct snd_rawmidi_runtime *runtime; @@ -654,6 +665,11 @@ int snd_rawmidi_output_params(struct snd if (!newbuf) return -ENOMEM; spin_lock_irq(&runtime->lock); + if (runtime->buffer_ref) { + spin_unlock_irq(&runtime->lock); + kfree(newbuf); + return -EBUSY; + } oldbuf = runtime->buffer; runtime->buffer = newbuf; runtime->buffer_size = params->buffer_size; @@ -962,8 +978,10 @@ static long snd_rawmidi_kernel_read1(str long result = 0, count1; struct snd_rawmidi_runtime *runtime = substream->runtime; unsigned long appl_ptr; + int err = 0;
spin_lock_irqsave(&runtime->lock, flags); + snd_rawmidi_buffer_ref(runtime); while (count > 0 && runtime->avail) { count1 = runtime->buffer_size - runtime->appl_ptr; if (count1 > count) @@ -982,16 +1000,19 @@ static long snd_rawmidi_kernel_read1(str if (userbuf) { spin_unlock_irqrestore(&runtime->lock, flags); if (copy_to_user(userbuf + result, - runtime->buffer + appl_ptr, count1)) { - return result > 0 ? result : -EFAULT; - } + runtime->buffer + appl_ptr, count1)) + err = -EFAULT; spin_lock_irqsave(&runtime->lock, flags); + if (err) + goto out; } result += count1; count -= count1; } + out: + snd_rawmidi_buffer_unref(runtime); spin_unlock_irqrestore(&runtime->lock, flags); - return result; + return result > 0 ? result : err; }
long snd_rawmidi_kernel_read(struct snd_rawmidi_substream *substream, @@ -1262,6 +1283,7 @@ static long snd_rawmidi_kernel_write1(st return -EAGAIN; } } + snd_rawmidi_buffer_ref(runtime); while (count > 0 && runtime->avail > 0) { count1 = runtime->buffer_size - runtime->appl_ptr; if (count1 > count) @@ -1293,6 +1315,7 @@ static long snd_rawmidi_kernel_write1(st } __end: count1 = runtime->avail < runtime->buffer_size; + snd_rawmidi_buffer_unref(runtime); spin_unlock_irqrestore(&runtime->lock, flags); if (count1) snd_rawmidi_output_trigger(substream, 1);
From: Jesus Ramos jesus-ramos@live.com
commit 073919e09ca445d4486968e3f851372ff44cf2b5 upstream.
Kingston HyperX headset with 0951:16ad also needs the same quirk for delaying the frequency controls.
Signed-off-by: Jesus Ramos jesus-ramos@live.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/BY5PR19MB3634BA68C7CCA23D8DF428E796AF0@BY5PR19MB36... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/usb/quirks.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1316,13 +1316,14 @@ void snd_usb_ctl_msg_quirk(struct usb_de && (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) mdelay(20);
- /* Zoom R16/24, Logitech H650e, Jabra 550a needs a tiny delay here, - * otherwise requests like get/set frequency return as failed despite - * actually succeeding. + /* Zoom R16/24, Logitech H650e, Jabra 550a, Kingston HyperX needs a tiny + * delay here, otherwise requests like get/set frequency return as + * failed despite actually succeeding. */ if ((chip->usb_id == USB_ID(0x1686, 0x00dd) || chip->usb_id == USB_ID(0x046d, 0x0a46) || - chip->usb_id == USB_ID(0x0b0e, 0x0349)) && + chip->usb_id == USB_ID(0x0b0e, 0x0349) || + chip->usb_id == USB_ID(0x0951, 0x16ad)) && (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS) mdelay(1); }
From: Kyungtae Kim kt0755@gmail.com
commit 15753588bcd4bbffae1cca33c8ced5722477fe1f upstream.
FuzzUSB (a variant of syzkaller) found an illegal array access using an incorrect index while binding a gadget with UDC.
Reference: https://www.spinics.net/lists/linux-usb/msg194331.html
This bug occurs when a size variable used for a buffer is misused to access its strcpy-ed buffer. Given a buffer along with its size variable (taken from user input), from which, a new buffer is created using kstrdup(). Due to the original buffer containing 0 value in the middle, the size of the kstrdup-ed buffer becomes smaller than that of the original. So accessing the kstrdup-ed buffer with the same size variable triggers memory access violation.
The fix makes sure no zero value in the buffer, by comparing the strlen() of the orignal buffer with the size variable, so that the access to the kstrdup-ed buffer is safe.
BUG: KASAN: slab-out-of-bounds in gadget_dev_desc_UDC_store+0x1ba/0x200 drivers/usb/gadget/configfs.c:266 Read of size 1 at addr ffff88806a55dd7e by task syz-executor.0/17208
CPU: 2 PID: 17208 Comm: syz-executor.0 Not tainted 5.6.8 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xce/0x128 lib/dump_stack.c:118 print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374 __kasan_report+0x131/0x1b0 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:641 __asan_report_load1_noabort+0x14/0x20 mm/kasan/generic_report.c:132 gadget_dev_desc_UDC_store+0x1ba/0x200 drivers/usb/gadget/configfs.c:266 flush_write_buffer fs/configfs/file.c:251 [inline] configfs_write_file+0x2f1/0x4c0 fs/configfs/file.c:283 __vfs_write+0x85/0x110 fs/read_write.c:494 vfs_write+0x1cd/0x510 fs/read_write.c:558 ksys_write+0x18a/0x220 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:620 do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Kyungtae Kim kt0755@gmail.com Reported-and-tested-by: Kyungtae Kim kt0755@gmail.com Cc: Felipe Balbi balbi@kernel.org Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20200510054326.GA19198@pizza01 Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/configfs.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/gadget/configfs.c +++ b/drivers/usb/gadget/configfs.c @@ -259,6 +259,9 @@ static ssize_t gadget_dev_desc_UDC_store char *name; int ret;
+ if (strlen(page) < len) + return -EOVERFLOW; + name = kstrdup(page, GFP_KERNEL); if (!name) return -ENOMEM;
From: Sriharsha Allenki sallenki@codeaurora.org
commit 3c6f8cb92c9178fc0c66b580ea3df1fa3ac1155a upstream.
On platforms with IOMMU enabled, multiple SGs can be coalesced into one by the IOMMU driver. In that case the SG list processing as part of the completion of a urb on a bulk endpoint can result into a NULL pointer dereference with the below stack dump.
<6> Unable to handle kernel NULL pointer dereference at virtual address 0000000c <6> pgd = c0004000 <6> [0000000c] *pgd=00000000 <6> Internal error: Oops: 5 [#1] PREEMPT SMP ARM <2> PC is at xhci_queue_bulk_tx+0x454/0x80c <2> LR is at xhci_queue_bulk_tx+0x44c/0x80c <2> pc : [<c08907c4>] lr : [<c08907bc>] psr: 000000d3 <2> sp : ca337c80 ip : 00000000 fp : ffffffff <2> r10: 00000000 r9 : 50037000 r8 : 00004000 <2> r7 : 00000000 r6 : 00004000 r5 : 00000000 r4 : 00000000 <2> r3 : 00000000 r2 : 00000082 r1 : c2c1a200 r0 : 00000000 <2> Flags: nzcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment none <2> Control: 10c0383d Table: b412c06a DAC: 00000051 <6> Process usb-storage (pid: 5961, stack limit = 0xca336210) <snip> <2> [<c08907c4>] (xhci_queue_bulk_tx) <2> [<c0881b3c>] (xhci_urb_enqueue) <2> [<c0831068>] (usb_hcd_submit_urb) <2> [<c08350b4>] (usb_sg_wait) <2> [<c089f384>] (usb_stor_bulk_transfer_sglist) <2> [<c089f2c0>] (usb_stor_bulk_srb) <2> [<c089fe38>] (usb_stor_Bulk_transport) <2> [<c089f468>] (usb_stor_invoke_transport) <2> [<c08a11b4>] (usb_stor_control_thread) <2> [<c014a534>] (kthread)
The above NULL pointer dereference is the result of block_len and the sent_len set to zero after the first SG of the list when IOMMU driver is enabled. Because of this the loop of processing the SGs has run more than num_sgs which resulted in a sg_next on the last SG of the list which has SG_END set.
Fix this by check for the sg before any attributes of the sg are accessed.
[modified reason for null pointer dereference in commit message subject -Mathias] Fixes: f9c589e142d04 ("xhci: TD-fragment, align the unsplittable case with a bounce buffer") Cc: stable@vger.kernel.org Signed-off-by: Sriharsha Allenki sallenki@codeaurora.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20200514110432.25564-2-mathias.nyman@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-ring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -3347,8 +3347,8 @@ int xhci_queue_bulk_tx(struct xhci_hcd * /* New sg entry */ --num_sgs; sent_len -= block_len; - if (num_sgs != 0) { - sg = sg_next(sg); + sg = sg_next(sg); + if (num_sgs != 0 && sg) { block_len = sg_dma_len(sg); addr = (u64) sg_dma_address(sg); addr += sent_len;
From: Fabio Estevam festevam@gmail.com
commit 0caf34350a25907515d929a9c77b9b206aac6d1e upstream.
The I2C2 pins are already used and the following errors are seen:
imx27-pinctrl 10015000.iomuxc: pin MX27_PAD_I2C2_SDA already requested by 10012000.i2c; cannot claim for 1001d000.i2c imx27-pinctrl 10015000.iomuxc: pin-69 (1001d000.i2c) status -22 imx27-pinctrl 10015000.iomuxc: could not request pin 69 (MX27_PAD_I2C2_SDA) from group i2c2grp on device 10015000.iomuxc imx-i2c 1001d000.i2c: Error applying setting, reverse things back imx-i2c: probe of 1001d000.i2c failed with error -22
Fix it by adding the correct I2C1 IOMUX entries for the pinctrl_i2c1 group.
Cc: stable@vger.kernel.org Fixes: 61664d0b432a ("ARM: dts: imx27 phyCARD-S pinctrl") Signed-off-by: Fabio Estevam festevam@gmail.com Reviewed-by: Stefan Riedmueller s.riedmueller@phytec.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts +++ b/arch/arm/boot/dts/imx27-phytec-phycard-s-rdk.dts @@ -81,8 +81,8 @@ imx27-phycard-s-rdk { pinctrl_i2c1: i2c1grp { fsl,pins = < - MX27_PAD_I2C2_SDA__I2C2_SDA 0x0 - MX27_PAD_I2C2_SCL__I2C2_SCL 0x0 + MX27_PAD_I2C_DATA__I2C_DATA 0x0 + MX27_PAD_I2C_CLK__I2C_CLK 0x0 >; };
From: Borislav Petkov bp@suse.de
commit a9a3ed1eff3601b63aea4fb462d8b3b92c7c1e7e upstream.
... or the odyssey of trying to disable the stack protector for the function which generates the stack canary value.
The whole story started with Sergei reporting a boot crash with a kernel built with gcc-10:
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013 Call Trace: dump_stack panic ? start_secondary __stack_chk_fail start_secondary secondary_startup_64 -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
This happens because gcc-10 tail-call optimizes the last function call in start_secondary() - cpu_startup_entry() - and thus emits a stack canary check which fails because the canary value changes after the boot_init_stack_canary() call.
To fix that, the initial attempt was to mark the one function which generates the stack canary with:
__attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
however, using the optimize attribute doesn't work cumulatively as the attribute does not add to but rather replaces previously supplied optimization options - roughly all -fxxx options.
The key one among them being -fno-omit-frame-pointer and thus leading to not present frame pointer - frame pointer which the kernel needs.
The next attempt to prevent compilers from tail-call optimizing the last function call cpu_startup_entry(), shy of carving out start_secondary() into a separate compilation unit and building it with -fno-stack-protector, was to add an empty asm("").
This current solution was short and sweet, and reportedly, is supported by both compilers but we didn't get very far this time: future (LTO?) optimization passes could potentially eliminate this, which leads us to the third attempt: having an actual memory barrier there which the compiler cannot ignore or move around etc.
That should hold for a long time, but hey we said that about the other two solutions too so...
Reported-by: Sergei Trofimovich slyfox@gentoo.org Signed-off-by: Borislav Petkov bp@suse.de Tested-by: Kalle Valo kvalo@codeaurora.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/include/asm/stackprotector.h | 7 ++++++- arch/x86/kernel/smpboot.c | 8 ++++++++ arch/x86/xen/smp.c | 1 + include/linux/compiler.h | 7 +++++++ init/main.c | 2 ++ 5 files changed, 24 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/stackprotector.h +++ b/arch/x86/include/asm/stackprotector.h @@ -54,8 +54,13 @@ /* * Initialize the stackprotector canary value. * - * NOTE: this must only be called from functions that never return, + * NOTE: this must only be called from functions that never return * and it must always be inlined. + * + * In addition, it should be called from a compilation unit for which + * stack protector is disabled. Alternatively, the caller should not end + * with a function call which gets tail-call optimized as that would + * lead to checking a modified canary value. */ static __always_inline void boot_init_stack_canary(void) { --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -249,6 +249,14 @@ static void notrace start_secondary(void
wmb(); cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); + + /* + * Prevent tail call to cpu_startup_entry() because the stack protector + * guard has been changed a couple of function calls up, in + * boot_init_stack_canary() and must not be checked before tail calling + * another function. + */ + prevent_tail_call_optimization(); }
/** --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -116,6 +116,7 @@ asmlinkage __visible void cpu_bringup_an #endif cpu_bringup(); cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); + prevent_tail_call_optimization(); }
void xen_smp_intr_free(unsigned int cpu) --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -605,4 +605,11 @@ unsigned long read_word_at_a_time(const # define __kprobes # define nokprobe_inline inline #endif + +/* + * This is needed in functions which generate the stack canary, see + * arch/x86/kernel/smpboot.c::start_secondary() for an example. + */ +#define prevent_tail_call_optimization() mb() + #endif /* __LINUX_COMPILER_H */ --- a/init/main.c +++ b/init/main.c @@ -662,6 +662,8 @@ asmlinkage __visible void __init start_k
/* Do the rest non-__init'ed, we're now alive */ rest_init(); + + prevent_tail_call_optimization(); }
/* Call all constructor functions linked into the kernel. */
From: Eric W. Biederman ebiederm@xmission.com
commit f87d1c9559164294040e58f5e3b74a162bf7c6e8 upstream.
I goofed when I added mm->user_ns support to would_dump. I missed the fact that in the case of binfmt_loader, binfmt_em86, binfmt_misc, and binfmt_script bprm->file is reassigned. Which made the move of would_dump from setup_new_exec to __do_execve_file before exec_binprm incorrect as it can result in would_dump running on the script instead of the interpreter of the script.
The net result is that the code stopped making unreadable interpreters undumpable. Which allows them to be ptraced and written to disk without special permissions. Oops.
The move was necessary because the call in set_new_exec was after bprm->mm was no longer valid.
To correct this mistake move the misplaced would_dump from __do_execve_file into flos_old_exec, before exec_mmap is called.
I tested and confirmed that without this fix I can attach with gdb to a script with an unreadable interpreter, and with this fix I can not.
Cc: stable@vger.kernel.org Fixes: f84df2a6f268 ("exec: Ensure mm->user_ns contains the execed files") Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/exec.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/exec.c +++ b/fs/exec.c @@ -1270,6 +1270,8 @@ int flush_old_exec(struct linux_binprm * */ set_mm_exe_file(bprm->mm, bprm->file);
+ would_dump(bprm, bprm->file); + /* * Release all of the old mmap stuff */ @@ -1780,8 +1782,6 @@ static int do_execveat_common(int fd, st if (retval < 0) goto out;
- would_dump(bprm, bprm->file); - retval = exec_binprm(bprm); if (retval < 0) goto out;
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit ccaef7e6e354fb65758eaddd3eae8065a8b3e295 upstream.
'dev' is allocated in 'net2272_probe_init()'. It must be freed in the error handling path, as already done in the remove function (i.e. 'net2272_plat_remove()')
Fixes: 90fccb529d24 ("usb: gadget: Gadget directory cleanup - group UDC drivers") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/udc/net2272.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/gadget/udc/net2272.c +++ b/drivers/usb/gadget/udc/net2272.c @@ -2666,6 +2666,8 @@ net2272_plat_probe(struct platform_devic err_req: release_mem_region(base, len); err: + kfree(dev); + return ret; }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit 19b94c1f9c9a16d41a8de3ccbdb8536cf1aecdbf upstream.
If 'usb_otg_descriptor_alloc()' fails, we must return an error code, not 0.
Fixes: 56023ce0fd70 ("usb: gadget: audio: allocate and init otg descriptor by otg capabilities") Reviewed-by: Peter Chen peter.chen@nxp.com Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/legacy/audio.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/audio.c +++ b/drivers/usb/gadget/legacy/audio.c @@ -249,8 +249,10 @@ static int audio_bind(struct usb_composi struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(cdev->gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail; + } usb_otg_descriptor_init(cdev->gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Wei Yongjun weiyongjun1@huawei.com
commit e27d4b30b71c66986196d8a1eb93cba9f602904a upstream.
If 'usb_otg_descriptor_alloc()' fails, we must return a negative error code -ENOMEM, not 0.
Fixes: 1156e91dd7cc ("usb: gadget: ncm: allocate and init otg descriptor by otg capabilities") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/legacy/ncm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/ncm.c +++ b/drivers/usb/gadget/legacy/ncm.c @@ -162,8 +162,10 @@ static int gncm_bind(struct usb_composit struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail; + } usb_otg_descriptor_init(gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Wei Yongjun weiyongjun1@huawei.com
commit e8f7f9e3499a6d96f7f63a4818dc7d0f45a7783b upstream.
If 'usb_otg_descriptor_alloc()' fails, we must return a negative error code -ENOMEM, not 0.
Fixes: ab6796ae9833 ("usb: gadget: cdc2: allocate and init otg descriptor by otg capabilities") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Wei Yongjun weiyongjun1@huawei.com Signed-off-by: Felipe Balbi balbi@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/legacy/cdc2.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/legacy/cdc2.c +++ b/drivers/usb/gadget/legacy/cdc2.c @@ -183,8 +183,10 @@ static int cdc_bind(struct usb_composite struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget); - if (!usb_desc) + if (!usb_desc) { + status = -ENOMEM; goto fail1; + } usb_otg_descriptor_init(gadget, usb_desc); otg_desc[0] = usb_desc; otg_desc[1] = NULL;
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit f41224efcf8aafe80ea47ac870c5e32f3209ffc8 upstream.
This reverts commit 3b36b13d5e69d6f51ff1c55d1b404a74646c9757.
Enable power save node breaks some systems with ACL225. Revert the patch and use a platform specific quirk for the original issue isntead.
Fixes: 3b36b13d5e69 ("ALSA: hda/realtek: Fix pop noise on ALC225") BugLink: https://bugs.launchpad.net/bugs/1875916 Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Link: https://lore.kernel.org/r/20200503152449.22761-1-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 2 -- 1 file changed, 2 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -4212,8 +4212,6 @@ static void alc_determine_headset_type(s is_ctia = (val & 0x1c02) == 0x1c02; break; case 0x10ec0225: - codec->power_save_node = 1; - /* fall through */ case 0x10ec0295: case 0x10ec0299: alc_process_coef_fw(codec, coef0225);
From: Geert Uytterhoeven geert+renesas@glider.be
commit 0f739fdfe9e5ce668bd6d3210f310df282321837 upstream.
The R-Mobile APE6 Compare Match Timer 1 generates 8 interrupts, one for each channel, but currently only 1 is described. Fix this by adding the missing interrupts.
Fixes: f7b65230019b9dac ("ARM: shmobile: r8a73a4: Add CMT1 node") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20200408090926.25201-1-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/r8a73a4.dtsi | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/arm/boot/dts/r8a73a4.dtsi +++ b/arch/arm/boot/dts/r8a73a4.dtsi @@ -135,7 +135,14 @@ cmt1: timer@e6130000 { compatible = "renesas,cmt-48-r8a73a4", "renesas,cmt-48-gen2"; reg = <0 0xe6130000 0 0x1004>; - interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>; + interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 124 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>; clocks = <&mstp3_clks R8A73A4_CLK_CMT1>; clock-names = "fck"; power-domains = <&pd_c5>;
From: Geert Uytterhoeven geert+renesas@glider.be
commit e47cb97f153193d4b41ca8d48127da14513d54c7 upstream.
The Clock Pulse Generator (CPG) device node lacks the extal2 clock. This may lead to a failure registering the "r" clock, or to a wrong parent for the "usb24s" clock, depending on MD_CK2 pin configuration and boot loader CPG_USBCKCR register configuration.
This went unnoticed, as this does not affect the single upstream board configuration, which relies on the first clock input only.
Fixes: d9ffd583bf345e2e ("ARM: shmobile: r8a7740: add SoC clocks to DTS") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Ulrich Hecht uli+renesas@fpond.eu Link: https://lore.kernel.org/r/20200508095918.6061-1-geert+renesas@glider.be Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm/boot/dts/r8a7740.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/boot/dts/r8a7740.dtsi +++ b/arch/arm/boot/dts/r8a7740.dtsi @@ -467,7 +467,7 @@ cpg_clocks: cpg_clocks@e6150000 { compatible = "renesas,r8a7740-cpg-clocks"; reg = <0xe6150000 0x10000>; - clocks = <&extal1_clk>, <&extalr_clk>; + clocks = <&extal1_clk>, <&extal2_clk>, <&extalr_clk>; #clock-cells = <1>; clock-output-names = "system", "pllc0", "pllc1", "pllc2", "r",
From: Jim Mattson jmattson@google.com
commit c4e0e4ab4cf3ec2b3f0b628ead108d677644ebd9 upstream.
Bank_num is a one-based count of banks, not a zero-based index. It overflows the allocated space only when strictly greater than KVM_MAX_MCE_BANKS.
Fixes: a9e38c3e01ad ("KVM: x86: Catch potential overrun in MCE setup") Signed-off-by: Jue Wang juew@google.com Signed-off-by: Jim Mattson jmattson@google.com Reviewed-by: Peter Shier pshier@google.com Message-Id: 20200511225616.19557-1-jmattson@google.com Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3128,7 +3128,7 @@ static int kvm_vcpu_ioctl_x86_setup_mce( unsigned bank_num = mcg_cap & 0xff, bank;
r = -EINVAL; - if (!bank_num || bank_num >= KVM_MAX_MCE_BANKS) + if (!bank_num || bank_num > KVM_MAX_MCE_BANKS) goto out; if (mcg_cap & ~(kvm_mce_cap_supported | 0xff | 0xff0000)) goto out;
From: Sergei Trofimovich slyfox@gentoo.org
commit b1112139a103b4b1101d0d2d72931f2d33d8c978 upstream.
gcc-10 will rename --param=allow-store-data-races=0 to -fno-allow-store-data-races.
The flag change happened at https://gcc.gnu.org/PR92046.
Signed-off-by: Sergei Trofimovich slyfox@gentoo.org Acked-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Masahiro Yamada masahiroy@kernel.org Cc: Thomas Backlund tmb@mageia.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/Makefile +++ b/Makefile @@ -665,6 +665,7 @@ endif
# Tell gcc to never replace conditional load with a non-conditional one KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0) +KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)
# check for 'asm goto' ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC) $(KBUILD_CFLAGS)), y)
On Mon, 18 May 2020 at 23:13, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.9.224 release. There are 90 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.9.224-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.9.y git commit: 7cb03e23d3f596ac9f89bee7cc836eb292321418 git describe: v4.9.223-91-g7cb03e23d3f5 Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.223-91-...
No regressions (compared to build v4.9.223)
No fixes (compared to build v4.9.223)
Ran 31444 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - juno-r2-compat - juno-r2-kasan - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64 - x86-kasan
Test Suites ----------- * build * install-android-platform-tools-r2600 * install-android-platform-tools-r2800 * kselftest * kselftest/drivers * kselftest/filesystems * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * v4l2-compliance * kvm-unit-tests * network-basic-tests * ltp-open-posix-tests * kselftest/net * kselftest/networking * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems * kselftest-vsyscall-mode-none * kselftest-vsyscall-mode-none/drivers * kselftest-vsyscall-mode-none/filesystems
On 18/05/2020 18:35, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.224 release. There are 90 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.9: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 24 tests: 24 pass, 0 fail
Linux version: 4.9.224-rc1-g6b0209f7a185 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On 5/18/20 11:35 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.224 release. There are 90 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.224-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On 5/18/20 10:35 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.224 release. There are 90 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 May 2020 17:32:42 +0000. Anything received after that time might be too late.
Build results: total: 171 pass: 171 fail: 0 Qemu test results: total: 384 pass: 384 fail: 0
Guenter
linux-stable-mirror@lists.linaro.org