This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.1.13-rc1
Dan Carpenter error27@gmail.com net: sched: sch: Fix off by one in htb_activate_prios()
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak
Keith Busch kbusch@kernel.org nvme-pci: refresh visible attrs for cmb attributes
Thomas Gleixner tglx@linutronix.de alarmtimer: Prevent starvation by small intervals and SIG_IGN
Sean Christopherson seanjc@google.com perf/x86: Refuse to export capabilities for hybrid PMUs
Greg Kroah-Hartman gregkh@linuxfoundation.org kvm: initialize all of the kvm_debugregs structure before sending it to userspace
Sean Christopherson seanjc@google.com KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)
Christoph Hellwig hch@lst.de nvme-apple: fix controller shutdown in apple_nvme_disable
Sagi Grimberg sagi@grimberg.me nvme-rdma: stop auth work after tearing down queues in error recovery
Sagi Grimberg sagi@grimberg.me nvme-tcp: stop auth work after tearing down queues in error recovery
Pedro Tammela pctammela@mojatatu.com net/sched: tcindex: search key must be 16 bits
Natalia Petrova n.petrova@fintech.ru i40e: Add checking for null for nlmsg_find_attr()
Arnd Bergmann arnd@arndb.de mm: extend max struct page size for kmsan
Kuan-Ying Lee Kuan-Ying.Lee@mediatek.com mm/gup: add folio to list when folio_isolate_lru() succeed
Guillaume Nault gnault@redhat.com ipv6: Fix tcp socket connection with DSCP.
Guillaume Nault gnault@redhat.com ipv6: Fix datagram socket connection with DSCP.
Jason Xing kernelxing@tencent.com ixgbe: add double of VLAN header when computing the max MTU
Miroslav Lichvar mlichvar@redhat.com igb: Fix PPS input and output using 3rd and 4th SDP
Corinna Vinschen vinschen@redhat.com igb: conditionalize I2C bit banging on external thermal sensor support
Jakub Kicinski kuba@kernel.org net: mpls: fix stale pointer if allocation fails during device rename
Tung Nguyen tung.q.nguyen@dektech.com.au tipc: fix kernel warning when sending SYN message
Eric Dumazet edumazet@google.com net: use a bounce buffer for copying skb->mark
Cristian Ciocaltea cristian.ciocaltea@collabora.com net: stmmac: Restrict warning on disabling DMA store and fwd mode
Steven Rostedt (Google) rostedt@goodmis.org tracing: Make trace_define_field_ext() static
Michael Chan michael.chan@broadcom.com bnxt_en: Fix mqprio and XDP ring checking logic
Johannes Zink j.zink@pengutronix.de net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
Hangyu Hua hbh25y@gmail.com net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
Pedro Tammela pctammela@mojatatu.com net/sched: act_ctinfo: use percpu stats
Miko Larsson mikoxyzzz@gmail.com net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
Kuniyuki Iwashima kuniyu@amazon.com dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.
Larysa Zaremba larysa.zaremba@intel.com ice: xsk: Fix cleaning of XDP_TX frames
Pedro Tammela pctammela@mojatatu.com net/sched: tcindex: update imperfect hash filters respecting rcu
Pietro Borrello borrello@diag.uniroma1.it sctp: sctp_sock_filter(): avoid list_entry() on possibly empty list
Siddharth Vadapalli s-vadapalli@ti.com net: ethernet: ti: am65-cpsw: Add RX DMA Channel Teardown Quirk
Rafał Miłecki rafal@milecki.pl net: bgmac: fix BCM5358 support by setting correct flags
Jason Xing kernelxing@tencent.com i40e: add double of VLAN header when computing the max MTU
Jason Xing kernelxing@tencent.com ixgbe: allow to increase MTU to 3K with XDP enabled
Jesse Brandeburg jesse.brandeburg@intel.com ice: fix lost multicast packets in promisc mode
Matt Roper matthew.d.roper@intel.com drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
Dave Stevenson dave.stevenson@raspberrypi.com drm/vc4: Fix YUV plane handling when planes are in different buffers
Dom Cobley popcornmix@gmail.com drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
Andrew Morton akpm@linux-foundation.org revert "squashfs: harden sanity check in squashfs_read_xattr_id_table"
Felix Riemann felix.riemann@sma.de net: Fix unwanted sign extension in netdev_stats_to_stats64()
Aaron Thompson dev@aaront.org Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
Geert Uytterhoeven geert@linux-m68k.org coredump: Move dump_emit_page() to kill unused warning
Peter Zijlstra peterz@infradead.org freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
Bartosz Golaszewski bartosz.golaszewski@linaro.org gpio: sim: fix a memory leak
Peter Xu peterx@redhat.com mm/migrate: fix wrongly apply write bit after mkdirty on sparc64
Qian Yingjin qian@ddn.com mm/filemap: fix page end in filemap_get_read_batch
Zach O'Keefe zokeefe@google.com mm/MADV_COLLAPSE: set EAGAIN on unexpected page refcount
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix underflow in second superblock position calculations
Mike Kravetz mike.kravetz@oracle.com hugetlb: check for undefined shift on 32 bit architectures
Munehisa Kamata kamatam@amazon.com sched/psi: Fix use-after-free in ep_remove_wait_queue()
Patrick McLean chutzpah@gentoo.org ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LH
Simon Gaiser simon@invisiblethingslab.com ata: ahci: Add Tiger Lake UP{3,4} AHCI controller
Andy Chi andy.chi@canonical.com ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP Laptops
Andy Chi andy.chi@canonical.com ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform.
Kailang Yang kailang@realtek.com ALSA: hda/realtek - fixed wrong gpio assigned
Bo Liu bo.liu@senarytech.com ALSA: hda/conexant: add a new hda codec SN6180
Cezary Rojewski cezary.rojewski@intel.com ALSA: hda: Fix codec device field initializan
Yang Yingliang yangyingliang@huawei.com mmc: mmc_spi: fix error handling in mmc_spi_probe()
Yang Yingliang yangyingliang@huawei.com mmc: sdio: fix possible resource leaks in some error paths
Heiner Kallweit hkallweit1@gmail.com mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't set
Paul Cercueil paul@crapouillou.net mmc: jz4740: Work around bug on JZ4760(B)
Zack Rusin zackr@vmware.com drm/vmwgfx: Do not drop the reference to the handle too soon
Zack Rusin zackr@vmware.com drm/vmwgfx: Stop accessing buffer objects which failed init
Leo Li sunpeng.li@amd.com drm/amd/display: Fail atomic_check early on normalize_zpos error
Jack Xiao Jack.Xiao@amd.com drm/amd/amdgpu: fix warning during suspend
Ville Syrjälä ville.syrjala@linux.intel.com drm: Disable dynamic debug as broken
Takashi Iwai tiwai@suse.de fbdev: Fix invalid page access after closing deferred I/O devices
Ronak Doshi doshir@vmware.com vmxnet3: move rss code block under eop descriptor
Seth Jenkins sethjenkins@google.com aio: fix mremap after fork null-deref
Qi Zheng zhengqi.arch@bytedance.com mm: shrinkers: fix deadlock in shrinker debugfs
Christophe Leroy christophe.leroy@csgroup.eu kasan: fix Oops due to missing calls to kasan_arch_is_ready()
Isaac J. Manjarres isaacmanjarres@google.com of: reserved_mem: Have kmemleak ignore dynamically allocated reserved mem
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: userspace: fix v4-v6 test in v6.1
Xiubo Li xiubli@redhat.com ceph: blocklist the kclient when receiving corrupted snap trace
Xiubo Li xiubli@redhat.com ceph: move mount state enum to super.h
Hans de Goede hdegoede@redhat.com platform/x86: touchscreen_dmi: Add Chuwi Vi8 (CWI501) DMI match
Alex Deucher alexander.deucher@amd.com drm/amd/display: Properly handle additional cases where DCN is not supported
Yiqing Yao yiqing.yao@amd.com drm/amdgpu: Enable vclk dclk node for gc11.0.3
Evan Quan evan.quan@amd.com drm/amdgpu: enable HDP SD for gfx 11.0.3
Nicholas Kazlauskas nicholas.kazlauskas@amd.com drm/amd/display: Reset DMUB mailbox SW state after HW reset
George Shen george.shen@amd.com drm/amd/display: Unassign does_plane_fit_in_mall function from dcn3.2
Daniel Miess Daniel.Miess@amd.com drm/amd/display: Adjust downscaling limits for dcn314
Daniel Miess Daniel.Miess@amd.com drm/amd/display: Add missing brackets in calculation
Maurizio Lombardi mlombard@redhat.com nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_set
Maurizio Lombardi mlombard@redhat.com nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_set
Amit Engel Amit.Engel@dell.com nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
Vasily Gorbik gor@linux.ibm.com s390/decompressor: specify __decompress() buf len to avoid overflow
Kees Cook keescook@chromium.org net: sched: sch: Bounds check priority
Kees Cook keescook@chromium.org net: ethernet: mtk_eth_soc: Avoid truncating allocation
Ben Skeggs bskeggs@redhat.com drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETED
Hou Tao houtao1@huawei.com fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work()
Nicholas Piggin npiggin@gmail.com powerpc/64: Fix perf profiling asynchronous interrupt handlers
Andrey Konovalov andrey.konovalov@linaro.org net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoC
Andrei Gherzan andrei.gherzan@canonical.com selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibility
Hyunwoo Kim v4bel@theori.io net/rose: Fix to not accept on connected socket
Tanmay Bhushan 007047221b@gmail.com vdpa: ifcvf: Do proper cleanup if IFCVF init fails
Shunsuke Mie mie@igel.co.jp tools/virtio: fix the vringh test for virtio ring changes
Arnd Bergmann arnd@arndb.de ASoC: cs42l56: fix DT probe
Jakub Sitnicki jakub@cloudflare.com bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
fengwk fengwk94@gmail.com ASoC: amd: yc: Add Xiaomi Redmi Book Pro 15 2022 into DMI table
Cezary Rojewski cezary.rojewski@intel.com ALSA: hda: Do not unset preset when cleaning up codec
Eduard Zingerman eddyz87@gmail.com selftests/bpf: Verify copy_register_state() preserves parent/live fields
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_ssp_amp: always set dpcm_capture for amplifiers
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_nau8825: always set dpcm_capture for amplifiers
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_cs42l42: always set dpcm_capture for amplifiers
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: Intel: sof_rt5682: always set dpcm_capture for amplifiers
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Add FIXED_RATE quirk for JBL Quantum610 Wireless
Bard Liao yung-chuan.liao@linux.intel.com ASoC: SOF: sof-audio: start with the right widget type
Syed Saba Kareem Syed.SabaKareem@amd.com ASoC: amd: yc: Add DMI support for new acer/emdoor platforms
Filipe Manana fdmanana@suse.com btrfs: lock the inode in shared mode before starting fiemap
Josef Bacik josef@toxicpanda.com btrfs: move the auto defrag code to defrag.c
Paolo Abeni pabeni@redhat.com mptcp: fix locking for in-kernel listener creation
Paolo Abeni pabeni@redhat.com mptcp: deduplicate error paths on endpoint creation
Paolo Abeni pabeni@redhat.com mptcp: fix locking for setsockopt corner-case
Matthieu Baerts matthieu.baerts@tessares.net mptcp: sockopt: make 'tcp_fastopen_connect' generic
-------------
Diffstat:
Makefile | 4 +- arch/powerpc/include/asm/hw_irq.h | 41 ++- arch/powerpc/kernel/dbell.c | 2 +- arch/powerpc/kernel/irq.c | 2 +- arch/powerpc/kernel/time.c | 2 +- arch/s390/boot/decompressor.c | 2 +- arch/x86/events/core.c | 12 +- arch/x86/kvm/pmu.h | 26 +- arch/x86/kvm/x86.c | 3 +- drivers/ata/ahci.c | 1 + drivers/ata/libata-core.c | 3 + drivers/gpio/gpio-sim.c | 2 +- drivers/gpu/drm/Kconfig | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 + drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/soc21.c | 3 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +- .../drm/amd/display/dc/dcn314/dcn314_resource.c | 5 +- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c | 2 +- .../display/dc/dml/dcn314/display_mode_vba_314.c | 2 +- drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 12 + drivers/gpu/drm/amd/pm/amdgpu_pm.c | 6 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 14 +- .../gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c | 23 ++ drivers/gpu/drm/vc4/vc4_crtc.c | 2 +- drivers/gpu/drm/vc4/vc4_plane.c | 6 +- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 12 +- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 2 + drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 8 +- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_shader.c | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 10 +- drivers/mmc/core/sdio_bus.c | 17 +- drivers/mmc/core/sdio_cis.c | 12 - drivers/mmc/host/jz4740_mmc.c | 10 + drivers/mmc/host/meson-gx-mmc.c | 23 +- drivers/mmc/host/mmc_spi.c | 8 +- drivers/net/ethernet/broadcom/bgmac-bcma.c | 6 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +- drivers/net/ethernet/intel/i40e/i40e_main.c | 4 +- drivers/net/ethernet/intel/ice/ice_main.c | 26 ++ drivers/net/ethernet/intel/ice/ice_xsk.c | 15 +- drivers/net/ethernet/intel/igb/igb_main.c | 54 +++- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 28 +- drivers/net/ethernet/mediatek/mtk_ppe.c | 3 +- drivers/net/ethernet/mediatek/mtk_ppe.h | 1 - .../ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c | 2 + drivers/net/ethernet/stmicro/stmmac/dwmac5.c | 3 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 +- .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 12 +- drivers/net/ethernet/ti/am65-cpsw-nuss.h | 1 + drivers/net/usb/kalmia.c | 8 +- drivers/net/vmxnet3/vmxnet3_drv.c | 50 +-- drivers/nvme/host/apple.c | 3 +- drivers/nvme/host/core.c | 5 +- drivers/nvme/host/pci.c | 8 + drivers/nvme/host/rdma.c | 2 +- drivers/nvme/host/tcp.c | 2 +- drivers/nvme/target/fc.c | 4 +- drivers/of/of_reserved_mem.c | 3 +- drivers/platform/x86/touchscreen_dmi.c | 9 + drivers/vdpa/ifcvf/ifcvf_main.c | 2 +- drivers/video/fbdev/core/fb_defio.c | 10 +- drivers/video/fbdev/core/fbmem.c | 4 + fs/aio.c | 4 + fs/btrfs/extent_io.c | 2 + fs/btrfs/file.c | 340 --------------------- fs/btrfs/tree-defrag.c | 337 ++++++++++++++++++++ fs/ceph/addr.c | 17 +- fs/ceph/caps.c | 16 +- fs/ceph/file.c | 3 + fs/ceph/mds_client.c | 30 +- fs/ceph/snap.c | 36 ++- fs/ceph/super.h | 11 + fs/coredump.c | 48 +-- fs/fscache/volume.c | 3 +- fs/nilfs2/ioctl.c | 7 + fs/nilfs2/super.c | 9 + fs/nilfs2/the_nilfs.c | 8 +- fs/squashfs/xattr_id.c | 2 +- include/linux/ceph/libceph.h | 10 - include/linux/fb.h | 1 + include/linux/hugetlb.h | 5 +- include/linux/mm.h | 12 +- include/linux/shrinker.h | 5 +- include/linux/stmmac.h | 1 + include/net/sock.h | 13 + kernel/sched/psi.c | 7 +- kernel/time/alarmtimer.c | 33 +- kernel/trace/trace_events.c | 2 +- kernel/umh.c | 20 +- mm/filemap.c | 5 +- mm/gup.c | 2 +- mm/huge_memory.c | 6 +- mm/kasan/common.c | 3 + mm/kasan/generic.c | 7 +- mm/kasan/shadow.c | 12 + mm/khugepaged.c | 1 + mm/memblock.c | 8 +- mm/migrate.c | 2 + mm/shrinker_debug.c | 13 +- mm/vmscan.c | 6 +- net/core/dev.c | 2 +- net/core/sock_map.c | 61 ++-- net/dccp/ipv6.c | 7 +- net/ipv6/datagram.c | 2 +- net/ipv6/tcp_ipv6.c | 11 +- net/mpls/af_mpls.c | 4 + net/mptcp/pm_netlink.c | 43 ++- net/mptcp/sockopt.c | 20 +- net/mptcp/subflow.c | 2 +- net/openvswitch/meter.c | 4 +- net/rose/af_rose.c | 8 + net/sched/act_ctinfo.c | 6 +- net/sched/cls_tcindex.c | 34 ++- net/sched/sch_htb.c | 5 +- net/sctp/diag.c | 4 +- net/socket.c | 9 +- net/tipc/socket.c | 2 + sound/pci/hda/hda_bind.c | 2 + sound/pci/hda/hda_codec.c | 3 +- sound/pci/hda/patch_conexant.c | 1 + sound/pci/hda/patch_realtek.c | 9 +- sound/soc/amd/yc/acp6x-mach.c | 21 ++ sound/soc/codecs/cs42l56.c | 6 - sound/soc/intel/boards/sof_cs42l42.c | 3 + sound/soc/intel/boards/sof_nau8825.c | 5 +- sound/soc/intel/boards/sof_rt5682.c | 5 +- sound/soc/intel/boards/sof_ssp_amp.c | 5 +- sound/soc/sof/intel/hda-dai.c | 8 +- sound/soc/sof/sof-audio.c | 4 +- sound/usb/quirks.c | 2 + tools/testing/memblock/internal.h | 4 - .../selftests/bpf/verifier/search_pruning.c | 36 +++ tools/testing/selftests/net/cmsg_ipv6.sh | 2 +- tools/testing/selftests/net/mptcp/userspace_pm.sh | 11 + tools/virtio/linux/bug.h | 8 +- tools/virtio/linux/build_bug.h | 7 + tools/virtio/linux/cpumask.h | 7 + tools/virtio/linux/gfp.h | 7 + tools/virtio/linux/kernel.h | 1 + tools/virtio/linux/kmsan.h | 12 + tools/virtio/linux/scatterlist.h | 1 + tools/virtio/linux/topology.h | 7 + 147 files changed, 1310 insertions(+), 726 deletions(-)
From: Matthieu Baerts matthieu.baerts@tessares.net
[ Upstream commit d3d429047cc66ff49780c93e4fccd9527723d385 ]
There are other socket options that need to act only on the first subflow, e.g. all TCP_FASTOPEN* socket options.
This is similar to the getsockopt version.
In the next commit, this new mptcp_setsockopt_first_sf_only() helper is used by other another option.
Reviewed-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: Paolo Abeni pabeni@redhat.com Stable-dep-of: 21e43569685d ("mptcp: fix locking for setsockopt corner-case") Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/sockopt.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index c7cb68c725b29..8d3b09d75c3ae 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -769,17 +769,17 @@ static int mptcp_setsockopt_sol_tcp_defer(struct mptcp_sock *msk, sockptr_t optv return tcp_setsockopt(listener->sk, SOL_TCP, TCP_DEFER_ACCEPT, optval, optlen); }
-static int mptcp_setsockopt_sol_tcp_fastopen_connect(struct mptcp_sock *msk, sockptr_t optval, - unsigned int optlen) +static int mptcp_setsockopt_first_sf_only(struct mptcp_sock *msk, int level, int optname, + sockptr_t optval, unsigned int optlen) { struct socket *sock;
- /* Limit to first subflow */ + /* Limit to first subflow, before the connection establishment */ sock = __mptcp_nmpc_socket(msk); if (!sock) return -EINVAL;
- return tcp_setsockopt(sock->sk, SOL_TCP, TCP_FASTOPEN_CONNECT, optval, optlen); + return tcp_setsockopt(sock->sk, level, optname, optval, optlen); }
static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, @@ -811,7 +811,8 @@ static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, case TCP_DEFER_ACCEPT: return mptcp_setsockopt_sol_tcp_defer(msk, optval, optlen); case TCP_FASTOPEN_CONNECT: - return mptcp_setsockopt_sol_tcp_fastopen_connect(msk, optval, optlen); + return mptcp_setsockopt_first_sf_only(msk, SOL_TCP, optname, + optval, optlen); }
return -EOPNOTSUPP;
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 21e43569685de4ad773fb060c11a15f3fd5e7ac4 ]
We need to call the __mptcp_nmpc_socket(), and later subflow socket access under the msk socket lock, or e.g. a racing connect() could change the socket status under the hood, with unexpected results.
Fixes: 54635bd04701 ("mptcp: add TCP_FASTOPEN_CONNECT socket option") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/sockopt.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index 8d3b09d75c3ae..696ba398d699a 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -772,14 +772,21 @@ static int mptcp_setsockopt_sol_tcp_defer(struct mptcp_sock *msk, sockptr_t optv static int mptcp_setsockopt_first_sf_only(struct mptcp_sock *msk, int level, int optname, sockptr_t optval, unsigned int optlen) { + struct sock *sk = (struct sock *)msk; struct socket *sock; + int ret = -EINVAL;
/* Limit to first subflow, before the connection establishment */ + lock_sock(sk); sock = __mptcp_nmpc_socket(msk); if (!sock) - return -EINVAL; + goto unlock;
- return tcp_setsockopt(sock->sk, level, optname, optval, optlen); + ret = tcp_setsockopt(sock->sk, level, optname, optval, optlen); + +unlock: + release_sock(sk); + return ret; }
static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 976d302fb6165ad620778d7ba834cde6e3fe9f9f ]
When endpoint creation fails, we need to free the newly allocated entry and eventually destroy the paired mptcp listener socket.
Consolidate such action in a single point let all the errors path reach it.
Reviewed-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: ad2171009d96 ("mptcp: fix locking for in-kernel listener creation") Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/pm_netlink.c | 35 +++++++++++++---------------------- 1 file changed, 13 insertions(+), 22 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 9813ed0fde9bd..fdf2ee29f7623 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1003,16 +1003,12 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, return err;
msk = mptcp_sk(entry->lsk->sk); - if (!msk) { - err = -EINVAL; - goto out; - } + if (!msk) + return -EINVAL;
ssock = __mptcp_nmpc_socket(msk); - if (!ssock) { - err = -EINVAL; - goto out; - } + if (!ssock) + return -EINVAL;
mptcp_info2sockaddr(&entry->addr, &addr, entry->addr.family); #if IS_ENABLED(CONFIG_MPTCP_IPV6) @@ -1022,20 +1018,16 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, err = kernel_bind(ssock, (struct sockaddr *)&addr, addrlen); if (err) { pr_warn("kernel_bind error, err=%d", err); - goto out; + return err; }
err = kernel_listen(ssock, backlog); if (err) { pr_warn("kernel_listen error, err=%d", err); - goto out; + return err; }
return 0; - -out: - sock_release(entry->lsk); - return err; }
int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct sock_common *skc) @@ -1327,7 +1319,7 @@ static int mptcp_nl_cmd_add_addr(struct sk_buff *skb, struct genl_info *info) return -EINVAL; }
- entry = kmalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); + entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); if (!entry) { GENL_SET_ERR_MSG(info, "can't allocate addr"); return -ENOMEM; @@ -1338,22 +1330,21 @@ static int mptcp_nl_cmd_add_addr(struct sk_buff *skb, struct genl_info *info) ret = mptcp_pm_nl_create_listen_socket(skb->sk, entry); if (ret) { GENL_SET_ERR_MSG(info, "create listen socket error"); - kfree(entry); - return ret; + goto out_free; } } ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); if (ret < 0) { GENL_SET_ERR_MSG(info, "too many addresses or duplicate one"); - if (entry->lsk) - sock_release(entry->lsk); - kfree(entry); - return ret; + goto out_free; }
mptcp_nl_add_subflow_or_signal_addr(sock_net(skb->sk)); - return 0; + +out_free: + __mptcp_pm_release_addr_entry(entry); + return ret; }
int mptcp_pm_get_flags_and_ifindex_by_id(struct mptcp_sock *msk, unsigned int id,
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit ad2171009d968104ccda9dc517f5a3ba891515db ]
For consistency, in mptcp_pm_nl_create_listen_socket(), we need to call the __mptcp_nmpc_socket() under the msk socket lock.
Note that as a side effect, mptcp_subflow_create_socket() needs a 'nested' lockdep annotation, as it will acquire the subflow (kernel) socket lock under the in-kernel listener msk socket lock.
The current lack of locking is almost harmless, because the relevant socket is not exposed to the user space, but in future we will add more complexity to the mentioned helper, let's play safe.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/pm_netlink.c | 10 ++++++---- net/mptcp/subflow.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index fdf2ee29f7623..5e38a0abbabae 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -992,8 +992,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, { int addrlen = sizeof(struct sockaddr_in); struct sockaddr_storage addr; - struct mptcp_sock *msk; struct socket *ssock; + struct sock *newsk; int backlog = 1024; int err;
@@ -1002,11 +1002,13 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, if (err) return err;
- msk = mptcp_sk(entry->lsk->sk); - if (!msk) + newsk = entry->lsk->sk; + if (!newsk) return -EINVAL;
- ssock = __mptcp_nmpc_socket(msk); + lock_sock(newsk); + ssock = __mptcp_nmpc_socket(mptcp_sk(newsk)); + release_sock(newsk); if (!ssock) return -EINVAL;
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 929b0ee8b3d5f..c4971bc42f60f 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1631,7 +1631,7 @@ int mptcp_subflow_create_socket(struct sock *sk, unsigned short family, if (err) return err;
- lock_sock(sf->sk); + lock_sock_nested(sf->sk, SINGLE_DEPTH_NESTING);
/* the newly created socket has to be in the same cgroup as its parent */ mptcp_attach_cgroup(sk, sf->sk);
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 6e3df18ba7e8e68015dd66bcab326a4b7aaed085 ]
This currently exists in file.c, move it to the more natural location in defrag.c.
Signed-off-by: Josef Bacik josef@toxicpanda.com [ reformat comments ] Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: 519b7e13b5ae ("btrfs: lock the inode in shared mode before starting fiemap") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/file.c | 340 ----------------------------------------- fs/btrfs/tree-defrag.c | 337 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 337 insertions(+), 340 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 23056d9914d84..1bda59c683602 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -31,329 +31,6 @@ #include "reflink.h" #include "subpage.h"
-static struct kmem_cache *btrfs_inode_defrag_cachep; -/* - * when auto defrag is enabled we - * queue up these defrag structs to remember which - * inodes need defragging passes - */ -struct inode_defrag { - struct rb_node rb_node; - /* objectid */ - u64 ino; - /* - * transid where the defrag was added, we search for - * extents newer than this - */ - u64 transid; - - /* root objectid */ - u64 root; - - /* - * The extent size threshold for autodefrag. - * - * This value is different for compressed/non-compressed extents, - * thus needs to be passed from higher layer. - * (aka, inode_should_defrag()) - */ - u32 extent_thresh; -}; - -static int __compare_inode_defrag(struct inode_defrag *defrag1, - struct inode_defrag *defrag2) -{ - if (defrag1->root > defrag2->root) - return 1; - else if (defrag1->root < defrag2->root) - return -1; - else if (defrag1->ino > defrag2->ino) - return 1; - else if (defrag1->ino < defrag2->ino) - return -1; - else - return 0; -} - -/* pop a record for an inode into the defrag tree. The lock - * must be held already - * - * If you're inserting a record for an older transid than an - * existing record, the transid already in the tree is lowered - * - * If an existing record is found the defrag item you - * pass in is freed - */ -static int __btrfs_add_inode_defrag(struct btrfs_inode *inode, - struct inode_defrag *defrag) -{ - struct btrfs_fs_info *fs_info = inode->root->fs_info; - struct inode_defrag *entry; - struct rb_node **p; - struct rb_node *parent = NULL; - int ret; - - p = &fs_info->defrag_inodes.rb_node; - while (*p) { - parent = *p; - entry = rb_entry(parent, struct inode_defrag, rb_node); - - ret = __compare_inode_defrag(defrag, entry); - if (ret < 0) - p = &parent->rb_left; - else if (ret > 0) - p = &parent->rb_right; - else { - /* if we're reinserting an entry for - * an old defrag run, make sure to - * lower the transid of our existing record - */ - if (defrag->transid < entry->transid) - entry->transid = defrag->transid; - entry->extent_thresh = min(defrag->extent_thresh, - entry->extent_thresh); - return -EEXIST; - } - } - set_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags); - rb_link_node(&defrag->rb_node, parent, p); - rb_insert_color(&defrag->rb_node, &fs_info->defrag_inodes); - return 0; -} - -static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info) -{ - if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) - return 0; - - if (btrfs_fs_closing(fs_info)) - return 0; - - return 1; -} - -/* - * insert a defrag record for this inode if auto defrag is - * enabled - */ -int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, - struct btrfs_inode *inode, u32 extent_thresh) -{ - struct btrfs_root *root = inode->root; - struct btrfs_fs_info *fs_info = root->fs_info; - struct inode_defrag *defrag; - u64 transid; - int ret; - - if (!__need_auto_defrag(fs_info)) - return 0; - - if (test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) - return 0; - - if (trans) - transid = trans->transid; - else - transid = inode->root->last_trans; - - defrag = kmem_cache_zalloc(btrfs_inode_defrag_cachep, GFP_NOFS); - if (!defrag) - return -ENOMEM; - - defrag->ino = btrfs_ino(inode); - defrag->transid = transid; - defrag->root = root->root_key.objectid; - defrag->extent_thresh = extent_thresh; - - spin_lock(&fs_info->defrag_inodes_lock); - if (!test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) { - /* - * If we set IN_DEFRAG flag and evict the inode from memory, - * and then re-read this inode, this new inode doesn't have - * IN_DEFRAG flag. At the case, we may find the existed defrag. - */ - ret = __btrfs_add_inode_defrag(inode, defrag); - if (ret) - kmem_cache_free(btrfs_inode_defrag_cachep, defrag); - } else { - kmem_cache_free(btrfs_inode_defrag_cachep, defrag); - } - spin_unlock(&fs_info->defrag_inodes_lock); - return 0; -} - -/* - * pick the defragable inode that we want, if it doesn't exist, we will get - * the next one. - */ -static struct inode_defrag * -btrfs_pick_defrag_inode(struct btrfs_fs_info *fs_info, u64 root, u64 ino) -{ - struct inode_defrag *entry = NULL; - struct inode_defrag tmp; - struct rb_node *p; - struct rb_node *parent = NULL; - int ret; - - tmp.ino = ino; - tmp.root = root; - - spin_lock(&fs_info->defrag_inodes_lock); - p = fs_info->defrag_inodes.rb_node; - while (p) { - parent = p; - entry = rb_entry(parent, struct inode_defrag, rb_node); - - ret = __compare_inode_defrag(&tmp, entry); - if (ret < 0) - p = parent->rb_left; - else if (ret > 0) - p = parent->rb_right; - else - goto out; - } - - if (parent && __compare_inode_defrag(&tmp, entry) > 0) { - parent = rb_next(parent); - if (parent) - entry = rb_entry(parent, struct inode_defrag, rb_node); - else - entry = NULL; - } -out: - if (entry) - rb_erase(parent, &fs_info->defrag_inodes); - spin_unlock(&fs_info->defrag_inodes_lock); - return entry; -} - -void btrfs_cleanup_defrag_inodes(struct btrfs_fs_info *fs_info) -{ - struct inode_defrag *defrag; - struct rb_node *node; - - spin_lock(&fs_info->defrag_inodes_lock); - node = rb_first(&fs_info->defrag_inodes); - while (node) { - rb_erase(node, &fs_info->defrag_inodes); - defrag = rb_entry(node, struct inode_defrag, rb_node); - kmem_cache_free(btrfs_inode_defrag_cachep, defrag); - - cond_resched_lock(&fs_info->defrag_inodes_lock); - - node = rb_first(&fs_info->defrag_inodes); - } - spin_unlock(&fs_info->defrag_inodes_lock); -} - -#define BTRFS_DEFRAG_BATCH 1024 - -static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info, - struct inode_defrag *defrag) -{ - struct btrfs_root *inode_root; - struct inode *inode; - struct btrfs_ioctl_defrag_range_args range; - int ret = 0; - u64 cur = 0; - -again: - if (test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state)) - goto cleanup; - if (!__need_auto_defrag(fs_info)) - goto cleanup; - - /* get the inode */ - inode_root = btrfs_get_fs_root(fs_info, defrag->root, true); - if (IS_ERR(inode_root)) { - ret = PTR_ERR(inode_root); - goto cleanup; - } - - inode = btrfs_iget(fs_info->sb, defrag->ino, inode_root); - btrfs_put_root(inode_root); - if (IS_ERR(inode)) { - ret = PTR_ERR(inode); - goto cleanup; - } - - if (cur >= i_size_read(inode)) { - iput(inode); - goto cleanup; - } - - /* do a chunk of defrag */ - clear_bit(BTRFS_INODE_IN_DEFRAG, &BTRFS_I(inode)->runtime_flags); - memset(&range, 0, sizeof(range)); - range.len = (u64)-1; - range.start = cur; - range.extent_thresh = defrag->extent_thresh; - - sb_start_write(fs_info->sb); - ret = btrfs_defrag_file(inode, NULL, &range, defrag->transid, - BTRFS_DEFRAG_BATCH); - sb_end_write(fs_info->sb); - iput(inode); - - if (ret < 0) - goto cleanup; - - cur = max(cur + fs_info->sectorsize, range.start); - goto again; - -cleanup: - kmem_cache_free(btrfs_inode_defrag_cachep, defrag); - return ret; -} - -/* - * run through the list of inodes in the FS that need - * defragging - */ -int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) -{ - struct inode_defrag *defrag; - u64 first_ino = 0; - u64 root_objectid = 0; - - atomic_inc(&fs_info->defrag_running); - while (1) { - /* Pause the auto defragger. */ - if (test_bit(BTRFS_FS_STATE_REMOUNTING, - &fs_info->fs_state)) - break; - - if (!__need_auto_defrag(fs_info)) - break; - - /* find an inode to defrag */ - defrag = btrfs_pick_defrag_inode(fs_info, root_objectid, - first_ino); - if (!defrag) { - if (root_objectid || first_ino) { - root_objectid = 0; - first_ino = 0; - continue; - } else { - break; - } - } - - first_ino = defrag->ino + 1; - root_objectid = defrag->root; - - __btrfs_run_defrag_inode(fs_info, defrag); - } - atomic_dec(&fs_info->defrag_running); - - /* - * during unmount, we use the transaction_wait queue to - * wait for the defragger to stop - */ - wake_up(&fs_info->transaction_wait); - return 0; -} - /* simple helper to fault in pages and copy. This should go away * and be replaced with calls into generic code. */ @@ -4130,23 +3807,6 @@ const struct file_operations btrfs_file_operations = { .remap_file_range = btrfs_remap_file_range, };
-void __cold btrfs_auto_defrag_exit(void) -{ - kmem_cache_destroy(btrfs_inode_defrag_cachep); -} - -int __init btrfs_auto_defrag_init(void) -{ - btrfs_inode_defrag_cachep = kmem_cache_create("btrfs_inode_defrag", - sizeof(struct inode_defrag), 0, - SLAB_MEM_SPREAD, - NULL); - if (!btrfs_inode_defrag_cachep) - return -ENOMEM; - - return 0; -} - int btrfs_fdatawrite_range(struct inode *inode, loff_t start, loff_t end) { int ret; diff --git a/fs/btrfs/tree-defrag.c b/fs/btrfs/tree-defrag.c index 072ab9a1374b5..0520d6d32a2db 100644 --- a/fs/btrfs/tree-defrag.c +++ b/fs/btrfs/tree-defrag.c @@ -10,6 +10,326 @@ #include "transaction.h" #include "locking.h"
+static struct kmem_cache *btrfs_inode_defrag_cachep; + +/* + * When auto defrag is enabled we queue up these defrag structs to remember + * which inodes need defragging passes. + */ +struct inode_defrag { + struct rb_node rb_node; + /* Inode number */ + u64 ino; + /* + * Transid where the defrag was added, we search for extents newer than + * this. + */ + u64 transid; + + /* Root objectid */ + u64 root; + + /* + * The extent size threshold for autodefrag. + * + * This value is different for compressed/non-compressed extents, thus + * needs to be passed from higher layer. + * (aka, inode_should_defrag()) + */ + u32 extent_thresh; +}; + +static int __compare_inode_defrag(struct inode_defrag *defrag1, + struct inode_defrag *defrag2) +{ + if (defrag1->root > defrag2->root) + return 1; + else if (defrag1->root < defrag2->root) + return -1; + else if (defrag1->ino > defrag2->ino) + return 1; + else if (defrag1->ino < defrag2->ino) + return -1; + else + return 0; +} + +/* + * Pop a record for an inode into the defrag tree. The lock must be held + * already. + * + * If you're inserting a record for an older transid than an existing record, + * the transid already in the tree is lowered. + * + * If an existing record is found the defrag item you pass in is freed. + */ +static int __btrfs_add_inode_defrag(struct btrfs_inode *inode, + struct inode_defrag *defrag) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + struct inode_defrag *entry; + struct rb_node **p; + struct rb_node *parent = NULL; + int ret; + + p = &fs_info->defrag_inodes.rb_node; + while (*p) { + parent = *p; + entry = rb_entry(parent, struct inode_defrag, rb_node); + + ret = __compare_inode_defrag(defrag, entry); + if (ret < 0) + p = &parent->rb_left; + else if (ret > 0) + p = &parent->rb_right; + else { + /* + * If we're reinserting an entry for an old defrag run, + * make sure to lower the transid of our existing + * record. + */ + if (defrag->transid < entry->transid) + entry->transid = defrag->transid; + entry->extent_thresh = min(defrag->extent_thresh, + entry->extent_thresh); + return -EEXIST; + } + } + set_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags); + rb_link_node(&defrag->rb_node, parent, p); + rb_insert_color(&defrag->rb_node, &fs_info->defrag_inodes); + return 0; +} + +static inline int __need_auto_defrag(struct btrfs_fs_info *fs_info) +{ + if (!btrfs_test_opt(fs_info, AUTO_DEFRAG)) + return 0; + + if (btrfs_fs_closing(fs_info)) + return 0; + + return 1; +} + +/* + * Insert a defrag record for this inode if auto defrag is enabled. + */ +int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, + struct btrfs_inode *inode, u32 extent_thresh) +{ + struct btrfs_root *root = inode->root; + struct btrfs_fs_info *fs_info = root->fs_info; + struct inode_defrag *defrag; + u64 transid; + int ret; + + if (!__need_auto_defrag(fs_info)) + return 0; + + if (test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) + return 0; + + if (trans) + transid = trans->transid; + else + transid = inode->root->last_trans; + + defrag = kmem_cache_zalloc(btrfs_inode_defrag_cachep, GFP_NOFS); + if (!defrag) + return -ENOMEM; + + defrag->ino = btrfs_ino(inode); + defrag->transid = transid; + defrag->root = root->root_key.objectid; + defrag->extent_thresh = extent_thresh; + + spin_lock(&fs_info->defrag_inodes_lock); + if (!test_bit(BTRFS_INODE_IN_DEFRAG, &inode->runtime_flags)) { + /* + * If we set IN_DEFRAG flag and evict the inode from memory, + * and then re-read this inode, this new inode doesn't have + * IN_DEFRAG flag. At the case, we may find the existed defrag. + */ + ret = __btrfs_add_inode_defrag(inode, defrag); + if (ret) + kmem_cache_free(btrfs_inode_defrag_cachep, defrag); + } else { + kmem_cache_free(btrfs_inode_defrag_cachep, defrag); + } + spin_unlock(&fs_info->defrag_inodes_lock); + return 0; +} + +/* + * Pick the defragable inode that we want, if it doesn't exist, we will get the + * next one. + */ +static struct inode_defrag *btrfs_pick_defrag_inode( + struct btrfs_fs_info *fs_info, u64 root, u64 ino) +{ + struct inode_defrag *entry = NULL; + struct inode_defrag tmp; + struct rb_node *p; + struct rb_node *parent = NULL; + int ret; + + tmp.ino = ino; + tmp.root = root; + + spin_lock(&fs_info->defrag_inodes_lock); + p = fs_info->defrag_inodes.rb_node; + while (p) { + parent = p; + entry = rb_entry(parent, struct inode_defrag, rb_node); + + ret = __compare_inode_defrag(&tmp, entry); + if (ret < 0) + p = parent->rb_left; + else if (ret > 0) + p = parent->rb_right; + else + goto out; + } + + if (parent && __compare_inode_defrag(&tmp, entry) > 0) { + parent = rb_next(parent); + if (parent) + entry = rb_entry(parent, struct inode_defrag, rb_node); + else + entry = NULL; + } +out: + if (entry) + rb_erase(parent, &fs_info->defrag_inodes); + spin_unlock(&fs_info->defrag_inodes_lock); + return entry; +} + +void btrfs_cleanup_defrag_inodes(struct btrfs_fs_info *fs_info) +{ + struct inode_defrag *defrag; + struct rb_node *node; + + spin_lock(&fs_info->defrag_inodes_lock); + node = rb_first(&fs_info->defrag_inodes); + while (node) { + rb_erase(node, &fs_info->defrag_inodes); + defrag = rb_entry(node, struct inode_defrag, rb_node); + kmem_cache_free(btrfs_inode_defrag_cachep, defrag); + + cond_resched_lock(&fs_info->defrag_inodes_lock); + + node = rb_first(&fs_info->defrag_inodes); + } + spin_unlock(&fs_info->defrag_inodes_lock); +} + +#define BTRFS_DEFRAG_BATCH 1024 + +static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info, + struct inode_defrag *defrag) +{ + struct btrfs_root *inode_root; + struct inode *inode; + struct btrfs_ioctl_defrag_range_args range; + int ret = 0; + u64 cur = 0; + +again: + if (test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state)) + goto cleanup; + if (!__need_auto_defrag(fs_info)) + goto cleanup; + + /* Get the inode */ + inode_root = btrfs_get_fs_root(fs_info, defrag->root, true); + if (IS_ERR(inode_root)) { + ret = PTR_ERR(inode_root); + goto cleanup; + } + + inode = btrfs_iget(fs_info->sb, defrag->ino, inode_root); + btrfs_put_root(inode_root); + if (IS_ERR(inode)) { + ret = PTR_ERR(inode); + goto cleanup; + } + + if (cur >= i_size_read(inode)) { + iput(inode); + goto cleanup; + } + + /* Do a chunk of defrag */ + clear_bit(BTRFS_INODE_IN_DEFRAG, &BTRFS_I(inode)->runtime_flags); + memset(&range, 0, sizeof(range)); + range.len = (u64)-1; + range.start = cur; + range.extent_thresh = defrag->extent_thresh; + + sb_start_write(fs_info->sb); + ret = btrfs_defrag_file(inode, NULL, &range, defrag->transid, + BTRFS_DEFRAG_BATCH); + sb_end_write(fs_info->sb); + iput(inode); + + if (ret < 0) + goto cleanup; + + cur = max(cur + fs_info->sectorsize, range.start); + goto again; + +cleanup: + kmem_cache_free(btrfs_inode_defrag_cachep, defrag); + return ret; +} + +/* + * Run through the list of inodes in the FS that need defragging. + */ +int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info) +{ + struct inode_defrag *defrag; + u64 first_ino = 0; + u64 root_objectid = 0; + + atomic_inc(&fs_info->defrag_running); + while (1) { + /* Pause the auto defragger. */ + if (test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state)) + break; + + if (!__need_auto_defrag(fs_info)) + break; + + /* find an inode to defrag */ + defrag = btrfs_pick_defrag_inode(fs_info, root_objectid, first_ino); + if (!defrag) { + if (root_objectid || first_ino) { + root_objectid = 0; + first_ino = 0; + continue; + } else { + break; + } + } + + first_ino = defrag->ino + 1; + root_objectid = defrag->root; + + __btrfs_run_defrag_inode(fs_info, defrag); + } + atomic_dec(&fs_info->defrag_running); + + /* + * During unmount, we use the transaction_wait queue to wait for the + * defragger to stop. + */ + wake_up(&fs_info->transaction_wait); + return 0; +} + /* * Defrag all the leaves in a given btree. * Read all the leaves and try to get key order to @@ -132,3 +452,20 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
return ret; } + +void __cold btrfs_auto_defrag_exit(void) +{ + kmem_cache_destroy(btrfs_inode_defrag_cachep); +} + +int __init btrfs_auto_defrag_init(void) +{ + btrfs_inode_defrag_cachep = kmem_cache_create("btrfs_inode_defrag", + sizeof(struct inode_defrag), 0, + SLAB_MEM_SPREAD, + NULL); + if (!btrfs_inode_defrag_cachep) + return -ENOMEM; + + return 0; +}
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 519b7e13b5ae8dd38da1e52275705343be6bb508 ]
Currently fiemap does not take the inode's lock (VFS lock), it only locks a file range in the inode's io tree. This however can lead to a deadlock if we have a concurrent fsync on the file and fiemap code triggers a fault when accessing the user space buffer with fiemap_fill_next_extent(). The deadlock happens on the inode's i_mmap_lock semaphore, which is taken both by fsync and btrfs_page_mkwrite(). This deadlock was recently reported by syzbot and triggers a trace like the following:
task:syz-executor361 state:D stack:20264 pid:5668 ppid:5119 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5293 [inline] __schedule+0x995/0xe20 kernel/sched/core.c:6606 schedule+0xcb/0x190 kernel/sched/core.c:6682 wait_on_state fs/btrfs/extent-io-tree.c:707 [inline] wait_extent_bit+0x577/0x6f0 fs/btrfs/extent-io-tree.c:751 lock_extent+0x1c2/0x280 fs/btrfs/extent-io-tree.c:1742 find_lock_delalloc_range+0x4e6/0x9c0 fs/btrfs/extent_io.c:488 writepage_delalloc+0x1ef/0x540 fs/btrfs/extent_io.c:1863 __extent_writepage+0x736/0x14e0 fs/btrfs/extent_io.c:2174 extent_write_cache_pages+0x983/0x1220 fs/btrfs/extent_io.c:3091 extent_writepages+0x219/0x540 fs/btrfs/extent_io.c:3211 do_writepages+0x3c3/0x680 mm/page-writeback.c:2581 filemap_fdatawrite_wbc+0x11e/0x170 mm/filemap.c:388 __filemap_fdatawrite_range mm/filemap.c:421 [inline] filemap_fdatawrite_range+0x175/0x200 mm/filemap.c:439 btrfs_fdatawrite_range fs/btrfs/file.c:3850 [inline] start_ordered_ops fs/btrfs/file.c:1737 [inline] btrfs_sync_file+0x4ff/0x1190 fs/btrfs/file.c:1839 generic_write_sync include/linux/fs.h:2885 [inline] btrfs_do_write_iter+0xcd3/0x1280 fs/btrfs/file.c:1684 call_write_iter include/linux/fs.h:2189 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x7dc/0xc50 fs/read_write.c:584 ksys_write+0x177/0x2a0 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f7d4054e9b9 RSP: 002b:00007f7d404fa2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f7d405d87a0 RCX: 00007f7d4054e9b9 RDX: 0000000000000090 RSI: 0000000020000000 RDI: 0000000000000006 RBP: 00007f7d405a51d0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 61635f65646f6e69 R13: 65646f7475616f6e R14: 7261637369646f6e R15: 00007f7d405d87a8 </TASK> INFO: task syz-executor361:5697 blocked for more than 145 seconds. Not tainted 6.2.0-rc3-syzkaller-00376-g7c6984405241 #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor361 state:D stack:21216 pid:5697 ppid:5119 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5293 [inline] __schedule+0x995/0xe20 kernel/sched/core.c:6606 schedule+0xcb/0x190 kernel/sched/core.c:6682 rwsem_down_read_slowpath+0x5f9/0x930 kernel/locking/rwsem.c:1095 __down_read_common+0x54/0x2a0 kernel/locking/rwsem.c:1260 btrfs_page_mkwrite+0x417/0xc80 fs/btrfs/inode.c:8526 do_page_mkwrite+0x19e/0x5e0 mm/memory.c:2947 wp_page_shared+0x15e/0x380 mm/memory.c:3295 handle_pte_fault mm/memory.c:4949 [inline] __handle_mm_fault mm/memory.c:5073 [inline] handle_mm_fault+0x1b79/0x26b0 mm/memory.c:5219 do_user_addr_fault+0x69b/0xcb0 arch/x86/mm/fault.c:1428 handle_page_fault arch/x86/mm/fault.c:1519 [inline] exc_page_fault+0x7a/0x110 arch/x86/mm/fault.c:1575 asm_exc_page_fault+0x22/0x30 arch/x86/include/asm/idtentry.h:570 RIP: 0010:copy_user_short_string+0xd/0x40 arch/x86/lib/copy_user_64.S:233 Code: 74 0a 89 (...) RSP: 0018:ffffc9000570f330 EFLAGS: 00050202 RAX: ffffffff843e6601 RBX: 00007fffffffefc8 RCX: 0000000000000007 RDX: 0000000000000000 RSI: ffffc9000570f3e0 RDI: 0000000020000120 RBP: ffffc9000570f490 R08: 0000000000000000 R09: fffff52000ae1e83 R10: fffff52000ae1e83 R11: 1ffff92000ae1e7c R12: 0000000000000038 R13: ffffc9000570f3e0 R14: 0000000020000120 R15: ffffc9000570f3e0 copy_user_generic arch/x86/include/asm/uaccess_64.h:37 [inline] raw_copy_to_user arch/x86/include/asm/uaccess_64.h:58 [inline] _copy_to_user+0xe9/0x130 lib/usercopy.c:34 copy_to_user include/linux/uaccess.h:169 [inline] fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144 emit_fiemap_extent+0x22d/0x3c0 fs/btrfs/extent_io.c:3458 fiemap_process_hole+0xa00/0xad0 fs/btrfs/extent_io.c:3716 extent_fiemap+0xe27/0x2100 fs/btrfs/extent_io.c:3922 btrfs_fiemap+0x172/0x1e0 fs/btrfs/inode.c:8209 ioctl_fiemap fs/ioctl.c:219 [inline] do_vfs_ioctl+0x185b/0x2980 fs/ioctl.c:810 __do_sys_ioctl fs/ioctl.c:868 [inline] __se_sys_ioctl+0x83/0x170 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f7d4054e9b9 RSP: 002b:00007f7d390d92f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f7d405d87b0 RCX: 00007f7d4054e9b9 RDX: 0000000020000100 RSI: 00000000c020660b RDI: 0000000000000005 RBP: 00007f7d405a51d0 R08: 00007f7d390d9700 R09: 0000000000000000 R10: 00007f7d390d9700 R11: 0000000000000246 R12: 61635f65646f6e69 R13: 65646f7475616f6e R14: 7261637369646f6e R15: 00007f7d405d87b8 </TASK>
What happens is the following:
1) Task A is doing an fsync, enters btrfs_sync_file() and flushes delalloc before locking the inode and the i_mmap_lock semaphore, that is, before calling btrfs_inode_lock();
2) After task A flushes delalloc and before it calls btrfs_inode_lock(), another task dirties a page;
3) Task B starts a fiemap without FIEMAP_FLAG_SYNC, so the page dirtied at step 2 remains dirty and unflushed. Then when it enters extent_fiemap() and it locks a file range that includes the range of the page dirtied in step 2;
4) Task A calls btrfs_inode_lock() and locks the inode (VFS lock) and the inode's i_mmap_lock semaphore in write mode. Then it tries to flush delalloc by calling start_ordered_ops(), which will block, at find_lock_delalloc_range(), when trying to lock the range of the page dirtied at step 2, since this range was locked by the fiemap task (at step 3);
5) Task B generates a page fault when accessing the user space fiemap buffer with a call to fiemap_fill_next_extent().
The fault handler needs to call btrfs_page_mkwrite() for some other page of our inode, and there we deadlock when trying to lock the inode's i_mmap_lock semaphore in read mode, since the fsync task locked it in write mode (step 4) and the fsync task can not progress because it's waiting to lock a file range that is currently locked by us (the fiemap task, step 3).
Fix this by taking the inode's lock (VFS lock) in shared mode when entering fiemap. This effectively serializes fiemap with fsync (except the most expensive part of fsync, the log sync), preventing this deadlock.
Reported-by: syzbot+cc35f55c41e34c30dcb5@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/00000000000032dc7305f2a66f46@google.com/ CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_io.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index acb3c5c3b0251..58785dc7080ad 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3938,6 +3938,7 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo, lockend = round_up(start + len, root->fs_info->sectorsize); prev_extent_end = lockstart;
+ btrfs_inode_lock(&inode->vfs_inode, BTRFS_ILOCK_SHARED); lock_extent(&inode->io_tree, lockstart, lockend, &cached_state);
ret = fiemap_find_last_extent_offset(inode, path, &last_extent_end); @@ -4129,6 +4130,7 @@ int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo,
out_unlock: unlock_extent(&inode->io_tree, lockstart, lockend, &cached_state); + btrfs_inode_unlock(&inode->vfs_inode, BTRFS_ILOCK_SHARED); out: kfree(backref_cache); btrfs_free_path(path);
From: Syed Saba Kareem Syed.SabaKareem@amd.com
[ Upstream commit 7fd26a27680aa9032920f798a5a8b38a2c61075f ]
Adding DMI entries to support new acer/emdoor platforms.
Suggested-by: shanshengwang shansheng.wang@amd.com Signed-off-by: Syed Saba Kareem Syed.SabaKareem@amd.com Link: https://lore.kernel.org/r/20230111102130.2276391-1-Syed.SabaKareem@amd.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/amd/yc/acp6x-mach.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index 0d283e41f66dc..00fb976e0b81e 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -234,6 +234,20 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Blade 14 (2022) - RZ09-0427"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "RB"), + DMI_MATCH(DMI_PRODUCT_NAME, "Swift SFA16-41"), + } + }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "IRBIS"), + DMI_MATCH(DMI_PRODUCT_NAME, "15NBC1011"), + } + }, {} };
From: Bard Liao yung-chuan.liao@linux.intel.com
[ Upstream commit fcc4348adafe53928fda46d104c1798e5a4de4ff ]
If there is a connection between a playback stream and a capture stream, all widgets that are connected to the playback stream and the capture stream will be in the list. So, we have to start with the exactly right widget type. snd_soc_dapm_aif_out is for capture stream and a playback stream should start with a snd_soc_dapm_aif_in widget. Contrarily, snd_soc_dapm_dai_in is for playback stream, and a capture stream should start with a snd_soc_dapm_dai_out widget.
Signed-off-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230117123534.2075-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/sof-audio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/sof-audio.c b/sound/soc/sof/sof-audio.c index 2df433c6ef55f..cf2c0db57d899 100644 --- a/sound/soc/sof/sof-audio.c +++ b/sound/soc/sof/sof-audio.c @@ -431,11 +431,11 @@ sof_walk_widgets_in_order(struct snd_sof_dev *sdev, struct snd_soc_dapm_widget_l
for_each_dapm_widgets(list, i, widget) { /* starting widget for playback is AIF type */ - if (dir == SNDRV_PCM_STREAM_PLAYBACK && !WIDGET_IS_AIF(widget->id)) + if (dir == SNDRV_PCM_STREAM_PLAYBACK && widget->id != snd_soc_dapm_aif_in) continue;
/* starting widget for capture is DAI type */ - if (dir == SNDRV_PCM_STREAM_CAPTURE && !WIDGET_IS_DAI(widget->id)) + if (dir == SNDRV_PCM_STREAM_CAPTURE && widget->id != snd_soc_dapm_dai_out) continue;
switch (op) {
From: Takashi Iwai tiwai@suse.de
[ Upstream commit dfd5fe19db7dc7006642f8109ee8965e5d031897 ]
JBL Quantum610 Wireless (0ecb:205c) requires the same workaround that was used for JBL Quantum810 for limiting the sample rate.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216798 Link: https://lore.kernel.org/r/20230118165947.22317-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 3d13fdf7590cd..3ecd1ba7fd4b1 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -2152,6 +2152,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = { QUIRK_FLAG_GENERIC_IMPLICIT_FB), DEVICE_FLG(0x0525, 0xa4ad, /* Hamedal C20 usb camero */ QUIRK_FLAG_IFACE_SKIP_CLOSE), + DEVICE_FLG(0x0ecb, 0x205c, /* JBL Quantum610 Wireless */ + QUIRK_FLAG_FIXED_RATE), DEVICE_FLG(0x0ecb, 0x2069, /* JBL Quantum810 Wireless */ QUIRK_FLAG_FIXED_RATE),
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 324f065cdbaba1b879a63bf07e61ca156b789537 ]
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier.
To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture.
Link: https://github.com/thesofproject/linux/issues/4083 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20230119163459.2235843-2-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_rt5682.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/soc/intel/boards/sof_rt5682.c b/sound/soc/intel/boards/sof_rt5682.c index 2358be208c1fd..59c58ef932e4d 100644 --- a/sound/soc/intel/boards/sof_rt5682.c +++ b/sound/soc/intel/boards/sof_rt5682.c @@ -761,8 +761,6 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, links[id].num_codecs = ARRAY_SIZE(max_98373_components); links[id].init = max_98373_spk_codec_init; links[id].ops = &max_98373_ops; - /* feedback stream */ - links[id].dpcm_capture = 1; } else if (sof_rt5682_quirk & SOF_MAX98360A_SPEAKER_AMP_PRESENT) { max_98360a_dai_link(&links[id]); @@ -789,6 +787,9 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, links[id].platforms = platform_component; links[id].num_platforms = ARRAY_SIZE(platform_component); links[id].dpcm_playback = 1; + /* feedback stream or firmware-generated echo reference */ + links[id].dpcm_capture = 1; + links[id].no_pcm = 1; links[id].cpus = &cpus[id]; links[id].num_cpus = 1;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit e0a52220344ab7defe25b9cdd58fe1dc1122e67c ]
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier.
To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture.
Link: https://github.com/thesofproject/linux/issues/4083 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20230119163459.2235843-3-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_cs42l42.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/intel/boards/sof_cs42l42.c b/sound/soc/intel/boards/sof_cs42l42.c index e38bd2831e6ac..e9d190cb13b0a 100644 --- a/sound/soc/intel/boards/sof_cs42l42.c +++ b/sound/soc/intel/boards/sof_cs42l42.c @@ -336,6 +336,9 @@ static int create_spk_amp_dai_links(struct device *dev, links[*id].platforms = platform_component; links[*id].num_platforms = ARRAY_SIZE(platform_component); links[*id].dpcm_playback = 1; + /* firmware-generated echo reference */ + links[*id].dpcm_capture = 1; + links[*id].no_pcm = 1; links[*id].cpus = &cpus[*id]; links[*id].num_cpus = 1;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 36a71a0eb7cdb5ccf4b0214dbd41ab00dff18c7f ]
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier.
To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture.
Link: https://github.com/thesofproject/linux/issues/4083 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20230119163459.2235843-4-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_nau8825.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/soc/intel/boards/sof_nau8825.c b/sound/soc/intel/boards/sof_nau8825.c index 009a41fbefa10..0c723d4d2d63b 100644 --- a/sound/soc/intel/boards/sof_nau8825.c +++ b/sound/soc/intel/boards/sof_nau8825.c @@ -479,8 +479,6 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, links[id].num_codecs = ARRAY_SIZE(max_98373_components); links[id].init = max_98373_spk_codec_init; links[id].ops = &max_98373_ops; - /* feedback stream */ - links[id].dpcm_capture = 1; } else if (sof_nau8825_quirk & SOF_MAX98360A_SPEAKER_AMP_PRESENT) { max_98360a_dai_link(&links[id]); @@ -493,6 +491,9 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, links[id].platforms = platform_component; links[id].num_platforms = ARRAY_SIZE(platform_component); links[id].dpcm_playback = 1; + /* feedback stream or firmware-generated echo reference */ + links[id].dpcm_capture = 1; + links[id].no_pcm = 1; links[id].cpus = &cpus[id]; links[id].num_cpus = 1;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit b3c00316a2f847791bae395ea6dd91aa7a221471 ]
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier.
To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture.
Link: https://github.com/thesofproject/linux/issues/4083 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://lore.kernel.org/r/20230119163459.2235843-5-kai.vehmanen@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/boards/sof_ssp_amp.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/sound/soc/intel/boards/sof_ssp_amp.c b/sound/soc/intel/boards/sof_ssp_amp.c index 94d25aeb6e7ce..7b74f122e3400 100644 --- a/sound/soc/intel/boards/sof_ssp_amp.c +++ b/sound/soc/intel/boards/sof_ssp_amp.c @@ -258,13 +258,12 @@ static struct snd_soc_dai_link *sof_card_dai_links_create(struct device *dev, sof_rt1308_dai_link(&links[id]); } else if (sof_ssp_amp_quirk & SOF_CS35L41_SPEAKER_AMP_PRESENT) { cs35l41_set_dai_link(&links[id]); - - /* feedback from amplifier */ - links[id].dpcm_capture = 1; } links[id].platforms = platform_component; links[id].num_platforms = ARRAY_SIZE(platform_component); links[id].dpcm_playback = 1; + /* feedback from amplifier or firmware-generated echo reference */ + links[id].dpcm_capture = 1; links[id].no_pcm = 1; links[id].cpus = &cpus[id]; links[id].num_cpus = 1;
From: Eduard Zingerman eddyz87@gmail.com
[ Upstream commit b9fa9bc839291020b362ab5392e5f18ba79657ac ]
A testcase to check that verifier.c:copy_register_state() preserves register parentage chain and livness information.
Signed-off-by: Eduard Zingerman eddyz87@gmail.com Link: https://lore.kernel.org/r/20230106142214.1040390-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/bpf/verifier/search_pruning.c | 36 +++++++++++++++++++ 1 file changed, 36 insertions(+)
diff --git a/tools/testing/selftests/bpf/verifier/search_pruning.c b/tools/testing/selftests/bpf/verifier/search_pruning.c index 68b14fdfebdb1..d63fd8991b03a 100644 --- a/tools/testing/selftests/bpf/verifier/search_pruning.c +++ b/tools/testing/selftests/bpf/verifier/search_pruning.c @@ -225,3 +225,39 @@ .result_unpriv = ACCEPT, .insn_processed = 15, }, +/* The test performs a conditional 64-bit write to a stack location + * fp[-8], this is followed by an unconditional 8-bit write to fp[-8], + * then data is read from fp[-8]. This sequence is unsafe. + * + * The test would be mistakenly marked as safe w/o dst register parent + * preservation in verifier.c:copy_register_state() function. + * + * Note the usage of BPF_F_TEST_STATE_FREQ to force creation of the + * checkpoint state after conditional 64-bit assignment. + */ +{ + "write tracking and register parent chain bug", + .insns = { + /* r6 = ktime_get_ns() */ + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + BPF_MOV64_REG(BPF_REG_6, BPF_REG_0), + /* r0 = ktime_get_ns() */ + BPF_EMIT_CALL(BPF_FUNC_ktime_get_ns), + /* if r0 > r6 goto +1 */ + BPF_JMP_REG(BPF_JGT, BPF_REG_0, BPF_REG_6, 1), + /* *(u64 *)(r10 - 8) = 0xdeadbeef */ + BPF_ST_MEM(BPF_DW, BPF_REG_FP, -8, 0xdeadbeef), + /* r1 = 42 */ + BPF_MOV64_IMM(BPF_REG_1, 42), + /* *(u8 *)(r10 - 8) = r1 */ + BPF_STX_MEM(BPF_B, BPF_REG_FP, BPF_REG_1, -8), + /* r2 = *(u64 *)(r10 - 8) */ + BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_FP, -8), + /* exit(0) */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .flags = BPF_F_TEST_STATE_FREQ, + .errstr = "invalid read from stack off -8+1 size 8", + .result = REJECT, +},
From: Cezary Rojewski cezary.rojewski@intel.com
[ Upstream commit 87978e6ad45a16835cc58234451111091be3c59a ]
Several functions that take part in codec's initialization and removal are re-used by ASoC codec drivers implementations. Drivers mimic the behavior of hda_codec_driver_probe/remove() found in sound/pci/hda/hda_bind.c with their component->probe/remove() instead.
One of the reasons for that is the expectation of snd_hda_codec_device_new() to receive a valid pointer to an instance of struct snd_card. This expectation can be met only once sound card components probing commences.
As ASoC sound card may be unbound without codec device being actually removed from the system, unsetting ->preset in snd_hda_codec_cleanup_for_unbind() interferes with module unload -> load scenario causing null-ptr-deref. Preset is assigned only once, during device/driver matching whereas ASoC codec driver's module reloading may occur several times throughout the lifetime of an audio stack.
Suggested-by: Takashi Iwai tiwai@suse.com Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Link: https://lore.kernel.org/r/20230119143235.1159814-1-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/hda_bind.c | 2 ++ sound/pci/hda/hda_codec.c | 1 - 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_bind.c b/sound/pci/hda/hda_bind.c index 1a868dd9dc4b6..890c2f7c33fc2 100644 --- a/sound/pci/hda/hda_bind.c +++ b/sound/pci/hda/hda_bind.c @@ -144,6 +144,7 @@ static int hda_codec_driver_probe(struct device *dev)
error: snd_hda_codec_cleanup_for_unbind(codec); + codec->preset = NULL; return err; }
@@ -166,6 +167,7 @@ static int hda_codec_driver_remove(struct device *dev) if (codec->patch_ops.free) codec->patch_ops.free(codec); snd_hda_codec_cleanup_for_unbind(codec); + codec->preset = NULL; module_put(dev->driver->owner); return 0; } diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index edd653ece70d7..ac1cc7c5290e3 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -795,7 +795,6 @@ void snd_hda_codec_cleanup_for_unbind(struct hda_codec *codec) snd_array_free(&codec->cvt_setups); snd_array_free(&codec->spdif_out); snd_array_free(&codec->verbs); - codec->preset = NULL; codec->follower_dig_outs = NULL; codec->spdif_status_reset = 0; snd_array_free(&codec->mixers);
From: fengwk fengwk94@gmail.com
[ Upstream commit dcff8b7ca92d724bdaf474a3fa37a7748377813a ]
This model requires an additional detection quirk to enable the internal microphone - BIOS doesn't seem to support AcpDmicConnected (nothing in acpidump output).
Signed-off-by: fengwk fengwk94@gmail.com Link: https://lore.kernel.org/r/Y8wmCutc74j/tyHP@arch Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/amd/yc/acp6x-mach.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index 00fb976e0b81e..36314753923b8 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -227,6 +227,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Redmi Book Pro 14 2022"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "TIMI"), + DMI_MATCH(DMI_PRODUCT_NAME, "Redmi Book Pro 15 2022"), + } + }, { .driver_data = &acp6x_card, .matches = {
From: Jakub Sitnicki jakub@cloudflare.com
[ Upstream commit 5b4a79ba65a1ab479903fff2e604865d229b70a9 ]
sock_map proto callbacks should never call themselves by design. Protect against bugs like [1] and break out of the recursive loop to avoid a stack overflow in favor of a resource leak.
[1] https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/
Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-1-1e0ee7ac2f90@cloudflare.... Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/sock_map.c | 61 +++++++++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 27 deletions(-)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 22fa2c5bc6ec9..a68a7290a3b2b 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -1569,15 +1569,16 @@ void sock_map_unhash(struct sock *sk) psock = sk_psock(sk); if (unlikely(!psock)) { rcu_read_unlock(); - if (sk->sk_prot->unhash) - sk->sk_prot->unhash(sk); - return; + saved_unhash = READ_ONCE(sk->sk_prot)->unhash; + } else { + saved_unhash = psock->saved_unhash; + sock_map_remove_links(sk, psock); + rcu_read_unlock(); } - - saved_unhash = psock->saved_unhash; - sock_map_remove_links(sk, psock); - rcu_read_unlock(); - saved_unhash(sk); + if (WARN_ON_ONCE(saved_unhash == sock_map_unhash)) + return; + if (saved_unhash) + saved_unhash(sk); } EXPORT_SYMBOL_GPL(sock_map_unhash);
@@ -1590,17 +1591,18 @@ void sock_map_destroy(struct sock *sk) psock = sk_psock_get(sk); if (unlikely(!psock)) { rcu_read_unlock(); - if (sk->sk_prot->destroy) - sk->sk_prot->destroy(sk); - return; + saved_destroy = READ_ONCE(sk->sk_prot)->destroy; + } else { + saved_destroy = psock->saved_destroy; + sock_map_remove_links(sk, psock); + rcu_read_unlock(); + sk_psock_stop(psock); + sk_psock_put(sk, psock); } - - saved_destroy = psock->saved_destroy; - sock_map_remove_links(sk, psock); - rcu_read_unlock(); - sk_psock_stop(psock); - sk_psock_put(sk, psock); - saved_destroy(sk); + if (WARN_ON_ONCE(saved_destroy == sock_map_destroy)) + return; + if (saved_destroy) + saved_destroy(sk); } EXPORT_SYMBOL_GPL(sock_map_destroy);
@@ -1615,16 +1617,21 @@ void sock_map_close(struct sock *sk, long timeout) if (unlikely(!psock)) { rcu_read_unlock(); release_sock(sk); - return sk->sk_prot->close(sk, timeout); + saved_close = READ_ONCE(sk->sk_prot)->close; + } else { + saved_close = psock->saved_close; + sock_map_remove_links(sk, psock); + rcu_read_unlock(); + sk_psock_stop(psock); + release_sock(sk); + cancel_work_sync(&psock->work); + sk_psock_put(sk, psock); } - - saved_close = psock->saved_close; - sock_map_remove_links(sk, psock); - rcu_read_unlock(); - sk_psock_stop(psock); - release_sock(sk); - cancel_work_sync(&psock->work); - sk_psock_put(sk, psock); + /* Make sure we do not recurse. This is a bug. + * Leak the socket instead of crashing on a stack overflow. + */ + if (WARN_ON_ONCE(saved_close == sock_map_close)) + return; saved_close(sk, timeout); } EXPORT_SYMBOL_GPL(sock_map_close);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit e18c6da62edc780e4f4f3c9ce07bdacd69505182 ]
While looking through legacy platform data users, I noticed that the DT probing never uses data from the DT properties, as the platform_data structure gets overwritten directly after it is initialized.
There have never been any boards defining the platform_data in the mainline kernel either, so this driver so far only worked with patched kernels or with the default values.
For the benefit of possible downstream users, fix the DT probe by no longer overwriting the data.
Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/20230126162203.2986339-1-arnd@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/cs42l56.c | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/sound/soc/codecs/cs42l56.c b/sound/soc/codecs/cs42l56.c index 26066682c983e..3b0e715549c9c 100644 --- a/sound/soc/codecs/cs42l56.c +++ b/sound/soc/codecs/cs42l56.c @@ -1191,18 +1191,12 @@ static int cs42l56_i2c_probe(struct i2c_client *i2c_client) if (pdata) { cs42l56->pdata = *pdata; } else { - pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata), - GFP_KERNEL); - if (!pdata) - return -ENOMEM; - if (i2c_client->dev.of_node) { ret = cs42l56_handle_of_data(i2c_client, &cs42l56->pdata); if (ret != 0) return ret; } - cs42l56->pdata = *pdata; }
if (cs42l56->pdata.gpio_nreset) {
From: Shunsuke Mie mie@igel.co.jp
[ Upstream commit 3f7b75abf41cc4143aa295f62acbb060a012868d ]
Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that are used in drivers/virtio/virtio_ring.c.
Signed-off-by: Shunsuke Mie mie@igel.co.jp Message-Id: 20230110034310.779744-1-mie@igel.co.jp Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/virtio/linux/bug.h | 8 +++----- tools/virtio/linux/build_bug.h | 7 +++++++ tools/virtio/linux/cpumask.h | 7 +++++++ tools/virtio/linux/gfp.h | 7 +++++++ tools/virtio/linux/kernel.h | 1 + tools/virtio/linux/kmsan.h | 12 ++++++++++++ tools/virtio/linux/scatterlist.h | 1 + tools/virtio/linux/topology.h | 7 +++++++ 8 files changed, 45 insertions(+), 5 deletions(-) create mode 100644 tools/virtio/linux/build_bug.h create mode 100644 tools/virtio/linux/cpumask.h create mode 100644 tools/virtio/linux/gfp.h create mode 100644 tools/virtio/linux/kmsan.h create mode 100644 tools/virtio/linux/topology.h
diff --git a/tools/virtio/linux/bug.h b/tools/virtio/linux/bug.h index 813baf13f62a2..51a919083d9b8 100644 --- a/tools/virtio/linux/bug.h +++ b/tools/virtio/linux/bug.h @@ -1,13 +1,11 @@ /* SPDX-License-Identifier: GPL-2.0 */ -#ifndef BUG_H -#define BUG_H +#ifndef _LINUX_BUG_H +#define _LINUX_BUG_H
#include <asm/bug.h>
#define BUG_ON(__BUG_ON_cond) assert(!(__BUG_ON_cond))
-#define BUILD_BUG_ON(x) - #define BUG() abort()
-#endif /* BUG_H */ +#endif /* _LINUX_BUG_H */ diff --git a/tools/virtio/linux/build_bug.h b/tools/virtio/linux/build_bug.h new file mode 100644 index 0000000000000..cdbb75e28a604 --- /dev/null +++ b/tools/virtio/linux/build_bug.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_BUILD_BUG_H +#define _LINUX_BUILD_BUG_H + +#define BUILD_BUG_ON(x) + +#endif /* _LINUX_BUILD_BUG_H */ diff --git a/tools/virtio/linux/cpumask.h b/tools/virtio/linux/cpumask.h new file mode 100644 index 0000000000000..307da69d6b26c --- /dev/null +++ b/tools/virtio/linux/cpumask.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_CPUMASK_H +#define _LINUX_CPUMASK_H + +#include <linux/kernel.h> + +#endif /* _LINUX_CPUMASK_H */ diff --git a/tools/virtio/linux/gfp.h b/tools/virtio/linux/gfp.h new file mode 100644 index 0000000000000..43d146f236f14 --- /dev/null +++ b/tools/virtio/linux/gfp.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_GFP_H +#define __LINUX_GFP_H + +#include <linux/topology.h> + +#endif diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h index 21593bf977552..8b877167933d1 100644 --- a/tools/virtio/linux/kernel.h +++ b/tools/virtio/linux/kernel.h @@ -10,6 +10,7 @@ #include <stdarg.h>
#include <linux/compiler.h> +#include <linux/log2.h> #include <linux/types.h> #include <linux/overflow.h> #include <linux/list.h> diff --git a/tools/virtio/linux/kmsan.h b/tools/virtio/linux/kmsan.h new file mode 100644 index 0000000000000..272b5aa285d5a --- /dev/null +++ b/tools/virtio/linux/kmsan.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_KMSAN_H +#define _LINUX_KMSAN_H + +#include <linux/gfp.h> + +inline void kmsan_handle_dma(struct page *page, size_t offset, size_t size, + enum dma_data_direction dir) +{ +} + +#endif /* _LINUX_KMSAN_H */ diff --git a/tools/virtio/linux/scatterlist.h b/tools/virtio/linux/scatterlist.h index 369ee308b6686..74d9e1825748e 100644 --- a/tools/virtio/linux/scatterlist.h +++ b/tools/virtio/linux/scatterlist.h @@ -2,6 +2,7 @@ #ifndef SCATTERLIST_H #define SCATTERLIST_H #include <linux/kernel.h> +#include <linux/bug.h>
struct scatterlist { unsigned long page_link; diff --git a/tools/virtio/linux/topology.h b/tools/virtio/linux/topology.h new file mode 100644 index 0000000000000..910794afb993a --- /dev/null +++ b/tools/virtio/linux/topology.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_TOPOLOGY_H +#define _LINUX_TOPOLOGY_H + +#include <linux/cpumask.h> + +#endif /* _LINUX_TOPOLOGY_H */
From: Tanmay Bhushan 007047221b@gmail.com
[ Upstream commit 6b04456e248761cf68f562f2fd7c04e591fcac94 ]
ifcvf_mgmt_dev leaks memory if it is not freed before returning. Call is made to correct return statement so memory does not leak. ifcvf_init_hw does not take care of this so it is needed to do it here.
Signed-off-by: Tanmay Bhushan 007047221b@gmail.com Message-Id: 772e9fe133f21fa78fb98a2ebe8969efbbd58e3c.camel@gmail.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Acked-by: Zhu Lingshan lingshan.zhu@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/ifcvf/ifcvf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index f9c0044c6442e..44b29289aa193 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -849,7 +849,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) ret = ifcvf_init_hw(vf, pdev); if (ret) { IFCVF_ERR(pdev, "Failed to init IFCVF hw\n"); - return ret; + goto err; }
for (i = 0; i < vf->nr_vring; i++)
From: Hyunwoo Kim v4bel@theori.io
[ Upstream commit 14caefcf9837a2be765a566005ad82cd0d2a429f ]
If you call listen() and accept() on an already connect()ed rose socket, accept() can successfully connect. This is because when the peer socket sends data to sendmsg, the skb with its own sk stored in the connected socket's sk->sk_receive_queue is connected, and rose_accept() dequeues the skb waiting in the sk->sk_receive_queue.
This creates a child socket with the sk of the parent rose socket, which can cause confusion.
Fix rose_listen() to return -EINVAL if the socket has already been successfully connected, and add lock_sock to prevent this issue.
Signed-off-by: Hyunwoo Kim v4bel@theori.io Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20230125105944.GA133314@ubuntu Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/rose/af_rose.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c index 36fefc3957d77..ca2b17f32670d 100644 --- a/net/rose/af_rose.c +++ b/net/rose/af_rose.c @@ -488,6 +488,12 @@ static int rose_listen(struct socket *sock, int backlog) { struct sock *sk = sock->sk;
+ lock_sock(sk); + if (sock->state != SS_UNCONNECTED) { + release_sock(sk); + return -EINVAL; + } + if (sk->sk_state != TCP_LISTEN) { struct rose_sock *rose = rose_sk(sk);
@@ -497,8 +503,10 @@ static int rose_listen(struct socket *sock, int backlog) memset(rose->dest_digis, 0, AX25_ADDR_LEN * ROSE_MAX_DIGIS); sk->sk_max_ack_backlog = backlog; sk->sk_state = TCP_LISTEN; + release_sock(sk); return 0; } + release_sock(sk);
return -EOPNOTSUPP; }
From: Andrei Gherzan andrei.gherzan@canonical.com
[ Upstream commit a6efc42a86c0c87cfe2f1c3d1f09a4c9b13ba890 ]
"tcpdump" is used to capture traffic in these tests while using a random, temporary and not suffixed file for it. This can interfere with apparmor configuration where the tool is only allowed to read from files with 'known' extensions.
The MINE type application/vnd.tcpdump.pcap was registered with IANA for pcap files and .pcap is the extension that is both most common but also aligned with standard apparmor configurations. See TCPDUMP(8) for more details.
This improves compatibility with standard apparmor configurations by using ".pcap" as the file extension for the tests' temporary files.
Signed-off-by: Andrei Gherzan andrei.gherzan@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/cmsg_ipv6.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/cmsg_ipv6.sh b/tools/testing/selftests/net/cmsg_ipv6.sh index 2d89cb0ad2889..330d0b1ceced3 100755 --- a/tools/testing/selftests/net/cmsg_ipv6.sh +++ b/tools/testing/selftests/net/cmsg_ipv6.sh @@ -6,7 +6,7 @@ ksft_skip=4 NS=ns IP6=2001:db8:1::1/64 TGT6=2001:db8:1::2 -TMPF=`mktemp` +TMPF=$(mktemp --suffix ".pcap")
cleanup() {
From: Andrey Konovalov andrey.konovalov@linaro.org
[ Upstream commit 54aa39a513dbf2164ca462a19f04519b2407a224 ]
Currently in phy_init_eee() the driver unconditionally configures the PHY to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt storm on my qcs404-base board.
Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible device RX_CLK continues to run even in Rx LPI state.
Signed-off-by: Andrey Konovalov andrey.konovalov@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++- include/linux/stmmac.h | 1 + 3 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c index 835caa15d55ff..732774645c1a6 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c @@ -560,6 +560,8 @@ static int qcom_ethqos_probe(struct platform_device *pdev) plat_dat->has_gmac4 = 1; plat_dat->pmt = 1; plat_dat->tso_en = of_property_read_bool(np, "snps,tso"); + if (of_device_is_compatible(np, "qcom,qcs404-ethqos")) + plat_dat->rx_clk_runs_in_lpi = 1;
ret = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res); if (ret) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 4bba0444c764a..84e1740b12f1b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1077,7 +1077,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
stmmac_mac_set(priv, priv->ioaddr, true); if (phy && priv->dma_cap.eee) { - priv->eee_active = phy_init_eee(phy, 1) >= 0; + priv->eee_active = + phy_init_eee(phy, !priv->plat->rx_clk_runs_in_lpi) >= 0; priv->eee_enabled = stmmac_eee_init(priv); priv->tx_lpi_enabled = priv->eee_enabled; stmmac_set_eee_pls(priv, priv->hw, true); diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h index fb2e88614f5d1..313edd19bf545 100644 --- a/include/linux/stmmac.h +++ b/include/linux/stmmac.h @@ -252,6 +252,7 @@ struct plat_stmmacenet_data { int rss_en; int mac_port_sel_speed; bool en_tx_lpi_clockgating; + bool rx_clk_runs_in_lpi; int has_xgmac; bool vlan_fail_q_en; u8 vlan_fail_q;
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit c28548012ee2bac55772ef7685138bd1124b80c3 ]
Interrupt entry sets the soft mask to IRQS_ALL_DISABLED to match the hard irq disabled state. So when should_hard_irq_enable() returns true because we want PMI interrupts in irq handlers, MSR[EE] is enabled but PMIs just get soft-masked. Fix this by clearing IRQS_PMI_DISABLED before enabling MSR[EE].
This also tidies some of the warnings, no need to duplicate them in both should_hard_irq_enable() and do_hard_irq_enable().
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20230121100156.2824054-1-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/hw_irq.h | 41 ++++++++++++++++++++++--------- arch/powerpc/kernel/dbell.c | 2 +- arch/powerpc/kernel/irq.c | 2 +- arch/powerpc/kernel/time.c | 2 +- 4 files changed, 32 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 0b7d01d408ac8..eb6d094083fd6 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -173,6 +173,15 @@ static inline notrace unsigned long irq_soft_mask_or_return(unsigned long mask) return flags; }
+static inline notrace unsigned long irq_soft_mask_andc_return(unsigned long mask) +{ + unsigned long flags = irq_soft_mask_return(); + + irq_soft_mask_set(flags & ~mask); + + return flags; +} + static inline unsigned long arch_local_save_flags(void) { return irq_soft_mask_return(); @@ -331,10 +340,11 @@ bool power_pmu_wants_prompt_pmi(void); * is a different soft-masked interrupt pending that requires hard * masking. */ -static inline bool should_hard_irq_enable(void) +static inline bool should_hard_irq_enable(struct pt_regs *regs) { if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) { - WARN_ON(irq_soft_mask_return() == IRQS_ENABLED); + WARN_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); + WARN_ON(!(get_paca()->irq_happened & PACA_IRQ_HARD_DIS)); WARN_ON(mfmsr() & MSR_EE); }
@@ -347,8 +357,17 @@ static inline bool should_hard_irq_enable(void) * * TODO: Add test for 64e */ - if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !power_pmu_wants_prompt_pmi()) - return false; + if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) { + if (!power_pmu_wants_prompt_pmi()) + return false; + /* + * If PMIs are disabled then IRQs should be disabled as well, + * so we shouldn't see this condition, check for it just in + * case because we are about to enable PMIs. + */ + if (WARN_ON_ONCE(regs->softe & IRQS_PMI_DISABLED)) + return false; + }
if (get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK) return false; @@ -358,18 +377,16 @@ static inline bool should_hard_irq_enable(void)
/* * Do the hard enabling, only call this if should_hard_irq_enable is true. + * This allows PMI interrupts to profile irq handlers. */ static inline void do_hard_irq_enable(void) { - if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) { - WARN_ON(irq_soft_mask_return() == IRQS_ENABLED); - WARN_ON(get_paca()->irq_happened & PACA_IRQ_MUST_HARD_MASK); - WARN_ON(mfmsr() & MSR_EE); - } /* - * This allows PMI interrupts (and watchdog soft-NMIs) through. - * There is no other reason to enable this way. + * Asynch interrupts come in with IRQS_ALL_DISABLED, + * PACA_IRQ_HARD_DIS, and MSR[EE]=0. */ + if (IS_ENABLED(CONFIG_PPC_BOOK3S_64)) + irq_soft_mask_andc_return(IRQS_PMI_DISABLED); get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS; __hard_irq_enable(); } @@ -452,7 +469,7 @@ static inline bool arch_irq_disabled_regs(struct pt_regs *regs) return !(regs->msr & MSR_EE); }
-static __always_inline bool should_hard_irq_enable(void) +static __always_inline bool should_hard_irq_enable(struct pt_regs *regs) { return false; } diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c index f55c6fb34a3a0..5712dd846263c 100644 --- a/arch/powerpc/kernel/dbell.c +++ b/arch/powerpc/kernel/dbell.c @@ -27,7 +27,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(doorbell_exception)
ppc_msgsync();
- if (should_hard_irq_enable()) + if (should_hard_irq_enable(regs)) do_hard_irq_enable();
kvmppc_clear_host_ipi(smp_processor_id()); diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 9ede61a5a469e..55142ff649f3f 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -238,7 +238,7 @@ static void __do_irq(struct pt_regs *regs, unsigned long oldsp) irq = static_call(ppc_get_irq)();
/* We can hard enable interrupts now to allow perf interrupts */ - if (should_hard_irq_enable()) + if (should_hard_irq_enable(regs)) do_hard_irq_enable();
/* And finally process it */ diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index a2ab397065c66..f157552d79b38 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -533,7 +533,7 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt) }
/* Conditionally hard-enable interrupts. */ - if (should_hard_irq_enable()) { + if (should_hard_irq_enable(regs)) { /* * Ensure a positive value is written to the decrementer, or * else some CPUs will continue to take decrementer exceptions.
From: Hou Tao houtao1@huawei.com
[ Upstream commit 3288666c72568fe1cc7f5c5ae33dfd3ab18004c8 ]
fscache_create_volume_work() uses wake_up_bit() to wake up the processes which are waiting for the completion of volume creation. According to comments in wake_up_bit() and waitqueue_active(), an extra smp_mb() is needed to guarantee the memory order between FSCACHE_VOLUME_CREATING flag and waitqueue_active() before invoking wake_up_bit().
Fixing it by using clear_and_wake_up_bit() to add the missing memory barrier.
Reviewed-by: Jingbo Xu jefflexu@linux.alibaba.com Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: David Howells dhowells@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Link: https://lore.kernel.org/r/20230113115211.2895845-3-houtao@huaweicloud.com/ # v3 Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fscache/volume.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/fscache/volume.c b/fs/fscache/volume.c index 903af9d85f8b9..cdf991bdd9def 100644 --- a/fs/fscache/volume.c +++ b/fs/fscache/volume.c @@ -280,8 +280,7 @@ static void fscache_create_volume_work(struct work_struct *work) fscache_end_cache_access(volume->cache, fscache_access_acquire_volume_end);
- clear_bit_unlock(FSCACHE_VOLUME_CREATING, &volume->flags); - wake_up_bit(&volume->flags, FSCACHE_VOLUME_CREATING); + clear_and_wake_up_bit(FSCACHE_VOLUME_CREATING, &volume->flags); fscache_put_volume(volume, fscache_volume_put_create_work); }
From: Ben Skeggs bskeggs@redhat.com
[ Upstream commit d22915d22ded21fd5b24b60d174775789f173997 ]
Starting from Turing, the driver is no longer responsible for initiating DEVINIT when required as the GPU started loading a FW image from ROM and executing DEVINIT itself after power-on.
However - we apparently still need to wait for it to complete.
This should correct some issues with runpm on some systems, where we get control of the HW before it's been fully reinitialised after resume from suspend.
Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Lyude Paul lyude@redhat.com Signed-off-by: Lyude Paul lyude@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230130223715.1831509-1-bskeg... Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/nouveau/nvkm/subdev/devinit/tu102.c | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c index 634f64f88fc8b..81a1ad2c88a7e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/tu102.c @@ -65,10 +65,33 @@ tu102_devinit_pll_set(struct nvkm_devinit *init, u32 type, u32 freq) return ret; }
+static int +tu102_devinit_wait(struct nvkm_device *device) +{ + unsigned timeout = 50 + 2000; + + do { + if (nvkm_rd32(device, 0x118128) & 0x00000001) { + if ((nvkm_rd32(device, 0x118234) & 0x000000ff) == 0xff) + return 0; + } + + usleep_range(1000, 2000); + } while (timeout--); + + return -ETIMEDOUT; +} + int tu102_devinit_post(struct nvkm_devinit *base, bool post) { struct nv50_devinit *init = nv50_devinit(base); + int ret; + + ret = tu102_devinit_wait(init->base.subdev.device); + if (ret) + return ret; + gm200_devinit_preos(init, post); return 0; }
From: Kees Cook keescook@chromium.org
[ Upstream commit f3eceaed9edd7c0e0d9fb057613131f92973626f ]
There doesn't appear to be a reason to truncate the allocation used for flow_info, so do a full allocation and remove the unused empty struct. GCC does not like having a reference to an object that has been partially allocated, as bounds checking may become impossible when such an object is passed to other code. Seen with GCC 13:
../drivers/net/ethernet/mediatek/mtk_ppe.c: In function 'mtk_foe_entry_commit_subflow': ../drivers/net/ethernet/mediatek/mtk_ppe.c:623:18: warning: array subscript 'struct mtk_flow_entry[0]' is partly outside array bounds of 'unsigned char[48]' [-Warray-bounds=] 623 | flow_info->l2_data.base_flow = entry; | ^~
Cc: Felix Fietkau nbd@nbd.name Cc: John Crispin john@phrozen.org Cc: Sean Wang sean.wang@mediatek.com Cc: Mark Lee Mark-MC.Lee@mediatek.com Cc: Lorenzo Bianconi lorenzo@kernel.org Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Cc: Matthias Brugger matthias.bgg@gmail.com Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mediatek@lists.infradead.org Signed-off-by: Kees Cook keescook@chromium.org Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230127223853.never.014-kees@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_ppe.c | 3 +-- drivers/net/ethernet/mediatek/mtk_ppe.h | 1 - 2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.c b/drivers/net/ethernet/mediatek/mtk_ppe.c index 784ecb2dc9fbd..34ea8af48c3d0 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe.c @@ -595,8 +595,7 @@ mtk_foe_entry_commit_subflow(struct mtk_ppe *ppe, struct mtk_flow_entry *entry, u32 ib1_mask = mtk_get_ib1_pkt_type_mask(ppe->eth) | MTK_FOE_IB1_UDP; int type;
- flow_info = kzalloc(offsetof(struct mtk_flow_entry, l2_data.end), - GFP_ATOMIC); + flow_info = kzalloc(sizeof(*flow_info), GFP_ATOMIC); if (!flow_info) return;
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.h b/drivers/net/ethernet/mediatek/mtk_ppe.h index a09c32539bcc9..e66283b1bc79e 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe.h +++ b/drivers/net/ethernet/mediatek/mtk_ppe.h @@ -277,7 +277,6 @@ struct mtk_flow_entry { struct { struct mtk_flow_entry *base_flow; struct hlist_node list; - struct {} end; } l2_data; }; struct rhash_head node;
From: Kees Cook keescook@chromium.org
[ Upstream commit de5ca4c3852f896cacac2bf259597aab5e17d9e3 ]
Nothing was explicitly bounds checking the priority index used to access clpriop[]. WARN and bail out early if it's pathological. Seen with GCC 13:
../net/sched/sch_htb.c: In function 'htb_activate_prios': ../net/sched/sch_htb.c:437:44: warning: array subscript [0, 31] is outside array bounds of 'struct htb_prio[8]' [-Warray-bounds=] 437 | if (p->inner.clprio[prio].feed.rb_node) | ~~~~~~~~~~~~~~~^~~~~~ ../net/sched/sch_htb.c:131:41: note: while referencing 'clprio' 131 | struct htb_prio clprio[TC_HTB_NUMPRIO]; | ^~~~~~
Cc: Jamal Hadi Salim jhs@mojatatu.com Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Jiri Pirko jiri@resnulli.us Cc: "David S. Miller" davem@davemloft.net Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Cong Wang cong.wang@bytedance.com Link: https://lore.kernel.org/r/20230127224036.never.561-kees@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_htb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 3afac9c21a763..14a202b5a3187 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -427,7 +427,10 @@ static void htb_activate_prios(struct htb_sched *q, struct htb_class *cl) while (cl->cmode == HTB_MAY_BORROW && p && mask) { m = mask; while (m) { - int prio = ffz(~m); + unsigned int prio = ffz(~m); + + if (WARN_ON_ONCE(prio > ARRAY_SIZE(p->inner.clprio))) + break; m &= ~(1 << prio);
if (p->inner.clprio[prio].feed.rb_node)
From: Vasily Gorbik gor@linux.ibm.com
[ Upstream commit 7ab41c2c08a32132ba8c14624910e2fe8ce4ba4b ]
Historically calls to __decompress() didn't specify "out_len" parameter on many architectures including s390, expecting that no writes beyond uncompressed kernel image are performed. This has changed since commit 2aa14b1ab2c4 ("zstd: import usptream v1.5.2") which includes zstd library commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer (#2751)"). Now zstd decompression code might store literal buffer in the unwritten portion of the destination buffer. Since "out_len" is not set, it is considered to be unlimited and hence free to use for optimization needs. On s390 this might corrupt initrd or ipl report which are often placed right after the decompressor buffer. Luckily the size of uncompressed kernel image is already known to the decompressor, so to avoid the problem simply specify it in the "out_len" parameter.
Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb Signed-off-by: Vasily Gorbik gor@linux.ibm.com Tested-by: Alexander Egorenkov egorenar@linux.ibm.com Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-her... Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/boot/decompressor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/boot/decompressor.c b/arch/s390/boot/decompressor.c index e27c2140d6206..623f6775d01d7 100644 --- a/arch/s390/boot/decompressor.c +++ b/arch/s390/boot/decompressor.c @@ -80,6 +80,6 @@ void *decompress_kernel(void) void *output = (void *)decompress_offset;
__decompress(_compressed_start, _compressed_end - _compressed_start, - NULL, NULL, output, 0, NULL, error); + NULL, NULL, output, vmlinux.image_size, NULL, error); return output; }
From: Amit Engel Amit.Engel@dell.com
[ Upstream commit 0cab4404874f2de52617de8400c844891c6ea1ce ]
As part of nvmet_fc_ls_create_association there is a case where nvmet_fc_alloc_target_queue fails right after a new association with an admin queue is created. In this case, no one releases the get taken in nvmet_fc_alloc_target_assoc. This fix is adding the missing put.
Signed-off-by: Amit Engel Amit.Engel@dell.com Reviewed-by: James Smart jsmart2021@gmail.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/fc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c index ab2627e17bb97..1ab6601fdd5cf 100644 --- a/drivers/nvme/target/fc.c +++ b/drivers/nvme/target/fc.c @@ -1685,8 +1685,10 @@ nvmet_fc_ls_create_association(struct nvmet_fc_tgtport *tgtport, else { queue = nvmet_fc_alloc_target_queue(iod->assoc, 0, be16_to_cpu(rqst->assoc_cmd.sqsize)); - if (!queue) + if (!queue) { ret = VERR_QUEUE_ALLOC_FAIL; + nvmet_fc_tgt_a_put(iod->assoc); + } } }
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit fd62678ab55cb01e11a404d302cdade222bf4022 ]
If nvme_alloc_admin_tag_set() fails, the admin_q and fabrics_q pointers are left with an invalid, non-NULL value. Other functions may then check the pointers and dereference them, e.g. in
nvme_probe() -> out_disable: -> nvme_dev_remove_admin().
Fix the bug by setting admin_q and fabrics_q to NULL in case of error.
Also use the set variable to free the tag_set as ctrl->admin_tagset isn't initialized yet.
Signed-off-by: Maurizio Lombardi mlombard@redhat.com Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 25ade4ce8e0a7..e189ce17deb3e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4881,7 +4881,9 @@ int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set, out_cleanup_admin_q: blk_mq_destroy_queue(ctrl->admin_q); out_free_tagset: - blk_mq_free_tag_set(ctrl->admin_tagset); + blk_mq_free_tag_set(set); + ctrl->admin_q = NULL; + ctrl->fabrics_q = NULL; return ret; } EXPORT_SYMBOL_GPL(nvme_alloc_admin_tag_set);
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit 6fbf13c0e24fd86ab2e4477cd8484a485b687421 ]
In nvme_alloc_io_tag_set(), the connect_q pointer should be set to NULL in case of error to avoid potential invalid pointer dereferences.
Signed-off-by: Maurizio Lombardi mlombard@redhat.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index e189ce17deb3e..5acc9ae225df3 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4933,6 +4933,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
out_free_tag_set: blk_mq_free_tag_set(set); + ctrl->connect_q = NULL; return ret; } EXPORT_SYMBOL_GPL(nvme_alloc_io_tag_set);
From: Daniel Miess Daniel.Miess@amd.com
[ Upstream commit ea062fd28f922cb118bfb33229f405b81aff7781 ]
[Why] Brackets missing in the calculation for MIN_DST_Y_NEXT_START
[How] Add missing brackets for this calculation
Reviewed-by: Nicholas Kazlauskas Nicholas.Kazlauskas@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Daniel Miess Daniel.Miess@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c index 0d12fd079cd61..3afd3c80e6da8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c @@ -3184,7 +3184,7 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman } else { v->MIN_DST_Y_NEXT_START[k] = v->VTotal[k] - v->VFrontPorch[k] + v->VTotal[k] - v->VActive[k] - v->VStartup[k]; } - v->MIN_DST_Y_NEXT_START[k] += dml_floor(4.0 * v->TSetup[k] / (double)v->HTotal[k] / v->PixelClock[k], 1.0) / 4.0; + v->MIN_DST_Y_NEXT_START[k] += dml_floor(4.0 * v->TSetup[k] / ((double)v->HTotal[k] / v->PixelClock[k]), 1.0) / 4.0; if (((v->VUpdateOffsetPix[k] + v->VUpdateWidthPix[k] + v->VReadyOffsetPix[k]) / v->HTotal[k]) <= (isInterlaceTiming ? dml_floor((v->VTotal[k] - v->VActive[k] - v->VFrontPorch[k] - v->VStartup[k]) / 2.0, 1.0) :
From: Daniel Miess Daniel.Miess@amd.com
[ Upstream commit dd2db2dc4bd298f33dea50c80c3c11bee4e3b0a4 ]
[Why] Lower max_downscale_ratio and ARGB888 downscale factor to prevent cases where underflow may occur on dcn314
[How] Set max_downscale_ratio to 400 and ARGB downscale factor to 250 for dcn314
Reviewed-by: Nicholas Kazlauskas Nicholas.Kazlauskas@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Daniel Miess Daniel.Miess@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c index 9066c511a0529..c80c8c8f51e97 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c @@ -871,8 +871,9 @@ static const struct dc_plane_cap plane_cap = { },
// 6:1 downscaling ratio: 1000/6 = 166.666 + // 4:1 downscaling ratio for ARGB888 to prevent underflow during P010 playback: 1000/4 = 250 .max_downscale_factor = { - .argb8888 = 167, + .argb8888 = 250, .nv12 = 167, .fp16 = 167 }, @@ -1755,7 +1756,7 @@ static bool dcn314_resource_construct( pool->base.underlay_pipe_index = NO_UNDERLAY_PIPE; pool->base.pipe_count = pool->base.res_cap->num_timing_generator; pool->base.mpcc_count = pool->base.res_cap->num_timing_generator; - dc->caps.max_downscale_ratio = 600; + dc->caps.max_downscale_ratio = 400; dc->caps.i2c_speed_in_khz = 100; dc->caps.i2c_speed_in_khz_hdcp = 100; dc->caps.max_cursor_size = 256;
From: George Shen george.shen@amd.com
[ Upstream commit 275d8a1db261a1272a818d40ebc61b3b865b60e5 ]
[Why] The hwss function does_plane_fit_in_mall not applicable to dcn3.2 asics. Using it with dcn3.2 can result in undefined behaviour.
[How] Assign the function pointer to NULL.
Reviewed-by: Alvin Lee Alvin.Lee2@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: George Shen george.shen@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c index 45a949ba6f3f3..7b7f0e6b2a2ff 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c @@ -94,7 +94,7 @@ static const struct hw_sequencer_funcs dcn32_funcs = { .get_vupdate_offset_from_vsync = dcn10_get_vupdate_offset_from_vsync, .calc_vupdate_position = dcn10_calc_vupdate_position, .apply_idle_power_optimizations = dcn32_apply_idle_power_optimizations, - .does_plane_fit_in_mall = dcn30_does_plane_fit_in_mall, + .does_plane_fit_in_mall = NULL, .set_backlight_level = dcn21_set_backlight_level, .set_abm_immediate_disable = dcn21_set_abm_immediate_disable, .hardware_release = dcn30_hardware_release,
From: Nicholas Kazlauskas nicholas.kazlauskas@amd.com
[ Upstream commit 154711aa5759ef9b45903124fa813c4c29ee681c ]
[Why] Otherwise we can be out of sync with what's in the hardware, leading to us rerunning every command that's presently in the ringbuffer.
[How] Reset software state for the mailboxes in hw_reset callback. This is already done as part of the mailbox init in hw_init, but we do need to remember to reset the last cached wptr value as well here.
Reviewed-by: Hansen Dsouza hansen.dsouza@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c index 4a122925c3ae9..92c18bfb98b3b 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -532,6 +532,9 @@ enum dmub_status dmub_srv_hw_init(struct dmub_srv *dmub, if (dmub->hw_funcs.reset) dmub->hw_funcs.reset(dmub);
+ /* reset the cache of the last wptr as well now that hw is reset */ + dmub->inbox1_last_wptr = 0; + cw0.offset.quad_part = inst_fb->gpu_addr; cw0.region.base = DMUB_CW0_BASE; cw0.region.top = cw0.region.base + inst_fb->size - 1; @@ -649,6 +652,15 @@ enum dmub_status dmub_srv_hw_reset(struct dmub_srv *dmub) if (dmub->hw_funcs.reset) dmub->hw_funcs.reset(dmub);
+ /* mailboxes have been reset in hw, so reset the sw state as well */ + dmub->inbox1_last_wptr = 0; + dmub->inbox1_rb.wrpt = 0; + dmub->inbox1_rb.rptr = 0; + dmub->outbox0_rb.wrpt = 0; + dmub->outbox0_rb.rptr = 0; + dmub->outbox1_rb.wrpt = 0; + dmub->outbox1_rb.rptr = 0; + dmub->hw_init = false;
return DMUB_STATUS_OK;
From: Evan Quan evan.quan@amd.com
[ Upstream commit bb25849c0fa550b26cecc9c476c519a927c66898 ]
Enable HDP clock gating control for gfx 11.0.3.
Signed-off-by: Evan Quan evan.quan@amd.com Reviewed-by: Feifei Xu Feifei.Xu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/soc21.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index 9bc9852b9cda9..230e15fed755c 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc21.c +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c @@ -643,7 +643,8 @@ static int soc21_common_early_init(void *handle) AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_CGLS | AMD_CG_SUPPORT_REPEATER_FGCG | - AMD_CG_SUPPORT_GFX_MGCG; + AMD_CG_SUPPORT_GFX_MGCG | + AMD_CG_SUPPORT_HDP_SD; adev->pg_flags = AMD_PG_SUPPORT_VCN | AMD_PG_SUPPORT_VCN_DPG | AMD_PG_SUPPORT_JPEG;
From: Yiqing Yao yiqing.yao@amd.com
[ Upstream commit ac7170082c0e140663f0853d3de733a5341ce7b0 ]
These sysfs nodes are tested supported, so enable them.
Signed-off-by: Yiqing Yao yiqing.yao@amd.com Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 41635694e5216..2f3e239e623dc 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -2009,14 +2009,16 @@ static int default_attr_update(struct amdgpu_device *adev, struct amdgpu_device_ gc_ver == IP_VERSION(10, 3, 0) || gc_ver == IP_VERSION(10, 1, 2) || gc_ver == IP_VERSION(11, 0, 0) || - gc_ver == IP_VERSION(11, 0, 2))) + gc_ver == IP_VERSION(11, 0, 2) || + gc_ver == IP_VERSION(11, 0, 3))) *states = ATTR_STATE_UNSUPPORTED; } else if (DEVICE_ATTR_IS(pp_dpm_dclk)) { if (!(gc_ver == IP_VERSION(10, 3, 1) || gc_ver == IP_VERSION(10, 3, 0) || gc_ver == IP_VERSION(10, 1, 2) || gc_ver == IP_VERSION(11, 0, 0) || - gc_ver == IP_VERSION(11, 0, 2))) + gc_ver == IP_VERSION(11, 0, 2) || + gc_ver == IP_VERSION(11, 0, 3))) *states = ATTR_STATE_UNSUPPORTED; } else if (DEVICE_ATTR_IS(pp_power_profile_mode)) { if (amdgpu_dpm_get_power_profile_mode(adev, NULL) == -EOPNOTSUPP)
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 6fc547a5a2ef5ce05b16924106663ab92f8f87a7 ]
There could be boards with DCN listed in IP discovery, but no display hardware actually wired up. In this case the vbios display table will not be populated. Detect this case and skip loading DM when we detect it.
v2: Mark DCN as harvested as well so other display checks elsewhere in the driver are handled properly.
Cc: Aurabindo Pillai aurabindo.pillai@amd.com Reviewed-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 988b1c947aefc..2d63248d09bbb 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4526,6 +4526,17 @@ DEVICE_ATTR_WO(s3_debug); static int dm_early_init(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; + struct amdgpu_mode_info *mode_info = &adev->mode_info; + struct atom_context *ctx = mode_info->atom_context; + int index = GetIndexIntoMasterTable(DATA, Object_Header); + u16 data_offset; + + /* if there is no object header, skip DM */ + if (!amdgpu_atom_parse_data_header(ctx, index, NULL, NULL, NULL, &data_offset)) { + adev->harvest_ip_mask |= AMD_HARVEST_IP_DMU_MASK; + dev_info(adev->dev, "No object header, skipping DM\n"); + return -ENOENT; + }
switch (adev->asic_type) { #if defined(CONFIG_DRM_AMD_DC_SI)
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit eecf2acd4a580e9364e5087daf0effca60a240b7 ]
Add a DMI match for the CWI501 version of the Chuwi Vi8 tablet, pointing to the same chuwi_vi8_data as the existing CWI506 version DMI match.
Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230202103413.331459-1-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/touchscreen_dmi.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/platform/x86/touchscreen_dmi.c b/drivers/platform/x86/touchscreen_dmi.c index f00995390fdfe..13802a3c3591d 100644 --- a/drivers/platform/x86/touchscreen_dmi.c +++ b/drivers/platform/x86/touchscreen_dmi.c @@ -1097,6 +1097,15 @@ const struct dmi_system_id touchscreen_dmi_table[] = { DMI_MATCH(DMI_BIOS_DATE, "05/07/2016"), }, }, + { + /* Chuwi Vi8 (CWI501) */ + .driver_data = (void *)&chuwi_vi8_data, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "i86"), + DMI_MATCH(DMI_BIOS_VERSION, "CHUWI.W86JLBNR01"), + }, + }, { /* Chuwi Vi8 (CWI506) */ .driver_data = (void *)&chuwi_vi8_data,
From: Xiubo Li xiubli@redhat.com
[ Upstream commit b38b17b6a01ca4e738af097a1529910646ef4270 ]
These flags are only used in ceph filesystem in fs/ceph, so just move it to the place it should be.
Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Venky Shankar vshankar@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/super.h | 10 ++++++++++ include/linux/ceph/libceph.h | 10 ---------- 2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/ceph/super.h b/fs/ceph/super.h index ae4126f634101..735279b2ceb55 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -100,6 +100,16 @@ struct ceph_mount_options { char *mon_addr; };
+/* mount state */ +enum { + CEPH_MOUNT_MOUNTING, + CEPH_MOUNT_MOUNTED, + CEPH_MOUNT_UNMOUNTING, + CEPH_MOUNT_UNMOUNTED, + CEPH_MOUNT_SHUTDOWN, + CEPH_MOUNT_RECOVER, +}; + #define CEPH_ASYNC_CREATE_CONFLICT_BITS 8
struct ceph_fs_client { diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h index 00af2c98da75a..4497d0a6772cd 100644 --- a/include/linux/ceph/libceph.h +++ b/include/linux/ceph/libceph.h @@ -99,16 +99,6 @@ struct ceph_options {
#define CEPH_AUTH_NAME_DEFAULT "guest"
-/* mount state */ -enum { - CEPH_MOUNT_MOUNTING, - CEPH_MOUNT_MOUNTED, - CEPH_MOUNT_UNMOUNTING, - CEPH_MOUNT_UNMOUNTED, - CEPH_MOUNT_SHUTDOWN, - CEPH_MOUNT_RECOVER, -}; - static inline unsigned long ceph_timeout_jiffies(unsigned long timeout) { return timeout ?: MAX_SCHEDULE_TIMEOUT;
From: Xiubo Li xiubli@redhat.com
[ Upstream commit a68e564adcaa69b0930809fb64d9d5f7d9c32ba9 ]
When received corrupted snap trace we don't know what exactly has happened in MDS side. And we shouldn't continue IOs and metadatas access to MDS, which may corrupt or get incorrect contents.
This patch will just block all the further IO/MDS requests immediately and then evict the kclient itself.
The reason why we still need to evict the kclient just after blocking all the further IOs is that the MDS could revoke the caps faster.
Link: https://tracker.ceph.com/issues/57686 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Venky Shankar vshankar@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/addr.c | 17 +++++++++++++++-- fs/ceph/caps.c | 16 +++++++++++++--- fs/ceph/file.c | 3 +++ fs/ceph/mds_client.c | 30 +++++++++++++++++++++++++++--- fs/ceph/snap.c | 36 ++++++++++++++++++++++++++++++++++-- fs/ceph/super.h | 1 + 6 files changed, 93 insertions(+), 10 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 61f47debec5ac..478c03bfba663 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -305,7 +305,7 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) struct inode *inode = rreq->inode; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); - struct ceph_osd_request *req; + struct ceph_osd_request *req = NULL; struct ceph_vino vino = ceph_vino(inode); struct iov_iter iter; struct page **pages; @@ -313,6 +313,11 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) int err = 0; u64 len = subreq->len;
+ if (ceph_inode_is_shutdown(inode)) { + err = -EIO; + goto out; + } + if (ceph_has_inline_data(ci) && ceph_netfs_issue_op_inline(subreq)) return;
@@ -563,6 +568,9 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
dout("writepage %p idx %lu\n", page, page->index);
+ if (ceph_inode_is_shutdown(inode)) + return -EIO; + /* verify this is a writeable snap context */ snapc = page_snap_context(page); if (!snapc) { @@ -1643,7 +1651,7 @@ int ceph_uninline_data(struct file *file) struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_request *req = NULL; - struct ceph_cap_flush *prealloc_cf; + struct ceph_cap_flush *prealloc_cf = NULL; struct folio *folio = NULL; u64 inline_version = CEPH_INLINE_NONE; struct page *pages[1]; @@ -1657,6 +1665,11 @@ int ceph_uninline_data(struct file *file) dout("uninline_data %p %llx.%llx inline_version %llu\n", inode, ceph_vinop(inode), inline_version);
+ if (ceph_inode_is_shutdown(inode)) { + err = -EIO; + goto out; + } + if (inline_version == CEPH_INLINE_NONE) return 0;
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index cd69bf267d1b1..795fd6d84bde0 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4081,6 +4081,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, void *p, *end; struct cap_extra_info extra_info = {}; bool queue_trunc; + bool close_sessions = false;
dout("handle_caps from mds%d\n", session->s_mds);
@@ -4218,9 +4219,13 @@ void ceph_handle_caps(struct ceph_mds_session *session, realm = NULL; if (snaptrace_len) { down_write(&mdsc->snap_rwsem); - ceph_update_snap_trace(mdsc, snaptrace, - snaptrace + snaptrace_len, - false, &realm); + if (ceph_update_snap_trace(mdsc, snaptrace, + snaptrace + snaptrace_len, + false, &realm)) { + up_write(&mdsc->snap_rwsem); + close_sessions = true; + goto done; + } downgrade_write(&mdsc->snap_rwsem); } else { down_read(&mdsc->snap_rwsem); @@ -4280,6 +4285,11 @@ void ceph_handle_caps(struct ceph_mds_session *session, iput(inode); out: ceph_put_string(extra_info.pool_ns); + + /* Defer closing the sessions after s_mutex lock being released */ + if (close_sessions) + ceph_mdsc_close_sessions(mdsc); + return;
flush_cap_releases: diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 6f9580defb2b3..5895797f3104a 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2004,6 +2004,9 @@ static int ceph_zero_partial_object(struct inode *inode, loff_t zero = 0; int op;
+ if (ceph_inode_is_shutdown(inode)) + return -EIO; + if (!length) { op = offset ? CEPH_OSD_OP_DELETE : CEPH_OSD_OP_TRUNCATE; length = &zero; diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 756560df3bdbd..27a245d959c0a 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -806,6 +806,9 @@ static struct ceph_mds_session *register_session(struct ceph_mds_client *mdsc, { struct ceph_mds_session *s;
+ if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_FENCE_IO) + return ERR_PTR(-EIO); + if (mds >= mdsc->mdsmap->possible_max_rank) return ERR_PTR(-EINVAL);
@@ -1478,6 +1481,9 @@ static int __open_session(struct ceph_mds_client *mdsc, int mstate; int mds = session->s_mds;
+ if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_FENCE_IO) + return -EIO; + /* wait for mds to go active? */ mstate = ceph_mdsmap_get_state(mdsc->mdsmap, mds); dout("open_session to mds%d (%s)\n", mds, @@ -2860,6 +2866,11 @@ static void __do_request(struct ceph_mds_client *mdsc, return; }
+ if (READ_ONCE(mdsc->fsc->mount_state) == CEPH_MOUNT_FENCE_IO) { + dout("do_request metadata corrupted\n"); + err = -EIO; + goto finish; + } if (req->r_timeout && time_after_eq(jiffies, req->r_started + req->r_timeout)) { dout("do_request timed out\n"); @@ -3245,6 +3256,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) u64 tid; int err, result; int mds = session->s_mds; + bool close_sessions = false;
if (msg->front.iov_len < sizeof(*head)) { pr_err("mdsc_handle_reply got corrupt (short) reply\n"); @@ -3351,10 +3363,17 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) realm = NULL; if (rinfo->snapblob_len) { down_write(&mdsc->snap_rwsem); - ceph_update_snap_trace(mdsc, rinfo->snapblob, + err = ceph_update_snap_trace(mdsc, rinfo->snapblob, rinfo->snapblob + rinfo->snapblob_len, le32_to_cpu(head->op) == CEPH_MDS_OP_RMSNAP, &realm); + if (err) { + up_write(&mdsc->snap_rwsem); + close_sessions = true; + if (err == -EIO) + ceph_msg_dump(msg); + goto out_err; + } downgrade_write(&mdsc->snap_rwsem); } else { down_read(&mdsc->snap_rwsem); @@ -3412,6 +3431,10 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) req->r_end_latency, err); out: ceph_mdsc_put_request(req); + + /* Defer closing the sessions after s_mutex lock being released */ + if (close_sessions) + ceph_mdsc_close_sessions(mdsc); return; }
@@ -5017,7 +5040,7 @@ static bool done_closing_sessions(struct ceph_mds_client *mdsc, int skipped) }
/* - * called after sb is ro. + * called after sb is ro or when metadata corrupted. */ void ceph_mdsc_close_sessions(struct ceph_mds_client *mdsc) { @@ -5307,7 +5330,8 @@ static void mds_peer_reset(struct ceph_connection *con) struct ceph_mds_client *mdsc = s->s_mdsc;
pr_warn("mds%d closed our session\n", s->s_mds); - send_mds_reconnect(mdsc, s); + if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO) + send_mds_reconnect(mdsc, s); }
static void mds_dispatch(struct ceph_connection *con, struct ceph_msg *msg) diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index e4151852184e0..87007203f130e 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/ceph/ceph_debug.h>
+#include <linux/fs.h> #include <linux/sort.h> #include <linux/slab.h> #include <linux/iversion.h> @@ -766,8 +767,10 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, struct ceph_snap_realm *realm; struct ceph_snap_realm *first_realm = NULL; struct ceph_snap_realm *realm_to_rebuild = NULL; + struct ceph_client *client = mdsc->fsc->client; int rebuild_snapcs; int err = -ENOMEM; + int ret; LIST_HEAD(dirty_realms);
lockdep_assert_held_write(&mdsc->snap_rwsem); @@ -884,6 +887,27 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc, if (first_realm) ceph_put_snap_realm(mdsc, first_realm); pr_err("%s error %d\n", __func__, err); + + /* + * When receiving a corrupted snap trace we don't know what + * exactly has happened in MDS side. And we shouldn't continue + * writing to OSD, which may corrupt the snapshot contents. + * + * Just try to blocklist this kclient and then this kclient + * must be remounted to continue after the corrupted metadata + * fixed in the MDS side. + */ + WRITE_ONCE(mdsc->fsc->mount_state, CEPH_MOUNT_FENCE_IO); + ret = ceph_monc_blocklist_add(&client->monc, &client->msgr.inst.addr); + if (ret) + pr_err("%s failed to blocklist %s: %d\n", __func__, + ceph_pr_addr(&client->msgr.inst.addr), ret); + + WARN(1, "%s: %s%sdo remount to continue%s", + __func__, ret ? "" : ceph_pr_addr(&client->msgr.inst.addr), + ret ? "" : " was blocklisted, ", + err == -EIO ? " after corrupted snaptrace is fixed" : ""); + return err; }
@@ -984,6 +1008,7 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, __le64 *split_inos = NULL, *split_realms = NULL; int i; int locked_rwsem = 0; + bool close_sessions = false;
/* decode */ if (msg->front.iov_len < sizeof(*h)) @@ -1092,8 +1117,12 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, * update using the provided snap trace. if we are deleting a * snap, we can avoid queueing cap_snaps. */ - ceph_update_snap_trace(mdsc, p, e, - op == CEPH_SNAP_OP_DESTROY, NULL); + if (ceph_update_snap_trace(mdsc, p, e, + op == CEPH_SNAP_OP_DESTROY, + NULL)) { + close_sessions = true; + goto bad; + }
if (op == CEPH_SNAP_OP_SPLIT) /* we took a reference when we created the realm, above */ @@ -1112,6 +1141,9 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, out: if (locked_rwsem) up_write(&mdsc->snap_rwsem); + + if (close_sessions) + ceph_mdsc_close_sessions(mdsc); return; }
diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 735279b2ceb55..3599fefa91f99 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -108,6 +108,7 @@ enum { CEPH_MOUNT_UNMOUNTED, CEPH_MOUNT_SHUTDOWN, CEPH_MOUNT_RECOVER, + CEPH_MOUNT_FENCE_IO, };
#define CEPH_ASYNC_CREATE_CONFLICT_BITS 8
From: Matthieu Baerts matthieu.baerts@tessares.net
The commit 4656d72c1efa ("selftests: mptcp: userspace: validate v4-v6 subflows mix") has been backported to v6.1.8 without any conflicts. But it looks like it was depending on a previous one:
commit 1cc94ac1af4b ("selftests: mptcp: make evts global in userspace_pm")
Without it, the test fails with:
./userspace_pm.sh: line 788: : No such file or directory # ADD_ADDR4 id:14 10.0.2.1 (ns1) => ns2, reuse port [FAIL] sed: can't read : No such file or directory
This dependence refactors the way the monitoring files are being created: only once for all the different sub-tests instead of per sub-test.
It is probably better to avoid backporting the refactoring. That is why the new sub-test has been adapted to work using the previous way that is still in place here in v6.1: the monitoring is started at the beginning of each sub-test and the created file is removed at the end.
Fixes: f59549814a64 ("selftests: mptcp: userspace: validate v4-v6 subflows mix") Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/mptcp/userspace_pm.sh | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/userspace_pm.sh b/tools/testing/selftests/net/mptcp/userspace_pm.sh index 0040e3bc7b16e..ad6547c79b831 100755 --- a/tools/testing/selftests/net/mptcp/userspace_pm.sh +++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh @@ -778,6 +778,14 @@ test_subflows()
test_subflows_v4_v6_mix() { + local client_evts + client_evts=$(mktemp) + # Capture events on the network namespace running the client + :>"$client_evts" + ip netns exec "$ns2" ./pm_nl_ctl events >> "$client_evts" 2>&1 & + evts_pid=$! + sleep 0.5 + # Attempt to add a listener at 10.0.2.1:<subflow-port> ip netns exec "$ns1" ./pm_nl_ctl listen 10.0.2.1\ $app6_port > /dev/null 2>&1 & @@ -820,6 +828,9 @@ test_subflows_v4_v6_mix() ip netns exec "$ns1" ./pm_nl_ctl rem id $server_addr_id token\ "$server6_token" > /dev/null 2>&1 sleep 0.5 + + kill_wait $evts_pid + rm -f "$client_evts" }
test_prio()
From: Isaac J. Manjarres isaacmanjarres@google.com
commit ce4d9a1ea35ac5429e822c4106cb2859d5c71f3e upstream.
Patch series "Fix kmemleak crashes when scanning CMA regions", v2.
When trying to boot a device with an ARM64 kernel with the following config options enabled:
CONFIG_DEBUG_PAGEALLOC=y CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y CONFIG_DEBUG_KMEMLEAK=y
a crash is encountered when kmemleak starts to scan the list of gray or allocated objects that it maintains. Upon closer inspection, it was observed that these page-faults always occurred when kmemleak attempted to scan a CMA region.
At the moment, kmemleak is made aware of CMA regions that are specified through the devicetree to be dynamically allocated within a range of addresses. However, kmemleak should not need to scan CMA regions or any reserved memory region, as those regions can be used for DMA transfers between drivers and peripherals, and thus wouldn't contain anything useful for kmemleak.
Additionally, since CMA regions are unmapped from the kernel's address space when they are freed to the buddy allocator at boot when CONFIG_DEBUG_PAGEALLOC is enabled, kmemleak shouldn't attempt to access those memory regions, as that will trigger a crash. Thus, kmemleak should ignore all dynamically allocated reserved memory regions.
This patch (of 1):
Currently, kmemleak ignores dynamically allocated reserved memory regions that don't have a kernel mapping. However, regions that do retain a kernel mapping (e.g. CMA regions) do get scanned by kmemleak.
This is not ideal for two reasons:
1 kmemleak works by scanning memory regions for pointers to allocated objects to determine if those objects have been leaked or not. However, reserved memory regions can be used between drivers and peripherals for DMA transfers, and thus, would not contain pointers to allocated objects, making it unnecessary for kmemleak to scan these reserved memory regions.
2 When CONFIG_DEBUG_PAGEALLOC is enabled, along with kmemleak, the CMA reserved memory regions are unmapped from the kernel's address space when they are freed to buddy at boot. These CMA reserved regions are still tracked by kmemleak, however, and when kmemleak attempts to scan them, a crash will happen, as accessing the CMA region will result in a page-fault, since the regions are unmapped.
Thus, use kmemleak_ignore_phys() for all dynamically allocated reserved memory regions, instead of those that do not have a kernel mapping associated with them.
Link: https://lkml.kernel.org/r/20230208232001.2052777-1-isaacmanjarres@google.com Link: https://lkml.kernel.org/r/20230208232001.2052777-2-isaacmanjarres@google.com Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private") Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com Acked-by: Mike Rapoport (IBM) rppt@kernel.org Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: Frank Rowand frowand.list@gmail.com Cc: Kirill A. Shutemov kirill.shtuemov@linux.intel.com Cc: Nick Kossifidis mick@ics.forth.gr Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Rob Herring robh@kernel.org Cc: Russell King (Oracle) rmk+kernel@armlinux.org.uk Cc: Saravana Kannan saravanak@google.com Cc: stable@vger.kernel.org [5.15+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/of_reserved_mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/of/of_reserved_mem.c +++ b/drivers/of/of_reserved_mem.c @@ -48,9 +48,10 @@ static int __init early_init_dt_alloc_re err = memblock_mark_nomap(base, size); if (err) memblock_phys_free(base, size); - kmemleak_ignore_phys(base); }
+ kmemleak_ignore_phys(base); + return err; }
From: Christophe Leroy christophe.leroy@csgroup.eu
commit 55d77bae73426237b3c74c1757a894b056550dff upstream.
On powerpc64, you can build a kernel with KASAN as soon as you build it with RADIX MMU support. However if the CPU doesn't have RADIX MMU, KASAN isn't enabled at init and the following Oops is encountered.
[ 0.000000][ T0] KASAN not enabled as it requires radix!
[ 4.484295][ T26] BUG: Unable to handle kernel data access at 0xc00e000000804a04 [ 4.485270][ T26] Faulting instruction address: 0xc00000000062ec6c [ 4.485748][ T26] Oops: Kernel access of bad area, sig: 11 [#1] [ 4.485920][ T26] BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries [ 4.486259][ T26] Modules linked in: [ 4.486637][ T26] CPU: 0 PID: 26 Comm: kworker/u2:2 Not tainted 6.2.0-rc3-02590-gf8a023b0a805 #249 [ 4.486907][ T26] Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,HEAD pSeries [ 4.487445][ T26] Workqueue: eval_map_wq .tracer_init_tracefs_work_func [ 4.488744][ T26] NIP: c00000000062ec6c LR: c00000000062bb84 CTR: c0000000002ebcd0 [ 4.488867][ T26] REGS: c0000000049175c0 TRAP: 0380 Not tainted (6.2.0-rc3-02590-gf8a023b0a805) [ 4.489028][ T26] MSR: 8000000002009032 <SF,VEC,EE,ME,IR,DR,RI> CR: 44002808 XER: 00000000 [ 4.489584][ T26] CFAR: c00000000062bb80 IRQMASK: 0 [ 4.489584][ T26] GPR00: c0000000005624d4 c000000004917860 c000000001cfc000 1800000000804a04 [ 4.489584][ T26] GPR04: c0000000003a2650 0000000000000cc0 c00000000000d3d8 c00000000000d3d8 [ 4.489584][ T26] GPR08: c0000000049175b0 a80e000000000000 0000000000000000 0000000017d78400 [ 4.489584][ T26] GPR12: 0000000044002204 c000000003790000 c00000000435003c c0000000043f1c40 [ 4.489584][ T26] GPR16: c0000000043f1c68 c0000000043501a0 c000000002106138 c0000000043f1c08 [ 4.489584][ T26] GPR20: c0000000043f1c10 c0000000043f1c20 c000000004146c40 c000000002fdb7f8 [ 4.489584][ T26] GPR24: c000000002fdb834 c000000003685e00 c000000004025030 c000000003522e90 [ 4.489584][ T26] GPR28: 0000000000000cc0 c0000000003a2650 c000000004025020 c000000004025020 [ 4.491201][ T26] NIP [c00000000062ec6c] .kasan_byte_accessible+0xc/0x20 [ 4.491430][ T26] LR [c00000000062bb84] .__kasan_check_byte+0x24/0x90 [ 4.491767][ T26] Call Trace: [ 4.491941][ T26] [c000000004917860] [c00000000062ae70] .__kasan_kmalloc+0xc0/0x110 (unreliable) [ 4.492270][ T26] [c0000000049178f0] [c0000000005624d4] .krealloc+0x54/0x1c0 [ 4.492453][ T26] [c000000004917990] [c0000000003a2650] .create_trace_option_files+0x280/0x530 [ 4.492613][ T26] [c000000004917a90] [c000000002050d90] .tracer_init_tracefs_work_func+0x274/0x2c0 [ 4.492771][ T26] [c000000004917b40] [c0000000001f9948] .process_one_work+0x578/0x9f0 [ 4.492927][ T26] [c000000004917c30] [c0000000001f9ebc] .worker_thread+0xfc/0x950 [ 4.493084][ T26] [c000000004917d60] [c00000000020be84] .kthread+0x1a4/0x1b0 [ 4.493232][ T26] [c000000004917e10] [c00000000000d3d8] .ret_from_kernel_thread+0x58/0x60 [ 4.495642][ T26] Code: 60000000 7cc802a6 38a00000 4bfffc78 60000000 7cc802a6 38a00001 4bfffc68 60000000 3d20a80e 7863e8c2 792907c6 <7c6348ae> 20630007 78630fe0 68630001 [ 4.496704][ T26] ---[ end trace 0000000000000000 ]---
The Oops is due to kasan_byte_accessible() not checking the readiness of KASAN. Add missing call to kasan_arch_is_ready() and bail out when not ready. The same problem is observed with ____kasan_kfree_large() so fix it the same.
Also, as KASAN is not available and no shadow area is allocated for linear memory mapping, there is no point in allocating shadow mem for vmalloc memory as shown below in /sys/kernel/debug/kernel_page_tables
---[ kasan shadow mem start ]--- 0xc00f000000000000-0xc00f00000006ffff 0x00000000040f0000 448K r w pte valid present dirty accessed 0xc00f000000860000-0xc00f00000086ffff 0x000000000ac10000 64K r w pte valid present dirty accessed 0xc00f3ffffffe0000-0xc00f3fffffffffff 0x0000000004d10000 128K r w pte valid present dirty accessed ---[ kasan shadow mem end ]---
So, also verify KASAN readiness before allocating and poisoning shadow mem for VMAs.
Link: https://lkml.kernel.org/r/150768c55722311699fdcf8f5379e8256749f47d.167471661... Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support") Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Reported-by: Nathan Lynch nathanl@linux.ibm.com Suggested-by: Michael Ellerman mpe@ellerman.id.au Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: stable@vger.kernel.org [5.19+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kasan/common.c | 3 +++ mm/kasan/generic.c | 7 ++++++- mm/kasan/shadow.c | 12 ++++++++++++ 3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/mm/kasan/common.c b/mm/kasan/common.c index 833bf2cfd2a3..21e66d7f261d 100644 --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -246,6 +246,9 @@ bool __kasan_slab_free(struct kmem_cache *cache, void *object,
static inline bool ____kasan_kfree_large(void *ptr, unsigned long ip) { + if (!kasan_arch_is_ready()) + return false; + if (ptr != page_address(virt_to_head_page(ptr))) { kasan_report_invalid_free(ptr, ip, KASAN_REPORT_INVALID_FREE); return true; diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c index b076f597a378..cb762982c8ba 100644 --- a/mm/kasan/generic.c +++ b/mm/kasan/generic.c @@ -191,7 +191,12 @@ bool kasan_check_range(unsigned long addr, size_t size, bool write,
bool kasan_byte_accessible(const void *addr) { - s8 shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(addr)); + s8 shadow_byte; + + if (!kasan_arch_is_ready()) + return true; + + shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(addr));
return shadow_byte >= 0 && shadow_byte < KASAN_GRANULE_SIZE; } diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c index 2fba1f51f042..15cfb34d16a1 100644 --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -291,6 +291,9 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size) unsigned long shadow_start, shadow_end; int ret;
+ if (!kasan_arch_is_ready()) + return 0; + if (!is_vmalloc_or_module_addr((void *)addr)) return 0;
@@ -459,6 +462,9 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end, unsigned long region_start, region_end; unsigned long size;
+ if (!kasan_arch_is_ready()) + return; + region_start = ALIGN(start, KASAN_MEMORY_PER_SHADOW_PAGE); region_end = ALIGN_DOWN(end, KASAN_MEMORY_PER_SHADOW_PAGE);
@@ -502,6 +508,9 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size, * with setting memory tags, so the KASAN_VMALLOC_INIT flag is ignored. */
+ if (!kasan_arch_is_ready()) + return (void *)start; + if (!is_vmalloc_or_module_addr(start)) return (void *)start;
@@ -524,6 +533,9 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size, */ void __kasan_poison_vmalloc(const void *start, unsigned long size) { + if (!kasan_arch_is_ready()) + return; + if (!is_vmalloc_or_module_addr(start)) return;
From: Qi Zheng zhengqi.arch@bytedance.com
commit badc28d4924bfed73efc93f716a0c3aa3afbdf6f upstream.
The debugfs_remove_recursive() is invoked by unregister_shrinker(), which is holding the write lock of shrinker_rwsem. It will waits for the handler of debugfs file complete. The handler also needs to hold the read lock of shrinker_rwsem to do something. So it may cause the following deadlock:
CPU0 CPU1
debugfs_file_get() shrinker_debugfs_count_show()/shrinker_debugfs_scan_write()
unregister_shrinker() --> down_write(&shrinker_rwsem); debugfs_remove_recursive() // wait for (A) --> wait_for_completion();
// wait for (B) --> down_read_killable(&shrinker_rwsem) debugfs_file_put() -- (A)
up_write() -- (B)
The down_read_killable() can be killed, so that the above deadlock can be recovered. But it still requires an extra kill action, otherwise it will block all subsequent shrinker-related operations, so it's better to fix it.
[akpm@linux-foundation.org: fix CONFIG_SHRINKER_DEBUG=n stub] Link: https://lkml.kernel.org/r/20230202105612.64641-1-zhengqi.arch@bytedance.com Fixes: 5035ebc644ae ("mm: shrinkers: introduce debugfs interface for memory shrinkers") Signed-off-by: Qi Zheng zhengqi.arch@bytedance.com Reviewed-by: Roman Gushchin roman.gushchin@linux.dev Cc: Kent Overstreet kent.overstreet@gmail.com Cc: Muchun Song songmuchun@bytedance.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/shrinker.h | 5 +++-- mm/shrinker_debug.c | 13 ++++++++----- mm/vmscan.c | 6 +++++- 3 files changed, 16 insertions(+), 8 deletions(-)
--- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -104,7 +104,7 @@ extern void synchronize_shrinkers(void);
#ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); -extern void shrinker_debugfs_remove(struct shrinker *shrinker); +extern struct dentry *shrinker_debugfs_remove(struct shrinker *shrinker); extern int __printf(2, 3) shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...); #else /* CONFIG_SHRINKER_DEBUG */ @@ -112,8 +112,9 @@ static inline int shrinker_debugfs_add(s { return 0; } -static inline void shrinker_debugfs_remove(struct shrinker *shrinker) +static inline struct dentry *shrinker_debugfs_remove(struct shrinker *shrinker) { + return NULL; } static inline __printf(2, 3) int shrinker_debugfs_rename(struct shrinker *shrinker, const char *fmt, ...) --- a/mm/shrinker_debug.c +++ b/mm/shrinker_debug.c @@ -246,18 +246,21 @@ int shrinker_debugfs_rename(struct shrin } EXPORT_SYMBOL(shrinker_debugfs_rename);
-void shrinker_debugfs_remove(struct shrinker *shrinker) +struct dentry *shrinker_debugfs_remove(struct shrinker *shrinker) { + struct dentry *entry = shrinker->debugfs_entry; + lockdep_assert_held(&shrinker_rwsem);
kfree_const(shrinker->name); shrinker->name = NULL;
- if (!shrinker->debugfs_entry) - return; + if (entry) { + ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id); + shrinker->debugfs_entry = NULL; + }
- debugfs_remove_recursive(shrinker->debugfs_entry); - ida_free(&shrinker_debugfs_ida, shrinker->debugfs_id); + return entry; }
static int __init shrinker_debugfs_init(void) --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -740,6 +740,8 @@ EXPORT_SYMBOL(register_shrinker); */ void unregister_shrinker(struct shrinker *shrinker) { + struct dentry *debugfs_entry; + if (!(shrinker->flags & SHRINKER_REGISTERED)) return;
@@ -748,9 +750,11 @@ void unregister_shrinker(struct shrinker shrinker->flags &= ~SHRINKER_REGISTERED; if (shrinker->flags & SHRINKER_MEMCG_AWARE) unregister_memcg_shrinker(shrinker); - shrinker_debugfs_remove(shrinker); + debugfs_entry = shrinker_debugfs_remove(shrinker); up_write(&shrinker_rwsem);
+ debugfs_remove_recursive(debugfs_entry); + kfree(shrinker->nr_deferred); shrinker->nr_deferred = NULL; }
From: Seth Jenkins sethjenkins@google.com
commit 81e9d6f8647650a7bead74c5f926e29970e834d1 upstream.
Commit e4a0d3e720e7 ("aio: Make it possible to remap aio ring") introduced a null-deref if mremap is called on an old aio mapping after fork as mm->ioctx_table will be set to NULL.
[jmoyer@redhat.com: fix 80 column issue] Link: https://lkml.kernel.org/r/x49sffq4nvg.fsf@segfault.boston.devel.redhat.com Fixes: e4a0d3e720e7 ("aio: Make it possible to remap aio ring") Signed-off-by: Seth Jenkins sethjenkins@google.com Signed-off-by: Jeff Moyer jmoyer@redhat.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Benjamin LaHaise bcrl@kvack.org Cc: Jann Horn jannh@google.com Cc: Pavel Emelyanov xemul@parallels.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/aio.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/aio.c +++ b/fs/aio.c @@ -361,6 +361,9 @@ static int aio_ring_mremap(struct vm_are spin_lock(&mm->ioctx_lock); rcu_read_lock(); table = rcu_dereference(mm->ioctx_table); + if (!table) + goto out_unlock; + for (i = 0; i < table->nr; i++) { struct kioctx *ctx;
@@ -374,6 +377,7 @@ static int aio_ring_mremap(struct vm_are } }
+out_unlock: rcu_read_unlock(); spin_unlock(&mm->ioctx_lock); return res;
From: Ronak Doshi doshir@vmware.com
commit ec76d0c2da5c6dfb6a33f1545cc15997013923da upstream.
Commit b3973bb40041 ("vmxnet3: set correct hash type based on rss information") added hashType information into skb. However, rssType field is populated for eop descriptor. This can lead to incorrectly reporting of hashType for packets which use multiple rx descriptors. Multiple rx descriptors are used for Jumbo frame or LRO packets, which can hit this issue.
This patch moves the RSS codeblock under eop descritor.
Cc: stable@vger.kernel.org Fixes: b3973bb40041 ("vmxnet3: set correct hash type based on rss information") Signed-off-by: Ronak Doshi doshir@vmware.com Acked-by: Peng Li lpeng@vmware.com Acked-by: Guolin Yang gyang@vmware.com Link: https://lore.kernel.org/r/20230208223900.5794-1-doshir@vmware.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/vmxnet3/vmxnet3_drv.c | 50 +++++++++++++++++++------------------- 1 file changed, 25 insertions(+), 25 deletions(-)
--- a/drivers/net/vmxnet3/vmxnet3_drv.c +++ b/drivers/net/vmxnet3/vmxnet3_drv.c @@ -1546,31 +1546,6 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx rxd->len = rbi->len; }
-#ifdef VMXNET3_RSS - if (rcd->rssType != VMXNET3_RCD_RSS_TYPE_NONE && - (adapter->netdev->features & NETIF_F_RXHASH)) { - enum pkt_hash_types hash_type; - - switch (rcd->rssType) { - case VMXNET3_RCD_RSS_TYPE_IPV4: - case VMXNET3_RCD_RSS_TYPE_IPV6: - hash_type = PKT_HASH_TYPE_L3; - break; - case VMXNET3_RCD_RSS_TYPE_TCPIPV4: - case VMXNET3_RCD_RSS_TYPE_TCPIPV6: - case VMXNET3_RCD_RSS_TYPE_UDPIPV4: - case VMXNET3_RCD_RSS_TYPE_UDPIPV6: - hash_type = PKT_HASH_TYPE_L4; - break; - default: - hash_type = PKT_HASH_TYPE_L3; - break; - } - skb_set_hash(ctx->skb, - le32_to_cpu(rcd->rssHash), - hash_type); - } -#endif skb_record_rx_queue(ctx->skb, rq->qid); skb_put(ctx->skb, rcd->len);
@@ -1653,6 +1628,31 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx u32 mtu = adapter->netdev->mtu; skb->len += skb->data_len;
+#ifdef VMXNET3_RSS + if (rcd->rssType != VMXNET3_RCD_RSS_TYPE_NONE && + (adapter->netdev->features & NETIF_F_RXHASH)) { + enum pkt_hash_types hash_type; + + switch (rcd->rssType) { + case VMXNET3_RCD_RSS_TYPE_IPV4: + case VMXNET3_RCD_RSS_TYPE_IPV6: + hash_type = PKT_HASH_TYPE_L3; + break; + case VMXNET3_RCD_RSS_TYPE_TCPIPV4: + case VMXNET3_RCD_RSS_TYPE_TCPIPV6: + case VMXNET3_RCD_RSS_TYPE_UDPIPV4: + case VMXNET3_RCD_RSS_TYPE_UDPIPV6: + hash_type = PKT_HASH_TYPE_L4; + break; + default: + hash_type = PKT_HASH_TYPE_L3; + break; + } + skb_set_hash(skb, + le32_to_cpu(rcd->rssHash), + hash_type); + } +#endif vmxnet3_rx_csum(adapter, skb, (union Vmxnet3_GenericDesc *)rcd); skb->protocol = eth_type_trans(skb, adapter->netdev);
From: Takashi Iwai tiwai@suse.de
commit 3efc61d95259956db25347e2a9562c3e54546e20 upstream.
When a fbdev with deferred I/O is once opened and closed, the dirty pages still remain queued in the pageref list, and eventually later those may be processed in the delayed work. This may lead to a corruption of pages, hitting an Oops.
This patch makes sure to cancel the delayed work and clean up the pageref list at closing the device for addressing the bug. A part of the cleanup code is factored out as a new helper function that is called from the common fb_release().
Reviewed-by: Patrik Jakobsson patrik.r.jakobsson@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Tested-by: Miko Larsson mikoxyzzz@gmail.com Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Reviewed-by: Thomas Zimmermann tzimmermann@suse.de Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20230129082856.22113-1-tiwai@s... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/video/fbdev/core/fb_defio.c | 10 +++++++++- drivers/video/fbdev/core/fbmem.c | 4 ++++ include/linux/fb.h | 1 + 3 files changed, 14 insertions(+), 1 deletion(-)
--- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -313,7 +313,7 @@ void fb_deferred_io_open(struct fb_info } EXPORT_SYMBOL_GPL(fb_deferred_io_open);
-void fb_deferred_io_cleanup(struct fb_info *info) +void fb_deferred_io_release(struct fb_info *info) { struct fb_deferred_io *fbdefio = info->fbdefio; struct page *page; @@ -327,6 +327,14 @@ void fb_deferred_io_cleanup(struct fb_in page = fb_deferred_io_page(info, i); page->mapping = NULL; } +} +EXPORT_SYMBOL_GPL(fb_deferred_io_release); + +void fb_deferred_io_cleanup(struct fb_info *info) +{ + struct fb_deferred_io *fbdefio = info->fbdefio; + + fb_deferred_io_release(info);
kvfree(info->pagerefs); mutex_destroy(&fbdefio->lock); --- a/drivers/video/fbdev/core/fbmem.c +++ b/drivers/video/fbdev/core/fbmem.c @@ -1453,6 +1453,10 @@ __releases(&info->lock) struct fb_info * const info = file->private_data;
lock_fb_info(info); +#if IS_ENABLED(CONFIG_FB_DEFERRED_IO) + if (info->fbdefio) + fb_deferred_io_release(info); +#endif if (info->fbops->fb_release) info->fbops->fb_release(info,1); module_put(info->fbops->owner); --- a/include/linux/fb.h +++ b/include/linux/fb.h @@ -662,6 +662,7 @@ extern int fb_deferred_io_init(struct f extern void fb_deferred_io_open(struct fb_info *info, struct inode *inode, struct file *file); +extern void fb_deferred_io_release(struct fb_info *info); extern void fb_deferred_io_cleanup(struct fb_info *info); extern int fb_deferred_io_fsync(struct file *file, loff_t start, loff_t end, int datasync);
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit bb2ff6c27bc9e1da4d3ec5e7b1d6b9df1092cb5a upstream.
CONFIG_DRM_USE_DYNAMIC_DEBUG breaks debug prints for (at least modular) drm drivers. The debug prints can be reinstated by manually frobbing /sys/module/drm/parameters/debug after the fact, but at that point the damage is done and all debugs from driver probe are lost. This makes drivers totally undebuggable.
There's a more complete fix in progress [1], with further details, but we need this fixed in stable kernels. Mark the feature as broken and disable it by default, with hopes distros follow suit and disable it as well.
[1] https://lore.kernel.org/r/20230125203743.564009-1-jim.cromie@gmail.com
Fixes: 84ec67288c10 ("drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro") Cc: Jim Cromie jim.cromie@gmail.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Maxime Ripard mripard@kernel.org Cc: Thomas Zimmermann tzimmermann@suse.de Cc: David Airlie airlied@gmail.com Cc: Daniel Vetter daniel@ffwll.ch Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Acked-by: Jim Cromie jim.cromie@gmail.com Acked-by: Maxime Ripard maxime@cerno.tech Signed-off-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230207143337.2126678-1-jani.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 315cbdf61979..9abfb482b615 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -53,7 +53,8 @@ config DRM_DEBUG_MM
config DRM_USE_DYNAMIC_DEBUG bool "use dynamic debug to implement drm.debug" - default y + default n + depends on BROKEN depends on DRM depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE depends on JUMP_LABEL
From: Jack Xiao Jack.Xiao@amd.com
commit 8f32378986218812083b127da5ba42d48297d7c4 upstream.
Freeing memory was warned during suspend. Move the self test out of suspend.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2151825 Cc: jfalempe@redhat.com Signed-off-by: Jack Xiao Jack.Xiao@amd.com Reviewed-by: Christian König christian.koenig@amd.com Reviewed-by: Feifei Xu Feifei.Xu@amd.com Reviewed-and-tested-by: Evan Quan evan.quan@amd.com Tested-by: Jocelyn Falempe jfalempe@redhat.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4248,6 +4248,9 @@ int amdgpu_device_resume(struct drm_devi #endif adev->in_suspend = false;
+ if (adev->enable_mes) + amdgpu_mes_self_test(adev); + if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D0)) DRM_WARN("smart shift update failed\n");
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c @@ -1339,7 +1339,7 @@ static int mes_v11_0_late_init(void *han struct amdgpu_device *adev = (struct amdgpu_device *)handle;
/* it's only intended for use in mes_self_test case, not for s0ix and reset */ - if (!amdgpu_in_reset(adev) && !adev->in_s0ix && + if (!amdgpu_in_reset(adev) && !adev->in_s0ix && !adev->in_suspend && (adev->ip_versions[GC_HWIP][0] != IP_VERSION(11, 0, 3))) amdgpu_mes_self_test(adev);
From: Leo Li sunpeng.li@amd.com
commit 2a00299e7447395d0898e7c6214817c06a61a8e8 upstream.
[Why]
drm_atomic_normalize_zpos() can return an error code when there's modeset lock contention. This was being ignored.
[How]
Bail out of atomic check if normalize_zpos() returns an error.
Fixes: b261509952bc ("drm/amd/display: Fix double cursor on non-video RGB MPO") Signed-off-by: Leo Li sunpeng.li@amd.com Tested-by: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com Reviewed-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -9556,7 +9556,11 @@ static int amdgpu_dm_atomic_check(struct * `dcn10_can_pipe_disable_cursor`). By now, all modified planes are in * atomic state, so call drm helper to normalize zpos. */ - drm_atomic_normalize_zpos(dev, state); + ret = drm_atomic_normalize_zpos(dev, state); + if (ret) { + drm_dbg(dev, "drm_atomic_normalize_zpos() failed\n"); + goto fail; + }
/* Remove exiting planes if they are modified */ for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
From: Zack Rusin zackr@vmware.com
commit 1a6897921f52ceb2c8665ef826e405bd96385159 upstream.
ttm_bo_init_reserved on failure puts the buffer object back which causes it to be deleted, but kfree was still being called on the same buffer in vmw_bo_create leading to a double free.
After the double free the vmw_gem_object_create_with_handle was setting the gem function objects before checking the return status of vmw_bo_create leading to null pointer access.
Fix the entire path by relaying on ttm_bo_init_reserved to delete the buffer objects on failure and making sure the return status is checked before setting the gem function objects on the buffer object.
Signed-off-by: Zack Rusin zackr@vmware.com Fixes: 8afa13a0583f ("drm/vmwgfx: Implement DRIVER_GEM") Reviewed-by: Maaz Mombasawala mombasawalam@vmware.com Reviewed-by: Martin Krastev krastevm@vmware.com Link: https://patchwork.freedesktop.org/patch/msgid/20230208180050.2093426-1-zack@... (cherry picked from commit 36d421e632e9a0e8375eaed0143551a34d81a7e3) Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 4 +++- drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 4 ++-- 2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c index aa1cd5126a32..53da183e2bfe 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c @@ -462,6 +462,9 @@ int vmw_bo_create(struct vmw_private *vmw, return -ENOMEM; }
+ /* + * vmw_bo_init will delete the *p_bo object if it fails + */ ret = vmw_bo_init(vmw, *p_bo, size, placement, interruptible, pin, bo_free); @@ -470,7 +473,6 @@ int vmw_bo_create(struct vmw_private *vmw,
return ret; out_error: - kfree(*p_bo); *p_bo = NULL; return ret; } diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c index ce609e7d758f..83d8f18cc16f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c @@ -146,11 +146,11 @@ int vmw_gem_object_create_with_handle(struct vmw_private *dev_priv, &vmw_sys_placement : &vmw_vram_sys_placement, true, false, &vmw_gem_destroy, p_vbo); - - (*p_vbo)->base.base.funcs = &vmw_gem_object_funcs; if (ret != 0) goto out_no_bo;
+ (*p_vbo)->base.base.funcs = &vmw_gem_object_funcs; + ret = drm_gem_handle_create(filp, &(*p_vbo)->base.base, handle); /* drop reference from allocate - handle holds it now */ drm_gem_object_put(&(*p_vbo)->base.base);
From: Zack Rusin zackr@vmware.com
commit a950b989ea29ab3b38ea7f6e3d2540700a3c54e8 upstream.
v3: Fix vmw_user_bo_lookup which was also dropping the gem reference before the kernel was done with buffer depending on userspace doing the right thing. Same bug, different spot.
It is possible for userspace to predict the next buffer handle and to destroy the buffer while it's still used by the kernel. Delay dropping the internal reference on the buffers until kernel is done with them.
Instead of immediately dropping the gem reference in vmw_user_bo_lookup and vmw_gem_object_create_with_handle let the callers decide when they're ready give the control back to userspace.
Also fixes the second usage of vmw_gem_object_create_with_handle in vmwgfx_surface.c which wasn't grabbing an explicit reference to the gem object which could have been destroyed by the userspace on the owning surface at any point.
Signed-off-by: Zack Rusin zackr@vmware.com Fixes: 8afa13a0583f ("drm/vmwgfx: Implement DRIVER_GEM") Reviewed-by: Martin Krastev krastevm@vmware.com Reviewed-by: Maaz Mombasawala mombasawalam@vmware.com Link: https://patchwork.freedesktop.org/patch/msgid/20230211050514.2431155-1-zack@... (cherry picked from commit 9ef8d83e8e25d5f1811b3a38eb1484f85f64296c) Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 8 +++++--- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 2 ++ drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 4 +++- drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_shader.c | 1 + drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 10 ++++++---- 7 files changed, 20 insertions(+), 10 deletions(-)
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c @@ -598,6 +598,7 @@ static int vmw_user_bo_synccpu_release(s ttm_bo_put(&vmw_bo->base); }
+ drm_gem_object_put(&vmw_bo->base.base); return ret; }
@@ -638,6 +639,7 @@ int vmw_user_bo_synccpu_ioctl(struct drm
ret = vmw_user_bo_synccpu_grab(vbo, arg->flags); vmw_bo_unreference(&vbo); + drm_gem_object_put(&vbo->base.base); if (unlikely(ret != 0)) { if (ret == -ERESTARTSYS || ret == -EBUSY) return -EBUSY; @@ -695,7 +697,7 @@ int vmw_bo_unref_ioctl(struct drm_device * struct vmw_buffer_object should be placed. * Return: Zero on success, Negative error code on error. * - * The vmw buffer object pointer will be refcounted. + * The vmw buffer object pointer will be refcounted (both ttm and gem) */ int vmw_user_bo_lookup(struct drm_file *filp, uint32_t handle, @@ -712,7 +714,6 @@ int vmw_user_bo_lookup(struct drm_file *
*out = gem_to_vmw_bo(gobj); ttm_bo_get(&(*out)->base); - drm_gem_object_put(gobj);
return 0; } @@ -779,7 +780,8 @@ int vmw_dumb_create(struct drm_file *fil ret = vmw_gem_object_create_with_handle(dev_priv, file_priv, args->size, &args->handle, &vbo); - + /* drop reference from allocate - handle holds it now */ + drm_gem_object_put(&vbo->base.base); return ret; }
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -1160,6 +1160,7 @@ static int vmw_translate_mob_ptr(struct } ret = vmw_validation_add_bo(sw_context->ctx, vmw_bo, true, false); ttm_bo_put(&vmw_bo->base); + drm_gem_object_put(&vmw_bo->base.base); if (unlikely(ret != 0)) return ret;
@@ -1214,6 +1215,7 @@ static int vmw_translate_guest_ptr(struc } ret = vmw_validation_add_bo(sw_context->ctx, vmw_bo, false, false); ttm_bo_put(&vmw_bo->base); + drm_gem_object_put(&vmw_bo->base.base); if (unlikely(ret != 0)) return ret;
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gem.c @@ -152,8 +152,6 @@ int vmw_gem_object_create_with_handle(st (*p_vbo)->base.base.funcs = &vmw_gem_object_funcs;
ret = drm_gem_handle_create(filp, &(*p_vbo)->base.base, handle); - /* drop reference from allocate - handle holds it now */ - drm_gem_object_put(&(*p_vbo)->base.base); out_no_bo: return ret; } @@ -180,6 +178,8 @@ int vmw_gem_object_create_ioctl(struct d rep->map_handle = drm_vma_node_offset_addr(&vbo->base.base.vma_node); rep->cur_gmr_id = handle; rep->cur_gmr_offset = 0; + /* drop reference from allocate - handle holds it now */ + drm_gem_object_put(&vbo->base.base); out_no_bo: return ret; } --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -1669,8 +1669,10 @@ static struct drm_framebuffer *vmw_kms_f
err_out: /* vmw_user_lookup_handle takes one ref so does new_fb */ - if (bo) + if (bo) { vmw_bo_unreference(&bo); + drm_gem_object_put(&bo->base.base); + } if (surface) vmw_surface_unreference(&surface);
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c @@ -458,6 +458,7 @@ int vmw_overlay_ioctl(struct drm_device ret = vmw_overlay_update_stream(dev_priv, buf, arg, true);
vmw_bo_unreference(&buf); + drm_gem_object_put(&buf->base.base);
out_unlock: mutex_unlock(&overlay->mutex); --- a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c @@ -807,6 +807,7 @@ static int vmw_shader_define(struct drm_ num_output_sig, tfile, shader_handle); out_bad_arg: vmw_bo_unreference(&buffer); + drm_gem_object_put(&buffer->base.base); return ret; }
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c @@ -683,7 +683,7 @@ static void vmw_user_surface_base_releas container_of(base, struct vmw_user_surface, prime.base); struct vmw_resource *res = &user_srf->srf.res;
- if (base->shareable && res && res->backup) + if (res && res->backup) drm_gem_object_put(&res->backup->base.base);
*p_base = NULL; @@ -860,7 +860,11 @@ int vmw_surface_define_ioctl(struct drm_ goto out_unlock; } vmw_bo_reference(res->backup); - drm_gem_object_get(&res->backup->base.base); + /* + * We don't expose the handle to the userspace and surface + * already holds a gem reference + */ + drm_gem_handle_delete(file_priv, backup_handle); }
tmp = vmw_resource_reference(&srf->res); @@ -1564,8 +1568,6 @@ vmw_gb_surface_define_internal(struct dr drm_vma_node_offset_addr(&res->backup->base.base.vma_node); rep->buffer_size = res->backup->base.base.size; rep->buffer_handle = backup_handle; - if (user_srf->prime.base.shareable) - drm_gem_object_get(&res->backup->base.base); } else { rep->buffer_map_handle = 0; rep->buffer_size = 0;
From: Paul Cercueil paul@crapouillou.net
commit 3f18c5046e633cc4bbad396b74c05d46d353033d upstream.
On JZ4760 and JZ4760B, SD cards fail to run if the maximum clock rate is set to 50 MHz, even though the controller officially does support it.
Until the actual bug is found and fixed, limit the maximum clock rate to 24 MHz.
Signed-off-by: Paul Cercueil paul@crapouillou.net Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230131210229.68129-1-paul@crapouillou.net Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/jz4740_mmc.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/mmc/host/jz4740_mmc.c +++ b/drivers/mmc/host/jz4740_mmc.c @@ -1053,6 +1053,16 @@ static int jz4740_mmc_probe(struct platf mmc->ops = &jz4740_mmc_ops; if (!mmc->f_max) mmc->f_max = JZ_MMC_CLK_RATE; + + /* + * There seems to be a problem with this driver on the JZ4760 and + * JZ4760B SoCs. There, when using the maximum rate supported (50 MHz), + * the communication fails with many SD cards. + * Until this bug is sorted out, limit the maximum rate to 24 MHz. + */ + if (host->version == JZ_MMC_JZ4760 && mmc->f_max > JZ_MMC_CLK_RATE) + mmc->f_max = JZ_MMC_CLK_RATE; + mmc->f_min = mmc->f_max / 128; mmc->ocr_avail = MMC_VDD_32_33 | MMC_VDD_33_34;
From: Heiner Kallweit hkallweit1@gmail.com
commit 6ea6b95a7e3ec2015954cb514ee9dbc6dc80ec8f upstream.
Some SDIO WiFi modules stopped working after SDIO interrupt mode was added if cap_sdio_irq isn't set in device tree. This patch was confirmed to fix the issue.
Fixes: 066ecde6d826 ("mmc: meson-gx: add SDIO interrupt support") Reported-by: Geraldo Nascimento geraldogabriel@gmail.com Tested-by: Geraldo Nascimento geraldogabriel@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/816cba9f-ff92-31a2-60f0-aca542d1d13e@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/meson-gx-mmc.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
--- a/drivers/mmc/host/meson-gx-mmc.c +++ b/drivers/mmc/host/meson-gx-mmc.c @@ -435,7 +435,8 @@ static int meson_mmc_clk_init(struct mes clk_reg |= FIELD_PREP(CLK_CORE_PHASE_MASK, CLK_PHASE_180); clk_reg |= FIELD_PREP(CLK_TX_PHASE_MASK, CLK_PHASE_0); clk_reg |= FIELD_PREP(CLK_RX_PHASE_MASK, CLK_PHASE_0); - clk_reg |= CLK_IRQ_SDIO_SLEEP(host); + if (host->mmc->caps & MMC_CAP_SDIO_IRQ) + clk_reg |= CLK_IRQ_SDIO_SLEEP(host); writel(clk_reg, host->regs + SD_EMMC_CLOCK);
/* get the mux parents */ @@ -948,16 +949,18 @@ static irqreturn_t meson_mmc_irq(int irq { struct meson_host *host = dev_id; struct mmc_command *cmd; - u32 status, raw_status; + u32 status, raw_status, irq_mask = IRQ_EN_MASK; irqreturn_t ret = IRQ_NONE;
+ if (host->mmc->caps & MMC_CAP_SDIO_IRQ) + irq_mask |= IRQ_SDIO; raw_status = readl(host->regs + SD_EMMC_STATUS); - status = raw_status & (IRQ_EN_MASK | IRQ_SDIO); + status = raw_status & irq_mask;
if (!status) { dev_dbg(host->dev, - "Unexpected IRQ! irq_en 0x%08lx - status 0x%08x\n", - IRQ_EN_MASK | IRQ_SDIO, raw_status); + "Unexpected IRQ! irq_en 0x%08x - status 0x%08x\n", + irq_mask, raw_status); return IRQ_NONE; }
@@ -1204,6 +1207,11 @@ static int meson_mmc_probe(struct platfo goto free_host; }
+ mmc->caps |= MMC_CAP_CMD23; + + if (mmc->caps & MMC_CAP_SDIO_IRQ) + mmc->caps2 |= MMC_CAP2_SDIO_IRQ_NOTHREAD; + host->data = (struct meson_mmc_data *) of_device_get_match_data(&pdev->dev); if (!host->data) { @@ -1277,11 +1285,6 @@ static int meson_mmc_probe(struct platfo
spin_lock_init(&host->lock);
- mmc->caps |= MMC_CAP_CMD23; - - if (mmc->caps & MMC_CAP_SDIO_IRQ) - mmc->caps2 |= MMC_CAP2_SDIO_IRQ_NOTHREAD; - if (host->dram_access_quirk) { /* Limit segments to 1 due to low available sram memory */ mmc->max_segs = 1;
From: Yang Yingliang yangyingliang@huawei.com
commit 605d9fb9556f8f5fb4566f4df1480f280f308ded upstream.
If sdio_add_func() or sdio_init_func() fails, sdio_remove_func() can not release the resources, because the sdio function is not presented in these two cases, it won't call of_node_put() or put_device().
To fix these leaks, make sdio_func_present() only control whether device_del() needs to be called or not, then always call of_node_put() and put_device().
In error case in sdio_init_func(), the reference of 'card->dev' is not get, to avoid redundant put in sdio_free_func_cis(), move the get_device() to sdio_alloc_func() and put_device() to sdio_release_func(), it can keep the get/put function be balanced.
Without this patch, while doing fault inject test, it can get the following leak reports, after this fix, the leak is gone.
unreferenced object 0xffff888112514000 (size 2048): comm "kworker/3:2", pid 65, jiffies 4294741614 (age 124.774s) hex dump (first 32 bytes): 00 e0 6f 12 81 88 ff ff 60 58 8d 06 81 88 ff ff ..o.....`X...... 10 40 51 12 81 88 ff ff 10 40 51 12 81 88 ff ff .@Q......@Q..... backtrace: [<000000009e5931da>] kmalloc_trace+0x21/0x110 [<000000002f839ccb>] mmc_alloc_card+0x38/0xb0 [mmc_core] [<0000000004adcbf6>] mmc_sdio_init_card+0xde/0x170 [mmc_core] [<000000007538fea0>] mmc_attach_sdio+0xcb/0x1b0 [mmc_core] [<00000000d4fdeba7>] mmc_rescan+0x54a/0x640 [mmc_core]
unreferenced object 0xffff888112511000 (size 2048): comm "kworker/3:2", pid 65, jiffies 4294741623 (age 124.766s) hex dump (first 32 bytes): 00 40 51 12 81 88 ff ff e0 58 8d 06 81 88 ff ff .@Q......X...... 10 10 51 12 81 88 ff ff 10 10 51 12 81 88 ff ff ..Q.......Q..... backtrace: [<000000009e5931da>] kmalloc_trace+0x21/0x110 [<00000000fcbe706c>] sdio_alloc_func+0x35/0x100 [mmc_core] [<00000000c68f4b50>] mmc_attach_sdio.cold.18+0xb1/0x395 [mmc_core] [<00000000d4fdeba7>] mmc_rescan+0x54a/0x640 [mmc_core]
Fixes: 3d10a1ba0d37 ("sdio: fix reference counting in sdio_remove_func()") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230130125808.3471254-1-yangyingliang@huawei.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/sdio_bus.c | 17 ++++++++++++++--- drivers/mmc/core/sdio_cis.c | 12 ------------ 2 files changed, 14 insertions(+), 15 deletions(-)
--- a/drivers/mmc/core/sdio_bus.c +++ b/drivers/mmc/core/sdio_bus.c @@ -294,6 +294,12 @@ static void sdio_release_func(struct dev if (!(func->card->quirks & MMC_QUIRK_NONSTD_SDIO)) sdio_free_func_cis(func);
+ /* + * We have now removed the link to the tuples in the + * card structure, so remove the reference. + */ + put_device(&func->card->dev); + kfree(func->info); kfree(func->tmpbuf); kfree(func); @@ -324,6 +330,12 @@ struct sdio_func *sdio_alloc_func(struct
device_initialize(&func->dev);
+ /* + * We may link to tuples in the card structure, + * we need make sure we have a reference to it. + */ + get_device(&func->card->dev); + func->dev.parent = &card->dev; func->dev.bus = &sdio_bus_type; func->dev.release = sdio_release_func; @@ -377,10 +389,9 @@ int sdio_add_func(struct sdio_func *func */ void sdio_remove_func(struct sdio_func *func) { - if (!sdio_func_present(func)) - return; + if (sdio_func_present(func)) + device_del(&func->dev);
- device_del(&func->dev); of_node_put(func->dev.of_node); put_device(&func->dev); } --- a/drivers/mmc/core/sdio_cis.c +++ b/drivers/mmc/core/sdio_cis.c @@ -404,12 +404,6 @@ int sdio_read_func_cis(struct sdio_func return ret;
/* - * Since we've linked to tuples in the card structure, - * we must make sure we have a reference to it. - */ - get_device(&func->card->dev); - - /* * Vendor/device id is optional for function CIS, so * copy it from the card structure as needed. */ @@ -434,11 +428,5 @@ void sdio_free_func_cis(struct sdio_func }
func->tuples = NULL; - - /* - * We have now removed the link to the tuples in the - * card structure, so remove the reference. - */ - put_device(&func->card->dev); }
From: Yang Yingliang yangyingliang@huawei.com
commit cf4c9d2ac1e42c7d18b921bec39486896645b714 upstream.
If mmc_add_host() fails, it doesn't need to call mmc_remove_host(), or it will cause null-ptr-deref, because of deleting a not added device in mmc_remove_host().
To fix this, goto label 'fail_glue_init', if mmc_add_host() fails, and change the label 'fail_add_host' to 'fail_gpiod_request'.
Fixes: 15a0580ced08 ("mmc_spi host driver") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Cc:stable@vger.kernel.org Link: https://lore.kernel.org/r/20230131013835.3564011-1-yangyingliang@huawei.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mmc_spi.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/mmc/host/mmc_spi.c +++ b/drivers/mmc/host/mmc_spi.c @@ -1437,7 +1437,7 @@ static int mmc_spi_probe(struct spi_devi
status = mmc_add_host(mmc); if (status != 0) - goto fail_add_host; + goto fail_glue_init;
/* * Index 0 is card detect @@ -1445,7 +1445,7 @@ static int mmc_spi_probe(struct spi_devi */ status = mmc_gpiod_request_cd(mmc, NULL, 0, false, 1000); if (status == -EPROBE_DEFER) - goto fail_add_host; + goto fail_gpiod_request; if (!status) { /* * The platform has a CD GPIO signal that may support @@ -1460,7 +1460,7 @@ static int mmc_spi_probe(struct spi_devi /* Index 1 is write protect/read only */ status = mmc_gpiod_request_ro(mmc, NULL, 1, 0); if (status == -EPROBE_DEFER) - goto fail_add_host; + goto fail_gpiod_request; if (!status) has_ro = true;
@@ -1474,7 +1474,7 @@ static int mmc_spi_probe(struct spi_devi ? ", cd polling" : ""); return 0;
-fail_add_host: +fail_gpiod_request: mmc_remove_host(mmc); fail_glue_init: mmc_spi_dma_free(host);
From: Cezary Rojewski cezary.rojewski@intel.com
commit 3af4a4f7a20c94009adba65764fa5a0269d70a82 upstream.
Commit f2bd1c5ae2cb ("ALSA: hda: Fix page fault in snd_hda_codec_shutdown()") relocated initialization of several codec device fields. Due to differences between codec_exec_verb() and snd_hdac_bus_exec_bus() in how they handle VERB execution - the latter does not touch PM - assigning ->exec_verb to codec_exec_verb() causes PM to be engaged before it is configured for the device. Configuration of PM for the ASoC HDAudio sound card is done with snd_hda_set_power_save() during skl_hda_audio_probe() whereas the assignment happens early, in snd_hda_codec_device_init().
Revert to previous behavior to avoid problems caused by too early PM manipulation.
Suggested-by: Jason Montleon jmontleo@redhat.com Link: https://lore.kernel.org/regressions/CALFERdzKUodLsm6=Ub3g2+PxpNpPtPq3bGBLbff... Fixes: f2bd1c5ae2cb ("ALSA: hda: Fix page fault in snd_hda_codec_shutdown()") Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Link: https://lore.kernel.org/r/20230210165541.3543604-1-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/hda_codec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index ac1cc7c5290e..2e728aad6771 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -927,7 +927,6 @@ snd_hda_codec_device_init(struct hda_bus *bus, unsigned int codec_addr, codec->depop_delay = -1; codec->fixup_id = HDA_FIXUP_ID_NOT_SET; codec->core.dev.release = snd_hda_codec_dev_release; - codec->core.exec_verb = codec_exec_verb; codec->core.type = HDA_DEV_LEGACY;
mutex_init(&codec->spdif_mutex); @@ -998,6 +997,7 @@ int snd_hda_codec_device_new(struct hda_bus *bus, struct snd_card *card, if (snd_BUG_ON(codec_addr > HDA_MAX_CODEC_ADDRESS)) return -EINVAL;
+ codec->core.exec_verb = codec_exec_verb; codec->card = card; codec->addr = codec_addr;
From: Bo Liu bo.liu@senarytech.com
commit 18d7e16c917a08f08778ecf2b780d63648d5d923 upstream.
The current kernel does not support the SN6180 codec chip. Add the SN6180 codec configuration item to kernel.
Signed-off-by: Bo Liu bo.liu@senarytech.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1675908828-1012-1-git-send-email-bo.liu@senarytech... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_conexant.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -1125,6 +1125,7 @@ static const struct hda_device_id snd_hd HDA_CODEC_ENTRY(0x14f11f87, "SN6140", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f12008, "CX8200", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f120d0, "CX11970", patch_conexant_auto), + HDA_CODEC_ENTRY(0x14f120d1, "SN6180", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f15045, "CX20549 (Venice)", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f15047, "CX20551 (Waikiki)", patch_conexant_auto), HDA_CODEC_ENTRY(0x14f15051, "CX20561 (Hermosa)", patch_conexant_auto),
From: Kailang Yang kailang@realtek.com
commit 2bdccfd290d421b50df4ec6a68d832dad1310748 upstream.
GPIO2 PIN use for output. Mask Dir and Data need to assign for 0x4. Not 0x3. This fixed was for Lenovo Desktop(0x17aa1056). GPIO2 use for AMP enable.
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/8d02bb9ac8134f878cd08607fdf088fd@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -832,7 +832,7 @@ do_sku: alc_setup_gpio(codec, 0x02); break; case 7: - alc_setup_gpio(codec, 0x03); + alc_setup_gpio(codec, 0x04); break; case 5: default:
From: Andy Chi andy.chi@canonical.com
commit 5007b848ff2234ff7ea55755cb315766888988da upstream.
There is a HP platform needs ALC236_FIXUP_HP_GPIO_LED quirk to make mic-mute/audio-mute working.
Signed-off-by: Andy Chi andy.chi@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230214035853.31217-1-andy.chi@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9436,6 +9436,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x8b5e, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), SND_PCI_QUIRK(0x103c, 0x8b7a, "HP", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b7d, "HP", ALC236_FIXUP_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b87, "HP", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b8a, "HP", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b8b, "HP", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b8d, "HP", ALC236_FIXUP_HP_GPIO_LED),
From: Andy Chi andy.chi@canonical.com
commit 9251584af09285133bec0595e5c7218fe2e595c9 upstream.
On HP Laptops, requires the ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED quirk to make its audio LEDs and speaker work.
Signed-off-by: Andy Chi andy.chi@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230214140432.39654-1-andy.chi@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9432,6 +9432,12 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x8abb, "HP ZBook Firefly 14 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8ad1, "HP EliteBook 840 14 inch G9 Notebook PC", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8ad2, "HP EliteBook 860 16 inch G9 Notebook PC", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b42, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b43, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b44, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b45, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b46, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), + SND_PCI_QUIRK(0x103c, 0x8b47, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8b5d, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), SND_PCI_QUIRK(0x103c, 0x8b5e, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), SND_PCI_QUIRK(0x103c, 0x8b7a, "HP", ALC236_FIXUP_HP_GPIO_LED),
From: Simon Gaiser simon@invisiblethingslab.com
commit 104ff59af73aba524e57ae0fef70121643ff270e upstream.
Mark the Tiger Lake UP{3,4} AHCI controller as "low_power". This enables S0ix to work out of the box. Otherwise this isn't working unless the user manually sets /sys/class/scsi_host/*/link_power_management_policy.
Intel lists a total of 4 SATA controller IDs in [1] for those mobile PCHs. This commit just adds the "AHCI" variant since I only tested those.
[1]: https://cdrdv2.intel.com/v1/dl/getContent/631119
Signed-off-by: Simon Gaiser simon@invisiblethingslab.com CC: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ata/ahci.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -422,6 +422,7 @@ static const struct pci_device_id ahci_p { PCI_VDEVICE(INTEL, 0x34d3), board_ahci_low_power }, /* Ice Lake LP AHCI */ { PCI_VDEVICE(INTEL, 0x02d3), board_ahci_low_power }, /* Comet Lake PCH-U AHCI */ { PCI_VDEVICE(INTEL, 0x02d7), board_ahci_low_power }, /* Comet Lake PCH RAID */ + { PCI_VDEVICE(INTEL, 0xa0d3), board_ahci_low_power }, /* Tiger Lake UP{3,4} AHCI */
/* JMicron 360/1/3/5/6, match class to avoid IDE function */ { PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
From: Patrick McLean chutzpah@gentoo.org
commit ead089577e0f55b238f980d9f62eaa90b7b64672 upstream.
Samsung MZ7LH drives are spewing messages like this in to dmesg with AMD SATA controllers:
ata1.00: exception Emask 0x0 SAct 0x7e0000 SErr 0x0 action 0x6 frozen ata1.00: failed command: SEND FPDMA QUEUED ata1.00: cmd 64/01:88:00:00:00/00:00:00:00:00/a0 tag 17 ncq dma 512 out res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Since this was seen previously with SSD 840 EVO drives in https://bugzilla.kernel.org/show_bug.cgi?id=203475 let's add the same fix for these drives as the EVOs have, since they likely have very similar firmwares.
Signed-off-by: Patrick McLean chutzpah@gentoo.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ata/libata-core.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4044,6 +4044,9 @@ static const struct ata_blacklist_entry { "Samsung SSD 870*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM | ATA_HORKAGE_NO_NCQ_ON_ATI }, + { "SAMSUNG*MZ7LH*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | + ATA_HORKAGE_ZERO_AFTER_TRIM | + ATA_HORKAGE_NO_NCQ_ON_ATI, }, { "FCCT*M500*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM },
From: Munehisa Kamata kamatam@amazon.com
commit c2dbe32d5db5c4ead121cf86dabd5ab691fb47fe upstream.
If a non-root cgroup gets removed when there is a thread that registered trigger and is polling on a pressure file within the cgroup, the polling waitqueue gets freed in the following path:
do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy
However, the polling thread still has a reference to the pressure file and will access the freed waitqueue when the file is closed or upon exit:
fput ep_eventpoll_release ep_free ep_remove_wait_queue remove_wait_queue
This results in use-after-free as pasted below.
The fundamental problem here is that cgroup_file_release() (and consequently waitqueue's lifetime) is not tied to the file's real lifetime. Using wake_up_pollfree() here might be less than ideal, but it is in line with the comment at commit 42288cb44c4b ("wait: add wake_up_pollfree()") since the waitqueue's lifetime is not tied to file's one and can be considered as another special case. While this would be fixable by somehow making cgroup_file_release() be tied to the fput(), it would require sizable refactoring at cgroups or higher layer which might be more justifiable if we identify more cases like this.
BUG: KASAN: use-after-free in _raw_spin_lock_irqsave+0x60/0xc0 Write of size 4 at addr ffff88810e625328 by task a.out/4404
CPU: 19 PID: 4404 Comm: a.out Not tainted 6.2.0-rc6 #38 Hardware name: Amazon EC2 c5a.8xlarge/, BIOS 1.0 10/16/2017 Call Trace: <TASK> dump_stack_lvl+0x73/0xa0 print_report+0x16c/0x4e0 kasan_report+0xc3/0xf0 kasan_check_range+0x2d2/0x310 _raw_spin_lock_irqsave+0x60/0xc0 remove_wait_queue+0x1a/0xa0 ep_free+0x12c/0x170 ep_eventpoll_release+0x26/0x30 __fput+0x202/0x400 task_work_run+0x11d/0x170 do_exit+0x495/0x1130 do_group_exit+0x100/0x100 get_signal+0xd67/0xde0 arch_do_signal_or_restart+0x2a/0x2b0 exit_to_user_mode_prepare+0x94/0x100 syscall_exit_to_user_mode+0x20/0x40 do_syscall_64+0x52/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK>
Allocated by task 4404:
kasan_set_track+0x3d/0x60 __kasan_kmalloc+0x85/0x90 psi_trigger_create+0x113/0x3e0 pressure_write+0x146/0x2e0 cgroup_file_write+0x11c/0x250 kernfs_fop_write_iter+0x186/0x220 vfs_write+0x3d8/0x5c0 ksys_write+0x90/0x110 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 4407:
kasan_set_track+0x3d/0x60 kasan_save_free_info+0x27/0x40 ____kasan_slab_free+0x11d/0x170 slab_free_freelist_hook+0x87/0x150 __kmem_cache_free+0xcb/0x180 psi_trigger_destroy+0x2e8/0x310 cgroup_file_release+0x4f/0xb0 kernfs_drain_open_files+0x165/0x1f0 kernfs_drain+0x162/0x1a0 __kernfs_remove+0x1fb/0x310 kernfs_remove_by_name_ns+0x95/0xe0 cgroup_addrm_files+0x67f/0x700 cgroup_destroy_locked+0x283/0x3c0 cgroup_rmdir+0x29/0x100 kernfs_iop_rmdir+0xd1/0x140 vfs_rmdir+0xfe/0x240 do_rmdir+0x13d/0x280 __x64_sys_rmdir+0x2c/0x30 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 0e94682b73bf ("psi: introduce psi monitor") Signed-off-by: Munehisa Kamata kamatam@amazon.com Signed-off-by: Mengchi Cheng mengcc@amazon.com Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: Suren Baghdasaryan surenb@google.com Acked-by: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/20230106224859.4123476-1-kamatam@amazon.com/ Link: https://lore.kernel.org/r/20230214212705.4058045-1-kamatam@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/psi.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1278,10 +1278,11 @@ void psi_trigger_destroy(struct psi_trig
group = t->group; /* - * Wakeup waiters to stop polling. Can happen if cgroup is deleted - * from under a polling process. + * Wakeup waiters to stop polling and clear the queue to prevent it from + * being accessed later. Can happen if cgroup is deleted from under a + * polling process. */ - wake_up_interruptible(&t->event_wait); + wake_up_pollfree(&t->event_wait);
mutex_lock(&group->trigger_lock);
From: Mike Kravetz mike.kravetz@oracle.com
commit ec4288fe63966b26d53907212ecd05dfa81dd2cc upstream.
Users can specify the hugetlb page size in the mmap, shmget and memfd_create system calls. This is done by using 6 bits within the flags argument to encode the base-2 logarithm of the desired page size. The routine hstate_sizelog() uses the log2 value to find the corresponding hugetlb hstate structure. Converting the log2 value (page_size_log) to potential hugetlb page size is the simple statement:
1UL << page_size_log
Because only 6 bits are used for page_size_log, the left shift can not be greater than 63. This is fine on 64 bit architectures where a long is 64 bits. However, if a value greater than 31 is passed on a 32 bit architecture (where long is 32 bits) the shift will result in undefined behavior. This was generally not an issue as the result of the undefined shift had to exactly match hugetlb page size to proceed.
Recent improvements in runtime checking have resulted in this undefined behavior throwing errors such as reported below.
Fix by comparing page_size_log to BITS_PER_LONG before doing shift.
Link: https://lkml.kernel.org/r/20230216013542.138708-1-mike.kravetz@oracle.com Link: https://lore.kernel.org/lkml/CA+G9fYuei_Tr-vN9GS7SfFyU1y9hNysnf=PB7kT0=yv4Mi... Fixes: 42d7395feb56 ("mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Reviewed-by: Jesper Juhl jesperjuhl76@gmail.com Acked-by: Muchun Song songmuchun@bytedance.com Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Naresh Kamboju naresh.kamboju@linaro.org Cc: Anders Roxell anders.roxell@linaro.org Cc: Andi Kleen ak@linux.intel.com Cc: Sasha Levin sashal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/hugetlb.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -753,7 +753,10 @@ static inline struct hstate *hstate_size if (!page_size_log) return &default_hstate;
- return size_to_hstate(1UL << page_size_log); + if (page_size_log < BITS_PER_LONG) + return size_to_hstate(1UL << page_size_log); + + return NULL; }
static inline struct hstate *hstate_vma(struct vm_area_struct *vma)
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 99b9402a36f0799f25feee4465bfa4b8dfa74b4d upstream.
Macro NILFS_SB2_OFFSET_BYTES, which computes the position of the second superblock, underflows when the argument device size is less than 4096 bytes. Therefore, when using this macro, it is necessary to check in advance that the device size is not less than a lower limit, or at least that underflow does not occur.
The current nilfs2 implementation lacks this check, causing out-of-bound block access when mounting devices smaller than 4096 bytes:
I/O error, dev loop0, sector 36028797018963960 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 NILFS (loop0): unable to read secondary superblock (blocksize = 1024)
In addition, when trying to resize the filesystem to a size below 4096 bytes, this underflow occurs in nilfs_resize_fs(), passing a huge number of segments to nilfs_sufile_resize(), corrupting parameters such as the number of segments in superblocks. This causes excessive loop iterations in nilfs_sufile_resize() during a subsequent resize ioctl, causing semaphore ns_segctor_sem to block for a long time and hang the writer thread:
INFO: task segctord:5067 blocked for more than 143 seconds. Not tainted 6.2.0-rc8-syzkaller-00015-gf6feea56f66d #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:segctord state:D stack:23456 pid:5067 ppid:2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:5293 [inline] __schedule+0x1409/0x43f0 kernel/sched/core.c:6606 schedule+0xc3/0x190 kernel/sched/core.c:6682 rwsem_down_write_slowpath+0xfcf/0x14a0 kernel/locking/rwsem.c:1190 nilfs_transaction_lock+0x25c/0x4f0 fs/nilfs2/segment.c:357 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2486 [inline] nilfs_segctor_thread+0x52f/0x1140 fs/nilfs2/segment.c:2570 kthread+0x270/0x300 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> ... Call Trace: <TASK> folio_mark_accessed+0x51c/0xf00 mm/swap.c:515 __nilfs_get_page_block fs/nilfs2/page.c:42 [inline] nilfs_grab_buffer+0x3d3/0x540 fs/nilfs2/page.c:61 nilfs_mdt_submit_block+0xd7/0x8f0 fs/nilfs2/mdt.c:121 nilfs_mdt_read_block+0xeb/0x430 fs/nilfs2/mdt.c:176 nilfs_mdt_get_block+0x12d/0xbb0 fs/nilfs2/mdt.c:251 nilfs_sufile_get_segment_usage_block fs/nilfs2/sufile.c:92 [inline] nilfs_sufile_truncate_range fs/nilfs2/sufile.c:679 [inline] nilfs_sufile_resize+0x7a3/0x12b0 fs/nilfs2/sufile.c:777 nilfs_resize_fs+0x20c/0xed0 fs/nilfs2/super.c:422 nilfs_ioctl_resize fs/nilfs2/ioctl.c:1033 [inline] nilfs_ioctl+0x137c/0x2440 fs/nilfs2/ioctl.c:1301 ...
This fixes these issues by inserting appropriate minimum device size checks or anti-underflow checks, depending on where the macro is used.
Link: https://lkml.kernel.org/r/0000000000004e1dfa05f4a48e6b@google.com Link: https://lkml.kernel.org/r/20230214224043.24141-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+f0c4082ce5ebebdac63b@syzkaller.appspotmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/ioctl.c | 7 +++++++ fs/nilfs2/super.c | 9 +++++++++ fs/nilfs2/the_nilfs.c | 8 +++++++- 3 files changed, 23 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/ioctl.c +++ b/fs/nilfs2/ioctl.c @@ -1114,7 +1114,14 @@ static int nilfs_ioctl_set_alloc_range(s
minseg = range[0] + segbytes - 1; do_div(minseg, segbytes); + + if (range[1] < 4096) + goto out; + maxseg = NILFS_SB2_OFFSET_BYTES(range[1]); + if (maxseg < segbytes) + goto out; + do_div(maxseg, segbytes); maxseg--;
--- a/fs/nilfs2/super.c +++ b/fs/nilfs2/super.c @@ -409,6 +409,15 @@ int nilfs_resize_fs(struct super_block * goto out;
/* + * Prevent underflow in second superblock position calculation. + * The exact minimum size check is done in nilfs_sufile_resize(). + */ + if (newsize < 4096) { + ret = -ENOSPC; + goto out; + } + + /* * Write lock is required to protect some functions depending * on the number of segments, the number of reserved segments, * and so forth. --- a/fs/nilfs2/the_nilfs.c +++ b/fs/nilfs2/the_nilfs.c @@ -544,9 +544,15 @@ static int nilfs_load_super_block(struct { struct nilfs_super_block **sbp = nilfs->ns_sbp; struct buffer_head **sbh = nilfs->ns_sbh; - u64 sb2off = NILFS_SB2_OFFSET_BYTES(bdev_nr_bytes(nilfs->ns_bdev)); + u64 sb2off, devsize = bdev_nr_bytes(nilfs->ns_bdev); int valid[2], swp = 0;
+ if (devsize < NILFS_SEG_MIN_BLOCKS * NILFS_MIN_BLOCK_SIZE + 4096) { + nilfs_err(sb, "device size too small"); + return -EINVAL; + } + sb2off = NILFS_SB2_OFFSET_BYTES(devsize); + sbp[0] = nilfs_read_super_block(sb, NILFS_SB_OFFSET_BYTES, blocksize, &sbh[0]); sbp[1] = nilfs_read_super_block(sb, sb2off, blocksize, &sbh[1]);
From: Zach O'Keefe zokeefe@google.com
commit ae63c898f4004bbc7d212f4adcb3bb14852c30d6 upstream.
During collapse, in a few places we check to see if a given small page has any unaccounted references. If the refcount on the page doesn't match our expectations, it must be there is an unknown user concurrently interested in the page, and so it's not safe to move the contents elsewhere. However, the unaccounted pins are likely an ephemeral state.
In this situation, MADV_COLLAPSE returns -EINVAL when it should return -EAGAIN. This could cause userspace to conclude that the syscall failed, when it in fact could succeed by retrying.
Link: https://lkml.kernel.org/r/20230125015738.912924-1-zokeefe@google.com Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") Signed-off-by: Zach O'Keefe zokeefe@google.com Reported-by: Hugh Dickins hughd@google.com Acked-by: Hugh Dickins hughd@google.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/khugepaged.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2608,6 +2608,7 @@ static int madvise_collapse_errno(enum s case SCAN_CGROUP_CHARGE_FAIL: return -EBUSY; /* Resource temporary unavailable - trying again might succeed */ + case SCAN_PAGE_COUNT: case SCAN_PAGE_LOCK: case SCAN_PAGE_LRU: case SCAN_DEL_PAGE_LRU:
From: Qian Yingjin qian@ddn.com
commit 5956592ce337330cdff0399a6f8b6a5aea397a8e upstream.
I was running traces of the read code against an RAID storage system to understand why read requests were being misaligned against the underlying RAID strips. I found that the page end offset calculation in filemap_get_read_batch() was off by one.
When a read is submitted with end offset 1048575, then it calculates the end page for read of 256 when it should be 255. "last_index" is the index of the page beyond the end of the read and it should be skipped when get a batch of pages for read in @filemap_get_read_batch().
The below simple patch fixes the problem. This code was introduced in kernel 5.12.
Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Signed-off-by: Qian Yingjin qian@ddn.com Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/filemap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/filemap.c +++ b/mm/filemap.c @@ -2569,18 +2569,19 @@ static int filemap_get_pages(struct kioc struct folio *folio; int err = 0;
+ /* "last_index" is the index of the page beyond the end of the read */ last_index = DIV_ROUND_UP(iocb->ki_pos + iter->count, PAGE_SIZE); retry: if (fatal_signal_pending(current)) return -EINTR;
- filemap_get_read_batch(mapping, index, last_index, fbatch); + filemap_get_read_batch(mapping, index, last_index - 1, fbatch); if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & IOCB_NOIO) return -EAGAIN; page_cache_sync_readahead(mapping, ra, filp, index, last_index - index); - filemap_get_read_batch(mapping, index, last_index, fbatch); + filemap_get_read_batch(mapping, index, last_index - 1, fbatch); } if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ))
From: Peter Xu peterx@redhat.com
commit 96a9c287e25d690fd9623b5133703b8e310fbed1 upstream.
Nick Bowler reported another sparc64 breakage after the young/dirty persistent work for page migration (per "Link:" below). That's after a similar report [2].
It turns out page migration was overlooked, and it wasn't failing before because page migration was not enabled in the initial report test environment.
David proposed another way [2] to fix this from sparc64 side, but that patch didn't land somehow. Neither did I check whether there's any other arch that has similar issues.
Let's fix it for now as simple as moving the write bit handling to be after dirty, like what we did before.
Note: this is based on mm-unstable, because the breakage was since 6.1 and we're at a very late stage of 6.2 (-rc8), so I assume for this specific case we should target this at 6.3.
[1] https://lore.kernel.org/all/20221021160603.GA23307@u164.east.ru/ [2] https://lore.kernel.org/all/20221212130213.136267-1-david@redhat.com/
Link: https://lkml.kernel.org/r/20230216153059.256739-1-peterx@redhat.com Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations") Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u3... Signed-off-by: Peter Xu peterx@redhat.com Reported-by: Nick Bowler nbowler@draconx.ca Acked-by: David Hildenbrand david@redhat.com Tested-by: Nick Bowler nbowler@draconx.ca Cc: regressions@lists.linux.dev Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 6 ++++-- mm/migrate.c | 2 ++ 2 files changed, 6 insertions(+), 2 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3253,8 +3253,6 @@ void remove_migration_pmd(struct page_vm pmde = mk_huge_pmd(new, READ_ONCE(vma->vm_page_prot)); if (pmd_swp_soft_dirty(*pvmw->pmd)) pmde = pmd_mksoft_dirty(pmde); - if (is_writable_migration_entry(entry)) - pmde = maybe_pmd_mkwrite(pmde, vma); if (pmd_swp_uffd_wp(*pvmw->pmd)) pmde = pmd_wrprotect(pmd_mkuffd_wp(pmde)); if (!is_migration_entry_young(entry)) @@ -3262,6 +3260,10 @@ void remove_migration_pmd(struct page_vm /* NOTE: this may contain setting soft-dirty on some archs */ if (PageDirty(new) && is_migration_entry_dirty(entry)) pmde = pmd_mkdirty(pmde); + if (is_writable_migration_entry(entry)) + pmde = maybe_pmd_mkwrite(pmde, vma); + else + pmde = pmd_wrprotect(pmde);
if (PageAnon(new)) { rmap_t rmap_flags = RMAP_COMPOUND; --- a/mm/migrate.c +++ b/mm/migrate.c @@ -215,6 +215,8 @@ static bool remove_migration_pte(struct pte = maybe_mkwrite(pte, vma); else if (pte_swp_uffd_wp(*pvmw.pte)) pte = pte_mkuffd_wp(pte); + else + pte = pte_wrprotect(pte);
if (folio_test_anon(folio) && !is_readable_migration_entry(entry)) rmap_flags |= RMAP_EXCLUSIVE;
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
commit 79eeab1d85e0fee4c0bc36f3b6ddf3920f39f74b upstream.
Fix an inverted logic bug in gpio_sim_remove_hogs() that leads to GPIO hog structures never being freed.
Fixes: cb8c474e79be ("gpio: sim: new testing module") Reported-by: Mirsad Goran Todorovac mirsad.todorovac@alu.unizg.hr Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-sim.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpio/gpio-sim.c +++ b/drivers/gpio/gpio-sim.c @@ -732,7 +732,7 @@ static void gpio_sim_remove_hogs(struct
gpiod_remove_hogs(dev->hogs);
- for (hog = dev->hogs; !hog->chip_label; hog++) { + for (hog = dev->hogs; hog->chip_label; hog++) { kfree(hog->chip_label); kfree(hog->line_name); }
From: Peter Zijlstra peterz@infradead.org
commit eedeb787ebb53de5c5dcf7b7b39d01bf1b0f037d upstream.
Tetsuo-San noted that commit f5d39b020809 ("freezer,sched: Rewrite core freezer logic") broke call_usermodehelper_exec() for the KILLABLE case.
Specifically it was missed that the second, unconditional, wait_for_completion() was not optional and ensures the on-stack completion is unused before going out-of-scope.
Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic") Reported-by: syzbot+6cd18e123583550cf469@syzkaller.appspotmail.com Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Debugged-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/Y90ar35uKQoUrLEK@hirez.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/umh.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/kernel/umh.c b/kernel/umh.c index 850631518665..fbf872c624cb 100644 --- a/kernel/umh.c +++ b/kernel/umh.c @@ -438,21 +438,27 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait) if (wait == UMH_NO_WAIT) /* task has freed sub_info */ goto unlock;
- if (wait & UMH_KILLABLE) - state |= TASK_KILLABLE; - if (wait & UMH_FREEZABLE) state |= TASK_FREEZABLE;
- retval = wait_for_completion_state(&done, state); - if (!retval) - goto wait_done; - if (wait & UMH_KILLABLE) { + retval = wait_for_completion_state(&done, state | TASK_KILLABLE); + if (!retval) + goto wait_done; + /* umh_complete() will see NULL and free sub_info */ if (xchg(&sub_info->complete, NULL)) goto unlock; + + /* + * fallthrough; in case of -ERESTARTSYS now do uninterruptible + * wait_for_completion_state(). Since umh_complete() shall call + * complete() in a moment if xchg() above returned NULL, this + * uninterruptible wait_for_completion_state() will not block + * SIGKILL'ed processes for long. + */ } + wait_for_completion_state(&done, state);
wait_done: retval = sub_info->retval;
From: Geert Uytterhoeven geert@linux-m68k.org
commit 9c7417b5ec440242bb5b64521acd53d4e19130c1 upstream.
If CONFIG_ELF_CORE is not set:
fs/coredump.c:835:12: error: ‘dump_emit_page’ defined but not used [-Werror=unused-function] 835 | static int dump_emit_page(struct coredump_params *cprm, struct page *page) | ^~~~~~~~~~~~~~
Fix this by moving dump_emit_page() inside the existing section protected by #ifdef CONFIG_ELF_CORE.
Fixes: 06bbaa6dc53cb720 ("[coredump] don't use __kernel_write() on kmap_local_page()") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/coredump.c | 48 ++++++++++++++++++++++++------------------------ 1 file changed, 24 insertions(+), 24 deletions(-)
--- a/fs/coredump.c +++ b/fs/coredump.c @@ -831,6 +831,30 @@ static int __dump_skip(struct coredump_p } }
+int dump_emit(struct coredump_params *cprm, const void *addr, int nr) +{ + if (cprm->to_skip) { + if (!__dump_skip(cprm, cprm->to_skip)) + return 0; + cprm->to_skip = 0; + } + return __dump_emit(cprm, addr, nr); +} +EXPORT_SYMBOL(dump_emit); + +void dump_skip_to(struct coredump_params *cprm, unsigned long pos) +{ + cprm->to_skip = pos - cprm->pos; +} +EXPORT_SYMBOL(dump_skip_to); + +void dump_skip(struct coredump_params *cprm, size_t nr) +{ + cprm->to_skip += nr; +} +EXPORT_SYMBOL(dump_skip); + +#ifdef CONFIG_ELF_CORE static int dump_emit_page(struct coredump_params *cprm, struct page *page) { struct bio_vec bvec = { @@ -864,30 +888,6 @@ static int dump_emit_page(struct coredum return 1; }
-int dump_emit(struct coredump_params *cprm, const void *addr, int nr) -{ - if (cprm->to_skip) { - if (!__dump_skip(cprm, cprm->to_skip)) - return 0; - cprm->to_skip = 0; - } - return __dump_emit(cprm, addr, nr); -} -EXPORT_SYMBOL(dump_emit); - -void dump_skip_to(struct coredump_params *cprm, unsigned long pos) -{ - cprm->to_skip = pos - cprm->pos; -} -EXPORT_SYMBOL(dump_skip_to); - -void dump_skip(struct coredump_params *cprm, size_t nr) -{ - cprm->to_skip += nr; -} -EXPORT_SYMBOL(dump_skip); - -#ifdef CONFIG_ELF_CORE int dump_user_range(struct coredump_params *cprm, unsigned long start, unsigned long len) {
From: Aaron Thompson dev@aaront.org
commit 647037adcad00f2bab8828d3d41cd0553d41f3bd upstream.
This reverts commit 115d9d77bb0f9152c60b6e8646369fa7f6167593.
The pages being freed by memblock_free_late() have already been initialized, but if they are in the deferred init range, __free_one_page() might access nearby uninitialized pages when trying to coalesce buddies. This can, for example, trigger this BUG:
BUG: unable to handle page fault for address: ffffe964c02580c8 RIP: 0010:__list_del_entry_valid+0x3f/0x70 <TASK> __free_one_page+0x139/0x410 __free_pages_ok+0x21d/0x450 memblock_free_late+0x8c/0xb9 efi_free_boot_services+0x16b/0x25c efi_enter_virtual_mode+0x403/0x446 start_kernel+0x678/0x714 secondary_startup_64_no_verify+0xd2/0xdb </TASK>
A proper fix will be more involved so revert this change for the time being.
Fixes: 115d9d77bb0f ("mm: Always release pages to the buddy allocator in memblock_free_late().") Signed-off-by: Aaron Thompson dev@aaront.org Link: https://lore.kernel.org/r/20230207082151.1303-1-dev@aaront.org Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memblock.c | 8 +------- tools/testing/memblock/internal.h | 4 ---- 2 files changed, 1 insertion(+), 11 deletions(-)
--- a/mm/memblock.c +++ b/mm/memblock.c @@ -1640,13 +1640,7 @@ void __init memblock_free_late(phys_addr end = PFN_DOWN(base + size);
for (; cursor < end; cursor++) { - /* - * Reserved pages are always initialized by the end of - * memblock_free_all() (by memmap_init() and, if deferred - * initialization is enabled, memmap_init_reserved_pages()), so - * these pages can be released directly to the buddy allocator. - */ - __free_pages_core(pfn_to_page(cursor), 0); + memblock_free_pages(pfn_to_page(cursor), cursor, 0); totalram_pages_inc(); } } --- a/tools/testing/memblock/internal.h +++ b/tools/testing/memblock/internal.h @@ -15,10 +15,6 @@ bool mirrored_kernelcore = false;
struct page {};
-void __free_pages_core(struct page *page, unsigned int order) -{ -} - void memblock_free_pages(struct page *page, unsigned long pfn, unsigned int order) {
From: Felix Riemann felix.riemann@sma.de
commit 9b55d3f0a69af649c62cbc2633e6d695bb3cc583 upstream.
When converting net_device_stats to rtnl_link_stats64 sign extension is triggered on ILP32 machines as 6c1c509778 changed the previous "ulong -> u64" conversion to "long -> u64" by accessing the net_device_stats fields through a (signed) atomic_long_t.
This causes for example the received bytes counter to jump to 16EiB after having received 2^31 bytes. Casting the atomic value to "unsigned long" beforehand converting it into u64 avoids this.
Fixes: 6c1c5097781f ("net: add atomic_long_t to net_device_stats fields") Signed-off-by: Felix Riemann felix.riemann@sma.de Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -10385,7 +10385,7 @@ void netdev_stats_to_stats64(struct rtnl
BUILD_BUG_ON(n > sizeof(*stats64) / sizeof(u64)); for (i = 0; i < n; i++) - dst[i] = atomic_long_read(&src[i]); + dst[i] = (unsigned long)atomic_long_read(&src[i]); /* zero out counters that only exist in rtnl_link_stats64 */ memset((char *)stats64 + n * sizeof(u64), 0, sizeof(*stats64) - n * sizeof(u64));
From: Andrew Morton akpm@linux-foundation.org
commit a5b21d8d791cd4db609d0bbcaa9e0c7e019888d1 upstream.
This fix was nacked by Philip, for reasons identified in the email linked below.
Link: https://lkml.kernel.org/r/68f15d67-8945-2728-1f17-5b53a80ec52d@squashfs.org.... Fixes: 72e544b1b28325 ("squashfs: harden sanity check in squashfs_read_xattr_id_table") Cc: Alexey Khoroshilov khoroshilov@ispras.ru Cc: Fedor Pchelkin pchelkin@ispras.ru Cc: Phillip Lougher phillip@squashfs.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/xattr_id.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/squashfs/xattr_id.c +++ b/fs/squashfs/xattr_id.c @@ -76,7 +76,7 @@ __le64 *squashfs_read_xattr_id_table(str /* Sanity check values */
/* there is always at least one xattr id */ - if (*xattr_ids <= 0) + if (*xattr_ids == 0) return ERR_PTR(-EINVAL);
len = SQUASHFS_XATTR_BLOCK_BYTES(*xattr_ids);
From: Dom Cobley popcornmix@gmail.com
commit 247a631f9c0ffb37ed0786a94cb4c5f2b6fc7ab1 upstream.
The formula that determines the core clock requirement based on pixel clock and blanking has been determined experimentally to minimise the clock while supporting all modes we've seen.
A new reduced blanking mode (4kp60 at 533MHz rather than the standard 594MHz) has been seen that doesn't produce a high enough clock and results in "flip_done timed out" error.
Increase the setup cost in the formula to make this work. The result is a reduced blanking mode increases by up to 7MHz while leaving the standard timing mode untouched
Link: https://github.com/raspberrypi/linux/issues/4446 Fixes: 16e101051f32 ("drm/vc4: Increase the core clock based on HVS load") Signed-off-by: Dom Cobley popcornmix@gmail.com Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20230127145558.446123-1-maxime... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/vc4/vc4_crtc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/vc4/vc4_crtc.c +++ b/drivers/gpu/drm/vc4/vc4_crtc.c @@ -711,7 +711,7 @@ static int vc4_crtc_atomic_check(struct struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
if (vc4_encoder->type == VC4_ENCODER_TYPE_HDMI0) { - vc4_state->hvs_load = max(mode->clock * mode->hdisplay / mode->htotal + 1000, + vc4_state->hvs_load = max(mode->clock * mode->hdisplay / mode->htotal + 8000, mode->clock * 9 / 10) * 1000; } else { vc4_state->hvs_load = mode->clock * 1000;
From: Dave Stevenson dave.stevenson@raspberrypi.com
commit 6b77b16de75a6efc0870b1fa467209387cbee8f3 upstream.
YUV images can either be presented as one allocation with offsets for the different planes, or multiple allocations with 0 offsets.
The driver only ever calls drm_fb_[dma|cma]_get_gem_obj with plane index 0, therefore any application using the second approach was incorrectly rendered.
Correctly determine the address for each plane, removing the assumption that the base address is the same for each.
Fixes: fc04023fafec ("drm/vc4: Add support for YUV planes.") Signed-off-by: Dave Stevenson dave.stevenson@raspberrypi.com Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20230127155708.454704-1-maxime... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/vc4/vc4_plane.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/vc4/vc4_plane.c +++ b/drivers/gpu/drm/vc4/vc4_plane.c @@ -340,7 +340,7 @@ static int vc4_plane_setup_clipping_and_ { struct vc4_plane_state *vc4_state = to_vc4_plane_state(state); struct drm_framebuffer *fb = state->fb; - struct drm_gem_dma_object *bo = drm_fb_dma_get_gem_obj(fb, 0); + struct drm_gem_dma_object *bo; int num_planes = fb->format->num_planes; struct drm_crtc_state *crtc_state; u32 h_subsample = fb->format->hsub; @@ -359,8 +359,10 @@ static int vc4_plane_setup_clipping_and_ if (ret) return ret;
- for (i = 0; i < num_planes; i++) + for (i = 0; i < num_planes; i++) { + bo = drm_fb_dma_get_gem_obj(fb, i); vc4_state->offsets[i] = bo->dma_addr + fb->offsets[i]; + }
/* * We don't support subpixel source positioning for scaling,
From: Matt Roper matthew.d.roper@intel.com
commit d5a1224aa68c8b124a4c5c390186e571815ed390 upstream.
The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround has 'BUS' style reset, indicating that it does not lose its value on engine resets. Furthermore, this register is part of the GT forcewake domain rather than the RENDER domain, so it should not be impacted by RCS engine resets. As such, we should implement this on the GT workaround list rather than an engine list.
Bspec: 19219 Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()") Signed-off-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Gustavo Sousa gustavo.sousa@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230201222831.608281-2-matthe... (cherry picked from commit 5f21dc07b52eb54a908e66f5d6e05a87bcb5b049) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1249,6 +1249,13 @@ icl_gt_workarounds_init(struct intel_gt GAMT_CHKN_BIT_REG, GAMT_CHKN_DISABLE_L3_COH_PIPE);
+ /* + * Wa_1408615072:icl,ehl (vsunit) + * Wa_1407596294:icl,ehl (hsunit) + */ + wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE, + VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS); + /* Wa_1407352427:icl,ehl */ wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2, PSDUNIT_CLKGATE_DIS); @@ -2369,13 +2376,6 @@ rcs_engine_wa_init(struct intel_engine_c GEN11_ENABLE_32_PLANE_MODE);
/* - * Wa_1408615072:icl,ehl (vsunit) - * Wa_1407596294:icl,ehl (hsunit) - */ - wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE, - VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS); - - /* * Wa_1408767742:icl[a2..forever],ehl[all] * Wa_1605460711:icl[a0..c0] */
From: Jesse Brandeburg jesse.brandeburg@intel.com
commit 43fbca02c2ddc39ff5879b6f3a4a097b1ba02098 upstream.
There was a problem reported to us where the addition of a VF with an IPv6 address ending with a particular sequence would cause the parent device on the PF to no longer be able to respond to neighbor discovery packets.
In this case, we had an ovs-bridge device living on top of a VLAN, which was on top of a PF, and it would not be able to talk anymore (the neighbor entry would expire and couldn't be restored).
The root cause of the issue is that if the PF is asked to be in IFF_PROMISC mode (promiscuous mode) and it had an ipv6 address that needed the 33:33:ff:00:00:04 multicast address to work, then when the VF was added with the need for the same multicast address, the VF would steal all the traffic destined for that address.
The ice driver didn't auto-subscribe a request of IFF_PROMISC to the "multicast replication from other port's traffic" meaning that it won't get for instance, packets with an exact destination in the VF, as above.
The VF's IPv6 address, which adds a "perfect filter" for 33:33:ff:00:00:04, results in no packets for that multicast address making it to the PF (which is in promisc but NOT "multicast replication").
The fix is to enable "multicast promiscuous" whenever the driver is asked to enable IFF_PROMISC, and make sure to disable it when appropriate.
Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_main.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
--- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -270,6 +270,8 @@ static int ice_set_promisc(struct ice_vs if (status && status != -EEXIST) return status;
+ netdev_dbg(vsi->netdev, "set promisc filter bits for VSI %i: 0x%x\n", + vsi->vsi_num, promisc_m); return 0; }
@@ -295,6 +297,8 @@ static int ice_clear_promisc(struct ice_ promisc_m, 0); }
+ netdev_dbg(vsi->netdev, "clear promisc filter bits for VSI %i: 0x%x\n", + vsi->vsi_num, promisc_m); return status; }
@@ -423,6 +427,16 @@ static int ice_vsi_sync_fltr(struct ice_ } err = 0; vlan_ops->dis_rx_filtering(vsi); + + /* promiscuous mode implies allmulticast so + * that VSIs that are in promiscuous mode are + * subscribed to multicast packets coming to + * the port + */ + err = ice_set_promisc(vsi, + ICE_MCAST_PROMISC_BITS); + if (err) + goto out_promisc; } } else { /* Clear Rx filter to remove traffic from wire */ @@ -439,6 +453,18 @@ static int ice_vsi_sync_fltr(struct ice_ NETIF_F_HW_VLAN_CTAG_FILTER) vlan_ops->ena_rx_filtering(vsi); } + + /* disable allmulti here, but only if allmulti is not + * still enabled for the netdev + */ + if (!(vsi->current_netdev_flags & IFF_ALLMULTI)) { + err = ice_clear_promisc(vsi, + ICE_MCAST_PROMISC_BITS); + if (err) { + netdev_err(netdev, "Error %d clearing multicast promiscuous on VSI %i\n", + err, vsi->vsi_num); + } + } } } goto exit;
From: Jason Xing kernelxing@tencent.com
commit f9cd6a4418bac6a046ee78382423b1ae7565fb24 upstream.
Recently I encountered one case where I cannot increase the MTU size directly from 1500 to a much bigger value with XDP enabled if the server is equipped with IXGBE card, which happened on thousands of servers in production environment. After applying the current patch, we can set the maximum MTU size to 3K.
This patch follows the behavior of changing MTU as i40e/ice does.
[1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP") [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -6778,6 +6778,18 @@ static void ixgbe_free_all_rx_resources( }
/** + * ixgbe_max_xdp_frame_size - returns the maximum allowed frame size for XDP + * @adapter: device handle, pointer to adapter + */ +static int ixgbe_max_xdp_frame_size(struct ixgbe_adapter *adapter) +{ + if (PAGE_SIZE >= 8192 || adapter->flags2 & IXGBE_FLAG2_RX_LEGACY) + return IXGBE_RXBUFFER_2K; + else + return IXGBE_RXBUFFER_3K; +} + +/** * ixgbe_change_mtu - Change the Maximum Transfer Unit * @netdev: network interface device structure * @new_mtu: new value for maximum frame size @@ -6788,18 +6800,13 @@ static int ixgbe_change_mtu(struct net_d { struct ixgbe_adapter *adapter = netdev_priv(netdev);
- if (adapter->xdp_prog) { + if (ixgbe_enabled_xdp_adapter(adapter)) { int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN; - int i; - - for (i = 0; i < adapter->num_rx_queues; i++) { - struct ixgbe_ring *ring = adapter->rx_ring[i];
- if (new_frame_size > ixgbe_rx_bufsz(ring)) { - e_warn(probe, "Requested MTU size is not supported with XDP\n"); - return -EINVAL; - } + if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) { + e_warn(probe, "Requested MTU size is not supported with XDP\n"); + return -EINVAL; } }
From: Jason Xing kernelxing@tencent.com
commit ce45ffb815e8e238f05de1630be3969b6bb15e4e upstream.
Include the second VLAN HLEN into account when computing the maximum MTU size as other drivers do.
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -2921,7 +2921,7 @@ static int i40e_change_mtu(struct net_de struct i40e_pf *pf = vsi->back;
if (i40e_enabled_xdp_vsi(vsi)) { - int frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN; + int frame_size = new_mtu + I40E_PACKET_HDR_PAD;
if (frame_size > i40e_max_xdp_frame_size(vsi)) return -EINVAL;
From: Rafał Miłecki rafal@milecki.pl
commit d61615c366a489646a1bfe5b33455f916762d5f4 upstream.
Code blocks handling BCMA_CHIP_ID_BCM5357 and BCMA_CHIP_ID_BCM53572 were incorrectly unified. Chip package values are not unique and cannot be checked independently. They are meaningful only in a context of a given chip.
Packages BCM5358 and BCM47188 share the same value but then belong to different chips. Code unification resulted in treating BCM5358 as BCM47188 and broke its initialization.
Link: https://github.com/openwrt/openwrt/issues/8278 Fixes: cb1b0f90acfe ("net: ethernet: bgmac: unify code of the same family") Cc: Jon Mason jdmason@kudzu.us Signed-off-by: Rafał Miłecki rafal@milecki.pl Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20230208091637.16291-1-zajec5@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bgmac-bcma.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/broadcom/bgmac-bcma.c +++ b/drivers/net/ethernet/broadcom/bgmac-bcma.c @@ -240,12 +240,12 @@ static int bgmac_probe(struct bcma_devic bgmac->feature_flags |= BGMAC_FEAT_CLKCTLST; bgmac->feature_flags |= BGMAC_FEAT_FLW_CTRL1; bgmac->feature_flags |= BGMAC_FEAT_SW_TYPE_PHY; - if (ci->pkg == BCMA_PKG_ID_BCM47188 || - ci->pkg == BCMA_PKG_ID_BCM47186) { + if ((ci->id == BCMA_CHIP_ID_BCM5357 && ci->pkg == BCMA_PKG_ID_BCM47186) || + (ci->id == BCMA_CHIP_ID_BCM53572 && ci->pkg == BCMA_PKG_ID_BCM47188)) { bgmac->feature_flags |= BGMAC_FEAT_SW_TYPE_RGMII; bgmac->feature_flags |= BGMAC_FEAT_IOST_ATTACHED; } - if (ci->pkg == BCMA_PKG_ID_BCM5358) + if (ci->id == BCMA_CHIP_ID_BCM5357 && ci->pkg == BCMA_PKG_ID_BCM5358) bgmac->feature_flags |= BGMAC_FEAT_SW_TYPE_EPHYRMII; break; case BCMA_CHIP_ID_BCM53573:
From: Siddharth Vadapalli s-vadapalli@ti.com
commit 0ed577e7e8e508c24e22ba07713ecc4903e147c3 upstream.
In TI's AM62x/AM64x SoCs, successful teardown of RX DMA Channel raises an interrupt. The process of servicing this interrupt involves flushing all pending RX DMA descriptors and clearing the teardown completion marker (TDCM). The am65_cpsw_nuss_rx_packets() function invoked from the RX NAPI callback services the interrupt. Thus, it is necessary to wait for this handler to run, drain all packets and clear TDCM, before calling napi_disable() in am65_cpsw_nuss_common_stop() function post channel teardown. If napi_disable() executes before ensuring that TDCM is cleared, the TDCM remains set when the interfaces are down, resulting in an interrupt storm when the interfaces are brought up again.
Since the interrupt raised to indicate the RX DMA Channel teardown is specific to the AM62x and AM64x SoCs, add a quirk for it.
Fixes: 4f7cce272403 ("net: ethernet: ti: am65-cpsw: add support for am64x cpsw3g") Co-developed-by: Vignesh Raghavendra vigneshr@ti.com Signed-off-by: Vignesh Raghavendra vigneshr@ti.com Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Reviewed-by: Roger Quadros rogerq@kernel.org Link: https://lore.kernel.org/r/20230209084432.189222-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 12 +++++++++++- drivers/net/ethernet/ti/am65-cpsw-nuss.h | 1 + 2 files changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -500,7 +500,15 @@ static int am65_cpsw_nuss_common_stop(st k3_udma_glue_disable_tx_chn(common->tx_chns[i].tx_chn); }
+ reinit_completion(&common->tdown_complete); k3_udma_glue_tdown_rx_chn(common->rx_chns.rx_chn, true); + + if (common->pdata.quirks & AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ) { + i = wait_for_completion_timeout(&common->tdown_complete, msecs_to_jiffies(1000)); + if (!i) + dev_err(common->dev, "rx teardown timeout\n"); + } + napi_disable(&common->napi_rx);
for (i = 0; i < AM65_CPSW_MAX_RX_FLOWS; i++) @@ -704,6 +712,8 @@ static int am65_cpsw_nuss_rx_packets(str
if (cppi5_desc_is_tdcm(desc_dma)) { dev_dbg(dev, "%s RX tdown flow: %u\n", __func__, flow_idx); + if (common->pdata.quirks & AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ) + complete(&common->tdown_complete); return 0; }
@@ -2634,7 +2644,7 @@ static const struct am65_cpsw_pdata j721 };
static const struct am65_cpsw_pdata am64x_cpswxg_pdata = { - .quirks = 0, + .quirks = AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ, .ale_dev_id = "am64-cpswxg", .fdqring_mode = K3_RINGACC_RING_MODE_RING, }; --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.h +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.h @@ -86,6 +86,7 @@ struct am65_cpsw_rx_chn { };
#define AM65_CPSW_QUIRK_I2027_NO_TX_CSUM BIT(0) +#define AM64_CPSW_QUIRK_DMA_RX_TDOWN_IRQ BIT(1)
struct am65_cpsw_pdata { u32 quirks;
From: Pietro Borrello borrello@diag.uniroma1.it
commit a1221703a0f75a9d81748c516457e0fc76951496 upstream.
Use list_is_first() to check whether tsp->asoc matches the first element of ep->asocs, as the list is not guaranteed to have an entry.
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Pietro Borrello borrello@diag.uniroma1.it Acked-by: Xin Long lucien.xin@gmail.com Link: https://lore.kernel.org/r/20230208-sctp-filter-v2-1-6e1f4017f326@diag.unirom... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/diag.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/net/sctp/diag.c +++ b/net/sctp/diag.c @@ -343,11 +343,9 @@ static int sctp_sock_filter(struct sctp_ struct sctp_comm_param *commp = p; struct sock *sk = ep->base.sk; const struct inet_diag_req_v2 *r = commp->r; - struct sctp_association *assoc = - list_entry(ep->asocs.next, struct sctp_association, asocs);
/* find the ep only once through the transports by this condition */ - if (tsp->asoc != assoc) + if (!list_is_first(&tsp->asoc->asocs, &ep->asocs)) return 0;
if (r->sdiag_family != AF_UNSPEC && sk->sk_family != r->sdiag_family)
From: Pedro Tammela pctammela@mojatatu.com
commit ee059170b1f7e94e55fa6cadee544e176a6e59c2 upstream.
The imperfect hash area can be updated while packets are traversing, which will cause a use-after-free when 'tcf_exts_exec()' is called with the destroyed tcf_ext.
CPU 0: CPU 1: tcindex_set_parms tcindex_classify tcindex_lookup tcindex_lookup tcf_exts_change tcf_exts_exec [UAF]
Stop operating on the shared area directly, by using a local copy, and update the filter with 'rcu_replace_pointer()'. Delete the old filter version only after a rcu grace period elapsed.
Fixes: 9b0d4446b569 ("net: sched: avoid atomic swap in tcf_exts_change") Reported-by: valis sec@valis.email Suggested-by: valis sec@valis.email Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Link: https://lore.kernel.org/r/20230209143739.279867-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/cls_tcindex.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-)
--- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -12,6 +12,7 @@ #include <linux/errno.h> #include <linux/slab.h> #include <linux/refcount.h> +#include <linux/rcupdate.h> #include <net/act_api.h> #include <net/netlink.h> #include <net/pkt_cls.h> @@ -338,6 +339,7 @@ tcindex_set_parms(struct net *net, struc struct tcf_result cr = {}; int err, balloc = 0; struct tcf_exts e; + bool update_h = false;
err = tcf_exts_init(&e, net, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE); if (err < 0) @@ -455,10 +457,13 @@ tcindex_set_parms(struct net *net, struc } }
- if (cp->perfect) + if (cp->perfect) { r = cp->perfect + handle; - else - r = tcindex_lookup(cp, handle) ? : &new_filter_result; + } else { + /* imperfect area is updated in-place using rcu */ + update_h = !!tcindex_lookup(cp, handle); + r = &new_filter_result; + }
if (r == &new_filter_result) { f = kzalloc(sizeof(*f), GFP_KERNEL); @@ -484,7 +489,28 @@ tcindex_set_parms(struct net *net, struc
rcu_assign_pointer(tp->root, cp);
- if (r == &new_filter_result) { + if (update_h) { + struct tcindex_filter __rcu **fp; + struct tcindex_filter *cf; + + f->result.res = r->res; + tcf_exts_change(&f->result.exts, &r->exts); + + /* imperfect area bucket */ + fp = cp->h + (handle % cp->hash); + + /* lookup the filter, guaranteed to exist */ + for (cf = rcu_dereference_bh_rtnl(*fp); cf; + fp = &cf->next, cf = rcu_dereference_bh_rtnl(*fp)) + if (cf->key == handle) + break; + + f->next = cf->next; + + cf = rcu_replace_pointer(*fp, f, 1); + tcf_exts_get_net(&cf->result.exts); + tcf_queue_work(&cf->rwork, tcindex_destroy_fexts_work); + } else if (r == &new_filter_result) { struct tcindex_filter *nfp; struct tcindex_filter __rcu **fp;
From: Larysa Zaremba larysa.zaremba@intel.com
commit 1f090494170ea298530cf1285fb8d078e355b4c0 upstream.
Incrementation of xsk_frames inside the for-loop produces infinite loop, if we have both normal AF_XDP-TX and XDP_TXed buffers to complete.
Split xsk_frames into 2 variables (xsk_frames and completed_frames) to eliminate this bug.
Fixes: 29322791bc8b ("ice: xsk: change batched Tx descriptor cleaning") Acked-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Larysa Zaremba larysa.zaremba@intel.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Acked-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230209160130.1779890-1-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ice/ice_xsk.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -789,6 +789,7 @@ static void ice_clean_xdp_irq_zc(struct struct ice_tx_desc *tx_desc; u16 cnt = xdp_ring->count; struct ice_tx_buf *tx_buf; + u16 completed_frames = 0; u16 xsk_frames = 0; u16 last_rs; int i; @@ -798,19 +799,21 @@ static void ice_clean_xdp_irq_zc(struct if ((tx_desc->cmd_type_offset_bsz & cpu_to_le64(ICE_TX_DESC_DTYPE_DESC_DONE))) { if (last_rs >= ntc) - xsk_frames = last_rs - ntc + 1; + completed_frames = last_rs - ntc + 1; else - xsk_frames = last_rs + cnt - ntc + 1; + completed_frames = last_rs + cnt - ntc + 1; }
- if (!xsk_frames) + if (!completed_frames) return;
- if (likely(!xdp_ring->xdp_tx_active)) + if (likely(!xdp_ring->xdp_tx_active)) { + xsk_frames = completed_frames; goto skip; + }
ntc = xdp_ring->next_to_clean; - for (i = 0; i < xsk_frames; i++) { + for (i = 0; i < completed_frames; i++) { tx_buf = &xdp_ring->tx_buf[ntc];
if (tx_buf->raw_buf) { @@ -826,7 +829,7 @@ static void ice_clean_xdp_irq_zc(struct } skip: tx_desc->cmd_type_offset_bsz = 0; - xdp_ring->next_to_clean += xsk_frames; + xdp_ring->next_to_clean += completed_frames; if (xdp_ring->next_to_clean >= cnt) xdp_ring->next_to_clean -= cnt; if (xsk_frames)
From: Kuniyuki Iwashima kuniyu@amazon.com
commit ca43ccf41224b023fc290073d5603a755fd12eed upstream.
Eric Dumazet pointed out [0] that when we call skb_set_owner_r() for ipv6_pinfo.pktoptions, sk_rmem_schedule() has not been called, resulting in a negative sk_forward_alloc.
We add a new helper which clones a skb and sets its owner only when sk_rmem_schedule() succeeds.
Note that we move skb_set_owner_r() forward in (dccp|tcp)_v6_do_rcv() because tcp_send_synack() can make sk_forward_alloc negative before ipv6_opt_accepted() in the crossed SYN-ACK or self-connect() cases.
[0]: https://lore.kernel.org/netdev/CANn89iK9oc20Jdi_41jb9URdF210r7d1Y-+uypbMSbOf...
Fixes: 323fbd0edf3f ("net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv()") Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sock.h | 13 +++++++++++++ net/dccp/ipv6.c | 7 ++----- net/ipv6/tcp_ipv6.c | 10 +++------- 3 files changed, 18 insertions(+), 12 deletions(-)
--- a/include/net/sock.h +++ b/include/net/sock.h @@ -2430,6 +2430,19 @@ static inline __must_check bool skb_set_ return false; }
+static inline struct sk_buff *skb_clone_and_charge_r(struct sk_buff *skb, struct sock *sk) +{ + skb = skb_clone(skb, sk_gfp_mask(sk, GFP_ATOMIC)); + if (skb) { + if (sk_rmem_schedule(sk, skb, skb->truesize)) { + skb_set_owner_r(skb, sk); + return skb; + } + __kfree_skb(skb); + } + return NULL; +} + static inline void skb_prepare_for_gro(struct sk_buff *skb) { if (skb->destructor != sock_wfree) { --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -551,11 +551,9 @@ static struct sock *dccp_v6_request_recv *own_req = inet_ehash_nolisten(newsk, req_to_sk(req_unhash), NULL); /* Clone pktoptions received with SYN, if we own the req */ if (*own_req && ireq->pktopts) { - newnp->pktoptions = skb_clone(ireq->pktopts, GFP_ATOMIC); + newnp->pktoptions = skb_clone_and_charge_r(ireq->pktopts, newsk); consume_skb(ireq->pktopts); ireq->pktopts = NULL; - if (newnp->pktoptions) - skb_set_owner_r(newnp->pktoptions, newsk); }
return newsk; @@ -615,7 +613,7 @@ static int dccp_v6_do_rcv(struct sock *s --ANK (980728) */ if (np->rxopt.all) - opt_skb = skb_clone(skb, GFP_ATOMIC); + opt_skb = skb_clone_and_charge_r(skb, sk);
if (sk->sk_state == DCCP_OPEN) { /* Fast path */ if (dccp_rcv_established(sk, skb, dccp_hdr(skb), skb->len)) @@ -679,7 +677,6 @@ ipv6_pktoptions: np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb)); if (ipv6_opt_accepted(sk, opt_skb, &DCCP_SKB_CB(opt_skb)->header.h6)) { - skb_set_owner_r(opt_skb, sk); memmove(IP6CB(opt_skb), &DCCP_SKB_CB(opt_skb)->header.h6, sizeof(struct inet6_skb_parm)); --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1388,14 +1388,11 @@ static struct sock *tcp_v6_syn_recv_sock
/* Clone pktoptions received with SYN, if we own the req */ if (ireq->pktopts) { - newnp->pktoptions = skb_clone(ireq->pktopts, - sk_gfp_mask(sk, GFP_ATOMIC)); + newnp->pktoptions = skb_clone_and_charge_r(ireq->pktopts, newsk); consume_skb(ireq->pktopts); ireq->pktopts = NULL; - if (newnp->pktoptions) { + if (newnp->pktoptions) tcp_v6_restore_cb(newnp->pktoptions); - skb_set_owner_r(newnp->pktoptions, newsk); - } } } else { if (!req_unhash && found_dup_sk) { @@ -1467,7 +1464,7 @@ int tcp_v6_do_rcv(struct sock *sk, struc --ANK (980728) */ if (np->rxopt.all) - opt_skb = skb_clone(skb, sk_gfp_mask(sk, GFP_ATOMIC)); + opt_skb = skb_clone_and_charge_r(skb, sk);
reason = SKB_DROP_REASON_NOT_SPECIFIED; if (sk->sk_state == TCP_ESTABLISHED) { /* Fast path */ @@ -1553,7 +1550,6 @@ ipv6_pktoptions: if (np->repflow) np->flow_label = ip6_flowlabel(ipv6_hdr(opt_skb)); if (ipv6_opt_accepted(sk, opt_skb, &TCP_SKB_CB(opt_skb)->header.h6)) { - skb_set_owner_r(opt_skb, sk); tcp_v6_restore_cb(opt_skb); opt_skb = xchg(&np->pktoptions, opt_skb); } else {
From: Miko Larsson mikoxyzzz@gmail.com
commit c68f345b7c425b38656e1791a0486769a8797016 upstream.
syzbot reported that act_len in kalmia_send_init_packet() is uninitialized when passing it to the first usb_bulk_msg error path. Jiri Pirko noted that it's pointless to pass it in the error path, and that the value that would be printed in the second error path would be the value of act_len from the first call to usb_bulk_msg.[1]
With this in mind, let's just not pass act_len to the usb_bulk_msg error paths.
1: https://lore.kernel.org/lkml/Y9pY61y1nwTuzMOa@nanopsycho/
Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Reported-and-tested-by: syzbot+cd80c5ef5121bfe85b55@syzkaller.appspotmail.com Signed-off-by: Miko Larsson mikoxyzzz@gmail.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/kalmia.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/net/usb/kalmia.c +++ b/drivers/net/usb/kalmia.c @@ -65,8 +65,8 @@ kalmia_send_init_packet(struct usbnet *d init_msg, init_msg_len, &act_len, KALMIA_USB_TIMEOUT); if (status != 0) { netdev_err(dev->net, - "Error sending init packet. Status %i, length %i\n", - status, act_len); + "Error sending init packet. Status %i\n", + status); return status; } else if (act_len != init_msg_len) { @@ -83,8 +83,8 @@ kalmia_send_init_packet(struct usbnet *d
if (status != 0) netdev_err(dev->net, - "Error receiving init result. Status %i, length %i\n", - status, act_len); + "Error receiving init result. Status %i\n", + status); else if (act_len != expected_len) netdev_err(dev->net, "Unexpected init result length: %i\n", act_len);
From: Pedro Tammela pctammela@mojatatu.com
commit 21c167aa0ba943a7cac2f6969814f83bb701666b upstream.
The tc action act_ctinfo was using shared stats, fix it to use percpu stats since bstats_update() must be called with locks or with a percpu pointer argument.
tdc results: 1..12 ok 1 c826 - Add ctinfo action with default setting ok 2 0286 - Add ctinfo action with dscp ok 3 4938 - Add ctinfo action with valid cpmark and zone ok 4 7593 - Add ctinfo action with drop control ok 5 2961 - Replace ctinfo action zone and action control ok 6 e567 - Delete ctinfo action with valid index ok 7 6a91 - Delete ctinfo action with invalid index ok 8 5232 - List ctinfo actions ok 9 7702 - Flush ctinfo actions ok 10 3201 - Add ctinfo action with duplicate index ok 11 8295 - Add ctinfo action with invalid index ok 12 3964 - Replace ctinfo action with invalid goto_chain control
Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action") Reviewed-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Larysa Zaremba larysa.zaremba@intel.com Link: https://lore.kernel.org/r/20230210200824.444856-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_ctinfo.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/sched/act_ctinfo.c +++ b/net/sched/act_ctinfo.c @@ -91,7 +91,7 @@ static int tcf_ctinfo_act(struct sk_buff cp = rcu_dereference_bh(ca->params);
tcf_lastuse_update(&ca->tcf_tm); - bstats_update(&ca->tcf_bstats, skb); + tcf_action_update_bstats(&ca->common, skb); action = READ_ONCE(ca->tcf_action);
wlen = skb_network_offset(skb); @@ -210,8 +210,8 @@ static int tcf_ctinfo_init(struct net *n index = actparm->index; err = tcf_idr_check_alloc(tn, &index, a, bind); if (!err) { - ret = tcf_idr_create(tn, index, est, a, - &act_ctinfo_ops, bind, false, flags); + ret = tcf_idr_create_from_flags(tn, index, est, a, + &act_ctinfo_ops, bind, flags); if (ret) { tcf_idr_cleanup(tn, index); return ret;
From: Hangyu Hua hbh25y@gmail.com
commit 2fa28f5c6fcbfc794340684f36d2581b4f2d20b5 upstream.
old_meter needs to be free after it is detached regardless of whether the new meter is successfully attached.
Fixes: c7c4c44c9a95 ("net: openvswitch: expand the meters supported number") Signed-off-by: Hangyu Hua hbh25y@gmail.com Acked-by: Eelco Chaudron echaudro@redhat.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/openvswitch/meter.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/openvswitch/meter.c +++ b/net/openvswitch/meter.c @@ -449,7 +449,7 @@ static int ovs_meter_cmd_set(struct sk_b
err = attach_meter(meter_tbl, meter); if (err) - goto exit_unlock; + goto exit_free_old_meter;
ovs_unlock();
@@ -472,6 +472,8 @@ static int ovs_meter_cmd_set(struct sk_b genlmsg_end(reply, ovs_reply_header); return genlmsg_reply(reply, info);
+exit_free_old_meter: + ovs_meter_free(old_meter); exit_unlock: ovs_unlock(); nlmsg_free(reply);
From: Johannes Zink j.zink@pengutronix.de
commit 4562c65ec852067c6196abdcf2d925f08841dcbc upstream.
So far changing the period by just setting new period values while running did not work.
The order as indicated by the publicly available reference manual of the i.MX8MP [1] indicates a sequence:
* initiate the programming sequence * set the values for PPS period and start time * start the pulse train generation.
This is currently not used in dwmac5_flex_pps_config(), which instead does:
* initiate the programming sequence and immediately start the pulse train generation * set the values for PPS period and start time
This caused the period values written not to take effect until the FlexPPS output was disabled and re-enabled again.
This patch fix the order and allows the period to be set immediately.
[1] https://www.nxp.com/webapp/Download?colCode=IMX8MPRM
Fixes: 9a8a02c9d46d ("net: stmmac: Add Flexible PPS support") Signed-off-by: Johannes Zink j.zink@pengutronix.de Link: https://lore.kernel.org/r/20230210143937.3427483-1-j.zink@pengutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac5.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac5.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac5.c @@ -541,9 +541,9 @@ int dwmac5_flex_pps_config(void __iomem return 0; }
- val |= PPSCMDx(index, 0x2); val |= TRGTMODSELx(index, 0x2); val |= PPSEN0; + writel(val, ioaddr + MAC_PPS_CONTROL);
writel(cfg->start.tv_sec, ioaddr + MAC_PPSx_TARGET_TIME_SEC(index));
@@ -568,6 +568,7 @@ int dwmac5_flex_pps_config(void __iomem writel(period - 1, ioaddr + MAC_PPSx_WIDTH(index));
/* Finally, activate it */ + val |= PPSCMDx(index, 0x2); writel(val, ioaddr + MAC_PPS_CONTROL); return 0; }
From: Michael Chan michael.chan@broadcom.com
commit 2038cc592811209de20c4e094ca08bfb1e6fbc6c upstream.
In bnxt_reserve_rings(), there is logic to check that the number of TX rings reserved is enough to cover all the mqprio TCs, but it fails to account for the TX XDP rings. So the check will always fail if there are mqprio TCs and TX XDP rings. As a result, the driver always fails to initialize after the XDP program is attached and the device will be brought down. A subsequent ifconfig up will also fail because the number of TX rings is set to an inconsistent number. Fix the check to properly account for TX XDP rings. If the check fails, set the number of TX rings back to a consistent number after calling netdev_reset_tc().
Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.") Reviewed-by: Hongguang Gao hongguang.gao@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -9239,10 +9239,14 @@ int bnxt_reserve_rings(struct bnxt *bp, netdev_err(bp->dev, "ring reservation/IRQ init failure rc: %d\n", rc); return rc; } - if (tcs && (bp->tx_nr_rings_per_tc * tcs != bp->tx_nr_rings)) { + if (tcs && (bp->tx_nr_rings_per_tc * tcs != + bp->tx_nr_rings - bp->tx_nr_rings_xdp)) { netdev_err(bp->dev, "tx ring reservation failure\n"); netdev_reset_tc(bp->dev); - bp->tx_nr_rings_per_tc = bp->tx_nr_rings; + if (bp->tx_nr_rings_xdp) + bp->tx_nr_rings_per_tc = bp->tx_nr_rings_xdp; + else + bp->tx_nr_rings_per_tc = bp->tx_nr_rings; return -ENOMEM; } return 0;
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 70b5339caf847b8b6097b6dfab0c5a99b40713c8 upstream.
trace_define_field_ext() is not used outside of trace_events.c, it should be static.
Link: https://lore.kernel.org/oe-kbuild-all/202302130750.679RaRog-lkp@intel.com/
Fixes: b6c7abd1c28a ("tracing: Fix TASK_COMM_LEN in trace event format file") Reported-by: Reported-by: kernel test robot lkp@intel.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 6a4696719297..6a942fa275c7 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -155,7 +155,7 @@ int trace_define_field(struct trace_event_call *call, const char *type, } EXPORT_SYMBOL_GPL(trace_define_field);
-int trace_define_field_ext(struct trace_event_call *call, const char *type, +static int trace_define_field_ext(struct trace_event_call *call, const char *type, const char *name, int offset, int size, int is_signed, int filter_type, int len) {
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
commit 05d7623a892a9da62da0e714428e38f09e4a64d8 upstream.
When setting 'snps,force_thresh_dma_mode' DT property, the following warning is always emitted, regardless the status of force_sf_dma_mode:
dwmac-starfive 10020000.ethernet: force_sf_dma_mode is ignored if force_thresh_dma_mode is set.
Do not print the rather misleading message when DMA store and forward mode is already disabled.
Fixes: e2a240c7d3bc ("driver:net:stmmac: Disable DMA store and forward mode if platform data force_thresh_dma_mode is set.") Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Link: https://lore.kernel.org/r/20230210202126.877548-1-cristian.ciocaltea@collabo... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -559,7 +559,7 @@ stmmac_probe_config_dt(struct platform_d dma_cfg->mixed_burst = of_property_read_bool(np, "snps,mixed-burst");
plat->force_thresh_dma_mode = of_property_read_bool(np, "snps,force_thresh_dma_mode"); - if (plat->force_thresh_dma_mode) { + if (plat->force_thresh_dma_mode && plat->force_sf_dma_mode) { plat->force_sf_dma_mode = 0; dev_warn(&pdev->dev, "force_sf_dma_mode is ignored if force_thresh_dma_mode is set.\n");
From: Eric Dumazet edumazet@google.com
commit 2558b8039d059342197610498c8749ad294adee5 upstream.
syzbot found arm64 builds would crash in sock_recv_mark() when CONFIG_HARDENED_USERCOPY=y
x86 and powerpc are not detecting the issue because they define user_access_begin. This will be handled in a different patch, because a check_object_size() is missing.
Only data from skb->cb[] can be copied directly to/from user space, as explained in commit 79a8a642bf05 ("net: Whitelist the skbuff_head_cache "cb" field")
syzbot report was: usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_head_cache' (offset 168, size 4)! ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:102 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 4410 Comm: syz-executor533 Not tainted 6.2.0-rc7-syzkaller-17907-g2d3827b3f393 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : usercopy_abort+0x90/0x94 mm/usercopy.c:90 lr : usercopy_abort+0x90/0x94 mm/usercopy.c:90 sp : ffff80000fb9b9a0 x29: ffff80000fb9b9b0 x28: ffff0000c6073400 x27: 0000000020001a00 x26: 0000000000000014 x25: ffff80000cf52000 x24: fffffc0000000000 x23: 05ffc00000000200 x22: fffffc000324bf80 x21: ffff0000c92fe1a8 x20: 0000000000000001 x19: 0000000000000004 x18: 0000000000000000 x17: 656a626f2042554c x16: ffff0000c6073dd0 x15: ffff80000dbd2118 x14: ffff0000c6073400 x13: 00000000ffffffff x12: ffff0000c6073400 x11: ff808000081bbb4c x10: 0000000000000000 x9 : 7b0572d7cc0ccf00 x8 : 7b0572d7cc0ccf00 x7 : ffff80000bf650d4 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000 x2 : ffff0001fefbff08 x1 : 0000000100000000 x0 : 000000000000006c Call trace: usercopy_abort+0x90/0x94 mm/usercopy.c:90 __check_heap_object+0xa8/0x100 mm/slub.c:4761 check_heap_object mm/usercopy.c:196 [inline] __check_object_size+0x208/0x6b8 mm/usercopy.c:251 check_object_size include/linux/thread_info.h:199 [inline] __copy_to_user include/linux/uaccess.h:115 [inline] put_cmsg+0x408/0x464 net/core/scm.c:238 sock_recv_mark net/socket.c:975 [inline] __sock_recv_cmsgs+0x1fc/0x248 net/socket.c:984 sock_recv_cmsgs include/net/sock.h:2728 [inline] packet_recvmsg+0x2d8/0x678 net/packet/af_packet.c:3482 ____sys_recvmsg+0x110/0x3a0 ___sys_recvmsg net/socket.c:2737 [inline] __sys_recvmsg+0x194/0x210 net/socket.c:2767 __do_sys_recvmsg net/socket.c:2777 [inline] __se_sys_recvmsg net/socket.c:2774 [inline] __arm64_sys_recvmsg+0x2c/0x3c net/socket.c:2774 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x64/0x178 arch/arm64/kernel/syscall.c:52 el0_svc_common+0xbc/0x180 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x48/0x110 arch/arm64/kernel/syscall.c:193 el0_svc+0x58/0x14c arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: 91388800 aa0903e1 f90003e8 94e6d752 (d4210000)
Fixes: 6fd1d51cfa25 ("net: SO_RCVMARK socket option for SO_MARK with recvmsg()") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Erin MacNeil lnx.erin@gmail.com Reviewed-by: Alexander Lobakin alexandr.lobakin@intel.com Link: https://lore.kernel.org/r/20230213160059.3829741-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/socket.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/net/socket.c +++ b/net/socket.c @@ -971,9 +971,12 @@ static inline void sock_recv_drops(struc static void sock_recv_mark(struct msghdr *msg, struct sock *sk, struct sk_buff *skb) { - if (sock_flag(sk, SOCK_RCVMARK) && skb) - put_cmsg(msg, SOL_SOCKET, SO_MARK, sizeof(__u32), - &skb->mark); + if (sock_flag(sk, SOCK_RCVMARK) && skb) { + /* We must use a bounce buffer for CONFIG_HARDENED_USERCOPY=y */ + __u32 mark = skb->mark; + + put_cmsg(msg, SOL_SOCKET, SO_MARK, sizeof(__u32), &mark); + } }
void __sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
From: Tung Nguyen tung.q.nguyen@dektech.com.au
commit 11a4d6f67cf55883dc78e31c247d1903ed7feccc upstream.
When sending a SYN message, this kernel stack trace is observed:
... [ 13.396352] RIP: 0010:_copy_from_iter+0xb4/0x550 ... [ 13.398494] Call Trace: [ 13.398630] <TASK> [ 13.398630] ? __alloc_skb+0xed/0x1a0 [ 13.398630] tipc_msg_build+0x12c/0x670 [tipc] [ 13.398630] ? shmem_add_to_page_cache.isra.71+0x151/0x290 [ 13.398630] __tipc_sendmsg+0x2d1/0x710 [tipc] [ 13.398630] ? tipc_connect+0x1d9/0x230 [tipc] [ 13.398630] ? __local_bh_enable_ip+0x37/0x80 [ 13.398630] tipc_connect+0x1d9/0x230 [tipc] [ 13.398630] ? __sys_connect+0x9f/0xd0 [ 13.398630] __sys_connect+0x9f/0xd0 [ 13.398630] ? preempt_count_add+0x4d/0xa0 [ 13.398630] ? fpregs_assert_state_consistent+0x22/0x50 [ 13.398630] __x64_sys_connect+0x16/0x20 [ 13.398630] do_syscall_64+0x42/0x90 [ 13.398630] entry_SYSCALL_64_after_hwframe+0x63/0xcd
It is because commit a41dad905e5a ("iov_iter: saner checks for attempt to copy to/from iterator") has introduced sanity check for copying from/to iov iterator. Lacking of copy direction from the iterator viewpoint would lead to kernel stack trace like above.
This commit fixes this issue by initializing the iov iterator with the correct copy direction when sending SYN or ACK without data.
Fixes: f25dcc7687d4 ("tipc: tipc ->sendmsg() conversion") Reported-by: syzbot+d43608d061e8847ec9f3@syzkaller.appspotmail.com Acked-by: Jon Maloy jmaloy@redhat.com Signed-off-by: Tung Nguyen tung.q.nguyen@dektech.com.au Link: https://lore.kernel.org/r/20230214012606.5804-1-tung.q.nguyen@dektech.com.au Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/tipc/socket.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2614,6 +2614,7 @@ static int tipc_connect(struct socket *s /* Send a 'SYN-' to destination */ m.msg_name = dest; m.msg_namelen = destlen; + iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0);
/* If connect is in non-blocking case, set MSG_DONTWAIT to * indicate send_msg() is never blocked. @@ -2776,6 +2777,7 @@ static int tipc_accept(struct socket *so __skb_queue_head(&new_sk->sk_receive_queue, buf); skb_set_owner_r(buf, new_sk); } + iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0); __tipc_sendstream(new_sock, &m, 0); release_sock(new_sk); exit:
From: Jakub Kicinski kuba@kernel.org
commit fda6c89fe3d9aca073495a664e1d5aea28cd4377 upstream.
lianhui reports that when MPLS fails to register the sysctl table under new location (during device rename) the old pointers won't get overwritten and may be freed again (double free).
Handle this gracefully. The best option would be unregistering the MPLS from the device completely on failure, but unfortunately mpls_ifdown() can fail. So failing fully is also unreliable.
Another option is to register the new table first then only remove old one if the new one succeeds. That requires more code, changes order of notifications and two tables may be visible at the same time.
sysctl point is not used in the rest of the code - set to NULL on failures and skip unregister if already NULL.
Reported-by: lianhui tang bluetlh@gmail.com Fixes: 0fae3bf018d9 ("mpls: handle device renames for per-device sysctls") Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mpls/af_mpls.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -1428,6 +1428,7 @@ static int mpls_dev_sysctl_register(stru free: kfree(table); out: + mdev->sysctl = NULL; return -ENOBUFS; }
@@ -1437,6 +1438,9 @@ static void mpls_dev_sysctl_unregister(s struct net *net = dev_net(dev); struct ctl_table *table;
+ if (!mdev->sysctl) + return; + table = mdev->sysctl->ctl_table_arg; unregister_net_sysctl_table(mdev->sysctl); kfree(table);
From: Corinna Vinschen vinschen@redhat.com
commit 5d54cb1767e06025819daa6769e0f18dcbc60936 upstream.
Commit a97f8783a937 ("igb: unbreak I2C bit-banging on i350") introduced code to change I2C settings to bit banging unconditionally.
However, this patch introduced a regression: On an Intel S2600CWR Server Board with three NICs:
- 1x dual-port copper Intel I350 Gigabit Network Connection [8086:1521] (rev 01) fw 1.63, 0x80000dda
- 2x quad-port SFP+ with copper SFP Avago ABCU-5700RZ Intel I350 Gigabit Fiber Network Connection [8086:1522] (rev 01) fw 1.52.0
the SFP NICs no longer get link at all. Reverting commit a97f8783a937 or switching to the Intel out-of-tree driver both fix the problem.
Per the igb out-of-tree driver, I2C bit banging on i350 depends on support for an external thermal sensor (ETS). However, commit a97f8783a937 added bit banging unconditionally. Additionally, the out-of-tree driver always calls init_thermal_sensor_thresh on probe, while our driver only calls init_thermal_sensor_thresh only in igb_reset(), and only if an ETS is present, ignoring the internal thermal sensor. The affected SFPs don't provide an ETS. Per Intel, the behaviour is a result of i350 firmware requirements.
This patch fixes the problem by aligning the behaviour to the out-of-tree driver:
- split igb_init_i2c() into two functions: - igb_init_i2c() only performs the basic I2C initialization. - igb_set_i2c_bb() makes sure that E1000_CTRL_I2C_ENA is set and enables bit-banging.
- igb_probe() only calls igb_set_i2c_bb() if an ETS is present.
- igb_probe() calls init_thermal_sensor_thresh() unconditionally.
- igb_reset() aligns its behaviour to igb_probe(), i. e., call igb_set_i2c_bb() if an ETS is present and call init_thermal_sensor_thresh() unconditionally.
Fixes: a97f8783a937 ("igb: unbreak I2C bit-banging on i350") Tested-by: Mateusz Palczewski mateusz.palczewski@intel.com Co-developed-by: Jamie Bainbridge jbainbri@redhat.com Signed-off-by: Jamie Bainbridge jbainbri@redhat.com Signed-off-by: Corinna Vinschen vinschen@redhat.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230214185549.1306522-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/igb/igb_main.c | 42 +++++++++++++++++------ 1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index d8e3048b93dd..b5b443883da9 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -2256,6 +2256,30 @@ static void igb_enable_mas(struct igb_adapter *adapter) } }
+#ifdef CONFIG_IGB_HWMON +/** + * igb_set_i2c_bb - Init I2C interface + * @hw: pointer to hardware structure + **/ +static void igb_set_i2c_bb(struct e1000_hw *hw) +{ + u32 ctrl_ext; + s32 i2cctl; + + ctrl_ext = rd32(E1000_CTRL_EXT); + ctrl_ext |= E1000_CTRL_I2C_ENA; + wr32(E1000_CTRL_EXT, ctrl_ext); + wrfl(); + + i2cctl = rd32(E1000_I2CPARAMS); + i2cctl |= E1000_I2CBB_EN + | E1000_I2C_CLK_OE_N + | E1000_I2C_DATA_OE_N; + wr32(E1000_I2CPARAMS, i2cctl); + wrfl(); +} +#endif + void igb_reset(struct igb_adapter *adapter) { struct pci_dev *pdev = adapter->pdev; @@ -2400,7 +2424,8 @@ void igb_reset(struct igb_adapter *adapter) * interface. */ if (adapter->ets) - mac->ops.init_thermal_sensor_thresh(hw); + igb_set_i2c_bb(hw); + mac->ops.init_thermal_sensor_thresh(hw); } } #endif @@ -3117,21 +3142,12 @@ static void igb_init_mas(struct igb_adapter *adapter) **/ static s32 igb_init_i2c(struct igb_adapter *adapter) { - struct e1000_hw *hw = &adapter->hw; s32 status = 0; - s32 i2cctl;
/* I2C interface supported on i350 devices */ if (adapter->hw.mac.type != e1000_i350) return 0;
- i2cctl = rd32(E1000_I2CPARAMS); - i2cctl |= E1000_I2CBB_EN - | E1000_I2C_CLK_OUT | E1000_I2C_CLK_OE_N - | E1000_I2C_DATA_OUT | E1000_I2C_DATA_OE_N; - wr32(E1000_I2CPARAMS, i2cctl); - wrfl(); - /* Initialize the i2c bus which is controlled by the registers. * This bus will use the i2c_algo_bit structure that implements * the protocol through toggling of the 4 bits in the register. @@ -3521,6 +3537,12 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent) adapter->ets = true; else adapter->ets = false; + /* Only enable I2C bit banging if an external thermal + * sensor is supported. + */ + if (adapter->ets) + igb_set_i2c_bb(hw); + hw->mac.ops.init_thermal_sensor_thresh(hw); if (igb_sysfs_init(adapter)) dev_err(&pdev->dev, "failed to allocate sysfs resources\n");
From: Miroslav Lichvar mlichvar@redhat.com
commit 207ce626add80ddd941f62fc2fe5d77586e0801b upstream.
Fix handling of the tsync interrupt to compare the pin number with IGB_N_SDP instead of IGB_N_EXTTS/IGB_N_PEROUT and fix the indexing to the perout array.
Fixes: cf99c1dd7b77 ("igb: move PEROUT and EXTTS isr logic to separate functions") Reported-by: Matt Corallo ntp-lists@mattcorallo.com Signed-off-by: Miroslav Lichvar mlichvar@redhat.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Gurucharan G gurucharanx.g@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230213185822.3960072-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/igb/igb_main.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6816,7 +6816,7 @@ static void igb_perout(struct igb_adapte struct timespec64 ts; u32 tsauxc;
- if (pin < 0 || pin >= IGB_N_PEROUT) + if (pin < 0 || pin >= IGB_N_SDP) return;
spin_lock(&adapter->tmreg_lock); @@ -6824,7 +6824,7 @@ static void igb_perout(struct igb_adapte if (hw->mac.type == e1000_82580 || hw->mac.type == e1000_i354 || hw->mac.type == e1000_i350) { - s64 ns = timespec64_to_ns(&adapter->perout[pin].period); + s64 ns = timespec64_to_ns(&adapter->perout[tsintr_tt].period); u32 systiml, systimh, level_mask, level, rem; u64 systim, now;
@@ -6872,8 +6872,8 @@ static void igb_perout(struct igb_adapte ts.tv_nsec = (u32)systim; ts.tv_sec = ((u32)(systim >> 32)) & 0xFF; } else { - ts = timespec64_add(adapter->perout[pin].start, - adapter->perout[pin].period); + ts = timespec64_add(adapter->perout[tsintr_tt].start, + adapter->perout[tsintr_tt].period); }
/* u32 conversion of tv_sec is safe until y2106 */ @@ -6882,7 +6882,7 @@ static void igb_perout(struct igb_adapte tsauxc = rd32(E1000_TSAUXC); tsauxc |= TSAUXC_EN_TT0; wr32(E1000_TSAUXC, tsauxc); - adapter->perout[pin].start = ts; + adapter->perout[tsintr_tt].start = ts;
spin_unlock(&adapter->tmreg_lock); } @@ -6896,7 +6896,7 @@ static void igb_extts(struct igb_adapter struct ptp_clock_event event; struct timespec64 ts;
- if (pin < 0 || pin >= IGB_N_EXTTS) + if (pin < 0 || pin >= IGB_N_SDP) return;
if (hw->mac.type == e1000_82580 ||
From: Jason Xing kernelxing@tencent.com
commit 0967bf837784a11c65d66060623a74e65211af0b upstream.
Include the second VLAN HLEN into account when computing the maximum MTU size as other drivers do.
Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP") Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbe/ixgbe.h | 2 ++ drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +-- 2 files changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -67,6 +67,8 @@ #define IXGBE_RXBUFFER_4K 4096 #define IXGBE_MAX_RXBUFFER 16384 /* largest size for a single descriptor */
+#define IXGBE_PKT_HDR_PAD (ETH_HLEN + ETH_FCS_LEN + (VLAN_HLEN * 2)) + /* Attempt to maximize the headroom available for incoming frames. We * use a 2K buffer for receives and need 1536/1534 to store the data for * the frame. This leaves us with 512 bytes of room. From that we need --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -6801,8 +6801,7 @@ static int ixgbe_change_mtu(struct net_d struct ixgbe_adapter *adapter = netdev_priv(netdev);
if (ixgbe_enabled_xdp_adapter(adapter)) { - int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN + - VLAN_HLEN; + int new_frame_size = new_mtu + IXGBE_PKT_HDR_PAD;
if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) { e_warn(probe, "Requested MTU size is not supported with XDP\n");
From: Guillaume Nault gnault@redhat.com
commit e010ae08c71fda8be3d6bda256837795a0b3ea41 upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in ip6_datagram_flow_key_init(). Otherwise fib6_rule_match() can't properly match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0 ip -6 rule add dsfield 0x04 table 100
echo test | socat - UDP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host") because the fib-rule doesn't jump to table 100 and the lookup ends up being done in the main table.
Fixes: 2cc67cc731d9 ("[IPV6] ROUTE: Routing by Traffic Class.") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/datagram.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -51,7 +51,7 @@ static void ip6_datagram_flow_key_init(s fl6->flowi6_mark = sk->sk_mark; fl6->fl6_dport = inet->inet_dport; fl6->fl6_sport = inet->inet_sport; - fl6->flowlabel = np->flow_label; + fl6->flowlabel = ip6_make_flowinfo(np->tclass, np->flow_label); fl6->flowi6_uid = sk->sk_uid;
if (!oif)
From: Guillaume Nault gnault@redhat.com
commit 8230680f36fd1525303d1117768c8852314c488c upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in tcp_v6_connect(). Otherwise fib6_rule_match() can't properly match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0 ip -6 rule add dsfield 0x04 table 100
echo test | socat - TCP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host") because the fib-rule doesn't jump to table 100 and the lookup ends up being done in the main table.
Fixes: 2cc67cc731d9 ("[IPV6] ROUTE: Routing by Traffic Class.") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/tcp_ipv6.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -272,6 +272,7 @@ static int tcp_v6_connect(struct sock *s fl6.flowi6_proto = IPPROTO_TCP; fl6.daddr = sk->sk_v6_daddr; fl6.saddr = saddr ? *saddr : np->saddr; + fl6.flowlabel = ip6_make_flowinfo(np->tclass, np->flow_label); fl6.flowi6_oif = sk->sk_bound_dev_if; fl6.flowi6_mark = sk->sk_mark; fl6.fl6_dport = usin->sin6_port;
From: Kuan-Ying Lee Kuan-Ying.Lee@mediatek.com
commit aa1e6a932ca652a50a5df458399724a80459f521 upstream.
If we call folio_isolate_lru() successfully, we will get return value 0. We need to add this folio to the movable_pages_list.
Link: https://lkml.kernel.org/r/20230131063206.28820-1-Kuan-Ying.Lee@mediatek.com Fixes: 67e139b02d99 ("mm/gup.c: refactor check_and_migrate_movable_pages()") Signed-off-by: Kuan-Ying Lee Kuan-Ying.Lee@mediatek.com Reviewed-by: Alistair Popple apopple@nvidia.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Cc: Andrew Yang andrew.yang@mediatek.com Cc: Chinwen Chang chinwen.chang@mediatek.com Cc: John Hubbard jhubbard@nvidia.com Cc: Matthias Brugger matthias.bgg@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/gup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -1978,7 +1978,7 @@ static unsigned long collect_longterm_un drain_allow = false; }
- if (!folio_isolate_lru(folio)) + if (folio_isolate_lru(folio)) continue;
list_add_tail(&folio->lru, movable_page_list);
From: Arnd Bergmann arnd@arndb.de
commit 3770e52fd4ec40ebee16ba19ad6c09dc0b52739b upstream.
After x86 enabled support for KMSAN, it has become possible to have larger 'struct page' than was expected when commit 5470dea49f53 ("mm: use mm_zero_struct_page from SPARC on all 64b architectures") was merged:
include/linux/mm.h:156:10: warning: no case matching constant switch condition '96' switch (sizeof(struct page)) {
Extend the maximum accordingly.
Link: https://lkml.kernel.org/r/20230130130739.563628-1-arnd@kernel.org Fixes: 5470dea49f53 ("mm: use mm_zero_struct_page from SPARC on all 64b architectures") Fixes: 4ca8cc8d1bbe ("x86: kmsan: enable KMSAN builds for x86") Fixes: f80be4571b19 ("kmsan: add KMSAN runtime core") Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: Pasha Tatashin pasha.tatashin@soleen.com Cc: Alexander Duyck alexander.h.duyck@linux.intel.com Cc: Alexander Potapenko glider@google.com Cc: Alex Sierra alex.sierra@amd.com Cc: David Hildenbrand david@redhat.com Cc: Hugh Dickins hughd@google.com Cc: John Hubbard jhubbard@nvidia.com Cc: Liam R. Howlett Liam.Howlett@Oracle.com Cc: Matthew Wilcox willy@infradead.org Cc: Naoya Horiguchi naoya.horiguchi@nec.com Cc: Suren Baghdasaryan surenb@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
--- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -136,7 +136,7 @@ extern int mmap_rnd_compat_bits __read_m * define their own version of this macro in <asm/pgtable.h> */ #if BITS_PER_LONG == 64 -/* This function must be updated when the size of struct page grows above 80 +/* This function must be updated when the size of struct page grows above 96 * or reduces below 56. The idea that compiler optimizes out switch() * statement, and only leaves move/store instructions. Also the compiler can * combine write statements if they are both assignments and can be reordered, @@ -147,12 +147,18 @@ static inline void __mm_zero_struct_page { unsigned long *_pp = (void *)page;
- /* Check that struct page is either 56, 64, 72, or 80 bytes */ + /* Check that struct page is either 56, 64, 72, 80, 88 or 96 bytes */ BUILD_BUG_ON(sizeof(struct page) & 7); BUILD_BUG_ON(sizeof(struct page) < 56); - BUILD_BUG_ON(sizeof(struct page) > 80); + BUILD_BUG_ON(sizeof(struct page) > 96);
switch (sizeof(struct page)) { + case 96: + _pp[11] = 0; + fallthrough; + case 88: + _pp[10] = 0; + fallthrough; case 80: _pp[9] = 0; fallthrough;
From: Natalia Petrova n.petrova@fintech.ru
[ Upstream commit 7fa0b526f865cb42aa33917fd02a92cb03746f4d ]
The result of nlmsg_find_attr() 'br_spec' is dereferenced in nla_for_each_nested(), but it can take NULL value in nla_find() function, which will result in an error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 51616018dd1b ("i40e: Add support for getlink, setlink ndo ops") Signed-off-by: Natalia Petrova n.petrova@fintech.ru Reviewed-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Gurucharan G gurucharanx.g@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230209172833.3596034-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 18044c2a36faa..d30bc38725e97 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -13140,6 +13140,8 @@ static int i40e_ndo_bridge_setlink(struct net_device *dev, }
br_spec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC); + if (!br_spec) + return -EINVAL;
nla_for_each_nested(attr, br_spec, rem) { __u16 mode;
From: Pedro Tammela pctammela@mojatatu.com
[ Upstream commit 42018a322bd453e38b3ffee294982243e50a484f ]
Syzkaller found an issue where a handle greater than 16 bits would trigger a null-ptr-deref in the imperfect hash area update.
general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 5070 Comm: syz-executor456 Not tainted 6.2.0-rc7-syzkaller-00112-gc68f345b7c42 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 RIP: 0010:tcindex_set_parms+0x1a6a/0x2990 net/sched/cls_tcindex.c:509 Code: 01 e9 e9 fe ff ff 4c 8b bd 28 fe ff ff e8 0e 57 7d f9 48 8d bb a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 94 0c 00 00 48 8b 85 f8 fd ff ff 48 8b 9b a8 00 RSP: 0018:ffffc90003d3ef88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000015 RSI: ffffffff8803a102 RDI: 00000000000000a8 RBP: ffffc90003d3f1d8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88801e2b10a8 R13: dffffc0000000000 R14: 0000000000030000 R15: ffff888017b3be00 FS: 00005555569af300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056041c6d2000 CR3: 000000002bfca000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcindex_change+0x1ea/0x320 net/sched/cls_tcindex.c:572 tc_new_tfilter+0x96e/0x2220 net/sched/cls_api.c:2155 rtnetlink_rcv_msg+0x959/0xca0 net/core/rtnetlink.c:6132 netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2574 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xd3/0x120 net/socket.c:734 ____sys_sendmsg+0x334/0x8c0 net/socket.c:2476 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2530 __sys_sendmmsg+0x18f/0x460 net/socket.c:2616 __do_sys_sendmmsg net/socket.c:2645 [inline] __se_sys_sendmmsg net/socket.c:2642 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2642 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
Fixes: ee059170b1f7 ("net/sched: tcindex: update imperfect hash filters respecting rcu") Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reported-by: syzbot syzkaller@googlegroups.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_tcindex.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c index 4422b711af081..eea8e185fcdb2 100644 --- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -502,7 +502,7 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base, /* lookup the filter, guaranteed to exist */ for (cf = rcu_dereference_bh_rtnl(*fp); cf; fp = &cf->next, cf = rcu_dereference_bh_rtnl(*fp)) - if (cf->key == handle) + if (cf->key == (u16)handle) break;
f->next = cf->next;
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 1f1a4f89562d3b33b6ca4fc8a4f3bd4cd35ab4ea ]
when starting error recovery there might be a authentication work running, and it involves I/O commands. Given the controller is tearing down there is no chance for the I/O to complete other than timing out which may unnecessarily take a full io timeout.
So first tear down the queues, fail/cancel all inflight I/O (including potentially authentication) and only then stop authentication. This ensures that failover is not stalled due to blocked authentication I/O.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 4c052c261517e..1dc7c733c7e39 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2128,7 +2128,6 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work) struct nvme_tcp_ctrl, err_work); struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
- nvme_auth_stop(ctrl); nvme_stop_keep_alive(ctrl); flush_work(&ctrl->async_event_work); nvme_tcp_teardown_io_queues(ctrl, false); @@ -2136,6 +2135,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work) nvme_start_queues(ctrl); nvme_tcp_teardown_admin_queue(ctrl, false); nvme_start_admin_queue(ctrl); + nvme_auth_stop(ctrl);
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) { /* state change failure is ok if we started ctrl delete */
From: Sagi Grimberg sagi@grimberg.me
[ Upstream commit 91c11d5f32547a08d462934246488fe72f3d44c3 ]
when starting error recovery there might be a authentication work running, and it involves I/O commands. Given the controller is tearing down there is no chance for the I/O to complete other than timing out which may unnecessarily take a full io timeout.
So first tear down the queues, fail/cancel all inflight I/O (including potentially authentication) and only then stop authentication. This ensures that failover is not stalled due to blocked authentication I/O.
Signed-off-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/rdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 6f918e61b6aef..80383213b8828 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1154,13 +1154,13 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work) struct nvme_rdma_ctrl *ctrl = container_of(work, struct nvme_rdma_ctrl, err_work);
- nvme_auth_stop(&ctrl->ctrl); nvme_stop_keep_alive(&ctrl->ctrl); flush_work(&ctrl->ctrl.async_event_work); nvme_rdma_teardown_io_queues(ctrl, false); nvme_start_queues(&ctrl->ctrl); nvme_rdma_teardown_admin_queue(ctrl, false); nvme_start_admin_queue(&ctrl->ctrl); + nvme_auth_stop(&ctrl->ctrl);
if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) { /* state change failure is ok if we started ctrl delete */
From: Christoph Hellwig hch@lst.de
[ Upstream commit c76b8308e4c9148e44e0c7e086ab6d8b4bb10162 ]
nvme_shutdown_ctrl already shuts the controller down, there is no need to also call nvme_disable_ctrl for the shutdown case.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Eric Curtin ecurtin@redhat.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Reviewed-by: Hector Martin marcan@marcan.st Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/apple.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c index 262d2b60ac6dd..92c70c4b2f6ec 100644 --- a/drivers/nvme/host/apple.c +++ b/drivers/nvme/host/apple.c @@ -831,7 +831,8 @@ static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
if (shutdown) nvme_shutdown_ctrl(&anv->ctrl); - nvme_disable_ctrl(&anv->ctrl); + else + nvme_disable_ctrl(&anv->ctrl); }
WRITE_ONCE(anv->ioq.enabled, false);
From: Sean Christopherson seanjc@google.com
commit 4d7404e5ee0066e9a9e8268675de8a273b568b08 upstream.
Disable KVM support for virtualizing PMUs on hosts with hybrid PMUs until KVM gains a sane way to enumeration the hybrid vPMU to userspace and/or gains a mechanism to let userspace opt-in to the dangers of exposing a hybrid vPMU to KVM guests. Virtualizing a hybrid PMU, or at least part of a hybrid PMU, is possible, but it requires careful, deliberate configuration from userspace.
E.g. to expose full functionality, vCPUs need to be pinned to pCPUs to prevent migrating a vCPU between a big core and a little core, userspace must enumerate a reasonable topology to the guest, and guest CPUID must be curated per vCPU to enumerate accurate vPMU capabilities.
The last point is especially problematic, as KVM doesn't control which pCPU it runs on when enumerating KVM's vPMU capabilities to userspace, i.e. userspace can't rely on KVM_GET_SUPPORTED_CPUID in it's current form.
Alternatively, userspace could enable vPMU support by enumerating the set of features that are common and coherent across all cores, e.g. by filtering PMU events and restricting guest capabilities. But again, that requires userspace to take action far beyond reflecting KVM's supported feature set into the guest.
For now, simply disable vPMU support on hybrid CPUs to avoid inducing seemingly random #GPs in guests, and punt support for hybrid CPUs to a future enabling effort.
Reported-by: Jianfeng Gao jianfeng.gao@intel.com Cc: stable@vger.kernel.org Cc: Andrew Cooper Andrew.Cooper3@citrix.com Cc: Peter Zijlstra peterz@infradead.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Andi Kleen ak@linux.intel.com Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.c... Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230208204230.1360502-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/pmu.h | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -164,15 +164,27 @@ static inline void kvm_init_pmu_capabili { bool is_intel = boot_cpu_data.x86_vendor == X86_VENDOR_INTEL;
- perf_get_x86_pmu_capability(&kvm_pmu_cap); - - /* - * For Intel, only support guest architectural pmu - * on a host with architectural pmu. - */ - if ((is_intel && !kvm_pmu_cap.version) || !kvm_pmu_cap.num_counters_gp) + /* + * Hybrid PMUs don't play nice with virtualization without careful + * configuration by userspace, and KVM's APIs for reporting supported + * vPMU features do not account for hybrid PMUs. Disable vPMU support + * for hybrid PMUs until KVM gains a way to let userspace opt-in. + */ + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) enable_pmu = false;
+ if (enable_pmu) { + perf_get_x86_pmu_capability(&kvm_pmu_cap); + + /* + * For Intel, only support guest architectural pmu + * on a host with architectural pmu. + */ + if ((is_intel && !kvm_pmu_cap.version) || + !kvm_pmu_cap.num_counters_gp) + enable_pmu = false; + } + if (!enable_pmu) { memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap)); return;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 2c10b61421a28e95a46ab489fd56c0f442ff6952 upstream.
When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it.
Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed.
Cc: Sean Christopherson seanjc@google.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Dave Hansen dave.hansen@linux.intel.com Cc: x86@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: stable stable@kernel.org Reported-by: Xingyuan Mo hdthky0@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Message-Id: 20230214103304.3689213-1-gregkh@linuxfoundation.org Tested-by: Xingyuan Mo hdthky0@gmail.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5250,12 +5250,11 @@ static void kvm_vcpu_ioctl_x86_get_debug { unsigned long val;
+ memset(dbgregs, 0, sizeof(*dbgregs)); memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db)); kvm_get_dr(vcpu, 6, &val); dbgregs->dr6 = val; dbgregs->dr7 = vcpu->arch.dr7; - dbgregs->flags = 0; - memset(&dbgregs->reserved, 0, sizeof(dbgregs->reserved)); }
static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu,
From: Sean Christopherson seanjc@google.com
commit 4b4191b8ae1278bde3642acaaef8f92810ed111a upstream.
Now that KVM disables vPMU support on hybrid CPUs, WARN and return zeros if perf_get_x86_pmu_capability() is invoked on a hybrid CPU. The helper doesn't provide an accurate accounting of the PMU capabilities for hybrid CPUs and needs to be enhanced if KVM, or anything else outside of perf, wants to act on the PMU capabilities.
Cc: stable@vger.kernel.org Cc: Andrew Cooper Andrew.Cooper3@citrix.com Cc: Peter Zijlstra peterz@infradead.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Andi Kleen ak@linux.intel.com Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.c... Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230208204230.1360502-3-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/events/core.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2994,17 +2994,19 @@ unsigned long perf_misc_flags(struct pt_
void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap) { - if (!x86_pmu_initialized()) { + /* This API doesn't currently support enumerating hybrid PMUs. */ + if (WARN_ON_ONCE(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)) || + !x86_pmu_initialized()) { memset(cap, 0, sizeof(*cap)); return; }
- cap->version = x86_pmu.version; /* - * KVM doesn't support the hybrid PMU yet. - * Return the common value in global x86_pmu, - * which available for all cores. + * Note, hybrid CPU models get tracked as having hybrid PMUs even when + * all E-cores are disabled via BIOS. When E-cores are disabled, the + * base PMU holds the correct number of counters for P-cores. */ + cap->version = x86_pmu.version; cap->num_counters_gp = x86_pmu.num_counters; cap->num_counters_fixed = x86_pmu.num_counters_fixed; cap->bit_width_gp = x86_pmu.cntval_bits;
From: Thomas Gleixner tglx@linutronix.de
commit d125d1349abeb46945dc5e98f7824bf688266f13 upstream.
syzbot reported a RCU stall which is caused by setting up an alarmtimer with a very small interval and ignoring the signal. The reproducer arms the alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a problem per se, but that's an issue when the signal is ignored because then the timer is immediately rearmed because there is no way to delay that rearming to the signal delivery path. See posix_timer_fn() and commit 58229a189942 ("posix-timers: Prevent softirq starvation by small intervals and SIG_IGN") for details.
The reproducer does not set SIG_IGN explicitely, but it sets up the timers signal with SIGCONT. That has the same effect as explicitely setting SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and the task is not ptraced.
The log clearly shows that:
[pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} ---
It works because the tasks are traced and therefore the signal is queued so the tracer can see it, which delays the restart of the timer to the signal delivery path. But then the tracer is killed:
[pid 5087] kill(-5102, SIGKILL <unfinished ...> ... ./strace-static-x86_64: Process 5107 detached
and after it's gone the stall can be observed:
syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns [ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: ... [ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran: [ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0: [ 184.669821][ C0] NMI backtrace for cpu 0 [ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0 ... [ 184.670036][ C0] Call Trace: [ 184.670041][ C0] <IRQ> [ 184.670045][ C0] alarmtimer_fired+0x327/0x670
posix_timer_fn() prevents that by checking whether the interval for timers which have the signal ignored is smaller than a jiffie and artifically delay it by shifting the next expiry out by a jiffie. That's accurate vs. the overrun accounting, but slightly inaccurate vs. timer_gettimer(2).
The comment in that function says what needs to be done and there was a fix available for the regular userspace induced SIG_IGN mechanism, but that did not work due to the implicit ignore for SIGCONT and similar signals. This needs to be worked on, but for now the only available workaround is to do exactly what posix_timer_fn() does:
Increase the interval of self-rearming timers, which have their signal ignored, to at least a jiffie.
Interestingly this has been fixed before via commit ff86bf0c65f1 ("alarmtimer: Rate limit periodic intervals") already, but that fix got lost in a later rework.
Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine") Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: John Stultz jstultz@google.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/alarmtimer.c | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-)
--- a/kernel/time/alarmtimer.c +++ b/kernel/time/alarmtimer.c @@ -470,11 +470,35 @@ u64 alarm_forward(struct alarm *alarm, k } EXPORT_SYMBOL_GPL(alarm_forward);
-u64 alarm_forward_now(struct alarm *alarm, ktime_t interval) +static u64 __alarm_forward_now(struct alarm *alarm, ktime_t interval, bool throttle) { struct alarm_base *base = &alarm_bases[alarm->type]; + ktime_t now = base->get_ktime(); + + if (IS_ENABLED(CONFIG_HIGH_RES_TIMERS) && throttle) { + /* + * Same issue as with posix_timer_fn(). Timers which are + * periodic but the signal is ignored can starve the system + * with a very small interval. The real fix which was + * promised in the context of posix_timer_fn() never + * materialized, but someone should really work on it. + * + * To prevent DOS fake @now to be 1 jiffie out which keeps + * the overrun accounting correct but creates an + * inconsistency vs. timer_gettime(2). + */ + ktime_t kj = NSEC_PER_SEC / HZ; + + if (interval < kj) + now = ktime_add(now, kj); + } + + return alarm_forward(alarm, now, interval); +}
- return alarm_forward(alarm, base->get_ktime(), interval); +u64 alarm_forward_now(struct alarm *alarm, ktime_t interval) +{ + return __alarm_forward_now(alarm, interval, false); } EXPORT_SYMBOL_GPL(alarm_forward_now);
@@ -551,9 +575,10 @@ static enum alarmtimer_restart alarm_han if (posix_timer_event(ptr, si_private) && ptr->it_interval) { /* * Handle ignored signals and rearm the timer. This will go - * away once we handle ignored signals proper. + * away once we handle ignored signals proper. Ensure that + * small intervals cannot starve the system. */ - ptr->it_overrun += alarm_forward_now(alarm, ptr->it_interval); + ptr->it_overrun += __alarm_forward_now(alarm, ptr->it_interval, true); ++ptr->it_requeue_pending; ptr->it_active = 1; result = ALARMTIMER_RESTART;
From: Keith Busch kbusch@kernel.org
commit e917a849c3fc317c4a5f82bb18726000173d39e6 upstream.
The sysfs group containing the cmb attributes is registered before the driver knows if they need to be visible or not. Update the group when cmb attributes are known to exist so the visibility setting is correct.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037 Fixes: 86adbf0cdb9ec65 ("nvme: simplify transport specific device attribute handling") Signed-off-by: Keith Busch kbusch@kernel.org Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/pci.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -109,6 +109,7 @@ struct nvme_queue;
static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode); +static void nvme_update_attrs(struct nvme_dev *dev);
/* * Represents an NVM Express device. Each nvme_dev is a PCI function. @@ -1967,6 +1968,8 @@ static void nvme_map_cmb(struct nvme_dev if ((dev->cmbsz & (NVME_CMBSZ_WDS | NVME_CMBSZ_RDS)) == (NVME_CMBSZ_WDS | NVME_CMBSZ_RDS)) pci_p2pmem_publish(pdev, true); + + nvme_update_attrs(dev); }
static int nvme_set_host_mem(struct nvme_dev *dev, u32 bits) @@ -2250,6 +2253,11 @@ static const struct attribute_group *nvm NULL, };
+static void nvme_update_attrs(struct nvme_dev *dev) +{ + sysfs_update_group(&dev->ctrl.device->kobj, &nvme_pci_dev_attrs_group); +} + /* * nirqs is the number of interrupts available for write and read * queues. The core already reserved an interrupt for the admin queue.
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
commit 1f810d2b6b2fbdc5279644d8b2c140b1f7c9d43d upstream.
The HDaudio stream allocation is done first, and in a second step the LOSIDV parameter is programmed for the multi-link used by a codec.
This leads to a possible stream_tag leak, e.g. if a DisplayAudio link is not used. This would happen when a non-Intel graphics card is used and userspace unconditionally uses the Intel Display Audio PCMs without checking if they are connected to a receiver with jack controls.
We should first check that there is a valid multi-link entry to configure before allocating a stream_tag. This change aligns the dma_assign and dma_cleanup phases.
Complements: b0cd60f3e9f5 ("ALSA/ASoC: hda: clarify bus_get_link() and bus_link_get() helpers") Link: https://github.com/thesofproject/linux/issues/4151 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Rander Wang rander.wang@intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230216162340.19480-1-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/intel/hda-dai.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/sound/soc/sof/intel/hda-dai.c +++ b/sound/soc/sof/intel/hda-dai.c @@ -216,6 +216,10 @@ static int hda_link_dma_hw_params(struct struct hdac_bus *bus = hstream->bus; struct hdac_ext_link *link;
+ link = snd_hdac_ext_bus_get_link(bus, codec_dai->component->name); + if (!link) + return -EINVAL; + hext_stream = snd_soc_dai_get_dma_data(cpu_dai, substream); if (!hext_stream) { hext_stream = hda_link_stream_assign(bus, substream); @@ -225,10 +229,6 @@ static int hda_link_dma_hw_params(struct snd_soc_dai_set_dma_data(cpu_dai, substream, (void *)hext_stream); }
- link = snd_hdac_ext_bus_get_link(bus, codec_dai->component->name); - if (!link) - return -EINVAL; - /* set the hdac_stream in the codec dai */ snd_soc_dai_set_stream(codec_dai, hdac_stream(hext_stream), substream->stream);
From: Dan Carpenter error27@gmail.com
commit 9cec2aaffe969f2a3e18b5ec105fc20bb908e475 upstream.
The > needs be >= to prevent an out of bounds access.
Fixes: de5ca4c3852f ("net: sched: sch: Bounds check priority") Signed-off-by: Dan Carpenter error27@gmail.com Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/Y+D+KN18FQI2DKLq@kili Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_htb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -429,7 +429,7 @@ static void htb_activate_prios(struct ht while (m) { unsigned int prio = ffz(~m);
- if (WARN_ON_ONCE(prio > ARRAY_SIZE(p->inner.clprio))) + if (WARN_ON_ONCE(prio >= ARRAY_SIZE(p->inner.clprio))) break; m &= ~(1 << prio);
On Mon, Feb 20, 2023 at 02:35:16PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
Nothing untoward in my CI.. Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
On 2/20/23 5:35 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, 20 Feb 2023 at 19:28, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.1.13-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.1.y * git commit: fc84fcf24fda6858e5ca04afe4516846e5b1cd25 * git describe: v6.1.12-119-gfc84fcf24fda * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.12...
## Test Regressions (compared to v6.1.11-115-g9012d1ebd323)
## Metric Regressions (compared to v6.1.11-115-g9012d1ebd323)
## Test Fixes (compared to v6.1.11-115-g9012d1ebd323)
## Metric Fixes (compared to v6.1.11-115-g9012d1ebd323)
## Test result summary total: 129915, pass: 115205, fail: 3736, skip: 10965, xfail: 9
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 145 total, 144 passed, 1 failed * arm64: 47 total, 47 passed, 0 failed * i386: 35 total, 34 passed, 1 failed * mips: 26 total, 26 passed, 0 failed * parisc: 6 total, 6 passed, 0 failed * powerpc: 34 total, 30 passed, 4 failed * riscv: 12 total, 12 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 12 total, 12 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 40 total, 40 passed, 0 failed
## Test suites summary * boot * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-open-posix-tests * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * packetdrill * perf * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
On Mon, Feb 20, 2023 at 02:35:16PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
Hi Greg,
On Mon, Feb 20, 2023 at 02:35:16PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
Build test (gcc version 12.2.1 20230210): mips: 52 configs -> no failure arm: 100 configs -> no failure arm64: 3 configs -> no failure x86_64: 4 configs -> no failure alpha allmodconfig -> no failure csky allmodconfig -> no failure powerpc allmodconfig -> no failure riscv allmodconfig -> no failure s390 allmodconfig -> no failure xtensa allmodconfig -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1] arm64: Booted on rpi4b (4GB model). No regression. [2] mips: Booted on ci20 board. No regression. [3]
[1]. https://openqa.qa.codethink.co.uk/tests/2908 [2]. https://openqa.qa.codethink.co.uk/tests/2911 [3]. https://openqa.qa.codethink.co.uk/tests/2913
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
On Mon, Feb 20, 2023 at 02:35:16PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 503 pass: 503 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On 2/20/23 05:35, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 2/20/23 06:35, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Mon, Feb 20, 2023 at 02:35:16PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc1 against the Fedora build system (aarch64, armv7, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
This is the start of the stable review cycle for the 6.1.13 release. There are 118 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 22 Feb 2023 13:35:35 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.13-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my x86_64 and ARM64 test systems. No errors or regressions.
Tested-by: Allen Pais apais@linux.microsoft.com
Thanks.
linux-stable-mirror@lists.linaro.org