April 2024 - Linux-stable-mirror

[PATCH] USB: serial: option: add support for Fibocom FM650/FG650

by Chuanhong Guo

Fibocom FM650/FG650 are 5G modems with ECM/NCM/RNDIS/MBIM modes. This patch adds support to all 4 modes. usb-devices output for all modes: ECM: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 5 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a04 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms NCM: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a05 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms RNDIS: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 4 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a06 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms MBIM: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 7 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a07 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms Signed-off-by: Chuanhong Guo <gch981213(a)gmail.com> Cc: stable(a)vger.kernel.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 55a65d941ccb..964d3fffa545 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2277,6 +2277,10 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a3, 0xff) }, /* Fibocom FM101-GL (laptop MBIM) */ { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a4, 0xff), /* Fibocom FM101-GL (laptop MBIM) */ .driver_info = RSVD(4) }, + { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a04, 0xff) }, /* Fibocom FM650-CN (ECM mode) */ + { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a05, 0xff) }, /* Fibocom FM650-CN (NCM mode) */ + { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a06, 0xff) }, /* Fibocom FM650-CN (RNDIS mode) */ + { USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0a07, 0xff) }, /* Fibocom FM650-CN (MBIM mode) */ { USB_DEVICE_INTERFACE_CLASS(0x2df3, 0x9d03, 0xff) }, /* LongSung M5710 */ { USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1404, 0xff) }, /* GosunCn GM500 RNDIS */ { USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1405, 0xff) }, /* GosunCn GM500 MBIM */ -- 2.44.0

1 year, 8 months

2
3
0 0

[PATCH 5.15 00/45] 5.15.156-rc1 review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 5.15.156 release. There are 45 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.156-r… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 5.15.156-rc1 Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915/cdclk: Fix CDCLK programming order when pipes are active Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Replace CONFIG_SPECTRE_BHI_{ON,OFF} with CONFIG_MITIGATION_SPECTRE_BHI Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Clarify that syscall hardening isn't a BHI mitigation Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Fix BHI handling of RRSBA Ingo Molnar <mingo(a)kernel.org> x86/bugs: Rename various 'ia32_cap' variables to 'x86_arch_cap_msr' Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Cache the value of MSR_IA32_ARCH_CAPABILITIES Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Fix BHI documentation Daniel Sneddon <daniel.sneddon(a)linux.intel.com> x86/bugs: Fix return type of spectre_bhi_state() Arnd Bergmann <arnd(a)arndb.de> irqflags: Explicitly ignore lockdep_hrtimer_exit() argument Adam Dunlap <acdunlap(a)google.com> x86/apic: Force native_apic_mem_read() to use the MOV instruction John Stultz <jstultz(a)google.com> selftests: timers: Fix abs() warning in posix_timers test Sean Christopherson <seanjc(a)google.com> x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n Namhyung Kim <namhyung(a)kernel.org> perf/x86: Fix out of range data Gavin Shan <gshan(a)redhat.com> vhost: Add smp_rmb() in vhost_vq_avail_empty() Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/client: Fully protect modes[] with dev->mode_config.mutex Boris Burkov <boris(a)bur.io> btrfs: qgroup: correctly model root qgroup rsv in convert Jacob Pan <jacob.jun.pan(a)linux.intel.com> iommu/vt-d: Allocate local memory for page request queue Arnd Bergmann <arnd(a)arndb.de> tracing: hide unused ftrace_event_id_fops David Arinzon <darinzon(a)amazon.com> net: ena: Fix incorrect descriptor free behavior David Arinzon <darinzon(a)amazon.com> net: ena: Wrong missing IO completions check order David Arinzon <darinzon(a)amazon.com> net: ena: Fix potential sign extension issue Michal Luczaj <mhal(a)rbox.co> af_unix: Fix garbage collector racing against connect() Kuniyuki Iwashima <kuniyu(a)amazon.com> af_unix: Do not use atomic ops for unix_sk(sk)->inflight. Arınç ÜNAL <arinc.unal(a)arinc9.com> net: dsa: mt7530: trap link-local frames regardless of ST Port State Daniel Machon <daniel.machon(a)microchip.com> net: sparx5: fix wrong config being used when reconfiguring PCS Cosmin Ratiu <cratiu(a)nvidia.com> net/mlx5: Properly link new fs rules into the tree Eric Dumazet <edumazet(a)google.com> netfilter: complete validation of user input Jiri Benc <jbenc(a)redhat.com> ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr Arnd Bergmann <arnd(a)arndb.de> ipv4/route: avoid unused-but-set-variable warning Arnd Bergmann <arnd(a)arndb.de> ipv6: fib: hide unused 'pn' variable Geetha sowjanya <gakula(a)marvell.com> octeontx2-af: Fix NIX SQ mode and BP config Kuniyuki Iwashima <kuniyu(a)amazon.com> af_unix: Clear stale u->oob_skb. Eric Dumazet <edumazet(a)google.com> geneve: fix header validation in geneve[6]_xmit_skb Eric Dumazet <edumazet(a)google.com> xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> u64_stats: Disable preemption on 32bit UP+SMP PREEMPT_RT during updates. Ilya Maximets <i.maximets(a)ovn.org> net: openvswitch: fix unwanted error log on timeout policy probing Dan Carpenter <dan.carpenter(a)linaro.org> scsi: qla2xxx: Fix off by one in qla_edif_app_getstats() Arnd Bergmann <arnd(a)arndb.de> nouveau: fix function cast warning Alex Constantino <dreaming.about.electric.sheep(a)gmail.com> Revert "drm/qxl: simplify qxl_fence_wait" Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order Nini Song <nini.song(a)mediatek.com> media: cec: core: remove length check of Timer Status Dmitry Antipov <dmantipov(a)yandex.ru> Bluetooth: Fix memory leak in hci_req_sync_complete() Steven Rostedt (Google) <rostedt(a)goodmis.org> ring-buffer: Only update pages_touched when a new page is touched Sven Eckelmann <sven(a)narfation.org> batman-adv: Avoid infinite loop trying to resize local TT ------------- Diffstat: Documentation/admin-guide/hw-vuln/spectre.rst | 22 +- Documentation/admin-guide/kernel-parameters.txt | 12 +- Makefile | 4 +- arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi | 12 +- arch/x86/Kconfig | 21 +- arch/x86/events/core.c | 1 + arch/x86/include/asm/apic.h | 3 +- arch/x86/kernel/cpu/bugs.c | 82 ++++---- arch/x86/kernel/cpu/common.c | 48 ++--- drivers/gpu/drm/drm_client_modeset.c | 3 +- drivers/gpu/drm/i915/display/intel_cdclk.c | 7 +- drivers/gpu/drm/i915/display/intel_cdclk.h | 3 + .../gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 +- drivers/gpu/drm/qxl/qxl_release.c | 50 ++++- drivers/iommu/intel/svm.c | 2 +- drivers/media/cec/core/cec-adap.c | 14 -- drivers/net/dsa/mt7530.c | 229 ++++++++++++++++++--- drivers/net/dsa/mt7530.h | 5 + drivers/net/ethernet/amazon/ena/ena_com.c | 2 +- drivers/net/ethernet/amazon/ena/ena_netdev.c | 35 ++-- .../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 22 +- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 3 +- .../net/ethernet/microchip/sparx5/sparx5_port.c | 4 +- drivers/net/geneve.c | 4 +- drivers/scsi/qla2xxx/qla_edif.c | 2 +- drivers/vhost/vhost.c | 14 +- fs/btrfs/qgroup.c | 2 + include/linux/dma-fence.h | 7 + include/linux/irqflags.h | 2 +- include/linux/u64_stats_sync.h | 42 ++-- include/net/addrconf.h | 4 + include/net/af_unix.h | 2 +- include/net/ip_tunnels.h | 33 +++ kernel/cpu.c | 3 +- kernel/trace/ring_buffer.c | 6 +- kernel/trace/trace_events.c | 4 + net/batman-adv/translation-table.c | 2 +- net/bluetooth/hci_request.c | 4 +- net/ipv4/netfilter/arp_tables.c | 4 + net/ipv4/netfilter/ip_tables.c | 4 + net/ipv4/route.c | 4 +- net/ipv6/addrconf.c | 7 +- net/ipv6/ip6_fib.c | 7 +- net/ipv6/netfilter/ip6_tables.c | 4 + net/openvswitch/conntrack.c | 5 +- net/unix/af_unix.c | 8 +- net/unix/garbage.c | 35 +++- net/unix/scm.c | 8 +- net/xdp/xsk.c | 2 + tools/testing/selftests/timers/posix_timers.c | 2 +- 50 files changed, 556 insertions(+), 256 deletions(-)

1 year, 8 months

9
54
0 0

[Request] backport a mainline patch to Linux kernel-5.10 stable tree

by Bo Ye (叶波)

Dear Reviewers, we suggest to backport a commit to Linux kernel-5.10 stable tree to fix thermal bug. Thanks a lot source patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/d… thermal: core: Fix thermal zone suspend-resume synchronization There are 3 synchronization issues with thermal zone suspend-resume during system-wide transitions: 1. The resume code runs in a PM notifier which is invoked after user space has been thawed, so it can run concurrently with user space which can trigger a thermal zone device removal. If that happens, the thermal zone resume code may use a stale pointer to the next list element and crash, because it does not hold thermal_list_lock while walking thermal_tz_list. 2. The thermal zone resume code calls thermal_zone_device_init() outside the zone lock, so user space or an update triggered by the platform firmware may see an inconsistent state of a thermal zone leading to unexpected behavior. 3. Clearing the in_suspend global variable in thermal_pm_notify() allows __thermal_zone_device_update() to continue for all thermal zones and it may as well run before the thermal_tz_list walk (or at any point during the list walk for that matter) and attempt to operate on a thermal zone that has not been resumed yet. It may also race destructively with thermal_zone_device_init(). To address these issues, add thermal_list_lock locking to thermal_pm_notify(), especially arount the thermal_tz_list, make it call thermal_zone_device_init() back-to-back with __thermal_zone_device_update() under the zone lock and replace in_suspend with per-zone bool "suspend" indicators set and unset under the given zone's lock. Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/ Reported-by: Bo Ye <bo.ye(a)mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.co thermal: core: Fix thermal zone suspend-resume synchronization There are 3 synchronization issues with thermal zone suspend-resume during system-wide transitions: 1. The resume code runs in a PM notifier which is invoked after user space has been thawed, so it can run concurrently with user space which can trigger a thermal zone device removal. If that happens, the thermal zone resume code may use a stale pointer to the next list element and crash, because it does not hold thermal_list_lock while walking thermal_tz_list. 2. The thermal zone resume code calls thermal_zone_device_init() outside the zone lock, so user space or an update triggered by the platform firmware may see an inconsistent state of a thermal zone leading to unexpected behavior. 3. Clearing the in_suspend global variable in thermal_pm_notify() allows __thermal_zone_device_update() to continue for all thermal zones and it may as well run before the thermal_tz_list walk (or at any point during the list walk for that matter) and attempt to operate on a thermal zone that has not been resumed yet. It may also race destructively with thermal_zone_device_init(). To address these issues, add thermal_list_lock locking to thermal_pm_notify(), especially arount the thermal_tz_list, make it call thermal_zone_device_init() back-to-back with __thermal_zone_device_update() under the zone lock and replace in_suspend with per-zone bool "suspend" indicators set and unset under the given zone's lock. Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/ Reported-by: Bo Ye <bo.ye(a)mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com> BRs Bo Ye

1 year, 8 months

2
1
0 0

撤回: [Request] backport a mainline patch to Linux kernel-5.10 stable tree

by Bo Ye (叶波)

Bo Ye (叶波) 将撤回邮件“[Request] backport a mainline patch to Linux kernel-5.10 stable tree”。

1 year, 8 months

2
1
0 0

撤回: [Request] backport a mainline patch to Linux kernel-5.10 stable tree

by Bo Ye (叶波)

Bo Ye (叶波) 将撤回邮件“[Request] backport a mainline patch to Linux kernel-5.10 stable tree”。

1 year, 8 months

1
0
0 0

撤回: [Request] backport a mainline patch to Linux kernel-5.10 stable tree

by Bo Ye (叶波)

Bo Ye (叶波) 将撤回邮件“[Request] backport a mainline patch to Linux kernel-5.10 stable tree”。

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] mm-madvise-make-madv_populate_readwrite-handle-vm_fault_retry-properly.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly has been removed from the -mm tree. Its filename was mm-madvise-make-madv_populate_readwrite-handle-vm_fault_retry-properly.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: David Hildenbrand <david(a)redhat.com> Subject: mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly Date: Thu, 14 Mar 2024 17:12:59 +0100 Darrick reports that in some cases where pread() would fail with -EIO and mmap()+access would generate a SIGBUS signal, MADV_POPULATE_READ / MADV_POPULATE_WRITE will keep retrying forever and not fail with -EFAULT. While the madvise() call can be interrupted by a signal, this is not the desired behavior. MADV_POPULATE_READ / MADV_POPULATE_WRITE should behave like page faults in that case: fail and not retry forever. A reproducer can be found at [1]. The reason is that __get_user_pages(), as called by faultin_vma_page_range(), will not handle VM_FAULT_RETRY in a proper way: it will simply return 0 when VM_FAULT_RETRY happened, making madvise_populate()->faultin_vma_page_range() retry again and again, never setting FOLL_TRIED->FAULT_FLAG_TRIED for __get_user_pages(). __get_user_pages_locked() does what we want, but duplicating that logic in faultin_vma_page_range() feels wrong. So let's use __get_user_pages_locked() instead, that will detect VM_FAULT_RETRY and set FOLL_TRIED when retrying, making the fault handler return VM_FAULT_SIGBUS (VM_FAULT_ERROR) at some point, propagating -EFAULT from faultin_page() to __get_user_pages(), all the way to madvise_populate(). But, there is an issue: __get_user_pages_locked() will end up re-taking the MM lock and then __get_user_pages() will do another VMA lookup. In the meantime, the VMA layout could have changed and we'd fail with different error codes than we'd want to. As __get_user_pages() will currently do a new VMA lookup either way, let it do the VMA handling in a different way, controlled by a new FOLL_MADV_POPULATE flag, effectively moving these checks from madvise_populate() + faultin_page_range() in there. With this change, Darricks reproducer properly fails with -EFAULT, as documented for MADV_POPULATE_READ / MADV_POPULATE_WRITE. [1] https://lore.kernel.org/all/20240313171936.GN1927156@frogsfrogsfrogs/ Link: https://lkml.kernel.org/r/20240314161300.382526-1-david@redhat.com Link: https://lkml.kernel.org/r/20240314161300.382526-2-david@redhat.com Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") Signed-off-by: David Hildenbrand <david(a)redhat.com> Reported-by: Darrick J. Wong <djwong(a)kernel.org> Closes: https://lore.kernel.org/all/20240311223815.GW1927156@frogsfrogsfrogs/ Cc: Darrick J. Wong <djwong(a)kernel.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: Jason Gunthorpe <jgg(a)nvidia.com> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/gup.c | 54 ++++++++++++++++++++++++++++-------------------- mm/internal.h | 10 +++++--- mm/madvise.c | 17 +-------------- 3 files changed, 40 insertions(+), 41 deletions(-) --- a/mm/gup.c~mm-madvise-make-madv_populate_readwrite-handle-vm_fault_retry-properly +++ a/mm/gup.c @@ -1206,6 +1206,22 @@ static long __get_user_pages(struct mm_s /* first iteration or cross vma bound */ if (!vma || start >= vma->vm_end) { + /* + * MADV_POPULATE_(READ|WRITE) wants to handle VMA + * lookups+error reporting differently. + */ + if (gup_flags & FOLL_MADV_POPULATE) { + vma = vma_lookup(mm, start); + if (!vma) { + ret = -ENOMEM; + goto out; + } + if (check_vma_flags(vma, gup_flags)) { + ret = -EINVAL; + goto out; + } + goto retry; + } vma = gup_vma_lookup(mm, start); if (!vma && in_gate_area(mm, start)) { ret = get_gate_page(mm, start & PAGE_MASK, @@ -1685,35 +1701,35 @@ long populate_vma_page_range(struct vm_a } /* - * faultin_vma_page_range() - populate (prefault) page tables inside the - * given VMA range readable/writable + * faultin_page_range() - populate (prefault) page tables inside the + * given range readable/writable * * This takes care of mlocking the pages, too, if VM_LOCKED is set. * - * @vma: target vma + * @mm: the mm to populate page tables in * @start: start address * @end: end address * @write: whether to prefault readable or writable * @locked: whether the mmap_lock is still held * - * Returns either number of processed pages in the vma, or a negative error - * code on error (see __get_user_pages()). + * Returns either number of processed pages in the MM, or a negative error + * code on error (see __get_user_pages()). Note that this function reports + * errors related to VMAs, such as incompatible mappings, as expected by + * MADV_POPULATE_(READ|WRITE). * - * vma->vm_mm->mmap_lock must be held. The range must be page-aligned and - * covered by the VMA. If it's released, *@locked will be set to 0. + * The range must be page-aligned. + * + * mm->mmap_lock must be held. If it's released, *@locked will be set to 0. */ -long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start, - unsigned long end, bool write, int *locked) +long faultin_page_range(struct mm_struct *mm, unsigned long start, + unsigned long end, bool write, int *locked) { - struct mm_struct *mm = vma->vm_mm; unsigned long nr_pages = (end - start) / PAGE_SIZE; int gup_flags; long ret; VM_BUG_ON(!PAGE_ALIGNED(start)); VM_BUG_ON(!PAGE_ALIGNED(end)); - VM_BUG_ON_VMA(start < vma->vm_start, vma); - VM_BUG_ON_VMA(end > vma->vm_end, vma); mmap_assert_locked(mm); /* @@ -1725,19 +1741,13 @@ long faultin_vma_page_range(struct vm_ar * a poisoned page. * !FOLL_FORCE: Require proper access permissions. */ - gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE; + gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE | + FOLL_MADV_POPULATE; if (write) gup_flags |= FOLL_WRITE; - /* - * We want to report -EINVAL instead of -EFAULT for any permission - * problems or incompatible mappings. - */ - if (check_vma_flags(vma, gup_flags)) - return -EINVAL; - - ret = __get_user_pages(mm, start, nr_pages, gup_flags, - NULL, locked); + ret = __get_user_pages_locked(mm, start, nr_pages, NULL, locked, + gup_flags); lru_add_drain(); return ret; } --- a/mm/internal.h~mm-madvise-make-madv_populate_readwrite-handle-vm_fault_retry-properly +++ a/mm/internal.h @@ -686,9 +686,8 @@ struct anon_vma *folio_anon_vma(struct f void unmap_mapping_folio(struct folio *folio); extern long populate_vma_page_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, int *locked); -extern long faultin_vma_page_range(struct vm_area_struct *vma, - unsigned long start, unsigned long end, - bool write, int *locked); +extern long faultin_page_range(struct mm_struct *mm, unsigned long start, + unsigned long end, bool write, int *locked); extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags, unsigned long bytes); @@ -1127,10 +1126,13 @@ enum { FOLL_FAST_ONLY = 1 << 20, /* allow unlocking the mmap lock */ FOLL_UNLOCKABLE = 1 << 21, + /* VMA lookup+checks compatible with MADV_POPULATE_(READ|WRITE) */ + FOLL_MADV_POPULATE = 1 << 22, }; #define INTERNAL_GUP_FLAGS (FOLL_TOUCH | FOLL_TRIED | FOLL_REMOTE | FOLL_PIN | \ - FOLL_FAST_ONLY | FOLL_UNLOCKABLE) + FOLL_FAST_ONLY | FOLL_UNLOCKABLE | \ + FOLL_MADV_POPULATE) /* * Indicates for which pages that are write-protected in the page table, --- a/mm/madvise.c~mm-madvise-make-madv_populate_readwrite-handle-vm_fault_retry-properly +++ a/mm/madvise.c @@ -908,27 +908,14 @@ static long madvise_populate(struct vm_a { const bool write = behavior == MADV_POPULATE_WRITE; struct mm_struct *mm = vma->vm_mm; - unsigned long tmp_end; int locked = 1; long pages; *prev = vma; while (start < end) { - /* - * We might have temporarily dropped the lock. For example, - * our VMA might have been split. - */ - if (!vma || start >= vma->vm_end) { - vma = vma_lookup(mm, start); - if (!vma) - return -ENOMEM; - } - - tmp_end = min_t(unsigned long, end, vma->vm_end); /* Populate (prefault) page tables readable/writable. */ - pages = faultin_vma_page_range(vma, start, tmp_end, write, - &locked); + pages = faultin_page_range(mm, start, end, write, &locked); if (!locked) { mmap_read_lock(mm); locked = 1; @@ -949,7 +936,7 @@ static long madvise_populate(struct vm_a pr_warn_once("%s: unhandled return value: %ld\n", __func__, pages); fallthrough; - case -ENOMEM: + case -ENOMEM: /* No VMA or out of memory. */ return -ENOMEM; } } _ Patches currently in -mm which might be from david(a)redhat.com are mm-madvise-dont-perform-madvise-vma-walk-for-madv_populate_readwrite.patch mm-convert-folio_estimated_sharers-to-folio_likely_mapped_shared.patch mm-convert-folio_estimated_sharers-to-folio_likely_mapped_shared-fix.patch selftests-memfd_secret-add-vmsplice-test.patch mm-merge-folio_is_secretmem-and-folio_fast_pin_allowed-into-gup_fast_folio_allowed.patch mm-optimize-config_per_vma_lock-member-placement-in-vm_area_struct.patch mm-remove-prot-parameter-from-move_pte.patch mm-gup-consistently-name-gup-fast-functions.patch mm-treewide-rename-config_have_fast_gup-to-config_have_gup_fast.patch mm-use-gup-fast-instead-fast-gup-in-remaining-comments.patch drivers-virt-acrn-fix-pfnmap-pte-checks-in-acrn_vm_ram_map.patch mm-pass-vma-instead-of-mm-to-follow_pte.patch mm-follow_pte-improvements.patch mm-allow-for-detecting-underflows-with-page_mapcount-again.patch mm-rmap-always-inline-anon-file-rmap-duplication-of-a-single-pte.patch mm-rmap-add-fast-path-for-small-folios-when-adding-removing-duplicating.patch mm-track-mapcount-of-large-folios-in-single-value.patch mm-improve-folio_likely_mapped_shared-using-the-mapcount-of-large-folios.patch mm-make-folio_mapcount-return-0-for-small-typed-folios.patch mm-memory-use-folio_mapcount-in-zap_present_folio_ptes.patch mm-huge_memory-use-folio_mapcount-in-zap_huge_pmd-sanity-check.patch mm-memory-failure-use-folio_mapcount-in-hwpoison_user_mappings.patch mm-page_alloc-use-folio_mapped-in-__alloc_contig_migrate_range.patch mm-migrate-use-folio_likely_mapped_shared-in-add_page_for_migration.patch sh-mm-cache-use-folio_mapped-in-copy_from_user_page.patch mm-filemap-use-folio_mapcount-in-filemap_unaccount_folio.patch mm-migrate_device-use-folio_mapcount-in-migrate_vma_check_page.patch trace-events-page_ref-trace-the-raw-page-mapcount-value.patch xtensa-mm-convert-check_tlb_entry-to-sanity-check-folios.patch mm-debug-print-only-page-mapcount-excluding-folio-entire-mapcount-in-__dump_folio.patch documentation-admin-guide-cgroup-v1-memoryrst-dont-reference-page_mapcount.patch mm-ksm-rename-get_ksm_page_flags-to-ksm_get_folio_flags.patch mm-ksm-remove-page_mapcount-usage-in-stable_tree_search.patch loongarch-tlb-fix-error-parameter-ptep-set-but-not-used-due-to-__tlb_remove_tlb_entry.patch

1 year, 8 months

2
1
0 0

[merged mm-hotfixes-stable] nilfs2-fix-oob-in-nilfs_set_de_type.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: nilfs2: fix OOB in nilfs_set_de_type has been removed from the -mm tree. Its filename was nilfs2-fix-oob-in-nilfs_set_de_type.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Jeongjun Park <aha310510(a)gmail.com> Subject: nilfs2: fix OOB in nilfs_set_de_type Date: Tue, 16 Apr 2024 03:20:48 +0900 The size of the nilfs_type_by_mode array in the fs/nilfs2/dir.c file is defined as "S_IFMT >> S_SHIFT", but the nilfs_set_de_type() function, which uses this array, specifies the index to read from the array in the same way as "(mode & S_IFMT) >> S_SHIFT". static void nilfs_set_de_type(struct nilfs_dir_entry *de, struct inode *inode) { umode_t mode = inode->i_mode; de->file_type = nilfs_type_by_mode[(mode & S_IFMT)>>S_SHIFT]; // oob } However, when the index is determined this way, an out-of-bounds (OOB) error occurs by referring to an index that is 1 larger than the array size when the condition "mode & S_IFMT == S_IFMT" is satisfied. Therefore, a patch to resize the nilfs_type_by_mode array should be applied to prevent OOB errors. Link: https://lkml.kernel.org/r/20240415182048.7144-1-konishi.ryusuke@gmail.com Reported-by: syzbot+2e22057de05b9f3b30d8(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2e22057de05b9f3b30d8 Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Signed-off-by: Jeongjun Park <aha310510(a)gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/nilfs2/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/nilfs2/dir.c~nilfs2-fix-oob-in-nilfs_set_de_type +++ a/fs/nilfs2/dir.c @@ -240,7 +240,7 @@ nilfs_filetype_table[NILFS_FT_MAX] = { #define S_SHIFT 12 static unsigned char -nilfs_type_by_mode[S_IFMT >> S_SHIFT] = { +nilfs_type_by_mode[(S_IFMT >> S_SHIFT) + 1] = { [S_IFREG >> S_SHIFT] = NILFS_FT_REG_FILE, [S_IFDIR >> S_SHIFT] = NILFS_FT_DIR, [S_IFCHR >> S_SHIFT] = NILFS_FT_CHRDEV, _ Patches currently in -mm which might be from aha310510(a)gmail.com are

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] fork-defer-linking-file-vma-until-vma-is-fully-initialized.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: fork: defer linking file vma until vma is fully initialized has been removed from the -mm tree. Its filename was fork-defer-linking-file-vma-until-vma-is-fully-initialized.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Miaohe Lin <linmiaohe(a)huawei.com> Subject: fork: defer linking file vma until vma is fully initialized Date: Wed, 10 Apr 2024 17:14:41 +0800 Thorvald reported a WARNING [1]. And the root cause is below race: CPU 1 CPU 2 fork hugetlbfs_fallocate dup_mmap hugetlbfs_punch_hole i_mmap_lock_write(mapping); vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree. i_mmap_unlock_write(mapping); hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem! i_mmap_lock_write(mapping); hugetlb_vmdelete_list vma_interval_tree_foreach hugetlb_vma_trylock_write -- Vma_lock is cleared. tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem! hugetlb_vma_unlock_write -- Vma_lock is assigned!!! i_mmap_unlock_write(mapping); hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside i_mmap_rwsem lock while vma lock can be used in the same time. Fix this by deferring linking file vma until vma is fully initialized. Those vmas should be initialized first before they can be used. Link: https://lkml.kernel.org/r/20240410091441.3539905-1-linmiaohe@huawei.com Fixes: 8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Reported-by: Thorvald Natvig <thorvald(a)google.com> Closes: https://lore.kernel.org/linux-mm/20240129161735.6gmjsswx62o4pbja@revolver/T/ [1] Reviewed-by: Jane Chu <jane.chu(a)oracle.com> Cc: Christian Brauner <brauner(a)kernel.org> Cc: Heiko Carstens <hca(a)linux.ibm.com> Cc: Kent Overstreet <kent.overstreet(a)linux.dev> Cc: Liam R. Howlett <Liam.Howlett(a)oracle.com> Cc: Mateusz Guzik <mjguzik(a)gmail.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Oleg Nesterov <oleg(a)redhat.com> Cc: Peng Zhang <zhangpeng.00(a)bytedance.com> Cc: Tycho Andersen <tandersen(a)netflix.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/fork.c | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) --- a/kernel/fork.c~fork-defer-linking-file-vma-until-vma-is-fully-initialized +++ a/kernel/fork.c @@ -714,6 +714,23 @@ static __latent_entropy int dup_mmap(str } else if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; vm_flags_clear(tmp, VM_LOCKED_MASK); + /* + * Copy/update hugetlb private vma information. + */ + if (is_vm_hugetlb_page(tmp)) + hugetlb_dup_vma_private(tmp); + + /* + * Link the vma into the MT. After using __mt_dup(), memory + * allocation is not necessary here, so it cannot fail. + */ + vma_iter_bulk_store(&vmi, tmp); + + mm->map_count++; + + if (tmp->vm_ops && tmp->vm_ops->open) + tmp->vm_ops->open(tmp); + file = tmp->vm_file; if (file) { struct address_space *mapping = file->f_mapping; @@ -730,25 +747,9 @@ static __latent_entropy int dup_mmap(str i_mmap_unlock_write(mapping); } - /* - * Copy/update hugetlb private vma information. - */ - if (is_vm_hugetlb_page(tmp)) - hugetlb_dup_vma_private(tmp); - - /* - * Link the vma into the MT. After using __mt_dup(), memory - * allocation is not necessary here, so it cannot fail. - */ - vma_iter_bulk_store(&vmi, tmp); - - mm->map_count++; if (!(tmp->vm_flags & VM_WIPEONFORK)) retval = copy_page_range(tmp, mpnt); - if (tmp->vm_ops && tmp->vm_ops->open) - tmp->vm_ops->open(tmp); - if (retval) { mpnt = vma_next(&vmi); goto loop_out; _ Patches currently in -mm which might be from linmiaohe(a)huawei.com are

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] mm-shmem-inline-shmem_is_huge-for-disabled-transparent-hugepages.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/shmem: inline shmem_is_huge() for disabled transparent hugepages has been removed from the -mm tree. Its filename was mm-shmem-inline-shmem_is_huge-for-disabled-transparent-hugepages.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Sumanth Korikkar <sumanthk(a)linux.ibm.com> Subject: mm/shmem: inline shmem_is_huge() for disabled transparent hugepages Date: Tue, 9 Apr 2024 17:54:07 +0200 In order to minimize code size (CONFIG_CC_OPTIMIZE_FOR_SIZE=y), compiler might choose to make a regular function call (out-of-line) for shmem_is_huge() instead of inlining it. When transparent hugepages are disabled (CONFIG_TRANSPARENT_HUGEPAGE=n), it can cause compilation error. mm/shmem.c: In function `shmem_getattr': ./include/linux/huge_mm.h:383:27: note: in expansion of macro `BUILD_BUG' 383 | #define HPAGE_PMD_SIZE ({ BUILD_BUG(); 0; }) | ^~~~~~~~~ mm/shmem.c:1148:33: note: in expansion of macro `HPAGE_PMD_SIZE' 1148 | stat->blksize = HPAGE_PMD_SIZE; To prevent the possible error, always inline shmem_is_huge() when transparent hugepages are disabled. Link: https://lkml.kernel.org/r/20240409155407.2322714-1-sumanthk@linux.ibm.com Signed-off-by: Sumanth Korikkar <sumanthk(a)linux.ibm.com> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Alexander Gordeev <agordeev(a)linux.ibm.com> Cc: Heiko Carstens <hca(a)linux.ibm.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Ilya Leoshkevich <iii(a)linux.ibm.com> Cc: Vasily Gorbik <gor(a)linux.ibm.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/shmem_fs.h | 9 +++++++++ mm/shmem.c | 6 ------ 2 files changed, 9 insertions(+), 6 deletions(-) --- a/include/linux/shmem_fs.h~mm-shmem-inline-shmem_is_huge-for-disabled-transparent-hugepages +++ a/include/linux/shmem_fs.h @@ -110,8 +110,17 @@ extern struct page *shmem_read_mapping_p extern void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end); int shmem_unuse(unsigned int type); +#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern bool shmem_is_huge(struct inode *inode, pgoff_t index, bool shmem_huge_force, struct mm_struct *mm, unsigned long vm_flags); +#else +static __always_inline bool shmem_is_huge(struct inode *inode, pgoff_t index, bool shmem_huge_force, + struct mm_struct *mm, unsigned long vm_flags) +{ + return false; +} +#endif + #ifdef CONFIG_SHMEM extern unsigned long shmem_swap_usage(struct vm_area_struct *vma); #else --- a/mm/shmem.c~mm-shmem-inline-shmem_is_huge-for-disabled-transparent-hugepages +++ a/mm/shmem.c @@ -748,12 +748,6 @@ static long shmem_unused_huge_count(stru #define shmem_huge SHMEM_HUGE_DENY -bool shmem_is_huge(struct inode *inode, pgoff_t index, bool shmem_huge_force, - struct mm_struct *mm, unsigned long vm_flags) -{ - return false; -} - static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, struct shrink_control *sc, unsigned long nr_to_split) { _ Patches currently in -mm which might be from sumanthk(a)linux.ibm.com are

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] squashfs-check-the-inode-number-is-not-the-invalid-value-of-zero.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: Squashfs: check the inode number is not the invalid value of zero has been removed from the -mm tree. Its filename was squashfs-check-the-inode-number-is-not-the-invalid-value-of-zero.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Phillip Lougher <phillip(a)squashfs.org.uk> Subject: Squashfs: check the inode number is not the invalid value of zero Date: Mon, 8 Apr 2024 23:02:06 +0100 Syskiller has produced an out of bounds access in fill_meta_index(). That out of bounds access is ultimately caused because the inode has an inode number with the invalid value of zero, which was not checked. The reason this causes the out of bounds access is due to following sequence of events: 1. Fill_meta_index() is called to allocate (via empty_meta_index()) and fill a metadata index. It however suffers a data read error and aborts, invalidating the newly returned empty metadata index. It does this by setting the inode number of the index to zero, which means unused (zero is not a valid inode number). 2. When fill_meta_index() is subsequently called again on another read operation, locate_meta_index() returns the previous index because it matches the inode number of 0. Because this index has been returned it is expected to have been filled, and because it hasn't been, an out of bounds access is performed. This patch adds a sanity check which checks that the inode number is not zero when the inode is created and returns -EINVAL if it is. [phillip(a)squashfs.org.uk: whitespace fix] Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk> Reported-by: "Ubisectech Sirius" <bugreport(a)ubisectech.com> Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport… Cc: Christian Brauner <brauner(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/squashfs/inode.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/fs/squashfs/inode.c~squashfs-check-the-inode-number-is-not-the-invalid-value-of-zero +++ a/fs/squashfs/inode.c @@ -48,6 +48,10 @@ static int squashfs_new_inode(struct sup gid_t i_gid; int err; + inode->i_ino = le32_to_cpu(sqsh_ino->inode_number); + if (inode->i_ino == 0) + return -EINVAL; + err = squashfs_get_id(sb, le16_to_cpu(sqsh_ino->uid), &i_uid); if (err) return err; @@ -58,7 +62,6 @@ static int squashfs_new_inode(struct sup i_uid_write(inode, i_uid); i_gid_write(inode, i_gid); - inode->i_ino = le32_to_cpu(sqsh_ino->inode_number); inode_set_mtime(inode, le32_to_cpu(sqsh_ino->mtime), 0); inode_set_atime(inode, inode_get_mtime_sec(inode), 0); inode_set_ctime(inode, inode_get_mtime_sec(inode), 0); _ Patches currently in -mm which might be from phillip(a)squashfs.org.uk are squashfs-remove-deprecated-strncpy-by-not-copying-the-string.patch

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] mmswapops-update-check-in-is_pfn_swap_entry-for-hwpoison-entries.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm,swapops: update check in is_pfn_swap_entry for hwpoison entries has been removed from the -mm tree. Its filename was mmswapops-update-check-in-is_pfn_swap_entry-for-hwpoison-entries.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Oscar Salvador <osalvador(a)suse.de> Subject: mm,swapops: update check in is_pfn_swap_entry for hwpoison entries Date: Sun, 7 Apr 2024 15:05:37 +0200 Tony reported that the Machine check recovery was broken in v6.9-rc1, as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors to DRAM. After some more digging and debugging on his side, he realized that this went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")'. That commit, among other things, introduced swp_offset_pfn(), replacing hwpoison_entry_to_pfn() in its favour. The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), but is_pfn_swap_entry() never got updated to cover hwpoison entries, which means that we would hit the VM_BUG_ON whenever we would call swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. Fix this by updating the check to cover hwpoison entries as well, and update the comment while we are it. Link: https://lkml.kernel.org/r/20240407130537.16977-1-osalvador@suse.de Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Signed-off-by: Oscar Salvador <osalvador(a)suse.de> Reported-by: Tony Luck <tony.luck(a)intel.com> Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ Tested-by: Tony Luck <tony.luck(a)intel.com> Reviewed-by: Peter Xu <peterx(a)redhat.com> Reviewed-by: David Hildenbrand <david(a)redhat.com> Acked-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: <stable(a)vger.kernel.org> [6.1.x] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/swapops.h | 65 +++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 32 deletions(-) --- a/include/linux/swapops.h~mmswapops-update-check-in-is_pfn_swap_entry-for-hwpoison-entries +++ a/include/linux/swapops.h @@ -390,6 +390,35 @@ static inline bool is_migration_entry_di } #endif /* CONFIG_MIGRATION */ +#ifdef CONFIG_MEMORY_FAILURE + +/* + * Support for hardware poisoned pages + */ +static inline swp_entry_t make_hwpoison_entry(struct page *page) +{ + BUG_ON(!PageLocked(page)); + return swp_entry(SWP_HWPOISON, page_to_pfn(page)); +} + +static inline int is_hwpoison_entry(swp_entry_t entry) +{ + return swp_type(entry) == SWP_HWPOISON; +} + +#else + +static inline swp_entry_t make_hwpoison_entry(struct page *page) +{ + return swp_entry(0, 0); +} + +static inline int is_hwpoison_entry(swp_entry_t swp) +{ + return 0; +} +#endif + typedef unsigned long pte_marker; #define PTE_MARKER_UFFD_WP BIT(0) @@ -483,8 +512,9 @@ static inline struct folio *pfn_swap_ent /* * A pfn swap entry is a special type of swap entry that always has a pfn stored - * in the swap offset. They are used to represent unaddressable device memory - * and to restrict access to a page undergoing migration. + * in the swap offset. They can either be used to represent unaddressable device + * memory, to restrict access to a page undergoing migration or to represent a + * pfn which has been hwpoisoned and unmapped. */ static inline bool is_pfn_swap_entry(swp_entry_t entry) { @@ -492,7 +522,7 @@ static inline bool is_pfn_swap_entry(swp BUILD_BUG_ON(SWP_TYPE_SHIFT < SWP_PFN_BITS); return is_migration_entry(entry) || is_device_private_entry(entry) || - is_device_exclusive_entry(entry); + is_device_exclusive_entry(entry) || is_hwpoison_entry(entry); } struct page_vma_mapped_walk; @@ -561,35 +591,6 @@ static inline int is_pmd_migration_entry } #endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ -#ifdef CONFIG_MEMORY_FAILURE - -/* - * Support for hardware poisoned pages - */ -static inline swp_entry_t make_hwpoison_entry(struct page *page) -{ - BUG_ON(!PageLocked(page)); - return swp_entry(SWP_HWPOISON, page_to_pfn(page)); -} - -static inline int is_hwpoison_entry(swp_entry_t entry) -{ - return swp_type(entry) == SWP_HWPOISON; -} - -#else - -static inline swp_entry_t make_hwpoison_entry(struct page *page) -{ - return swp_entry(0, 0); -} - -static inline int is_hwpoison_entry(swp_entry_t swp) -{ - return 0; -} -#endif - static inline int non_swap_entry(swp_entry_t entry) { return swp_type(entry) >= MAX_SWAPFILES; _ Patches currently in -mm which might be from osalvador(a)suse.de are

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled has been removed from the -mm tree. Its filename was mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Miaohe Lin <linmiaohe(a)huawei.com> Subject: mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled Date: Sun, 7 Apr 2024 16:54:56 +0800 When I did hard offline test with hugetlb pages, below deadlock occurs: ====================================================== WARNING: possible circular locking dependency detected 6.8.0-11409-gf6cef5f8c37f #1 Not tainted ------------------------------------------------------ bash/46904 is trying to acquire lock: ffffffffabe68910 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_slow_dec+0x16/0x60 but task is already holding lock: ffffffffabf92ea8 (pcp_batch_high_lock){+.+.}-{3:3}, at: zone_pcp_disable+0x16/0x40 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (pcp_batch_high_lock){+.+.}-{3:3}: __mutex_lock+0x6c/0x770 page_alloc_cpu_online+0x3c/0x70 cpuhp_invoke_callback+0x397/0x5f0 __cpuhp_invoke_callback_range+0x71/0xe0 _cpu_up+0xeb/0x210 cpu_up+0x91/0xe0 cpuhp_bringup_mask+0x49/0xb0 bringup_nonboot_cpus+0xb7/0xe0 smp_init+0x25/0xa0 kernel_init_freeable+0x15f/0x3e0 kernel_init+0x15/0x1b0 ret_from_fork+0x2f/0x50 ret_from_fork_asm+0x1a/0x30 -> #0 (cpu_hotplug_lock){++++}-{0:0}: __lock_acquire+0x1298/0x1cd0 lock_acquire+0xc0/0x2b0 cpus_read_lock+0x2a/0xc0 static_key_slow_dec+0x16/0x60 __hugetlb_vmemmap_restore_folio+0x1b9/0x200 dissolve_free_huge_page+0x211/0x260 __page_handle_poison+0x45/0xc0 memory_failure+0x65e/0xc70 hard_offline_page_store+0x55/0xa0 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xca/0x1e0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pcp_batch_high_lock); lock(cpu_hotplug_lock); lock(pcp_batch_high_lock); rlock(cpu_hotplug_lock); *** DEADLOCK *** 5 locks held by bash/46904: #0: ffff98f6c3bb23f0 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 #1: ffff98f6c328e488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf8/0x1d0 #2: ffff98ef83b31890 (kn->active#113){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x100/0x1d0 #3: ffffffffabf9db48 (mf_mutex){+.+.}-{3:3}, at: memory_failure+0x44/0xc70 #4: ffffffffabf92ea8 (pcp_batch_high_lock){+.+.}-{3:3}, at: zone_pcp_disable+0x16/0x40 stack backtrace: CPU: 10 PID: 46904 Comm: bash Kdump: loaded Not tainted 6.8.0-11409-gf6cef5f8c37f #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0xa0 check_noncircular+0x129/0x140 __lock_acquire+0x1298/0x1cd0 lock_acquire+0xc0/0x2b0 cpus_read_lock+0x2a/0xc0 static_key_slow_dec+0x16/0x60 __hugetlb_vmemmap_restore_folio+0x1b9/0x200 dissolve_free_huge_page+0x211/0x260 __page_handle_poison+0x45/0xc0 memory_failure+0x65e/0xc70 hard_offline_page_store+0x55/0xa0 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xca/0x1e0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 RIP: 0033:0x7fc862314887 Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007fff19311268 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007fc862314887 RDX: 000000000000000c RSI: 000056405645fe10 RDI: 0000000000000001 RBP: 000056405645fe10 R08: 00007fc8623d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007fc86241b780 R14: 00007fc862417600 R15: 00007fc862416a00 In short, below scene breaks the lock dependency chain: memory_failure __page_handle_poison zone_pcp_disable -- lock(pcp_batch_high_lock) dissolve_free_huge_page __hugetlb_vmemmap_restore_folio static_key_slow_dec cpus_read_lock -- rlock(cpu_hotplug_lock) Fix this by calling drain_all_pages() instead. This issue won't occur until commit a6b40850c442 ("mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key"). As it introduced rlock(cpu_hotplug_lock) in dissolve_free_huge_page() code path while lock(pcp_batch_high_lock) is already in the __page_handle_poison(). [linmiaohe(a)huawei.com: extend comment per Oscar] [akpm(a)linux-foundation.org: reflow block comment] Link: https://lkml.kernel.org/r/20240407085456.2798193-1-linmiaohe@huawei.com Fixes: a6b40850c442 ("mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Acked-by: Oscar Salvador <osalvador(a)suse.de> Reviewed-by: Jane Chu <jane.chu(a)oracle.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memory-failure.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) --- a/mm/memory-failure.c~mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled +++ a/mm/memory-failure.c @@ -154,11 +154,23 @@ static int __page_handle_poison(struct p { int ret; - zone_pcp_disable(page_zone(page)); + /* + * zone_pcp_disable() can't be used here. It will + * hold pcp_batch_high_lock and dissolve_free_huge_page() might hold + * cpu_hotplug_lock via static_key_slow_dec() when hugetlb vmemmap + * optimization is enabled. This will break current lock dependency + * chain and leads to deadlock. + * Disabling pcp before dissolving the page was a deterministic + * approach because we made sure that those pages cannot end up in any + * PCP list. Draining PCP lists expels those pages to the buddy system, + * but nothing guarantees that those pages do not get back to a PCP + * queue if we need to refill those. + */ ret = dissolve_free_huge_page(page); - if (!ret) + if (!ret) { + drain_all_pages(page_zone(page)); ret = take_page_off_buddy(page); - zone_pcp_enable(page_zone(page)); + } return ret; } _ Patches currently in -mm which might be from linmiaohe(a)huawei.com are

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] mm-userfaultfd-allow-hugetlb-change-protection-upon-poison-entry.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/userfaultfd: allow hugetlb change protection upon poison entry has been removed from the -mm tree. Its filename was mm-userfaultfd-allow-hugetlb-change-protection-upon-poison-entry.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Peter Xu <peterx(a)redhat.com> Subject: mm/userfaultfd: allow hugetlb change protection upon poison entry Date: Fri, 5 Apr 2024 19:19:20 -0400 After UFFDIO_POISON, there can be two kinds of hugetlb pte markers, either the POISON one or UFFD_WP one. Allow change protection to run on a poisoned marker just like !hugetlb cases, ignoring the marker irrelevant of the permission. Here the two bits are mutual exclusive. For example, when install a poisoned entry it must not be UFFD_WP already (by checking pte_none() before such install). And it also means if UFFD_WP is set there must have no POISON bit set. It makes sense because UFFD_WP is a bit to reflect permission, and permissions do not apply if the pte is poisoned and destined to sigbus. So here we simply check uffd_wp bit set first, do nothing otherwise. Attach the Fixes to UFFDIO_POISON work, as before that it should not be possible to have poison entry for hugetlb (e.g., hugetlb doesn't do swap, so no chance of swapin errors). Link: https://lkml.kernel.org/r/20240405231920.1772199-1-peterx@redhat.com Link: https://lore.kernel.org/r/000000000000920d5e0615602dd1@google.com Fixes: fc71884a5f59 ("mm: userfaultfd: add new UFFDIO_POISON ioctl") Signed-off-by: Peter Xu <peterx(a)redhat.com> Reported-by: syzbot+b07c8ac8eee3d4d8440f(a)syzkaller.appspotmail.com Reviewed-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen(a)google.com> Cc: <stable(a)vger.kernel.org> [6.6+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/hugetlb.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) --- a/mm/hugetlb.c~mm-userfaultfd-allow-hugetlb-change-protection-upon-poison-entry +++ a/mm/hugetlb.c @@ -7044,9 +7044,13 @@ long hugetlb_change_protection(struct vm if (!pte_same(pte, newpte)) set_huge_pte_at(mm, address, ptep, newpte, psize); } else if (unlikely(is_pte_marker(pte))) { - /* No other markers apply for now. */ - WARN_ON_ONCE(!pte_marker_uffd_wp(pte)); - if (uffd_wp_resolve) + /* + * Do nothing on a poison marker; page is + * corrupted, permissons do not apply. Here + * pte_marker_uffd_wp()==true implies !poison + * because they're mutual exclusive. + */ + if (pte_marker_uffd_wp(pte) && uffd_wp_resolve) /* Safe to modify directly (non-present->none). */ huge_pte_clear(mm, address, ptep, psize); } else if (!huge_pte_none(pte)) { _ Patches currently in -mm which might be from peterx(a)redhat.com are mm-hmm-process-pud-swap-entry-without-pud_huge.patch mm-gup-cache-p4d-in-follow_p4d_mask.patch mm-gup-check-p4d-presence-before-going-on.patch mm-x86-change-pxd_huge-behavior-to-exclude-swap-entries.patch mm-sparc-change-pxd_huge-behavior-to-exclude-swap-entries.patch mm-arm-use-macros-to-define-pmd-pud-helpers.patch mm-arm-redefine-pmd_huge-with-pmd_leaf.patch mm-arm64-merge-pxd_huge-and-pxd_leaf-definitions.patch mm-powerpc-redefine-pxd_huge-with-pxd_leaf.patch mm-gup-merge-pxd-huge-mapping-checks.patch mm-treewide-replace-pxd_huge-with-pxd_leaf.patch mm-treewide-remove-pxd_huge.patch mm-arm-remove-pmd_thp_or_huge.patch mm-document-pxd_leaf-api.patch selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation.patch selftests-mm-run_vmtestssh-fix-hugetlb-mem-size-calculation-fix.patch mm-kconfig-config_pgtable_has_huge_leaves.patch mm-hugetlb-declare-hugetlbfs_pagecache_present-non-static.patch mm-make-hpage_pxd_-macros-even-if-thp.patch mm-introduce-vma_pgtable_walk_beginend.patch mm-arch-provide-pud_pfn-fallback.patch mm-arch-provide-pud_pfn-fallback-fix.patch mm-gup-drop-folio_fast_pin_allowed-in-hugepd-processing.patch mm-gup-refactor-record_subpages-to-find-1st-small-page.patch mm-gup-handle-hugetlb-for-no_page_table.patch mm-gup-cache-pudp-in-follow_pud_mask.patch mm-gup-handle-huge-pud-for-follow_pud_mask.patch mm-gup-handle-huge-pmd-for-follow_pmd_mask.patch mm-gup-handle-huge-pmd-for-follow_pmd_mask-fix.patch mm-gup-handle-hugepd-for-follow_page.patch mm-gup-handle-hugetlb-in-the-generic-follow_page_mask-code.patch mm-allow-anon-exclusive-check-over-hugetlb-tail-pages.patch mm-page_table_check-support-userfault-wr-protect-entries.patch

1 year, 8 months

1
0
0 0

[merged mm-hotfixes-stable] userfaultfd-change-src_folio-after-ensuring-its-unpinned-in-uffdio_move.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE has been removed from the -mm tree. Its filename was userfaultfd-change-src_folio-after-ensuring-its-unpinned-in-uffdio_move.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Lokesh Gidra <lokeshgidra(a)google.com> Subject: userfaultfd: change src_folio after ensuring it's unpinned in UFFDIO_MOVE Date: Thu, 4 Apr 2024 10:17:26 -0700 Commit d7a08838ab74 ("mm: userfaultfd: fix unexpected change to src_folio when UFFDIO_MOVE fails") moved the src_folio->{mapping, index} changing to after clearing the page-table and ensuring that it's not pinned. This avoids failure of swapout+migration and possibly memory corruption. However, the commit missed fixing it in the huge-page case. Link: https://lkml.kernel.org/r/20240404171726.2302435-1-lokeshgidra@google.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Lokesh Gidra <lokeshgidra(a)google.com> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: Kalesh Singh <kaleshsingh(a)google.com> Cc: Lokesh Gidra <lokeshgidra(a)google.com> Cc: Nicolas Geoffray <ngeoffray(a)google.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: Qi Zheng <zhengqi.arch(a)bytedance.com> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/mm/huge_memory.c~userfaultfd-change-src_folio-after-ensuring-its-unpinned-in-uffdio_move +++ a/mm/huge_memory.c @@ -2259,9 +2259,6 @@ int move_pages_huge_pmd(struct mm_struct goto unlock_ptls; } - folio_move_anon_rmap(src_folio, dst_vma); - WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, dst_addr)); - src_pmdval = pmdp_huge_clear_flush(src_vma, src_addr, src_pmd); /* Folio got pinned from under us. Put it back and fail the move. */ if (folio_maybe_dma_pinned(src_folio)) { @@ -2270,6 +2267,9 @@ int move_pages_huge_pmd(struct mm_struct goto unlock_ptls; } + folio_move_anon_rmap(src_folio, dst_vma); + WRITE_ONCE(src_folio->index, linear_page_index(dst_vma, dst_addr)); + _dst_pmd = mk_huge_pmd(&src_folio->page, dst_vma->vm_page_prot); /* Follow mremap() behavior and treat the entry dirty after the move */ _dst_pmd = pmd_mkwrite(pmd_mkdirty(_dst_pmd), dst_vma); _ Patches currently in -mm which might be from lokeshgidra(a)google.com are

1 year, 8 months

1
0
0 0

[folded-merged] mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled-v2.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled has been removed from the -mm tree. Its filename was mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled-v2.patch This patch was dropped because it was folded into mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled.patch ------------------------------------------------------ From: Miaohe Lin <linmiaohe(a)huawei.com> Subject: mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled Date: Fri, 12 Apr 2024 10:57:54 +0800 extend comment per Oscar Link: https://lkml.kernel.org/r/20240412025754.1897615-1-linmiaohe@huawei.com Fixes: a6b40850c442 ("mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Acked-by: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memory-failure.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/memory-failure.c~mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled-v2 +++ a/mm/memory-failure.c @@ -159,6 +159,10 @@ static int __page_handle_poison(struct p * dissolve_free_huge_page() might hold cpu_hotplug_lock via static_key_slow_dec() * when hugetlb vmemmap optimization is enabled. This will break current lock * dependency chain and leads to deadlock. + * Disabling pcp before dissolving the page was a deterministic approach because + * we made sure that those pages cannot end up in any PCP list. Draining PCP lists + * expels those pages to the buddy system, but nothing guarantees that those pages + * do not get back to a PCP queue if we need to refill those. */ ret = dissolve_free_huge_page(page); if (!ret) { _ Patches currently in -mm which might be from linmiaohe(a)huawei.com are mm-memory-failure-fix-deadlock-when-hugetlb_optimize_vmemmap-is-enabled.patch fork-defer-linking-file-vma-until-vma-is-fully-initialized.patch

1 year, 8 months

1
0
0 0

[PATCH] Revert 2267b2e84593bd3d61a1188e68fba06307fa9dab

by cel＠kernel.org

From: Chuck Lever <chuck.lever(a)oracle.com> ltp test fcntl17 fails on v5.15.154. This was bisected to commit 2267b2e84593 ("lockd: introduce safe async lock op"). Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com> Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com> --- Documentation/filesystems/nfs/exporting.rst | 7 ------- fs/lockd/svclock.c | 4 +++- fs/nfsd/nfs4state.c | 10 +++------- include/linux/exportfs.h | 14 -------------- 4 files changed, 6 insertions(+), 29 deletions(-) diff --git a/Documentation/filesystems/nfs/exporting.rst b/Documentation/filesystems/nfs/exporting.rst index 6a1cbd7de38d..6f59a364f84c 100644 --- a/Documentation/filesystems/nfs/exporting.rst +++ b/Documentation/filesystems/nfs/exporting.rst @@ -241,10 +241,3 @@ following flags are defined: all of an inode's dirty data on last close. Exports that behave this way should set EXPORT_OP_FLUSH_ON_CLOSE so that NFSD knows to skip waiting for writeback when closing such files. - - EXPORT_OP_ASYNC_LOCK - Indicates a capable filesystem to do async lock - requests from lockd. Only set EXPORT_OP_ASYNC_LOCK if the filesystem has - it's own ->lock() functionality as core posix_lock_file() implementation - has no async lock request handling yet. For more information about how to - indicate an async lock request from a ->lock() file_operations struct, see - fs/locks.c and comment for the function vfs_lock_file(). diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c index 55c0a0331188..4e30f3c50970 100644 --- a/fs/lockd/svclock.c +++ b/fs/lockd/svclock.c @@ -470,7 +470,9 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file, struct nlm_host *host, struct nlm_lock *lock, int wait, struct nlm_cookie *cookie, int reclaim) { +#if IS_ENABLED(CONFIG_SUNRPC_DEBUG) struct inode *inode = nlmsvc_file_inode(file); +#endif struct nlm_block *block = NULL; int error; int mode; @@ -484,7 +486,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file, (long long)lock->fl.fl_end, wait); - if (!exportfs_lock_op_is_async(inode->i_sb->s_export_op)) { + if (nlmsvc_file_file(file)->f_op->lock) { async_block = wait; wait = 0; } diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 40b5b226e504..d07176eee935 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -7420,7 +7420,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, struct nfsd4_blocked_lock *nbl = NULL; struct file_lock *file_lock = NULL; struct file_lock *conflock = NULL; - struct super_block *sb; __be32 status = 0; int lkflg; int err; @@ -7442,7 +7441,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, dprintk("NFSD: nfsd4_lock: permission denied!\n"); return status; } - sb = cstate->current_fh.fh_dentry->d_sb; if (lock->lk_is_new) { if (nfsd4_has_session(cstate)) @@ -7494,8 +7492,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, fp = lock_stp->st_stid.sc_file; switch (lock->lk_type) { case NFS4_READW_LT: - if (nfsd4_has_session(cstate) || - exportfs_lock_op_is_async(sb->s_export_op)) + if (nfsd4_has_session(cstate)) fl_flags |= FL_SLEEP; fallthrough; case NFS4_READ_LT: @@ -7507,8 +7504,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, fl_type = F_RDLCK; break; case NFS4_WRITEW_LT: - if (nfsd4_has_session(cstate) || - exportfs_lock_op_is_async(sb->s_export_op)) + if (nfsd4_has_session(cstate)) fl_flags |= FL_SLEEP; fallthrough; case NFS4_WRITE_LT: @@ -7536,7 +7532,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, * for file locks), so don't attempt blocking lock notifications * on those filesystems: */ - if (!exportfs_lock_op_is_async(sb->s_export_op)) + if (nf->nf_file->f_op->lock) fl_flags &= ~FL_SLEEP; nbl = find_or_allocate_block(lock_sop, &fp->fi_fhandle, nn); diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index 6525f4b7eb97..218fc5c54e90 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h @@ -222,23 +222,9 @@ struct export_operations { atomic attribute updates */ #define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */ -#define EXPORT_OP_ASYNC_LOCK (0x40) /* fs can do async lock request */ unsigned long flags; }; -/** - * exportfs_lock_op_is_async() - export op supports async lock operation - * @export_ops: the nfs export operations to check - * - * Returns true if the nfs export_operations structure has - * EXPORT_OP_ASYNC_LOCK in their flags set - */ -static inline bool -exportfs_lock_op_is_async(const struct export_operations *export_ops) -{ - return export_ops->flags & EXPORT_OP_ASYNC_LOCK; -} - extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid, int *max_len, struct inode *parent); extern int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, -- 2.44.0

1 year, 8 months

2
1
0 0

[PATCH v2] rust: macros: fix soundness issue in `module!` macro

by Benno Lossin

The `module!` macro creates glue code that are called by C to initialize the Rust modules using the `Module::init` function. Part of this glue code are the local functions `__init` and `__exit` that are used to initialize/destroy the Rust module. These functions are safe and also visible to the Rust mod in which the `module!` macro is invoked. This means that they can be called by other safe Rust code. But since they contain `unsafe` blocks that rely on only being called at the right time, this is a soundness issue. Wrap these generated functions inside of two private modules, this guarantees that the public functions cannot be called from the outside. Make the safe functions `unsafe` and add SAFETY comments. Cc: stable(a)vger.kernel.org Closes: https://github.com/Rust-for-Linux/linux/issues/629 Fixes: 1fbde52bde73 ("rust: add `macros` crate") Signed-off-by: Benno Lossin <benno.lossin(a)proton.me> --- v1: https://lore.kernel.org/rust-for-linux/20240327160346.22442-1-benno.lossin@… v1 -> v2: - wrapped `__init` and `__exit` calls in `unsafe` blocks and added SAFETY comments, - fixed safety requirement on `__exit` and `__init`, - rebased onto rust-next. rust/macros/module.rs | 213 +++++++++++++++++++++++++----------------- 1 file changed, 127 insertions(+), 86 deletions(-) diff --git a/rust/macros/module.rs b/rust/macros/module.rs index 27979e582e4b..293beca0a583 100644 --- a/rust/macros/module.rs +++ b/rust/macros/module.rs @@ -199,103 +199,144 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream { /// Used by the printing macros, e.g. [`info!`]. const __LOG_PREFIX: &[u8] = b\"{name}\\0\"; - /// The \"Rust loadable module\" mark. - // - // This may be best done another way later on, e.g. as a new modinfo - // key or a new section. For the moment, keep it simple. - #[cfg(MODULE)] - #[doc(hidden)] - #[used] - static __IS_RUST_MODULE: () = (); - - static mut __MOD: Option<{type_}> = None; - - // SAFETY: `__this_module` is constructed by the kernel at load time and will not be - // freed until the module is unloaded. - #[cfg(MODULE)] - static THIS_MODULE: kernel::ThisModule = unsafe {{ - kernel::ThisModule::from_ptr(&kernel::bindings::__this_module as *const _ as *mut _) - }}; - #[cfg(not(MODULE))] - static THIS_MODULE: kernel::ThisModule = unsafe {{ - kernel::ThisModule::from_ptr(core::ptr::null_mut()) - }}; - - // Loadable modules need to export the `{{init,cleanup}}_module` identifiers. - /// # Safety - /// - /// This function must not be called after module initialization, because it may be - /// freed after that completes. - #[cfg(MODULE)] - #[doc(hidden)] - #[no_mangle] - #[link_section = \".init.text\"] - pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{ - __init() - }} + // Double nested modules, since then nobody can access the public items inside. + mod __module_init {{ + mod __module_init {{ + use super::super::{type_}; + + /// The \"Rust loadable module\" mark. + // + // This may be best done another way later on, e.g. as a new modinfo + // key or a new section. For the moment, keep it simple. + #[cfg(MODULE)] + #[doc(hidden)] + #[used] + static __IS_RUST_MODULE: () = (); + + static mut __MOD: Option<{type_}> = None; + + // SAFETY: `__this_module` is constructed by the kernel at load time and will not be + // freed until the module is unloaded. + #[cfg(MODULE)] + static THIS_MODULE: kernel::ThisModule = unsafe {{ + kernel::ThisModule::from_ptr(&kernel::bindings::__this_module as *const _ as *mut _) + }}; + #[cfg(not(MODULE))] + static THIS_MODULE: kernel::ThisModule = unsafe {{ + kernel::ThisModule::from_ptr(core::ptr::null_mut()) + }}; + + // Loadable modules need to export the `{{init,cleanup}}_module` identifiers. + /// # Safety + /// + /// This function must not be called after module initialization, because it may be + /// freed after that completes. + #[cfg(MODULE)] + #[doc(hidden)] + #[no_mangle] + #[link_section = \".init.text\"] + pub unsafe extern \"C\" fn init_module() -> core::ffi::c_int {{ + // SAFETY: this function is inaccessible to the outside due to the double + // module wrapping it. It is called exactly once by the C side via its + // unique name. + unsafe {{ __init() }} + }} - #[cfg(MODULE)] - #[doc(hidden)] - #[no_mangle] - pub extern \"C\" fn cleanup_module() {{ - __exit() - }} + #[cfg(MODULE)] + #[doc(hidden)] + #[no_mangle] + pub extern \"C\" fn cleanup_module() {{ + // SAFETY: + // - this function is inaccessible to the outside due to the double + // module wrapping it. It is called exactly once by the C side via its + // unique name, + // - furthermore it is only called after `init_module` has returned `0` + // (which delegates to `__init`). + unsafe {{ __exit() }} + }} - // Built-in modules are initialized through an initcall pointer - // and the identifiers need to be unique. - #[cfg(not(MODULE))] - #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))] - #[doc(hidden)] - #[link_section = \"{initcall_section}\"] - #[used] - pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init; - - #[cfg(not(MODULE))] - #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)] - core::arch::global_asm!( - r#\".section \"{initcall_section}\", \"a\" - __{name}_initcall: - .long __{name}_init - . - .previous - \"# - ); + // Built-in modules are initialized through an initcall pointer + // and the identifiers need to be unique. + #[cfg(not(MODULE))] + #[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))] + #[doc(hidden)] + #[link_section = \"{initcall_section}\"] + #[used] + pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init; + + #[cfg(not(MODULE))] + #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)] + core::arch::global_asm!( + r#\".section \"{initcall_section}\", \"a\" + __{name}_initcall: + .long __{name}_init - . + .previous + \"# + ); + + #[cfg(not(MODULE))] + #[doc(hidden)] + #[no_mangle] + pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{ + // SAFETY: this function is inaccessible to the outside due to the double + // module wrapping it. It is called exactly once by the C side via its + // placement above in the initcall section. + unsafe {{ __init() }} + }} - #[cfg(not(MODULE))] - #[doc(hidden)] - #[no_mangle] - pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{ - __init() - }} + #[cfg(not(MODULE))] + #[doc(hidden)] + #[no_mangle] + pub extern \"C\" fn __{name}_exit() {{ + // SAFETY: + // - this function is inaccessible to the outside due to the double + // module wrapping it. It is called exactly once by the C side via its + // unique name, + // - furthermore it is only called after `__{name}_init` has returned `0` + // (which delegates to `__init`). + unsafe {{ __exit() }} + }} - #[cfg(not(MODULE))] - #[doc(hidden)] - #[no_mangle] - pub extern \"C\" fn __{name}_exit() {{ - __exit() - }} + /// # Safety + /// + /// This function must only be called once. + unsafe fn __init() -> core::ffi::c_int {{ + match <{type_} as kernel::Module>::init(&THIS_MODULE) {{ + Ok(m) => {{ + // SAFETY: + // no data race, since `__MOD` can only be accessed by this module and + // there only `__init` and `__exit` access it. These functions are only + // called once and `__exit` cannot be called before or during `__init`. + unsafe {{ + __MOD = Some(m); + }} + return 0; + }} + Err(e) => {{ + return e.to_errno(); + }} + }} + }} - fn __init() -> core::ffi::c_int {{ - match <{type_} as kernel::Module>::init(&THIS_MODULE) {{ - Ok(m) => {{ + /// # Safety + /// + /// This function must + /// - only be called once, + /// - be called after `__init` has been called and returned `0`. + unsafe fn __exit() {{ + // SAFETY: + // no data race, since `__MOD` can only be accessed by this module and there + // only `__init` and `__exit` access it. These functions are only called once + // and `__init` was already called. unsafe {{ - __MOD = Some(m); + // Invokes `drop()` on `__MOD`, which should be used for cleanup. + __MOD = None; }} - return 0; }} - Err(e) => {{ - return e.to_errno(); - }} - }} - }} - fn __exit() {{ - unsafe {{ - // Invokes `drop()` on `__MOD`, which should be used for cleanup. - __MOD = None; + {modinfo} }} }} - - {modinfo} ", type_ = info.type_, name = info.name, base-commit: 9ffe2a730313f27cebd0859ea856247ac59c576c -- 2.44.0

1 year, 8 months

4
8
0 0

[PATCH V2 1/8] clk: qcom: clk-alpha-pll: Fix CAL_L_VAL override for LUCID EVO PLL

by Ajit Pandey

In LUCID EVO PLL CAL_L_VAL and L_VAL bitfields are part of single PLL_L_VAL register. Update for L_VAL bitfield values in PLL_L_VAL register using regmap_write() API in __alpha_pll_trion_set_rate callback will override LUCID EVO PLL initial configuration related to PLL_CAL_L_VAL bit fields in PLL_L_VAL register. Observed random PLL lock failures during PLL enable due to such override in PLL calibration value. Use regmap_update_bits() with L_VAL bitfield mask instead of regmap_write() API to update only PLL_L_VAL bitfields in __alpha_pll_trion_set_rate callback. Fixes: 260e36606a03 ("clk: qcom: clk-alpha-pll: add Lucid EVO PLL configuration interfaces") Signed-off-by: Ajit Pandey <quic_ajipan(a)quicinc.com> Cc: stable(a)vger.kernel.org --- drivers/clk/qcom/clk-alpha-pll.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c index 8a412ef47e16..81cabd28eabe 100644 --- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -1656,7 +1656,7 @@ static int __alpha_pll_trion_set_rate(struct clk_hw *hw, unsigned long rate, if (ret < 0) return ret; - regmap_write(pll->clkr.regmap, PLL_L_VAL(pll), l); + regmap_update_bits(pll->clkr.regmap, PLL_L_VAL(pll), LUCID_EVO_PLL_L_VAL_MASK, l); regmap_write(pll->clkr.regmap, PLL_ALPHA_VAL(pll), a); /* Latch the PLL input */ -- 2.25.1

1 year, 8 months

2
1
0 0

[PATCH 6.8 000/172] 6.8.7-rc1 review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 6.8.7 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed, 17 Apr 2024 14:19:30 +0000. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.8.7-rc1.… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.8.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 6.8.7-rc1 Fudongwang <fudong.wang(a)amd.com> drm/amd/display: fix disable otg wa logic in DCN316 Wenjing Liu <wenjing.liu(a)amd.com> drm/amd/display: always reset ODM mode in context when adding first plane Alex Hung <alex.hung(a)amd.com> drm/amd/display: Return max resolution supported by DWB Dillon Varone <dillon.varone(a)amd.com> drm/amd/display: Do not recursively call manual trigger programming Harry Wentland <harry.wentland(a)amd.com> drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST Harry Wentland <harry.wentland(a)amd.com> drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4 Yifan Zhang <yifan1.zhang(a)amd.com> drm/amdgpu: differentiate external rev id for gfx 11.5.0 Tim Huang <Tim.Huang(a)amd.com> drm/amdgpu: fix incorrect number of active RBs for gfx11 Alex Deucher <alexander.deucher(a)amd.com> drm/amdgpu: always force full reset for SOC21 Lijo Lazar <lijo.lazar(a)amd.com> drm/amdgpu: Reset dGPU if suspend got aborted Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915: Disable live M/N updates when using bigjoiner Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915: Disable port sync when bigjoiner is used Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915/psr: Disable PSR when bigjoiner is used Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915/cdclk: Fix CDCLK programming order when pipes are active Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Replace CONFIG_SPECTRE_BHI_{ON,OFF} with CONFIG_MITIGATION_SPECTRE_BHI Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Clarify that syscall hardening isn't a BHI mitigation Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Fix BHI handling of RRSBA Ingo Molnar <mingo(a)kernel.org> x86/bugs: Rename various 'ia32_cap' variables to 'x86_arch_cap_msr' Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Cache the value of MSR_IA32_ARCH_CAPABILITIES Josh Poimboeuf <jpoimboe(a)kernel.org> x86/bugs: Fix BHI documentation Daniel Sneddon <daniel.sneddon(a)linux.intel.com> x86/bugs: Fix return type of spectre_bhi_state() Amir Goldstein <amir73il(a)gmail.com> kernfs: annotate different lockdep class for of->mutex of writable files Oleg Nesterov <oleg(a)redhat.com> selftests: kselftest: Fix build failure with NOLIBC Arnd Bergmann <arnd(a)arndb.de> irqflags: Explicitly ignore lockdep_hrtimer_exit() argument Adam Dunlap <acdunlap(a)google.com> x86/apic: Force native_apic_mem_read() to use the MOV instruction Nathan Chancellor <nathan(a)kernel.org> selftests: kselftest: Mark functions that unconditionally call exit() as __noreturn John Stultz <jstultz(a)google.com> selftests: timers: Fix abs() warning in posix_timers test John Stultz <jstultz(a)google.com> selftests: timers: Fix posix_timers ksft_print_msg() warning Oleg Nesterov <oleg(a)redhat.com> selftests/timers/posix_timers: Reimplement check_timer_distribution() Sean Christopherson <seanjc(a)google.com> x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n Namhyung Kim <namhyung(a)kernel.org> perf/x86: Fix out of range data Gavin Shan <gshan(a)redhat.com> vhost: Add smp_rmb() in vhost_enable_notify() Gavin Shan <gshan(a)redhat.com> vhost: Add smp_rmb() in vhost_vq_avail_empty() Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-dma: fix spi lpcg indices Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-lsio: fix pwm lpcg indices Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-dma: fix pwm lpcg indices Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-conn: fix usb lpcg indices Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-dma: fix adc lpcg indices Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-dma: fix can lpcg indices Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8qm-ss-dma: fix can lpcg indices Lang Yu <Lang.Yu(a)amd.com> drm/amdgpu/umsch: reinitialize write pointer in hw init Johan Hovold <johan+linaro(a)kernel.org> drm/msm/dp: fix runtime PM leak on connect failure Johan Hovold <johan+linaro(a)kernel.org> drm/msm/dp: fix runtime PM leak on disconnect Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/client: Fully protect modes[] with dev->mode_config.mutex Boris Brezillon <boris.brezillon(a)collabora.com> drm/panfrost: Fix the error path in panfrost_mmu_map_fault_addr() Jammy Huang <jammy_huang(a)aspeedtech.com> drm/ast: Fix soft lockup Harish Kasiviswanathan <Harish.Kasiviswanathan(a)amd.com> drm/amdkfd: Reset GPU on queue preemption failure Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915/vrr: Disable VRR when using bigjoiner Zack Rusin <zack.rusin(a)broadcom.com> drm/vmwgfx: Enable DMA mappings with SEV Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com> accel/ivpu: Fix deadlock in context_xa Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com> accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE Jacek Lawrynowicz <jacek.lawrynowicz(a)linux.intel.com> accel/ivpu: Put NPU back to D3hot after failed resume Wachowski, Karol <karol.wachowski(a)intel.com> accel/ivpu: Fix PCI D0 state entry in resume Wachowski, Karol <karol.wachowski(a)intel.com> accel/ivpu: Check return code of ipc->lock init Alexander Wetzel <Alexander(a)wetzel-home.de> scsi: sg: Avoid race in error handling & drop bogus warn Alexander Wetzel <Alexander(a)wetzel-home.de> scsi: sg: Avoid sg device teardown race Masami Hiramatsu <mhiramat(a)kernel.org> fs/proc: Skip bootloader comment if no embedded kernel parameters Zhenhua Huang <quic_zhenhuah(a)quicinc.com> fs/proc: remove redundant comments from /proc/bootconfig Zheng Yejian <zhengyejian1(a)huawei.com> kprobes: Fix possible use-after-free issue on kprobe registration Pavel Begunkov <asml.silence(a)gmail.com> io_uring/net: restore msg_control on sendzc retry Boris Burkov <boris(a)bur.io> btrfs: qgroup: convert PREALLOC to PERTRANS after record_root_in_trans Boris Burkov <boris(a)bur.io> btrfs: record delayed inode root in transaction Boris Burkov <boris(a)bur.io> btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations Boris Burkov <boris(a)bur.io> btrfs: qgroup: correctly model root qgroup rsv in convert Jens Axboe <axboe(a)kernel.dk> io_uring: disable io-wq execution of multishot NOWAIT requests Pavel Begunkov <asml.silence(a)gmail.com> io_uring: refactor DEFER_TASKRUN multishot checks Lu Baolu <baolu.lu(a)linux.intel.com> iommu/vt-d: Fix WARN_ON in iommu probe path Jacob Pan <jacob.jun.pan(a)linux.intel.com> iommu/vt-d: Allocate local memory for page request queue Xuchun Shang <xuchun.shang(a)linux.alibaba.com> iommu/vt-d: Fix wrong use of pasid config Arnd Bergmann <arnd(a)arndb.de> tracing: hide unused ftrace_event_id_fops Karthik Poosa <karthik.poosa(a)intel.com> drm/xe/hwmon: Cast result to output precision on left shift of operand Lucas De Marchi <lucas.demarchi(a)intel.com> drm/xe/display: Fix double mutex initialization David Arinzon <darinzon(a)amazon.com> net: ena: Set tx_info->xdpf value to NULL David Arinzon <darinzon(a)amazon.com> net: ena: Fix incorrect descriptor free behavior David Arinzon <darinzon(a)amazon.com> net: ena: Wrong missing IO completions check order David Arinzon <darinzon(a)amazon.com> net: ena: Fix potential sign extension issue Michal Luczaj <mhal(a)rbox.co> af_unix: Fix garbage collector racing against connect() Kuniyuki Iwashima <kuniyu(a)amazon.com> af_unix: Do not use atomic ops for unix_sk(sk)->inflight. Arınç ÜNAL <arinc.unal(a)arinc9.com> net: dsa: mt7530: trap link-local frames regardless of ST Port State Gerd Bayer <gbayer(a)linux.ibm.com> Revert "s390/ism: fix receive message buffer allocation" Daniel Machon <daniel.machon(a)microchip.com> net: sparx5: fix wrong config being used when reconfiguring PCS Rahul Rameshbabu <rrameshbabu(a)nvidia.com> net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit Carolina Jubran <cjubran(a)nvidia.com> net/mlx5e: HTB, Fix inconsistencies with QoS SQs number Carolina Jubran <cjubran(a)nvidia.com> net/mlx5e: Fix mlx5e_priv_init() cleanup flow Carolina Jubran <cjubran(a)nvidia.com> net/mlx5e: RSS, Block changing channels number when RXFH is configured Cosmin Ratiu <cratiu(a)nvidia.com> net/mlx5: Correctly compare pkt reformat ids Cosmin Ratiu <cratiu(a)nvidia.com> net/mlx5: Properly link new fs rules into the tree Michael Liang <mliang(a)purestorage.com> net/mlx5: offset comp irq index in name by one Shay Drory <shayd(a)nvidia.com> net/mlx5: Register devlink first under devlink lock Moshe Shemesh <moshe(a)nvidia.com> net/mlx5: SF, Stop waiting for FW as teardown was called Eric Dumazet <edumazet(a)google.com> netfilter: complete validation of user input Archie Pusaka <apusaka(a)chromium.org> Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bit Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: hci_sock: Fix not validating setsockopt user input Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: ISO: Fix not validating setsockopt user input Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: L2CAP: Fix not validating setsockopt user input Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: RFCOMM: Fix not validating setsockopt user input Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: SCO: Fix not validating setsockopt user input Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: hci_sync: Fix using the same interval and window for Coded PHY Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: hci_sync: Use QoS to determine which PHY to scan Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: ISO: Don't reject BT_ISO_QOS if parameters are unset Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: ISO: Align broadcast sync_timeout with connection timeout Brett Creeley <brett.creeley(a)amd.com> pds_core: Fix pdsc_check_pci_health function to use work thread Shannon Nelson <shannon.nelson(a)amd.com> pds_core: use pci_reset_function for health reset Jiri Benc <jbenc(a)redhat.com> ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr Arnd Bergmann <arnd(a)arndb.de> ipv4/route: avoid unused-but-set-variable warning Arnd Bergmann <arnd(a)arndb.de> ipv6: fib: hide unused 'pn' variable Geetha sowjanya <gakula(a)marvell.com> octeontx2-af: Fix NIX SQ mode and BP config Kuniyuki Iwashima <kuniyu(a)amazon.com> af_unix: Clear stale u->oob_skb. Marek Vasut <marex(a)denx.de> net: ks8851: Handle softirqs at the end of IRQ thread to fix hang Marek Vasut <marex(a)denx.de> net: ks8851: Inline ks8851_rx_skb() Dave Jiang <dave.jiang(a)intel.com> cxl: Fix retrieving of access_coordinates in PCIe path Dave Jiang <dave.jiang(a)intel.com> cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates() Dave Jiang <dave.jiang(a)intel.com> cxl: Split out host bridge access coordinates Dave Jiang <dave.jiang(a)intel.com> cxl: Split out combine_coordinates() for common shared usage Dave Jiang <dave.jiang(a)intel.com> ACPI: HMAT / cxl: Add retrieval of generic port coordinates for both access classes Dave Jiang <dave.jiang(a)intel.com> ACPI: HMAT: Introduce 2 levels of generic port access class Dave Jiang <dave.jiang(a)intel.com> base/node / ACPI: Enumerate node access class for 'struct access_coordinate' Raag Jadav <raag.jadav(a)intel.com> ACPI: bus: allow _UID matching for integer zero Pavan Chebbi <pavan.chebbi(a)broadcom.com> bnxt_en: Reset PTP tx_avail after possible firmware reset Vikas Gupta <vikas.gupta(a)broadcom.com> bnxt_en: Fix error recovery for RoCE ulp client Vikas Gupta <vikas.gupta(a)broadcom.com> bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init() Gerd Bayer <gbayer(a)linux.ibm.com> s390/ism: fix receive message buffer allocation Eric Dumazet <edumazet(a)google.com> geneve: fix header validation in geneve[6]_xmit_skb Arnd Bergmann <arnd(a)arndb.de> lib: checksum: hide unused expected_csum_ipv6_magic[] Ming Lei <ming.lei(a)redhat.com> block: fix q->blkg_list corruption during disk rebind Hariprasad Kelam <hkelam(a)marvell.com> octeontx2-pf: Fix transmit scheduler resource leak Eric Dumazet <edumazet(a)google.com> xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING Petr Tesarik <petr(a)tesarici.cz> u64_stats: fix u64_stats_init() for lockdep when used repeatedly in one file Ilya Maximets <i.maximets(a)ovn.org> net: openvswitch: fix unwanted error log on timeout policy probing Dan Carpenter <dan.carpenter(a)linaro.org> scsi: qla2xxx: Fix off by one in qla_edif_app_getstats() Xiang Chen <chenxiang66(a)hisilicon.com> scsi: hisi_sas: Modify the deadline for ata_wait_after_reset() Luca Weiss <luca.weiss(a)fairphone.com> drm/msm/adreno: Set highest_bank_bit for A619 Arnd Bergmann <arnd(a)arndb.de> nouveau: fix function cast warning Alex Constantino <dreaming.about.electric.sheep(a)gmail.com> Revert "drm/qxl: simplify qxl_fence_wait" Kwangjin Ko <kwangjin.ko(a)sk.com> cxl/core: Fix initialization of mbox_cmd.size_out in get event Frank Li <Frank.Li(a)nxp.com> arm64: dts: imx8-ss-conn: fix usdhc wrong lpcg clock order Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org> dt-bindings: display/msm: sm8150-mdss: add DP node Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org> drm/msm/dpu: make error messages at dpu_core_irq_register_callback() more sensible Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org> drm/msm/dpu: don't allow overriding data from catalog Stephen Boyd <swboyd(a)chromium.org> drm/msm: Add newlines to some debug prints Tim Harvey <tharvey(a)gateworks.com> arm64: dts: freescale: imx8mp-venice-gw73xx-2x: fix USB vbus regulator Tim Harvey <tharvey(a)gateworks.com> arm64: dts: freescale: imx8mp-venice-gw72xx-2x: fix USB vbus regulator Dave Jiang <dave.jiang(a)intel.com> cxl/core/regs: Fix usage of map->reg_type in cxl_decode_regblock() before assigned Yuquan Wang <wangyuquan1236(a)phytium.com.cn> cxl/mem: Fix for the index of Clear Event Record Handle Cristian Marussi <cristian.marussi(a)arm.com> firmware: arm_scmi: Make raw debugfs entries non-seekable Jens Wiklander <jens.wiklander(a)linaro.org> firmware: arm_ffa: Fix the partition ID check in ffa_notification_info_get() Aaro Koskinen <aaro.koskinen(a)iki.fi> ARM: OMAP2+: fix USB regression on Nokia N8x0 Aaro Koskinen <aaro.koskinen(a)iki.fi> mmc: omap: restore original power up/down steps Aaro Koskinen <aaro.koskinen(a)iki.fi> mmc: omap: fix deferred probe Aaro Koskinen <aaro.koskinen(a)iki.fi> mmc: omap: fix broken slot switch lookup Aaro Koskinen <aaro.koskinen(a)iki.fi> ARM: OMAP2+: fix N810 MMC gpiod table Aaro Koskinen <aaro.koskinen(a)iki.fi> ARM: OMAP2+: fix bogus MMC GPIO labels on Nokia N8x0 David Sterba <dsterba(a)suse.com> btrfs: tests: allocate dummy fs_info and root in test_find_delalloc() Nini Song <nini.song(a)mediatek.com> media: cec: core: remove length check of Timer Status Anna-Maria Behnsen <anna-maria(a)linutronix.de> PM: s2idle: Make sure CPUs will wakeup directly on resume Hans de Goede <hdegoede(a)redhat.com> ACPI: scan: Do not increase dep_unmet for already met dependencies Noah Loomans <noah(a)noahloomans.com> platform/chrome: cros_ec_uart: properly fix race condition Tim Huang <Tim.Huang(a)amd.com> drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11 Dmitry Antipov <dmantipov(a)yandex.ru> Bluetooth: Fix memory leak in hci_req_sync_complete() Steven Rostedt (Google) <rostedt(a)goodmis.org> ring-buffer: Only update pages_touched when a new page is touched Yu Kuai <yukuai3(a)huawei.com> raid1: fix use-after-free for original bio in raid1_write_request() Fabio Estevam <festevam(a)denx.de> ARM: dts: imx7s-warp: Pass OV2680 link-frequencies Gavin Shan <gshan(a)redhat.com> arm64: tlb: Fix TLBI RANGE operand Breno Leitao <leitao(a)debian.org> virtio_net: Do not send RSS key if it is not supported Xiubo Li <xiubli(a)redhat.com> ceph: switch to use cap_delay_lock for the unlink delay list NeilBrown <neilb(a)suse.de> ceph: redirty page before returning AOP_WRITEPAGE_ACTIVATE Sven Eckelmann <sven(a)narfation.org> batman-adv: Avoid infinite loop trying to resize local TT Peyton Lee <peytolee(a)amd.com> drm/amdgpu/vpe: power on vpe when hw_init Damien Le Moal <dlemoal(a)kernel.org> ata: libata-scsi: Fix ata_scsi_dev_rescan() error path Igor Pylypiv <ipylypiv(a)google.com> ata: libata-core: Allow command duration limits detection for ACS-4 drives Steve French <stfrench(a)microsoft.com> smb3: fix Open files on server counter going negative ------------- Diffstat: Documentation/admin-guide/hw-vuln/spectre.rst | 22 +- Documentation/admin-guide/kernel-parameters.txt | 12 +- .../bindings/display/msm/qcom,sm8150-mdss.yaml | 9 + Makefile | 4 +- arch/arm/boot/dts/nxp/imx/imx7s-warp.dts | 1 + arch/arm/mach-omap2/board-n8x0.c | 23 +-- arch/arm64/boot/dts/freescale/imx8-ss-conn.dtsi | 16 +- arch/arm64/boot/dts/freescale/imx8-ss-dma.dtsi | 40 ++-- arch/arm64/boot/dts/freescale/imx8-ss-lsio.dtsi | 16 +- .../boot/dts/freescale/imx8mp-venice-gw72xx.dtsi | 2 +- .../boot/dts/freescale/imx8mp-venice-gw73xx.dtsi | 2 +- arch/arm64/boot/dts/freescale/imx8qm-ss-dma.dtsi | 8 +- arch/arm64/include/asm/tlbflush.h | 20 +- arch/x86/Kconfig | 21 +- arch/x86/events/core.c | 1 + arch/x86/include/asm/apic.h | 3 +- arch/x86/kernel/apic/apic.c | 6 +- arch/x86/kernel/cpu/bugs.c | 82 ++++---- arch/x86/kernel/cpu/common.c | 48 ++--- block/blk-cgroup.c | 9 +- block/blk-cgroup.h | 2 + block/blk-core.c | 2 + drivers/accel/ivpu/ivpu_drv.c | 20 +- drivers/accel/ivpu/ivpu_hw.h | 6 + drivers/accel/ivpu/ivpu_hw_37xx.c | 7 +- drivers/accel/ivpu/ivpu_hw_40xx.c | 6 + drivers/accel/ivpu/ivpu_ipc.c | 8 +- drivers/accel/ivpu/ivpu_pm.c | 7 +- drivers/acpi/numa/hmat.c | 43 ++-- drivers/acpi/scan.c | 3 +- drivers/ata/libata-core.c | 2 +- drivers/ata/libata-scsi.c | 9 +- drivers/base/node.c | 6 +- drivers/cxl/acpi.c | 8 +- drivers/cxl/core/cdat.c | 58 ++++-- drivers/cxl/core/mbox.c | 5 +- drivers/cxl/core/port.c | 76 ++++--- drivers/cxl/core/regs.c | 5 +- drivers/cxl/cxl.h | 8 +- drivers/firmware/arm_ffa/driver.c | 2 +- drivers/firmware/arm_scmi/raw_mode.c | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 6 + drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/soc21.c | 32 ++- drivers/gpu/drm/amd/amdgpu/umsch_mm_v4_0.c | 2 + .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 + drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_wb.c | 6 +- .../amd/display/dc/clk_mgr/dcn316/dcn316_clk_mgr.c | 19 +- drivers/gpu/drm/amd/display/dc/core/dc_state.c | 9 + .../gpu/drm/amd/display/dc/optc/dcn32/dcn32_optc.c | 3 - .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c | 12 +- drivers/gpu/drm/ast/ast_dp.c | 3 + drivers/gpu/drm/drm_client_modeset.c | 3 +- drivers/gpu/drm/i915/display/intel_cdclk.c | 7 +- drivers/gpu/drm/i915/display/intel_cdclk.h | 3 + drivers/gpu/drm/i915/display/intel_ddi.c | 5 + drivers/gpu/drm/i915/display/intel_dp.c | 6 +- drivers/gpu/drm/i915/display/intel_psr.c | 11 + drivers/gpu/drm/i915/display/intel_vrr.c | 7 + drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 + drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 10 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 8 +- drivers/gpu/drm/msm/dp/dp_display.c | 2 + drivers/gpu/drm/msm/msm_fb.c | 6 +- drivers/gpu/drm/msm/msm_kms.c | 4 +- .../gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c | 7 +- drivers/gpu/drm/panfrost/panfrost_mmu.c | 13 +- drivers/gpu/drm/qxl/qxl_release.c | 50 ++++- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 11 +- drivers/gpu/drm/xe/xe_display.c | 5 - drivers/gpu/drm/xe/xe_hwmon.c | 4 +- drivers/iommu/intel/iommu.c | 11 +- drivers/iommu/intel/perfmon.c | 2 +- drivers/iommu/intel/svm.c | 2 +- drivers/md/raid1.c | 2 +- drivers/media/cec/core/cec-adap.c | 14 -- drivers/mmc/host/omap.c | 48 +++-- drivers/net/dsa/mt7530.c | 229 ++++++++++++++++++--- drivers/net/dsa/mt7530.h | 5 + drivers/net/ethernet/amazon/ena/ena_com.c | 2 +- drivers/net/ethernet/amazon/ena/ena_netdev.c | 35 ++-- drivers/net/ethernet/amazon/ena/ena_xdp.c | 4 +- drivers/net/ethernet/amd/pds_core/core.c | 14 +- drivers/net/ethernet/amd/pds_core/core.h | 5 +- drivers/net/ethernet/amd/pds_core/dev.c | 3 + drivers/net/ethernet/amd/pds_core/main.c | 8 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 + drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 6 +- .../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 22 +- drivers/net/ethernet/marvell/octeontx2/nic/qos.c | 1 + drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h | 8 +- drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 33 +-- drivers/net/ethernet/mellanox/mlx5/core/en/selq.c | 2 + .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 17 ++ drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 - drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 7 +- drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 17 +- drivers/net/ethernet/mellanox/mlx5/core/main.c | 37 ++-- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 4 +- .../ethernet/mellanox/mlx5/core/sf/dev/driver.c | 22 +- drivers/net/ethernet/micrel/ks8851.h | 3 - drivers/net/ethernet/micrel/ks8851_common.c | 16 +- drivers/net/ethernet/micrel/ks8851_par.c | 11 - drivers/net/ethernet/micrel/ks8851_spi.c | 11 - .../net/ethernet/microchip/sparx5/sparx5_port.c | 4 +- drivers/net/geneve.c | 4 +- drivers/net/virtio_net.c | 28 ++- drivers/platform/chrome/cros_ec_uart.c | 28 +-- drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +- drivers/scsi/qla2xxx/qla_edif.c | 2 +- drivers/scsi/sg.c | 20 +- drivers/vhost/vhost.c | 28 ++- fs/btrfs/delayed-inode.c | 3 + fs/btrfs/inode.c | 13 +- fs/btrfs/ioctl.c | 37 +++- fs/btrfs/qgroup.c | 2 + fs/btrfs/root-tree.c | 10 - fs/btrfs/root-tree.h | 2 - fs/btrfs/tests/extent-io-tests.c | 28 ++- fs/btrfs/transaction.c | 17 +- fs/ceph/addr.c | 4 +- fs/ceph/caps.c | 4 +- fs/ceph/mds_client.c | 9 +- fs/ceph/mds_client.h | 3 +- fs/kernfs/file.c | 9 +- fs/proc/bootconfig.c | 12 +- fs/smb/client/cached_dir.c | 4 +- include/acpi/acpi_bus.h | 8 +- include/linux/bootconfig.h | 1 + include/linux/dma-fence.h | 7 + include/linux/irqflags.h | 2 +- include/linux/node.h | 18 +- include/linux/u64_stats_sync.h | 9 +- include/net/addrconf.h | 4 + include/net/af_unix.h | 2 +- include/net/bluetooth/bluetooth.h | 11 + include/net/ip_tunnels.h | 33 +++ init/main.c | 5 + io_uring/io_uring.c | 25 +++ io_uring/net.c | 22 +- io_uring/rw.c | 2 - kernel/cpu.c | 3 +- kernel/kprobes.c | 18 +- kernel/power/suspend.c | 6 + kernel/trace/ring_buffer.c | 6 +- kernel/trace/trace_events.c | 4 + lib/checksum_kunit.c | 5 +- net/batman-adv/translation-table.c | 2 +- net/bluetooth/hci_request.c | 4 +- net/bluetooth/hci_sock.c | 21 +- net/bluetooth/hci_sync.c | 66 +++++- net/bluetooth/iso.c | 50 ++--- net/bluetooth/l2cap_core.c | 3 +- net/bluetooth/l2cap_sock.c | 52 ++--- net/bluetooth/rfcomm/sock.c | 14 +- net/bluetooth/sco.c | 23 +-- net/ipv4/netfilter/arp_tables.c | 4 + net/ipv4/netfilter/ip_tables.c | 4 + net/ipv4/route.c | 4 +- net/ipv6/addrconf.c | 7 +- net/ipv6/ip6_fib.c | 7 +- net/ipv6/netfilter/ip6_tables.c | 4 + net/openvswitch/conntrack.c | 5 +- net/unix/af_unix.c | 8 +- net/unix/garbage.c | 35 +++- net/unix/scm.c | 8 +- net/xdp/xsk.c | 2 + tools/testing/selftests/kselftest.h | 33 ++- tools/testing/selftests/timers/posix_timers.c | 105 +++++----- 170 files changed, 1559 insertions(+), 882 deletions(-)

1 year, 8 months

8
179
0 0

[PATCH] block: Fix BLKRRPART regression

by Saranya Muruganandam

The BLKRRPART ioctl used to report errors such as EIO before we changed the blkdev_reread_part() logic. Lets add a flag and capture the errors returned by bdev_disk_changed() when the flag is set. Setting this flag for the BLKRRPART path when we want the errors to be reported when rereading partitions on the disk. Link: https://lore.kernel.org/all/20240320015134.GA14267@lst.de/ Suggested-by: Christoph Hellwig <hch(a)lst.de> Tested: Tested by simulating failure to the block device and will propose a new test to blktests. Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part") Reported-by: Saranya Muruganandam <saranyamohan(a)google.com> Signed-off-by: Saranya Muruganandam <saranyamohan(a)google.com> Change-Id: Idf3d97390ed78061556f8468d10d6cab24ae20b1 --- block/bdev.c | 31 +++++++++++++++++++++---------- block/ioctl.c | 3 ++- include/linux/blkdev.h | 3 +++ 3 files changed, 26 insertions(+), 11 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 77fa77cd29bee..71478f8865546 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -632,6 +632,14 @@ static void blkdev_flush_mapping(struct block_device *bdev) bdev_write_inode(bdev); } +static void blkdev_put_whole(struct block_device *bdev) +{ + if (atomic_dec_and_test(&bdev->bd_openers)) + blkdev_flush_mapping(bdev); + if (bdev->bd_disk->fops->release) + bdev->bd_disk->fops->release(bdev->bd_disk); +} + static int blkdev_get_whole(struct block_device *bdev, blk_mode_t mode) { struct gendisk *disk = bdev->bd_disk; @@ -650,18 +658,21 @@ static int blkdev_get_whole(struct block_device *bdev, blk_mode_t mode) if (!atomic_read(&bdev->bd_openers)) set_init_blocksize(bdev); - if (test_bit(GD_NEED_PART_SCAN, &disk->state)) - bdev_disk_changed(disk, false); atomic_inc(&bdev->bd_openers); - return 0; -} -static void blkdev_put_whole(struct block_device *bdev) -{ - if (atomic_dec_and_test(&bdev->bd_openers)) - blkdev_flush_mapping(bdev); - if (bdev->bd_disk->fops->release) - bdev->bd_disk->fops->release(bdev->bd_disk); + if (test_bit(GD_NEED_PART_SCAN, &disk->state)) { + /* + * Only return scanning errors if we are called from contexts + * that explicitly want them, e.g. the BLKRRPART ioctl. + */ + ret = bdev_disk_changed(disk, false); + if (ret && (mode & BLK_OPEN_STRICT_SCAN)) { + blkdev_put_whole(bdev); + return ret; + } + } + + return 0; } static int blkdev_get_part(struct block_device *part, blk_mode_t mode) diff --git a/block/ioctl.c b/block/ioctl.c index aa46f3761c3ed..e8d72d9f327fd 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -557,7 +557,8 @@ static int blkdev_common_ioctl(struct block_device *bdev, blk_mode_t mode, return -EACCES; if (bdev_is_partition(bdev)) return -EINVAL; - return disk_scan_partitions(bdev->bd_disk, mode); + return disk_scan_partitions(bdev->bd_disk, + mode | BLK_OPEN_STRICT_SCAN); case BLKTRACESTART: case BLKTRACESTOP: case BLKTRACETEARDOWN: diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 01983eece8f2a..d0104dc839b0d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -151,6 +151,9 @@ struct access_rules_head { int max_rules; }; +/* return partition scanning errors */ +#define BLK_OPEN_STRICT_SCAN ((__force blk_mode_t)(1 << 5)) + struct gendisk { /* * major/first_minor/minors should not be set by any new driver, the -- 2.44.0.478.gd926399ef9-goog

1 year, 8 months

6
11
0 0

[PATCH] mmc: sdhci-msm: pervent access to suspended controller

by Mantas Pucka

Generic sdhci code registers LED device and uses host->runtime_suspended flag to protect access to it. The sdhci-msm driver doesn't set this flag, which causes a crash when LED is accessed while controller is runtime suspended. Fix this by setting the flag correctly. Cc: stable(a)vger.kernel.org Fixes: 67e6db113c90 ("mmc: sdhci-msm: Add pm_runtime and system PM support") Signed-off-by: Mantas Pucka <mantas(a)8devices.com> --- drivers/mmc/host/sdhci-msm.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c index 668e0aceeeba..e113b99a3eab 100644 --- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -2694,6 +2694,11 @@ static __maybe_unused int sdhci_msm_runtime_suspend(struct device *dev) struct sdhci_host *host = dev_get_drvdata(dev); struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); + unsigned long flags; + + spin_lock_irqsave(&host->lock, flags); + host->runtime_suspended = true; + spin_unlock_irqrestore(&host->lock, flags); /* Drop the performance vote */ dev_pm_opp_set_rate(dev, 0); @@ -2708,6 +2713,7 @@ static __maybe_unused int sdhci_msm_runtime_resume(struct device *dev) struct sdhci_host *host = dev_get_drvdata(dev); struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); + unsigned long flags; int ret; ret = clk_bulk_prepare_enable(ARRAY_SIZE(msm_host->bulk_clks), @@ -2726,7 +2732,15 @@ static __maybe_unused int sdhci_msm_runtime_resume(struct device *dev) dev_pm_opp_set_rate(dev, msm_host->clk_rate); - return sdhci_msm_ice_resume(msm_host); + ret = sdhci_msm_ice_resume(msm_host); + if (ret) + return ret; + + spin_lock_irqsave(&host->lock, flags); + host->runtime_suspended = false; + spin_unlock_irqrestore(&host->lock, flags); + + return ret; } static const struct dev_pm_ops sdhci_msm_pm_ops = { --- base-commit: e8f897f4afef0031fe618a8e94127a0934896aba change-id: 20240321-sdhci-mmc-suspend-34f4af1d0286 Best regards, -- Mantas Pucka <mantas(a)8devices.com>

1 year, 8 months

3
7
0 0

[PATCH] dm: Change the default value of rq_affinity from 0 into 1

by Bart Van Assche

The following behavior is inconsistent: * For request-based dm queues the default value of rq_affinity is 1. * For bio-based dm queues the default value of rq_affinity is 0. The default value for request-based dm queues is 1 because of the following code in blk_mq_init_allocated_queue(): q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; From <linux/blkdev.h>: #define QUEUE_FLAG_MQ_DEFAULT ((1UL << QUEUE_FLAG_IO_STAT) | \ (1UL << QUEUE_FLAG_SAME_COMP) | \ (1UL << QUEUE_FLAG_NOWAIT)) The default value of rq_affinity for bio-based dm queues is 0 because the dm alloc_dev() function does not set any of the QUEUE_FLAG_SAME_* flags. I think the different default values are the result of an oversight when blk-mq support was added in the device mapper code. Hence this patch that changes the default value of rq_affinity from 0 to 1 for bio-based dm queues. This patch reduces the boot time from 12.23 to 12.20 seconds on my test setup, a Pixel 2023 development board. The storage controller on that test setup supports a single completion interrupt and hence benefits from redirecting I/O completions to a CPU core that is closer to the submitter. Cc: Mikulas Patocka <mpatocka(a)redhat.com> Cc: Eric Biggers <ebiggers(a)kernel.org> Cc: Jaegeuk Kim <jaegeuk(a)kernel.org> Cc: Daniel Lee <chullee(a)google.com> Cc: stable(a)vger.kernel.org Fixes: bfebd1cdb497 ("dm: add full blk-mq support to request-based DM") Signed-off-by: Bart Van Assche <bvanassche(a)acm.org> --- drivers/md/dm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 56aa2a8b9d71..9af216c11cf7 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2106,6 +2106,7 @@ static struct mapped_device *alloc_dev(int minor) if (IS_ERR(md->disk)) goto bad; md->queue = md->disk->queue; + blk_queue_flag_set(QUEUE_FLAG_SAME_COMP, md->queue); init_waitqueue_head(&md->wait); INIT_WORK(&md->work, dm_wq_work);

1 year, 8 months

2
3
0 0

[PATCH 1/2] sched: Add missing memory barrier in switch_mm_cid

by Mathieu Desnoyers

Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb() which the core scheduler code has depended upon since commit: commit 223baf9d17f25 ("sched: Fix performance regression introduced by mm_cid") If switch_mm() doesn't call smp_mb(), sched_mm_cid_remote_clear() can unset the actively used cid when it fails to observe active task after it sets lazy_put. There *is* a memory barrier between storing to rq->curr and _return to userspace_ (as required by membarrier), but the rseq mm_cid has stricter requirements: the barrier needs to be issued between store to rq->curr and switch_mm_cid(), which happens earlier than: - spin_unlock(), - switch_to(). So it's fine when the architecture switch_mm() happens to have that barrier already, but less so when the architecture only provides the full barrier in switch_to() or spin_unlock(). It is a bug in the rseq switch_mm_cid() implementation. All architectures that don't have memory barriers in switch_mm(), but rather have the full barrier either in finish_lock_switch() or switch_to() have them too late for the needs of switch_mm_cid(). Introduce a new smp_mb__after_switch_mm(), defined as smp_mb() in the generic barrier.h header, and use it in switch_mm_cid() for scheduler transitions where switch_mm() is expected to provide a memory barrier. Architectures can override smp_mb__after_switch_mm() if their switch_mm() implementation provides an implicit memory barrier. Override it with a no-op on x86 which implicitly provide this memory barrier by writing to CR3. Link: https://lore.kernel.org/lkml/20240305145335.2696125-1-yeoreum.yun@arm.com/ Reported-by: levi.yun <yeoreum.yun(a)arm.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com> # for arm64 Acked-by: Dave Hansen <dave.hansen(a)linux.intel.com> # for x86 Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by mm_cid") Cc: <stable(a)vger.kernel.org> # 6.4.x Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Steven Rostedt <rostedt(a)goodmis.org> Cc: Vincent Guittot <vincent.guittot(a)linaro.org> Cc: Juri Lelli <juri.lelli(a)redhat.com> Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com> Cc: Ben Segall <bsegall(a)google.com> Cc: Mel Gorman <mgorman(a)suse.de> Cc: Daniel Bristot de Oliveira <bristot(a)redhat.com> Cc: Valentin Schneider <vschneid(a)redhat.com> Cc: levi.yun <yeoreum.yun(a)arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Cc: Catalin Marinas <catalin.marinas(a)arm.com> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Will Deacon <will(a)kernel.org> Cc: Aaron Lu <aaron.lu(a)intel.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Arnd Bergmann <arnd(a)arndb.de> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: linux-arch(a)vger.kernel.org Cc: linux-mm(a)kvack.org Cc: x86(a)kernel.org --- arch/x86/include/asm/barrier.h | 3 +++ include/asm-generic/barrier.h | 8 ++++++++ kernel/sched/sched.h | 20 ++++++++++++++------ 3 files changed, 25 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index fe1e7e3cc844..63bdc6b85219 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -79,6 +79,9 @@ do { \ #define __smp_mb__before_atomic() do { } while (0) #define __smp_mb__after_atomic() do { } while (0) +/* Writing to CR3 provides a full memory barrier in switch_mm(). */ +#define smp_mb__after_switch_mm() do { } while (0) + #include <asm-generic/barrier.h> #endif /* _ASM_X86_BARRIER_H */ diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h index 0c0695763bea..dc32b96140c1 100644 --- a/include/asm-generic/barrier.h +++ b/include/asm-generic/barrier.h @@ -294,5 +294,13 @@ do { \ #define io_stop_wc() do { } while (0) #endif +/* + * Architectures that guarantee an implicit smp_mb() in switch_mm() + * can override smp_mb__after_switch_mm. + */ +#ifndef smp_mb__after_switch_mm +#define smp_mb__after_switch_mm() smp_mb() +#endif + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_GENERIC_BARRIER_H */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index d2242679239e..d2895d264196 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -79,6 +79,8 @@ # include <asm/paravirt_api_clock.h> #endif +#include <asm/barrier.h> + #include "cpupri.h" #include "cpudeadline.h" @@ -3445,13 +3447,19 @@ static inline void switch_mm_cid(struct rq *rq, * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu]. * Provide it here. */ - if (!prev->mm) // from kernel + if (!prev->mm) { // from kernel smp_mb(); - /* - * user -> user transition guarantees a memory barrier through - * switch_mm() when current->mm changes. If current->mm is - * unchanged, no barrier is needed. - */ + } else { // from user + /* + * user -> user transition relies on an implicit + * memory barrier in switch_mm() when + * current->mm changes. If the architecture + * switch_mm() does not have an implicit memory + * barrier, it is emitted here. If current->mm + * is unchanged, no barrier is needed. + */ + smp_mb__after_switch_mm(); + } } if (prev->mm_cid_active) { mm_cid_snapshot_time(rq, prev->mm); -- 2.39.2

1 year, 8 months

2
1
0 0

[PATCH 0/7] kdb: Refactor and fix bugs in kdb_read()

by Daniel Thompson

Inspired by a patch from [Justin][1] I took a closer look at kdb_read(). Despite Justin's patch being a (correct) one-line manipulation it was a tough patch to review because the surrounding code was hard to read and it looked like there were unfixed problems. This series isn't enough to make kdb_read() beautiful but it does make it shorter, easier to reason about and fixes a buffer overflow and a screen redraw problem! [1]: https://lore.kernel.org/all/20240403-strncpy-kernel-debug-kdb-kdb_io-c-v1-1… Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org> --- Daniel Thompson (7): kdb: Fix buffer overflow during tab-complete kdb: Use format-strings rather than '\0' injection in kdb_read() kdb: Fix console handling when editing and tab-completing commands kdb: Replace double memcpy() with memmove() in kdb_read() kdb: Merge identical case statements in kdb_read() kdb: Use format-specifiers rather than memset() for padding in kdb_read() kdb: Simplify management of tmpbuffer in kdb_read() kernel/debug/kdb/kdb_io.c | 133 ++++++++++++++++++++-------------------------- 1 file changed, 58 insertions(+), 75 deletions(-) --- base-commit: dccce9b8780618986962ba37c373668bcf426866 change-id: 20240415-kgdb_read_refactor-2ea2dfc15dbb Best regards, -- Daniel Thompson <daniel.thompson(a)linaro.org>

1 year, 8 months

1
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror April 2024