This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.103-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.103-rc1
Alexandru Matei alexandru.matei@uipath.com KVM: VMX: Fix crash due to uninitialized current_vmcs
Vitaly Kuznetsov vkuznets@redhat.com KVM: VMX: Introduce vmx_msr_bitmap_l01_changed() helper
Vitaly Kuznetsov vkuznets@redhat.com KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
Christian Brauner christian.brauner@ubuntu.com fs: hold writers when changing mount's idmapping
Masahiro Yamada masahiroy@kernel.org UML: define RUNTIME_DISCARD_EXIT
Gaosheng Cui cuigaosheng1@huawei.com xfs: remove xfs_setattr_time() declaration
Miaohe Lin linmiaohe@huawei.com KVM: fix memoryleak in kvm_init()
Andres Freund andres@anarazel.de tools bpftool: Fix compilation error with new binutils
Andres Freund andres@anarazel.de tools bpf_jit_disasm: Fix compilation error with new binutils
Andres Freund andres@anarazel.de tools perf: Fix compilation error with new binutils
Andres Freund andres@anarazel.de tools include: add dis-asm-compat.h to handle version differences
Andres Freund andres@anarazel.de tools build: Add feature test for init_disassemble_info API changes
Tom Saeger tom.saeger@oracle.com sh: define RUNTIME_DISCARD_EXIT
Masahiro Yamada masahiroy@kernel.org s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36
Michael Ellerman mpe@ellerman.id.au powerpc/vmlinux.lds: Don't discard .rela* for relocatable builds
Michael Ellerman mpe@ellerman.id.au powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT
Masahiro Yamada masahiroy@kernel.org arch: fix broken BuildID for arm64 and riscv
Lukas Czerner lczerner@redhat.com ext4: block range must be validated before use in ext4_mb_clear_bb()
Ritesh Harjani riteshh@linux.ibm.com ext4: add strict range checks while freeing blocks
Ritesh Harjani riteshh@linux.ibm.com ext4: add ext4_sb_block_valid() refactored out of ext4_inode_block_valid()
Ritesh Harjani riteshh@linux.ibm.com ext4: refactor ext4_free_blocks() to pull out ext4_mb_clear_bb()
Qais Yousef qyousef@layalina.io sched/fair: Fixes for capacity inversion detection
Qais Yousef qyousef@layalina.io sched/uclamp: Fix a uninitialized variable warnings
Qais Yousef qais.yousef@arm.com sched/fair: Consider capacity inversion in util_fits_cpu()
Qais Yousef qais.yousef@arm.com sched/fair: Detect capacity inversion
Qais Yousef qais.yousef@arm.com sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit condition
Qais Yousef qais.yousef@arm.com sched/uclamp: Make cpu_overutilized() use util_fits_cpu()
Qais Yousef qais.yousef@arm.com sched/uclamp: Fix fits_capacity() check in feec()
Seth Forshee sforshee@kernel.org filelocks: use mount idmapping for setlease permission check
Li Jun jun.li@nxp.com media: rc: gpio-ir-recv: add remove function
Paul Elder paul.elder@ideasonboard.com media: ov5640: Fix analogue gain control
Masahiro Yamada masahiroy@kernel.org scripts: handle BrokenPipeError for python scripts
Alvaro Karsz alvaro.karsz@solid-run.com PCI: Avoid FLR for SolidRun SNET DPU rev 1
Alvaro Karsz alvaro.karsz@solid-run.com PCI: Add SolidRun vendor ID
Nathan Chancellor nathan@kernel.org macintosh: windfarm: Use unsigned type for 1-bit bitfields
Edward Humes aurxenon@lunos.org alpha: fix R_ALPHA_LITERAL reloc for large modules
Rohan McLure rmclure@linux.ibm.com powerpc/kcsan: Exclude udelay to prevent recursive instrumentation
Greg Kroah-Hartman gregkh@linuxfoundation.org powerpc/iommu: fix memory leak with using debugfs_lookup()
Christophe Leroy christophe.leroy@csgroup.eu powerpc: Check !irq instead of irq == NO_IRQ and remove NO_IRQ
xurui xurui@kylinos.cn MIPS: Fix a compilation issue
Dmitry Baryshkov dmitry.baryshkov@linaro.org clk: qcom: mmcc-apq8084: remove spdm clocks
Christian Brauner brauner@kernel.org fs: use consistent setgid checks in is_sxid()
Christian Brauner brauner@kernel.org attr: use consistent sgid stripping checks
Christian Brauner brauner@kernel.org attr: add setattr_should_drop_sgid()
Christian Brauner brauner@kernel.org fs: move should_remove_suid()
Christian Brauner brauner@kernel.org attr: add in_group_or_capable()
Yang Xu xuyang2018.jy@fujitsu.com fs: move S_ISGID stripping into the vfs_*() helpers
Yang Xu xuyang2018.jy@fujitsu.com fs: add mode_strip_sgid() helper
Dave Chinner dchinner@redhat.com xfs: set prealloc flag in xfs_alloc_file_space()
Dave Chinner dchinner@redhat.com xfs: fallocate() should call file_modified()
Dave Chinner dchinner@redhat.com xfs: remove XFS_PREALLOC_SYNC
Darrick J. Wong djwong@kernel.org xfs: use setattr_copy to set vfs inode attributes
Morten Linderud morten@linderud.pw tpm/eventlog: Don't abort tpm_read_log on faulty ACPI address
David Disseldorp ddiss@suse.de watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
Hans de Goede hdegoede@redhat.com staging: rtl8723bs: Fix key-store index handling
Hannes Braun hannesbraun@mail.de staging: rtl8723bs: fix placement of braces
Jagath Jog J jagathjog1996@gmail.com Staging: rtl8723bs: Placing opening { braces in previous line
Michael Straube straube.linux@gmail.com staging: rtl8723bs: clean up comparsions to NULL
Gavrilov Ilia Ilia.Gavrilov@infotecs.ru iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter
Kim Phillips kim.phillips@amd.com iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options
Suravee Suthikulpanit suravee.suthikulpanit@amd.com iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands
Christoph Hellwig hch@lst.de nbd: use the correct block_device in nbd_bdev_reset
Johan Hovold johan+linaro@kernel.org irqdomain: Fix mapping-creation race
Jan Kara jack@suse.cz ext4: Fix deadlock during directory rename
Conor Dooley conor.dooley@microchip.com RISC-V: Don't check text_mutex during stop_machine
Heiko Carstens hca@linux.ibm.com s390/ftrace: remove dead code
Alexandre Ghiti alexghiti@rivosinc.com riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
Eric Dumazet edumazet@google.com af_unix: fix struct pid leaks in OOB support
Kuniyuki Iwashima kuniyu@amazon.co.jp af_unix: Remove unnecessary brackets around CONFIG_AF_UNIX_OOB.
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC
Benjamin Coddington bcodding@redhat.com SUNRPC: Fix a server shutdown leak
Suman Ghosh sumang@marvell.com octeontx2-af: Unlock contexts in the queue context cache in case of fault detection
D. Wythe alibuda@linux.alibaba.com net/smc: fix fallback failed while sendmsg with fastopen
Randy Dunlap rdunlap@infradead.org platform: x86: MLX_PLATFORM: select REGMAP instead of depending on it
Eric Dumazet edumazet@google.com netfilter: conntrack: adopt safer max chain length
Chandrakanth Patil chandrakanth.patil@broadcom.com scsi: megaraid_sas: Update max supported LD IDs to 240
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: fix RX data corruption issue
Heiner Kallweit hkallweit1@gmail.com net: phy: smsc: fix link up detection in forced irq mode
Lukas Wunner lukas@wunner.de net: phy: smsc: Cache interrupt mask
Lorenz Bauer lorenz.bauer@isovalent.com btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
Florian Westphal fw@strlen.de netfilter: tproxy: fix deadlock due to missing BH disable
Ivan Delalande colona@arista.com netfilter: ctnetlink: revert to dumping mark regardless of event type
Michael Chan michael.chan@broadcom.com bnxt_en: Avoid order-5 memory allocation for TPA data
Russell King (Oracle) rmk+kernel@armlinux.org.uk net: phylib: get rid of unnecessary locking
Rongguang Wei weirongguang@kylinos.cn net: stmmac: add to set device wake up flag when stmmac init phy
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/dpu: fix len of sc7180 ctl blocks
Liu Jian liujian56@huawei.com bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
Petr Oros poros@redhat.com ice: copy last block omitted in ice_get_module_eeprom()
Shigeru Yoshida syoshida@redhat.com net: caif: Fix use-after-free in cfusbl_device_notify()
Yuiko Oshino yuiko.oshino@microchip.com net: lan78xx: fix accessing the LAN7800's internal phy specific registers from the MAC driver
Changbin Du changbin.du@huawei.com perf stat: Fix counting when initial delay configured
Hangbin Liu liuhangbin@gmail.com selftests: nft_nat: ensuring the listening side is up before starting the client
Eric Dumazet edumazet@google.com ila: do not generate empty messages in ila_xlat_nl_cmd_get_mapping()
Vladimir Oltean vladimir.oltean@nxp.com powerpc: dts: t1040rdb: fix compatible string for Rev A boards
Kang Chen void0red@gmail.com nfc: fdp: add null check of devm_kmalloc_array in fdp_nci_i2c_read_device_properties
Rafał Miłecki rafal@milecki.pl bgmac: fix *initial* chip reset to support BCM5358
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/a5xx: fix context faults during ring switch
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/a5xx: fix the emptyness check in the preempt code
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/a5xx: fix highest bank bit for a530
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register
Rob Clark robdclark@chromium.org drm/msm: Fix potential invalid ptr free
Jiri Slaby (SUSE) jirislaby@kernel.org drm/nouveau/kms/nv50: fix nv50_wndw_new_ prototype
Ben Skeggs bskeggs@redhat.com drm/nouveau/kms/nv50-: remove unused functions
Jan Kara jack@suse.cz ext4: Fix possible corruption when moving a directory
Matthias Kaehlcke mka@chromium.org regulator: core: Use ktime_get_boottime() to determine how long a regulator was off
Christian Kohlschütter christian@kohlschutter.com regulator: core: Fix off-on-delay-us for always-on/boot-on regulators
Mark Brown broonie@kernel.org regulator: Flag uncontrollable regulators as always_on
Bart Van Assche bvanassche@acm.org scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
Liao Chang liaochang1@huawei.com riscv: Add header include guards to insn.h
Mattias Nissler mnissler@rivosinc.com riscv: Avoid enabling interrupts in die()
Palmer Dabbelt palmer@rivosinc.com RISC-V: Avoid dereferening NULL regs in die()
Pierre Gondois pierre.gondois@arm.com arm64: efi: Make efi_rt_lock a raw_spinlock
Jens Axboe axboe@kernel.dk brd: mark as nowait compatible
Luis Chamberlain mcgrof@kernel.org block/brd: add error handling support for add_disk()
Jacob Pan jacob.jun.pan@linux.intel.com iommu/vt-d: Fix PASID directory pointer coherency
Johan Hovold johan+linaro@kernel.org irqdomain: Refactor __irq_domain_alloc_irqs()
Corey Minyard cminyard@mvista.com ipmi:ssif: Add a timer between request retries
Corey Minyard cminyard@mvista.com ipmi:ssif: Increase the message retry time
Jaegeuk Kim jaegeuk@kernel.org f2fs: retry to update the inode page given data corruption
Jaegeuk Kim jaegeuk@kernel.org f2fs: do not bother checkpoint by f2fs_get_node_info
Jaegeuk Kim jaegeuk@kernel.org f2fs: avoid down_write on nat_tree_lock during checkpoint
Jan Kara jack@suse.cz udf: Fix off-by-one error when discarding preallocation
Alexander Aring aahringo@redhat.com fs: dlm: start midcomms before scand
Alexander Aring aahringo@redhat.com fs: dlm: add midcomms init/start functions
Alexander Aring aahringo@redhat.com fs: dlm: fix log of lowcomms vs midcomms
Sean Christopherson seanjc@google.com KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target
Sean Christopherson seanjc@google.com KVM: SVM: Don't rewrite guest ICR on AVIC IPI virtualization failure
Sean Christopherson seanjc@google.com KVM: Register /dev/kvm as the _very_ last thing during initialization
Vitaly Kuznetsov vkuznets@redhat.com KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
Vitaly Kuznetsov vkuznets@redhat.com KVM: Optimize kvm_make_vcpus_request_mask() a bit
Fedor Pchelkin pchelkin@ispras.ru nfc: change order inside nfc_se_io error path
Zhihao Cheng chengzhihao1@huawei.com ext4: zero i_disksize when initializing the bootloader inode
Ye Bin yebin10@huawei.com ext4: fix WARNING in ext4_update_inline_data
Ye Bin yebin10@huawei.com ext4: move where set the MAY_INLINE_DATA flag is set
Darrick J. Wong djwong@kernel.org ext4: fix another off-by-one fsmap error on 1k block filesystems
Eric Whitney enwlinux@gmail.com ext4: fix RENAME_WHITEOUT handling for inline directories
Eric Biggers ebiggers@google.com ext4: fix cgroup writeback accounting with fs-layer encryption
Hans de Goede hdegoede@redhat.com staging: rtl8723bs: Pass correct parameters to cfg80211_get_bss()
Harry Wentland harry.wentland@amd.com drm/connector: print max_requested_bpc in state debugfs
Alex Deucher alexander.deucher@amd.com drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
Andrew Cooper andrew.cooper3@citrix.com x86/CPU/AMD: Disable XSAVES on AMD family 0x17
Tobias Klauser tklauser@distanz.ch fork: allow CLONE_NEWTIME in clone3 flags
Namhyung Kim namhyung@kernel.org perf inject: Fix --buildid-all not to eat up MMAP2
Johannes Thumshirn johannes.thumshirn@wdc.com btrfs: fix percent calculation for bg reclaim message
Theodore Ts'o tytso@mit.edu fs: prevent out-of-bounds array speculation when closing a file descriptor
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 51 ++- Documentation/trace/ftrace.rst | 2 +- Makefile | 4 +- arch/alpha/kernel/module.c | 4 +- arch/arm64/include/asm/efi.h | 6 +- arch/arm64/kernel/efi.c | 2 +- arch/mips/include/asm/mach-rc32434/pci.h | 2 +- arch/powerpc/boot/dts/fsl/t1040rdb-rev-a.dts | 1 - arch/powerpc/include/asm/irq.h | 3 - arch/powerpc/kernel/iommu.c | 4 +- arch/powerpc/kernel/time.c | 4 +- arch/powerpc/kernel/vmlinux.lds.S | 6 +- arch/powerpc/platforms/44x/fsp2.c | 2 +- arch/riscv/include/asm/ftrace.h | 2 +- arch/riscv/include/asm/parse_asm.h | 5 + arch/riscv/include/asm/patch.h | 2 + arch/riscv/kernel/ftrace.c | 14 +- arch/riscv/kernel/patch.c | 28 +- arch/riscv/kernel/stacktrace.c | 2 +- arch/riscv/kernel/traps.c | 14 +- arch/s390/kernel/ftrace.c | 86 +---- arch/s390/kernel/vmlinux.lds.S | 2 + arch/sh/kernel/vmlinux.lds.S | 1 + arch/um/kernel/vmlinux.lds.S | 2 +- arch/x86/kernel/cpu/amd.c | 9 + arch/x86/kvm/lapic.c | 1 + arch/x86/kvm/svm/avic.c | 28 +- arch/x86/kvm/vmx/evmcs.h | 11 - arch/x86/kvm/vmx/vmx.c | 44 ++- drivers/block/brd.c | 10 +- drivers/block/nbd.c | 14 +- drivers/char/ipmi/ipmi_ssif.c | 34 +- drivers/char/tpm/eventlog/acpi.c | 6 +- drivers/clk/qcom/mmcc-apq8084.c | 271 ---------------- drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +- drivers/gpu/drm/drm_atomic.c | 1 + drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 6 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 4 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 6 +- drivers/gpu/drm/msm/msm_gem_submit.c | 5 +- drivers/gpu/drm/nouveau/dispnv50/disp.c | 16 - drivers/gpu/drm/nouveau/dispnv50/wndw.c | 12 - drivers/gpu/drm/nouveau/dispnv50/wndw.h | 7 +- drivers/iommu/amd/init.c | 105 ++++-- drivers/iommu/intel/pasid.c | 7 + drivers/macintosh/windfarm_lm75_sensor.c | 4 +- drivers/macintosh/windfarm_smu_sensors.c | 4 +- drivers/media/i2c/ov5640.c | 2 +- drivers/media/rc/gpio-ir-recv.c | 18 ++ drivers/net/dsa/mt7530.c | 35 +- drivers/net/ethernet/broadcom/bgmac.c | 8 +- drivers/net/ethernet/broadcom/bgmac.h | 2 + drivers/net/ethernet/broadcom/bnxt/bnxt.c | 23 +- drivers/net/ethernet/intel/ice/ice_ethtool.c | 6 +- drivers/net/ethernet/marvell/octeontx2/af/rvu.h | 5 + .../ethernet/marvell/octeontx2/af/rvu_debugfs.c | 7 +- .../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 16 +- .../net/ethernet/marvell/octeontx2/af/rvu_npa.c | 58 +++- .../net/ethernet/marvell/octeontx2/af/rvu_reg.h | 3 + drivers/net/ethernet/mediatek/mtk_eth_soc.c | 3 +- drivers/net/ethernet/mediatek/mtk_eth_soc.h | 1 + drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 + drivers/net/phy/microchip.c | 32 ++ drivers/net/phy/phy_device.c | 8 +- drivers/net/phy/smsc.c | 20 +- drivers/net/usb/lan78xx.c | 27 +- drivers/nfc/fdp/i2c.c | 4 + drivers/pci/quirks.c | 8 + drivers/platform/x86/Kconfig | 3 +- drivers/regulator/core.c | 27 +- drivers/scsi/hosts.c | 2 + drivers/scsi/megaraid/megaraid_sas.h | 2 + drivers/scsi/megaraid/megaraid_sas_fp.c | 2 +- drivers/staging/rtl8723bs/core/rtw_ap.c | 20 +- drivers/staging/rtl8723bs/core/rtw_cmd.c | 96 +++--- drivers/staging/rtl8723bs/core/rtw_ioctl_set.c | 4 +- drivers/staging/rtl8723bs/core/rtw_mlme.c | 6 +- drivers/staging/rtl8723bs/core/rtw_mlme_ext.c | 56 ++-- drivers/staging/rtl8723bs/core/rtw_security.c | 6 +- drivers/staging/rtl8723bs/include/rtw_security.h | 8 +- drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 355 +++++++-------------- drivers/staging/rtl8723bs/os_dep/ioctl_linux.c | 51 +-- drivers/staging/rtl8723bs/os_dep/os_intfs.c | 4 +- fs/attr.c | 72 ++++- fs/btrfs/block-group.c | 3 +- fs/dlm/lockspace.c | 21 +- fs/dlm/lowcomms.c | 16 +- fs/dlm/lowcomms.h | 1 + fs/dlm/main.c | 7 +- fs/dlm/midcomms.c | 17 +- fs/dlm/midcomms.h | 3 + fs/ext4/block_validity.c | 26 +- fs/ext4/ext4.h | 3 + fs/ext4/fsmap.c | 2 + fs/ext4/inline.c | 1 - fs/ext4/inode.c | 7 +- fs/ext4/ioctl.c | 1 + fs/ext4/mballoc.c | 205 +++++++----- fs/ext4/namei.c | 36 ++- fs/ext4/page-io.c | 11 +- fs/ext4/xattr.c | 3 + fs/f2fs/checkpoint.c | 2 +- fs/f2fs/compress.c | 2 +- fs/f2fs/data.c | 8 +- fs/f2fs/f2fs.h | 2 +- fs/f2fs/file.c | 2 +- fs/f2fs/gc.c | 6 +- fs/f2fs/inline.c | 4 +- fs/f2fs/inode.c | 14 +- fs/f2fs/node.c | 23 +- fs/f2fs/recovery.c | 2 +- fs/f2fs/segment.c | 2 +- fs/file.c | 1 + fs/fuse/file.c | 2 +- fs/inode.c | 90 +++--- fs/internal.h | 10 +- fs/locks.c | 3 +- fs/namei.c | 82 ++++- fs/namespace.c | 29 +- fs/ocfs2/file.c | 4 +- fs/ocfs2/namei.c | 1 + fs/open.c | 8 +- fs/udf/inode.c | 2 +- fs/xfs/xfs_bmap_util.c | 9 +- fs/xfs/xfs_file.c | 24 +- fs/xfs/xfs_iops.c | 56 +--- fs/xfs/xfs_iops.h | 1 - fs/xfs/xfs_pnfs.c | 9 +- include/asm-generic/vmlinux.lds.h | 5 + include/linux/fs.h | 6 +- include/linux/pci_ids.h | 2 + include/net/netfilter/nf_tproxy.h | 7 + kernel/bpf/btf.c | 1 + kernel/fork.c | 2 +- kernel/irq/irqdomain.c | 152 +++++---- kernel/sched/core.c | 10 +- kernel/sched/fair.c | 128 +++++++- kernel/sched/sched.h | 61 +++- kernel/watch_queue.c | 1 + net/caif/caif_usb.c | 3 + net/ipv4/netfilter/nf_tproxy_ipv4.c | 2 +- net/ipv4/tcp_bpf.c | 6 + net/ipv4/udp_bpf.c | 3 + net/ipv6/ila/ila_xlat.c | 1 + net/ipv6/netfilter/nf_tproxy_ipv6.c | 2 +- net/netfilter/nf_conntrack_core.c | 4 +- net/netfilter/nf_conntrack_netlink.c | 14 +- net/nfc/netlink.c | 2 +- net/smc/af_smc.c | 13 +- net/sunrpc/svc.c | 6 +- net/unix/af_unix.c | 16 +- net/unix/unix_bpf.c | 3 + scripts/checkkconfigsymbols.py | 13 +- scripts/clang-tools/run-clang-tools.py | 21 +- scripts/diffconfig | 16 +- tools/bpf/Makefile | 5 +- tools/bpf/bpf_jit_disasm.c | 5 +- tools/bpf/bpftool/Makefile | 5 +- tools/bpf/bpftool/jit_disasm.c | 42 ++- tools/build/Makefile.feature | 1 + tools/build/feature/Makefile | 4 + tools/build/feature/test-all.c | 4 + .../build/feature/test-disassembler-init-styled.c | 13 + tools/include/tools/dis-asm-compat.h | 55 ++++ tools/perf/Makefile.config | 8 + tools/perf/builtin-inject.c | 1 + tools/perf/builtin-stat.c | 15 +- tools/perf/util/annotate.c | 7 +- tools/perf/util/stat.c | 6 +- tools/perf/util/stat.h | 1 - tools/perf/util/target.h | 12 + tools/testing/selftests/netfilter/nft_nat.sh | 2 + virt/kvm/kvm_main.c | 145 ++++++--- 173 files changed, 1962 insertions(+), 1490 deletions(-)
From: Theodore Ts'o tytso@mit.edu
commit 609d54441493c99f21c1823dfd66fa7f4c512ff4 upstream.
Google-Bug-Id: 114199369 Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/file.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/file.c +++ b/fs/file.c @@ -646,6 +646,7 @@ static struct file *pick_file(struct fil file = ERR_PTR(-EINVAL); goto out_unlock; } + fd = array_index_nospec(fd, fdt->max_fds); file = fdt->fd[fd]; if (!file) { file = ERR_PTR(-EBADF);
From: Johannes Thumshirn johannes.thumshirn@wdc.com
commit 95cd356ca23c3807b5f3503687161e216b1c520d upstream.
We have a report, that the info message for block-group reclaim is crossing the 100% used mark.
This is happening as we were truncating the divisor for the division (the block_group->length) to a 32bit value.
Fix this by using div64_u64() to not truncate the divisor.
In the worst case, it can lead to a div by zero error and should be possible to trigger on 4 disks RAID0, and each device is large enough:
$ mkfs.btrfs -f /dev/test/scratch[1234] -m raid1 -d raid0 btrfs-progs v6.1 [...] Filesystem size: 40.00GiB Block group profiles: Data: RAID0 4.00GiB <<< Metadata: RAID1 256.00MiB System: RAID1 8.00MiB
Reported-by: Forza forza@tnonline.net Link: https://lore.kernel.org/linux-btrfs/e99483.c11a58d.1863591ca52@tnonline.net/ Fixes: 5f93e776c673 ("btrfs: zoned: print unusable percentage when reclaiming block groups") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Anand Jain anand.jain@oracle.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: David Sterba dsterba@suse.com [ add Qu's note ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/block-group.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -1554,7 +1554,8 @@ void btrfs_reclaim_bgs_work(struct work_
btrfs_info(fs_info, "reclaiming chunk %llu with %llu%% used %llu%% unusable", - bg->start, div_u64(bg->used * 100, bg->length), + bg->start, + div64_u64(bg->used * 100, bg->length), div64_u64(zone_unusable * 100, bg->length)); trace_btrfs_reclaim_block_group(bg); ret = btrfs_relocate_chunk(fs_info, bg->start);
From: Namhyung Kim namhyung@kernel.org
commit ce9f1c05d2edfa6cdf2c1a510495d333e11810a8 upstream.
When MMAP2 has the PERF_RECORD_MISC_MMAP_BUILD_ID flag, it means the record already has the build-id info. So it marks the DSO as hit, to skip if the same DSO is not processed if it happens to miss the build-id later.
But it missed to copy the MMAP2 record itself so it'd fail to symbolize samples for those regions.
For example, the following generates 249 MMAP2 events.
$ perf record --buildid-mmap -o- true | perf report --stat -i- | grep MMAP2 MMAP2 events: 249 (86.8%)
Adding perf inject should not change the number of events like this
$ perf record --buildid-mmap -o- true | perf inject -b | \
perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.5%)
But when --buildid-all is used, it eats most of the MMAP2 events.
$ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
perf report --stat -i- | grep MMAP2
MMAP2 events: 1 ( 2.5%)
With this patch, it shows the original number now.
$ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.5%)
Committer testing:
Before:
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (36.2%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (36.2%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2 MMAP2 events: 2 ( 1.9%) $
After:
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (29.3%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (34.3%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (38.4%) $
Fixes: f7fc0d1c915a74ff ("perf inject: Do not inject BUILD_ID record if MMAP2 has it") Signed-off-by: Namhyung Kim namhyung@kernel.org Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230223070155.54251-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/builtin-inject.c | 1 + 1 file changed, 1 insertion(+)
--- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -463,6 +463,7 @@ static int perf_event__repipe_buildid_mm dso->hit = 1; } dso__put(dso); + perf_event__repipe(tool, event, sample, machine); return 0; }
From: Tobias Klauser tklauser@distanz.ch
commit a402f1e35313fc7ce2ca60f543c4402c2c7c3544 upstream.
Currently, calling clone3() with CLONE_NEWTIME in clone_args->flags fails with -EINVAL. This is because CLONE_NEWTIME intersects with CSIGNAL. However, CSIGNAL was deprecated when clone3 was introduced in commit 7f192e3cd316 ("fork: add clone3"), allowing re-use of that part of clone flags.
Fix this by explicitly allowing CLONE_NEWTIME in clone3_args_valid. This is also in line with the respective check in check_unshare_flags which allow CLONE_NEWTIME for unshare().
Fixes: 769071ac9f20 ("ns: Introduce Time Namespace") Cc: Andrey Vagin avagin@openvz.org Cc: Christian Brauner brauner@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Tobias Klauser tklauser@distanz.ch Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/fork.c +++ b/kernel/fork.c @@ -2829,7 +2829,7 @@ static bool clone3_args_valid(struct ker * - make the CLONE_DETACHED bit reusable for clone3 * - make the CSIGNAL bits reusable for clone3 */ - if (kargs->flags & (CLONE_DETACHED | CSIGNAL)) + if (kargs->flags & (CLONE_DETACHED | (CSIGNAL & (~CLONE_NEWTIME)))) return false;
if ((kargs->flags & (CLONE_SIGHAND | CLONE_CLEAR_SIGHAND)) ==
From: Andrew Cooper andrew.cooper3@citrix.com
commit b0563468eeac88ebc70559d52a0b66efc37e4e9d upstream.
AMD Erratum 1386 is summarised as:
XSAVES Instruction May Fail to Save XMM Registers to the Provided State Save Area
This piece of accidental chronomancy causes the %xmm registers to occasionally reset back to an older value.
Ignore the XSAVES feature on all AMD Zen1/2 hardware. The XSAVEC instruction (which works fine) is equivalent on affected parts.
[ bp: Typos, move it into the F17h-specific function. ]
Reported-by: Tavis Ormandy taviso@gmail.com Signed-off-by: Andrew Cooper andrew.cooper3@citrix.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230307174643.1240184-1-andrew.cooper3@citrix.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/amd.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -904,6 +904,15 @@ void init_spectral_chicken(struct cpuinf } } #endif + /* + * Work around Erratum 1386. The XSAVES instruction malfunctions in + * certain circumstances on Zen1/2 uarch, and not all parts have had + * updated microcode at the time of writing (March 2023). + * + * Affected parts all have no supervisor XSAVE states, meaning that + * the XSAVEC instruction (which works fine) is equivalent. + */ + clear_cpu_cap(c, X86_FEATURE_XSAVES); }
static void init_amd_zn(struct cpuinfo_x86 *c)
From: Alex Deucher alexander.deucher@amd.com
commit 0dcdf8498eae2727bb33cef3576991dc841d4343 upstream.
Properly skip non-existent registers as well.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442 Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Reviewed-by: Evan Quan evan.quan@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c @@ -461,8 +461,9 @@ static int soc15_read_register(struct am *value = 0; for (i = 0; i < ARRAY_SIZE(soc15_allowed_read_registers); i++) { en = &soc15_allowed_read_registers[i]; - if (adev->reg_offset[en->hwip][en->inst] && - reg_offset != (adev->reg_offset[en->hwip][en->inst][en->seg] + if (!adev->reg_offset[en->hwip][en->inst]) + continue; + else if (reg_offset != (adev->reg_offset[en->hwip][en->inst][en->seg] + en->reg_offset)) continue;
From: Harry Wentland harry.wentland@amd.com
commit 7d386975f6a495902e679a3a250a7456d7e54765 upstream.
This is useful to understand the bpc defaults and support of a driver.
Signed-off-by: Harry Wentland harry.wentland@amd.com Cc: Pekka Paalanen ppaalanen@gmail.com Cc: Sebastian Wick sebastian.wick@redhat.com Cc: Vitaly.Prosyak@amd.com Cc: Uma Shankar uma.shankar@intel.com Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Joshua Ashton joshua@froggi.es Cc: Jani Nikula jani.nikula@linux.intel.com Cc: dri-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Reviewed-By: Joshua Ashton joshua@froggi.es Link: https://patchwork.freedesktop.org/patch/msgid/20230113162428.33874-3-harry.w... Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_atomic.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -1052,6 +1052,7 @@ static void drm_atomic_connector_print_s drm_printf(p, "connector[%u]: %s\n", connector->base.id, connector->name); drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name : "(null)"); drm_printf(p, "\tself_refresh_aware=%d\n", state->self_refresh_aware); + drm_printf(p, "\tmax_requested_bpc=%d\n", state->max_requested_bpc);
if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK) if (state->writeback_job && state->writeback_job->fb)
From: Hans de Goede hdegoede@redhat.com
commit d17789edd6a8270c38459e592ee536a84c6202db upstream.
To last 2 parameters to cfg80211_get_bss() should be of the enum ieee80211_bss_type resp. enum ieee80211_privacy types, which WLAN_CAPABILITY_ESS very much is not.
Fix both cfg80211_get_bss() calls in ioctl_cfg80211.c to pass the right parameters.
Note that the second call was already somewhat fixed by commenting out WLAN_CAPABILITY_ESS and passing in 0 instead. This was still not entirely correct though since that would limit returned BSS-es to ESS type BSS-es with privacy on.
Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230306153512.162104-2-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c @@ -358,7 +358,7 @@ int rtw_cfg80211_check_bss(struct adapte bss = cfg80211_get_bss(padapter->rtw_wdev->wiphy, notify_channel, pnetwork->mac_address, pnetwork->ssid.ssid, pnetwork->ssid.ssid_length, - WLAN_CAPABILITY_ESS, WLAN_CAPABILITY_ESS); + IEEE80211_BSS_TYPE_ANY, IEEE80211_PRIVACY_ANY);
cfg80211_put_bss(padapter->rtw_wdev->wiphy, bss);
@@ -1239,8 +1239,8 @@ void rtw_cfg80211_unlink_bss(struct adap
bss = cfg80211_get_bss(wiphy, NULL/*notify_channel*/, select_network->mac_address, select_network->ssid.ssid, - select_network->ssid.ssid_length, 0/*WLAN_CAPABILITY_ESS*/, - 0/*WLAN_CAPABILITY_ESS*/); + select_network->ssid.ssid_length, IEEE80211_BSS_TYPE_ANY, + IEEE80211_PRIVACY_ANY);
if (bss) { cfg80211_unlink_bss(wiphy, bss);
From: Eric Biggers ebiggers@google.com
commit ffec85d53d0f39ee4680a2cf0795255e000e1feb upstream.
When writing a page from an encrypted file that is using filesystem-layer encryption (not inline encryption), ext4 encrypts the pagecache page into a bounce page, then writes the bounce page.
It also passes the bounce page to wbc_account_cgroup_owner(). That's incorrect, because the bounce page is a newly allocated temporary page that doesn't have the memory cgroup of the original pagecache page. This makes wbc_account_cgroup_owner() not account the I/O to the owner of the pagecache page as it should.
Fix this by always passing the pagecache page to wbc_account_cgroup_owner().
Fixes: 001e4a8775f6 ("ext4: implement cgroup writeback support") Cc: stable@vger.kernel.org Reported-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Eric Biggers ebiggers@google.com Acked-by: Tejun Heo tj@kernel.org Link: https://lore.kernel.org/r/20230203005503.141557-1-ebiggers@kernel.org Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/page-io.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -413,7 +413,8 @@ static void io_submit_init_bio(struct ex
static void io_submit_add_bh(struct ext4_io_submit *io, struct inode *inode, - struct page *page, + struct page *pagecache_page, + struct page *bounce_page, struct buffer_head *bh) { int ret; @@ -427,10 +428,11 @@ submit_and_retry: io_submit_init_bio(io, bh); io->io_bio->bi_write_hint = inode->i_write_hint; } - ret = bio_add_page(io->io_bio, page, bh->b_size, bh_offset(bh)); + ret = bio_add_page(io->io_bio, bounce_page ?: pagecache_page, + bh->b_size, bh_offset(bh)); if (ret != bh->b_size) goto submit_and_retry; - wbc_account_cgroup_owner(io->io_wbc, page, bh->b_size); + wbc_account_cgroup_owner(io->io_wbc, pagecache_page, bh->b_size); io->io_next_block++; }
@@ -548,8 +550,7 @@ int ext4_bio_write_page(struct ext4_io_s do { if (!buffer_async_write(bh)) continue; - io_submit_add_bh(io, inode, - bounce_page ? bounce_page : page, bh); + io_submit_add_bh(io, inode, page, bounce_page, bh); nr_submitted++; clear_buffer_dirty(bh); } while ((bh = bh->b_this_page) != head);
From: Eric Whitney enwlinux@gmail.com
commit c9f62c8b2dbf7240536c0cc9a4529397bb8bf38e upstream.
A significant number of xfstests can cause ext4 to log one or more warning messages when they are run on a test file system where the inline_data feature has been enabled. An example:
"EXT4-fs warning (device vdc): ext4_dirblock_csum_set:425: inode #16385: comm fsstress: No space for directory leaf checksum. Please run e2fsck -D."
The xfstests include: ext4/057, 058, and 307; generic/013, 051, 068, 070, 076, 078, 083, 232, 269, 270, 390, 461, 475, 476, 482, 579, 585, 589, 626, 631, and 650.
In this situation, the warning message indicates a bug in the code that performs the RENAME_WHITEOUT operation on a directory entry that has been stored inline. It doesn't detect that the directory is stored inline, and incorrectly attempts to compute a dirent block checksum on the whiteout inode when creating it. This attempt fails as a result of the integrity checking in get_dirent_tail (usually due to a failure to match the EXT4_FT_DIR_CSUM magic cookie), and the warning message is then emitted.
Fix this by simply collecting the inlined data state at the time the search for the source directory entry is performed. Existing code handles the rest, and this is sufficient to eliminate all spurious warning messages produced by the tests above. Go one step further and do the same in the code that resets the source directory entry in the event of failure. The inlined state should be present in the "old" struct, but given the possibility of a race there's no harm in taking a conservative approach and getting that information again since the directory entry is being reread anyway.
Fixes: b7ff91fd030d ("ext4: find old entry again if failed to rename whiteout") Cc: stable@kernel.org Signed-off-by: Eric Whitney enwlinux@gmail.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230210173244.679890-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/namei.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1595,11 +1595,10 @@ static struct buffer_head *__ext4_find_e int has_inline_data = 1; ret = ext4_find_inline_entry(dir, fname, res_dir, &has_inline_data); - if (has_inline_data) { - if (inlined) - *inlined = 1; + if (inlined) + *inlined = has_inline_data; + if (has_inline_data) goto cleanup_and_exit; - } }
if ((namelen <= 2) && (name[0] == '.') && @@ -3660,7 +3659,8 @@ static void ext4_resetent(handle_t *hand * so the old->de may no longer valid and need to find it again * before reset old inode info. */ - old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL); + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, + &old.inlined); if (IS_ERR(old.bh)) retval = PTR_ERR(old.bh); if (!old.bh) @@ -3827,7 +3827,8 @@ static int ext4_rename(struct user_names return retval; }
- old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL); + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, + &old.inlined); if (IS_ERR(old.bh)) return PTR_ERR(old.bh); /*
From: Darrick J. Wong djwong@kernel.org
commit c993799baf9c5861f8df91beb80e1611b12efcbd upstream.
Apparently syzbot figured out that issuing this FSMAP call:
struct fsmap_head cmd = { .fmh_count = ...; .fmh_keys = { { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, { .fmr_device = /* ext4 dev */, .fmr_physical = 0, }, }, ... }; ret = ioctl(fd, FS_IOC_GETFSMAP, &cmd);
Produces this crash if the underlying filesystem is a 1k-block ext4 filesystem:
kernel BUG at fs/ext4/ext4.h:3331! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 3 PID: 3227965 Comm: xfs_io Tainted: G W O 6.2.0-rc8-achx Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:ext4_mb_load_buddy_gfp+0x47c/0x570 [ext4] RSP: 0018:ffffc90007c03998 EFLAGS: 00010246 RAX: ffff888004978000 RBX: ffffc90007c03a20 RCX: ffff888041618000 RDX: 0000000000000000 RSI: 00000000000005a4 RDI: ffffffffa0c99b11 RBP: ffff888012330000 R08: ffffffffa0c2b7d0 R09: 0000000000000400 R10: ffffc90007c03950 R11: 0000000000000000 R12: 0000000000000001 R13: 00000000ffffffff R14: 0000000000000c40 R15: ffff88802678c398 FS: 00007fdf2020c880(0000) GS:ffff88807e100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd318a5fe8 CR3: 000000007f80f001 CR4: 00000000001706e0 Call Trace: <TASK> ext4_mballoc_query_range+0x4b/0x210 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_getfsmap_datadev+0x713/0x890 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_getfsmap+0x2b7/0x330 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] ext4_ioc_getfsmap+0x153/0x2b0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] __ext4_ioctl+0x2a7/0x17e0 [ext4 dfa189daddffe8fecd3cdfd00564e0f265a8ab80] __x64_sys_ioctl+0x82/0xa0 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fdf20558aff RSP: 002b:00007ffd318a9e30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000200c0 RCX: 00007fdf20558aff RDX: 00007fdf1feb2010 RSI: 00000000c0c0583b RDI: 0000000000000003 RBP: 00005625c0634be0 R08: 00005625c0634c40 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fdf1feb2010 R13: 00005625be70d994 R14: 0000000000000800 R15: 0000000000000000
For GETFSMAP calls, the caller selects a physical block device by writing its block number into fsmap_head.fmh_keys[01].fmr_device. To query mappings for a subrange of the device, the starting byte of the range is written to fsmap_head.fmh_keys[0].fmr_physical and the last byte of the range goes in fsmap_head.fmh_keys[1].fmr_physical.
IOWs, to query what mappings overlap with bytes 3-14 of /dev/sda, you'd set the inputs as follows:
fmh_keys[0] = { .fmr_device = major(8, 0), .fmr_physical = 3}, fmh_keys[1] = { .fmr_device = major(8, 0), .fmr_physical = 14},
Which would return you whatever is mapped in the 12 bytes starting at physical offset 3.
The crash is due to insufficient range validation of keys[1] in ext4_getfsmap_datadev. On 1k-block filesystems, block 0 is not part of the filesystem, which means that s_first_data_block is nonzero. ext4_get_group_no_and_offset subtracts this quantity from the blocknr argument before cracking it into a group number and a block number within a group. IOWs, block group 0 spans blocks 1-8192 (1-based) instead of 0-8191 (0-based) like what happens with larger blocksizes.
The net result of this encoding is that blocknr < s_first_data_block is not a valid input to this function. The end_fsb variable is set from the keys that are copied from userspace, which means that in the above example, its value is zero. That leads to an underflow here:
blocknr = blocknr - le32_to_cpu(es->s_first_data_block);
The division then operates on -1:
offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> EXT4_SB(sb)->s_cluster_bits;
Leaving an impossibly large group number (2^32-1) in blocknr. ext4_getfsmap_check_keys checked that keys[0].fmr_physical and keys[1].fmr_physical are in increasing order, but ext4_getfsmap_datadev adjusts keys[0].fmr_physical to be at least s_first_data_block. This implies that we have to check it again after the adjustment, which is the piece that I forgot.
Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com Fixes: 4a4956249dac ("ext4: fix off-by-one fsmap error on 1k block filesystems") Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c3000... Cc: stable@vger.kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Link: https://lore.kernel.org/r/Y+58NPTH7VNGgzdd@magnolia Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/fsmap.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/ext4/fsmap.c +++ b/fs/ext4/fsmap.c @@ -486,6 +486,8 @@ static int ext4_getfsmap_datadev(struct keys[0].fmr_physical = bofs; if (keys[1].fmr_physical >= eofs) keys[1].fmr_physical = eofs - 1; + if (keys[1].fmr_physical < keys[0].fmr_physical) + return 0; start_fsb = keys[0].fmr_physical; end_fsb = keys[1].fmr_physical;
From: Ye Bin yebin10@huawei.com
commit 1dcdce5919115a471bf4921a57f20050c545a236 upstream.
The only caller of ext4_find_inline_data_nolock() that needs setting of EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode(). In ext4_write_inline_data_end() we just need to update inode->i_inline_off. Since we are going to add one more caller that does not need to set EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA out to ext4_iget_extra_inode().
Signed-off-by: Ye Bin yebin10@huawei.com Cc: stable@kernel.org Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230307015253.2232062-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inline.c | 1 - fs/ext4/inode.c | 7 ++++++- 2 files changed, 6 insertions(+), 2 deletions(-)
--- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -158,7 +158,6 @@ int ext4_find_inline_data_nolock(struct (void *)ext4_raw_inode(&is.iloc)); EXT4_I(inode)->i_inline_size = EXT4_MIN_INLINE_DATA_SIZE + le32_to_cpu(is.s.here->e_value_size); - ext4_set_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); } out: brelse(is.iloc.bh); --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4549,8 +4549,13 @@ static inline int ext4_iget_extra_inode(
if (EXT4_INODE_HAS_XATTR_SPACE(inode) && *magic == cpu_to_le32(EXT4_XATTR_MAGIC)) { + int err; + ext4_set_inode_state(inode, EXT4_STATE_XATTR); - return ext4_find_inline_data_nolock(inode); + err = ext4_find_inline_data_nolock(inode); + if (!err && ext4_has_inline_data(inode)) + ext4_set_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + return err; } else EXT4_I(inode)->i_inline_off = 0; return 0;
From: Ye Bin yebin10@huawei.com
commit 2b96b4a5d9443ca4cad58b0040be455803c05a42 upstream.
Syzbot found the following issue: EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni" fscrypt: AES-256-XTS using implementation "xts-aes-aesni" ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5071 at mm/page_alloc.c:5525 __alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 Modules linked in: CPU: 1 PID: 5071 Comm: syz-executor263 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:__alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 RSP: 0018:ffffc90003c2f1c0 EFLAGS: 00010246 RAX: ffffc90003c2f220 RBX: 0000000000000014 RCX: 0000000000000000 RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffc90003c2f248 RBP: ffffc90003c2f2d8 R08: dffffc0000000000 R09: ffffc90003c2f220 R10: fffff52000785e49 R11: 1ffff92000785e44 R12: 0000000000040d40 R13: 1ffff92000785e40 R14: dffffc0000000000 R15: 1ffff92000785e3c FS: 0000555556c0d300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95d5e04138 CR3: 00000000793aa000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __alloc_pages_node include/linux/gfp.h:237 [inline] alloc_pages_node include/linux/gfp.h:260 [inline] __kmalloc_large_node+0x95/0x1e0 mm/slab_common.c:1113 __do_kmalloc_node mm/slab_common.c:956 [inline] __kmalloc+0xfe/0x190 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] kzalloc include/linux/slab.h:720 [inline] ext4_update_inline_data+0x236/0x6b0 fs/ext4/inline.c:346 ext4_update_inline_dir fs/ext4/inline.c:1115 [inline] ext4_try_add_inline_entry+0x328/0x990 fs/ext4/inline.c:1307 ext4_add_entry+0x5a4/0xeb0 fs/ext4/namei.c:2385 ext4_add_nondir+0x96/0x260 fs/ext4/namei.c:2772 ext4_create+0x36c/0x560 fs/ext4/namei.c:2817 lookup_open fs/namei.c:3413 [inline] open_last_lookups fs/namei.c:3481 [inline] path_openat+0x12ac/0x2dd0 fs/namei.c:3711 do_filp_open+0x264/0x4f0 fs/namei.c:3741 do_sys_openat2+0x124/0x4e0 fs/open.c:1310 do_sys_open fs/open.c:1326 [inline] __do_sys_openat fs/open.c:1342 [inline] __se_sys_openat fs/open.c:1337 [inline] __x64_sys_openat+0x243/0x290 fs/open.c:1337 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Above issue happens as follows: ext4_iget ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60 ext4_try_add_inline_entry __ext4_mark_inode_dirty ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44 ext4_xattr_shift_entries ->after shift i_inline_off is incorrect, actually is change to 176 ext4_try_add_inline_entry ext4_update_inline_dir get_max_inline_xattr_value_size if (EXT4_I(inode)->i_inline_off) entry = (struct ext4_xattr_entry *)((void *)raw_inode + EXT4_I(inode)->i_inline_off); free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)); ->As entry is incorrect, then 'free' may be negative ext4_update_inline_data value = kzalloc(len, GFP_NOFS); -> len is unsigned int, maybe very large, then trigger warning when 'kzalloc()'
To resolve the above issue we need to update 'i_inline_off' after 'ext4_xattr_shift_entries()'. We do not need to set EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty() already sets this flag if needed. Setting EXT4_STATE_MAY_INLINE_DATA when it is needed may trigger a BUG_ON in ext4_writepages().
Reported-by: syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com Cc: stable@kernel.org Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2789,6 +2789,9 @@ shift: (void *)header, total_ino); EXT4_I(inode)->i_extra_isize = new_extra_isize;
+ if (ext4_has_inline_data(inode)) + error = ext4_find_inline_data_nolock(inode); + cleanup: if (error && (mnt_count != le16_to_cpu(sbi->s_es->s_mnt_count))) { ext4_warning(inode->i_sb, "Unable to expand inode %lu. Delete some EAs or run e2fsck.",
From: Zhihao Cheng chengzhihao1@huawei.com
commit f5361da1e60d54ec81346aee8e3d8baf1be0b762 upstream.
If the boot loader inode has never been used before, the EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the i_size to 0. However, if the "never before used" boot loader has a non-zero i_size, then i_disksize will be non-zero, and the inconsistency between i_size and i_disksize can trigger a kernel warning:
WARNING: CPU: 0 PID: 2580 at fs/ext4/file.c:319 CPU: 0 PID: 2580 Comm: bb Not tainted 6.3.0-rc1-00004-g703695902cfa RIP: 0010:ext4_file_write_iter+0xbc7/0xd10 Call Trace: vfs_write+0x3b1/0x5c0 ksys_write+0x77/0x160 __x64_sys_write+0x22/0x30 do_syscall_64+0x39/0x80
Reproducer: 1. create corrupted image and mount it: mke2fs -t ext4 /tmp/foo.img 200 debugfs -wR "sif <5> size 25700" /tmp/foo.img mount -t ext4 /tmp/foo.img /mnt cd /mnt echo 123 > file 2. Run the reproducer program: posix_memalign(&buf, 1024, 1024) fd = open("file", O_RDWR | O_DIRECT); ioctl(fd, EXT4_IOC_SWAP_BOOT); write(fd, buf, 1024);
Fix this by setting i_disksize as well as i_size to zero when initiaizing the boot loader inode.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217159 Cc: stable@kernel.org Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Link: https://lore.kernel.org/r/20230308032643.641113-1-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ioctl.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -184,6 +184,7 @@ static long swap_inode_boot_loader(struc ei_bl->i_flags = 0; inode_set_iversion(inode_bl, 1); i_size_write(inode_bl, 0); + EXT4_I(inode_bl)->i_disksize = inode_bl->i_size; inode_bl->i_mode = S_IFREG; if (ext4_has_feature_extents(sb)) { ext4_set_inode_flag(inode_bl, EXT4_INODE_EXTENTS);
From: Fedor Pchelkin pchelkin@ispras.ru
commit 7d834b4d1ab66c48e8c0810fdeadaabb80fa2c81 upstream.
cb_context should be freed on the error path in nfc_se_io as stated by commit 25ff6f8a5a3b ("nfc: fix memory leak of se_io context in nfc_genl_se_io").
Make the error path in nfc_se_io unwind everything in reverse order, i.e. free the cb_context after unlocking the device.
Suggested-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20230306212650.230322-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/nfc/netlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -1446,8 +1446,8 @@ static int nfc_se_io(struct nfc_dev *dev return rc;
error: - kfree(cb_context); device_unlock(&dev->dev); + kfree(cb_context); return rc; }
From: Vitaly Kuznetsov vkuznets@redhat.com
[ Upstream commit ae0946cd3601752dc58f86d84258e5361e9c8cd4 ]
Iterating over set bits in 'vcpu_bitmap' should be faster than going through all vCPUs, especially when just a few bits are set.
Drop kvm_make_vcpus_request_mask() call from kvm_make_all_cpus_request_except() to avoid handling the special case when 'vcpu_bitmap' is NULL, move the code to kvm_make_all_cpus_request_except() itself.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Reviewed-by: Sean Christopherson seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Message-Id: 20210903075141.403071-5-vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Stable-dep-of: 2b0128127373 ("KVM: Register /dev/kvm as the _very_ last thing during initialization") Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/kvm_main.c | 88 +++++++++++++++++++++++++++------------------ 1 file changed, 53 insertions(+), 35 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3ffed093d3ea2..d08dbf0be2fce 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -257,50 +257,57 @@ static inline bool kvm_kick_many_cpus(cpumask_var_t tmp, bool wait) return true; }
+static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned int req, cpumask_var_t tmp, + int current_cpu) +{ + int cpu; + + kvm_make_request(req, vcpu); + + if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu)) + return; + + /* + * tmp can be "unavailable" if cpumasks are allocated off stack as + * allocation of the mask is deliberately not fatal and is handled by + * falling back to kicking all online CPUs. + */ + if (!cpumask_available(tmp)) + return; + + /* + * Note, the vCPU could get migrated to a different pCPU at any point + * after kvm_request_needs_ipi(), which could result in sending an IPI + * to the previous pCPU. But, that's OK because the purpose of the IPI + * is to ensure the vCPU returns to OUTSIDE_GUEST_MODE, which is + * satisfied if the vCPU migrates. Entering READING_SHADOW_PAGE_TABLES + * after this point is also OK, as the requirement is only that KVM wait + * for vCPUs that were reading SPTEs _before_ any changes were + * finalized. See kvm_vcpu_kick() for more details on handling requests. + */ + if (kvm_request_needs_ipi(vcpu, req)) { + cpu = READ_ONCE(vcpu->cpu); + if (cpu != -1 && cpu != current_cpu) + __cpumask_set_cpu(cpu, tmp); + } +} + bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req, struct kvm_vcpu *except, unsigned long *vcpu_bitmap, cpumask_var_t tmp) { - int i, cpu, me; struct kvm_vcpu *vcpu; + int i, me; bool called;
me = get_cpu();
- kvm_for_each_vcpu(i, vcpu, kvm) { - if ((vcpu_bitmap && !test_bit(i, vcpu_bitmap)) || - vcpu == except) + for_each_set_bit(i, vcpu_bitmap, KVM_MAX_VCPUS) { + vcpu = kvm_get_vcpu(kvm, i); + if (!vcpu || vcpu == except) continue; - - kvm_make_request(req, vcpu); - - if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu)) - continue; - - /* - * tmp can be "unavailable" if cpumasks are allocated off stack - * as allocation of the mask is deliberately not fatal and is - * handled by falling back to kicking all online CPUs. - */ - if (!cpumask_available(tmp)) - continue; - - /* - * Note, the vCPU could get migrated to a different pCPU at any - * point after kvm_request_needs_ipi(), which could result in - * sending an IPI to the previous pCPU. But, that's ok because - * the purpose of the IPI is to ensure the vCPU returns to - * OUTSIDE_GUEST_MODE, which is satisfied if the vCPU migrates. - * Entering READING_SHADOW_PAGE_TABLES after this point is also - * ok, as the requirement is only that KVM wait for vCPUs that - * were reading SPTEs _before_ any changes were finalized. See - * kvm_vcpu_kick() for more details on handling requests. - */ - if (kvm_request_needs_ipi(vcpu, req)) { - cpu = READ_ONCE(vcpu->cpu); - if (cpu != -1 && cpu != me) - __cpumask_set_cpu(cpu, tmp); - } + kvm_make_vcpu_request(kvm, vcpu, req, tmp, me); }
called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT)); @@ -312,12 +319,23 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req, bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req, struct kvm_vcpu *except) { + struct kvm_vcpu *vcpu; cpumask_var_t cpus; bool called; + int i, me;
zalloc_cpumask_var(&cpus, GFP_ATOMIC);
- called = kvm_make_vcpus_request_mask(kvm, req, except, NULL, cpus); + me = get_cpu(); + + kvm_for_each_vcpu(i, vcpu, kvm) { + if (vcpu == except) + continue; + kvm_make_vcpu_request(kvm, vcpu, req, cpus, me); + } + + called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT)); + put_cpu();
free_cpumask_var(cpus); return called;
From: Vitaly Kuznetsov vkuznets@redhat.com
[ Upstream commit baff59ccdc657d290be51b95b38ebe5de40036b4 ]
Allocating cpumask dynamically in zalloc_cpumask_var() is not ideal. Allocation is somewhat slow and can (in theory and when CPUMASK_OFFSTACK) fail. kvm_make_all_cpus_request_except() already disables preemption so we can use pre-allocated per-cpu cpumasks instead.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Reviewed-by: Sean Christopherson seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Message-Id: 20210903075141.403071-8-vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Stable-dep-of: 2b0128127373 ("KVM: Register /dev/kvm as the _very_ last thing during initialization") Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/kvm_main.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d08dbf0be2fce..c60e88b0b24ec 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -157,6 +157,8 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm); static unsigned long long kvm_createvm_count; static unsigned long long kvm_active_vms;
+static DEFINE_PER_CPU(cpumask_var_t, cpu_kick_mask); + __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, unsigned long start, unsigned long end) { @@ -320,14 +322,15 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req, struct kvm_vcpu *except) { struct kvm_vcpu *vcpu; - cpumask_var_t cpus; + struct cpumask *cpus; bool called; int i, me;
- zalloc_cpumask_var(&cpus, GFP_ATOMIC); - me = get_cpu();
+ cpus = this_cpu_cpumask_var_ptr(cpu_kick_mask); + cpumask_clear(cpus); + kvm_for_each_vcpu(i, vcpu, kvm) { if (vcpu == except) continue; @@ -337,7 +340,6 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req, called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT)); put_cpu();
- free_cpumask_var(cpus); return called; }
@@ -5637,9 +5639,17 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, goto out_free_3; }
+ for_each_possible_cpu(cpu) { + if (!alloc_cpumask_var_node(&per_cpu(cpu_kick_mask, cpu), + GFP_KERNEL, cpu_to_node(cpu))) { + r = -ENOMEM; + goto out_free_4; + } + } + r = kvm_async_pf_init(); if (r) - goto out_free; + goto out_free_5;
kvm_chardev_ops.owner = module; kvm_vm_fops.owner = module; @@ -5665,7 +5675,10 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
out_unreg: kvm_async_pf_deinit(); -out_free: +out_free_5: + for_each_possible_cpu(cpu) + free_cpumask_var(per_cpu(cpu_kick_mask, cpu)); +out_free_4: kmem_cache_destroy(kvm_vcpu_cache); out_free_3: unregister_reboot_notifier(&kvm_reboot_notifier); @@ -5685,8 +5698,12 @@ EXPORT_SYMBOL_GPL(kvm_init);
void kvm_exit(void) { + int cpu; + debugfs_remove_recursive(kvm_debugfs_dir); misc_deregister(&kvm_dev); + for_each_possible_cpu(cpu) + free_cpumask_var(per_cpu(cpu_kick_mask, cpu)); kmem_cache_destroy(kvm_vcpu_cache); kvm_async_pf_deinit(); unregister_syscore_ops(&kvm_syscore_ops);
From: Sean Christopherson seanjc@google.com
[ Upstream commit 2b01281273738bf2d6551da48d65db2df3f28998 ]
Register /dev/kvm, i.e. expose KVM to userspace, only after all other setup has completed. Once /dev/kvm is exposed, userspace can start invoking KVM ioctls, creating VMs, etc... If userspace creates a VM before KVM is done with its configuration, bad things may happen, e.g. KVM will fail to properly migrate vCPU state if a VM is created before KVM has registered preemption notifiers.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20221130230934.1014142-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/kvm_main.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index c60e88b0b24ec..17e22c654c4e4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5655,12 +5655,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, kvm_vm_fops.owner = module; kvm_vcpu_fops.owner = module;
- r = misc_register(&kvm_dev); - if (r) { - pr_err("kvm: misc device register failed\n"); - goto out_unreg; - } - register_syscore_ops(&kvm_syscore_ops);
kvm_preempt_ops.sched_in = kvm_sched_in; @@ -5669,11 +5663,24 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align, kvm_init_debug();
r = kvm_vfio_ops_init(); - WARN_ON(r); + if (WARN_ON_ONCE(r)) + goto err_vfio; + + /* + * Registration _must_ be the very last thing done, as this exposes + * /dev/kvm to userspace, i.e. all infrastructure must be setup! + */ + r = misc_register(&kvm_dev); + if (r) { + pr_err("kvm: misc device register failed\n"); + goto err_register; + }
return 0;
-out_unreg: +err_register: + kvm_vfio_ops_exit(); +err_vfio: kvm_async_pf_deinit(); out_free_5: for_each_possible_cpu(cpu) @@ -5700,8 +5707,14 @@ void kvm_exit(void) { int cpu;
- debugfs_remove_recursive(kvm_debugfs_dir); + /* + * Note, unregistering /dev/kvm doesn't strictly need to come first, + * fops_get(), a.k.a. try_module_get(), prevents acquiring references + * to KVM while the module is being stopped. + */ misc_deregister(&kvm_dev); + + debugfs_remove_recursive(kvm_debugfs_dir); for_each_possible_cpu(cpu) free_cpumask_var(per_cpu(cpu_kick_mask, cpu)); kmem_cache_destroy(kvm_vcpu_cache);
From: Sean Christopherson seanjc@google.com
[ Upstream commit b51818afdc1d3c7cc269e295953685558d3af71c ]
Don't bother rewriting the ICR value into the vAPIC page on an AVIC IPI virtualization failure, the access is a trap, i.e. the value has already been written to the vAPIC page. The one caveat is if hardware left the BUSY flag set (which appears to happen somewhat arbitrarily), in which case go through the "nodecode" APIC-write path in order to clear the BUSY flag.
Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20220204214205.3306634-6-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Stable-dep-of: 5aede752a839 ("KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/lapic.c | 1 + arch/x86/kvm/svm/avic.c | 22 +++++++++++----------- 2 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 25530a908b4cd..8c9e41ff2a24e 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1295,6 +1295,7 @@ void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL); } +EXPORT_SYMBOL_GPL(kvm_apic_send_ipi);
static u32 apic_get_tmcct(struct kvm_lapic *apic) { diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 3d3f8dfb80457..52778be77713f 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -320,18 +320,18 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu) switch (id) { case AVIC_IPI_FAILURE_INVALID_INT_TYPE: /* - * AVIC hardware handles the generation of - * IPIs when the specified Message Type is Fixed - * (also known as fixed delivery mode) and - * the Trigger Mode is edge-triggered. The hardware - * also supports self and broadcast delivery modes - * specified via the Destination Shorthand(DSH) - * field of the ICRL. Logical and physical APIC ID - * formats are supported. All other IPI types cause - * a #VMEXIT, which needs to emulated. + * Emulate IPIs that are not handled by AVIC hardware, which + * only virtualizes Fixed, Edge-Triggered INTRs. The exit is + * a trap, e.g. ICR holds the correct value and RIP has been + * advanced, KVM is responsible only for emulating the IPI. + * Sadly, hardware may sometimes leave the BUSY flag set, in + * which case KVM needs to emulate the ICR write as well in + * order to clear the BUSY flag. */ - kvm_lapic_reg_write(apic, APIC_ICR2, icrh); - kvm_lapic_reg_write(apic, APIC_ICR, icrl); + if (icrl & APIC_ICR_BUSY) + kvm_apic_write_nodecode(vcpu, APIC_ICR); + else + kvm_apic_send_ipi(apic, icrl, icrh); break; case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: /*
From: Sean Christopherson seanjc@google.com
[ Upstream commit 5aede752a839904059c2b5d68be0dc4501c6c15f ]
Emulate ICR writes on AVIC IPI failures due to invalid targets using the same logic as failures due to invalid types. AVIC acceleration fails if _any_ of the targets are invalid, and crucially VM-Exits before sending IPIs to targets that _are_ valid. In logical mode, the destination is a bitmap, i.e. a single IPI can target multiple logical IDs. Doing nothing causes KVM to drop IPIs if at least one target is valid and at least one target is invalid.
Fixes: 18f40c53e10f ("svm: Add VMEXIT handlers for AVIC") Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini pbonzini@redhat.com Reviewed-by: Maxim Levitsky mlevitsk@redhat.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230106011306.85230-5-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/svm/avic.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 52778be77713f..b595a33860d70 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -318,14 +318,18 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu) trace_kvm_avic_incomplete_ipi(vcpu->vcpu_id, icrh, icrl, id, index);
switch (id) { + case AVIC_IPI_FAILURE_INVALID_TARGET: case AVIC_IPI_FAILURE_INVALID_INT_TYPE: /* * Emulate IPIs that are not handled by AVIC hardware, which - * only virtualizes Fixed, Edge-Triggered INTRs. The exit is - * a trap, e.g. ICR holds the correct value and RIP has been - * advanced, KVM is responsible only for emulating the IPI. - * Sadly, hardware may sometimes leave the BUSY flag set, in - * which case KVM needs to emulate the ICR write as well in + * only virtualizes Fixed, Edge-Triggered INTRs, and falls over + * if _any_ targets are invalid, e.g. if the logical mode mask + * is a superset of running vCPUs. + * + * The exit is a trap, e.g. ICR holds the correct value and RIP + * has been advanced, KVM is responsible only for emulating the + * IPI. Sadly, hardware may sometimes leave the BUSY flag set, + * in which case KVM needs to emulate the ICR write as well in * order to clear the BUSY flag. */ if (icrl & APIC_ICR_BUSY) @@ -341,8 +345,6 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu) */ avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh); break; - case AVIC_IPI_FAILURE_INVALID_TARGET: - break; case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE: WARN_ONCE(1, "Invalid backing page\n"); break;
From: Alexander Aring aahringo@redhat.com
[ Upstream commit 3e54c9e80e68b765d8877023d93f1eea1b9d1c54 ]
This patch will fix a small issue when printing out that dlm_midcomms_start() failed to start and it was printing out that the dlm subcomponent lowcomms was failed but lowcomms is behind the midcomms layer.
Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Stable-dep-of: aad633dc0cf9 ("fs: dlm: start midcomms before scand") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/lockspace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 10eddfa6c3d7b..a1b34605742fc 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -393,7 +393,7 @@ static int threads_start(void) /* Thread for sending/receiving messages for all lockspace's */ error = dlm_midcomms_start(); if (error) { - log_print("cannot start dlm lowcomms %d", error); + log_print("cannot start dlm midcomms %d", error); goto scand_fail; }
From: Alexander Aring aahringo@redhat.com
[ Upstream commit 8b0188b0d60b6f6183b48380bac49fe080c5ded9 ]
This patch introduces leftovers of init, start, stop and exit functionality. The dlm application layer should always call the midcomms layer which getting aware of such event and redirect it to the lowcomms layer. Some functionality which is currently handled inside the start functionality of midcomms and lowcomms should be handled in the init functionality as it only need to be initialized once when dlm is loaded.
Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Stable-dep-of: aad633dc0cf9 ("fs: dlm: start midcomms before scand") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/lockspace.c | 5 ++--- fs/dlm/lowcomms.c | 16 ++++++++++------ fs/dlm/lowcomms.h | 1 + fs/dlm/main.c | 7 +++++-- fs/dlm/midcomms.c | 17 ++++++++++++++++- fs/dlm/midcomms.h | 3 +++ 6 files changed, 37 insertions(+), 12 deletions(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index a1b34605742fc..1b49f375829d0 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -17,7 +17,6 @@ #include "recoverd.h" #include "dir.h" #include "midcomms.h" -#include "lowcomms.h" #include "config.h" #include "memory.h" #include "lock.h" @@ -705,7 +704,7 @@ int dlm_new_lockspace(const char *name, const char *cluster, if (!ls_count) { dlm_scand_stop(); dlm_midcomms_shutdown(); - dlm_lowcomms_stop(); + dlm_midcomms_stop(); } out: mutex_unlock(&ls_lock); @@ -889,7 +888,7 @@ int dlm_release_lockspace(void *lockspace, int force) if (!error) ls_count--; if (!ls_count) - dlm_lowcomms_stop(); + dlm_midcomms_stop(); mutex_unlock(&ls_lock);
return error; diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index d56a8f88a3852..1eb95ba7e7772 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1959,10 +1959,6 @@ static const struct dlm_proto_ops dlm_sctp_ops = { int dlm_lowcomms_start(void) { int error = -EINVAL; - int i; - - for (i = 0; i < CONN_HASH_SIZE; i++) - INIT_HLIST_HEAD(&connection_hash[i]);
init_local(); if (!dlm_local_count) { @@ -1971,8 +1967,6 @@ int dlm_lowcomms_start(void) goto fail; }
- INIT_WORK(&listen_con.rwork, process_listen_recv_socket); - error = work_start(); if (error) goto fail_local; @@ -2011,6 +2005,16 @@ int dlm_lowcomms_start(void) return error; }
+void dlm_lowcomms_init(void) +{ + int i; + + for (i = 0; i < CONN_HASH_SIZE; i++) + INIT_HLIST_HEAD(&connection_hash[i]); + + INIT_WORK(&listen_con.rwork, process_listen_recv_socket); +} + void dlm_lowcomms_exit(void) { struct dlm_node_addr *na, *safe; diff --git a/fs/dlm/lowcomms.h b/fs/dlm/lowcomms.h index 4ccae07cf0058..26433632d1717 100644 --- a/fs/dlm/lowcomms.h +++ b/fs/dlm/lowcomms.h @@ -35,6 +35,7 @@ extern int dlm_allow_conn; int dlm_lowcomms_start(void); void dlm_lowcomms_shutdown(void); void dlm_lowcomms_stop(void); +void dlm_lowcomms_init(void); void dlm_lowcomms_exit(void); int dlm_lowcomms_close(int nodeid); struct dlm_msg *dlm_lowcomms_new_msg(int nodeid, int len, gfp_t allocation, diff --git a/fs/dlm/main.c b/fs/dlm/main.c index afc66a1346d3d..974f7ebb3fe63 100644 --- a/fs/dlm/main.c +++ b/fs/dlm/main.c @@ -17,7 +17,7 @@ #include "user.h" #include "memory.h" #include "config.h" -#include "lowcomms.h" +#include "midcomms.h"
static int __init init_dlm(void) { @@ -27,6 +27,8 @@ static int __init init_dlm(void) if (error) goto out;
+ dlm_midcomms_init(); + error = dlm_lockspace_init(); if (error) goto out_mem; @@ -63,6 +65,7 @@ static int __init init_dlm(void) out_lockspace: dlm_lockspace_exit(); out_mem: + dlm_midcomms_exit(); dlm_memory_exit(); out: return error; @@ -76,7 +79,7 @@ static void __exit exit_dlm(void) dlm_config_exit(); dlm_memory_exit(); dlm_lockspace_exit(); - dlm_lowcomms_exit(); + dlm_midcomms_exit(); dlm_unregister_debugfs(); }
diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c index 702c14de7a4bd..84a7a39fc12e6 100644 --- a/fs/dlm/midcomms.c +++ b/fs/dlm/midcomms.c @@ -1142,13 +1142,28 @@ void dlm_midcomms_commit_mhandle(struct dlm_mhandle *mh) }
int dlm_midcomms_start(void) +{ + return dlm_lowcomms_start(); +} + +void dlm_midcomms_stop(void) +{ + dlm_lowcomms_stop(); +} + +void dlm_midcomms_init(void) { int i;
for (i = 0; i < CONN_HASH_SIZE; i++) INIT_HLIST_HEAD(&node_hash[i]);
- return dlm_lowcomms_start(); + dlm_lowcomms_init(); +} + +void dlm_midcomms_exit(void) +{ + dlm_lowcomms_exit(); }
static void dlm_act_fin_ack_rcv(struct midcomms_node *node) diff --git a/fs/dlm/midcomms.h b/fs/dlm/midcomms.h index 579abc6929be2..1a36b7834dfc5 100644 --- a/fs/dlm/midcomms.h +++ b/fs/dlm/midcomms.h @@ -20,6 +20,9 @@ struct dlm_mhandle *dlm_midcomms_get_mhandle(int nodeid, int len, void dlm_midcomms_commit_mhandle(struct dlm_mhandle *mh); int dlm_midcomms_close(int nodeid); int dlm_midcomms_start(void); +void dlm_midcomms_stop(void); +void dlm_midcomms_init(void); +void dlm_midcomms_exit(void); void dlm_midcomms_shutdown(void); void dlm_midcomms_add_member(int nodeid); void dlm_midcomms_remove_member(int nodeid);
From: Alexander Aring aahringo@redhat.com
[ Upstream commit aad633dc0cf90093998b1ae0ba9f19b5f1dab644 ]
The scand kthread can send dlm messages out, especially dlm remove messages to free memory for unused rsb on other nodes. To send out dlm messages, midcomms must be initialized. This patch moves the midcomms start before scand is started.
Cc: stable@vger.kernel.org Fixes: e7fd41792fc0 ("[DLM] The core of the DLM for GFS2/CLVM") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/lockspace.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 1b49f375829d0..fa086a81a8476 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -383,23 +383,23 @@ static int threads_start(void) { int error;
- error = dlm_scand_start(); + /* Thread for sending/receiving messages for all lockspace's */ + error = dlm_midcomms_start(); if (error) { - log_print("cannot start dlm_scand thread %d", error); + log_print("cannot start dlm midcomms %d", error); goto fail; }
- /* Thread for sending/receiving messages for all lockspace's */ - error = dlm_midcomms_start(); + error = dlm_scand_start(); if (error) { - log_print("cannot start dlm midcomms %d", error); - goto scand_fail; + log_print("cannot start dlm_scand thread %d", error); + goto midcomms_fail; }
return 0;
- scand_fail: - dlm_scand_stop(); + midcomms_fail: + dlm_midcomms_stop(); fail: return error; }
From: Jan Kara jack@suse.cz
[ Upstream commit f54aa97fb7e5329a373f9df4e5e213ced4fc8759 ]
The condition determining whether the preallocation can be used had an off-by-one error so we didn't discard preallocation when new allocation was just following it. This can then confuse code in inode_getblk().
CC: stable@vger.kernel.org Fixes: 16d055656814 ("udf: Discard preallocation before extending file with a hole") Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/udf/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/udf/inode.c b/fs/udf/inode.c index a151e04856afe..594d224588819 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -442,7 +442,7 @@ static int udf_get_block(struct inode *inode, sector_t block, * Block beyond EOF and prealloc extents? Just discard preallocation * as it is not useful and complicates things. */ - if (((loff_t)block) << inode->i_blkbits > iinfo->i_lenExtents) + if (((loff_t)block) << inode->i_blkbits >= iinfo->i_lenExtents) udf_discard_prealloc(inode); udf_clear_extent_cache(inode); phys = inode_getblk(inode, block, &err, &new);
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit 0df035c7208c5e3e2ae7685548353ae536a19015 ]
Let's cache nat entry if there's no lock contention only.
Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Stable-dep-of: 3aa51c61cb4a ("f2fs: retry to update the inode page given data corruption") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/node.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index f810c6bbeff02..7f00f3004a665 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -430,6 +430,10 @@ static void cache_nat_entry(struct f2fs_sb_info *sbi, nid_t nid, struct f2fs_nm_info *nm_i = NM_I(sbi); struct nat_entry *new, *e;
+ /* Let's mitigate lock contention of nat_tree_lock during checkpoint */ + if (rwsem_is_locked(&sbi->cp_global_sem)) + return; + new = __alloc_nat_entry(sbi, nid, false); if (!new) return;
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit a9419b63bf414775e8aeee95d8c4a5e0df690748 ]
This patch tries to mitigate lock contention between f2fs_write_checkpoint and f2fs_get_node_info along with nat_tree_lock.
The idea is, if checkpoint is currently running, other threads that try to grab nat_tree_lock would be better to wait for checkpoint.
Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Stable-dep-of: 3aa51c61cb4a ("f2fs: retry to update the inode page given data corruption") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/checkpoint.c | 2 +- fs/f2fs/compress.c | 2 +- fs/f2fs/data.c | 8 ++++---- fs/f2fs/f2fs.h | 2 +- fs/f2fs/file.c | 2 +- fs/f2fs/gc.c | 6 +++--- fs/f2fs/inline.c | 4 ++-- fs/f2fs/inode.c | 2 +- fs/f2fs/node.c | 19 ++++++++++--------- fs/f2fs/recovery.c | 2 +- fs/f2fs/segment.c | 2 +- 11 files changed, 26 insertions(+), 25 deletions(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 02840dadde5d4..c68f1f8000f17 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -672,7 +672,7 @@ static int recover_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino) /* truncate all the data during iput */ iput(inode);
- err = f2fs_get_node_info(sbi, ino, &ni); + err = f2fs_get_node_info(sbi, ino, &ni, false); if (err) goto err_out;
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 6adf047259546..4fa62f98cb515 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1275,7 +1275,7 @@ static int f2fs_write_compressed_pages(struct compress_ctx *cc,
psize = (loff_t)(cc->rpages[last_index]->index + 1) << PAGE_SHIFT;
- err = f2fs_get_node_info(fio.sbi, dn.nid, &ni); + err = f2fs_get_node_info(fio.sbi, dn.nid, &ni, false); if (err) goto out_put_dnode;
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index ee2909267a33b..524d4b49a5209 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1354,7 +1354,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type) if (unlikely(is_inode_flag_set(dn->inode, FI_NO_ALLOC))) return -EPERM;
- err = f2fs_get_node_info(sbi, dn->nid, &ni); + err = f2fs_get_node_info(sbi, dn->nid, &ni, false); if (err) return err;
@@ -1796,7 +1796,7 @@ static int f2fs_xattr_fiemap(struct inode *inode, if (!page) return -ENOMEM;
- err = f2fs_get_node_info(sbi, inode->i_ino, &ni); + err = f2fs_get_node_info(sbi, inode->i_ino, &ni, false); if (err) { f2fs_put_page(page, 1); return err; @@ -1828,7 +1828,7 @@ static int f2fs_xattr_fiemap(struct inode *inode, if (!page) return -ENOMEM;
- err = f2fs_get_node_info(sbi, xnid, &ni); + err = f2fs_get_node_info(sbi, xnid, &ni, false); if (err) { f2fs_put_page(page, 1); return err; @@ -2688,7 +2688,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio) fio->need_lock = LOCK_REQ; }
- err = f2fs_get_node_info(fio->sbi, dn.nid, &ni); + err = f2fs_get_node_info(fio->sbi, dn.nid, &ni, false); if (err) goto out_writepage;
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index a144471c53166..80e4f9afe86f7 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3416,7 +3416,7 @@ int f2fs_need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid); bool f2fs_is_checkpointed_node(struct f2fs_sb_info *sbi, nid_t nid); bool f2fs_need_inode_block_update(struct f2fs_sb_info *sbi, nid_t ino); int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, - struct node_info *ni); + struct node_info *ni, bool checkpoint_context); pgoff_t f2fs_get_next_page_offset(struct dnode_of_data *dn, pgoff_t pgofs); int f2fs_get_dnode_of_data(struct dnode_of_data *dn, pgoff_t index, int mode); int f2fs_truncate_inode_blocks(struct inode *inode, pgoff_t from); diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 326c1a4c2a6ac..3be34ea4e2998 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1232,7 +1232,7 @@ static int __clone_blkaddrs(struct inode *src_inode, struct inode *dst_inode, if (ret) return ret;
- ret = f2fs_get_node_info(sbi, dn.nid, &ni); + ret = f2fs_get_node_info(sbi, dn.nid, &ni, false); if (ret) { f2fs_put_dnode(&dn); return ret; diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index fa1f5fb750b39..615b109570b05 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -944,7 +944,7 @@ static int gc_node_segment(struct f2fs_sb_info *sbi, continue; }
- if (f2fs_get_node_info(sbi, nid, &ni)) { + if (f2fs_get_node_info(sbi, nid, &ni, false)) { f2fs_put_page(node_page, 1); continue; } @@ -1012,7 +1012,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum, if (IS_ERR(node_page)) return false;
- if (f2fs_get_node_info(sbi, nid, dni)) { + if (f2fs_get_node_info(sbi, nid, dni, false)) { f2fs_put_page(node_page, 1); return false; } @@ -1223,7 +1223,7 @@ static int move_data_block(struct inode *inode, block_t bidx,
f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);
- err = f2fs_get_node_info(fio.sbi, dn.nid, &ni); + err = f2fs_get_node_info(fio.sbi, dn.nid, &ni, false); if (err) goto put_out;
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c index bce1c2ae6d153..e4fc169a07f55 100644 --- a/fs/f2fs/inline.c +++ b/fs/f2fs/inline.c @@ -146,7 +146,7 @@ int f2fs_convert_inline_page(struct dnode_of_data *dn, struct page *page) if (err) return err;
- err = f2fs_get_node_info(fio.sbi, dn->nid, &ni); + err = f2fs_get_node_info(fio.sbi, dn->nid, &ni, false); if (err) { f2fs_truncate_data_blocks_range(dn, 1); f2fs_put_dnode(dn); @@ -797,7 +797,7 @@ int f2fs_inline_data_fiemap(struct inode *inode, ilen = start + len; ilen -= start;
- err = f2fs_get_node_info(F2FS_I_SB(inode), inode->i_ino, &ni); + err = f2fs_get_node_info(F2FS_I_SB(inode), inode->i_ino, &ni, false); if (err) goto out;
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index bd8960f4966bc..2fa0fcffe0c6d 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -888,7 +888,7 @@ void f2fs_handle_failed_inode(struct inode *inode) * so we can prevent losing this orphan when encoutering checkpoint * and following suddenly power-off. */ - err = f2fs_get_node_info(sbi, inode->i_ino, &ni); + err = f2fs_get_node_info(sbi, inode->i_ino, &ni, false); if (err) { set_sbi_flag(sbi, SBI_NEED_FSCK); set_inode_flag(inode, FI_FREE_NID); diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 7f00f3004a665..89a7f6021c369 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -543,7 +543,7 @@ int f2fs_try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink) }
int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, - struct node_info *ni) + struct node_info *ni, bool checkpoint_context) { struct f2fs_nm_info *nm_i = NM_I(sbi); struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); @@ -576,9 +576,10 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, * nat_tree_lock. Therefore, we should retry, if we failed to grab here * while not bothering checkpoint. */ - if (!rwsem_is_locked(&sbi->cp_global_sem)) { + if (!rwsem_is_locked(&sbi->cp_global_sem) || checkpoint_context) { down_read(&curseg->journal_rwsem); - } else if (!down_read_trylock(&curseg->journal_rwsem)) { + } else if (rwsem_is_contended(&nm_i->nat_tree_lock) || + !down_read_trylock(&curseg->journal_rwsem)) { up_read(&nm_i->nat_tree_lock); goto retry; } @@ -891,7 +892,7 @@ static int truncate_node(struct dnode_of_data *dn) int err; pgoff_t index;
- err = f2fs_get_node_info(sbi, dn->nid, &ni); + err = f2fs_get_node_info(sbi, dn->nid, &ni, false); if (err) return err;
@@ -1290,7 +1291,7 @@ struct page *f2fs_new_node_page(struct dnode_of_data *dn, unsigned int ofs) goto fail;
#ifdef CONFIG_F2FS_CHECK_FS - err = f2fs_get_node_info(sbi, dn->nid, &new_ni); + err = f2fs_get_node_info(sbi, dn->nid, &new_ni, false); if (err) { dec_valid_node_count(sbi, dn->inode, !ofs); goto fail; @@ -1356,7 +1357,7 @@ static int read_node_page(struct page *page, int op_flags) return LOCKED_PAGE; }
- err = f2fs_get_node_info(sbi, page->index, &ni); + err = f2fs_get_node_info(sbi, page->index, &ni, false); if (err) return err;
@@ -1607,7 +1608,7 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted, nid = nid_of_node(page); f2fs_bug_on(sbi, page->index != nid);
- if (f2fs_get_node_info(sbi, nid, &ni)) + if (f2fs_get_node_info(sbi, nid, &ni, !do_balance)) goto redirty_out;
if (wbc->for_reclaim) { @@ -2712,7 +2713,7 @@ int f2fs_recover_xattr_data(struct inode *inode, struct page *page) goto recover_xnid;
/* 1: invalidate the previous xattr nid */ - err = f2fs_get_node_info(sbi, prev_xnid, &ni); + err = f2fs_get_node_info(sbi, prev_xnid, &ni, false); if (err) return err;
@@ -2752,7 +2753,7 @@ int f2fs_recover_inode_page(struct f2fs_sb_info *sbi, struct page *page) struct page *ipage; int err;
- err = f2fs_get_node_info(sbi, ino, &old_ni); + err = f2fs_get_node_info(sbi, ino, &old_ni, false); if (err) return err;
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c index ed21e34b59c7f..ba7eeb3c27384 100644 --- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -604,7 +604,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct inode *inode,
f2fs_wait_on_page_writeback(dn.node_page, NODE, true, true);
- err = f2fs_get_node_info(sbi, dn.nid, &ni); + err = f2fs_get_node_info(sbi, dn.nid, &ni, false); if (err) goto err;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 194c0811fbdfe..58dd4de41986e 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -253,7 +253,7 @@ static int __revoke_inmem_pages(struct inode *inode, goto next; }
- err = f2fs_get_node_info(sbi, dn.nid, &ni); + err = f2fs_get_node_info(sbi, dn.nid, &ni, false); if (err) { f2fs_put_dnode(&dn); return err;
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit 3aa51c61cb4a4dcb40df51ac61171e9ac5a35321 ]
If the storage gives a corrupted node block due to short power failure and reset, f2fs stops the entire operations by setting the checkpoint failure flag.
Let's give more chances to live by re-issuing IOs for a while in such critical path.
Cc: stable@vger.kernel.org Suggested-by: Randall Huang huangrandall@google.com Suggested-by: Chao Yu chao@kernel.org Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/f2fs/inode.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 2fa0fcffe0c6d..94e21136d5790 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -681,17 +681,19 @@ void f2fs_update_inode_page(struct inode *inode) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct page *node_page; + int count = 0; retry: node_page = f2fs_get_node_page(sbi, inode->i_ino); if (IS_ERR(node_page)) { int err = PTR_ERR(node_page);
- if (err == -ENOMEM) { - cond_resched(); + /* The node block was truncated. */ + if (err == -ENOENT) + return; + + if (err == -ENOMEM || ++count <= DEFAULT_RETRY_IO_COUNT) goto retry; - } else if (err != -ENOENT) { - f2fs_stop_checkpoint(sbi, false); - } + f2fs_stop_checkpoint(sbi, false); return; } f2fs_update_inode(inode, node_page);
From: Corey Minyard cminyard@mvista.com
[ Upstream commit 39721d62bbc16ebc9bb2bdc2c163658f33da3b0b ]
The spec states that the minimum message retry time is 60ms, but it was set to 20ms. Correct it.
Reported by: Tony Camuso tcamuso@redhat.com Signed-off-by: Corey Minyard cminyard@mvista.com Stable-dep-of: 00bb7e763ec9 ("ipmi:ssif: Add a timer between request retries") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/ipmi/ipmi_ssif.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index 427bf618c4470..cae21632cf079 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -74,7 +74,7 @@ /* * Timer values */ -#define SSIF_MSG_USEC 20000 /* 20ms between message tries. */ +#define SSIF_MSG_USEC 60000 /* 60ms between message tries. */ #define SSIF_MSG_PART_USEC 5000 /* 5ms for a message part */
/* How many times to we retry sending/receiving the message. */
From: Corey Minyard cminyard@mvista.com
[ Upstream commit 00bb7e763ec9f384cb382455cb6ba5588b5375cf ]
The IPMI spec has a time (T6) specified between request retries. Add the handling for that.
Reported by: Tony Camuso tcamuso@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Corey Minyard cminyard@mvista.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/ipmi/ipmi_ssif.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-)
diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index cae21632cf079..20dc2452815c7 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -74,7 +74,8 @@ /* * Timer values */ -#define SSIF_MSG_USEC 60000 /* 60ms between message tries. */ +#define SSIF_MSG_USEC 60000 /* 60ms between message tries (T3). */ +#define SSIF_REQ_RETRY_USEC 60000 /* 60ms between send retries (T6). */ #define SSIF_MSG_PART_USEC 5000 /* 5ms for a message part */
/* How many times to we retry sending/receiving the message. */ @@ -82,7 +83,9 @@ #define SSIF_RECV_RETRIES 250
#define SSIF_MSG_MSEC (SSIF_MSG_USEC / 1000) +#define SSIF_REQ_RETRY_MSEC (SSIF_REQ_RETRY_USEC / 1000) #define SSIF_MSG_JIFFIES ((SSIF_MSG_USEC * 1000) / TICK_NSEC) +#define SSIF_REQ_RETRY_JIFFIES ((SSIF_REQ_RETRY_USEC * 1000) / TICK_NSEC) #define SSIF_MSG_PART_JIFFIES ((SSIF_MSG_PART_USEC * 1000) / TICK_NSEC)
/* @@ -229,6 +232,9 @@ struct ssif_info { bool got_alert; bool waiting_alert;
+ /* Used to inform the timeout that it should do a resend. */ + bool do_resend; + /* * If set to true, this will request events the next time the * state machine is idle. @@ -538,22 +544,28 @@ static void start_get(struct ssif_info *ssif_info) ssif_info->recv, I2C_SMBUS_BLOCK_DATA); }
+static void start_resend(struct ssif_info *ssif_info); + static void retry_timeout(struct timer_list *t) { struct ssif_info *ssif_info = from_timer(ssif_info, t, retry_timer); unsigned long oflags, *flags; - bool waiting; + bool waiting, resend;
if (ssif_info->stopping) return;
flags = ipmi_ssif_lock_cond(ssif_info, &oflags); + resend = ssif_info->do_resend; + ssif_info->do_resend = false; waiting = ssif_info->waiting_alert; ssif_info->waiting_alert = false; ipmi_ssif_unlock_cond(ssif_info, flags);
if (waiting) start_get(ssif_info); + if (resend) + start_resend(ssif_info); }
static void watch_timeout(struct timer_list *t) @@ -602,8 +614,6 @@ static void ssif_alert(struct i2c_client *client, enum i2c_alert_protocol type, start_get(ssif_info); }
-static void start_resend(struct ssif_info *ssif_info); - static void msg_done_handler(struct ssif_info *ssif_info, int result, unsigned char *data, unsigned int len) { @@ -909,7 +919,13 @@ static void msg_written_handler(struct ssif_info *ssif_info, int result, if (result < 0) { ssif_info->retries_left--; if (ssif_info->retries_left > 0) { - start_resend(ssif_info); + /* + * Wait the retry timeout time per the spec, + * then redo the send. + */ + ssif_info->do_resend = true; + mod_timer(&ssif_info->retry_timer, + jiffies + SSIF_REQ_RETRY_JIFFIES); return; }
@@ -1322,8 +1338,10 @@ static int do_cmd(struct i2c_client *client, int len, unsigned char *msg, ret = i2c_smbus_write_block_data(client, SSIF_IPMI_REQUEST, len, msg); if (ret) { retry_cnt--; - if (retry_cnt > 0) + if (retry_cnt > 0) { + msleep(SSIF_REQ_RETRY_MSEC); goto retry1; + } return -ENODEV; }
@@ -1464,8 +1482,10 @@ static int start_multipart_test(struct i2c_client *client, 32, msg); if (ret) { retry_cnt--; - if (retry_cnt > 0) + if (retry_cnt > 0) { + msleep(SSIF_REQ_RETRY_MSEC); goto retry_write; + } dev_err(&client->dev, "Could not write multi-part start, though the BMC said it could handle it. Just limit sends to one part.\n"); return ret; }
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit d55f7f4c58c07beb5050a834bf57ae2ede599c7e ]
Refactor __irq_domain_alloc_irqs() so that it can be called internally while holding the irq_domain_mutex.
This will be used to fix a shared-interrupt mapping race, hence the Fixes tag.
Fixes: b62b2cf5759b ("irqdomain: Fix handling of type settings for existing mappings") Cc: stable@vger.kernel.org # 4.8 Tested-by: Hsin-Yi Wang hsinyi@chromium.org Tested-by: Mark-PK Tsai mark-pk.tsai@mediatek.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230213104302.17307-6-johan+linaro@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/irq/irqdomain.c | 88 ++++++++++++++++++++++++++----------------------- 1 file changed, 48 insertions(+), 40 deletions(-)
--- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -1464,40 +1464,12 @@ int irq_domain_alloc_irqs_hierarchy(stru return domain->ops->alloc(domain, irq_base, nr_irqs, arg); }
-/** - * __irq_domain_alloc_irqs - Allocate IRQs from domain - * @domain: domain to allocate from - * @irq_base: allocate specified IRQ number if irq_base >= 0 - * @nr_irqs: number of IRQs to allocate - * @node: NUMA node id for memory allocation - * @arg: domain specific argument - * @realloc: IRQ descriptors have already been allocated if true - * @affinity: Optional irq affinity mask for multiqueue devices - * - * Allocate IRQ numbers and initialized all data structures to support - * hierarchy IRQ domains. - * Parameter @realloc is mainly to support legacy IRQs. - * Returns error code or allocated IRQ number - * - * The whole process to setup an IRQ has been split into two steps. - * The first step, __irq_domain_alloc_irqs(), is to allocate IRQ - * descriptor and required hardware resources. The second step, - * irq_domain_activate_irq(), is to program the hardware with preallocated - * resources. In this way, it's easier to rollback when failing to - * allocate resources. - */ -int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base, - unsigned int nr_irqs, int node, void *arg, - bool realloc, const struct irq_affinity_desc *affinity) +static int irq_domain_alloc_irqs_locked(struct irq_domain *domain, int irq_base, + unsigned int nr_irqs, int node, void *arg, + bool realloc, const struct irq_affinity_desc *affinity) { int i, ret, virq;
- if (domain == NULL) { - domain = irq_default_domain; - if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n")) - return -EINVAL; - } - if (realloc && irq_base >= 0) { virq = irq_base; } else { @@ -1516,24 +1488,18 @@ int __irq_domain_alloc_irqs(struct irq_d goto out_free_desc; }
- mutex_lock(&irq_domain_mutex); ret = irq_domain_alloc_irqs_hierarchy(domain, virq, nr_irqs, arg); - if (ret < 0) { - mutex_unlock(&irq_domain_mutex); + if (ret < 0) goto out_free_irq_data; - }
for (i = 0; i < nr_irqs; i++) { ret = irq_domain_trim_hierarchy(virq + i); - if (ret) { - mutex_unlock(&irq_domain_mutex); + if (ret) goto out_free_irq_data; - } } - + for (i = 0; i < nr_irqs; i++) irq_domain_insert_irq(virq + i); - mutex_unlock(&irq_domain_mutex);
return virq;
@@ -1544,6 +1510,48 @@ out_free_desc: return ret; }
+/** + * __irq_domain_alloc_irqs - Allocate IRQs from domain + * @domain: domain to allocate from + * @irq_base: allocate specified IRQ number if irq_base >= 0 + * @nr_irqs: number of IRQs to allocate + * @node: NUMA node id for memory allocation + * @arg: domain specific argument + * @realloc: IRQ descriptors have already been allocated if true + * @affinity: Optional irq affinity mask for multiqueue devices + * + * Allocate IRQ numbers and initialized all data structures to support + * hierarchy IRQ domains. + * Parameter @realloc is mainly to support legacy IRQs. + * Returns error code or allocated IRQ number + * + * The whole process to setup an IRQ has been split into two steps. + * The first step, __irq_domain_alloc_irqs(), is to allocate IRQ + * descriptor and required hardware resources. The second step, + * irq_domain_activate_irq(), is to program the hardware with preallocated + * resources. In this way, it's easier to rollback when failing to + * allocate resources. + */ +int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base, + unsigned int nr_irqs, int node, void *arg, + bool realloc, const struct irq_affinity_desc *affinity) +{ + int ret; + + if (domain == NULL) { + domain = irq_default_domain; + if (WARN(!domain, "domain is NULL; cannot allocate IRQ\n")) + return -EINVAL; + } + + mutex_lock(&irq_domain_mutex); + ret = irq_domain_alloc_irqs_locked(domain, irq_base, nr_irqs, node, arg, + realloc, affinity); + mutex_unlock(&irq_domain_mutex); + + return ret; +} + /* The irq_data was moved, fix the revmap to refer to the new location */ static void irq_domain_fix_revmap(struct irq_data *d) {
From: Jacob Pan jacob.jun.pan@linux.intel.com
[ Upstream commit 194b3348bdbb7db65375c72f3f774aee4cc6614e ]
On platforms that do not support IOMMU Extended capability bit 0 Page-walk Coherency, CPU caches are not snooped when IOMMU is accessing any translation structures. IOMMU access goes only directly to memory. Intel IOMMU code was missing a flush for the PASID table directory that resulted in the unrecoverable fault as shown below.
This patch adds clflush calls whenever allocating and updating a PASID table directory to ensure cache coherency.
On the reverse direction, there's no need to clflush the PASID directory pointer when we deactivate a context entry in that IOMMU hardware will not see the old PASID directory pointer after we clear the context entry. PASID directory entries are also never freed once allocated.
DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Read NO_PASID] Request device [00:0d.2] fault addr 0x1026a4000 [fault reason 0x51] SM: Present bit in Directory Entry is clear DMAR: Dump dmar1 table entries for IOVA 0x1026a4000 DMAR: scalable mode root entry: hi 0x0000000102448001, low 0x0000000101b3e001 DMAR: context entry: hi 0x0000000000000000, low 0x0000000101b4d401 DMAR: pasid dir entry: 0x0000000101b4e001 DMAR: pasid table entry[0]: 0x0000000000000109 DMAR: pasid table entry[1]: 0x0000000000000001 DMAR: pasid table entry[2]: 0x0000000000000000 DMAR: pasid table entry[3]: 0x0000000000000000 DMAR: pasid table entry[4]: 0x0000000000000000 DMAR: pasid table entry[5]: 0x0000000000000000 DMAR: pasid table entry[6]: 0x0000000000000000 DMAR: pasid table entry[7]: 0x0000000000000000 DMAR: PTE not present at level 4
Cc: stable@vger.kernel.org Fixes: 0bbeb01a4faf ("iommu/vt-d: Manage scalalble mode PASID tables") Reviewed-by: Kevin Tian kevin.tian@intel.com Reported-by: Sukumar Ghorai sukumar.ghorai@intel.com Signed-off-by: Ashok Raj ashok.raj@intel.com Signed-off-by: Jacob Pan jacob.jun.pan@linux.intel.com Link: https://lore.kernel.org/r/20230209212843.1788125-1-jacob.jun.pan@linux.intel... Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/pasid.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 9a3dd55aaa1c2..6dbc43b414ca3 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -186,6 +186,9 @@ int intel_pasid_alloc_table(struct device *dev) attach_out: device_attach_pasid_table(info, pasid_table);
+ if (!ecap_coherent(info->iommu->ecap)) + clflush_cache_range(pasid_table->table, size); + return 0; }
@@ -276,6 +279,10 @@ static struct pasid_entry *intel_pasid_get_entry(struct device *dev, u32 pasid) free_pgtable_page(entries); goto retry; } + if (!ecap_coherent(info->iommu->ecap)) { + clflush_cache_range(entries, VTD_PAGE_SIZE); + clflush_cache_range(&dir[dir_index].val, sizeof(*dir)); + } }
return &entries[index];
From: Luis Chamberlain mcgrof@kernel.org
[ Upstream commit e1528830bd4ebf435d91c154e309e6e028336210 ]
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling.
Signed-off-by: Luis Chamberlain mcgrof@kernel.org Link: https://lore.kernel.org/r/20211015235219.2191207-2-mcgrof@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: 67205f80be99 ("brd: mark as nowait compatible") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/brd.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 2427b2261e516..63ac5cd523408 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -370,6 +370,7 @@ static int brd_alloc(int i) struct brd_device *brd; struct gendisk *disk; char buf[DISK_NAME_LEN]; + int err = -ENOMEM;
mutex_lock(&brd_devices_mutex); list_for_each_entry(brd, &brd_devices, brd_list) { @@ -420,16 +421,20 @@ static int brd_alloc(int i) /* Tell the block layer that this is not a rotational device */ blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, disk->queue); - add_disk(disk); + err = add_disk(disk); + if (err) + goto out_cleanup_disk;
return 0;
+out_cleanup_disk: + blk_cleanup_disk(disk); out_free_dev: mutex_lock(&brd_devices_mutex); list_del(&brd->brd_list); mutex_unlock(&brd_devices_mutex); kfree(brd); - return -ENOMEM; + return err; }
static void brd_probe(dev_t dev)
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 67205f80be9910207481406c47f7d85e703fb2e9 ]
By default, non-mq drivers do not support nowait. This causes io_uring to use a slower path as the driver cannot be trust not to block. brd can safely set the nowait flag, as worst case all it does is a NOIO allocation.
For io_uring, this makes a substantial difference. Before:
submitter=0, tid=453, file=/dev/ram0, node=-1 polled=0, fixedbufs=1/0, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=440.03K, BW=1718MiB/s, IOS/call=32/31 IOPS=428.96K, BW=1675MiB/s, IOS/call=32/32 IOPS=442.59K, BW=1728MiB/s, IOS/call=32/31 IOPS=419.65K, BW=1639MiB/s, IOS/call=32/32 IOPS=426.82K, BW=1667MiB/s, IOS/call=32/31
and after:
submitter=0, tid=354, file=/dev/ram0, node=-1 polled=0, fixedbufs=1/0, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=3.37M, BW=13.15GiB/s, IOS/call=32/31 IOPS=3.45M, BW=13.46GiB/s, IOS/call=32/31 IOPS=3.43M, BW=13.42GiB/s, IOS/call=32/32 IOPS=3.43M, BW=13.39GiB/s, IOS/call=32/31 IOPS=3.43M, BW=13.38GiB/s, IOS/call=32/31
or about an 8x in difference. Now that brd is prepared to deal with REQ_NOWAIT reads/writes, mark it as supporting that.
Cc: stable@vger.kernel.org # 5.10+ Link: https://lore.kernel.org/linux-block/20230203103005.31290-1-p.raghav@samsung.... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/brd.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 63ac5cd523408..76ce6f766d55e 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -421,6 +421,7 @@ static int brd_alloc(int i) /* Tell the block layer that this is not a rotational device */ blk_queue_flag_set(QUEUE_FLAG_NONROT, disk->queue); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, disk->queue); + blk_queue_flag_set(QUEUE_FLAG_NOWAIT, disk->queue); err = add_disk(disk); if (err) goto out_cleanup_disk;
From: Pierre Gondois pierre.gondois@arm.com
[ Upstream commit 0e68b5517d3767562889f1d83fdb828c26adb24f ]
Running a rt-kernel base on 6.2.0-rc3-rt1 on an Ampere Altra outputs the following: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9, name: kworker/u320:0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by kworker/u320:0/9: #0: ffff3fff8c27d128 ((wq_completion)efi_rts_wq){+.+.}-{0:0}, at: process_one_work (./include/linux/atomic/atomic-long.h:41) #1: ffff80000861bdd0 ((work_completion)(&efi_rts_work.work)){+.+.}-{0:0}, at: process_one_work (./include/linux/atomic/atomic-long.h:41) #2: ffffdf7e1ed3e460 (efi_rt_lock){+.+.}-{3:3}, at: efi_call_rts (drivers/firmware/efi/runtime-wrappers.c:101) Preemption disabled at: efi_virtmap_load (./arch/arm64/include/asm/mmu_context.h:248) CPU: 0 PID: 9 Comm: kworker/u320:0 Tainted: G W 6.2.0-rc3-rt1 Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18 Workqueue: efi_rts_wq efi_call_rts Call trace: dump_backtrace (arch/arm64/kernel/stacktrace.c:158) show_stack (arch/arm64/kernel/stacktrace.c:165) dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) dump_stack (lib/dump_stack.c:114) __might_resched (kernel/sched/core.c:10134) rt_spin_lock (kernel/locking/rtmutex.c:1769 (discriminator 4)) efi_call_rts (drivers/firmware/efi/runtime-wrappers.c:101) [...]
This seems to come from commit ff7a167961d1 ("arm64: efi: Execute runtime services from a dedicated stack") which adds a spinlock. This spinlock is taken through: efi_call_rts() -efi_call_virt() -efi_call_virt_pointer() -arch_efi_call_virt_setup()
Make 'efi_rt_lock' a raw_spinlock to avoid being preempted.
[ardb: The EFI runtime services are called with a different set of translation tables, and are permitted to use the SIMD registers. The context switch code preserves/restores neither, and so EFI calls must be made with preemption disabled, rather than only disabling migration.]
Fixes: ff7a167961d1 ("arm64: efi: Execute runtime services from a dedicated stack") Signed-off-by: Pierre Gondois pierre.gondois@arm.com Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/efi.h | 6 +++--- arch/arm64/kernel/efi.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h index c5d4551a1be71..53cbbb96f7ebf 100644 --- a/arch/arm64/include/asm/efi.h +++ b/arch/arm64/include/asm/efi.h @@ -25,7 +25,7 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); ({ \ efi_virtmap_load(); \ __efi_fpsimd_begin(); \ - spin_lock(&efi_rt_lock); \ + raw_spin_lock(&efi_rt_lock); \ })
#define arch_efi_call_virt(p, f, args...) \ @@ -37,12 +37,12 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
#define arch_efi_call_virt_teardown() \ ({ \ - spin_unlock(&efi_rt_lock); \ + raw_spin_unlock(&efi_rt_lock); \ __efi_fpsimd_end(); \ efi_virtmap_unload(); \ })
-extern spinlock_t efi_rt_lock; +extern raw_spinlock_t efi_rt_lock; efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);
#define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT) diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 386bd81ca12bb..9669f3fa2aefe 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -145,7 +145,7 @@ asmlinkage efi_status_t efi_handle_corrupted_x18(efi_status_t s, const char *f) return s; }
-DEFINE_SPINLOCK(efi_rt_lock); +DEFINE_RAW_SPINLOCK(efi_rt_lock);
asmlinkage u64 *efi_rt_stack_top __ro_after_init;
From: Palmer Dabbelt palmer@rivosinc.com
[ Upstream commit f2913d006fcdb61719635e093d1b5dd0dafecac7 ]
I don't think we can actually die() without a regs pointer, but the compiler was warning about a NULL check after a dereference. It seems prudent to just avoid the possibly-NULL dereference, given that when die()ing the system is already toast so who knows how we got there.
Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20220920200037.6727-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Stable-dep-of: 130aee3fd998 ("riscv: Avoid enabling interrupts in die()") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 6084bd93d2f58..502cba5029ca4 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -33,6 +33,7 @@ void die(struct pt_regs *regs, const char *str) { static int die_counter; int ret; + long cause;
oops_enter();
@@ -42,11 +43,13 @@ void die(struct pt_regs *regs, const char *str)
pr_emerg("%s [#%d]\n", str, ++die_counter); print_modules(); - show_regs(regs); + if (regs) + show_regs(regs);
- ret = notify_die(DIE_OOPS, str, regs, 0, regs->cause, SIGSEGV); + cause = regs ? regs->cause : -1; + ret = notify_die(DIE_OOPS, str, regs, 0, cause, SIGSEGV);
- if (regs && kexec_should_crash(current)) + if (kexec_should_crash(current)) crash_kexec(regs);
bust_spinlocks(0);
From: Mattias Nissler mnissler@rivosinc.com
[ Upstream commit 130aee3fd9981297ff9354e5d5609cd59aafbbea ]
While working on something else, I noticed that the kernel would start accepting interrupts again after crashing in an interrupt handler. Since the kernel is already in inconsistent state, enabling interrupts is dangerous and opens up risk of kernel state deteriorating further. Interrupts do get enabled via what looks like an unintended side effect of spin_unlock_irq, so switch to the more cautious spin_lock_irqsave/spin_unlock_irqrestore instead.
Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Mattias Nissler mnissler@rivosinc.com Reviewed-by: Björn Töpel bjorn@kernel.org Link: https://lore.kernel.org/r/20230215144828.3370316-1-mnissler@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 502cba5029ca4..4f38b3c47e6d5 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -34,10 +34,11 @@ void die(struct pt_regs *regs, const char *str) static int die_counter; int ret; long cause; + unsigned long flags;
oops_enter();
- spin_lock_irq(&die_lock); + spin_lock_irqsave(&die_lock, flags); console_verbose(); bust_spinlocks(1);
@@ -54,7 +55,7 @@ void die(struct pt_regs *regs, const char *str)
bust_spinlocks(0); add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE); - spin_unlock_irq(&die_lock); + spin_unlock_irqrestore(&die_lock, flags); oops_exit();
if (in_interrupt())
From: Liao Chang liaochang1@huawei.com
[ Upstream commit 8ac6e619d9d51b3eb5bae817db8aa94e780a0db4 ]
Add header include guards to insn.h to prevent repeating declaration of any identifiers in insn.h.
Fixes: edde5584c7ab ("riscv: Add SW single-step support for KDB") Signed-off-by: Liao Chang liaochang1@huawei.com Reviewed-by: Andrew Jones ajones@ventanamicro.com Fixes: c9c1af3f186a ("RISC-V: rename parse_asm.h to insn.h") Reviewed-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20230129094242.282620-1-liaochang1@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/parse_asm.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/riscv/include/asm/parse_asm.h b/arch/riscv/include/asm/parse_asm.h index f36368de839f5..3cd00332d70f5 100644 --- a/arch/riscv/include/asm/parse_asm.h +++ b/arch/riscv/include/asm/parse_asm.h @@ -3,6 +3,9 @@ * Copyright (C) 2020 SiFive */
+#ifndef _ASM_RISCV_INSN_H +#define _ASM_RISCV_INSN_H + #include <linux/bits.h>
/* The bit field of immediate value in I-type instruction */ @@ -217,3 +220,5 @@ static inline bool is_ ## INSN_NAME ## _insn(long insn) \ (RVC_X(x_, RVC_B_IMM_5_OPOFF, RVC_B_IMM_5_MASK) << RVC_B_IMM_5_OFF) | \ (RVC_X(x_, RVC_B_IMM_7_6_OPOFF, RVC_B_IMM_7_6_MASK) << RVC_B_IMM_7_6_OFF) | \ (RVC_IMM_SIGN(x_) << RVC_B_IMM_SIGN_OFF); }) + +#endif /* _ASM_RISCV_INSN_H */
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit fc663711b94468f4e1427ebe289c9f05669699c9 ]
Remove the /proc/scsi/${proc_name} directory earlier to fix a race condition between unloading and reloading kernel modules. This fixes a bug introduced in 2009 by commit 77c019768f06 ("[SCSI] fix /proc memory leak in the SCSI core").
Fix the following kernel warning:
proc_dir_entry 'scsi/scsi_debug' already registered WARNING: CPU: 19 PID: 27986 at fs/proc/generic.c:376 proc_register+0x27d/0x2e0 Call Trace: proc_mkdir+0xb5/0xe0 scsi_proc_hostdir_add+0xb5/0x170 scsi_host_alloc+0x683/0x6c0 sdebug_driver_probe+0x6b/0x2d0 [scsi_debug] really_probe+0x159/0x540 __driver_probe_device+0xdc/0x230 driver_probe_device+0x4f/0x120 __device_attach_driver+0xef/0x180 bus_for_each_drv+0xe5/0x130 __device_attach+0x127/0x290 device_initial_probe+0x17/0x20 bus_probe_device+0x110/0x130 device_add+0x673/0xc80 device_register+0x1e/0x30 sdebug_add_host_helper+0x1a7/0x3b0 [scsi_debug] scsi_debug_init+0x64f/0x1000 [scsi_debug] do_one_initcall+0xd7/0x470 do_init_module+0xe7/0x330 load_module+0x122a/0x12c0 __do_sys_finit_module+0x124/0x1a0 __x64_sys_finit_module+0x46/0x50 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lore.kernel.org/r/20230210205200.36973-3-bvanassche@acm.org Cc: Alan Stern stern@rowland.harvard.edu Cc: Yi Zhang yi.zhang@redhat.com Cc: stable@vger.kernel.org Fixes: 77c019768f06 ("[SCSI] fix /proc memory leak in the SCSI core") Reported-by: Yi Zhang yi.zhang@redhat.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hosts.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index 0165dad803001..28b201c443267 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -180,6 +180,7 @@ void scsi_remove_host(struct Scsi_Host *shost) scsi_forget_host(shost); mutex_unlock(&shost->scan_mutex); scsi_proc_host_rm(shost); + scsi_proc_hostdir_rm(shost->hostt);
spin_lock_irqsave(shost->host_lock, flags); if (scsi_host_set_state(shost, SHOST_DEL)) @@ -321,6 +322,7 @@ static void scsi_host_dev_release(struct device *dev) struct Scsi_Host *shost = dev_to_shost(dev); struct device *parent = dev->parent;
+ /* In case scsi_remove_host() has not been called. */ scsi_proc_hostdir_rm(shost->hostt);
/* Wait for functions invoked through call_rcu(&scmd->rcu, ...) */
From: Mark Brown broonie@kernel.org
[ Upstream commit 261f06315cf7c3744731e36bfd8d4434949e3389 ]
While we currently assume that regulators with no control available are just uncontionally enabled this isn't always as clearly displayed to users as is desirable, for example the code for disabling unused regulators will log that it is about to disable them. Clean this up a bit by setting always_on during constraint evaluation if we have no available mechanism for controlling the regualtor so things that check the constraint will do the right thing.
Signed-off-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20220325144637.1543496-1-broonie@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Stable-dep-of: 80d2c29e09e6 ("regulator: core: Use ktime_get_boottime() to determine how long a regulator was off") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/core.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 3eae3aa5ad1d2..450aa0756dd8c 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1521,6 +1521,24 @@ static int set_machine_constraints(struct regulator_dev *rdev) } }
+ /* + * If there is no mechanism for controlling the regulator then + * flag it as always_on so we don't end up duplicating checks + * for this so much. Note that we could control the state of + * a supply to control the output on a regulator that has no + * direct control. + */ + if (!rdev->ena_pin && !ops->enable) { + if (rdev->supply_name && !rdev->supply) + return -EPROBE_DEFER; + + if (rdev->supply) + rdev->constraints->always_on = + rdev->supply->rdev->constraints->always_on; + else + rdev->constraints->always_on = true; + } + /* If the constraints say the regulator should be on at this point * and we have control then make sure it is enabled. */
From: Christian Kohlschütter christian@kohlschutter.com
[ Upstream commit 218320fec29430438016f88dd4fbebfa1b95ad8d ]
Regulators marked with "regulator-always-on" or "regulator-boot-on" as well as an "off-on-delay-us", may run into cycling issues that are hard to detect.
This is caused by the "last_off" state not being initialized in this case.
Fix the "last_off" initialization by setting it to the current kernel time upon initialization, regardless of always_on/boot_on state.
Signed-off-by: Christian Kohlschütter christian@kohlschutter.com Link: https://lore.kernel.org/r/FAFD5B39-E9C4-47C7-ACF1-2A04CD59758D@kohlschutter.... Signed-off-by: Mark Brown broonie@kernel.org Stable-dep-of: 80d2c29e09e6 ("regulator: core: Use ktime_get_boottime() to determine how long a regulator was off") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/core.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 450aa0756dd8c..c7b1e15bf7bb5 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1539,6 +1539,9 @@ static int set_machine_constraints(struct regulator_dev *rdev) rdev->constraints->always_on = true; }
+ if (rdev->desc->off_on_delay) + rdev->last_off = ktime_get(); + /* If the constraints say the regulator should be on at this point * and we have control then make sure it is enabled. */ @@ -1572,8 +1575,6 @@ static int set_machine_constraints(struct regulator_dev *rdev)
if (rdev->constraints->always_on) rdev->use_count++; - } else if (rdev->desc->off_on_delay) { - rdev->last_off = ktime_get(); }
print_constraints(rdev);
From: Matthias Kaehlcke mka@chromium.org
[ Upstream commit 80d2c29e09e663761c2778167a625b25ffe01b6f ]
For regulators with 'off-on-delay-us' the regulator framework currently uses ktime_get() to determine how long the regulator has been off before re-enabling it (after a delay if needed). A problem with using ktime_get() is that it doesn't account for the time the system is suspended. As a result a regulator with a longer 'off-on-delay' (e.g. 500ms) that was switched off during suspend might still incurr in a delay on resume before it is re-enabled, even though the regulator might have been off for hours. ktime_get_boottime() accounts for suspend time, use it instead of ktime_get().
Fixes: a8ce7bd89689 ("regulator: core: Fix off_on_delay handling") Cc: stable@vger.kernel.org # 5.13+ Signed-off-by: Matthias Kaehlcke mka@chromium.org Reviewed-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20230223003301.v2.1.I9719661b8eb0a73b8c416f9c26cf5... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index c7b1e15bf7bb5..cd10880378a6d 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1540,7 +1540,7 @@ static int set_machine_constraints(struct regulator_dev *rdev) }
if (rdev->desc->off_on_delay) - rdev->last_off = ktime_get(); + rdev->last_off = ktime_get_boottime();
/* If the constraints say the regulator should be on at this point * and we have control then make sure it is enabled. @@ -2629,7 +2629,7 @@ static int _regulator_do_enable(struct regulator_dev *rdev) * this regulator was disabled. */ ktime_t end = ktime_add_us(rdev->last_off, rdev->desc->off_on_delay); - s64 remaining = ktime_us_delta(end, ktime_get()); + s64 remaining = ktime_us_delta(end, ktime_get_boottime());
if (remaining > 0) _regulator_enable_delay(remaining); @@ -2868,7 +2868,7 @@ static int _regulator_do_disable(struct regulator_dev *rdev) }
if (rdev->desc->off_on_delay) - rdev->last_off = ktime_get(); + rdev->last_off = ktime_get_boottime();
trace_regulator_disable_complete(rdev_get_name(rdev));
From: Jan Kara jack@suse.cz
[ Upstream commit 0813299c586b175d7edb25f56412c54b812d0379 ]
When we are renaming a directory to a different directory, we need to update '..' entry in the moved directory. However nothing prevents moved directory from being modified and even converted from the inline format to the normal format. When such race happens the rename code gets confused and we crash. Fix the problem by locking the moved directory.
CC: stable@vger.kernel.org Fixes: 32f7f22c0b52 ("ext4: let ext4_rename handle inline dir") Signed-off-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230126112221.11866-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/namei.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index c7791e1957f50..aa689adeeafdf 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3887,9 +3887,16 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, if (new.dir != old.dir && EXT4_DIR_LINK_MAX(new.dir)) goto end_rename; } + /* + * We need to protect against old.inode directory getting + * converted from inline directory format into a normal one. + */ + inode_lock_nested(old.inode, I_MUTEX_NONDIR2); retval = ext4_rename_dir_prepare(handle, &old); - if (retval) + if (retval) { + inode_unlock(old.inode); goto end_rename; + } } /* * If we're renaming a file within an inline_data dir and adding or @@ -4014,6 +4021,8 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, } else { ext4_journal_stop(handle); } + if (old.dir_bh) + inode_unlock(old.inode); release_bh: brelse(old.dir_bh); brelse(old.bh);
From: Ben Skeggs bskeggs@redhat.com
[ Upstream commit 89ed996b888faaf11c69bb4cbc19f21475c9050e ]
Signed-off-by: Ben Skeggs bskeggs@redhat.com Reviewed-by: Dave Airlie airlied@redhat.com Signed-off-by: Dave Airlie airlied@redhat.com Stable-dep-of: 3638a820c5c3 ("drm/nouveau/kms/nv50: fix nv50_wndw_new_ prototype") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 16 ---------------- drivers/gpu/drm/nouveau/dispnv50/wndw.c | 12 ------------ drivers/gpu/drm/nouveau/dispnv50/wndw.h | 2 -- 3 files changed, 30 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index d7b9f7f8c9e31..0722b907bfcf4 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -2622,14 +2622,6 @@ nv50_display_fini(struct drm_device *dev, bool runtime, bool suspend) { struct nouveau_drm *drm = nouveau_drm(dev); struct drm_encoder *encoder; - struct drm_plane *plane; - - drm_for_each_plane(plane, dev) { - struct nv50_wndw *wndw = nv50_wndw(plane); - if (plane->funcs != &nv50_wndw) - continue; - nv50_wndw_fini(wndw); - }
list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) { if (encoder->encoder_type != DRM_MODE_ENCODER_DPMST) @@ -2645,7 +2637,6 @@ nv50_display_init(struct drm_device *dev, bool resume, bool runtime) { struct nv50_core *core = nv50_disp(dev)->core; struct drm_encoder *encoder; - struct drm_plane *plane;
if (resume || runtime) core->func->init(core); @@ -2658,13 +2649,6 @@ nv50_display_init(struct drm_device *dev, bool resume, bool runtime) } }
- drm_for_each_plane(plane, dev) { - struct nv50_wndw *wndw = nv50_wndw(plane); - if (plane->funcs != &nv50_wndw) - continue; - nv50_wndw_init(wndw); - } - return 0; }
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index 8d048bacd6f02..e1e62674e82d3 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -694,18 +694,6 @@ nv50_wndw_notify(struct nvif_notify *notify) return NVIF_NOTIFY_KEEP; }
-void -nv50_wndw_fini(struct nv50_wndw *wndw) -{ - nvif_notify_put(&wndw->notify); -} - -void -nv50_wndw_init(struct nv50_wndw *wndw) -{ - nvif_notify_get(&wndw->notify); -} - static const u64 nv50_cursor_format_modifiers[] = { DRM_FORMAT_MOD_LINEAR, DRM_FORMAT_MOD_INVALID, diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.h b/drivers/gpu/drm/nouveau/dispnv50/wndw.h index f4e0c50800344..980f8ea96d54a 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.h +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.h @@ -40,8 +40,6 @@ int nv50_wndw_new_(const struct nv50_wndw_func *, struct drm_device *, enum drm_plane_type, const char *name, int index, const u32 *format, enum nv50_disp_interlock_type, u32 interlock_data, u32 heads, struct nv50_wndw **); -void nv50_wndw_init(struct nv50_wndw *); -void nv50_wndw_fini(struct nv50_wndw *); void nv50_wndw_flush_set(struct nv50_wndw *, u32 *interlock, struct nv50_wndw_atom *); void nv50_wndw_flush_clr(struct nv50_wndw *, u32 *interlock, bool flush,
From: Jiri Slaby (SUSE) jirislaby@kernel.org
[ Upstream commit 3638a820c5c3b52f327cebb174fd4274bee08aa7 ]
gcc-13 warns about mismatching types for enums. That revealed switched arguments of nv50_wndw_new_(): drivers/gpu/drm/nouveau/dispnv50/wndw.c:696:1: error: conflicting types for 'nv50_wndw_new_' due to enum/integer mismatch; have 'int(const struct nv50_wndw_func *, struct drm_device *, enum drm_plane_type, const char *, int, const u32 *, u32, enum nv50_disp_interlock_type, u32, struct nv50_wndw **)' drivers/gpu/drm/nouveau/dispnv50/wndw.h:36:5: note: previous declaration of 'nv50_wndw_new_' with type 'int(const struct nv50_wndw_func *, struct drm_device *, enum drm_plane_type, const char *, int, const u32 *, enum nv50_disp_interlock_type, u32, u32, struct nv50_wndw **)'
It can be barely visible, but the declaration says about the parameters in the middle: enum nv50_disp_interlock_type, u32 interlock_data, u32 heads,
While the definition states differently: u32 heads, enum nv50_disp_interlock_type interlock_type, u32 interlock_data,
Unify/fix the declaration to match the definition.
Fixes: 53e0a3e70de6 ("drm/nouveau/kms/nv50-: simplify tracking of channel interlocks") Cc: Martin Liska mliska@suse.cz Cc: Ben Skeggs bskeggs@redhat.com Cc: Karol Herbst kherbst@redhat.com Cc: Lyude Paul lyude@redhat.com Cc: David Airlie airlied@gmail.com Cc: Daniel Vetter daniel@ffwll.ch Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Jiri Slaby (SUSE) jirislaby@kernel.org Signed-off-by: Karol Herbst kherbst@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20221031114229.10289-1-jirisla... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/dispnv50/wndw.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.h b/drivers/gpu/drm/nouveau/dispnv50/wndw.h index 980f8ea96d54a..6c64864da4550 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.h +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.h @@ -38,8 +38,9 @@ struct nv50_wndw {
int nv50_wndw_new_(const struct nv50_wndw_func *, struct drm_device *, enum drm_plane_type, const char *name, int index, - const u32 *format, enum nv50_disp_interlock_type, - u32 interlock_data, u32 heads, struct nv50_wndw **); + const u32 *format, u32 heads, + enum nv50_disp_interlock_type, u32 interlock_data, + struct nv50_wndw **); void nv50_wndw_flush_set(struct nv50_wndw *, u32 *interlock, struct nv50_wndw_atom *); void nv50_wndw_flush_clr(struct nv50_wndw *, u32 *interlock, bool flush,
From: Rob Clark robdclark@chromium.org
[ Upstream commit 8a86f213f4426f19511a16d886871805b35c3acf ]
The error path cleanup expects that chain and syncobj are either NULL or valid pointers. But post_deps was not allocated with __GFP_ZERO.
Fixes: ab723b7a992a ("drm/msm: Add syncobj support.") Signed-off-by: Rob Clark robdclark@chromium.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Dmitry Osipenko dmitry.osipenko@collabora.com Patchwork: https://patchwork.freedesktop.org/patch/523051/ Link: https://lore.kernel.org/r/20230215235048.1166484-1-robdclark@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 83e6ccad77286..fc2fb1019ea1c 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -640,8 +640,8 @@ static struct msm_submit_post_dep *msm_parse_post_deps(struct drm_device *dev, int ret = 0; uint32_t i, j;
- post_deps = kmalloc_array(nr_syncobjs, sizeof(*post_deps), - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); + post_deps = kcalloc(nr_syncobjs, sizeof(*post_deps), + GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); if (!post_deps) return ERR_PTR(-ENOMEM);
@@ -656,7 +656,6 @@ static struct msm_submit_post_dep *msm_parse_post_deps(struct drm_device *dev, }
post_deps[i].point = syncobj_desc.point; - post_deps[i].chain = NULL;
if (syncobj_desc.flags) { ret = -EINVAL;
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit a7a4c19c36de1e4b99b06e4060ccc8ab837725bc ]
Rather than writing CP_PREEMPT_ENABLE_GLOBAL twice, follow the vendor kernel and set CP_PREEMPT_ENABLE_LOCAL register instead. a5xx_submit() will override it during submission, but let's get the sequence correct.
Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/522638/ Link: https://lore.kernel.org/r/20230214020956.164473-2-dmitry.baryshkov@linaro.or... Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 5e2750eb3810c..c1bc5e2aaefe5 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -153,8 +153,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) OUT_RING(ring, 1);
/* Enable local preemption for finegrain preemption */ - OUT_PKT7(ring, CP_PREEMPT_ENABLE_GLOBAL, 1); - OUT_RING(ring, 0x02); + OUT_PKT7(ring, CP_PREEMPT_ENABLE_LOCAL, 1); + OUT_RING(ring, 0x1);
/* Allow CP_CONTEXT_SWITCH_YIELD packets in the IB2 */ OUT_PKT7(ring, CP_YIELD_ENABLE, 1);
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 141f66ebbfa17cc7e2075f06c50107da978c965b ]
A530 has highest bank bit equal to 15 (like A540). Fix values written to REG_A5XX_RB_MODE_CNTL and REG_A5XX_TPL1_MODE_CNTL registers.
Fixes: 1d832ab30ce6 ("drm/msm/a5xx: Add support for Adreno 508, 509, 512 GPUs") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/522639/ Link: https://lore.kernel.org/r/20230214020956.164473-3-dmitry.baryshkov@linaro.or... Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index c1bc5e2aaefe5..b8c49ba65254c 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -801,7 +801,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu) gpu_write(gpu, REG_A5XX_RBBM_AHB_CNTL2, 0x0000003F);
/* Set the highest bank bit */ - if (adreno_is_a540(adreno_gpu)) + if (adreno_is_a540(adreno_gpu) || adreno_is_a530(adreno_gpu)) regbit = 2; else regbit = 1;
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit b4fb748f0b734ce1d2e7834998cc599fcbd25d67 ]
Quoting Yassine: ring->memptrs->rptr is never updated and stays 0, so the comparison always evaluates to false and get_next_ring always returns ring 0 thinking it isn't empty.
Fix this by calling get_rptr() instead of reading rptr directly.
Reported-by: Yassine Oudjana y.oudjana@protonmail.com Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/522642/ Link: https://lore.kernel.org/r/20230214020956.164473-4-dmitry.baryshkov@linaro.or... Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c index 8abc9a2b114a2..6e326d851ba53 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c @@ -63,7 +63,7 @@ static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu) struct msm_ringbuffer *ring = gpu->rb[i];
spin_lock_irqsave(&ring->preempt_lock, flags); - empty = (get_wptr(ring) == ring->memptrs->rptr); + empty = (get_wptr(ring) == gpu->funcs->get_rptr(gpu, ring)); spin_unlock_irqrestore(&ring->preempt_lock, flags);
if (!empty)
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 32e7083429d46f29080626fe387ff90c086b1fbe ]
The rptr_addr is set in the preempt_init_ring(), which is called from a5xx_gpu_init(). It uses shadowptr() to set the address, however the shadow_iova is not yet initialized at that time. Move the rptr_addr setting to the a5xx_preempt_hw_init() which is called after setting the shadow_iova, getting the correct value for the address.
Fixes: 8907afb476ac ("drm/msm: Allow a5xx to mark the RPTR shadow as privileged") Suggested-by: Rob Clark robdclark@gmail.com Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/522640/ Link: https://lore.kernel.org/r/20230214020956.164473-5-dmitry.baryshkov@linaro.or... Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c index 6e326d851ba53..e0eef47dae632 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c @@ -208,6 +208,7 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu) a5xx_gpu->preempt[i]->wptr = 0; a5xx_gpu->preempt[i]->rptr = 0; a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova; + a5xx_gpu->preempt[i]->rptr_addr = shadowptr(a5xx_gpu, gpu->rb[i]); }
/* Write a 0 to signal that we aren't switching pagetables */ @@ -259,7 +260,6 @@ static int preempt_init_ring(struct a5xx_gpu *a5xx_gpu, ptr->data = 0; ptr->cntl = MSM_GPU_RB_CNTL_DEFAULT | AXXX_CP_RB_CNTL_NO_UPDATE;
- ptr->rptr_addr = shadowptr(a5xx_gpu, ring); ptr->counter = counters_iova;
return 0;
From: Rafał Miłecki rafal@milecki.pl
[ Upstream commit f99e6d7c4ed3be2531bd576425a5bd07fb133bd7 ]
While bringing hardware up we should perform a full reset including the switch bit (BGMAC_BCMA_IOCTL_SW_RESET aka SICF_SWRST). It's what specification says and what reference driver does.
This seems to be critical for the BCM5358. Without this hardware doesn't get initialized properly and doesn't seem to transmit or receive any packets.
Originally bgmac was calling bgmac_chip_reset() before setting "has_robosw" property which resulted in expected behaviour. That has changed as a side effect of adding platform device support which regressed BCM5358 support.
Fixes: f6a95a24957a ("net: ethernet: bgmac: Add platform device support") Cc: Jon Mason jdmason@kudzu.us Signed-off-by: Rafał Miłecki rafal@milecki.pl Reviewed-by: Leon Romanovsky leonro@nvidia.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20230227091156.19509-1-zajec5@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bgmac.c | 8 ++++++-- drivers/net/ethernet/broadcom/bgmac.h | 2 ++ 2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index fa2a43d465db7..f8fd65ab663ee 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -890,13 +890,13 @@ static void bgmac_chip_reset_idm_config(struct bgmac *bgmac)
if (iost & BGMAC_BCMA_IOST_ATTACHED) { flags = BGMAC_BCMA_IOCTL_SW_CLKEN; - if (!bgmac->has_robosw) + if (bgmac->in_init || !bgmac->has_robosw) flags |= BGMAC_BCMA_IOCTL_SW_RESET; } bgmac_clk_enable(bgmac, flags); }
- if (iost & BGMAC_BCMA_IOST_ATTACHED && !bgmac->has_robosw) + if (iost & BGMAC_BCMA_IOST_ATTACHED && (bgmac->in_init || !bgmac->has_robosw)) bgmac_idm_write(bgmac, BCMA_IOCTL, bgmac_idm_read(bgmac, BCMA_IOCTL) & ~BGMAC_BCMA_IOCTL_SW_RESET); @@ -1490,6 +1490,8 @@ int bgmac_enet_probe(struct bgmac *bgmac) struct net_device *net_dev = bgmac->net_dev; int err;
+ bgmac->in_init = true; + bgmac_chip_intrs_off(bgmac);
net_dev->irq = bgmac->irq; @@ -1542,6 +1544,8 @@ int bgmac_enet_probe(struct bgmac *bgmac) /* Omit FCS from max MTU size */ net_dev->max_mtu = BGMAC_RX_MAX_FRAME_SIZE - ETH_FCS_LEN;
+ bgmac->in_init = false; + err = register_netdev(bgmac->net_dev); if (err) { dev_err(bgmac->dev, "Cannot register net device\n"); diff --git a/drivers/net/ethernet/broadcom/bgmac.h b/drivers/net/ethernet/broadcom/bgmac.h index 110088e662eab..99a344175a751 100644 --- a/drivers/net/ethernet/broadcom/bgmac.h +++ b/drivers/net/ethernet/broadcom/bgmac.h @@ -474,6 +474,8 @@ struct bgmac { int irq; u32 int_mask;
+ bool in_init; + /* Current MAC state */ int mac_speed; int mac_duplex;
From: Kang Chen void0red@gmail.com
[ Upstream commit 11f180a5d62a51b484e9648f9b310e1bd50b1a57 ]
devm_kmalloc_array may fails, *fw_vsc_cfg might be null and cause out-of-bounds write in device_property_read_u8_array later.
Fixes: a06347c04c13 ("NFC: Add Intel Fields Peak NFC solution driver") Signed-off-by: Kang Chen void0red@gmail.com Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230227093037.907654-1-void0red@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nfc/fdp/i2c.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/nfc/fdp/i2c.c b/drivers/nfc/fdp/i2c.c index 051c43a2a52f8..5f97dcf08dd07 100644 --- a/drivers/nfc/fdp/i2c.c +++ b/drivers/nfc/fdp/i2c.c @@ -249,6 +249,9 @@ static void fdp_nci_i2c_read_device_properties(struct device *dev, len, sizeof(**fw_vsc_cfg), GFP_KERNEL);
+ if (!*fw_vsc_cfg) + goto alloc_err; + r = device_property_read_u8_array(dev, FDP_DP_FW_VSC_CFG_NAME, *fw_vsc_cfg, len);
@@ -262,6 +265,7 @@ static void fdp_nci_i2c_read_device_properties(struct device *dev, *fw_vsc_cfg = NULL; }
+alloc_err: dev_dbg(dev, "Clock type: %d, clock frequency: %d, VSC: %s", *clock_type, *clock_freq, *fw_vsc_cfg != NULL ? "yes" : "no"); }
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit ae44f1c9d1fc54aeceb335fedb1e73b2c3ee4561 ]
It looks like U-Boot fails to start the kernel properly when the compatible string of the board isn't fsl,T1040RDB, so stop overriding it from the rev-a.dts.
Fixes: 5ebb74749202 ("powerpc: dts: t1040rdb: fix ports names for Seville Ethernet switch") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/dts/fsl/t1040rdb-rev-a.dts | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/boot/dts/fsl/t1040rdb-rev-a.dts b/arch/powerpc/boot/dts/fsl/t1040rdb-rev-a.dts index 73f8c998c64df..d4f5f159d6f23 100644 --- a/arch/powerpc/boot/dts/fsl/t1040rdb-rev-a.dts +++ b/arch/powerpc/boot/dts/fsl/t1040rdb-rev-a.dts @@ -10,7 +10,6 @@
/ { model = "fsl,T1040RDB-REV-A"; - compatible = "fsl,T1040RDB-REV-A"; };
&seville_port0 {
From: Eric Dumazet edumazet@google.com
[ Upstream commit 693aa2c0d9b6d5b1f2745d31b6e70d09dbbaf06e ]
ila_xlat_nl_cmd_get_mapping() generates an empty skb, triggerring a recent sanity check [1].
Instead, return an error code, so that user space can get it.
[1] skb_assert_len WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 skb_assert_len include/linux/skbuff.h:2527 [inline] WARNING: CPU: 0 PID: 5923 at include/linux/skbuff.h:2527 __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 Modules linked in: CPU: 0 PID: 5923 Comm: syz-executor269 Not tainted 6.2.0-syzkaller-18300-g2ebd1fbb946d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_assert_len include/linux/skbuff.h:2527 [inline] pc : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 lr : skb_assert_len include/linux/skbuff.h:2527 [inline] lr : __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 sp : ffff80001e0d6c40 x29: ffff80001e0d6e60 x28: dfff800000000000 x27: ffff0000c86328c0 x26: dfff800000000000 x25: ffff0000c8632990 x24: ffff0000c8632a00 x23: 0000000000000000 x22: 1fffe000190c6542 x21: ffff0000c8632a10 x20: ffff0000c8632a00 x19: ffff80001856e000 x18: ffff80001e0d5fc0 x17: 0000000000000000 x16: ffff80001235d16c x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001 x11: ff80800008353a30 x10: 0000000000000000 x9 : 21567eaf25bfb600 x8 : 21567eaf25bfb600 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff80001e0d6558 x4 : ffff800015c74760 x3 : ffff800008596744 x2 : 0000000000000001 x1 : 0000000100000000 x0 : 000000000000000e Call trace: skb_assert_len include/linux/skbuff.h:2527 [inline] __dev_queue_xmit+0x1bc0/0x3488 net/core/dev.c:4156 dev_queue_xmit include/linux/netdevice.h:3033 [inline] __netlink_deliver_tap_skb net/netlink/af_netlink.c:307 [inline] __netlink_deliver_tap+0x45c/0x6f8 net/netlink/af_netlink.c:325 netlink_deliver_tap+0xf4/0x174 net/netlink/af_netlink.c:338 __netlink_sendskb net/netlink/af_netlink.c:1283 [inline] netlink_sendskb+0x6c/0x154 net/netlink/af_netlink.c:1292 netlink_unicast+0x334/0x8d4 net/netlink/af_netlink.c:1380 nlmsg_unicast include/net/netlink.h:1099 [inline] genlmsg_unicast include/net/genetlink.h:433 [inline] genlmsg_reply include/net/genetlink.h:443 [inline] ila_xlat_nl_cmd_get_mapping+0x620/0x7d0 net/ipv6/ila/ila_xlat.c:493 genl_family_rcv_msg_doit net/netlink/genetlink.c:968 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1048 [inline] genl_rcv_msg+0x938/0xc1c net/netlink/genetlink.c:1065 netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2574 genl_rcv+0x38/0x50 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x660/0x8d4 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x800/0xae0 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x558/0x844 net/socket.c:2479 ___sys_sendmsg net/socket.c:2533 [inline] __sys_sendmsg+0x26c/0x33c net/socket.c:2562 __do_sys_sendmsg net/socket.c:2571 [inline] __se_sys_sendmsg net/socket.c:2569 [inline] __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2569 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52 el0_svc_common+0x138/0x258 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193 el0_svc+0x58/0x168 arch/arm64/kernel/entry-common.c:637 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 irq event stamp: 136484 hardirqs last enabled at (136483): [<ffff800008350244>] __up_console_sem+0x60/0xb4 kernel/printk/printk.c:345 hardirqs last disabled at (136484): [<ffff800012358d60>] el1_dbg+0x24/0x80 arch/arm64/kernel/entry-common.c:405 softirqs last enabled at (136418): [<ffff800008020ea8>] softirq_handle_end kernel/softirq.c:414 [inline] softirqs last enabled at (136418): [<ffff800008020ea8>] __do_softirq+0xd4c/0xfa4 kernel/softirq.c:600 softirqs last disabled at (136371): [<ffff80000802b4a4>] ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:80 ---[ end trace 0000000000000000 ]--- skb len=0 headroom=0 headlen=0 tailroom=192 mac=(0,0) net=(0,-1) trans=-1 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0) hash(0x0 sw=0 l4=0) proto=0x0010 pkttype=6 iif=0 dev name=nlmon0 feat=0x0000000000005861
Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ila/ila_xlat.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/ipv6/ila/ila_xlat.c b/net/ipv6/ila/ila_xlat.c index a1ac0e3d8c60c..163668531a57f 100644 --- a/net/ipv6/ila/ila_xlat.c +++ b/net/ipv6/ila/ila_xlat.c @@ -477,6 +477,7 @@ int ila_xlat_nl_cmd_get_mapping(struct sk_buff *skb, struct genl_info *info)
rcu_read_lock();
+ ret = -ESRCH; ila = ila_lookup_by_params(&xp, ilan); if (ila) { ret = ila_dump_info(ila,
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 2067e7a00aa604b94de31d64f29b8893b1696f26 ]
The test_local_dnat_portonly() function initiates the client-side as soon as it sets the listening side to the background. This could lead to a race condition where the server may not be ready to listen. To ensure that the server-side is up and running before initiating the client-side, a delay is introduced to the test_local_dnat_portonly() function.
Before the fix: # ./nft_nat.sh PASS: netns routing/connectivity: ns0-rthlYrBU can reach ns1-rthlYrBU and ns2-rthlYrBU PASS: ping to ns1-rthlYrBU was ip NATted to ns2-rthlYrBU PASS: ping to ns1-rthlYrBU OK after ip nat output chain flush PASS: ipv6 ping to ns1-rthlYrBU was ip6 NATted to ns2-rthlYrBU 2023/02/27 04:11:03 socat[6055] E connect(5, AF=2 10.0.1.99:2000, 16): Connection refused ERROR: inet port rewrite
After the fix: # ./nft_nat.sh PASS: netns routing/connectivity: ns0-9sPJV6JJ can reach ns1-9sPJV6JJ and ns2-9sPJV6JJ PASS: ping to ns1-9sPJV6JJ was ip NATted to ns2-9sPJV6JJ PASS: ping to ns1-9sPJV6JJ OK after ip nat output chain flush PASS: ipv6 ping to ns1-9sPJV6JJ was ip6 NATted to ns2-9sPJV6JJ PASS: inet port rewrite without l3 address
Fixes: 282e5f8fe907 ("netfilter: nat: really support inet nat without l3 address") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/netfilter/nft_nat.sh | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/netfilter/nft_nat.sh b/tools/testing/selftests/netfilter/nft_nat.sh index 032f2de6e14e0..462dc47420b65 100755 --- a/tools/testing/selftests/netfilter/nft_nat.sh +++ b/tools/testing/selftests/netfilter/nft_nat.sh @@ -404,6 +404,8 @@ EOF echo SERVER-$family | ip netns exec "$ns1" timeout 5 socat -u STDIN TCP-LISTEN:2000 & sc_s=$!
+ sleep 1 + result=$(ip netns exec "$ns0" timeout 1 socat TCP:$daddr:2000 STDOUT)
if [ "$result" = "SERVER-inet" ];then
From: Yuiko Oshino yuiko.oshino@microchip.com
[ Upstream commit e57cf3639c323eeed05d3725fd82f91b349adca8 ]
Move the LAN7800 internal phy (phy ID 0x0007c132) specific register accesses to the phy driver (microchip.c).
Fix the error reported by Enguerrand de Ribaucourt in December 2022, "Some operations during the cable switch workaround modify the register LAN88XX_INT_MASK of the PHY. However, this register is specific to the LAN8835 PHY. For instance, if a DP8322I PHY is connected to the LAN7801, that register (0x19), corresponds to the LED and MAC address configuration, resulting in unapropriate behavior."
I did not test with the DP8322I PHY, but I tested with an EVB-LAN7800 with the internal PHY.
Fixes: 14437e3fa284 ("lan78xx: workaround of forced 100 Full/Half duplex mode error") Signed-off-by: Yuiko Oshino yuiko.oshino@microchip.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230301154307.30438-1-yuiko.oshino@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/microchip.c | 32 ++++++++++++++++++++++++++++++++ drivers/net/usb/lan78xx.c | 27 +-------------------------- 2 files changed, 33 insertions(+), 26 deletions(-)
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c index 9f1f2b6c97d4f..230f2fcf9c46a 100644 --- a/drivers/net/phy/microchip.c +++ b/drivers/net/phy/microchip.c @@ -342,6 +342,37 @@ static int lan88xx_config_aneg(struct phy_device *phydev) return genphy_config_aneg(phydev); }
+static void lan88xx_link_change_notify(struct phy_device *phydev) +{ + int temp; + + /* At forced 100 F/H mode, chip may fail to set mode correctly + * when cable is switched between long(~50+m) and short one. + * As workaround, set to 10 before setting to 100 + * at forced 100 F/H mode. + */ + if (!phydev->autoneg && phydev->speed == 100) { + /* disable phy interrupt */ + temp = phy_read(phydev, LAN88XX_INT_MASK); + temp &= ~LAN88XX_INT_MASK_MDINTPIN_EN_; + phy_write(phydev, LAN88XX_INT_MASK, temp); + + temp = phy_read(phydev, MII_BMCR); + temp &= ~(BMCR_SPEED100 | BMCR_SPEED1000); + phy_write(phydev, MII_BMCR, temp); /* set to 10 first */ + temp |= BMCR_SPEED100; + phy_write(phydev, MII_BMCR, temp); /* set to 100 later */ + + /* clear pending interrupt generated while workaround */ + temp = phy_read(phydev, LAN88XX_INT_STS); + + /* enable phy interrupt back */ + temp = phy_read(phydev, LAN88XX_INT_MASK); + temp |= LAN88XX_INT_MASK_MDINTPIN_EN_; + phy_write(phydev, LAN88XX_INT_MASK, temp); + } +} + static struct phy_driver microchip_phy_driver[] = { { .phy_id = 0x0007c130, @@ -355,6 +386,7 @@ static struct phy_driver microchip_phy_driver[] = {
.config_init = lan88xx_config_init, .config_aneg = lan88xx_config_aneg, + .link_change_notify = lan88xx_link_change_notify,
.config_intr = lan88xx_phy_config_intr, .handle_interrupt = lan88xx_handle_interrupt, diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 3e1a83a22fdd6..5700c9d20a3e2 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -1950,33 +1950,8 @@ static void lan78xx_remove_mdio(struct lan78xx_net *dev) static void lan78xx_link_status_change(struct net_device *net) { struct phy_device *phydev = net->phydev; - int temp; - - /* At forced 100 F/H mode, chip may fail to set mode correctly - * when cable is switched between long(~50+m) and short one. - * As workaround, set to 10 before setting to 100 - * at forced 100 F/H mode. - */ - if (!phydev->autoneg && (phydev->speed == 100)) { - /* disable phy interrupt */ - temp = phy_read(phydev, LAN88XX_INT_MASK); - temp &= ~LAN88XX_INT_MASK_MDINTPIN_EN_; - phy_write(phydev, LAN88XX_INT_MASK, temp);
- temp = phy_read(phydev, MII_BMCR); - temp &= ~(BMCR_SPEED100 | BMCR_SPEED1000); - phy_write(phydev, MII_BMCR, temp); /* set to 10 first */ - temp |= BMCR_SPEED100; - phy_write(phydev, MII_BMCR, temp); /* set to 100 later */ - - /* clear pending interrupt generated while workaround */ - temp = phy_read(phydev, LAN88XX_INT_STS); - - /* enable phy interrupt back */ - temp = phy_read(phydev, LAN88XX_INT_MASK); - temp |= LAN88XX_INT_MASK_MDINTPIN_EN_; - phy_write(phydev, LAN88XX_INT_MASK, temp); - } + phy_print_status(phydev); }
static int irq_map(struct irq_domain *d, unsigned int irq,
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit 9781e98a97110f5e76999058368b4be76a788484 ]
syzbot reported use-after-free in cfusbl_device_notify() [1]. This causes a stack trace like below:
BUG: KASAN: use-after-free in cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138 Read of size 8 at addr ffff88807ac4e6f0 by task kworker/u4:6/1214
CPU: 0 PID: 1214 Comm: kworker/u4:6 Not tainted 5.19.0-rc3-syzkaller-00146-g92f20ff72066 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313 print_report mm/kasan/report.c:429 [inline] kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491 cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138 notifier_call_chain+0xb5/0x200 kernel/notifier.c:87 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1945 call_netdevice_notifiers_extack net/core/dev.c:1983 [inline] call_netdevice_notifiers net/core/dev.c:1997 [inline] netdev_wait_allrefs_any net/core/dev.c:10227 [inline] netdev_run_todo+0xbc0/0x10f0 net/core/dev.c:10341 default_device_exit_batch+0x44e/0x590 net/core/dev.c:11334 ops_exit_list+0x125/0x170 net/core/net_namespace.c:167 cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK>
When unregistering a net device, unregister_netdevice_many_notify() sets the device's reg_state to NETREG_UNREGISTERING, calls notifiers with NETDEV_UNREGISTER, and adds the device to the todo list.
Later on, devices in the todo list are processed by netdev_run_todo(). netdev_run_todo() waits devices' reference count become 1 while rebdoadcasting NETDEV_UNREGISTER notification.
When cfusbl_device_notify() is called with NETDEV_UNREGISTER multiple times, the parent device might be freed. This could cause UAF. Processing NETDEV_UNREGISTER multiple times also causes inbalance of reference count for the module.
This patch fixes the issue by accepting only first NETDEV_UNREGISTER notification.
Fixes: 7ad65bf68d70 ("caif: Add support for CAIF over CDC NCM USB interface") CC: sjur.brandeland@stericsson.com sjur.brandeland@stericsson.com Reported-by: syzbot+b563d33852b893653a9e@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=c3bfd8e2450adab3bffe4d80821fbbced600407... [1] Signed-off-by: Shigeru Yoshida syoshida@redhat.com Link: https://lore.kernel.org/r/20230301163913.391304-1-syoshida@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/caif/caif_usb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c index b02e1292f7f19..24488a4e2d26e 100644 --- a/net/caif/caif_usb.c +++ b/net/caif/caif_usb.c @@ -134,6 +134,9 @@ static int cfusbl_device_notify(struct notifier_block *me, unsigned long what, struct usb_device *usbdev; int res;
+ if (what == NETDEV_UNREGISTER && dev->reg_state >= NETREG_UNREGISTERED) + return 0; + /* Check whether we have a NCM device, and find its VID/PID. */ if (!(dev->dev.parent && dev->dev.parent->driver && strcmp(dev->dev.parent->driver->name, "cdc_ncm") == 0))
From: Petr Oros poros@redhat.com
[ Upstream commit 84cba1840e68430325ac133a11be06bfb2f7acd8 ]
ice_get_module_eeprom() is broken since commit e9c9692c8a81 ("ice: Reimplement module reads used by ethtool") In this refactor, ice_get_module_eeprom() reads the eeprom in blocks of size 8. But the condition that should protect the buffer overflow ignores the last block. The last block always contains zeros.
Bug uncovered by ethtool upstream commit 9538f384b535 ("netlink: eeprom: Defer page requests to individual parsers") After this commit, ethtool reads a block with length = 1; to read the SFF-8024 identifier value.
unpatched driver: $ ethtool -m enp65s0f0np0 offset 0x90 length 8 Offset Values ------ ------ 0x0090: 00 00 00 00 00 00 00 00 $ ethtool -m enp65s0f0np0 offset 0x90 length 12 Offset Values ------ ------ 0x0090: 00 00 01 a0 4d 65 6c 6c 00 00 00 00 $
$ ethtool -m enp65s0f0np0 Offset Values ------ ------ 0x0000: 11 06 06 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 08 00 0x0070: 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00
patched driver: $ ethtool -m enp65s0f0np0 offset 0x90 length 8 Offset Values ------ ------ 0x0090: 00 00 01 a0 4d 65 6c 6c $ ethtool -m enp65s0f0np0 offset 0x90 length 12 Offset Values ------ ------ 0x0090: 00 00 01 a0 4d 65 6c 6c 61 6e 6f 78 $ ethtool -m enp65s0f0np0 Identifier : 0x11 (QSFP28) Extended identifier : 0x00 Extended identifier description : 1.5W max. Power consumption Extended identifier description : No CDR in TX, No CDR in RX Extended identifier description : High Power Class (> 3.5 W) not enabled Connector : 0x23 (No separable connector) Transceiver codes : 0x88 0x00 0x00 0x00 0x00 0x00 0x00 0x00 Transceiver type : 40G Ethernet: 40G Base-CR4 Transceiver type : 25G Ethernet: 25G Base-CR CA-N Encoding : 0x05 (64B/66B) BR, Nominal : 25500Mbps Rate identifier : 0x00 Length (SMF,km) : 0km Length (OM3 50um) : 0m Length (OM2 50um) : 0m Length (OM1 62.5um) : 0m Length (Copper or Active cable) : 1m Transmitter technology : 0xa0 (Copper cable unequalized) Attenuation at 2.5GHz : 4db Attenuation at 5.0GHz : 5db Attenuation at 7.0GHz : 7db Attenuation at 12.9GHz : 10db ........ ....
Fixes: e9c9692c8a81 ("ice: Reimplement module reads used by ethtool") Signed-off-by: Petr Oros poros@redhat.com Reviewed-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Jesse Brandeburg jesse.brandeburg@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_ethtool.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index 24001035910e0..60f73e775beeb 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -3998,6 +3998,8 @@ ice_get_module_eeprom(struct net_device *netdev, * SFP modules only ever use page 0. */ if (page == 0 || !(data[0x2] & 0x4)) { + u32 copy_len; + /* If i2c bus is busy due to slow page change or * link management access, call can fail. This is normal. * So we retry this a few times. @@ -4021,8 +4023,8 @@ ice_get_module_eeprom(struct net_device *netdev, }
/* Make sure we have enough room for the new block */ - if ((i + SFF_READ_BLOCK_SIZE) < ee->len) - memcpy(data + i, value, SFF_READ_BLOCK_SIZE); + copy_len = min_t(u32, SFF_READ_BLOCK_SIZE, ee->len - i); + memcpy(data + i, value, copy_len); } } return 0;
From: Liu Jian liujian56@huawei.com
[ Upstream commit d900f3d20cc3169ce42ec72acc850e662a4d4db2 ]
When the buffer length of the recvmsg system call is 0, we got the flollowing soft lockup problem:
watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149] CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:remove_wait_queue+0xb/0xc0 Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20 RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768 RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040 RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7 R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800 R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0 FS: 00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_msg_wait_data+0x279/0x2f0 tcp_bpf_recvmsg_parser+0x3c6/0x490 inet_recvmsg+0x280/0x290 sock_recvmsg+0xfc/0x120 ____sys_recvmsg+0x160/0x3d0 ___sys_recvmsg+0xf0/0x180 __sys_recvmsg+0xea/0x1a0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc
The logic in tcp_bpf_recvmsg_parser is as follows:
msg_bytes_ready: copied = sk_msg_recvmsg(sk, psock, msg, len, flags); if (!copied) { wait data; goto msg_bytes_ready; }
In this case, "copied" always is 0, the infinite loop occurs.
According to the Linux system call man page, 0 should be returned in this case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly return. Also modify several other functions with the same problem.
Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap") Fixes: 9825d866ce0d ("af_unix: Implement unix_dgram_bpf_recvmsg()") Fixes: c5d2177a72a1 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self") Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Liu Jian liujian56@huawei.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: John Fastabend john.fastabend@gmail.com Cc: Jakub Sitnicki jakub@cloudflare.com Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_bpf.c | 6 ++++++ net/ipv4/udp_bpf.c | 3 +++ net/unix/unix_bpf.c | 3 +++ 3 files changed, 12 insertions(+)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 7f34c455651db..20ad554af3693 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -187,6 +187,9 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk, if (unlikely(flags & MSG_ERRQUEUE)) return inet_recv_error(sk, msg, len, addr_len);
+ if (!len) + return 0; + psock = sk_psock_get(sk); if (unlikely(!psock)) return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); @@ -245,6 +248,9 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, if (unlikely(flags & MSG_ERRQUEUE)) return inet_recv_error(sk, msg, len, addr_len);
+ if (!len) + return 0; + psock = sk_psock_get(sk); if (unlikely(!psock)) return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len); diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index bbe6569c9ad34..56e1047632f6b 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -69,6 +69,9 @@ static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, if (unlikely(flags & MSG_ERRQUEUE)) return inet_recv_error(sk, msg, len, addr_len);
+ if (!len) + return 0; + psock = sk_psock_get(sk); if (unlikely(!psock)) return sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c index 452376c6f4194..5919d61d9874a 100644 --- a/net/unix/unix_bpf.c +++ b/net/unix/unix_bpf.c @@ -55,6 +55,9 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg, struct sk_psock *psock; int copied;
+ if (!len) + return 0; + psock = sk_psock_get(sk); if (unlikely(!psock)) return __unix_recvmsg(sk, msg, len, flags);
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit ce6bd00abc220e9edf10986234fadba6462b4abf ]
Change sc7180's ctl block len to 0x1dc.
Fixes: 7bdc0c4b8126 ("msm:disp:dpu1: add support for display for SC7180 target") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/522210/ Link: https://lore.kernel.org/r/20230211231259.1308718-5-dmitry.baryshkov@linaro.o... Signed-off-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c index 700d65e39feb0..4c65259eecb9d 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c @@ -375,19 +375,19 @@ static const struct dpu_ctl_cfg sdm845_ctl[] = { static const struct dpu_ctl_cfg sc7180_ctl[] = { { .name = "ctl_0", .id = CTL_0, - .base = 0x1000, .len = 0xE4, + .base = 0x1000, .len = 0x1dc, .features = BIT(DPU_CTL_ACTIVE_CFG), .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 9), }, { .name = "ctl_1", .id = CTL_1, - .base = 0x1200, .len = 0xE4, + .base = 0x1200, .len = 0x1dc, .features = BIT(DPU_CTL_ACTIVE_CFG), .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 10), }, { .name = "ctl_2", .id = CTL_2, - .base = 0x1400, .len = 0xE4, + .base = 0x1400, .len = 0x1dc, .features = BIT(DPU_CTL_ACTIVE_CFG), .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 11), },
From: Rongguang Wei weirongguang@kylinos.cn
[ Upstream commit a9334b702a03b693f54ebd3b98f67bf722b74870 ]
When MAC is not support PMT, driver will check PHY's WoL capability and set device wakeup capability in stmmac_init_phy(). We can enable the WoL through ethtool, the driver would enable the device wake up flag. Now the device_may_wakeup() return true.
But if there is a way which enable the PHY's WoL capability derectly, like in BIOS. The driver would not know the enable thing and would not set the device wake up flag. The phy_suspend may failed like this:
[ 32.409063] PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x50 returns -16 [ 32.409065] PM: Device stmmac-1:00 failed to suspend: error -16 [ 32.409067] PM: Some devices failed to suspend, or early wake event detected
Add to set the device wakeup enable flag according to the get_wol function result in PHY can fix the error in this scene.
v2: add a Fixes tag.
Fixes: 1d8e5b0f3f2c ("net: stmmac: Support WOL with phy") Signed-off-by: Rongguang Wei weirongguang@kylinos.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index d56f65338ea66..728e68971c397 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1262,6 +1262,7 @@ static int stmmac_init_phy(struct net_device *dev)
phylink_ethtool_get_wol(priv->phylink, &wol); device_set_wakeup_capable(priv->device, !!wol.supported); + device_set_wakeup_enable(priv->device, !!wol.wolopts); }
return ret;
From: Russell King (Oracle) rmk+kernel@armlinux.org.uk
[ Upstream commit f4b47a2e9463950df3e7c8b70e017877c1d4eb11 ]
The locking in phy_probe() and phy_remove() does very little to prevent any races with e.g. phy_attach_direct(), but instead causes lockdep ABBA warnings. Remove it.
====================================================== WARNING: possible circular locking dependency detected 6.2.0-dirty #1108 Tainted: G W E ------------------------------------------------------ ip/415 is trying to acquire lock: ffff5c268f81ef50 (&dev->lock){+.+.}-{3:3}, at: phy_attach_direct+0x17c/0x3a0 [libphy]
but task is already holding lock: ffffaef6496cb518 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x154/0x560
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (rtnl_mutex){+.+.}-{3:3}: __lock_acquire+0x35c/0x6c0 lock_acquire.part.0+0xcc/0x220 lock_acquire+0x68/0x84 __mutex_lock+0x8c/0x414 mutex_lock_nested+0x34/0x40 rtnl_lock+0x24/0x30 sfp_bus_add_upstream+0x34/0x150 phy_sfp_probe+0x4c/0x94 [libphy] mv3310_probe+0x148/0x184 [marvell10g] phy_probe+0x8c/0x200 [libphy] call_driver_probe+0xbc/0x15c really_probe+0xc0/0x320 __driver_probe_device+0x84/0x120 driver_probe_device+0x44/0x120 __device_attach_driver+0xc4/0x160 bus_for_each_drv+0x80/0xe0 __device_attach+0xb0/0x1f0 device_initial_probe+0x1c/0x2c bus_probe_device+0xa4/0xb0 device_add+0x360/0x53c phy_device_register+0x60/0xa4 [libphy] fwnode_mdiobus_phy_device_register+0xc0/0x190 [fwnode_mdio] fwnode_mdiobus_register_phy+0x160/0xd80 [fwnode_mdio] of_mdiobus_register+0x140/0x340 [of_mdio] orion_mdio_probe+0x298/0x3c0 [mvmdio] platform_probe+0x70/0xe0 call_driver_probe+0x34/0x15c really_probe+0xc0/0x320 __driver_probe_device+0x84/0x120 driver_probe_device+0x44/0x120 __driver_attach+0x104/0x210 bus_for_each_dev+0x78/0xdc driver_attach+0x2c/0x3c bus_add_driver+0x184/0x240 driver_register+0x80/0x13c __platform_driver_register+0x30/0x3c xt_compat_calc_jump+0x28/0xa4 [x_tables] do_one_initcall+0x50/0x1b0 do_init_module+0x50/0x1fc load_module+0x684/0x744 __do_sys_finit_module+0xc4/0x140 __arm64_sys_finit_module+0x28/0x34 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0x6c/0x1b0 do_el0_svc+0x34/0x44 el0_svc+0x48/0xf0 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x1a0/0x1a4
-> #0 (&dev->lock){+.+.}-{3:3}: check_prev_add+0xb4/0xc80 validate_chain+0x414/0x47c __lock_acquire+0x35c/0x6c0 lock_acquire.part.0+0xcc/0x220 lock_acquire+0x68/0x84 __mutex_lock+0x8c/0x414 mutex_lock_nested+0x34/0x40 phy_attach_direct+0x17c/0x3a0 [libphy] phylink_fwnode_phy_connect.part.0+0x70/0xe4 [phylink] phylink_fwnode_phy_connect+0x48/0x60 [phylink] mvpp2_open+0xec/0x2e0 [mvpp2] __dev_open+0x104/0x214 __dev_change_flags+0x1d4/0x254 dev_change_flags+0x2c/0x7c do_setlink+0x254/0xa50 __rtnl_newlink+0x430/0x514 rtnl_newlink+0x58/0x8c rtnetlink_rcv_msg+0x17c/0x560 netlink_rcv_skb+0x64/0x150 rtnetlink_rcv+0x20/0x30 netlink_unicast+0x1d4/0x2b4 netlink_sendmsg+0x1a4/0x400 ____sys_sendmsg+0x228/0x290 ___sys_sendmsg+0x88/0xec __sys_sendmsg+0x70/0xd0 __arm64_sys_sendmsg+0x2c/0x40 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0x6c/0x1b0 do_el0_svc+0x34/0x44 el0_svc+0x48/0xf0 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x1a0/0x1a4
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(&dev->lock); lock(rtnl_mutex); lock(&dev->lock);
*** DEADLOCK ***
Fixes: 298e54fa810e ("net: phy: add core phylib sfp support") Reported-by: Marc Zyngier maz@kernel.org Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phy_device.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 996842a1a9a35..73485383db4ef 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -3052,8 +3052,6 @@ static int phy_probe(struct device *dev) if (phydrv->flags & PHY_IS_INTERNAL) phydev->is_internal = true;
- mutex_lock(&phydev->lock); - /* Deassert the reset signal */ phy_device_reset(phydev, 0);
@@ -3121,12 +3119,10 @@ static int phy_probe(struct device *dev) phydev->state = PHY_READY;
out: - /* Assert the reset signal */ + /* Re-assert the reset signal on error */ if (err) phy_device_reset(phydev, 1);
- mutex_unlock(&phydev->lock); - return err; }
@@ -3136,9 +3132,7 @@ static int phy_remove(struct device *dev)
cancel_delayed_work_sync(&phydev->state_queue);
- mutex_lock(&phydev->lock); phydev->state = PHY_DOWN; - mutex_unlock(&phydev->lock);
sfp_bus_del_upstream(phydev->sfp_bus); phydev->sfp_bus = NULL;
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit accd7e23693aaaa9aa0d3e9eca0ae77d1be80ab3 ]
The driver needs to keep track of all the possible concurrent TPA (GRO/LRO) completions on the aggregation ring. On P5 chips, the maximum number of concurrent TPA is 256 and the amount of memory we allocate is order-5 on systems using 4K pages. Memory allocation failure has been reported:
NetworkManager: page allocation failure: order:5, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0-1 CPU: 15 PID: 2995 Comm: NetworkManager Kdump: loaded Not tainted 5.10.156 #1 Hardware name: Dell Inc. PowerEdge R660/0M1CC5, BIOS 0.2.25 08/12/2022 Call Trace: dump_stack+0x57/0x6e warn_alloc.cold.120+0x7b/0xdd ? _cond_resched+0x15/0x30 ? __alloc_pages_direct_compact+0x15f/0x170 __alloc_pages_slowpath.constprop.108+0xc58/0xc70 __alloc_pages_nodemask+0x2d0/0x300 kmalloc_order+0x24/0xe0 kmalloc_order_trace+0x19/0x80 bnxt_alloc_mem+0x1150/0x15c0 [bnxt_en] ? bnxt_get_func_stat_ctxs+0x13/0x60 [bnxt_en] __bnxt_open_nic+0x12e/0x780 [bnxt_en] bnxt_open+0x10b/0x240 [bnxt_en] __dev_open+0xe9/0x180 __dev_change_flags+0x1af/0x220 dev_change_flags+0x21/0x60 do_setlink+0x35c/0x1100
Instead of allocating this big chunk of memory and dividing it up for the concurrent TPA instances, allocate each small chunk separately for each TPA instance. This will reduce it to order-0 allocations.
Fixes: 79632e9ba386 ("bnxt_en: Expand bnxt_tpa_info struct to support 57500 chips.") Reviewed-by: Somnath Kotur somnath.kotur@broadcom.com Reviewed-by: Damodharam Ammepalli damodharam.ammepalli@broadcom.com Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index f64df4d532896..4e98e34fc46b5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2999,7 +2999,7 @@ static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem)
static void bnxt_free_tpa_info(struct bnxt *bp) { - int i; + int i, j;
for (i = 0; i < bp->rx_nr_rings; i++) { struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i]; @@ -3007,8 +3007,10 @@ static void bnxt_free_tpa_info(struct bnxt *bp) kfree(rxr->rx_tpa_idx_map); rxr->rx_tpa_idx_map = NULL; if (rxr->rx_tpa) { - kfree(rxr->rx_tpa[0].agg_arr); - rxr->rx_tpa[0].agg_arr = NULL; + for (j = 0; j < bp->max_tpa; j++) { + kfree(rxr->rx_tpa[j].agg_arr); + rxr->rx_tpa[j].agg_arr = NULL; + } } kfree(rxr->rx_tpa); rxr->rx_tpa = NULL; @@ -3017,14 +3019,13 @@ static void bnxt_free_tpa_info(struct bnxt *bp)
static int bnxt_alloc_tpa_info(struct bnxt *bp) { - int i, j, total_aggs = 0; + int i, j;
bp->max_tpa = MAX_TPA; if (bp->flags & BNXT_FLAG_CHIP_P5) { if (!bp->max_tpa_v2) return 0; bp->max_tpa = max_t(u16, bp->max_tpa_v2, MAX_TPA_P5); - total_aggs = bp->max_tpa * MAX_SKB_FRAGS; }
for (i = 0; i < bp->rx_nr_rings; i++) { @@ -3038,12 +3039,12 @@ static int bnxt_alloc_tpa_info(struct bnxt *bp)
if (!(bp->flags & BNXT_FLAG_CHIP_P5)) continue; - agg = kcalloc(total_aggs, sizeof(*agg), GFP_KERNEL); - rxr->rx_tpa[0].agg_arr = agg; - if (!agg) - return -ENOMEM; - for (j = 1; j < bp->max_tpa; j++) - rxr->rx_tpa[j].agg_arr = agg + j * MAX_SKB_FRAGS; + for (j = 0; j < bp->max_tpa; j++) { + agg = kcalloc(MAX_SKB_FRAGS, sizeof(*agg), GFP_KERNEL); + if (!agg) + return -ENOMEM; + rxr->rx_tpa[j].agg_arr = agg; + } rxr->rx_tpa_idx_map = kzalloc(sizeof(*rxr->rx_tpa_idx_map), GFP_KERNEL); if (!rxr->rx_tpa_idx_map)
From: Ivan Delalande colona@arista.com
[ Upstream commit 9f7dd42f0db1dc6915a52d4a8a96ca18dd8cc34e ]
It seems that change was unintentional, we have userspace code that needs the mark while listening for events like REPLY, DESTROY, etc. Also include 0-marks in requested dumps, as they were before that fix.
Fixes: 1feeae071507 ("netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark") Signed-off-by: Ivan Delalande colona@arista.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_netlink.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 18a508783c282..8776e75b200ff 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -322,11 +322,12 @@ ctnetlink_dump_timestamp(struct sk_buff *skb, const struct nf_conn *ct) }
#ifdef CONFIG_NF_CONNTRACK_MARK -static int ctnetlink_dump_mark(struct sk_buff *skb, const struct nf_conn *ct) +static int ctnetlink_dump_mark(struct sk_buff *skb, const struct nf_conn *ct, + bool dump) { u32 mark = READ_ONCE(ct->mark);
- if (!mark) + if (!mark && !dump) return 0;
if (nla_put_be32(skb, CTA_MARK, htonl(mark))) @@ -337,7 +338,7 @@ static int ctnetlink_dump_mark(struct sk_buff *skb, const struct nf_conn *ct) return -1; } #else -#define ctnetlink_dump_mark(a, b) (0) +#define ctnetlink_dump_mark(a, b, c) (0) #endif
#ifdef CONFIG_NF_CONNTRACK_SECMARK @@ -542,7 +543,7 @@ static int ctnetlink_dump_extinfo(struct sk_buff *skb, static int ctnetlink_dump_info(struct sk_buff *skb, struct nf_conn *ct) { if (ctnetlink_dump_status(skb, ct) < 0 || - ctnetlink_dump_mark(skb, ct) < 0 || + ctnetlink_dump_mark(skb, ct, true) < 0 || ctnetlink_dump_secctx(skb, ct) < 0 || ctnetlink_dump_id(skb, ct) < 0 || ctnetlink_dump_use(skb, ct) < 0 || @@ -825,8 +826,7 @@ ctnetlink_conntrack_event(unsigned int events, const struct nf_ct_event *item) }
#ifdef CONFIG_NF_CONNTRACK_MARK - if (events & (1 << IPCT_MARK) && - ctnetlink_dump_mark(skb, ct) < 0) + if (ctnetlink_dump_mark(skb, ct, events & (1 << IPCT_MARK))) goto nla_put_failure; #endif nlmsg_end(skb, nlh); @@ -2759,7 +2759,7 @@ static int __ctnetlink_glue_build(struct sk_buff *skb, struct nf_conn *ct) goto nla_put_failure;
#ifdef CONFIG_NF_CONNTRACK_MARK - if (ctnetlink_dump_mark(skb, ct) < 0) + if (ctnetlink_dump_mark(skb, ct, true) < 0) goto nla_put_failure; #endif if (ctnetlink_dump_labels(skb, ct) < 0)
From: Florian Westphal fw@strlen.de
[ Upstream commit 4a02426787bf024dafdb79b362285ee325de3f5e ]
The xtables packet traverser performs an unconditional local_bh_disable(), but the nf_tables evaluation loop does not.
Functions that are called from either xtables or nftables must assume that they can be called in process context.
inet_twsk_deschedule_put() assumes that no softirq interrupt can occur. If tproxy is used from nf_tables its possible that we'll deadlock trying to aquire a lock already held in process context.
Add a small helper that takes care of this and use it.
Link: https://lore.kernel.org/netfilter-devel/401bd6ed-314a-a196-1cdc-e13c720cc8f2... Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Reported-and-tested-by: Major Dávid major.david@balasys.hu Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tproxy.h | 7 +++++++ net/ipv4/netfilter/nf_tproxy_ipv4.c | 2 +- net/ipv6/netfilter/nf_tproxy_ipv6.c | 2 +- 3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/include/net/netfilter/nf_tproxy.h b/include/net/netfilter/nf_tproxy.h index 82d0e41b76f22..faa108b1ba675 100644 --- a/include/net/netfilter/nf_tproxy.h +++ b/include/net/netfilter/nf_tproxy.h @@ -17,6 +17,13 @@ static inline bool nf_tproxy_sk_is_transparent(struct sock *sk) return false; }
+static inline void nf_tproxy_twsk_deschedule_put(struct inet_timewait_sock *tw) +{ + local_bh_disable(); + inet_twsk_deschedule_put(tw); + local_bh_enable(); +} + /* assign a socket to the skb -- consumes sk */ static inline void nf_tproxy_assign_sock(struct sk_buff *skb, struct sock *sk) { diff --git a/net/ipv4/netfilter/nf_tproxy_ipv4.c b/net/ipv4/netfilter/nf_tproxy_ipv4.c index b2bae0b0e42a1..61cb2341f50fe 100644 --- a/net/ipv4/netfilter/nf_tproxy_ipv4.c +++ b/net/ipv4/netfilter/nf_tproxy_ipv4.c @@ -38,7 +38,7 @@ nf_tproxy_handle_time_wait4(struct net *net, struct sk_buff *skb, hp->source, lport ? lport : hp->dest, skb->dev, NF_TPROXY_LOOKUP_LISTENER); if (sk2) { - inet_twsk_deschedule_put(inet_twsk(sk)); + nf_tproxy_twsk_deschedule_put(inet_twsk(sk)); sk = sk2; } } diff --git a/net/ipv6/netfilter/nf_tproxy_ipv6.c b/net/ipv6/netfilter/nf_tproxy_ipv6.c index 6bac68fb27a39..3fe4f15e01dc8 100644 --- a/net/ipv6/netfilter/nf_tproxy_ipv6.c +++ b/net/ipv6/netfilter/nf_tproxy_ipv6.c @@ -63,7 +63,7 @@ nf_tproxy_handle_time_wait6(struct sk_buff *skb, int tproto, int thoff, lport ? lport : hp->dest, skb->dev, NF_TPROXY_LOOKUP_LISTENER); if (sk2) { - inet_twsk_deschedule_put(inet_twsk(sk)); + nf_tproxy_twsk_deschedule_put(inet_twsk(sk)); sk = sk2; } }
From: Lorenz Bauer lorenz.bauer@isovalent.com
[ Upstream commit 9b459804ff9973e173fabafba2a1319f771e85fa ]
btf_datasec_resolve contains a bug that causes the following BTF to fail loading:
[1] DATASEC a size=2 vlen=2 type_id=4 offset=0 size=1 type_id=7 offset=1 size=1 [2] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none) [3] PTR (anon) type_id=2 [4] VAR a type_id=3 linkage=0 [5] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none) [6] TYPEDEF td type_id=5 [7] VAR b type_id=6 linkage=0
This error message is printed during btf_check_all_types:
[1] DATASEC a size=2 vlen=2 type_id=7 offset=1 size=1 Invalid type
By tracing btf_*_resolve we can pinpoint the problem:
btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_TBD) = 0 btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_TBD) = 0 btf_ptr_resolve(depth: 3, type_id: 3, mode: RESOLVE_PTR) = 0 btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_PTR) = 0 btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_PTR) = -22
The last invocation of btf_datasec_resolve should invoke btf_var_resolve by means of env_stack_push, instead it returns EINVAL. The reason is that env_stack_push is never executed for the second VAR.
if (!env_type_is_resolve_sink(env, var_type) && !env_type_is_resolved(env, var_type_id)) { env_stack_set_next_member(env, i + 1); return env_stack_push(env, var_type, var_type_id); }
env_type_is_resolve_sink() changes its behaviour based on resolve_mode. For RESOLVE_PTR, we can simplify the if condition to the following:
(btf_type_is_modifier() || btf_type_is_ptr) && !env_type_is_resolved()
Since we're dealing with a VAR the clause evaluates to false. This is not sufficient to trigger the bug however. The log output and EINVAL are only generated if btf_type_id_size() fails.
if (!btf_type_id_size(btf, &type_id, &type_size)) { btf_verifier_log_vsi(env, v->t, vsi, "Invalid type"); return -EINVAL; }
Most types are sized, so for example a VAR referring to an INT is not a problem. The bug is only triggered if a VAR points at a modifier. Since we skipped btf_var_resolve that modifier was also never resolved, which means that btf_resolved_type_id returns 0 aka VOID for the modifier. This in turn causes btf_type_id_size to return NULL, triggering EINVAL.
To summarise, the following conditions are necessary:
- VAR pointing at PTR, STRUCT, UNION or ARRAY - Followed by a VAR pointing at TYPEDEF, VOLATILE, CONST, RESTRICT or TYPE_TAG
The fix is to reset resolve_mode to RESOLVE_TBD before attempting to resolve a VAR from a DATASEC.
Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") Signed-off-by: Lorenz Bauer lmb@isovalent.com Link: https://lore.kernel.org/r/20230306112138.155352-2-lmb@isovalent.com Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/btf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 1f9369b677fe2..6c7126de5c17f 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3655,6 +3655,7 @@ static int btf_datasec_resolve(struct btf_verifier_env *env, struct btf *btf = env->btf; u16 i;
+ env->resolve_mode = RESOLVE_TBD; for_each_vsi_from(i, v->next_member, v->t, vsi) { u32 var_type_id = vsi->type, type_id, type_size = 0; const struct btf_type *var_type = btf_type_by_id(env->btf,
From: Lukas Wunner lukas@wunner.de
[ Upstream commit 7e8b617eb93f9fcaedac02cd19edcad31c767386 ]
Cache the interrupt mask to avoid re-reading it from the PHY upon every interrupt.
This will simplify a subsequent commit which detects hot-removal in the interrupt handler and bails out.
Analyzing and debugging PHY transactions also becomes simpler if such redundant reads are avoided.
Last not least, interrupt overhead and latency is slightly improved.
Tested-by: Oleksij Rempel o.rempel@pengutronix.de # LAN9514/9512/9500 Tested-by: Ferry Toth fntoth@gmail.com # LAN9514 Signed-off-by: Lukas Wunner lukas@wunner.de Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 58aac3a2ef41 ("net: phy: smsc: fix link up detection in forced irq mode") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/smsc.c | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c index 636b0907a5987..63dca3bb56110 100644 --- a/drivers/net/phy/smsc.c +++ b/drivers/net/phy/smsc.c @@ -44,6 +44,7 @@ static struct smsc_hw_stat smsc_hw_stats[] = { };
struct smsc_phy_priv { + u16 intmask; bool energy_enable; struct clk *refclk; }; @@ -58,7 +59,6 @@ static int smsc_phy_ack_interrupt(struct phy_device *phydev) static int smsc_phy_config_intr(struct phy_device *phydev) { struct smsc_phy_priv *priv = phydev->priv; - u16 intmask = 0; int rc;
if (phydev->interrupts == PHY_INTERRUPT_ENABLED) { @@ -66,12 +66,15 @@ static int smsc_phy_config_intr(struct phy_device *phydev) if (rc) return rc;
- intmask = MII_LAN83C185_ISF_INT4 | MII_LAN83C185_ISF_INT6; + priv->intmask = MII_LAN83C185_ISF_INT4 | MII_LAN83C185_ISF_INT6; if (priv->energy_enable) - intmask |= MII_LAN83C185_ISF_INT7; - rc = phy_write(phydev, MII_LAN83C185_IM, intmask); + priv->intmask |= MII_LAN83C185_ISF_INT7; + + rc = phy_write(phydev, MII_LAN83C185_IM, priv->intmask); } else { - rc = phy_write(phydev, MII_LAN83C185_IM, intmask); + priv->intmask = 0; + + rc = phy_write(phydev, MII_LAN83C185_IM, 0); if (rc) return rc;
@@ -83,13 +86,8 @@ static int smsc_phy_config_intr(struct phy_device *phydev)
static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev) { - int irq_status, irq_enabled; - - irq_enabled = phy_read(phydev, MII_LAN83C185_IM); - if (irq_enabled < 0) { - phy_error(phydev); - return IRQ_NONE; - } + struct smsc_phy_priv *priv = phydev->priv; + int irq_status;
irq_status = phy_read(phydev, MII_LAN83C185_ISF); if (irq_status < 0) { @@ -97,7 +95,7 @@ static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; }
- if (!(irq_status & irq_enabled)) + if (!(irq_status & priv->intmask)) return IRQ_NONE;
phy_trigger_machine(phydev);
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 58aac3a2ef414fea6d7fdf823ea177744a087d13 ]
Currently link up can't be detected in forced mode if polling isn't used. Only link up interrupt source we have is aneg complete which isn't applicable in forced mode. Therefore we have to use energy-on as link up indicator.
Fixes: 7365494550f6 ("net: phy: smsc: skip ENERGYON interrupt if disabled") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/smsc.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c index 63dca3bb56110..04e628788f1b5 100644 --- a/drivers/net/phy/smsc.c +++ b/drivers/net/phy/smsc.c @@ -44,7 +44,6 @@ static struct smsc_hw_stat smsc_hw_stats[] = { };
struct smsc_phy_priv { - u16 intmask; bool energy_enable; struct clk *refclk; }; @@ -58,7 +57,6 @@ static int smsc_phy_ack_interrupt(struct phy_device *phydev)
static int smsc_phy_config_intr(struct phy_device *phydev) { - struct smsc_phy_priv *priv = phydev->priv; int rc;
if (phydev->interrupts == PHY_INTERRUPT_ENABLED) { @@ -66,14 +64,9 @@ static int smsc_phy_config_intr(struct phy_device *phydev) if (rc) return rc;
- priv->intmask = MII_LAN83C185_ISF_INT4 | MII_LAN83C185_ISF_INT6; - if (priv->energy_enable) - priv->intmask |= MII_LAN83C185_ISF_INT7; - - rc = phy_write(phydev, MII_LAN83C185_IM, priv->intmask); + rc = phy_write(phydev, MII_LAN83C185_IM, + MII_LAN83C185_ISF_INT_PHYLIB_EVENTS); } else { - priv->intmask = 0; - rc = phy_write(phydev, MII_LAN83C185_IM, 0); if (rc) return rc; @@ -86,7 +79,6 @@ static int smsc_phy_config_intr(struct phy_device *phydev)
static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev) { - struct smsc_phy_priv *priv = phydev->priv; int irq_status;
irq_status = phy_read(phydev, MII_LAN83C185_ISF); @@ -95,7 +87,7 @@ static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev) return IRQ_NONE; }
- if (!(irq_status & priv->intmask)) + if (!(irq_status & MII_LAN83C185_ISF_INT_PHYLIB_EVENTS)) return IRQ_NONE;
phy_trigger_machine(phydev);
From: Daniel Golle daniel@makrotopia.org
[ Upstream commit 193250ace270fecd586dd2d0dfbd9cbd2ade977f ]
Fix data corruption issue with SerDes connected PHYs operating at 1.25 Gbps speed where we could previously observe about 30% packet loss while the bad packet counter was increasing.
As almost all boards with MediaTek MT7622 or MT7986 use either the MT7531 switch IC operating at 3.125Gbps SerDes rate or single-port PHYs using rate-adaptation to 2500Base-X mode, this issue only got exposed now when we started trying to use SFP modules operating with 1.25 Gbps with the BananaPi R3 board.
The fix is to set bit 12 which disables the RX FIFO clear function when setting up MAC MCR, MediaTek SDK did the same change stating: "If without this patch, kernel might receive invalid packets that are corrupted by GMAC."[1]
[1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+...
Fixes: 42c03844e93d ("net-next: mediatek: add support for MediaTek MT7622 SoC") Tested-by: Bjørn Mork bjorn@mork.no Signed-off-by: Daniel Golle daniel@makrotopia.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/138da2735f92c8b6f8578ec2e5a794ee515b665f.167793731... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 3 ++- drivers/net/ethernet/mediatek/mtk_eth_soc.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index cc6a5b2f24e3e..bb1acdb0c62b3 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -363,7 +363,8 @@ static void mtk_mac_config(struct phylink_config *config, unsigned int mode, mcr_cur = mtk_r32(mac->hw, MTK_MAC_MCR(mac->id)); mcr_new = mcr_cur; mcr_new |= MAC_MCR_IPG_CFG | MAC_MCR_FORCE_MODE | - MAC_MCR_BACKOFF_EN | MAC_MCR_BACKPR_EN | MAC_MCR_FORCE_LINK; + MAC_MCR_BACKOFF_EN | MAC_MCR_BACKPR_EN | MAC_MCR_FORCE_LINK | + MAC_MCR_RX_FIFO_CLR_DIS;
/* Only update control register when needed! */ if (mcr_new != mcr_cur) diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h index f2d90639d7ed1..d60260e00a3fc 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h @@ -369,6 +369,7 @@ #define MAC_MCR_FORCE_MODE BIT(15) #define MAC_MCR_TX_EN BIT(14) #define MAC_MCR_RX_EN BIT(13) +#define MAC_MCR_RX_FIFO_CLR_DIS BIT(12) #define MAC_MCR_BACKOFF_EN BIT(9) #define MAC_MCR_BACKPR_EN BIT(8) #define MAC_MCR_FORCE_RX_FC BIT(5)
From: Chandrakanth Patil chandrakanth.patil@broadcom.com
[ Upstream commit bfa659177dcba48cf13f2bd88c1972f12a60bf1c ]
The firmware only supports Logical Disk IDs up to 240 and LD ID 255 (0xFF) is reserved for deleted LDs. However, in some cases, firmware was assigning LD ID 254 (0xFE) to deleted LDs and this was causing the driver to mark the wrong disk as deleted. This in turn caused the wrong disk device to be taken offline by the SCSI midlayer.
To address this issue, limit the LD ID range from 255 to 240. This ensures the deleted LD ID is properly identified and removed by the driver without accidently deleting any valid LDs.
Fixes: ae6874ba4b43 ("scsi: megaraid_sas: Early detection of VD deletion through RaidMap update") Reported-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Chandrakanth Patil chandrakanth.patil@broadcom.com Signed-off-by: Sumit Saxena sumit.saxena@broadcom.com Link: https://lore.kernel.org/r/20230302105342.34933-2-chandrakanth.patil@broadcom... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/megaraid/megaraid_sas.h | 2 ++ drivers/scsi/megaraid/megaraid_sas_fp.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 650210d2abb4d..02d7ab119f806 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -1517,6 +1517,8 @@ struct megasas_ctrl_info { #define MEGASAS_MAX_LD_IDS (MEGASAS_MAX_LD_CHANNELS * \ MEGASAS_MAX_DEV_PER_CHANNEL)
+#define MEGASAS_MAX_SUPPORTED_LD_IDS 240 + #define MEGASAS_MAX_SECTORS (2*1024) #define MEGASAS_MAX_SECTORS_IEEE (2*128) #define MEGASAS_DBG_LVL 1 diff --git a/drivers/scsi/megaraid/megaraid_sas_fp.c b/drivers/scsi/megaraid/megaraid_sas_fp.c index 83f69c33b01a9..ec10d35b4685a 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fp.c +++ b/drivers/scsi/megaraid/megaraid_sas_fp.c @@ -358,7 +358,7 @@ u8 MR_ValidateMapInfo(struct megasas_instance *instance, u64 map_id) ld = MR_TargetIdToLdGet(i, drv_map);
/* For non existing VDs, iterate to next VD*/ - if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1)) + if (ld >= MEGASAS_MAX_SUPPORTED_LD_IDS) continue;
raid = MR_LdRaidGet(ld, drv_map);
From: Eric Dumazet edumazet@google.com
[ Upstream commit c77737b736ceb50fdf150434347dbd81ec76dbb1 ]
Customers using GKE 1.25 and 1.26 are facing conntrack issues root caused to commit c9c3b6811f74 ("netfilter: conntrack: make max chain length random").
Even if we assume Uniform Hashing, a bucket often reachs 8 chained items while the load factor of the hash table is smaller than 0.5
With a limit of 16, we reach load factors of 3. With a limit of 32, we reach load factors of 11. With a limit of 40, we reach load factors of 15. With a limit of 50, we reach load factors of 24.
This patch changes MIN_CHAINLEN to 50, to minimize risks.
Ideally, we could in the future add a cushion based on expected load factor (2 * nf_conntrack_max / nf_conntrack_buckets), because some setups might expect unusual values.
Fixes: c9c3b6811f74 ("netfilter: conntrack: make max chain length random") Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 43ea8cfd374bb..7ff0da5f998a0 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -96,8 +96,8 @@ static DEFINE_MUTEX(nf_conntrack_mutex); #define GC_SCAN_MAX_DURATION msecs_to_jiffies(10) #define GC_SCAN_EXPIRED_MAX (64000u / HZ)
-#define MIN_CHAINLEN 8u -#define MAX_CHAINLEN (32u - MIN_CHAINLEN) +#define MIN_CHAINLEN 50u +#define MAX_CHAINLEN (80u - MIN_CHAINLEN)
static struct conntrack_gc_work conntrack_gc_work;
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 7e7e1541c91615e9950d0b96bcd1806d297e970e ]
REGMAP is a hidden (not user visible) symbol. Users cannot set it directly thru "make *config", so drivers should select it instead of depending on it if they need it.
Consistently using "select" or "depends on" can also help reduce Kconfig circular dependency issues.
Therefore, change the use of "depends on REGMAP" to "select REGMAP".
Fixes: ef0f62264b2a ("platform/x86: mlx-platform: Add physical bus number auto detection") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Vadim Pasternak vadimp@mellanox.com Cc: Darren Hart dvhart@infradead.org Cc: Hans de Goede hdegoede@redhat.com Cc: Mark Gross markgross@kernel.org Cc: platform-driver-x86@vger.kernel.org Link: https://lore.kernel.org/r/20230226053953.4681-7-rdunlap@infradead.org Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index cd8146dbdd453..61186829d1f6b 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -943,7 +943,8 @@ config I2C_MULTI_INSTANTIATE
config MLX_PLATFORM tristate "Mellanox Technologies platform support" - depends on I2C && REGMAP + depends on I2C + select REGMAP help This option enables system support for the Mellanox Technologies platform. The Mellanox systems provide data center networking
From: D. Wythe alibuda@linux.alibaba.com
[ Upstream commit ce7ca794712f186da99719e8b4e97bd5ddbb04c3 ]
Before determining whether the msg has unsupported options, it has been prematurely terminated by the wrong status check.
For the application, the general usages of MSG_FASTOPEN likes
fd = socket(...) /* rather than connect */ sendto(fd, data, len, MSG_FASTOPEN)
Hence, We need to check the flag before state check, because the sock state here is always SMC_INIT when applications tries MSG_FASTOPEN. Once we found unsupported options, fallback it to TCP.
Fixes: ee9dfbef02d1 ("net/smc: handle sockopts forcing fallback") Signed-off-by: D. Wythe alibuda@linux.alibaba.com Signed-off-by: Simon Horman simon.horman@corigine.com
v2 -> v1: Optimize code style Reviewed-by: Tony Lu tonylu@linux.alibaba.com
Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index d5ddf283ed8e2..9cdb7df0801f3 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2172,16 +2172,14 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk = sock->sk; struct smc_sock *smc; - int rc = -EPIPE; + int rc;
smc = smc_sk(sk); lock_sock(sk); - if ((sk->sk_state != SMC_ACTIVE) && - (sk->sk_state != SMC_APPCLOSEWAIT1) && - (sk->sk_state != SMC_INIT)) - goto out;
+ /* SMC does not support connect with fastopen */ if (msg->msg_flags & MSG_FASTOPEN) { + /* not connected yet, fallback */ if (sk->sk_state == SMC_INIT && !smc->connect_nonblock) { rc = smc_switch_to_fallback(smc, SMC_CLC_DECL_OPTUNSUPP); if (rc) @@ -2190,6 +2188,11 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) rc = -EINVAL; goto out; } + } else if ((sk->sk_state != SMC_ACTIVE) && + (sk->sk_state != SMC_APPCLOSEWAIT1) && + (sk->sk_state != SMC_INIT)) { + rc = -EPIPE; + goto out; }
if (smc->use_fallback) {
From: Suman Ghosh sumang@marvell.com
[ Upstream commit ea9dd2e5c6d12c8b65ce7514c8359a70eeaa0e70 ]
NDC caches contexts of frequently used queue's (Rx and Tx queues) contexts. Due to a HW errata when NDC detects fault/poision while accessing contexts it could go into an illegal state where a cache line could get locked forever. To makesure all cache lines in NDC are available for optimum performance upon fault/lockerror/posion errors scan through all cache lines in NDC and clear the lock bit.
Fixes: 4a3581cd5995 ("octeontx2-af: NPA AQ instruction enqueue support") Signed-off-by: Suman Ghosh sumang@marvell.com Signed-off-by: Sunil Kovvuri Goutham sgoutham@marvell.com Signed-off-by: Sai Krishna saikrishnag@marvell.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/marvell/octeontx2/af/rvu.h | 5 ++ .../marvell/octeontx2/af/rvu_debugfs.c | 7 +-- .../ethernet/marvell/octeontx2/af/rvu_nix.c | 16 ++++- .../ethernet/marvell/octeontx2/af/rvu_npa.c | 58 ++++++++++++++++++- .../ethernet/marvell/octeontx2/af/rvu_reg.h | 3 + 5 files changed, 82 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h index a7213db38804b..fed49d6a178d0 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h @@ -811,6 +811,9 @@ bool is_mcam_entry_enabled(struct rvu *rvu, struct npc_mcam *mcam, int blkaddr, /* CPT APIs */ int rvu_cpt_lf_teardown(struct rvu *rvu, u16 pcifunc, int lf, int slot);
+#define NDC_AF_BANK_MASK GENMASK_ULL(7, 0) +#define NDC_AF_BANK_LINE_MASK GENMASK_ULL(31, 16) + /* CN10K RVU */ int rvu_set_channels_base(struct rvu *rvu); void rvu_program_channels(struct rvu *rvu); @@ -826,6 +829,8 @@ static inline void rvu_dbg_init(struct rvu *rvu) {} static inline void rvu_dbg_exit(struct rvu *rvu) {} #endif
+int rvu_ndc_fix_locked_cacheline(struct rvu *rvu, int blkaddr); + /* RVU Switch */ void rvu_switch_enable(struct rvu *rvu); void rvu_switch_disable(struct rvu *rvu); diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c index 66d34699f160c..4dddf6ec3be87 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c @@ -196,9 +196,6 @@ enum cpt_eng_type { CPT_IE_TYPE = 3, };
-#define NDC_MAX_BANK(rvu, blk_addr) (rvu_read64(rvu, \ - blk_addr, NDC_AF_CONST) & 0xFF) - #define rvu_dbg_NULL NULL #define rvu_dbg_open_NULL NULL
@@ -1009,6 +1006,7 @@ static int ndc_blk_hits_miss_stats(struct seq_file *s, int idx, int blk_addr) struct nix_hw *nix_hw; struct rvu *rvu; int bank, max_bank; + u64 ndc_af_const;
if (blk_addr == BLKADDR_NDC_NPA0) { rvu = s->private; @@ -1017,7 +1015,8 @@ static int ndc_blk_hits_miss_stats(struct seq_file *s, int idx, int blk_addr) rvu = nix_hw->rvu; }
- max_bank = NDC_MAX_BANK(rvu, blk_addr); + ndc_af_const = rvu_read64(rvu, blk_addr, NDC_AF_CONST); + max_bank = FIELD_GET(NDC_AF_BANK_MASK, ndc_af_const); for (bank = 0; bank < max_bank; bank++) { seq_printf(s, "BANK:%d\n", bank); seq_printf(s, "\tHits:\t%lld\n", diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c index 09892703cfd46..d274d552924a3 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c @@ -797,6 +797,7 @@ static int nix_aq_enqueue_wait(struct rvu *rvu, struct rvu_block *block, struct nix_aq_res_s *result; int timeout = 1000; u64 reg, head; + int ret;
result = (struct nix_aq_res_s *)aq->res->base;
@@ -820,9 +821,22 @@ static int nix_aq_enqueue_wait(struct rvu *rvu, struct rvu_block *block, return -EBUSY; }
- if (result->compcode != NIX_AQ_COMP_GOOD) + if (result->compcode != NIX_AQ_COMP_GOOD) { /* TODO: Replace this with some error code */ + if (result->compcode == NIX_AQ_COMP_CTX_FAULT || + result->compcode == NIX_AQ_COMP_LOCKERR || + result->compcode == NIX_AQ_COMP_CTX_POISON) { + ret = rvu_ndc_fix_locked_cacheline(rvu, BLKADDR_NDC_NIX0_RX); + ret |= rvu_ndc_fix_locked_cacheline(rvu, BLKADDR_NDC_NIX0_TX); + ret |= rvu_ndc_fix_locked_cacheline(rvu, BLKADDR_NDC_NIX1_RX); + ret |= rvu_ndc_fix_locked_cacheline(rvu, BLKADDR_NDC_NIX1_TX); + if (ret) + dev_err(rvu->dev, + "%s: Not able to unlock cachelines\n", __func__); + } + return -EBUSY; + }
return 0; } diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c index 70bd036ed76e4..4f5ca5ab13a40 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c @@ -4,7 +4,7 @@ * Copyright (C) 2018 Marvell. * */ - +#include <linux/bitfield.h> #include <linux/module.h> #include <linux/pci.h>
@@ -42,9 +42,18 @@ static int npa_aq_enqueue_wait(struct rvu *rvu, struct rvu_block *block, return -EBUSY; }
- if (result->compcode != NPA_AQ_COMP_GOOD) + if (result->compcode != NPA_AQ_COMP_GOOD) { /* TODO: Replace this with some error code */ + if (result->compcode == NPA_AQ_COMP_CTX_FAULT || + result->compcode == NPA_AQ_COMP_LOCKERR || + result->compcode == NPA_AQ_COMP_CTX_POISON) { + if (rvu_ndc_fix_locked_cacheline(rvu, BLKADDR_NDC_NPA0)) + dev_err(rvu->dev, + "%s: Not able to unlock cachelines\n", __func__); + } + return -EBUSY; + }
return 0; } @@ -545,3 +554,48 @@ void rvu_npa_lf_teardown(struct rvu *rvu, u16 pcifunc, int npalf)
npa_ctx_free(rvu, pfvf); } + +/* Due to an Hardware errata, in some corner cases, AQ context lock + * operations can result in a NDC way getting into an illegal state + * of not valid but locked. + * + * This API solves the problem by clearing the lock bit of the NDC block. + * The operation needs to be done for each line of all the NDC banks. + */ +int rvu_ndc_fix_locked_cacheline(struct rvu *rvu, int blkaddr) +{ + int bank, max_bank, line, max_line, err; + u64 reg, ndc_af_const; + + /* Set the ENABLE bit(63) to '0' */ + reg = rvu_read64(rvu, blkaddr, NDC_AF_CAMS_RD_INTERVAL); + rvu_write64(rvu, blkaddr, NDC_AF_CAMS_RD_INTERVAL, reg & GENMASK_ULL(62, 0)); + + /* Poll until the BUSY bits(47:32) are set to '0' */ + err = rvu_poll_reg(rvu, blkaddr, NDC_AF_CAMS_RD_INTERVAL, GENMASK_ULL(47, 32), true); + if (err) { + dev_err(rvu->dev, "Timed out while polling for NDC CAM busy bits.\n"); + return err; + } + + ndc_af_const = rvu_read64(rvu, blkaddr, NDC_AF_CONST); + max_bank = FIELD_GET(NDC_AF_BANK_MASK, ndc_af_const); + max_line = FIELD_GET(NDC_AF_BANK_LINE_MASK, ndc_af_const); + for (bank = 0; bank < max_bank; bank++) { + for (line = 0; line < max_line; line++) { + /* Check if 'cache line valid bit(63)' is not set + * but 'cache line lock bit(60)' is set and on + * success, reset the lock bit(60). + */ + reg = rvu_read64(rvu, blkaddr, + NDC_AF_BANKX_LINEX_METADATA(bank, line)); + if (!(reg & BIT_ULL(63)) && (reg & BIT_ULL(60))) { + rvu_write64(rvu, blkaddr, + NDC_AF_BANKX_LINEX_METADATA(bank, line), + reg & ~BIT_ULL(60)); + } + } + } + + return 0; +} diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h index 21f1ed4e222f7..d81b63a0d430f 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_reg.h @@ -670,6 +670,7 @@ #define NDC_AF_INTR_ENA_W1S (0x00068) #define NDC_AF_INTR_ENA_W1C (0x00070) #define NDC_AF_ACTIVE_PC (0x00078) +#define NDC_AF_CAMS_RD_INTERVAL (0x00080) #define NDC_AF_BP_TEST_ENABLE (0x001F8) #define NDC_AF_BP_TEST(a) (0x00200 | (a) << 3) #define NDC_AF_BLK_RST (0x002F0) @@ -685,6 +686,8 @@ (0x00F00 | (a) << 5 | (b) << 4) #define NDC_AF_BANKX_HIT_PC(a) (0x01000 | (a) << 3) #define NDC_AF_BANKX_MISS_PC(a) (0x01100 | (a) << 3) +#define NDC_AF_BANKX_LINEX_METADATA(a, b) \ + (0x10000 | (a) << 12 | (b) << 3)
/* LBK */ #define LBK_CONST (0x10ull)
From: Benjamin Coddington bcodding@redhat.com
[ Upstream commit 9ca6705d9d609441d34f8b853e1e4a6369b3b171 ]
Fix a race where kthread_stop() may prevent the threadfn from ever getting called. If that happens the svc_rqst will not be cleaned up.
Fixes: ed6473ddc704 ("NFSv4: Fix callback server shutdown") Signed-off-by: Benjamin Coddington bcodding@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/svc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c index 08ca797bb8a46..74a1c9116a785 100644 --- a/net/sunrpc/svc.c +++ b/net/sunrpc/svc.c @@ -806,6 +806,7 @@ EXPORT_SYMBOL_GPL(svc_set_num_threads); static int svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) { + struct svc_rqst *rqstp; struct task_struct *task; unsigned int state = serv->sv_nrthreads-1;
@@ -814,7 +815,10 @@ svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs) task = choose_victim(serv, pool, &state); if (task == NULL) break; - kthread_stop(task); + rqstp = kthread_data(task); + /* Did we lose a race to svo_function threadfn? */ + if (kthread_stop(task) == -EINTR) + svc_exit_thread(rqstp); nrservs++; } while (nrservs < 0); return 0;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit c8b8a3c601f2cfad25ab5ce5b04df700048aef6e ]
The MT7530 switch from the MT7621 SoC has 2 ports which can be set up as internal: port 5 and 6. Arınç reports that the GMAC1 attached to port 5 receives corrupted frames, unless port 6 (attached to GMAC0) has been brought up by the driver. This is true regardless of whether port 5 is used as a user port or as a CPU port (carrying DSA tags).
Offline debugging (blind for me) which began in the linked thread showed experimentally that the configuration done by the driver for port 6 contains a step which is needed by port 5 as well - the write to CORE_GSWPLL_GRP2 (note that I've no idea as to what it does, apart from the comment "Set core clock into 500Mhz"). Prints put by Arınç show that the reset value of CORE_GSWPLL_GRP2 is RG_GSWPLL_POSDIV_500M(1) | RG_GSWPLL_FBKDIV_500M(40) (0x128), both on the MCM MT7530 from the MT7621 SoC, as well as on the standalone MT7530 from MT7623NI Bananapi BPI-R2. Apparently, port 5 on the standalone MT7530 can work under both values of the register, while on the MT7621 SoC it cannot.
The call path that triggers the register write is:
mt753x_phylink_mac_config() for port 6 -> mt753x_pad_setup() -> mt7530_pad_clk_setup()
so this fully explains the behavior noticed by Arınç, that bringing port 6 up is necessary.
The simplest fix for the problem is to extract the register writes which are needed for both port 5 and 6 into a common mt7530_pll_setup() function, which is called at mt7530_setup() time, immediately after switch reset. We can argue that this mirrors the code layout introduced in mt7531_setup() by commit 42bc4fafe359 ("net: mt7531: only do PLL once after the reset"), in that the PLL setup has the exact same positioning, and further work to consolidate the separate setup() functions is not hindered.
Testing confirms that:
- the slight reordering of writes to MT7530_P6ECR and to CORE_GSWPLL_GRP1 / CORE_GSWPLL_GRP2 introduced by this change does not appear to cause problems for the operation of port 6 on MT7621 and on MT7623 (where port 5 also always worked)
- packets sent through port 5 are not corrupted anymore, regardless of whether port 6 is enabled by phylink or not (or even present in the device tree)
My algorithm for determining the Fixes: tag is as follows. Testing shows that some logic from mt7530_pad_clk_setup() is needed even for port 5. Prior to commit ca366d6c889b ("net: dsa: mt7530: Convert to PHYLINK API"), a call did exist for all phy_is_pseudo_fixed_link() ports - so port 5 included. That commit replaced it with a temporary "Port 5 is not supported!" comment, and the following commit 38f790a80560 ("net: dsa: mt7530: Add support for port 5") replaced that comment with a configuration procedure in mt7530_setup_port5() which was insufficient for port 5 to work. I'm laying the blame on the patch that claimed support for port 5, although one would have also needed the change from commit c3b8e07909db ("net: dsa: mt7530: setup core clock even in TRGMII mode") for the write to be performed completely independently from port 6's configuration.
Thanks go to Arınç for describing the problem, for debugging and for testing.
Reported-by: Arınç ÜNAL arinc.unal@arinc9.com Link: https://lore.kernel.org/netdev/f297c2c4-6e7c-57ac-2394-f6025d309b9d@arinc9.c... Fixes: 38f790a80560 ("net: dsa: mt7530: Add support for port 5") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Tested-by: Arınç ÜNAL arinc.unal@arinc9.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230307155411.868573-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mt7530.c | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-)
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index c1505de23957f..7bcfa3be95e29 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -388,6 +388,24 @@ mt7530_fdb_write(struct mt7530_priv *priv, u16 vid, mt7530_write(priv, MT7530_ATA1 + (i * 4), reg[i]); }
+/* Set up switch core clock for MT7530 */ +static void mt7530_pll_setup(struct mt7530_priv *priv) +{ + /* Disable PLL */ + core_write(priv, CORE_GSWPLL_GRP1, 0); + + /* Set core clock into 500Mhz */ + core_write(priv, CORE_GSWPLL_GRP2, + RG_GSWPLL_POSDIV_500M(1) | + RG_GSWPLL_FBKDIV_500M(25)); + + /* Enable PLL */ + core_write(priv, CORE_GSWPLL_GRP1, + RG_GSWPLL_EN_PRE | + RG_GSWPLL_POSDIV_200M(2) | + RG_GSWPLL_FBKDIV_200M(32)); +} + /* Setup TX circuit including relevant PAD and driving */ static int mt7530_pad_clk_setup(struct dsa_switch *ds, phy_interface_t interface) @@ -448,21 +466,6 @@ mt7530_pad_clk_setup(struct dsa_switch *ds, phy_interface_t interface) core_clear(priv, CORE_TRGMII_GSW_CLK_CG, REG_GSWCK_EN | REG_TRGMIICK_EN);
- /* Setup core clock for MT7530 */ - /* Disable PLL */ - core_write(priv, CORE_GSWPLL_GRP1, 0); - - /* Set core clock into 500Mhz */ - core_write(priv, CORE_GSWPLL_GRP2, - RG_GSWPLL_POSDIV_500M(1) | - RG_GSWPLL_FBKDIV_500M(25)); - - /* Enable PLL */ - core_write(priv, CORE_GSWPLL_GRP1, - RG_GSWPLL_EN_PRE | - RG_GSWPLL_POSDIV_200M(2) | - RG_GSWPLL_FBKDIV_200M(32)); - /* Setup the MT7530 TRGMII Tx Clock */ core_write(priv, CORE_PLL_GROUP5, RG_LCDDS_PCW_NCPO1(ncpo1)); core_write(priv, CORE_PLL_GROUP6, RG_LCDDS_PCW_NCPO0(0)); @@ -2163,6 +2166,8 @@ mt7530_setup(struct dsa_switch *ds) SYS_CTRL_PHY_RST | SYS_CTRL_SW_RST | SYS_CTRL_REG_RST);
+ mt7530_pll_setup(priv); + /* Enable Port 6 only; P5 as GMAC5 which currently is not supported */ val = mt7530_read(priv, MT7530_MHWTRAP); val &= ~MHWTRAP_P6_DIS & ~MHWTRAP_PHY_ACCESS;
From: Kuniyuki Iwashima kuniyu@amazon.co.jp
[ Upstream commit 4edf21aa94ee33c75f819f2b6eb6dd52ef8a1628 ]
Let's remove unnecessary brackets around CONFIG_AF_UNIX_OOB.
Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.co.jp Link: https://lore.kernel.org/r/20220317032308.65372-1-kuniyu@amazon.co.jp Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 2aab4b969002 ("af_unix: fix struct pid leaks in OOB support") Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 0a59a00cb5815..32ddf8fe32c69 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1969,7 +1969,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, */ #define UNIX_SKB_FRAGS_SZ (PAGE_SIZE << get_order(32768))
-#if (IS_ENABLED(CONFIG_AF_UNIX_OOB)) +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other) { struct unix_sock *ousk = unix_sk(other); @@ -2035,7 +2035,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
err = -EOPNOTSUPP; if (msg->msg_flags & MSG_OOB) { -#if (IS_ENABLED(CONFIG_AF_UNIX_OOB)) +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) if (len) len--; else @@ -2106,7 +2106,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, sent += size; }
-#if (IS_ENABLED(CONFIG_AF_UNIX_OOB)) +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) if (msg->msg_flags & MSG_OOB) { err = queue_oob(sock, msg, other); if (err)
From: Eric Dumazet edumazet@google.com
[ Upstream commit 2aab4b96900272885bc157f8b236abf1cdc02e08 ]
syzbot reported struct pid leak [1].
Issue is that queue_oob() calls maybe_add_creds() which potentially holds a reference on a pid.
But skb->destructor is not set (either directly or by calling unix_scm_to_skb())
This means that subsequent kfree_skb() or consume_skb() would leak this reference.
In this fix, I chose to fully support scm even for the OOB message.
[1] BUG: memory leak unreferenced object 0xffff8881053e7f80 (size 128): comm "syz-executor242", pid 5066, jiffies 4294946079 (age 13.220s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff812ae26a>] alloc_pid+0x6a/0x560 kernel/pid.c:180 [<ffffffff812718df>] copy_process+0x169f/0x26c0 kernel/fork.c:2285 [<ffffffff81272b37>] kernel_clone+0xf7/0x610 kernel/fork.c:2684 [<ffffffff812730cc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:2825 [<ffffffff849ad699>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff849ad699>] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 [<ffffffff84a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 314001f0bf92 ("af_unix: Add OOB support") Reported-by: syzbot+7699d9e5635c10253a27@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Rao Shoaib rao.shoaib@oracle.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://lore.kernel.org/r/20230307164530.771896-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 32ddf8fe32c69..a96026dbdf94e 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1970,7 +1970,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, #define UNIX_SKB_FRAGS_SZ (PAGE_SIZE << get_order(32768))
#if IS_ENABLED(CONFIG_AF_UNIX_OOB) -static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other) +static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other, + struct scm_cookie *scm, bool fds_sent) { struct unix_sock *ousk = unix_sk(other); struct sk_buff *skb; @@ -1981,6 +1982,11 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other if (!skb) return err;
+ err = unix_scm_to_skb(scm, skb, !fds_sent); + if (err < 0) { + kfree_skb(skb); + return err; + } skb_put(skb, 1); err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, 1);
@@ -2108,7 +2114,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
#if IS_ENABLED(CONFIG_AF_UNIX_OOB) if (msg->msg_flags & MSG_OOB) { - err = queue_oob(sock, msg, other); + err = queue_oob(sock, msg, other, &scm, fds_sent); if (err) goto out_err; sent++;
From: Alexandre Ghiti alexghiti@rivosinc.com
[ Upstream commit 76950340cf03b149412fe0d5f0810e52ac1df8cb ]
When CONFIG_FRAME_POINTER is unset, the stack unwinding function walk_stackframe randomly reads the stack and then, when KASAN is enabled, it can lead to the following backtrace:
[ 0.000000] ================================================================== [ 0.000000] BUG: KASAN: stack-out-of-bounds in walk_stackframe+0xa6/0x11a [ 0.000000] Read of size 8 at addr ffffffff81807c40 by task swapper/0 [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-12919-g24203e6db61f #43 [ 0.000000] Hardware name: riscv-virtio,qemu (DT) [ 0.000000] Call Trace: [ 0.000000] [<ffffffff80007ba8>] walk_stackframe+0x0/0x11a [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80c49c80>] dump_stack_lvl+0x22/0x36 [ 0.000000] [<ffffffff80c3783e>] print_report+0x198/0x4a8 [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8015f68a>] kasan_report+0x9a/0xc8 [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8006e99c>] desc_make_final+0x80/0x84 [ 0.000000] [<ffffffff8009a04e>] stack_trace_save+0x88/0xa6 [ 0.000000] [<ffffffff80099fc2>] filter_irq_stacks+0x72/0x76 [ 0.000000] [<ffffffff8006b95e>] devkmsg_read+0x32a/0x32e [ 0.000000] [<ffffffff8015ec16>] kasan_save_stack+0x28/0x52 [ 0.000000] [<ffffffff8006e998>] desc_make_final+0x7c/0x84 [ 0.000000] [<ffffffff8009a04a>] stack_trace_save+0x84/0xa6 [ 0.000000] [<ffffffff8015ec52>] kasan_set_track+0x12/0x20 [ 0.000000] [<ffffffff8015f22e>] __kasan_slab_alloc+0x58/0x5e [ 0.000000] [<ffffffff8015e7ea>] __kmem_cache_create+0x21e/0x39a [ 0.000000] [<ffffffff80e133ac>] create_boot_cache+0x70/0x9c [ 0.000000] [<ffffffff80e17ab2>] kmem_cache_init+0x6c/0x11e [ 0.000000] [<ffffffff80e00fd6>] mm_init+0xd8/0xfe [ 0.000000] [<ffffffff80e011d8>] start_kernel+0x190/0x3ca [ 0.000000] [ 0.000000] The buggy address belongs to stack of task swapper/0 [ 0.000000] and is located at offset 0 in frame: [ 0.000000] stack_trace_save+0x0/0xa6 [ 0.000000] [ 0.000000] This frame has 1 object: [ 0.000000] [32, 56) 'c' [ 0.000000] [ 0.000000] The buggy address belongs to the physical page: [ 0.000000] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x81a07 [ 0.000000] flags: 0x1000(reserved|zone=0) [ 0.000000] raw: 0000000000001000 ff600003f1e3d150 ff600003f1e3d150 0000000000000000 [ 0.000000] raw: 0000000000000000 0000000000000000 00000001ffffffff [ 0.000000] page dumped because: kasan: bad access detected [ 0.000000] [ 0.000000] Memory state around the buggy address: [ 0.000000] ffffffff81807b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] >ffffffff81807c00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f3 [ 0.000000] ^ [ 0.000000] ffffffff81807c80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ==================================================================
Fix that by using READ_ONCE_NOCHECK when reading the stack in imprecise mode.
Fixes: 5d8544e2d007 ("RISC-V: Generic library routines and assembly") Reported-by: Chathura Rajapaksha chathura.abeyrathne.lk@gmail.com Link: https://lore.kernel.org/all/CAD7mqryDQCYyJ1gAmtMm8SASMWAQ4i103ptTb0f6Oda=tPY... Suggested-by: Dmitry Vyukov dvyukov@google.com Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20230308091639.602024-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/stacktrace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c index ee8ef91c8aaf4..894ae66421a76 100644 --- a/arch/riscv/kernel/stacktrace.c +++ b/arch/riscv/kernel/stacktrace.c @@ -94,7 +94,7 @@ void notrace walk_stackframe(struct task_struct *task, while (!kstack_end(ksp)) { if (__kernel_text_address(pc) && unlikely(!fn(arg, pc))) break; - pc = (*ksp++) - 0x4; + pc = READ_ONCE_NOCHECK(*ksp++) - 0x4; } }
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit b860b9346e2d5667fbae2cefc571bdb6ce665b53 ]
ftrace_shared_hotpatch_trampoline() never returns NULL, therefore quite a bit of code can be removed.
Acked-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Stable-dep-of: 2a8db5ec4a28 ("RISC-V: Don't check text_mutex during stop_machine") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/ftrace.c | 86 +++------------------------------------ 1 file changed, 6 insertions(+), 80 deletions(-)
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c index 1d94ffdf347bb..5d0c45c13b5fa 100644 --- a/arch/s390/kernel/ftrace.c +++ b/arch/s390/kernel/ftrace.c @@ -80,17 +80,6 @@ asm(
#ifdef CONFIG_MODULES static char *ftrace_plt; - -asm( - " .data\n" - "ftrace_plt_template:\n" - " basr %r1,%r0\n" - " lg %r1,0f-.(%r1)\n" - " br %r1\n" - "0: .quad ftrace_caller\n" - "ftrace_plt_template_end:\n" - " .previous\n" -); #endif /* CONFIG_MODULES */
static const char *ftrace_shared_hotpatch_trampoline(const char **end) @@ -116,7 +105,7 @@ static const char *ftrace_shared_hotpatch_trampoline(const char **end)
bool ftrace_need_init_nop(void) { - return ftrace_shared_hotpatch_trampoline(NULL); + return true; }
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) @@ -175,28 +164,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, return 0; }
-static void ftrace_generate_nop_insn(struct ftrace_insn *insn) -{ - /* brcl 0,0 */ - insn->opc = 0xc004; - insn->disp = 0; -} - -static void ftrace_generate_call_insn(struct ftrace_insn *insn, - unsigned long ip) -{ - unsigned long target; - - /* brasl r0,ftrace_caller */ - target = FTRACE_ADDR; -#ifdef CONFIG_MODULES - if (is_module_addr((void *)ip)) - target = (unsigned long)ftrace_plt; -#endif /* CONFIG_MODULES */ - insn->opc = 0xc005; - insn->disp = (target - ip) / 2; -} - static void brcl_disable(void *brcl) { u8 op = 0x04; /* set mask field to zero */ @@ -207,23 +174,7 @@ static void brcl_disable(void *brcl) int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { - struct ftrace_insn orig, new, old; - - if (ftrace_shared_hotpatch_trampoline(NULL)) { - brcl_disable((void *)rec->ip); - return 0; - } - - if (copy_from_kernel_nofault(&old, (void *) rec->ip, sizeof(old))) - return -EFAULT; - /* Replace ftrace call with a nop. */ - ftrace_generate_call_insn(&orig, rec->ip); - ftrace_generate_nop_insn(&new); - - /* Verify that the to be replaced code matches what we expect. */ - if (memcmp(&orig, &old, sizeof(old))) - return -EINVAL; - s390_kernel_write((void *) rec->ip, &new, sizeof(new)); + brcl_disable((void *)rec->ip); return 0; }
@@ -236,23 +187,7 @@ static void brcl_enable(void *brcl)
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) { - struct ftrace_insn orig, new, old; - - if (ftrace_shared_hotpatch_trampoline(NULL)) { - brcl_enable((void *)rec->ip); - return 0; - } - - if (copy_from_kernel_nofault(&old, (void *) rec->ip, sizeof(old))) - return -EFAULT; - /* Replace nop with an ftrace call. */ - ftrace_generate_nop_insn(&orig); - ftrace_generate_call_insn(&new, rec->ip); - - /* Verify that the to be replaced code matches what we expect. */ - if (memcmp(&orig, &old, sizeof(old))) - return -EINVAL; - s390_kernel_write((void *) rec->ip, &new, sizeof(new)); + brcl_enable((void *)rec->ip); return 0; }
@@ -269,10 +204,7 @@ int __init ftrace_dyn_arch_init(void)
void arch_ftrace_update_code(int command) { - if (ftrace_shared_hotpatch_trampoline(NULL)) - ftrace_modify_all_code(command); - else - ftrace_run_stop_machine(command); + ftrace_modify_all_code(command); }
static void __ftrace_sync(void *dummy) @@ -281,10 +213,8 @@ static void __ftrace_sync(void *dummy)
int ftrace_arch_code_modify_post_process(void) { - if (ftrace_shared_hotpatch_trampoline(NULL)) { - /* Send SIGP to the other CPUs, so they see the new code. */ - smp_call_function(__ftrace_sync, NULL, 1); - } + /* Send SIGP to the other CPUs, so they see the new code. */ + smp_call_function(__ftrace_sync, NULL, 1); return 0; }
@@ -299,10 +229,6 @@ static int __init ftrace_plt_init(void) panic("cannot allocate ftrace plt\n");
start = ftrace_shared_hotpatch_trampoline(&end); - if (!start) { - start = ftrace_plt_template; - end = ftrace_plt_template_end; - } memcpy(ftrace_plt, start, end - start); set_memory_ro((unsigned long)ftrace_plt, 1); return 0;
From: Conor Dooley conor.dooley@microchip.com
[ Upstream commit 2a8db5ec4a28a0fce822d10224db9471a44b6925 ]
We're currently using stop_machine() to update ftrace & kprobes, which means that the thread that takes text_mutex during may not be the same as the thread that eventually patches the code. This isn't actually a race because the lock is still held (preventing any other concurrent accesses) and there is only one thread running during stop_machine(), but it does trigger a lockdep failure.
This patch just elides the lockdep check during stop_machine.
Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support") Suggested-by: Steven Rostedt rostedt@goodmis.org Reported-by: Changbin Du changbin.du@gmail.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.co... Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/ftrace.h | 2 +- arch/riscv/include/asm/patch.h | 2 ++ arch/riscv/kernel/ftrace.c | 14 ++++++++++++-- arch/riscv/kernel/patch.c | 28 +++++++++++++++++++++++++--- 4 files changed, 40 insertions(+), 6 deletions(-)
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index 9e73922e1e2e5..d47d87c2d7e3d 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -109,6 +109,6 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); #define ftrace_init_nop ftrace_init_nop #endif
-#endif +#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* _ASM_RISCV_FTRACE_H */ diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h index 9a7d7346001ee..98d9de07cba17 100644 --- a/arch/riscv/include/asm/patch.h +++ b/arch/riscv/include/asm/patch.h @@ -9,4 +9,6 @@ int patch_text_nosync(void *addr, const void *insns, size_t len); int patch_text(void *addr, u32 insn);
+extern int riscv_patch_in_stop_machine; + #endif /* _ASM_RISCV_PATCH_H */ diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 47b43d8ee9a6c..1bf92cfa6764e 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -15,11 +15,21 @@ int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) { mutex_lock(&text_mutex); + + /* + * The code sequences we use for ftrace can't be patched while the + * kernel is running, so we need to use stop_machine() to modify them + * for now. This doesn't play nice with text_mutex, we use this flag + * to elide the check. + */ + riscv_patch_in_stop_machine = true; + return 0; }
int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex) { + riscv_patch_in_stop_machine = false; mutex_unlock(&text_mutex); return 0; } @@ -109,9 +119,9 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) { int out;
- ftrace_arch_code_modify_prepare(); + mutex_lock(&text_mutex); out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); - ftrace_arch_code_modify_post_process(); + mutex_unlock(&text_mutex);
return out; } diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c index 765004b605132..e099961453cca 100644 --- a/arch/riscv/kernel/patch.c +++ b/arch/riscv/kernel/patch.c @@ -11,6 +11,7 @@ #include <asm/kprobes.h> #include <asm/cacheflush.h> #include <asm/fixmap.h> +#include <asm/ftrace.h> #include <asm/patch.h>
struct patch_insn { @@ -19,6 +20,8 @@ struct patch_insn { atomic_t cpu_count; };
+int riscv_patch_in_stop_machine = false; + #ifdef CONFIG_MMU /* * The fix_to_virt(, idx) needs a const value (not a dynamic variable of @@ -59,8 +62,15 @@ static int patch_insn_write(void *addr, const void *insn, size_t len) * Before reaching here, it was expected to lock the text_mutex * already, so we don't need to give another lock here and could * ensure that it was safe between each cores. + * + * We're currently using stop_machine() for ftrace & kprobes, and while + * that ensures text_mutex is held before installing the mappings it + * does not ensure text_mutex is held by the calling thread. That's + * safe but triggers a lockdep failure, so just elide it for that + * specific case. */ - lockdep_assert_held(&text_mutex); + if (!riscv_patch_in_stop_machine) + lockdep_assert_held(&text_mutex);
if (across_pages) patch_map(addr + len, FIX_TEXT_POKE1); @@ -121,13 +131,25 @@ NOKPROBE_SYMBOL(patch_text_cb);
int patch_text(void *addr, u32 insn) { + int ret; struct patch_insn patch = { .addr = addr, .insn = insn, .cpu_count = ATOMIC_INIT(0), };
- return stop_machine_cpuslocked(patch_text_cb, - &patch, cpu_online_mask); + /* + * kprobes takes text_mutex, before calling patch_text(), but as we call + * calls stop_machine(), the lockdep assertion in patch_insn_write() + * gets confused by the context in which the lock is taken. + * Instead, ensure the lock is held before calling stop_machine(), and + * set riscv_patch_in_stop_machine to skip the check in + * patch_insn_write(). + */ + lockdep_assert_held(&text_mutex); + riscv_patch_in_stop_machine = true; + ret = stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask); + riscv_patch_in_stop_machine = false; + return ret; } NOKPROBE_SYMBOL(patch_text);
From: Jan Kara jack@suse.cz
[ Upstream commit 3c92792da8506a295afb6d032b4476e46f979725 ]
As lockdep properly warns, we should not be locking i_rwsem while having transactions started as the proper lock ordering used by all directory handling operations is i_rwsem -> transaction start. Fix the lock ordering by moving the locking of the directory earlier in ext4_rename().
Reported-by: syzbot+9d16c39efb5fade84574@syzkaller.appspotmail.com Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory") Link: https://syzkaller.appspot.com/bug?extid=9d16c39efb5fade84574 Signed-off-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230301141004.15087-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/namei.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index aa689adeeafdf..c79c61002a620 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3827,10 +3827,20 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, return retval; }
+ /* + * We need to protect against old.inode directory getting converted + * from inline directory format into a normal one. + */ + if (S_ISDIR(old.inode->i_mode)) + inode_lock_nested(old.inode, I_MUTEX_NONDIR2); + old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, &old.inlined); - if (IS_ERR(old.bh)) - return PTR_ERR(old.bh); + if (IS_ERR(old.bh)) { + retval = PTR_ERR(old.bh); + goto unlock_moved_dir; + } + /* * Check for inode number is _not_ due to possible IO errors. * We might rmdir the source, keep it as pwd of some process @@ -3887,11 +3897,6 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, if (new.dir != old.dir && EXT4_DIR_LINK_MAX(new.dir)) goto end_rename; } - /* - * We need to protect against old.inode directory getting - * converted from inline directory format into a normal one. - */ - inode_lock_nested(old.inode, I_MUTEX_NONDIR2); retval = ext4_rename_dir_prepare(handle, &old); if (retval) { inode_unlock(old.inode); @@ -4021,12 +4026,15 @@ static int ext4_rename(struct user_namespace *mnt_userns, struct inode *old_dir, } else { ext4_journal_stop(handle); } - if (old.dir_bh) - inode_unlock(old.inode); release_bh: brelse(old.dir_bh); brelse(old.bh); brelse(new.bh); + +unlock_moved_dir: + if (S_ISDIR(old.inode->i_mode)) + inode_unlock(old.inode); + return retval; }
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit 601363cc08da25747feb87c55573dd54de91d66a ]
Parallel probing of devices that share interrupts (e.g. when a driver uses asynchronous probing) can currently result in two mappings for the same hardware interrupt to be created due to missing serialisation.
Make sure to hold the irq_domain_mutex when creating mappings so that looking for an existing mapping before creating a new one is done atomically.
Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers") Fixes: b62b2cf5759b ("irqdomain: Fix handling of type settings for existing mappings") Link: https://lore.kernel.org/r/YuJXMHoT4ijUxnRb@hovoldconsulting.com Cc: stable@vger.kernel.org # 4.8 Cc: Dmitry Torokhov dtor@chromium.org Cc: Jon Hunter jonathanh@nvidia.com Tested-by: Hsin-Yi Wang hsinyi@chromium.org Tested-by: Mark-PK Tsai mark-pk.tsai@mediatek.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230213104302.17307-7-johan+linaro@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/irq/irqdomain.c | 64 ++++++++++++++++++++++++++++++------------ 1 file changed, 46 insertions(+), 18 deletions(-)
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c index f95196063884a..e0b67784ac1e0 100644 --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -25,6 +25,9 @@ static DEFINE_MUTEX(irq_domain_mutex);
static struct irq_domain *irq_default_domain;
+static int irq_domain_alloc_irqs_locked(struct irq_domain *domain, int irq_base, + unsigned int nr_irqs, int node, void *arg, + bool realloc, const struct irq_affinity_desc *affinity); static void irq_domain_check_hierarchy(struct irq_domain *domain);
struct irqchip_fwid { @@ -703,9 +706,9 @@ unsigned int irq_create_direct_mapping(struct irq_domain *domain) EXPORT_SYMBOL_GPL(irq_create_direct_mapping); #endif
-static unsigned int __irq_create_mapping_affinity(struct irq_domain *domain, - irq_hw_number_t hwirq, - const struct irq_affinity_desc *affinity) +static unsigned int irq_create_mapping_affinity_locked(struct irq_domain *domain, + irq_hw_number_t hwirq, + const struct irq_affinity_desc *affinity) { struct device_node *of_node = irq_domain_get_of_node(domain); int virq; @@ -720,7 +723,7 @@ static unsigned int __irq_create_mapping_affinity(struct irq_domain *domain, return 0; }
- if (irq_domain_associate(domain, virq, hwirq)) { + if (irq_domain_associate_locked(domain, virq, hwirq)) { irq_free_desc(virq); return 0; } @@ -756,14 +759,20 @@ unsigned int irq_create_mapping_affinity(struct irq_domain *domain, return 0; }
+ mutex_lock(&irq_domain_mutex); + /* Check if mapping already exists */ virq = irq_find_mapping(domain, hwirq); if (virq) { pr_debug("existing mapping on virq %d\n", virq); - return virq; + goto out; }
- return __irq_create_mapping_affinity(domain, hwirq, affinity); + virq = irq_create_mapping_affinity_locked(domain, hwirq, affinity); +out: + mutex_unlock(&irq_domain_mutex); + + return virq; } EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
@@ -830,6 +839,8 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec) if (WARN_ON(type & ~IRQ_TYPE_SENSE_MASK)) type &= IRQ_TYPE_SENSE_MASK;
+ mutex_lock(&irq_domain_mutex); + /* * If we've already configured this interrupt, * don't do it again, or hell will break loose. @@ -842,7 +853,7 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec) * interrupt number. */ if (type == IRQ_TYPE_NONE || type == irq_get_trigger_type(virq)) - return virq; + goto out;
/* * If the trigger type has not been set yet, then set @@ -850,35 +861,45 @@ unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec) */ if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) { irq_data = irq_get_irq_data(virq); - if (!irq_data) - return 0; + if (!irq_data) { + virq = 0; + goto out; + }
irqd_set_trigger_type(irq_data, type); - return virq; + goto out; }
pr_warn("type mismatch, failed to map hwirq-%lu for %s!\n", hwirq, of_node_full_name(to_of_node(fwspec->fwnode))); - return 0; + virq = 0; + goto out; }
if (irq_domain_is_hierarchy(domain)) { - virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, fwspec); - if (virq <= 0) - return 0; + virq = irq_domain_alloc_irqs_locked(domain, -1, 1, NUMA_NO_NODE, + fwspec, false, NULL); + if (virq <= 0) { + virq = 0; + goto out; + } } else { /* Create mapping */ - virq = __irq_create_mapping_affinity(domain, hwirq, NULL); + virq = irq_create_mapping_affinity_locked(domain, hwirq, NULL); if (!virq) - return virq; + goto out; }
irq_data = irq_get_irq_data(virq); - if (WARN_ON(!irq_data)) - return 0; + if (WARN_ON(!irq_data)) { + virq = 0; + goto out; + }
/* Store trigger type */ irqd_set_trigger_type(irq_data, type); +out: + mutex_unlock(&irq_domain_mutex);
return virq; } @@ -1910,6 +1931,13 @@ void irq_domain_set_info(struct irq_domain *domain, unsigned int virq, irq_set_handler_data(virq, handler_data); }
+static int irq_domain_alloc_irqs_locked(struct irq_domain *domain, int irq_base, + unsigned int nr_irqs, int node, void *arg, + bool realloc, const struct irq_affinity_desc *affinity) +{ + return -EINVAL; +} + static void irq_domain_check_hierarchy(struct irq_domain *domain) { }
From: Christoph Hellwig hch@lst.de
[ Upstream commit 2a852a693f8839bb877fc731ffbc9ece3a9c16d7 ]
The bdev parameter to ->ioctl contains the block device that the ioctl is called on, which can be the partition. But the openers check in nbd_bdev_reset really needs to check use the whole device, so switch to using that.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20220330052917.2566582-2-hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Stable-dep-of: e5cfefa97bcc ("block: fix scan partition for exclusively open device again") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/nbd.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index c1ef1df42eb66..ade8b839e4458 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1167,11 +1167,11 @@ static int nbd_reconnect_socket(struct nbd_device *nbd, unsigned long arg) return -ENOSPC; }
-static void nbd_bdev_reset(struct block_device *bdev) +static void nbd_bdev_reset(struct nbd_device *nbd) { - if (bdev->bd_openers > 1) + if (nbd->disk->part0->bd_openers > 1) return; - set_capacity(bdev->bd_disk, 0); + set_capacity(nbd->disk, 0); }
static void nbd_parse_flags(struct nbd_device *nbd) @@ -1337,7 +1337,7 @@ static int nbd_start_device(struct nbd_device *nbd) return nbd_set_size(nbd, config->bytesize, nbd_blksize(config)); }
-static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev) +static int nbd_start_device_ioctl(struct nbd_device *nbd) { struct nbd_config *config = nbd->config; int ret; @@ -1358,7 +1358,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
flush_workqueue(nbd->recv_workq); mutex_lock(&nbd->config_lock); - nbd_bdev_reset(bdev); + nbd_bdev_reset(nbd); /* user requested, ignore socket errors */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED, &config->runtime_flags)) ret = 0; @@ -1372,7 +1372,7 @@ static void nbd_clear_sock_ioctl(struct nbd_device *nbd, { nbd_clear_sock(nbd); __invalidate_device(bdev, true); - nbd_bdev_reset(bdev); + nbd_bdev_reset(nbd); if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF, &nbd->config->runtime_flags)) nbd_config_put(nbd); @@ -1418,7 +1418,7 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, config->flags = arg; return 0; case NBD_DO_IT: - return nbd_start_device_ioctl(nbd, bdev); + return nbd_start_device_ioctl(nbd); case NBD_CLEAR_QUE: /* * This is for compatibility only. The queue is always cleared
From: Suravee Suthikulpanit suravee.suthikulpanit@amd.com
[ Upstream commit bbe3a106580c21bc883fb0c9fa3da01534392fe8 ]
By default, PCI segment is zero and can be omitted. To support system with non-zero PCI segment ID, modify the parsing functions to allow PCI segment ID.
Co-developed-by: Vasant Hegde vasant.hegde@amd.com Signed-off-by: Vasant Hegde vasant.hegde@amd.com Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/20220706113825.25582-33-vasant.hegde@amd.com Signed-off-by: Joerg Roedel jroedel@suse.de Stable-dep-of: b6b26d86c61c ("iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter") Signed-off-by: Sasha Levin sashal@kernel.org --- .../admin-guide/kernel-parameters.txt | 34 ++++++++++---- drivers/iommu/amd/init.c | 44 ++++++++++++------- 2 files changed, 52 insertions(+), 26 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index bcb102c91b190..d49437b9a6980 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2200,23 +2200,39 @@
ivrs_ioapic [HW,X86-64] Provide an override to the IOAPIC-ID<->DEVICE-ID - mapping provided in the IVRS ACPI table. For - example, to map IOAPIC-ID decimal 10 to - PCI device 00:14.0 write the parameter as: + mapping provided in the IVRS ACPI table. + By default, PCI segment is 0, and can be omitted. + For example: + * To map IOAPIC-ID decimal 10 to PCI device 00:14.0 + write the parameter as: ivrs_ioapic[10]=00:14.0 + * To map IOAPIC-ID decimal 10 to PCI segment 0x1 and + PCI device 00:14.0 write the parameter as: + ivrs_ioapic[10]=0001:00:14.0
ivrs_hpet [HW,X86-64] Provide an override to the HPET-ID<->DEVICE-ID - mapping provided in the IVRS ACPI table. For - example, to map HPET-ID decimal 0 to - PCI device 00:14.0 write the parameter as: + mapping provided in the IVRS ACPI table. + By default, PCI segment is 0, and can be omitted. + For example: + * To map HPET-ID decimal 0 to PCI device 00:14.0 + write the parameter as: ivrs_hpet[0]=00:14.0 + * To map HPET-ID decimal 10 to PCI segment 0x1 and + PCI device 00:14.0 write the parameter as: + ivrs_ioapic[10]=0001:00:14.0
ivrs_acpihid [HW,X86-64] Provide an override to the ACPI-HID:UID<->DEVICE-ID - mapping provided in the IVRS ACPI table. For - example, to map UART-HID:UID AMD0020:0 to - PCI device 00:14.5 write the parameter as: + mapping provided in the IVRS ACPI table. + + For example, to map UART-HID:UID AMD0020:0 to + PCI segment 0x1 and PCI device ID 00:14.5, + write the parameter as: + ivrs_acpihid[0001:00:14.5]=AMD0020:0 + + By default, PCI segment is 0, and can be omitted. + For example, PCI device 00:14.5 write the parameter as: ivrs_acpihid[00:14.5]=AMD0020:0
js= [HW,JOY] Analog joystick diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index ce91a6d8532f3..6961fe80e8eea 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -85,6 +85,10 @@ #define ACPI_DEVFLAG_ATSDIS 0x10000000
#define LOOP_TIMEOUT 2000000 + +#define IVRS_GET_SBDF_ID(seg, bus, dev, fd) (((seg & 0xffff) << 16) | ((bus & 0xff) << 8) \ + | ((dev & 0x1f) << 3) | (fn & 0x7)) + /* * ACPI table definitions * @@ -3146,15 +3150,17 @@ static int __init parse_amd_iommu_options(char *str)
static int __init parse_ivrs_ioapic(char *str) { - unsigned int bus, dev, fn; + u32 seg = 0, bus, dev, fn; int ret, id, i; - u16 devid; + u32 devid;
ret = sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn); - if (ret != 4) { - pr_err("Invalid command line: ivrs_ioapic%s\n", str); - return 1; + ret = sscanf(str, "[%d]=%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn); + if (ret != 5) { + pr_err("Invalid command line: ivrs_ioapic%s\n", str); + return 1; + } }
if (early_ioapic_map_size == EARLY_MAP_SIZE) { @@ -3163,7 +3169,7 @@ static int __init parse_ivrs_ioapic(char *str) return 1; }
- devid = ((bus & 0xff) << 8) | ((dev & 0x1f) << 3) | (fn & 0x7); + devid = IVRS_GET_SBDF_ID(seg, bus, dev, fn);
cmdline_maps = true; i = early_ioapic_map_size++; @@ -3176,15 +3182,17 @@ static int __init parse_ivrs_ioapic(char *str)
static int __init parse_ivrs_hpet(char *str) { - unsigned int bus, dev, fn; + u32 seg = 0, bus, dev, fn; int ret, id, i; - u16 devid; + u32 devid;
ret = sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn); - if (ret != 4) { - pr_err("Invalid command line: ivrs_hpet%s\n", str); - return 1; + ret = sscanf(str, "[%d]=%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn); + if (ret != 5) { + pr_err("Invalid command line: ivrs_hpet%s\n", str); + return 1; + } }
if (early_hpet_map_size == EARLY_MAP_SIZE) { @@ -3193,7 +3201,7 @@ static int __init parse_ivrs_hpet(char *str) return 1; }
- devid = ((bus & 0xff) << 8) | ((dev & 0x1f) << 3) | (fn & 0x7); + devid = IVRS_GET_SBDF_ID(seg, bus, dev, fn);
cmdline_maps = true; i = early_hpet_map_size++; @@ -3206,15 +3214,18 @@ static int __init parse_ivrs_hpet(char *str)
static int __init parse_ivrs_acpihid(char *str) { - u32 bus, dev, fn; + u32 seg = 0, bus, dev, fn; char *hid, *uid, *p; char acpiid[ACPIHID_UID_LEN + ACPIHID_HID_LEN] = {0}; int ret, i;
ret = sscanf(str, "[%x:%x.%x]=%s", &bus, &dev, &fn, acpiid); if (ret != 4) { - pr_err("Invalid command line: ivrs_acpihid(%s)\n", str); - return 1; + ret = sscanf(str, "[%x:%x:%x.%x]=%s", &seg, &bus, &dev, &fn, acpiid); + if (ret != 5) { + pr_err("Invalid command line: ivrs_acpihid(%s)\n", str); + return 1; + } }
p = acpiid; @@ -3236,8 +3247,7 @@ static int __init parse_ivrs_acpihid(char *str) i = early_acpihid_map_size++; memcpy(early_acpihid_map[i].hid, hid, strlen(hid)); memcpy(early_acpihid_map[i].uid, uid, strlen(uid)); - early_acpihid_map[i].devid = - ((bus & 0xff) << 8) | ((dev & 0x1f) << 3) | (fn & 0x7); + early_acpihid_map[i].devid = IVRS_GET_SBDF_ID(seg, bus, dev, fn); early_acpihid_map[i].cmd_line = true;
return 1;
From: Kim Phillips kim.phillips@amd.com
[ Upstream commit 1198d2316dc4265a97d0e8445a22c7a6d17580a4 ]
Currently, these options cause the following libkmod error:
libkmod: ERROR ../libkmod/libkmod-config.c:489 kcmdline_parse_result: \ Ignoring bad option on kernel command line while parsing module \ name: 'ivrs_xxxx[XX:XX'
Fix by introducing a new parameter format for these options and throw a warning for the deprecated format.
Users are still allowed to omit the PCI Segment if zero.
Adding a Link: to the reason why we're modding the syntax parsing in the driver and not in libkmod.
Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-modules/20200310082308.14318-2-lucas.demarchi@... Reported-by: Kim Phillips kim.phillips@amd.com Co-developed-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Kim Phillips kim.phillips@amd.com Link: https://lore.kernel.org/r/20220919155638.391481-2-kim.phillips@amd.com Signed-off-by: Joerg Roedel jroedel@suse.de Stable-dep-of: b6b26d86c61c ("iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter") Signed-off-by: Sasha Levin sashal@kernel.org --- .../admin-guide/kernel-parameters.txt | 27 +++++-- drivers/iommu/amd/init.c | 79 +++++++++++++------ 2 files changed, 76 insertions(+), 30 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d49437b9a6980..4dbbc95316691 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2202,7 +2202,13 @@ Provide an override to the IOAPIC-ID<->DEVICE-ID mapping provided in the IVRS ACPI table. By default, PCI segment is 0, and can be omitted. - For example: + + For example, to map IOAPIC-ID decimal 10 to + PCI segment 0x1 and PCI device 00:14.0, + write the parameter as: + ivrs_ioapic=10@0001:00:14.0 + + Deprecated formats: * To map IOAPIC-ID decimal 10 to PCI device 00:14.0 write the parameter as: ivrs_ioapic[10]=00:14.0 @@ -2214,7 +2220,13 @@ Provide an override to the HPET-ID<->DEVICE-ID mapping provided in the IVRS ACPI table. By default, PCI segment is 0, and can be omitted. - For example: + + For example, to map HPET-ID decimal 10 to + PCI segment 0x1 and PCI device 00:14.0, + write the parameter as: + ivrs_hpet=10@0001:00:14.0 + + Deprecated formats: * To map HPET-ID decimal 0 to PCI device 00:14.0 write the parameter as: ivrs_hpet[0]=00:14.0 @@ -2225,15 +2237,20 @@ ivrs_acpihid [HW,X86-64] Provide an override to the ACPI-HID:UID<->DEVICE-ID mapping provided in the IVRS ACPI table. + By default, PCI segment is 0, and can be omitted.
For example, to map UART-HID:UID AMD0020:0 to PCI segment 0x1 and PCI device ID 00:14.5, write the parameter as: - ivrs_acpihid[0001:00:14.5]=AMD0020:0 + ivrs_acpihid=AMD0020:0@0001:00:14.5
- By default, PCI segment is 0, and can be omitted. - For example, PCI device 00:14.5 write the parameter as: + Deprecated formats: + * To map UART-HID:UID AMD0020:0 to PCI segment is 0, + PCI device ID 00:14.5, write the parameter as: ivrs_acpihid[00:14.5]=AMD0020:0 + * To map UART-HID:UID AMD0020:0 to PCI segment 0x1 and + PCI device ID 00:14.5, write the parameter as: + ivrs_acpihid[0001:00:14.5]=AMD0020:0
js= [HW,JOY] Analog joystick See Documentation/input/joydev/joystick.rst. diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 6961fe80e8eea..6c11db3356f78 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3151,18 +3151,24 @@ static int __init parse_amd_iommu_options(char *str) static int __init parse_ivrs_ioapic(char *str) { u32 seg = 0, bus, dev, fn; - int ret, id, i; + int id, i; u32 devid;
- ret = sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn); - if (ret != 4) { - ret = sscanf(str, "[%d]=%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn); - if (ret != 5) { - pr_err("Invalid command line: ivrs_ioapic%s\n", str); - return 1; - } + if (sscanf(str, "=%d@%x:%x.%x", &id, &bus, &dev, &fn) == 4 || + sscanf(str, "=%d@%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn) == 5) + goto found; + + if (sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn) == 4 || + sscanf(str, "[%d]=%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn) == 5) { + pr_warn("ivrs_ioapic%s option format deprecated; use ivrs_ioapic=%d@%04x:%02x:%02x.%d instead\n", + str, id, seg, bus, dev, fn); + goto found; }
+ pr_err("Invalid command line: ivrs_ioapic%s\n", str); + return 1; + +found: if (early_ioapic_map_size == EARLY_MAP_SIZE) { pr_err("Early IOAPIC map overflow - ignoring ivrs_ioapic%s\n", str); @@ -3183,18 +3189,24 @@ static int __init parse_ivrs_ioapic(char *str) static int __init parse_ivrs_hpet(char *str) { u32 seg = 0, bus, dev, fn; - int ret, id, i; + int id, i; u32 devid;
- ret = sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn); - if (ret != 4) { - ret = sscanf(str, "[%d]=%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn); - if (ret != 5) { - pr_err("Invalid command line: ivrs_hpet%s\n", str); - return 1; - } + if (sscanf(str, "=%d@%x:%x.%x", &id, &bus, &dev, &fn) == 4 || + sscanf(str, "=%d@%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn) == 5) + goto found; + + if (sscanf(str, "[%d]=%x:%x.%x", &id, &bus, &dev, &fn) == 4 || + sscanf(str, "[%d]=%x:%x:%x.%x", &id, &seg, &bus, &dev, &fn) == 5) { + pr_warn("ivrs_hpet%s option format deprecated; use ivrs_hpet=%d@%04x:%02x:%02x.%d instead\n", + str, id, seg, bus, dev, fn); + goto found; }
+ pr_err("Invalid command line: ivrs_hpet%s\n", str); + return 1; + +found: if (early_hpet_map_size == EARLY_MAP_SIZE) { pr_err("Early HPET map overflow - ignoring ivrs_hpet%s\n", str); @@ -3215,19 +3227,36 @@ static int __init parse_ivrs_hpet(char *str) static int __init parse_ivrs_acpihid(char *str) { u32 seg = 0, bus, dev, fn; - char *hid, *uid, *p; + char *hid, *uid, *p, *addr; char acpiid[ACPIHID_UID_LEN + ACPIHID_HID_LEN] = {0}; - int ret, i; - - ret = sscanf(str, "[%x:%x.%x]=%s", &bus, &dev, &fn, acpiid); - if (ret != 4) { - ret = sscanf(str, "[%x:%x:%x.%x]=%s", &seg, &bus, &dev, &fn, acpiid); - if (ret != 5) { - pr_err("Invalid command line: ivrs_acpihid(%s)\n", str); - return 1; + int i; + + addr = strchr(str, '@'); + if (!addr) { + if (sscanf(str, "[%x:%x.%x]=%s", &bus, &dev, &fn, acpiid) == 4 || + sscanf(str, "[%x:%x:%x.%x]=%s", &seg, &bus, &dev, &fn, acpiid) == 5) { + pr_warn("ivrs_acpihid%s option format deprecated; use ivrs_acpihid=%s@%04x:%02x:%02x.%d instead\n", + str, acpiid, seg, bus, dev, fn); + goto found; } + goto not_found; }
+ /* We have the '@', make it the terminator to get just the acpiid */ + *addr++ = 0; + + if (sscanf(str, "=%s", acpiid) != 1) + goto not_found; + + if (sscanf(addr, "%x:%x.%x", &bus, &dev, &fn) == 3 || + sscanf(addr, "%x:%x:%x.%x", &seg, &bus, &dev, &fn) == 4) + goto found; + +not_found: + pr_err("Invalid command line: ivrs_acpihid%s\n", str); + return 1; + +found: p = acpiid; hid = strsep(&p, ":"); uid = p;
From: Gavrilov Ilia Ilia.Gavrilov@infotecs.ru
[ Upstream commit b6b26d86c61c441144c72f842f7469bb686e1211 ]
The 'acpiid' buffer in the parse_ivrs_acpihid function may overflow, because the string specifier in the format string sscanf() has no width limitation.
Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Signed-off-by: Ilia.Gavrilov Ilia.Gavrilov@infotecs.ru Reviewed-by: Kim Phillips kim.phillips@amd.com Link: https://lore.kernel.org/r/20230202082719.1513849-1-Ilia.Gavrilov@infotecs.ru Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/init.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 6c11db3356f78..50ea582be5910 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3224,15 +3224,26 @@ static int __init parse_ivrs_hpet(char *str) return 1; }
+#define ACPIID_LEN (ACPIHID_UID_LEN + ACPIHID_HID_LEN) + static int __init parse_ivrs_acpihid(char *str) { u32 seg = 0, bus, dev, fn; char *hid, *uid, *p, *addr; - char acpiid[ACPIHID_UID_LEN + ACPIHID_HID_LEN] = {0}; + char acpiid[ACPIID_LEN] = {0}; int i;
addr = strchr(str, '@'); if (!addr) { + addr = strchr(str, '='); + if (!addr) + goto not_found; + + ++addr; + + if (strlen(addr) > ACPIID_LEN) + goto not_found; + if (sscanf(str, "[%x:%x.%x]=%s", &bus, &dev, &fn, acpiid) == 4 || sscanf(str, "[%x:%x:%x.%x]=%s", &seg, &bus, &dev, &fn, acpiid) == 5) { pr_warn("ivrs_acpihid%s option format deprecated; use ivrs_acpihid=%s@%04x:%02x:%02x.%d instead\n", @@ -3245,6 +3256,9 @@ static int __init parse_ivrs_acpihid(char *str) /* We have the '@', make it the terminator to get just the acpiid */ *addr++ = 0;
+ if (strlen(str) > ACPIID_LEN + 1) + goto not_found; + if (sscanf(str, "=%s", acpiid) != 1) goto not_found;
From: Michael Straube straube.linux@gmail.com
[ Upstream commit cd1f1450092216b3e39516f8db58869b6fc20575 ]
Clean up comparsions to NULL reported by checkpatch.
x == NULL -> !x x != NULL -> x
Signed-off-by: Michael Straube straube.linux@gmail.com Link: https://lore.kernel.org/r/20210829154533.11054-1-straube.linux@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 05cbcc415c9b ("staging: rtl8723bs: Fix key-store index handling") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/rtl8723bs/core/rtw_ap.c | 20 ++-- drivers/staging/rtl8723bs/core/rtw_cmd.c | 96 +++++++++---------- .../staging/rtl8723bs/core/rtw_ioctl_set.c | 4 +- drivers/staging/rtl8723bs/core/rtw_mlme.c | 6 +- drivers/staging/rtl8723bs/core/rtw_mlme_ext.c | 56 +++++------ drivers/staging/rtl8723bs/core/rtw_security.c | 6 +- .../staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 24 ++--- .../staging/rtl8723bs/os_dep/ioctl_linux.c | 18 ++-- drivers/staging/rtl8723bs/os_dep/os_intfs.c | 4 +- 9 files changed, 117 insertions(+), 117 deletions(-)
diff --git a/drivers/staging/rtl8723bs/core/rtw_ap.c b/drivers/staging/rtl8723bs/core/rtw_ap.c index 6064dd6a76b42..674592e914e26 100644 --- a/drivers/staging/rtl8723bs/core/rtw_ap.c +++ b/drivers/staging/rtl8723bs/core/rtw_ap.c @@ -891,7 +891,7 @@ int rtw_check_beacon_data(struct adapter *padapter, u8 *pbuf, int len) &ie_len, (pbss_network->ie_length - _BEACON_IE_OFFSET_) ); - if (p != NULL) { + if (p) { memcpy(supportRate, p + 2, ie_len); supportRateNum = ie_len; } @@ -903,7 +903,7 @@ int rtw_check_beacon_data(struct adapter *padapter, u8 *pbuf, int len) &ie_len, pbss_network->ie_length - _BEACON_IE_OFFSET_ ); - if (p != NULL) { + if (p) { memcpy(supportRate + supportRateNum, p + 2, ie_len); supportRateNum += ie_len; } @@ -991,7 +991,7 @@ int rtw_check_beacon_data(struct adapter *padapter, u8 *pbuf, int len) break; }
- if ((p == NULL) || (ie_len == 0)) + if (!p || ie_len == 0) break; }
@@ -1021,7 +1021,7 @@ int rtw_check_beacon_data(struct adapter *padapter, u8 *pbuf, int len) break; }
- if ((p == NULL) || (ie_len == 0)) + if (!p || ie_len == 0) break; } } @@ -1145,7 +1145,7 @@ int rtw_check_beacon_data(struct adapter *padapter, u8 *pbuf, int len) psta = rtw_get_stainfo(&padapter->stapriv, pbss_network->mac_address); if (!psta) { psta = rtw_alloc_stainfo(&padapter->stapriv, pbss_network->mac_address); - if (psta == NULL) + if (!psta) return _FAIL; }
@@ -1275,7 +1275,7 @@ u8 rtw_ap_set_pairwise_key(struct adapter *padapter, struct sta_info *psta) }
psetstakey_para = rtw_zmalloc(sizeof(struct set_stakey_parm)); - if (psetstakey_para == NULL) { + if (!psetstakey_para) { kfree(ph2c); res = _FAIL; goto exit; @@ -1311,12 +1311,12 @@ static int rtw_ap_set_key( int res = _SUCCESS;
pcmd = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmd == NULL) { + if (!pcmd) { res = _FAIL; goto exit; } psetkeyparm = rtw_zmalloc(sizeof(struct setkey_parm)); - if (psetkeyparm == NULL) { + if (!psetkeyparm) { kfree(pcmd); res = _FAIL; goto exit; @@ -1474,11 +1474,11 @@ static void update_bcn_wps_ie(struct adapter *padapter) &wps_ielen );
- if (pwps_ie == NULL || wps_ielen == 0) + if (!pwps_ie || wps_ielen == 0) return;
pwps_ie_src = pmlmepriv->wps_beacon_ie; - if (pwps_ie_src == NULL) + if (!pwps_ie_src) return;
wps_offset = (uint)(pwps_ie - ie); diff --git a/drivers/staging/rtl8723bs/core/rtw_cmd.c b/drivers/staging/rtl8723bs/core/rtw_cmd.c index 93e3a4c9e1159..5f4f603b3b366 100644 --- a/drivers/staging/rtl8723bs/core/rtw_cmd.c +++ b/drivers/staging/rtl8723bs/core/rtw_cmd.c @@ -251,7 +251,7 @@ int _rtw_enqueue_cmd(struct __queue *queue, struct cmd_obj *obj) { unsigned long irqL;
- if (obj == NULL) + if (!obj) goto exit;
/* spin_lock_bh(&queue->lock); */ @@ -319,7 +319,7 @@ int rtw_enqueue_cmd(struct cmd_priv *pcmdpriv, struct cmd_obj *cmd_obj) int res = _FAIL; struct adapter *padapter = pcmdpriv->padapter;
- if (cmd_obj == NULL) + if (!cmd_obj) goto exit;
cmd_obj->padapter = padapter; @@ -484,7 +484,7 @@ int rtw_cmd_thread(void *context) /* call callback function for post-processed */ if (pcmd->cmdcode < ARRAY_SIZE(rtw_cmd_callback)) { pcmd_callback = rtw_cmd_callback[pcmd->cmdcode].callback; - if (pcmd_callback == NULL) { + if (!pcmd_callback) { rtw_free_cmd_obj(pcmd); } else { /* todo: !!! fill rsp_buf to pcmd->rsp if (pcmd->rsp!= NULL) */ @@ -503,7 +503,7 @@ int rtw_cmd_thread(void *context) /* free all cmd_obj resources */ do { pcmd = rtw_dequeue_cmd(pcmdpriv); - if (pcmd == NULL) { + if (!pcmd) { rtw_unregister_cmd_alive(padapter); break; } @@ -542,11 +542,11 @@ u8 rtw_sitesurvey_cmd(struct adapter *padapter, struct ndis_802_11_ssid *ssid, rtw_lps_ctrl_wk_cmd(padapter, LPS_CTRL_SCAN, 1);
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) + if (!ph2c) return _FAIL;
psurveyPara = rtw_zmalloc(sizeof(struct sitesurvey_parm)); - if (psurveyPara == NULL) { + if (!psurveyPara) { kfree(ph2c); return _FAIL; } @@ -604,13 +604,13 @@ u8 rtw_setdatarate_cmd(struct adapter *padapter, u8 *rateset) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pbsetdataratepara = rtw_zmalloc(sizeof(struct setdatarate_parm)); - if (pbsetdataratepara == NULL) { + if (!pbsetdataratepara) { kfree(ph2c); res = _FAIL; goto exit; @@ -640,7 +640,7 @@ u8 rtw_createbss_cmd(struct adapter *padapter) u8 res = _SUCCESS;
pcmd = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmd == NULL) { + if (!pcmd) { res = _FAIL; goto exit; } @@ -673,7 +673,7 @@ int rtw_startbss_cmd(struct adapter *padapter, int flags) } else { /* need enqueue, prepare cmd_obj and enqueue */ pcmd = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmd == NULL) { + if (!pcmd) { res = _FAIL; goto exit; } @@ -725,7 +725,7 @@ u8 rtw_joinbss_cmd(struct adapter *padapter, struct wlan_network *pnetwork) u8 *ptmp = NULL;
pcmd = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmd == NULL) { + if (!pcmd) { res = _FAIL; goto exit; } @@ -837,7 +837,7 @@ u8 rtw_disassoc_cmd(struct adapter *padapter, u32 deauth_timeout_ms, bool enqueu
/* prepare cmd parameter */ param = rtw_zmalloc(sizeof(*param)); - if (param == NULL) { + if (!param) { res = _FAIL; goto exit; } @@ -846,7 +846,7 @@ u8 rtw_disassoc_cmd(struct adapter *padapter, u32 deauth_timeout_ms, bool enqueu if (enqueue) { /* need enqueue, prepare cmd_obj and enqueue */ cmdobj = rtw_zmalloc(sizeof(*cmdobj)); - if (cmdobj == NULL) { + if (!cmdobj) { res = _FAIL; kfree(param); goto exit; @@ -874,7 +874,7 @@ u8 rtw_setopmode_cmd(struct adapter *padapter, enum ndis_802_11_network_infrast
psetop = rtw_zmalloc(sizeof(struct setopmode_parm));
- if (psetop == NULL) { + if (!psetop) { res = _FAIL; goto exit; } @@ -882,7 +882,7 @@ u8 rtw_setopmode_cmd(struct adapter *padapter, enum ndis_802_11_network_infrast
if (enqueue) { ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { kfree(psetop); res = _FAIL; goto exit; @@ -910,7 +910,7 @@ u8 rtw_setstakey_cmd(struct adapter *padapter, struct sta_info *sta, u8 unicast_ u8 res = _SUCCESS;
psetstakey_para = rtw_zmalloc(sizeof(struct set_stakey_parm)); - if (psetstakey_para == NULL) { + if (!psetstakey_para) { res = _FAIL; goto exit; } @@ -932,14 +932,14 @@ u8 rtw_setstakey_cmd(struct adapter *padapter, struct sta_info *sta, u8 unicast_
if (enqueue) { ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { kfree(psetstakey_para); res = _FAIL; goto exit; }
psetstakey_rsp = rtw_zmalloc(sizeof(struct set_stakey_rsp)); - if (psetstakey_rsp == NULL) { + if (!psetstakey_rsp) { kfree(ph2c); kfree(psetstakey_para); res = _FAIL; @@ -977,20 +977,20 @@ u8 rtw_clearstakey_cmd(struct adapter *padapter, struct sta_info *sta, u8 enqueu } } else { ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
psetstakey_para = rtw_zmalloc(sizeof(struct set_stakey_parm)); - if (psetstakey_para == NULL) { + if (!psetstakey_para) { kfree(ph2c); res = _FAIL; goto exit; }
psetstakey_rsp = rtw_zmalloc(sizeof(struct set_stakey_rsp)); - if (psetstakey_rsp == NULL) { + if (!psetstakey_rsp) { kfree(ph2c); kfree(psetstakey_para); res = _FAIL; @@ -1022,13 +1022,13 @@ u8 rtw_addbareq_cmd(struct adapter *padapter, u8 tid, u8 *addr) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
paddbareq_parm = rtw_zmalloc(sizeof(struct addBaReq_parm)); - if (paddbareq_parm == NULL) { + if (!paddbareq_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1054,13 +1054,13 @@ u8 rtw_reset_securitypriv_cmd(struct adapter *padapter) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1089,13 +1089,13 @@ u8 rtw_free_assoc_resources_cmd(struct adapter *padapter) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1125,13 +1125,13 @@ u8 rtw_dynamic_chk_wk_cmd(struct adapter *padapter)
/* only primary padapter does this cmd */ ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1173,7 +1173,7 @@ u8 rtw_set_chplan_cmd(struct adapter *padapter, u8 chplan, u8 enqueue, u8 swconf
/* prepare cmd parameter */ setChannelPlan_param = rtw_zmalloc(sizeof(struct SetChannelPlan_param)); - if (setChannelPlan_param == NULL) { + if (!setChannelPlan_param) { res = _FAIL; goto exit; } @@ -1182,7 +1182,7 @@ u8 rtw_set_chplan_cmd(struct adapter *padapter, u8 chplan, u8 enqueue, u8 swconf if (enqueue) { /* need enqueue, prepare cmd_obj and enqueue */ pcmdobj = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmdobj == NULL) { + if (!pcmdobj) { kfree(setChannelPlan_param); res = _FAIL; goto exit; @@ -1432,13 +1432,13 @@ u8 rtw_lps_ctrl_wk_cmd(struct adapter *padapter, u8 lps_ctrl_type, u8 enqueue)
if (enqueue) { ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1474,13 +1474,13 @@ u8 rtw_dm_in_lps_wk_cmd(struct adapter *padapter)
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1540,13 +1540,13 @@ u8 rtw_dm_ra_mask_wk_cmd(struct adapter *padapter, u8 *psta)
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1575,13 +1575,13 @@ u8 rtw_ps_cmd(struct adapter *padapter) u8 res = _SUCCESS;
ppscmd = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ppscmd == NULL) { + if (!ppscmd) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ppscmd); res = _FAIL; goto exit; @@ -1647,13 +1647,13 @@ u8 rtw_chk_hi_queue_cmd(struct adapter *padapter) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1741,13 +1741,13 @@ u8 rtw_c2h_packet_wk_cmd(struct adapter *padapter, u8 *pbuf, u16 length) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1776,13 +1776,13 @@ u8 rtw_c2h_wk_cmd(struct adapter *padapter, u8 *c2h_evt) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
pdrvextra_cmd_parm = rtw_zmalloc(sizeof(struct drvextra_cmd_parm)); - if (pdrvextra_cmd_parm == NULL) { + if (!pdrvextra_cmd_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -1957,7 +1957,7 @@ void rtw_createbss_cmd_callback(struct adapter *padapter, struct cmd_obj *pcmd) struct wlan_bssid_ex *pnetwork = (struct wlan_bssid_ex *)pcmd->parmbuf; struct wlan_network *tgt_network = &(pmlmepriv->cur_network);
- if (pcmd->parmbuf == NULL) + if (!pcmd->parmbuf) goto exit;
if (pcmd->res != H2C_SUCCESS) @@ -1980,9 +1980,9 @@ void rtw_createbss_cmd_callback(struct adapter *padapter, struct cmd_obj *pcmd) } else { pwlan = rtw_alloc_network(pmlmepriv); spin_lock_bh(&(pmlmepriv->scanned_queue.lock)); - if (pwlan == NULL) { + if (!pwlan) { pwlan = rtw_get_oldest_wlan_network(&pmlmepriv->scanned_queue); - if (pwlan == NULL) { + if (!pwlan) { spin_unlock_bh(&(pmlmepriv->scanned_queue.lock)); goto createbss_cmd_fail; } diff --git a/drivers/staging/rtl8723bs/core/rtw_ioctl_set.c b/drivers/staging/rtl8723bs/core/rtw_ioctl_set.c index 5cfde71766173..8c11daff2d590 100644 --- a/drivers/staging/rtl8723bs/core/rtw_ioctl_set.c +++ b/drivers/staging/rtl8723bs/core/rtw_ioctl_set.c @@ -370,7 +370,7 @@ u8 rtw_set_802_11_bssid_list_scan(struct adapter *padapter, struct ndis_802_11_s struct mlme_priv *pmlmepriv = &padapter->mlmepriv; u8 res = true;
- if (padapter == NULL) { + if (!padapter) { res = false; goto exit; } @@ -481,7 +481,7 @@ u16 rtw_get_cur_max_rate(struct adapter *adapter) return 0;
psta = rtw_get_stainfo(&adapter->stapriv, get_bssid(pmlmepriv)); - if (psta == NULL) + if (!psta) return 0;
short_GI = query_ra_short_GI(psta); diff --git a/drivers/staging/rtl8723bs/core/rtw_mlme.c b/drivers/staging/rtl8723bs/core/rtw_mlme.c index 952c3e14d1b33..26c40042d2bed 100644 --- a/drivers/staging/rtl8723bs/core/rtw_mlme.c +++ b/drivers/staging/rtl8723bs/core/rtw_mlme.c @@ -439,7 +439,7 @@ struct wlan_network *rtw_get_oldest_wlan_network(struct __queue *scanned_queue) pwlan = list_entry(plist, struct wlan_network, list);
if (!pwlan->fixed) { - if (oldest == NULL || time_after(oldest->last_scanned, pwlan->last_scanned)) + if (!oldest || time_after(oldest->last_scanned, pwlan->last_scanned)) oldest = pwlan; } } @@ -542,7 +542,7 @@ void rtw_update_scanned_network(struct adapter *adapter, struct wlan_bssid_ex *t /* TODO: don't select network in the same ess as oldest if it's new enough*/ }
- if (oldest == NULL || time_after(oldest->last_scanned, pnetwork->last_scanned)) + if (!oldest || time_after(oldest->last_scanned, pnetwork->last_scanned)) oldest = pnetwork;
} @@ -1816,7 +1816,7 @@ static int rtw_check_join_candidate(struct mlme_priv *mlme goto exit; }
- if (*candidate == NULL || (*candidate)->network.rssi < competitor->network.rssi) { + if (!*candidate || (*candidate)->network.rssi < competitor->network.rssi) { *candidate = competitor; updated = true; } diff --git a/drivers/staging/rtl8723bs/core/rtw_mlme_ext.c b/drivers/staging/rtl8723bs/core/rtw_mlme_ext.c index 1a4b4c75c4bf5..e923f306cf0c3 100644 --- a/drivers/staging/rtl8723bs/core/rtw_mlme_ext.c +++ b/drivers/staging/rtl8723bs/core/rtw_mlme_ext.c @@ -742,11 +742,11 @@ unsigned int OnAuth(struct adapter *padapter, union recv_frame *precv_frame) }
pstat = rtw_get_stainfo(pstapriv, sa); - if (pstat == NULL) { + if (!pstat) {
/* allocate a new one */ pstat = rtw_alloc_stainfo(pstapriv, sa); - if (pstat == NULL) { + if (!pstat) { status = WLAN_STATUS_AP_UNABLE_TO_HANDLE_NEW_STA; goto auth_fail; } @@ -814,7 +814,7 @@ unsigned int OnAuth(struct adapter *padapter, union recv_frame *precv_frame) p = rtw_get_ie(pframe + WLAN_HDR_A3_LEN + 4 + _AUTH_IE_OFFSET_, WLAN_EID_CHALLENGE, (int *)&ie_len, len - WLAN_HDR_A3_LEN - _AUTH_IE_OFFSET_ - 4);
- if ((p == NULL) || (ie_len <= 0)) { + if (!p || ie_len <= 0) { status = WLAN_STATUS_CHALLENGE_FAIL; goto auth_fail; } @@ -1034,7 +1034,7 @@ unsigned int OnAssocReq(struct adapter *padapter, union recv_frame *precv_frame)
/* check if the supported rate is ok */ p = rtw_get_ie(pframe + WLAN_HDR_A3_LEN + ie_offset, WLAN_EID_SUPP_RATES, &ie_len, pkt_len - WLAN_HDR_A3_LEN - ie_offset); - if (p == NULL) { + if (!p) { /* use our own rate set as statoin used */ /* memcpy(supportRate, AP_BSSRATE, AP_BSSRATE_LEN); */ /* supportRateNum = AP_BSSRATE_LEN; */ @@ -1047,7 +1047,7 @@ unsigned int OnAssocReq(struct adapter *padapter, union recv_frame *precv_frame)
p = rtw_get_ie(pframe + WLAN_HDR_A3_LEN + ie_offset, WLAN_EID_EXT_SUPP_RATES, &ie_len, pkt_len - WLAN_HDR_A3_LEN - ie_offset); - if (p != NULL) { + if (p) {
if (supportRateNum <= sizeof(supportRate)) { memcpy(supportRate+supportRateNum, p+2, ie_len); @@ -1294,7 +1294,7 @@ unsigned int OnAssocReq(struct adapter *padapter, union recv_frame *precv_frame) /* get a unique AID */ if (pstat->aid == 0) { for (pstat->aid = 1; pstat->aid <= NUM_STA; pstat->aid++) - if (pstapriv->sta_aid[pstat->aid - 1] == NULL) + if (!pstapriv->sta_aid[pstat->aid - 1]) break;
/* if (pstat->aid > NUM_STA) { */ @@ -1940,7 +1940,7 @@ static struct xmit_frame *_alloc_mgtxmitframe(struct xmit_priv *pxmitpriv, bool goto exit;
pxmitbuf = rtw_alloc_xmitbuf_ext(pxmitpriv); - if (pxmitbuf == NULL) { + if (!pxmitbuf) { rtw_free_xmitframe(pxmitpriv, pmgntframe); pmgntframe = NULL; goto exit; @@ -2293,7 +2293,7 @@ void issue_probersp(struct adapter *padapter, unsigned char *da, u8 is_valid_p2p struct wlan_bssid_ex *cur_network = &(pmlmeinfo->network); unsigned int rate_len;
- if (da == NULL) + if (!da) return;
pmgntframe = alloc_mgtxmitframe(pxmitpriv); @@ -2617,7 +2617,7 @@ void issue_auth(struct adapter *padapter, struct sta_info *psta, unsigned short __le16 le_tmp;
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) + if (!pmgntframe) return;
/* update attribute */ @@ -2748,7 +2748,7 @@ void issue_asocrsp(struct adapter *padapter, unsigned short status, struct sta_i __le16 lestatus, le_tmp;
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) + if (!pmgntframe) return;
/* update attribute */ @@ -2836,7 +2836,7 @@ void issue_asocrsp(struct adapter *padapter, unsigned short status, struct sta_i break; }
- if ((pbuf == NULL) || (ie_len == 0)) { + if (!pbuf || ie_len == 0) { break; } } @@ -2880,7 +2880,7 @@ void issue_assocreq(struct adapter *padapter) u8 vs_ie_length = 0;
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) + if (!pmgntframe) goto exit;
/* update attribute */ @@ -3057,7 +3057,7 @@ static int _issue_nulldata(struct adapter *padapter, unsigned char *da, pmlmeinfo = &(pmlmeext->mlmext_info);
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) + if (!pmgntframe) goto exit;
/* update attribute */ @@ -3196,7 +3196,7 @@ static int _issue_qos_nulldata(struct adapter *padapter, unsigned char *da, struct mlme_ext_info *pmlmeinfo = &(pmlmeext->mlmext_info);
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) + if (!pmgntframe) goto exit;
/* update attribute */ @@ -3309,7 +3309,7 @@ static int _issue_deauth(struct adapter *padapter, unsigned char *da, __le16 le_tmp;
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) { + if (!pmgntframe) { goto exit; }
@@ -3635,7 +3635,7 @@ static void issue_action_BSSCoexistPacket(struct adapter *padapter) action = ACT_PUBLIC_BSSCOEXIST;
pmgntframe = alloc_mgtxmitframe(pxmitpriv); - if (pmgntframe == NULL) { + if (!pmgntframe) { return; }
@@ -3702,7 +3702,7 @@ static void issue_action_BSSCoexistPacket(struct adapter *padapter) pbss_network = (struct wlan_bssid_ex *)&pnetwork->network;
p = rtw_get_ie(pbss_network->ies + _FIXED_IE_LENGTH_, WLAN_EID_HT_CAPABILITY, &len, pbss_network->ie_length - _FIXED_IE_LENGTH_); - if ((p == NULL) || (len == 0)) {/* non-HT */ + if (!p || len == 0) {/* non-HT */
if (pbss_network->configuration.ds_config <= 0) continue; @@ -3765,7 +3765,7 @@ unsigned int send_delba(struct adapter *padapter, u8 initiator, u8 *addr) return _SUCCESS;
psta = rtw_get_stainfo(pstapriv, addr); - if (psta == NULL) + if (!psta) return _SUCCESS;
if (initiator == 0) {/* recipient */ @@ -4637,13 +4637,13 @@ void report_del_sta_event(struct adapter *padapter, unsigned char *MacAddr, unsi struct cmd_priv *pcmdpriv = &padapter->cmdpriv;
pcmd_obj = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmd_obj == NULL) { + if (!pcmd_obj) { return; }
cmdsz = (sizeof(struct stadel_event) + sizeof(struct C2HEvent_Header)); pevtcmd = rtw_zmalloc(cmdsz); - if (pevtcmd == NULL) { + if (!pevtcmd) { kfree(pcmd_obj); return; } @@ -4689,12 +4689,12 @@ void report_add_sta_event(struct adapter *padapter, unsigned char *MacAddr, int struct cmd_priv *pcmdpriv = &padapter->cmdpriv;
pcmd_obj = rtw_zmalloc(sizeof(struct cmd_obj)); - if (pcmd_obj == NULL) + if (!pcmd_obj) return;
cmdsz = (sizeof(struct stassoc_event) + sizeof(struct C2HEvent_Header)); pevtcmd = rtw_zmalloc(cmdsz); - if (pevtcmd == NULL) { + if (!pevtcmd) { kfree(pcmd_obj); return; } @@ -5143,12 +5143,12 @@ void survey_timer_hdl(struct timer_list *t) }
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { goto exit_survey_timer_hdl; }
psurveyPara = rtw_zmalloc(sizeof(struct sitesurvey_parm)); - if (psurveyPara == NULL) { + if (!psurveyPara) { kfree(ph2c); goto exit_survey_timer_hdl; } @@ -5777,7 +5777,7 @@ u8 chk_bmc_sleepq_cmd(struct adapter *padapter) u8 res = _SUCCESS;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; } @@ -5801,13 +5801,13 @@ u8 set_tx_beacon_cmd(struct adapter *padapter) int len_diff = 0;
ph2c = rtw_zmalloc(sizeof(struct cmd_obj)); - if (ph2c == NULL) { + if (!ph2c) { res = _FAIL; goto exit; }
ptxBeacon_parm = rtw_zmalloc(sizeof(struct Tx_Beacon_param)); - if (ptxBeacon_parm == NULL) { + if (!ptxBeacon_parm) { kfree(ph2c); res = _FAIL; goto exit; @@ -5867,7 +5867,7 @@ u8 mlme_evt_hdl(struct adapter *padapter, unsigned char *pbuf) void (*event_callback)(struct adapter *dev, u8 *pbuf); struct evt_priv *pevt_priv = &(padapter->evtpriv);
- if (pbuf == NULL) + if (!pbuf) goto _abort_event_;
peventbuf = (uint *)pbuf; diff --git a/drivers/staging/rtl8723bs/core/rtw_security.c b/drivers/staging/rtl8723bs/core/rtw_security.c index b050bf62e3b94..ac731415f7332 100644 --- a/drivers/staging/rtl8723bs/core/rtw_security.c +++ b/drivers/staging/rtl8723bs/core/rtw_security.c @@ -51,7 +51,7 @@ void rtw_wep_encrypt(struct adapter *padapter, u8 *pxmitframe) struct xmit_priv *pxmitpriv = &padapter->xmitpriv; struct arc4_ctx *ctx = &psecuritypriv->xmit_arc4_ctx;
- if (((struct xmit_frame *)pxmitframe)->buf_addr == NULL) + if (!((struct xmit_frame *)pxmitframe)->buf_addr) return;
hw_hdr_offset = TXDESC_OFFSET; @@ -476,7 +476,7 @@ u32 rtw_tkip_encrypt(struct adapter *padapter, u8 *pxmitframe) struct arc4_ctx *ctx = &psecuritypriv->xmit_arc4_ctx; u32 res = _SUCCESS;
- if (((struct xmit_frame *)pxmitframe)->buf_addr == NULL) + if (!((struct xmit_frame *)pxmitframe)->buf_addr) return _FAIL;
hw_hdr_offset = TXDESC_OFFSET; @@ -1043,7 +1043,7 @@ u32 rtw_aes_encrypt(struct adapter *padapter, u8 *pxmitframe)
u32 res = _SUCCESS;
- if (((struct xmit_frame *)pxmitframe)->buf_addr == NULL) + if (!((struct xmit_frame *)pxmitframe)->buf_addr) return _FAIL;
hw_hdr_offset = TXDESC_OFFSET; diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c index 448c4248ca4cf..20d05b4ee63e2 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c @@ -391,7 +391,7 @@ void rtw_cfg80211_ibss_indicate_connect(struct adapter *padapter) } else { - if (scanned == NULL) { + if (!scanned) { rtw_warn_on(1); return; } @@ -432,7 +432,7 @@ void rtw_cfg80211_indicate_connect(struct adapter *padapter) struct wlan_bssid_ex *pnetwork = &(padapter->mlmeextpriv.mlmext_info.network); struct wlan_network *scanned = pmlmepriv->cur_network_scanned;
- if (scanned == NULL) { + if (!scanned) { rtw_warn_on(1); goto check_bss; } @@ -551,10 +551,10 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa goto exit; }
- if (strcmp(param->u.crypt.alg, "none") == 0 && (psta == NULL)) + if (strcmp(param->u.crypt.alg, "none") == 0 && !psta) goto exit;
- if (strcmp(param->u.crypt.alg, "WEP") == 0 && (psta == NULL)) + if (strcmp(param->u.crypt.alg, "WEP") == 0 && !psta) { wep_key_idx = param->u.crypt.idx; wep_key_len = param->u.crypt.key_len; @@ -907,7 +907,7 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param }
pbcmc_sta = rtw_get_bcmc_stainfo(padapter); - if (pbcmc_sta == NULL) + if (!pbcmc_sta) { /* DEBUG_ERR(("Set OID_802_11_ADD_KEY: bcmc stainfo is null\n")); */ } @@ -947,7 +947,7 @@ static int cfg80211_rtw_add_key(struct wiphy *wiphy, struct net_device *ndev,
param_len = sizeof(struct ieee_param) + params->key_len; param = rtw_malloc(param_len); - if (param == NULL) + if (!param) return -1;
memset(param, 0, param_len); @@ -1098,7 +1098,7 @@ static int cfg80211_rtw_get_station(struct wiphy *wiphy, }
psta = rtw_get_stainfo(pstapriv, (u8 *)mac); - if (psta == NULL) { + if (!psta) { ret = -ENOENT; goto exit; } @@ -1327,7 +1327,7 @@ static int cfg80211_rtw_scan(struct wiphy *wiphy struct rtw_wdev_priv *pwdev_priv; struct mlme_priv *pmlmepriv;
- if (ndev == NULL) { + if (!ndev) { ret = -EINVAL; goto exit; } @@ -1571,7 +1571,7 @@ static int rtw_cfg80211_set_wpa_ie(struct adapter *padapter, u8 *pie, size_t iel u8 *pwpa, *pwpa2; u8 null_addr[] = {0, 0, 0, 0, 0, 0};
- if (pie == NULL || !ielen) { + if (!pie || !ielen) { /* Treat this as normal case, but need to clear WIFI_UNDER_WPS */ _clr_fwstate_(&padapter->mlmepriv, WIFI_UNDER_WPS); goto exit; @@ -1583,7 +1583,7 @@ static int rtw_cfg80211_set_wpa_ie(struct adapter *padapter, u8 *pie, size_t iel }
buf = rtw_zmalloc(ielen); - if (buf == NULL) { + if (!buf) { ret = -ENOMEM; goto exit; } @@ -1873,7 +1873,7 @@ static int cfg80211_rtw_connect(struct wiphy *wiphy, struct net_device *ndev, wep_key_len = wep_key_len <= 5 ? 5 : 13; wep_total_len = wep_key_len + FIELD_OFFSET(struct ndis_802_11_wep, key_material); pwep = rtw_malloc(wep_total_len); - if (pwep == NULL) { + if (!pwep) { ret = -ENOMEM; goto exit; } @@ -2708,7 +2708,7 @@ static int cfg80211_rtw_mgmt_tx(struct wiphy *wiphy, struct adapter *padapter; struct rtw_wdev_priv *pwdev_priv;
- if (ndev == NULL) { + if (!ndev) { ret = -EINVAL; goto exit; } diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c index 295121c268bd4..f23ab3e103455 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c @@ -153,7 +153,7 @@ static int wpa_set_encryption(struct net_device *dev, struct ieee_param *param,
if (check_fwstate(pmlmepriv, WIFI_STATION_STATE | WIFI_MP_STATE) == true) { /* sta mode */ psta = rtw_get_stainfo(pstapriv, get_bssid(pmlmepriv)); - if (psta == NULL) { + if (!psta) { /* DEBUG_ERR(("Set wpa_set_encryption: Obtain Sta_info fail\n")); */ } else { /* Jeff: don't disable ieee8021x_blocked while clearing key */ @@ -206,7 +206,7 @@ static int wpa_set_encryption(struct net_device *dev, struct ieee_param *param, }
pbcmc_sta = rtw_get_bcmc_stainfo(padapter); - if (pbcmc_sta == NULL) { + if (!pbcmc_sta) { /* DEBUG_ERR(("Set OID_802_11_ADD_KEY: bcmc stainfo is null\n")); */ } else { /* Jeff: don't disable ieee8021x_blocked while clearing key */ @@ -236,9 +236,9 @@ static int rtw_set_wpa_ie(struct adapter *padapter, char *pie, unsigned short ie int ret = 0; u8 null_addr[] = {0, 0, 0, 0, 0, 0};
- if ((ielen > MAX_WPA_IE_LEN) || (pie == NULL)) { + if (ielen > MAX_WPA_IE_LEN || !pie) { _clr_fwstate_(&padapter->mlmepriv, WIFI_UNDER_WPS); - if (pie == NULL) + if (!pie) return ret; else return -EINVAL; @@ -246,7 +246,7 @@ static int rtw_set_wpa_ie(struct adapter *padapter, char *pie, unsigned short ie
if (ielen) { buf = rtw_zmalloc(ielen); - if (buf == NULL) { + if (!buf) { ret = -ENOMEM; goto exit; } @@ -491,7 +491,7 @@ static int wpa_supplicant_ioctl(struct net_device *dev, struct iw_point *p) return -EINVAL;
param = rtw_malloc(p->length); - if (param == NULL) + if (!param) return -ENOMEM;
if (copy_from_user(param, p->pointer, p->length)) { @@ -571,7 +571,7 @@ static int rtw_set_encryption(struct net_device *dev, struct ieee_param *param, goto exit; }
- if (strcmp(param->u.crypt.alg, "none") == 0 && (psta == NULL)) { + if (strcmp(param->u.crypt.alg, "none") == 0 && !psta) { /* todo:clear default encryption keys */
psecuritypriv->dot11AuthAlgrthm = dot11AuthAlgrthm_Open; @@ -583,7 +583,7 @@ static int rtw_set_encryption(struct net_device *dev, struct ieee_param *param, }
- if (strcmp(param->u.crypt.alg, "WEP") == 0 && (psta == NULL)) { + if (strcmp(param->u.crypt.alg, "WEP") == 0 && !psta) { wep_key_idx = param->u.crypt.idx; wep_key_len = param->u.crypt.key_len;
@@ -1227,7 +1227,7 @@ static int rtw_hostapd_ioctl(struct net_device *dev, struct iw_point *p) return -EINVAL;
param = rtw_malloc(p->length); - if (param == NULL) + if (!param) return -ENOMEM;
if (copy_from_user(param, p->pointer, p->length)) { diff --git a/drivers/staging/rtl8723bs/os_dep/os_intfs.c b/drivers/staging/rtl8723bs/os_dep/os_intfs.c index 23f4f706f935c..279347be77c40 100644 --- a/drivers/staging/rtl8723bs/os_dep/os_intfs.c +++ b/drivers/staging/rtl8723bs/os_dep/os_intfs.c @@ -488,7 +488,7 @@ void rtw_unregister_netdevs(struct dvobj_priv *dvobj)
padapter = dvobj->padapters;
- if (padapter == NULL) + if (!padapter) return;
pnetdev = padapter->pnetdev; @@ -594,7 +594,7 @@ struct dvobj_priv *devobj_init(void) struct dvobj_priv *pdvobj = NULL;
pdvobj = rtw_zmalloc(sizeof(*pdvobj)); - if (pdvobj == NULL) + if (!pdvobj) return NULL;
mutex_init(&pdvobj->hw_init_mutex);
From: Jagath Jog J jagathjog1996@gmail.com
[ Upstream commit 1d7280898f683ca824fc5eab5c486a583a81473b ]
Fix following checkpatch.pl error by placing opening { braces in previous line ERROR: that open brace { should be on the previous line
Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Jagath Jog J jagathjog1996@gmail.com Link: https://lore.kernel.org/r/20220124034456.8665-2-jagathjog1996@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 05cbcc415c9b ("staging: rtl8723bs: Fix key-store index handling") Signed-off-by: Sasha Levin sashal@kernel.org --- .../staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 98 ++++++------------- 1 file changed, 32 insertions(+), 66 deletions(-)
diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c index 20d05b4ee63e2..2404f7f84d82a 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c @@ -113,13 +113,10 @@ static struct ieee80211_supported_band *rtw_spt_band_alloc( struct ieee80211_supported_band *spt_band = NULL; int n_channels, n_bitrates;
- if (band == NL80211_BAND_2GHZ) - { + if (band == NL80211_BAND_2GHZ) { n_channels = RTW_2G_CHANNELS_NUM; n_bitrates = RTW_G_RATES_NUM; - } - else - { + } else { goto exit; }
@@ -135,8 +132,7 @@ static struct ieee80211_supported_band *rtw_spt_band_alloc( spt_band->n_channels = n_channels; spt_band->n_bitrates = n_bitrates;
- if (band == NL80211_BAND_2GHZ) - { + if (band == NL80211_BAND_2GHZ) { rtw_2g_channels_init(spt_band->channels); rtw_2g_rates_init(spt_band->bitrates); } @@ -235,8 +231,7 @@ struct cfg80211_bss *rtw_cfg80211_inform_bss(struct adapter *padapter, struct wl { u16 wapi_len = 0;
- if (rtw_get_wapi_ie(pnetwork->network.ies, pnetwork->network.ie_length, NULL, &wapi_len) > 0) - { + if (rtw_get_wapi_ie(pnetwork->network.ies, pnetwork->network.ie_length, NULL, &wapi_len) > 0) { if (wapi_len > 0) goto exit; } @@ -244,8 +239,7 @@ struct cfg80211_bss *rtw_cfg80211_inform_bss(struct adapter *padapter, struct wl
/* To reduce PBC Overlap rate */ /* spin_lock_bh(&pwdev_priv->scan_req_lock); */ - if (adapter_wdev_data(padapter)->scan_request) - { + if (adapter_wdev_data(padapter)->scan_request) { u8 *psr = NULL, sr = 0; struct ndis_802_11_ssid *pssid = &pnetwork->network.ssid; struct cfg80211_scan_request *request = adapter_wdev_data(padapter)->scan_request; @@ -258,14 +252,12 @@ struct cfg80211_bss *rtw_cfg80211_inform_bss(struct adapter *padapter, struct wl if (wpsie && wpsielen > 0) psr = rtw_get_wps_attr_content(wpsie, wpsielen, WPS_ATTR_SELECTED_REGISTRAR, (u8 *)(&sr), NULL);
- if (sr != 0) - { - if (request->n_ssids == 1 && request->n_channels == 1) /* it means under processing WPS */ - { + if (sr != 0) { + /* it means under processing WPS */ + if (request->n_ssids == 1 && request->n_channels == 1) { if (ssids[0].ssid_len != 0 && (pssid->ssid_length != ssids[0].ssid_len || - memcmp(pssid->ssid, ssids[0].ssid, ssids[0].ssid_len))) - { + memcmp(pssid->ssid, ssids[0].ssid, ssids[0].ssid_len))) { if (psr) *psr = 0; /* clear sr */ } @@ -374,8 +366,7 @@ void rtw_cfg80211_ibss_indicate_connect(struct adapter *padapter) int freq = (int)cur_network->network.configuration.ds_config; struct ieee80211_channel *chan;
- if (pwdev->iftype != NL80211_IFTYPE_ADHOC) - { + if (pwdev->iftype != NL80211_IFTYPE_ADHOC) { return; }
@@ -383,14 +374,11 @@ void rtw_cfg80211_ibss_indicate_connect(struct adapter *padapter) struct wlan_bssid_ex *pnetwork = &(padapter->mlmeextpriv.mlmext_info.network); struct wlan_network *scanned = pmlmepriv->cur_network_scanned;
- if (check_fwstate(pmlmepriv, WIFI_ADHOC_MASTER_STATE) == true) - { + if (check_fwstate(pmlmepriv, WIFI_ADHOC_MASTER_STATE) == true) {
memcpy(&cur_network->network, pnetwork, sizeof(struct wlan_bssid_ex)); rtw_cfg80211_inform_bss(padapter, cur_network); - } - else - { + } else { if (!scanned) { rtw_warn_on(1); return; @@ -473,9 +461,7 @@ void rtw_cfg80211_indicate_connect(struct adapter *padapter) roam_info.resp_ie_len = pmlmepriv->assoc_rsp_len-sizeof(struct ieee80211_hdr_3addr)-6; cfg80211_roamed(padapter->pnetdev, &roam_info, GFP_ATOMIC); - } - else - { + } else { cfg80211_connect_result(padapter->pnetdev, cur_network->network.mac_address , pmlmepriv->assoc_req+sizeof(struct ieee80211_hdr_3addr)+2 , pmlmepriv->assoc_req_len-sizeof(struct ieee80211_hdr_3addr)-2 @@ -527,24 +513,19 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa param->u.crypt.err = 0; param->u.crypt.alg[IEEE_CRYPT_ALG_NAME_LEN - 1] = '\0';
- if (param_len != sizeof(struct ieee_param) + param->u.crypt.key_len) - { + if (param_len != sizeof(struct ieee_param) + param->u.crypt.key_len) { ret = -EINVAL; goto exit; }
if (param->sta_addr[0] == 0xff && param->sta_addr[1] == 0xff && param->sta_addr[2] == 0xff && param->sta_addr[3] == 0xff && - param->sta_addr[4] == 0xff && param->sta_addr[5] == 0xff) - { - if (param->u.crypt.idx >= WEP_KEYS) - { + param->sta_addr[4] == 0xff && param->sta_addr[5] == 0xff) { + if (param->u.crypt.idx >= WEP_KEYS) { ret = -EINVAL; goto exit; } - } - else - { + } else { psta = rtw_get_stainfo(pstapriv, param->sta_addr); if (!psta) /* ret = -EINVAL; */ @@ -554,24 +535,20 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa if (strcmp(param->u.crypt.alg, "none") == 0 && !psta) goto exit;
- if (strcmp(param->u.crypt.alg, "WEP") == 0 && !psta) - { + if (strcmp(param->u.crypt.alg, "WEP") == 0 && !psta) { wep_key_idx = param->u.crypt.idx; wep_key_len = param->u.crypt.key_len;
- if ((wep_key_idx >= WEP_KEYS) || (wep_key_len <= 0)) - { + if ((wep_key_idx >= WEP_KEYS) || (wep_key_len <= 0)) { ret = -EINVAL; goto exit; }
- if (wep_key_len > 0) - { + if (wep_key_len > 0) { wep_key_len = wep_key_len <= 5 ? 5 : 13; }
- if (psecuritypriv->bWepDefaultKeyIdxSet == 0) - { + if (psecuritypriv->bWepDefaultKeyIdxSet == 0) { /* wep default key has not been set, so use this key index as default key. */
psecuritypriv->dot11AuthAlgrthm = dot11AuthAlgrthm_Auto; @@ -579,8 +556,7 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa psecuritypriv->dot11PrivacyAlgrthm = _WEP40_; psecuritypriv->dot118021XGrpPrivacy = _WEP40_;
- if (wep_key_len == 13) - { + if (wep_key_len == 13) { psecuritypriv->dot11PrivacyAlgrthm = _WEP104_; psecuritypriv->dot118021XGrpPrivacy = _WEP104_; } @@ -598,24 +574,19 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa
}
- - if (!psta && check_fwstate(pmlmepriv, WIFI_AP_STATE)) /* group key */ - { - if (param->u.crypt.set_tx == 0) /* group key */ - { - if (strcmp(param->u.crypt.alg, "WEP") == 0) - { + /* group key */ + if (!psta && check_fwstate(pmlmepriv, WIFI_AP_STATE)) { + /* group key */ + if (param->u.crypt.set_tx == 0) { + if (strcmp(param->u.crypt.alg, "WEP") == 0) { memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len));
psecuritypriv->dot118021XGrpPrivacy = _WEP40_; - if (param->u.crypt.key_len == 13) - { + if (param->u.crypt.key_len == 13) { psecuritypriv->dot118021XGrpPrivacy = _WEP104_; }
- } - else if (strcmp(param->u.crypt.alg, "TKIP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "TKIP") == 0) { psecuritypriv->dot118021XGrpPrivacy = _TKIP_;
memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len)); @@ -627,15 +598,11 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa
psecuritypriv->busetkipkey = true;
- } - else if (strcmp(param->u.crypt.alg, "CCMP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "CCMP") == 0) { psecuritypriv->dot118021XGrpPrivacy = _AES_;
memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len)); - } - else - { + } else { psecuritypriv->dot118021XGrpPrivacy = _NO_PRIVACY_; }
@@ -648,8 +615,7 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa rtw_ap_set_group_key(padapter, param->u.crypt.key, psecuritypriv->dot118021XGrpPrivacy, param->u.crypt.idx);
pbcmc_sta = rtw_get_bcmc_stainfo(padapter); - if (pbcmc_sta) - { + if (pbcmc_sta) { pbcmc_sta->ieee8021x_blocked = false; pbcmc_sta->dot118021XPrivacy = psecuritypriv->dot118021XGrpPrivacy;/* rx will use bmc_sta's dot118021XPrivacy */ }
From: Hannes Braun hannesbraun@mail.de
[ Upstream commit a8b088d6d98dafddda9874f98ac2a7cefc51639b ]
This patch should eliminate the following errors/warnings emitted by checkpatch.pl: - that open brace { should be on the previous line - else should follow close brace '}' - braces {} are not necessary for single statement blocks
Signed-off-by: Hannes Braun hannesbraun@mail.de Link: https://lore.kernel.org/r/20220528123115.13024-1-hannesbraun@mail.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 05cbcc415c9b ("staging: rtl8723bs: Fix key-store index handling") Signed-off-by: Sasha Levin sashal@kernel.org --- .../staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 223 +++++------------- 1 file changed, 65 insertions(+), 158 deletions(-)
diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c index 2404f7f84d82a..b7a3f3df72723 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c @@ -366,9 +366,8 @@ void rtw_cfg80211_ibss_indicate_connect(struct adapter *padapter) int freq = (int)cur_network->network.configuration.ds_config; struct ieee80211_channel *chan;
- if (pwdev->iftype != NL80211_IFTYPE_ADHOC) { + if (pwdev->iftype != NL80211_IFTYPE_ADHOC) return; - }
if (!rtw_cfg80211_check_bss(padapter)) { struct wlan_bssid_ex *pnetwork = &(padapter->mlmeextpriv.mlmext_info.network); @@ -544,9 +543,8 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa goto exit; }
- if (wep_key_len > 0) { + if (wep_key_len > 0) wep_key_len = wep_key_len <= 5 ? 5 : 13; - }
if (psecuritypriv->bWepDefaultKeyIdxSet == 0) { /* wep default key has not been set, so use this key index as default key. */ @@ -582,9 +580,8 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len));
psecuritypriv->dot118021XGrpPrivacy = _WEP40_; - if (param->u.crypt.key_len == 13) { + if (param->u.crypt.key_len == 13) psecuritypriv->dot118021XGrpPrivacy = _WEP104_; - }
} else if (strcmp(param->u.crypt.alg, "TKIP") == 0) { psecuritypriv->dot118021XGrpPrivacy = _TKIP_; @@ -626,24 +623,16 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa
}
- if (psecuritypriv->dot11AuthAlgrthm == dot11AuthAlgrthm_8021X && psta) /* psk/802_1x */ - { - if (check_fwstate(pmlmepriv, WIFI_AP_STATE)) - { - if (param->u.crypt.set_tx == 1) /* pairwise key */ - { + if (psecuritypriv->dot11AuthAlgrthm == dot11AuthAlgrthm_8021X && psta) { /* psk/802_1x */ + if (check_fwstate(pmlmepriv, WIFI_AP_STATE)) { + if (param->u.crypt.set_tx == 1) { /* pairwise key */ memcpy(psta->dot118021x_UncstKey.skey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len));
- if (strcmp(param->u.crypt.alg, "WEP") == 0) - { + if (strcmp(param->u.crypt.alg, "WEP") == 0) { psta->dot118021XPrivacy = _WEP40_; if (param->u.crypt.key_len == 13) - { psta->dot118021XPrivacy = _WEP104_; - } - } - else if (strcmp(param->u.crypt.alg, "TKIP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "TKIP") == 0) { psta->dot118021XPrivacy = _TKIP_;
/* DEBUG_ERR("set key length :param->u.crypt.key_len =%d\n", param->u.crypt.key_len); */ @@ -653,14 +642,10 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa
psecuritypriv->busetkipkey = true;
- } - else if (strcmp(param->u.crypt.alg, "CCMP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "CCMP") == 0) {
psta->dot118021XPrivacy = _AES_; - } - else - { + } else { psta->dot118021XPrivacy = _NO_PRIVACY_; }
@@ -670,21 +655,14 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa
psta->bpairwise_key_installed = true;
- } - else/* group key??? */ - { - if (strcmp(param->u.crypt.alg, "WEP") == 0) - { + } else { /* group key??? */ + if (strcmp(param->u.crypt.alg, "WEP") == 0) { memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len));
psecuritypriv->dot118021XGrpPrivacy = _WEP40_; if (param->u.crypt.key_len == 13) - { psecuritypriv->dot118021XGrpPrivacy = _WEP104_; - } - } - else if (strcmp(param->u.crypt.alg, "TKIP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "TKIP") == 0) { psecuritypriv->dot118021XGrpPrivacy = _TKIP_;
memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len)); @@ -696,15 +674,11 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa
psecuritypriv->busetkipkey = true;
- } - else if (strcmp(param->u.crypt.alg, "CCMP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "CCMP") == 0) { psecuritypriv->dot118021XGrpPrivacy = _AES_;
memcpy(grpkey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len)); - } - else - { + } else { psecuritypriv->dot118021XGrpPrivacy = _NO_PRIVACY_; }
@@ -717,8 +691,7 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa rtw_ap_set_group_key(padapter, param->u.crypt.key, psecuritypriv->dot118021XGrpPrivacy, param->u.crypt.idx);
pbcmc_sta = rtw_get_bcmc_stainfo(padapter); - if (pbcmc_sta) - { + if (pbcmc_sta) { pbcmc_sta->ieee8021x_blocked = false; pbcmc_sta->dot118021XPrivacy = psecuritypriv->dot118021XGrpPrivacy;/* rx will use bmc_sta's dot118021XPrivacy */ } @@ -746,20 +719,16 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param param->u.crypt.err = 0; param->u.crypt.alg[IEEE_CRYPT_ALG_NAME_LEN - 1] = '\0';
- if (param_len < (u32) ((u8 *) param->u.crypt.key - (u8 *) param) + param->u.crypt.key_len) - { + if (param_len < (u32) ((u8 *) param->u.crypt.key - (u8 *) param) + param->u.crypt.key_len) { ret = -EINVAL; goto exit; }
if (param->sta_addr[0] == 0xff && param->sta_addr[1] == 0xff && param->sta_addr[2] == 0xff && param->sta_addr[3] == 0xff && - param->sta_addr[4] == 0xff && param->sta_addr[5] == 0xff) - { + param->sta_addr[4] == 0xff && param->sta_addr[5] == 0xff) { if (param->u.crypt.idx >= WEP_KEYS - || param->u.crypt.idx >= BIP_MAX_KEYID - ) - { + || param->u.crypt.idx >= BIP_MAX_KEYID) { ret = -EINVAL; goto exit; } @@ -770,19 +739,16 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param } }
- if (strcmp(param->u.crypt.alg, "WEP") == 0) - { + if (strcmp(param->u.crypt.alg, "WEP") == 0) { wep_key_idx = param->u.crypt.idx; wep_key_len = param->u.crypt.key_len;
- if ((wep_key_idx >= WEP_KEYS) || (wep_key_len <= 0)) - { + if ((wep_key_idx >= WEP_KEYS) || (wep_key_len <= 0)) { ret = -EINVAL; goto exit; }
- if (psecuritypriv->bWepDefaultKeyIdxSet == 0) - { + if (psecuritypriv->bWepDefaultKeyIdxSet == 0) { /* wep default key has not been set, so use this key index as default key. */
wep_key_len = wep_key_len <= 5 ? 5 : 13; @@ -791,8 +757,7 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param psecuritypriv->dot11PrivacyAlgrthm = _WEP40_; psecuritypriv->dot118021XGrpPrivacy = _WEP40_;
- if (wep_key_len == 13) - { + if (wep_key_len == 13) { psecuritypriv->dot11PrivacyAlgrthm = _WEP104_; psecuritypriv->dot118021XGrpPrivacy = _WEP104_; } @@ -809,13 +774,11 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param goto exit; }
- if (padapter->securitypriv.dot11AuthAlgrthm == dot11AuthAlgrthm_8021X) /* 802_1x */ - { + if (padapter->securitypriv.dot11AuthAlgrthm == dot11AuthAlgrthm_8021X) { /* 802_1x */ struct sta_info *psta, *pbcmc_sta; struct sta_priv *pstapriv = &padapter->stapriv;
- if (check_fwstate(pmlmepriv, WIFI_STATION_STATE | WIFI_MP_STATE) == true) /* sta mode */ - { + if (check_fwstate(pmlmepriv, WIFI_STATION_STATE | WIFI_MP_STATE) == true) { /* sta mode */ psta = rtw_get_stainfo(pstapriv, get_bssid(pmlmepriv)); if (psta) { /* Jeff: don't disable ieee8021x_blocked while clearing key */ @@ -824,18 +787,15 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param
if ((padapter->securitypriv.ndisencryptstatus == Ndis802_11Encryption2Enabled) || - (padapter->securitypriv.ndisencryptstatus == Ndis802_11Encryption3Enabled)) - { + (padapter->securitypriv.ndisencryptstatus == Ndis802_11Encryption3Enabled)) { psta->dot118021XPrivacy = padapter->securitypriv.dot11PrivacyAlgrthm; }
- if (param->u.crypt.set_tx == 1)/* pairwise key */ - { + if (param->u.crypt.set_tx == 1) { /* pairwise key */
memcpy(psta->dot118021x_UncstKey.skey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len));
- if (strcmp(param->u.crypt.alg, "TKIP") == 0)/* set mic key */ - { + if (strcmp(param->u.crypt.alg, "TKIP") == 0) { /* set mic key */ /* DEBUG_ERR(("\nset key length :param->u.crypt.key_len =%d\n", param->u.crypt.key_len)); */ memcpy(psta->dot11tkiptxmickey.skey, &(param->u.crypt.key[16]), 8); memcpy(psta->dot11tkiprxmickey.skey, &(param->u.crypt.key[24]), 8); @@ -845,11 +805,8 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param }
rtw_setstakey_cmd(padapter, psta, true, true); - } - else/* group key */ - { - if (strcmp(param->u.crypt.alg, "TKIP") == 0 || strcmp(param->u.crypt.alg, "CCMP") == 0) - { + } else { /* group key */ + if (strcmp(param->u.crypt.alg, "TKIP") == 0 || strcmp(param->u.crypt.alg, "CCMP") == 0) { memcpy(padapter->securitypriv.dot118021XGrpKey[param->u.crypt.idx].skey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len)); memcpy(padapter->securitypriv.dot118021XGrptxmickey[param->u.crypt.idx].skey, &(param->u.crypt.key[16]), 8); memcpy(padapter->securitypriv.dot118021XGrprxmickey[param->u.crypt.idx].skey, &(param->u.crypt.key[24]), 8); @@ -857,9 +814,7 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param
padapter->securitypriv.dot118021XGrpKeyid = param->u.crypt.idx; rtw_set_key(padapter, &padapter->securitypriv, param->u.crypt.idx, 1, true); - } - else if (strcmp(param->u.crypt.alg, "BIP") == 0) - { + } else if (strcmp(param->u.crypt.alg, "BIP") == 0) { /* save the IGTK key, length 16 bytes */ memcpy(padapter->securitypriv.dot11wBIPKey[param->u.crypt.idx].skey, param->u.crypt.key, (param->u.crypt.key_len > 16 ? 16 : param->u.crypt.key_len)); /* @@ -873,25 +828,19 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param }
pbcmc_sta = rtw_get_bcmc_stainfo(padapter); - if (!pbcmc_sta) - { + if (!pbcmc_sta) { /* DEBUG_ERR(("Set OID_802_11_ADD_KEY: bcmc stainfo is null\n")); */ - } - else - { + } else { /* Jeff: don't disable ieee8021x_blocked while clearing key */ if (strcmp(param->u.crypt.alg, "none") != 0) pbcmc_sta->ieee8021x_blocked = false;
if ((padapter->securitypriv.ndisencryptstatus == Ndis802_11Encryption2Enabled) || - (padapter->securitypriv.ndisencryptstatus == Ndis802_11Encryption3Enabled)) - { + (padapter->securitypriv.ndisencryptstatus == Ndis802_11Encryption3Enabled)) { pbcmc_sta->dot118021XPrivacy = padapter->securitypriv.dot11PrivacyAlgrthm; } } - } - else if (check_fwstate(pmlmepriv, WIFI_ADHOC_STATE)) /* adhoc mode */ - { + } else if (check_fwstate(pmlmepriv, WIFI_ADHOC_STATE)) { /* adhoc mode */ } }
@@ -949,39 +898,29 @@ static int cfg80211_rtw_add_key(struct wiphy *wiphy, struct net_device *ndev,
if (!mac_addr || is_broadcast_ether_addr(mac_addr)) - { param->u.crypt.set_tx = 0; /* for wpa/wpa2 group key */ - } else { + else param->u.crypt.set_tx = 1; /* for wpa/wpa2 pairwise key */ - }
param->u.crypt.idx = key_index;
if (params->seq_len && params->seq) - { memcpy(param->u.crypt.seq, (u8 *)params->seq, params->seq_len); - }
- if (params->key_len && params->key) - { + if (params->key_len && params->key) { param->u.crypt.key_len = params->key_len; memcpy(param->u.crypt.key, (u8 *)params->key, params->key_len); }
- if (check_fwstate(pmlmepriv, WIFI_STATION_STATE) == true) - { + if (check_fwstate(pmlmepriv, WIFI_STATION_STATE) == true) { ret = rtw_cfg80211_set_encryption(ndev, param, param_len); - } - else if (check_fwstate(pmlmepriv, WIFI_AP_STATE) == true) - { + } else if (check_fwstate(pmlmepriv, WIFI_AP_STATE) == true) { if (mac_addr) memcpy(param->sta_addr, (void *)mac_addr, ETH_ALEN);
ret = rtw_cfg80211_ap_set_encryption(ndev, param, param_len); - } - else if (check_fwstate(pmlmepriv, WIFI_ADHOC_STATE) == true - || check_fwstate(pmlmepriv, WIFI_ADHOC_MASTER_STATE) == true) - { + } else if (check_fwstate(pmlmepriv, WIFI_ADHOC_STATE) == true + || check_fwstate(pmlmepriv, WIFI_ADHOC_MASTER_STATE) == true) { ret = rtw_cfg80211_set_encryption(ndev, param, param_len); }
@@ -1007,8 +946,7 @@ static int cfg80211_rtw_del_key(struct wiphy *wiphy, struct net_device *ndev, struct adapter *padapter = rtw_netdev_priv(ndev); struct security_priv *psecuritypriv = &padapter->securitypriv;
- if (key_index == psecuritypriv->dot11PrivacyKeyIndex) - { + if (key_index == psecuritypriv->dot11PrivacyKeyIndex) { /* clear the flag of wep default key set. */ psecuritypriv->bWepDefaultKeyIdxSet = 0; } @@ -1024,16 +962,14 @@ static int cfg80211_rtw_set_default_key(struct wiphy *wiphy, struct adapter *padapter = rtw_netdev_priv(ndev); struct security_priv *psecuritypriv = &padapter->securitypriv;
- if ((key_index < WEP_KEYS) && ((psecuritypriv->dot11PrivacyAlgrthm == _WEP40_) || (psecuritypriv->dot11PrivacyAlgrthm == _WEP104_))) /* set wep default key */ - { + if ((key_index < WEP_KEYS) && ((psecuritypriv->dot11PrivacyAlgrthm == _WEP40_) || (psecuritypriv->dot11PrivacyAlgrthm == _WEP104_))) { /* set wep default key */ psecuritypriv->ndisencryptstatus = Ndis802_11Encryption1Enabled;
psecuritypriv->dot11PrivacyKeyIndex = key_index;
psecuritypriv->dot11PrivacyAlgrthm = _WEP40_; psecuritypriv->dot118021XGrpPrivacy = _WEP40_; - if (psecuritypriv->dot11DefKeylen[key_index] == 13) - { + if (psecuritypriv->dot11DefKeylen[key_index] == 13) { psecuritypriv->dot11PrivacyAlgrthm = _WEP104_; psecuritypriv->dot118021XGrpPrivacy = _WEP104_; } @@ -1071,9 +1007,7 @@ static int cfg80211_rtw_get_station(struct wiphy *wiphy,
/* for infra./P2PClient mode */ if (check_fwstate(pmlmepriv, WIFI_STATION_STATE) - && check_fwstate(pmlmepriv, _FW_LINKED) - ) - { + && check_fwstate(pmlmepriv, _FW_LINKED)) { struct wlan_network *cur_network = &(pmlmepriv->cur_network);
if (memcmp((u8 *)mac, cur_network->network.mac_address, ETH_ALEN)) { @@ -1099,9 +1033,7 @@ static int cfg80211_rtw_get_station(struct wiphy *wiphy, if ((check_fwstate(pmlmepriv, WIFI_ADHOC_STATE) || check_fwstate(pmlmepriv, WIFI_ADHOC_MASTER_STATE) || check_fwstate(pmlmepriv, WIFI_AP_STATE)) - && check_fwstate(pmlmepriv, _FW_LINKED) - ) - { + && check_fwstate(pmlmepriv, _FW_LINKED)) { /* TODO: should acquire station info... */ }
@@ -1121,8 +1053,7 @@ static int cfg80211_rtw_change_iface(struct wiphy *wiphy, struct mlme_ext_priv *pmlmeext = &(padapter->mlmeextpriv); int ret = 0;
- if (adapter_to_dvobj(padapter)->processing_dev_remove == true) - { + if (adapter_to_dvobj(padapter)->processing_dev_remove == true) { ret = -EPERM; goto exit; } @@ -1141,8 +1072,7 @@ static int cfg80211_rtw_change_iface(struct wiphy *wiphy,
old_type = rtw_wdev->iftype;
- if (old_type != type) - { + if (old_type != type) { pmlmeext->action_public_rxseq = 0xffff; pmlmeext->action_public_dialog_token = 0xff; } @@ -1164,8 +1094,7 @@ static int cfg80211_rtw_change_iface(struct wiphy *wiphy,
rtw_wdev->iftype = type;
- if (rtw_set_802_11_infrastructure_mode(padapter, networkType) == false) - { + if (rtw_set_802_11_infrastructure_mode(padapter, networkType) == false) { rtw_wdev->iftype = old_type; ret = -EPERM; goto exit; @@ -1230,9 +1159,7 @@ void rtw_cfg80211_surveydone_event_callback(struct adapter *padapter)
/* report network only if the current channel set contains the channel to which this network belongs */ if (rtw_ch_set_search_ch(padapter->mlmeextpriv.channel_set, pnetwork->network.configuration.ds_config) >= 0 - && true == rtw_validate_ssid(&(pnetwork->network.ssid)) - ) - { + && true == rtw_validate_ssid(&(pnetwork->network.ssid))) { /* ev =translate_scan(padapter, a, pnetwork, ev, stop); */ rtw_cfg80211_inform_bss(padapter, pnetwork); } @@ -1249,13 +1176,10 @@ static int rtw_cfg80211_set_probe_req_wpsp2pie(struct adapter *padapter, char *b u8 *wps_ie; struct mlme_priv *pmlmepriv = &(padapter->mlmepriv);
- if (len > 0) - { + if (len > 0) { wps_ie = rtw_get_wps_ie(buf, len, NULL, &wps_ielen); - if (wps_ie) - { - if (pmlmepriv->wps_probe_req_ie) - { + if (wps_ie) { + if (pmlmepriv->wps_probe_req_ie) { pmlmepriv->wps_probe_req_ie_len = 0; kfree(pmlmepriv->wps_probe_req_ie); pmlmepriv->wps_probe_req_ie = NULL; @@ -1307,10 +1231,8 @@ static int cfg80211_rtw_scan(struct wiphy *wiphy pwdev_priv->scan_request = request; spin_unlock_bh(&pwdev_priv->scan_req_lock);
- if (check_fwstate(pmlmepriv, WIFI_AP_STATE) == true) - { - if (check_fwstate(pmlmepriv, WIFI_UNDER_WPS|_FW_UNDER_SURVEY|_FW_UNDER_LINKING) == true) - { + if (check_fwstate(pmlmepriv, WIFI_AP_STATE) == true) { + if (check_fwstate(pmlmepriv, WIFI_UNDER_WPS|_FW_UNDER_SURVEY|_FW_UNDER_LINKING) == true) { need_indicate_scan_done = true; goto check_need_indicate_scan_done; } @@ -1333,15 +1255,13 @@ static int cfg80211_rtw_scan(struct wiphy *wiphy goto check_need_indicate_scan_done; }
- if (pmlmepriv->LinkDetectInfo.bBusyTraffic == true) - { + if (pmlmepriv->LinkDetectInfo.bBusyTraffic == true) { static unsigned long lastscantime = 0; unsigned long passtime;
passtime = jiffies_to_msecs(jiffies - lastscantime); lastscantime = jiffies; - if (passtime > 12000) - { + if (passtime > 12000) { need_indicate_scan_done = true; goto check_need_indicate_scan_done; } @@ -1380,9 +1300,7 @@ static int cfg80211_rtw_scan(struct wiphy *wiphy } else if (request->n_channels <= 4) { for (j = request->n_channels - 1; j >= 0; j--) for (i = 0; i < survey_times; i++) - { memcpy(&ch[j*survey_times+i], &ch[j], sizeof(struct rtw_ieee80211_channel)); - } _status = rtw_sitesurvey_cmd(padapter, ssid, RTW_SSID_SCAN_AMOUNT, ch, survey_times * request->n_channels); } else { _status = rtw_sitesurvey_cmd(padapter, ssid, RTW_SSID_SCAN_AMOUNT, NULL, 0); @@ -1391,14 +1309,11 @@ static int cfg80211_rtw_scan(struct wiphy *wiphy
if (_status == false) - { ret = -1; - }
check_need_indicate_scan_done: kfree(ssid); - if (need_indicate_scan_done) - { + if (need_indicate_scan_done) { rtw_cfg80211_surveydone_event_callback(padapter); rtw_cfg80211_indicate_scan_done(padapter, false); } @@ -1424,9 +1339,7 @@ static int rtw_cfg80211_set_wpa_version(struct security_priv *psecuritypriv, u32
if (wpa_version & (NL80211_WPA_VERSION_1 | NL80211_WPA_VERSION_2)) - { psecuritypriv->ndisauthtype = Ndis802_11AuthModeWPAPSK; - }
return 0;
@@ -1585,8 +1498,7 @@ static int rtw_cfg80211_set_wpa_ie(struct adapter *padapter, u8 *pie, size_t iel if (pairwise_cipher == 0) pairwise_cipher = WPA_CIPHER_NONE;
- switch (group_cipher) - { + switch (group_cipher) { case WPA_CIPHER_NONE: padapter->securitypriv.dot118021XGrpPrivacy = _NO_PRIVACY_; padapter->securitypriv.ndisencryptstatus = Ndis802_11EncryptionDisabled; @@ -1609,8 +1521,7 @@ static int rtw_cfg80211_set_wpa_ie(struct adapter *padapter, u8 *pie, size_t iel break; }
- switch (pairwise_cipher) - { + switch (pairwise_cipher) { case WPA_CIPHER_NONE: padapter->securitypriv.dot11PrivacyAlgrthm = _NO_PRIVACY_; padapter->securitypriv.ndisencryptstatus = Ndis802_11EncryptionDisabled; @@ -1731,8 +1642,7 @@ static int cfg80211_rtw_leave_ibss(struct wiphy *wiphy, struct net_device *ndev)
rtw_wdev->iftype = NL80211_IFTYPE_STATION;
- if (rtw_set_802_11_infrastructure_mode(padapter, Ndis802_11Infrastructure) == false) - { + if (rtw_set_802_11_infrastructure_mode(padapter, Ndis802_11Infrastructure) == false) { rtw_wdev->iftype = old_type; ret = -EPERM; goto leave_ibss; @@ -1792,9 +1702,8 @@ static int cfg80211_rtw_connect(struct wiphy *wiphy, struct net_device *ndev, ret = -EBUSY; goto exit; } - if (check_fwstate(pmlmepriv, _FW_UNDER_SURVEY) == true) { + if (check_fwstate(pmlmepriv, _FW_UNDER_SURVEY) == true) rtw_scan_abort(padapter); - }
psecuritypriv->ndisencryptstatus = Ndis802_11EncryptionDisabled; psecuritypriv->dot11PrivacyAlgrthm = _NO_PRIVACY_; @@ -2287,9 +2196,8 @@ static int rtw_cfg80211_add_monitor_if(struct adapter *padapter, char *name, str mon_ndev->ieee80211_ptr = mon_wdev;
ret = cfg80211_register_netdevice(mon_ndev); - if (ret) { + if (ret) goto out; - }
*ndev = pwdev_priv->pmon_ndev = mon_ndev; memcpy(pwdev_priv->ifname_mon, name, IFNAMSIZ+1); @@ -2402,11 +2310,10 @@ static int rtw_add_beacon(struct adapter *adapter, const u8 *head, size_t head_l rtw_ies_remove_ie(pbuf, &len, _BEACON_IE_OFFSET_, WLAN_EID_VENDOR_SPECIFIC, P2P_OUI, 4); rtw_ies_remove_ie(pbuf, &len, _BEACON_IE_OFFSET_, WLAN_EID_VENDOR_SPECIFIC, WFD_OUI, 4);
- if (rtw_check_beacon_data(adapter, pbuf, len) == _SUCCESS) { + if (rtw_check_beacon_data(adapter, pbuf, len) == _SUCCESS) ret = 0; - } else { + else ret = -EINVAL; - }
kfree(pbuf);
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 05cbcc415c9b8c8bc4f9a09f8e03610a89042f03 ]
There are 2 issues with the key-store index handling
1. The non WEP key stores can store keys with indexes 0 - BIP_MAX_KEYID, this means that they should be an array with BIP_MAX_KEYID + 1 entries. But some of the arrays where just BIP_MAX_KEYID entries big. While one other array was hardcoded to a size of 6 entries, instead of using the BIP_MAX_KEYID define.
2. The rtw_cfg80211_set_encryption() and wpa_set_encryption() functions index check where checking that the passed in key-index would fit inside both the WEP key store (which only has 4 entries) as well as in the non WEP key stores. This breaks any attempts to set non WEP keys with index 4 or 5.
Issue 2. specifically breaks wifi connection with some access points which advertise PMF support. Without this fix connecting to these access points fails with the following wpa_supplicant messages:
nl80211: kernel reports: key addition failed wlan0: WPA: Failed to configure IGTK to the driver wlan0: RSN: Failed to configure IGTK wlan0: CTRL-EVENT-DISCONNECTED bssid=... reason=1 locally_generated=1
Fix 1. by using the right size for the key-stores. After this 2. can safely be fixed by checking the right max-index value depending on the used algorithm, fixing wifi not working with some PMF capable APs.
Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230306153512.162104-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../staging/rtl8723bs/include/rtw_security.h | 8 ++--- .../staging/rtl8723bs/os_dep/ioctl_cfg80211.c | 26 ++++++++------- .../staging/rtl8723bs/os_dep/ioctl_linux.c | 33 ++++++++++--------- 3 files changed, 36 insertions(+), 31 deletions(-)
diff --git a/drivers/staging/rtl8723bs/include/rtw_security.h b/drivers/staging/rtl8723bs/include/rtw_security.h index a68b738584623..7587fa8885274 100644 --- a/drivers/staging/rtl8723bs/include/rtw_security.h +++ b/drivers/staging/rtl8723bs/include/rtw_security.h @@ -107,13 +107,13 @@ struct security_priv {
u32 dot118021XGrpPrivacy; /* This specify the privacy algthm. used for Grp key */ u32 dot118021XGrpKeyid; /* key id used for Grp Key (tx key index) */ - union Keytype dot118021XGrpKey[BIP_MAX_KEYID]; /* 802.1x Group Key, for inx0 and inx1 */ - union Keytype dot118021XGrptxmickey[BIP_MAX_KEYID]; - union Keytype dot118021XGrprxmickey[BIP_MAX_KEYID]; + union Keytype dot118021XGrpKey[BIP_MAX_KEYID + 1]; /* 802.1x Group Key, for inx0 and inx1 */ + union Keytype dot118021XGrptxmickey[BIP_MAX_KEYID + 1]; + union Keytype dot118021XGrprxmickey[BIP_MAX_KEYID + 1]; union pn48 dot11Grptxpn; /* PN48 used for Grp Key xmit. */ union pn48 dot11Grprxpn; /* PN48 used for Grp Key recv. */ u32 dot11wBIPKeyid; /* key id used for BIP Key (tx key index) */ - union Keytype dot11wBIPKey[6]; /* BIP Key, for index4 and index5 */ + union Keytype dot11wBIPKey[BIP_MAX_KEYID + 1]; /* BIP Key, for index4 and index5 */ union pn48 dot11wBIPtxpn; /* PN48 used for Grp Key xmit. */ union pn48 dot11wBIPrxpn; /* PN48 used for Grp Key recv. */
diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c index b7a3f3df72723..b33424a9e83b7 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c @@ -711,6 +711,7 @@ static int rtw_cfg80211_ap_set_encryption(struct net_device *dev, struct ieee_pa static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param *param, u32 param_len) { int ret = 0; + u8 max_idx; u32 wep_key_idx, wep_key_len; struct adapter *padapter = rtw_netdev_priv(dev); struct mlme_priv *pmlmepriv = &padapter->mlmepriv; @@ -724,26 +725,29 @@ static int rtw_cfg80211_set_encryption(struct net_device *dev, struct ieee_param goto exit; }
- if (param->sta_addr[0] == 0xff && param->sta_addr[1] == 0xff && - param->sta_addr[2] == 0xff && param->sta_addr[3] == 0xff && - param->sta_addr[4] == 0xff && param->sta_addr[5] == 0xff) { - if (param->u.crypt.idx >= WEP_KEYS - || param->u.crypt.idx >= BIP_MAX_KEYID) { - ret = -EINVAL; - goto exit; - } - } else { - { + if (param->sta_addr[0] != 0xff || param->sta_addr[1] != 0xff || + param->sta_addr[2] != 0xff || param->sta_addr[3] != 0xff || + param->sta_addr[4] != 0xff || param->sta_addr[5] != 0xff) { ret = -EINVAL; goto exit; } + + if (strcmp(param->u.crypt.alg, "WEP") == 0) + max_idx = WEP_KEYS - 1; + else + max_idx = BIP_MAX_KEYID; + + if (param->u.crypt.idx > max_idx) { + netdev_err(dev, "Error crypt.idx %d > %d\n", param->u.crypt.idx, max_idx); + ret = -EINVAL; + goto exit; }
if (strcmp(param->u.crypt.alg, "WEP") == 0) { wep_key_idx = param->u.crypt.idx; wep_key_len = param->u.crypt.key_len;
- if ((wep_key_idx >= WEP_KEYS) || (wep_key_len <= 0)) { + if (wep_key_len <= 0) { ret = -EINVAL; goto exit; } diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c index f23ab3e103455..0d2cb3e7ea4df 100644 --- a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c @@ -60,6 +60,7 @@ static int wpa_set_auth_algs(struct net_device *dev, u32 value) static int wpa_set_encryption(struct net_device *dev, struct ieee_param *param, u32 param_len) { int ret = 0; + u8 max_idx; u32 wep_key_idx, wep_key_len, wep_total_len; struct ndis_802_11_wep *pwep = NULL; struct adapter *padapter = rtw_netdev_priv(dev); @@ -74,19 +75,22 @@ static int wpa_set_encryption(struct net_device *dev, struct ieee_param *param, goto exit; }
- if (param->sta_addr[0] == 0xff && param->sta_addr[1] == 0xff && - param->sta_addr[2] == 0xff && param->sta_addr[3] == 0xff && - param->sta_addr[4] == 0xff && param->sta_addr[5] == 0xff) { - if (param->u.crypt.idx >= WEP_KEYS || - param->u.crypt.idx >= BIP_MAX_KEYID) { - ret = -EINVAL; - goto exit; - } - } else { - { - ret = -EINVAL; - goto exit; - } + if (param->sta_addr[0] != 0xff || param->sta_addr[1] != 0xff || + param->sta_addr[2] != 0xff || param->sta_addr[3] != 0xff || + param->sta_addr[4] != 0xff || param->sta_addr[5] != 0xff) { + ret = -EINVAL; + goto exit; + } + + if (strcmp(param->u.crypt.alg, "WEP") == 0) + max_idx = WEP_KEYS - 1; + else + max_idx = BIP_MAX_KEYID; + + if (param->u.crypt.idx > max_idx) { + netdev_err(dev, "Error crypt.idx %d > %d\n", param->u.crypt.idx, max_idx); + ret = -EINVAL; + goto exit; }
if (strcmp(param->u.crypt.alg, "WEP") == 0) { @@ -98,9 +102,6 @@ static int wpa_set_encryption(struct net_device *dev, struct ieee_param *param, wep_key_idx = param->u.crypt.idx; wep_key_len = param->u.crypt.key_len;
- if (wep_key_idx > WEP_KEYS) - return -EINVAL; - if (wep_key_len > 0) { wep_key_len = wep_key_len <= 5 ? 5 : 13; wep_total_len = wep_key_len + FIELD_OFFSET(struct ndis_802_11_wep, key_material);
From: David Disseldorp ddiss@suse.de
[ Upstream commit 03e1d60e177eedbd302b77af4ea5e21b5a7ade31 ]
The watch_queue_set_size() allocation error paths return the ret value set via the prior pipe_resize_ring() call, which will always be zero.
As a result, IOC_WATCH_QUEUE_SET_SIZE callers such as "keyctl watch" fail to detect kernel wqueue->notes allocation failures and proceed to KEYCTL_WATCH_KEY, with any notifications subsequently lost.
Fixes: c73be61cede58 ("pipe: Add general notification queue support") Signed-off-by: David Disseldorp ddiss@suse.de Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/watch_queue.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c index 1059ef6c3711a..54cbaa9711398 100644 --- a/kernel/watch_queue.c +++ b/kernel/watch_queue.c @@ -274,6 +274,7 @@ long watch_queue_set_size(struct pipe_inode_info *pipe, unsigned int nr_notes) if (ret < 0) goto error;
+ ret = -ENOMEM; pages = kcalloc(sizeof(struct page *), nr_pages, GFP_KERNEL); if (!pages) goto error;
From: Morten Linderud morten@linderud.pw
[ Upstream commit 80a6c216b16d7f5c584d2148c2e4345ea4eb06ce ]
tpm_read_log_acpi() should return -ENODEV when no eventlog from the ACPI table is found. If the firmware vendor includes an invalid log address we are unable to map from the ACPI memory and tpm_read_log() returns -EIO which would abort discovery of the eventlog.
Change the return value from -EIO to -ENODEV when acpi_os_map_iomem() fails to map the event log.
The following hardware was used to test this issue: Framework Laptop (Pre-production) BIOS: INSYDE Corp, Revision: 3.2 TPM Device: NTC, Firmware Revision: 7.2
Dump of the faulty ACPI TPM2 table: [000h 0000 4] Signature : "TPM2" [Trusted Platform Module hardware interface Table] [004h 0004 4] Table Length : 0000004C [008h 0008 1] Revision : 04 [009h 0009 1] Checksum : 2B [00Ah 0010 6] Oem ID : "INSYDE" [010h 0016 8] Oem Table ID : "TGL-ULT" [018h 0024 4] Oem Revision : 00000002 [01Ch 0028 4] Asl Compiler ID : "ACPI" [020h 0032 4] Asl Compiler Revision : 00040000
[024h 0036 2] Platform Class : 0000 [026h 0038 2] Reserved : 0000 [028h 0040 8] Control Address : 0000000000000000 [030h 0048 4] Start Method : 06 [Memory Mapped I/O]
[034h 0052 12] Method Parameters : 00 00 00 00 00 00 00 00 00 00 00 00 [040h 0064 4] Minimum Log Length : 00010000 [044h 0068 8] Log Address : 000000004053D000
Fixes: 0cf577a03f21 ("tpm: Fix handling of missing event log") Tested-by: Erkki Eilonen erkki@bearmetal.eu Signed-off-by: Morten Linderud morten@linderud.pw Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/eventlog/acpi.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/char/tpm/eventlog/acpi.c b/drivers/char/tpm/eventlog/acpi.c index 0913d3eb8d518..cd266021d0103 100644 --- a/drivers/char/tpm/eventlog/acpi.c +++ b/drivers/char/tpm/eventlog/acpi.c @@ -143,8 +143,12 @@ int tpm_read_log_acpi(struct tpm_chip *chip)
ret = -EIO; virt = acpi_os_map_iomem(start, len); - if (!virt) + if (!virt) { + dev_warn(&chip->dev, "%s: Failed to map ACPI memory\n", __func__); + /* try EFI log next */ + ret = -ENODEV; goto err; + }
memcpy_fromio(log->bios_event_log, virt, len);
From: Darrick J. Wong djwong@kernel.org
commit e014f37db1a2d109afa750042ac4d69cf3e3d88e upsream.
Filipe Manana pointed out that XFS' behavior w.r.t. setuid/setgid revocation isn't consistent with btrfs[1] or ext4. Those two filesystems use the VFS function setattr_copy to convey certain attributes from struct iattr into the VFS inode structure.
Andrey Zhadchenko reported[2] that XFS uses the wrong user namespace to decide if it should clear setgid and setuid on a file attribute update. This is a second symptom of the problem that Filipe noticed.
XFS, on the other hand, open-codes setattr_copy in xfs_setattr_mode, xfs_setattr_nonsize, and xfs_setattr_time. Regrettably, setattr_copy is /not/ a simple copy function; it contains additional logic to clear the setgid bit when setting the mode, and XFS' version no longer matches.
The VFS implements its own setuid/setgid stripping logic, which establishes consistent behavior. It's a tad unfortunate that it's scattered across notify_change, should_remove_suid, and setattr_copy but XFS should really follow the Linux VFS. Adapt XFS to use the VFS functions and get rid of the old functions.
[1] https://lore.kernel.org/fstests/CAL3q7H47iNQ=Wmk83WcGB-KBJVOEtR9+qGczzCeXJ9Y... [2] https://lore.kernel.org/linux-xfs/20220221182218.748084-1-andrey.zhadchenko@...
Fixes: 7fa294c8991c ("userns: Allow chown and setgid preservation") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_iops.c | 56 +++-------------------------------------------- fs/xfs/xfs_pnfs.c | 3 ++- 2 files changed, 5 insertions(+), 54 deletions(-)
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index a607d6aca5c4d..1eb71275e5b09 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -634,37 +634,6 @@ xfs_vn_getattr( return 0; }
-static void -xfs_setattr_mode( - struct xfs_inode *ip, - struct iattr *iattr) -{ - struct inode *inode = VFS_I(ip); - umode_t mode = iattr->ia_mode; - - ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); - - inode->i_mode &= S_IFMT; - inode->i_mode |= mode & ~S_IFMT; -} - -void -xfs_setattr_time( - struct xfs_inode *ip, - struct iattr *iattr) -{ - struct inode *inode = VFS_I(ip); - - ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); - - if (iattr->ia_valid & ATTR_ATIME) - inode->i_atime = iattr->ia_atime; - if (iattr->ia_valid & ATTR_CTIME) - inode->i_ctime = iattr->ia_ctime; - if (iattr->ia_valid & ATTR_MTIME) - inode->i_mtime = iattr->ia_mtime; -} - static int xfs_vn_change_ok( struct user_namespace *mnt_userns, @@ -763,16 +732,6 @@ xfs_setattr_nonsize( gid = (mask & ATTR_GID) ? iattr->ia_gid : igid; uid = (mask & ATTR_UID) ? iattr->ia_uid : iuid;
- /* - * CAP_FSETID overrides the following restrictions: - * - * The set-user-ID and set-group-ID bits of a file will be - * cleared upon successful return from chown() - */ - if ((inode->i_mode & (S_ISUID|S_ISGID)) && - !capable(CAP_FSETID)) - inode->i_mode &= ~(S_ISUID|S_ISGID); - /* * Change the ownerships and register quota modifications * in the transaction. @@ -784,7 +743,6 @@ xfs_setattr_nonsize( olddquot1 = xfs_qm_vop_chown(tp, ip, &ip->i_udquot, udqp); } - inode->i_uid = uid; } if (!gid_eq(igid, gid)) { if (XFS_IS_GQUOTA_ON(mp)) { @@ -795,15 +753,10 @@ xfs_setattr_nonsize( olddquot2 = xfs_qm_vop_chown(tp, ip, &ip->i_gdquot, gdqp); } - inode->i_gid = gid; } }
- if (mask & ATTR_MODE) - xfs_setattr_mode(ip, iattr); - if (mask & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME)) - xfs_setattr_time(ip, iattr); - + setattr_copy(mnt_userns, inode, iattr); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
XFS_STATS_INC(mp, xs_ig_attrchg); @@ -1028,11 +981,8 @@ xfs_setattr_size( xfs_inode_clear_eofblocks_tag(ip); }
- if (iattr->ia_valid & ATTR_MODE) - xfs_setattr_mode(ip, iattr); - if (iattr->ia_valid & (ATTR_ATIME|ATTR_CTIME|ATTR_MTIME)) - xfs_setattr_time(ip, iattr); - + ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID))); + setattr_copy(mnt_userns, inode, iattr); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
XFS_STATS_INC(mp, xs_ig_attrchg); diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c index 5e1d29d8b2e73..8865f7d4404ae 100644 --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -283,7 +283,8 @@ xfs_fs_commit_blocks( xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
- xfs_setattr_time(ip, iattr); + ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID))); + setattr_copy(&init_user_ns, inode, iattr); if (update_isize) { i_size_write(inode, iattr->ia_size); ip->i_disk_size = iattr->ia_size;
From: Dave Chinner dchinner@redhat.com
commit 472c6e46f589c26057596dcba160712a5b3e02c5 upstream.
[partial backport for dependency - xfs_ioc_space() still uses XFS_PREALLOC_SYNC]
Callers can acheive the same thing by calling xfs_log_force_inode() after making their modifications. There is no need for xfs_update_prealloc_flags() to do this.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_file.c | 13 +++++++------ fs/xfs/xfs_pnfs.c | 6 ++++-- 2 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 240eb932c014b..752b676c92e3f 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -95,8 +95,6 @@ xfs_update_prealloc_flags( ip->i_diflags &= ~XFS_DIFLAG_PREALLOC;
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); - if (flags & XFS_PREALLOC_SYNC) - xfs_trans_set_sync(tp); return xfs_trans_commit(tp); }
@@ -1059,9 +1057,6 @@ xfs_file_fallocate( } }
- if (file->f_flags & O_DSYNC) - flags |= XFS_PREALLOC_SYNC; - error = xfs_update_prealloc_flags(ip, flags); if (error) goto out_unlock; @@ -1084,8 +1079,14 @@ xfs_file_fallocate( * leave shifted extents past EOF and hence losing access to * the data that is contained within them. */ - if (do_file_insert) + if (do_file_insert) { error = xfs_insert_file_space(ip, offset, len); + if (error) + goto out_unlock; + } + + if (file->f_flags & O_DSYNC) + error = xfs_log_force_inode(ip);
out_unlock: xfs_iunlock(ip, iolock); diff --git a/fs/xfs/xfs_pnfs.c b/fs/xfs/xfs_pnfs.c index 8865f7d4404ae..3a82a13d880c2 100644 --- a/fs/xfs/xfs_pnfs.c +++ b/fs/xfs/xfs_pnfs.c @@ -164,10 +164,12 @@ xfs_fs_map_blocks( * that the blocks allocated and handed out to the client are * guaranteed to be present even after a server crash. */ - error = xfs_update_prealloc_flags(ip, - XFS_PREALLOC_SET | XFS_PREALLOC_SYNC); + error = xfs_update_prealloc_flags(ip, XFS_PREALLOC_SET); + if (!error) + error = xfs_log_force_inode(ip); if (error) goto out_unlock; + } else { xfs_iunlock(ip, lock_flags); }
From: Dave Chinner dchinner@redhat.com
commit fbe7e520036583a783b13ff9744e35c2a329d9a4 upsream.
In XFS, we always update the inode change and modification time when any fallocate() operation succeeds. Furthermore, as various fallocate modes can change the file contents (extending EOF, punching holes, zeroing things, shifting extents), we should drop file privileges like suid just like we do for a regular write(). There's already a VFS helper that figures all this out for us, so use that.
The net effect of this is that we no longer drop suid/sgid if the caller is root, but we also now drop file capabilities.
We also move the xfs_update_prealloc_flags() function so that it now is only called by the scope that needs to set the the prealloc flag.
Based on a patch from Darrick Wong.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_file.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 752b676c92e3f..020e0a412287a 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -954,6 +954,10 @@ xfs_file_fallocate( goto out_unlock; }
+ error = file_modified(file); + if (error) + goto out_unlock; + if (mode & FALLOC_FL_PUNCH_HOLE) { error = xfs_free_file_space(ip, offset, len); if (error) @@ -1055,11 +1059,12 @@ xfs_file_fallocate( if (error) goto out_unlock; } - }
- error = xfs_update_prealloc_flags(ip, flags); - if (error) - goto out_unlock; + error = xfs_update_prealloc_flags(ip, XFS_PREALLOC_SET); + if (error) + goto out_unlock; + + }
/* Change file size if needed */ if (new_size) {
From: Dave Chinner dchinner@redhat.com
commit 0b02c8c0d75a738c98c35f02efb36217c170d78c upsream.
Now that we only call xfs_update_prealloc_flags() from xfs_file_fallocate() in the case where we need to set the preallocation flag, do this in xfs_alloc_file_space() where we already have the inode joined into a transaction and get rid of the call to xfs_update_prealloc_flags() from the fallocate code.
This also means that we now correctly avoid setting the XFS_DIFLAG_PREALLOC flag when xfs_is_always_cow_inode() is true, as these inodes will never have preallocated extents.
Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_bmap_util.c | 9 +++------ fs/xfs/xfs_file.c | 8 -------- 2 files changed, 3 insertions(+), 14 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 73a36b7be3bd1..fd2ad6a3019ca 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -851,9 +851,6 @@ xfs_alloc_file_space( rblocks = 0; }
- /* - * Allocate and setup the transaction. - */ error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, dblocks, rblocks, false, &tp); if (error) @@ -870,9 +867,9 @@ xfs_alloc_file_space( if (error) goto error;
- /* - * Complete the transaction - */ + ip->i_diflags |= XFS_DIFLAG_PREALLOC; + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); + error = xfs_trans_commit(tp); xfs_iunlock(ip, XFS_ILOCK_EXCL); if (error) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 020e0a412287a..8cd0c3df253f9 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -909,7 +909,6 @@ xfs_file_fallocate( struct inode *inode = file_inode(file); struct xfs_inode *ip = XFS_I(inode); long error; - enum xfs_prealloc_flags flags = 0; uint iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; loff_t new_size = 0; bool do_file_insert = false; @@ -1007,8 +1006,6 @@ xfs_file_fallocate( } do_file_insert = true; } else { - flags |= XFS_PREALLOC_SET; - if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > i_size_read(inode)) { new_size = offset + len; @@ -1059,11 +1056,6 @@ xfs_file_fallocate( if (error) goto out_unlock; } - - error = xfs_update_prealloc_flags(ip, XFS_PREALLOC_SET); - if (error) - goto out_unlock; - }
/* Change file size if needed */
From: Yang Xu xuyang2018.jy@fujitsu.com
commit 2b3416ceff5e6bd4922f6d1c61fb68113dd82302 upsream.
Add a dedicated helper to handle the setgid bit when creating a new file in a setgid directory. This is a preparatory patch for moving setgid stripping into the vfs. The patch contains no functional changes.
Currently the setgid stripping logic is open-coded directly in inode_init_owner() and the individual filesystems are responsible for handling setgid inheritance. Since this has proven to be brittle as evidenced by old issues we uncovered over the last months (see [1] to [3] below) we will try to move this logic into the vfs.
Link: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") [1] Link: 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") [2] Link: fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") [3] Link: https://lore.kernel.org/r/1657779088-2242-1-git-send-email-xuyang2018.jy@fuj... Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christian Brauner (Microsoft) brauner@kernel.org Reviewed-and-Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/inode.c | 36 ++++++++++++++++++++++++++++++++---- include/linux/fs.h | 2 ++ 2 files changed, 34 insertions(+), 4 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c index 8279c700a2b7f..3740102c9bd5e 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2165,10 +2165,8 @@ void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode, /* Directories are special, and always inherit S_ISGID */ if (S_ISDIR(mode)) mode |= S_ISGID; - else if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP) && - !in_group_p(i_gid_into_mnt(mnt_userns, dir)) && - !capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID)) - mode &= ~S_ISGID; + else + mode = mode_strip_sgid(mnt_userns, dir, mode); } else inode_fsgid_set(inode, mnt_userns); inode->i_mode = mode; @@ -2324,3 +2322,33 @@ struct timespec64 current_time(struct inode *inode) return timestamp_truncate(now, inode); } EXPORT_SYMBOL(current_time); + +/** + * mode_strip_sgid - handle the sgid bit for non-directories + * @mnt_userns: User namespace of the mount the inode was created from + * @dir: parent directory inode + * @mode: mode of the file to be created in @dir + * + * If the @mode of the new file has both the S_ISGID and S_IXGRP bit + * raised and @dir has the S_ISGID bit raised ensure that the caller is + * either in the group of the parent directory or they have CAP_FSETID + * in their user namespace and are privileged over the parent directory. + * In all other cases, strip the S_ISGID bit from @mode. + * + * Return: the new mode to use for the file + */ +umode_t mode_strip_sgid(struct user_namespace *mnt_userns, + const struct inode *dir, umode_t mode) +{ + if ((mode & (S_ISGID | S_IXGRP)) != (S_ISGID | S_IXGRP)) + return mode; + if (S_ISDIR(mode) || !dir || !(dir->i_mode & S_ISGID)) + return mode; + if (in_group_p(i_gid_into_mnt(mnt_userns, dir))) + return mode; + if (capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID)) + return mode; + + return mode & ~S_ISGID; +} +EXPORT_SYMBOL(mode_strip_sgid); diff --git a/include/linux/fs.h b/include/linux/fs.h index 1e1ac116dd136..be9be4a7216c7 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1941,6 +1941,8 @@ extern long compat_ptr_ioctl(struct file *file, unsigned int cmd, void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode, const struct inode *dir, umode_t mode); extern bool may_open_dev(const struct path *path); +umode_t mode_strip_sgid(struct user_namespace *mnt_userns, + const struct inode *dir, umode_t mode);
/* * This is the "filldir" function type, used by readdir() to let
From: Yang Xu xuyang2018.jy@fujitsu.com
commit 1639a49ccdce58ea248841ed9b23babcce6dbb0b upsream.
Move setgid handling out of individual filesystems and into the VFS itself to stop the proliferation of setgid inheritance bugs.
Creating files that have both the S_IXGRP and S_ISGID bit raised in directories that themselves have the S_ISGID bit set requires additional privileges to avoid security issues.
When a filesystem creates a new inode it needs to take care that the caller is either in the group of the newly created inode or they have CAP_FSETID in their current user namespace and are privileged over the parent directory of the new inode. If any of these two conditions is true then the S_ISGID bit can be raised for an S_IXGRP file and if not it needs to be stripped.
However, there are several key issues with the current implementation:
* S_ISGID stripping logic is entangled with umask stripping.
If a filesystem doesn't support or enable POSIX ACLs then umask stripping is done directly in the vfs before calling into the filesystem. If the filesystem does support POSIX ACLs then unmask stripping may be done in the filesystem itself when calling posix_acl_create().
Since umask stripping has an effect on S_ISGID inheritance, e.g., by stripping the S_IXGRP bit from the file to be created and all relevant filesystems have to call posix_acl_create() before inode_init_owner() where we currently take care of S_ISGID handling S_ISGID handling is order dependent. IOW, whether or not you get a setgid bit depends on POSIX ACLs and umask and in what order they are called.
Note that technically filesystems are free to impose their own ordering between posix_acl_create() and inode_init_owner() meaning that there's additional ordering issues that influence S_SIGID inheritance.
* Filesystems that don't rely on inode_init_owner() don't get S_ISGID stripping logic.
While that may be intentional (e.g. network filesystems might just defer setgid stripping to a server) it is often just a security issue.
This is not just ugly it's unsustainably messy especially since we do still have bugs in this area years after the initial round of setgid bugfixes.
So the current state is quite messy and while we won't be able to make it completely clean as posix_acl_create() is still a filesystem specific call we can improve the S_SIGD stripping situation quite a bit by hoisting it out of inode_init_owner() and into the vfs creation operations. This means we alleviate the burden for filesystems to handle S_ISGID stripping correctly and can standardize the ordering between S_ISGID and umask stripping in the vfs.
We add a new helper vfs_prepare_mode() so S_ISGID handling is now done in the VFS before umask handling. This has S_ISGID handling is unaffected unaffected by whether umask stripping is done by the VFS itself (if no POSIX ACLs are supported or enabled) or in the filesystem in posix_acl_create() (if POSIX ACLs are supported).
The vfs_prepare_mode() helper is called directly in vfs_*() helpers that create new filesystem objects. We need to move them into there to make sure that filesystems like overlayfs hat have callchains like:
sys_mknod() -> do_mknodat(mode) -> .mknod = ovl_mknod(mode) -> ovl_create(mode) -> vfs_mknod(mode)
get S_ISGID stripping done when calling into lower filesystems via vfs_*() creation helpers. Moving vfs_prepare_mode() into e.g. vfs_mknod() takes care of that. This is in any case semantically cleaner because S_ISGID stripping is VFS security requirement.
Security hooks so far have seen the mode with the umask applied but without S_ISGID handling done. The relevant hooks are called outside of vfs_*() creation helpers so by calling vfs_prepare_mode() from vfs_*() helpers the security hooks would now see the mode without umask stripping applied. For now we fix this by passing the mode with umask settings applied to not risk any regressions for LSM hooks. IOW, nothing changes for LSM hooks. It is worth pointing out that security hooks never saw the mode that is seen by the filesystem when actually creating the file. They have always been completely misplaced for that to work.
The following filesystems use inode_init_owner() and thus relied on S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs, hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs, overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs, bpf, tmpfs.
All of the above filesystems end up calling inode_init_owner() when new filesystem objects are created through the ->mkdir(), ->mknod(), ->create(), ->tmpfile(), ->rename() inode operations.
Since directories always inherit the S_ISGID bit with the exception of xfs when irix_sgid_inherit mode is turned on S_ISGID stripping doesn't apply. The ->symlink() and ->link() inode operations trivially inherit the mode from the target and the ->rename() inode operation inherits the mode from the source inode. All other creation inode operations will get S_ISGID handling via vfs_prepare_mode() when called from their relevant vfs_*() helpers.
In addition to this there are filesystems which allow the creation of filesystem objects through ioctl()s or - in the case of spufs - circumventing the vfs in other ways. If filesystem objects are created through ioctl()s the vfs doesn't know about it and can't apply regular permission checking including S_ISGID logic. Therfore, a filesystem relying on S_ISGID stripping in inode_init_owner() in their ioctl() callpath will be affected by moving this logic into the vfs. We audited those filesystems:
* btrfs allows the creation of filesystem objects through various ioctls(). Snapshot creation literally takes a snapshot and so the mode is fully preserved and S_ISGID stripping doesn't apply.
Creating a new subvolum relies on inode_init_owner() in btrfs_new_subvol_inode() but only creates directories and doesn't raise S_ISGID.
* ocfs2 has a peculiar implementation of reflinks. In contrast to e.g. xfs and btrfs FICLONE/FICLONERANGE ioctl() that is only concerned with the actual extents ocfs2 uses a separate ioctl() that also creates the target file.
Iow, ocfs2 circumvents the vfs entirely here and did indeed rely on inode_init_owner() to strip the S_ISGID bit. This is the only place where a filesystem needs to call mode_strip_sgid() directly but this is self-inflicted pain.
* spufs doesn't go through the vfs at all and doesn't use ioctl()s either. Instead it has a dedicated system call spufs_create() which allows the creation of filesystem objects. But spufs only creates directories and doesn't allo S_SIGID bits, i.e. it specifically only allows 0777 bits.
* bpf uses vfs_mkobj() but also doesn't allow S_ISGID bits to be created.
The patch will have an effect on ext2 when the EXT2_MOUNT_GRPID mount option is used, on ext4 when the EXT4_MOUNT_GRPID mount option is used, and on xfs when the XFS_FEAT_GRPID mount option is used. When any of these filesystems are mounted with their respective GRPID option then newly created files inherit the parent directories group unconditionally. In these cases non of the filesystems call inode_init_owner() and thus did never strip the S_ISGID bit for newly created files. Moving this logic into the VFS means that they now get the S_ISGID bit stripped. This is a user visible change. If this leads to regressions we will either need to figure out a better way or we need to revert. However, given the various setgid bugs that we found just in the last two years this is a regression risk we should take.
Associated with this change is a new set of fstests to enforce the semantics for all new filesystems.
Link: https://lore.kernel.org/ceph-devel/20220427092201.wvsdjbnc7b4dttaw@wittgenst... [1] Link: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") [2] Link: 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") [3] Link: fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") [4] Link: https://lore.kernel.org/r/1657779088-2242-3-git-send-email-xuyang2018.jy@fuj... Suggested-by: Dave Chinner david@fromorbit.com Suggested-by: Christian Brauner (Microsoft) brauner@kernel.org Reviewed-by: Darrick J. Wong djwong@kernel.org Reviewed-and-Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Yang Xu xuyang2018.jy@fujitsu.com [brauner@kernel.org: rewrote commit message] Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/inode.c | 2 -- fs/namei.c | 82 ++++++++++++++++++++++++++++++++++++++++-------- fs/ocfs2/namei.c | 1 + 3 files changed, 70 insertions(+), 15 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c index 3740102c9bd5e..957b2d18ec29f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2165,8 +2165,6 @@ void inode_init_owner(struct user_namespace *mnt_userns, struct inode *inode, /* Directories are special, and always inherit S_ISGID */ if (S_ISDIR(mode)) mode |= S_ISGID; - else - mode = mode_strip_sgid(mnt_userns, dir, mode); } else inode_fsgid_set(inode, mnt_userns); inode->i_mode = mode; diff --git a/fs/namei.c b/fs/namei.c index 81b31d9a063f2..02e99606c65be 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3000,6 +3000,65 @@ void unlock_rename(struct dentry *p1, struct dentry *p2) } EXPORT_SYMBOL(unlock_rename);
+/** + * mode_strip_umask - handle vfs umask stripping + * @dir: parent directory of the new inode + * @mode: mode of the new inode to be created in @dir + * + * Umask stripping depends on whether or not the filesystem supports POSIX + * ACLs. If the filesystem doesn't support it umask stripping is done directly + * in here. If the filesystem does support POSIX ACLs umask stripping is + * deferred until the filesystem calls posix_acl_create(). + * + * Returns: mode + */ +static inline umode_t mode_strip_umask(const struct inode *dir, umode_t mode) +{ + if (!IS_POSIXACL(dir)) + mode &= ~current_umask(); + return mode; +} + +/** + * vfs_prepare_mode - prepare the mode to be used for a new inode + * @mnt_userns: user namespace of the mount the inode was found from + * @dir: parent directory of the new inode + * @mode: mode of the new inode + * @mask_perms: allowed permission by the vfs + * @type: type of file to be created + * + * This helper consolidates and enforces vfs restrictions on the @mode of a new + * object to be created. + * + * Umask stripping depends on whether the filesystem supports POSIX ACLs (see + * the kernel documentation for mode_strip_umask()). Moving umask stripping + * after setgid stripping allows the same ordering for both non-POSIX ACL and + * POSIX ACL supporting filesystems. + * + * Note that it's currently valid for @type to be 0 if a directory is created. + * Filesystems raise that flag individually and we need to check whether each + * filesystem can deal with receiving S_IFDIR from the vfs before we enforce a + * non-zero type. + * + * Returns: mode to be passed to the filesystem + */ +static inline umode_t vfs_prepare_mode(struct user_namespace *mnt_userns, + const struct inode *dir, umode_t mode, + umode_t mask_perms, umode_t type) +{ + mode = mode_strip_sgid(mnt_userns, dir, mode); + mode = mode_strip_umask(dir, mode); + + /* + * Apply the vfs mandated allowed permission mask and set the type of + * file to be created before we call into the filesystem. + */ + mode &= (mask_perms & ~S_IFMT); + mode |= (type & S_IFMT); + + return mode; +} + /** * vfs_create - create new file * @mnt_userns: user namespace of the mount the inode was found from @@ -3025,8 +3084,8 @@ int vfs_create(struct user_namespace *mnt_userns, struct inode *dir,
if (!dir->i_op->create) return -EACCES; /* shouldn't it be ENOSYS? */ - mode &= S_IALLUGO; - mode |= S_IFREG; + + mode = vfs_prepare_mode(mnt_userns, dir, mode, S_IALLUGO, S_IFREG); error = security_inode_create(dir, dentry, mode); if (error) return error; @@ -3291,8 +3350,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, if (open_flag & O_CREAT) { if (open_flag & O_EXCL) open_flag &= ~O_TRUNC; - if (!IS_POSIXACL(dir->d_inode)) - mode &= ~current_umask(); + mode = vfs_prepare_mode(mnt_userns, dir->d_inode, mode, mode, mode); if (likely(got_write)) create_error = may_o_create(mnt_userns, &nd->path, dentry, mode); @@ -3525,8 +3583,7 @@ struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns, child = d_alloc(dentry, &slash_name); if (unlikely(!child)) goto out_err; - if (!IS_POSIXACL(dir)) - mode &= ~current_umask(); + mode = vfs_prepare_mode(mnt_userns, dir, mode, mode, mode); error = dir->i_op->tmpfile(mnt_userns, dir, child, mode); if (error) goto out_err; @@ -3804,6 +3861,7 @@ int vfs_mknod(struct user_namespace *mnt_userns, struct inode *dir, if (!dir->i_op->mknod) return -EPERM;
+ mode = vfs_prepare_mode(mnt_userns, dir, mode, mode, mode); error = devcgroup_inode_mknod(mode, dev); if (error) return error; @@ -3854,9 +3912,8 @@ static int do_mknodat(int dfd, struct filename *name, umode_t mode, if (IS_ERR(dentry)) goto out1;
- if (!IS_POSIXACL(path.dentry->d_inode)) - mode &= ~current_umask(); - error = security_path_mknod(&path, dentry, mode, dev); + error = security_path_mknod(&path, dentry, + mode_strip_umask(path.dentry->d_inode, mode), dev); if (error) goto out2;
@@ -3926,7 +3983,7 @@ int vfs_mkdir(struct user_namespace *mnt_userns, struct inode *dir, if (!dir->i_op->mkdir) return -EPERM;
- mode &= (S_IRWXUGO|S_ISVTX); + mode = vfs_prepare_mode(mnt_userns, dir, mode, S_IRWXUGO | S_ISVTX, 0); error = security_inode_mkdir(dir, dentry, mode); if (error) return error; @@ -3954,9 +4011,8 @@ int do_mkdirat(int dfd, struct filename *name, umode_t mode) if (IS_ERR(dentry)) goto out_putname;
- if (!IS_POSIXACL(path.dentry->d_inode)) - mode &= ~current_umask(); - error = security_path_mkdir(&path, dentry, mode); + error = security_path_mkdir(&path, dentry, + mode_strip_umask(path.dentry->d_inode, mode)); if (!error) { struct user_namespace *mnt_userns; mnt_userns = mnt_user_ns(path.mnt); diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c index 11807034dd483..5b8237ceb8cce 100644 --- a/fs/ocfs2/namei.c +++ b/fs/ocfs2/namei.c @@ -197,6 +197,7 @@ static struct inode *ocfs2_get_init_inode(struct inode *dir, umode_t mode) * callers. */ if (S_ISDIR(mode)) set_nlink(inode, 2); + mode = mode_strip_sgid(&init_user_ns, dir, mode); inode_init_owner(&init_user_ns, inode, dir, mode); status = dquot_initialize(inode); if (status)
From: Christian Brauner brauner@kernel.org
commit 11c2a8700cdcabf9b639b7204a1e38e2a0b6798e upstream.
[backport to 5.15.y, prior to vfsgid_t]
In setattr_{copy,prepare}() we need to perform the same permission checks to determine whether we need to drop the setgid bit or not. Instead of open-coding it twice add a simple helper the encapsulates the logic. We will reuse this helpers to make dropping the setgid bit during write operations more consistent in a follow up patch.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/attr.c | 8 ++++---- fs/inode.c | 28 ++++++++++++++++++++++++---- fs/internal.h | 2 ++ 3 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c index f581c4d008971..686840aa91c8b 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -18,6 +18,8 @@ #include <linux/evm.h> #include <linux/ima.h>
+#include "internal.h" + /** * chown_ok - verify permissions to chown inode * @mnt_userns: user namespace of the mount @inode was found from @@ -141,8 +143,7 @@ int setattr_prepare(struct user_namespace *mnt_userns, struct dentry *dentry, mapped_gid = i_gid_into_mnt(mnt_userns, inode);
/* Also check the setgid bit! */ - if (!in_group_p(mapped_gid) && - !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID)) + if (!in_group_or_capable(mnt_userns, inode, mapped_gid)) attr->ia_mode &= ~S_ISGID; }
@@ -257,8 +258,7 @@ void setattr_copy(struct user_namespace *mnt_userns, struct inode *inode, if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode; kgid_t kgid = i_gid_into_mnt(mnt_userns, inode); - if (!in_group_p(kgid) && - !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID)) + if (!in_group_or_capable(mnt_userns, inode, kgid)) mode &= ~S_ISGID; inode->i_mode = mode; } diff --git a/fs/inode.c b/fs/inode.c index 957b2d18ec29f..a71fb82279bb1 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2321,6 +2321,28 @@ struct timespec64 current_time(struct inode *inode) } EXPORT_SYMBOL(current_time);
+/** + * in_group_or_capable - check whether caller is CAP_FSETID privileged + * @mnt_userns: user namespace of the mount @inode was found from + * @inode: inode to check + * @gid: the new/current gid of @inode + * + * Check wether @gid is in the caller's group list or if the caller is + * privileged with CAP_FSETID over @inode. This can be used to determine + * whether the setgid bit can be kept or must be dropped. + * + * Return: true if the caller is sufficiently privileged, false if not. + */ +bool in_group_or_capable(struct user_namespace *mnt_userns, + const struct inode *inode, kgid_t gid) +{ + if (in_group_p(gid)) + return true; + if (capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID)) + return true; + return false; +} + /** * mode_strip_sgid - handle the sgid bit for non-directories * @mnt_userns: User namespace of the mount the inode was created from @@ -2342,11 +2364,9 @@ umode_t mode_strip_sgid(struct user_namespace *mnt_userns, return mode; if (S_ISDIR(mode) || !dir || !(dir->i_mode & S_ISGID)) return mode; - if (in_group_p(i_gid_into_mnt(mnt_userns, dir))) - return mode; - if (capable_wrt_inode_uidgid(mnt_userns, dir, CAP_FSETID)) + if (in_group_or_capable(mnt_userns, dir, + i_gid_into_mnt(mnt_userns, dir))) return mode; - return mode & ~S_ISGID; } EXPORT_SYMBOL(mode_strip_sgid); diff --git a/fs/internal.h b/fs/internal.h index 9075490f21a62..c898147272817 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -150,6 +150,8 @@ extern int vfs_open(const struct path *, struct file *); extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc); extern void inode_add_lru(struct inode *inode); extern int dentry_needs_remove_privs(struct dentry *dentry); +bool in_group_or_capable(struct user_namespace *mnt_userns, + const struct inode *inode, kgid_t gid);
/* * fs-writeback.c
From: Christian Brauner brauner@kernel.org
commit e243e3f94c804ecca9a8241b5babe28f35258ef4 upstream.
Move the helper from inode.c to attr.c. This keeps the the core of the set{g,u}id stripping logic in one place when we add follow-up changes. It is the better place anyway, since should_remove_suid() returns ATTR_KILL_S{G,U}ID flags.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/attr.c | 29 +++++++++++++++++++++++++++++ fs/inode.c | 29 ----------------------------- 2 files changed, 29 insertions(+), 29 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c index 686840aa91c8b..f045431bab1ad 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -20,6 +20,35 @@
#include "internal.h"
+/* + * The logic we want is + * + * if suid or (sgid and xgrp) + * remove privs + */ +int should_remove_suid(struct dentry *dentry) +{ + umode_t mode = d_inode(dentry)->i_mode; + int kill = 0; + + /* suid always must be killed */ + if (unlikely(mode & S_ISUID)) + kill = ATTR_KILL_SUID; + + /* + * sgid without any exec bits is just a mandatory locking mark; leave + * it alone. If some exec bits are set, it's a real sgid; kill it. + */ + if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) + kill |= ATTR_KILL_SGID; + + if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) + return kill; + + return 0; +} +EXPORT_SYMBOL(should_remove_suid); + /** * chown_ok - verify permissions to chown inode * @mnt_userns: user namespace of the mount @inode was found from diff --git a/fs/inode.c b/fs/inode.c index a71fb82279bb1..3811269259e11 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1864,35 +1864,6 @@ void touch_atime(const struct path *path) } EXPORT_SYMBOL(touch_atime);
-/* - * The logic we want is - * - * if suid or (sgid and xgrp) - * remove privs - */ -int should_remove_suid(struct dentry *dentry) -{ - umode_t mode = d_inode(dentry)->i_mode; - int kill = 0; - - /* suid always must be killed */ - if (unlikely(mode & S_ISUID)) - kill = ATTR_KILL_SUID; - - /* - * sgid without any exec bits is just a mandatory locking mark; leave - * it alone. If some exec bits are set, it's a real sgid; kill it. - */ - if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) - kill |= ATTR_KILL_SGID; - - if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) - return kill; - - return 0; -} -EXPORT_SYMBOL(should_remove_suid); - /* * Return mask of changes for notify_change() that need to be done as a * response to write or truncate. Return 0 if nothing has to be changed.
From: Christian Brauner brauner@kernel.org
commit 72ae017c5451860443a16fb2a8c243bff3e396b8 upstream.
[backport to 5.15.y, prior to vfsgid_t]
The current setgid stripping logic during write and ownership change operations is inconsistent and strewn over multiple places. In order to consolidate it and make more consistent we'll add a new helper setattr_should_drop_sgid(). The function retains the old behavior where we remove the S_ISGID bit unconditionally when S_IXGRP is set but also when it isn't set and the caller is neither in the group of the inode nor privileged over the inode.
We will use this helper both in write operation permission removal such as file_remove_privs() as well as in ownership change operations.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/attr.c | 28 ++++++++++++++++++++++++++++ fs/internal.h | 6 ++++++ 2 files changed, 34 insertions(+)
diff --git a/fs/attr.c b/fs/attr.c index f045431bab1ad..965be68ed8fa0 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -20,6 +20,34 @@
#include "internal.h"
+/** + * setattr_should_drop_sgid - determine whether the setgid bit needs to be + * removed + * @mnt_userns: user namespace of the mount @inode was found from + * @inode: inode to check + * + * This function determines whether the setgid bit needs to be removed. + * We retain backwards compatibility and require setgid bit to be removed + * unconditionally if S_IXGRP is set. Otherwise we have the exact same + * requirements as setattr_prepare() and setattr_copy(). + * + * Return: ATTR_KILL_SGID if setgid bit needs to be removed, 0 otherwise. + */ +int setattr_should_drop_sgid(struct user_namespace *mnt_userns, + const struct inode *inode) +{ + umode_t mode = inode->i_mode; + + if (!(mode & S_ISGID)) + return 0; + if (mode & S_IXGRP) + return ATTR_KILL_SGID; + if (!in_group_or_capable(mnt_userns, inode, + i_gid_into_mnt(mnt_userns, inode))) + return ATTR_KILL_SGID; + return 0; +} + /* * The logic we want is * diff --git a/fs/internal.h b/fs/internal.h index c898147272817..45cf31d7380b8 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -231,3 +231,9 @@ struct xattr_ctx { int setxattr_copy(const char __user *name, struct xattr_ctx *ctx); int do_setxattr(struct user_namespace *mnt_userns, struct dentry *dentry, struct xattr_ctx *ctx); + +/* + * fs/attr.c + */ +int setattr_should_drop_sgid(struct user_namespace *mnt_userns, + const struct inode *inode);
From: Christian Brauner brauner@kernel.org
commit ed5a7047d2011cb6b2bf84ceb6680124cc6a7d95 upstream.
[backport to 5.15.y, prior to vfsgid_t]
Currently setgid stripping in file_remove_privs()'s should_remove_suid() helper is inconsistent with other parts of the vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups and the caller isn't privileged over the inode although we require this already in setattr_prepare() and setattr_copy() and so all filesystem implement this requirement implicitly because they have to use setattr_{prepare,copy}() anyway.
But the inconsistency shows up in setgid stripping bugs for overlayfs in xfstests (e.g., generic/673, generic/683, generic/685, generic/686, generic/687). For example, we test whether suid and setgid stripping works correctly when performing various write-like operations as an unprivileged user (fallocate, reflink, write, etc.):
echo "Test 1 - qa_user, non-exec file $verb" setup_testfile chmod a+rws $junk_file commit_and_check "$qa_user" "$verb" 64k 64k
The test basically creates a file with 6666 permissions. While the file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP set. On a regular filesystem like xfs what will happen is:
sys_fallocate() -> vfs_fallocate() -> xfs_file_fallocate() -> file_modified() -> __file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = ATTR_FORCE | kill; -> notify_change() -> setattr_copy()
In should_remove_suid() we can see that ATTR_KILL_SUID is raised unconditionally because the file in the test has S_ISUID set.
But we also see that ATTR_KILL_SGID won't be set because while the file is S_ISGID it is not S_IXGRP (see above) which is a condition for ATTR_KILL_SGID being raised.
So by the time we call notify_change() we have attr->ia_valid set to ATTR_KILL_SUID | ATTR_FORCE. Now notify_change() sees that ATTR_KILL_SUID is set and does:
ia_valid = attr->ia_valid |= ATTR_MODE attr->ia_mode = (inode->i_mode & ~S_ISUID);
which means that when we call setattr_copy() later we will definitely update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.
Now we call into the filesystem's ->setattr() inode operation which will end up calling setattr_copy(). Since ATTR_MODE is set we will hit:
if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode; vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode); if (!vfsgid_in_group_p(vfsgid) && !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID)) mode &= ~S_ISGID; inode->i_mode = mode; }
and since the caller in the test is neither capable nor in the group of the inode the S_ISGID bit is stripped.
But assume the file isn't suid then ATTR_KILL_SUID won't be raised which has the consequence that neither the setgid nor the suid bits are stripped even though it should be stripped because the inode isn't in the caller's groups and the caller isn't privileged over the inode.
If overlayfs is in the mix things become a bit more complicated and the bug shows up more clearly. When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be raised but because the check in notify_change() is questioning the ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped the S_ISGID bit isn't removed even though it should be stripped:
sys_fallocate() -> vfs_fallocate() -> ovl_fallocate() -> file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = ATTR_FORCE | kill; -> notify_change() -> ovl_setattr() // TAKE ON MOUNTER'S CREDS -> ovl_do_notify_change() -> notify_change() // GIVE UP MOUNTER'S CREDS // TAKE ON MOUNTER'S CREDS -> vfs_fallocate() -> xfs_file_fallocate() -> file_modified() -> __file_remove_privs() -> dentry_needs_remove_privs() -> should_remove_suid() -> __remove_privs() newattrs.ia_valid = attr_force | kill; -> notify_change()
The fix for all of this is to make file_remove_privs()'s should_remove_suid() helper to perform the same checks as we already require in setattr_prepare() and setattr_copy() and have notify_change() not pointlessly requiring S_IXGRP again. It doesn't make any sense in the first place because the caller must calculate the flags via should_remove_suid() anyway which would raise ATTR_KILL_SGID.
While we're at it we move should_remove_suid() from inode.c to attr.c where it belongs with the rest of the iattr helpers. Especially since it returns ATTR_KILL_S{G,U}ID flags. We also rename it to setattr_should_drop_suidgid() to better reflect that it indicates both setuid and setgid bit removal and also that it returns attr flags.
Running xfstests with this doesn't report any regressions. We should really try and use consistent checks.
Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/trace/ftrace.rst | 2 +- fs/attr.c | 33 +++++++++++++++++++-------------- fs/fuse/file.c | 2 +- fs/inode.c | 7 ++++--- fs/internal.h | 2 +- fs/ocfs2/file.c | 4 ++-- fs/open.c | 8 ++++---- include/linux/fs.h | 2 +- 8 files changed, 33 insertions(+), 27 deletions(-)
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 4e5b26f03d5b1..d036946bce7ab 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -2929,7 +2929,7 @@ Produces:: bash-1994 [000] .... 4342.324898: ima_get_action <-process_measurement bash-1994 [000] .... 4342.324898: ima_match_policy <-ima_get_action bash-1994 [000] .... 4342.324899: do_truncate <-do_last - bash-1994 [000] .... 4342.324899: should_remove_suid <-do_truncate + bash-1994 [000] .... 4342.324899: setattr_should_drop_suidgid <-do_truncate bash-1994 [000] .... 4342.324899: notify_change <-do_truncate bash-1994 [000] .... 4342.324900: current_fs_time <-notify_change bash-1994 [000] .... 4342.324900: current_kernel_time <-current_fs_time diff --git a/fs/attr.c b/fs/attr.c index 965be68ed8fa0..0ca14cbd4b8bb 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -48,34 +48,39 @@ int setattr_should_drop_sgid(struct user_namespace *mnt_userns, return 0; }
-/* - * The logic we want is +/** + * setattr_should_drop_suidgid - determine whether the set{g,u}id bit needs to + * be dropped + * @mnt_userns: user namespace of the mount @inode was found from + * @inode: inode to check * - * if suid or (sgid and xgrp) - * remove privs + * This function determines whether the set{g,u}id bits need to be removed. + * If the setuid bit needs to be removed ATTR_KILL_SUID is returned. If the + * setgid bit needs to be removed ATTR_KILL_SGID is returned. If both + * set{g,u}id bits need to be removed the corresponding mask of both flags is + * returned. + * + * Return: A mask of ATTR_KILL_S{G,U}ID indicating which - if any - setid bits + * to remove, 0 otherwise. */ -int should_remove_suid(struct dentry *dentry) +int setattr_should_drop_suidgid(struct user_namespace *mnt_userns, + struct inode *inode) { - umode_t mode = d_inode(dentry)->i_mode; + umode_t mode = inode->i_mode; int kill = 0;
/* suid always must be killed */ if (unlikely(mode & S_ISUID)) kill = ATTR_KILL_SUID;
- /* - * sgid without any exec bits is just a mandatory locking mark; leave - * it alone. If some exec bits are set, it's a real sgid; kill it. - */ - if (unlikely((mode & S_ISGID) && (mode & S_IXGRP))) - kill |= ATTR_KILL_SGID; + kill |= setattr_should_drop_sgid(mnt_userns, inode);
if (unlikely(kill && !capable(CAP_FSETID) && S_ISREG(mode))) return kill;
return 0; } -EXPORT_SYMBOL(should_remove_suid); +EXPORT_SYMBOL(setattr_should_drop_suidgid);
/** * chown_ok - verify permissions to chown inode @@ -440,7 +445,7 @@ int notify_change(struct user_namespace *mnt_userns, struct dentry *dentry, } } if (ia_valid & ATTR_KILL_SGID) { - if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) { + if (mode & S_ISGID) { if (!(ia_valid & ATTR_MODE)) { ia_valid = attr->ia_valid |= ATTR_MODE; attr->ia_mode = inode->i_mode; diff --git a/fs/fuse/file.c b/fs/fuse/file.c index cc95a1c376449..2b19d281351e5 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1295,7 +1295,7 @@ static ssize_t fuse_cache_write_iter(struct kiocb *iocb, struct iov_iter *from) return err;
if (fc->handle_killpriv_v2 && - should_remove_suid(file_dentry(file))) { + setattr_should_drop_suidgid(&init_user_ns, file_inode(file))) { goto writethrough; }
diff --git a/fs/inode.c b/fs/inode.c index 3811269259e11..079b64f9b7561 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1869,7 +1869,8 @@ EXPORT_SYMBOL(touch_atime); * response to write or truncate. Return 0 if nothing has to be changed. * Negative value on error (change should be denied). */ -int dentry_needs_remove_privs(struct dentry *dentry) +int dentry_needs_remove_privs(struct user_namespace *mnt_userns, + struct dentry *dentry) { struct inode *inode = d_inode(dentry); int mask = 0; @@ -1878,7 +1879,7 @@ int dentry_needs_remove_privs(struct dentry *dentry) if (IS_NOSEC(inode)) return 0;
- mask = should_remove_suid(dentry); + mask = setattr_should_drop_suidgid(mnt_userns, inode); ret = security_inode_need_killpriv(dentry); if (ret < 0) return ret; @@ -1920,7 +1921,7 @@ int file_remove_privs(struct file *file) if (IS_NOSEC(inode) || !S_ISREG(inode->i_mode)) return 0;
- kill = dentry_needs_remove_privs(dentry); + kill = dentry_needs_remove_privs(file_mnt_user_ns(file), dentry); if (kill < 0) return kill; if (kill) diff --git a/fs/internal.h b/fs/internal.h index 45cf31d7380b8..46df4ce58e87e 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -149,7 +149,7 @@ extern int vfs_open(const struct path *, struct file *); */ extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc); extern void inode_add_lru(struct inode *inode); -extern int dentry_needs_remove_privs(struct dentry *dentry); +int dentry_needs_remove_privs(struct user_namespace *, struct dentry *dentry); bool in_group_or_capable(struct user_namespace *mnt_userns, const struct inode *inode, kgid_t gid);
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index fc5f780fa2355..92182d4be247e 100644 --- a/fs/ocfs2/file.c +++ b/fs/ocfs2/file.c @@ -1994,7 +1994,7 @@ static int __ocfs2_change_file_space(struct file *file, struct inode *inode, } }
- if (file && should_remove_suid(file->f_path.dentry)) { + if (file && setattr_should_drop_suidgid(&init_user_ns, file_inode(file))) { ret = __ocfs2_write_remove_suid(inode, di_bh); if (ret) { mlog_errno(ret); @@ -2282,7 +2282,7 @@ static int ocfs2_prepare_inode_for_write(struct file *file, * inode. There's also the dinode i_size state which * can be lost via setattr during extending writes (we * set inode->i_size at the end of a write. */ - if (should_remove_suid(dentry)) { + if (setattr_should_drop_suidgid(&init_user_ns, inode)) { if (meta_level == 0) { ocfs2_inode_unlock_for_extent_tree(inode, &di_bh, diff --git a/fs/open.c b/fs/open.c index 5e322f188e839..e93c33069055b 100644 --- a/fs/open.c +++ b/fs/open.c @@ -54,7 +54,7 @@ int do_truncate(struct user_namespace *mnt_userns, struct dentry *dentry, }
/* Remove suid, sgid, and file capabilities on truncate too */ - ret = dentry_needs_remove_privs(dentry); + ret = dentry_needs_remove_privs(mnt_userns, dentry); if (ret < 0) return ret; if (ret) @@ -671,10 +671,10 @@ int chown_common(const struct path *path, uid_t user, gid_t group) newattrs.ia_valid |= ATTR_GID; newattrs.ia_gid = gid; } - if (!S_ISDIR(inode->i_mode)) - newattrs.ia_valid |= - ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV; inode_lock(inode); + if (!S_ISDIR(inode->i_mode)) + newattrs.ia_valid |= ATTR_KILL_SUID | ATTR_KILL_PRIV | + setattr_should_drop_sgid(mnt_userns, inode); error = security_path_chown(path, uid, gid); if (!error) error = notify_change(mnt_userns, path->dentry, &newattrs, diff --git a/include/linux/fs.h b/include/linux/fs.h index be9be4a7216c7..9601c2d774c88 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3133,7 +3133,7 @@ extern void __destroy_inode(struct inode *); extern struct inode *new_inode_pseudo(struct super_block *sb); extern struct inode *new_inode(struct super_block *sb); extern void free_inode_nonrcu(struct inode *inode); -extern int should_remove_suid(struct dentry *); +extern int setattr_should_drop_suidgid(struct user_namespace *, struct inode *); extern int file_remove_privs(struct file *);
extern void __insert_inode_hash(struct inode *, unsigned long hashval);
From: Christian Brauner brauner@kernel.org
commit 8d84e39d76bd83474b26cb44f4b338635676e7e8 upstream.
Now that we made the VFS setgid checking consistent an inode can't be marked security irrelevant even if the setgid bit is still set. Make this function consistent with all other helpers.
Note that enforcing consistent setgid stripping checks for file modification and mode- and ownership changes will cause the setgid bit to be lost in more cases than useed to be the case. If an unprivileged user wrote to a non-executable setgid file that they don't have privilege over the setgid bit will be dropped. This will lead to temporary failures in some xfstests until they have been updated.
Reported-by: Miklos Szeredi miklos@szeredi.hu Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Amir Goldstein amir73il@gmail.com Tested-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 9601c2d774c88..23ecfecdc4504 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3571,7 +3571,7 @@ int __init list_bdev_fs_names(char *buf, size_t size);
static inline bool is_sxid(umode_t mode) { - return (mode & S_ISUID) || ((mode & S_ISGID) && (mode & S_IXGRP)); + return mode & (S_ISUID | S_ISGID); }
static inline int check_sticky(struct user_namespace *mnt_userns,
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit 7b347f4b677b6d84687e67d82b6b17c6f55ea2b4 ]
SPDM is used for debug/profiling and does not have any other functionality. These clocks can safely be removed.
Suggested-by: Stephen Boyd sboyd@kernel.org Suggested-by: Georgi Djakov djakov@kernel.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20230111060402.1168726-11-dmitry.baryshkov@linaro.... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/mmcc-apq8084.c | 271 -------------------------------- 1 file changed, 271 deletions(-)
diff --git a/drivers/clk/qcom/mmcc-apq8084.c b/drivers/clk/qcom/mmcc-apq8084.c index fbfcf00067394..893e5536f64c7 100644 --- a/drivers/clk/qcom/mmcc-apq8084.c +++ b/drivers/clk/qcom/mmcc-apq8084.c @@ -2363,262 +2363,6 @@ static struct clk_branch mmss_rbcpr_clk = { }, };
-static struct clk_branch mmss_spdm_ahb_clk = { - .halt_reg = 0x0230, - .clkr = { - .enable_reg = 0x0230, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_ahb_clk", - .parent_names = (const char *[]){ - "mmss_spdm_ahb_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_axi_clk = { - .halt_reg = 0x0210, - .clkr = { - .enable_reg = 0x0210, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_axi_clk", - .parent_names = (const char *[]){ - "mmss_spdm_axi_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_csi0_clk = { - .halt_reg = 0x023c, - .clkr = { - .enable_reg = 0x023c, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_csi0_clk", - .parent_names = (const char *[]){ - "mmss_spdm_csi0_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_gfx3d_clk = { - .halt_reg = 0x022c, - .clkr = { - .enable_reg = 0x022c, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_gfx3d_clk", - .parent_names = (const char *[]){ - "mmss_spdm_gfx3d_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_jpeg0_clk = { - .halt_reg = 0x0204, - .clkr = { - .enable_reg = 0x0204, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_jpeg0_clk", - .parent_names = (const char *[]){ - "mmss_spdm_jpeg0_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_jpeg1_clk = { - .halt_reg = 0x0208, - .clkr = { - .enable_reg = 0x0208, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_jpeg1_clk", - .parent_names = (const char *[]){ - "mmss_spdm_jpeg1_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_jpeg2_clk = { - .halt_reg = 0x0224, - .clkr = { - .enable_reg = 0x0224, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_jpeg2_clk", - .parent_names = (const char *[]){ - "mmss_spdm_jpeg2_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_mdp_clk = { - .halt_reg = 0x020c, - .clkr = { - .enable_reg = 0x020c, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_mdp_clk", - .parent_names = (const char *[]){ - "mmss_spdm_mdp_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_pclk0_clk = { - .halt_reg = 0x0234, - .clkr = { - .enable_reg = 0x0234, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_pclk0_clk", - .parent_names = (const char *[]){ - "mmss_spdm_pclk0_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_pclk1_clk = { - .halt_reg = 0x0228, - .clkr = { - .enable_reg = 0x0228, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_pclk1_clk", - .parent_names = (const char *[]){ - "mmss_spdm_pclk1_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_vcodec0_clk = { - .halt_reg = 0x0214, - .clkr = { - .enable_reg = 0x0214, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_vcodec0_clk", - .parent_names = (const char *[]){ - "mmss_spdm_vcodec0_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_vfe0_clk = { - .halt_reg = 0x0218, - .clkr = { - .enable_reg = 0x0218, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_vfe0_clk", - .parent_names = (const char *[]){ - "mmss_spdm_vfe0_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_vfe1_clk = { - .halt_reg = 0x021c, - .clkr = { - .enable_reg = 0x021c, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_vfe1_clk", - .parent_names = (const char *[]){ - "mmss_spdm_vfe1_div_clk", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_rm_axi_clk = { - .halt_reg = 0x0304, - .clkr = { - .enable_reg = 0x0304, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_rm_axi_clk", - .parent_names = (const char *[]){ - "mmss_axi_clk_src", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - -static struct clk_branch mmss_spdm_rm_ocmemnoc_clk = { - .halt_reg = 0x0308, - .clkr = { - .enable_reg = 0x0308, - .enable_mask = BIT(0), - .hw.init = &(struct clk_init_data){ - .name = "mmss_spdm_rm_ocmemnoc_clk", - .parent_names = (const char *[]){ - "ocmemnoc_clk_src", - }, - .num_parents = 1, - .flags = CLK_SET_RATE_PARENT, - .ops = &clk_branch2_ops, - }, - }, -}; - - static struct clk_branch mmss_misc_ahb_clk = { .halt_reg = 0x502c, .clkr = { @@ -3251,21 +2995,6 @@ static struct clk_regmap *mmcc_apq8084_clocks[] = { [MDSS_VSYNC_CLK] = &mdss_vsync_clk.clkr, [MMSS_RBCPR_AHB_CLK] = &mmss_rbcpr_ahb_clk.clkr, [MMSS_RBCPR_CLK] = &mmss_rbcpr_clk.clkr, - [MMSS_SPDM_AHB_CLK] = &mmss_spdm_ahb_clk.clkr, - [MMSS_SPDM_AXI_CLK] = &mmss_spdm_axi_clk.clkr, - [MMSS_SPDM_CSI0_CLK] = &mmss_spdm_csi0_clk.clkr, - [MMSS_SPDM_GFX3D_CLK] = &mmss_spdm_gfx3d_clk.clkr, - [MMSS_SPDM_JPEG0_CLK] = &mmss_spdm_jpeg0_clk.clkr, - [MMSS_SPDM_JPEG1_CLK] = &mmss_spdm_jpeg1_clk.clkr, - [MMSS_SPDM_JPEG2_CLK] = &mmss_spdm_jpeg2_clk.clkr, - [MMSS_SPDM_MDP_CLK] = &mmss_spdm_mdp_clk.clkr, - [MMSS_SPDM_PCLK0_CLK] = &mmss_spdm_pclk0_clk.clkr, - [MMSS_SPDM_PCLK1_CLK] = &mmss_spdm_pclk1_clk.clkr, - [MMSS_SPDM_VCODEC0_CLK] = &mmss_spdm_vcodec0_clk.clkr, - [MMSS_SPDM_VFE0_CLK] = &mmss_spdm_vfe0_clk.clkr, - [MMSS_SPDM_VFE1_CLK] = &mmss_spdm_vfe1_clk.clkr, - [MMSS_SPDM_RM_AXI_CLK] = &mmss_spdm_rm_axi_clk.clkr, - [MMSS_SPDM_RM_OCMEMNOC_CLK] = &mmss_spdm_rm_ocmemnoc_clk.clkr, [MMSS_MISC_AHB_CLK] = &mmss_misc_ahb_clk.clkr, [MMSS_MMSSNOC_AHB_CLK] = &mmss_mmssnoc_ahb_clk.clkr, [MMSS_MMSSNOC_BTO_AHB_CLK] = &mmss_mmssnoc_bto_ahb_clk.clkr,
From: xurui xurui@kylinos.cn
[ Upstream commit 109d587a4b4d7ccca2200ab1f808f43ae23e2585 ]
arch/mips/include/asm/mach-rc32434/pci.h:377: cc1: error: result of ‘-117440512 << 16’ requires 44 bits to represent, but ‘int’ only has 32 bits [-Werror=shift-overflow=]
All bits in KORINA_STAT are already at the correct position, so there is no addtional shift needed.
Signed-off-by: xurui xurui@kylinos.cn Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/mach-rc32434/pci.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/mach-rc32434/pci.h b/arch/mips/include/asm/mach-rc32434/pci.h index 9a6eefd127571..3eb767c8a4eec 100644 --- a/arch/mips/include/asm/mach-rc32434/pci.h +++ b/arch/mips/include/asm/mach-rc32434/pci.h @@ -374,7 +374,7 @@ struct pci_msu { PCI_CFG04_STAT_SSE | \ PCI_CFG04_STAT_PE)
-#define KORINA_CNFG1 ((KORINA_STAT<<16)|KORINA_CMD) +#define KORINA_CNFG1 (KORINA_STAT | KORINA_CMD)
#define KORINA_REVID 0 #define KORINA_CLASS_CODE 0
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit bab537805a10bdbf55b31324ba4a9599e0651e5e ]
NO_IRQ is a relic from the old days. It is not used anymore in core functions. By the way, function irq_of_parse_and_map() returns value 0 on error.
In some drivers, NO_IRQ is erroneously used to check the return of irq_of_parse_and_map().
It is not a real bug today because the only architectures using the drivers being fixed by this patch define NO_IRQ as 0, but there are architectures which define NO_IRQ as -1. If one day those architectures start using the non fixed drivers, there will be a problem.
Long time ago Linus advocated for not using NO_IRQ, see https://lore.kernel.org/all/Pine.LNX.4.64.0511211150040.13959@g5.osdl.org
He re-iterated the same view recently in https://lore.kernel.org/all/CAHk-=wg2Pkb9kbfbstbB91AJA2SF6cySbsgHG-iQMq56j3V...
So test !irq instead of tesing irq == NO_IRQ.
All other usage of NO_IRQ for powerpc were removed in previous cycles so the time has come to remove NO_IRQ completely for powerpc.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/4b8d4f96140af01dec3a3330924dda8b2451c316.167447679... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/irq.h | 3 --- arch/powerpc/platforms/44x/fsp2.c | 2 +- 2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index 2b3278534bc14..858393f7fd7d7 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -16,9 +16,6 @@
extern atomic_t ppc_n_lost_interrupts;
-/* This number is used when no interrupt has been assigned */ -#define NO_IRQ (0) - /* Total number of virq in the platform */ #define NR_IRQS CONFIG_NR_IRQS
diff --git a/arch/powerpc/platforms/44x/fsp2.c b/arch/powerpc/platforms/44x/fsp2.c index 823397c802def..f8bbe05d9ef29 100644 --- a/arch/powerpc/platforms/44x/fsp2.c +++ b/arch/powerpc/platforms/44x/fsp2.c @@ -205,7 +205,7 @@ static void node_irq_request(const char *compat, irq_handler_t errirq_handler)
for_each_compatible_node(np, NULL, compat) { irq = irq_of_parse_and_map(np, 0); - if (irq == NO_IRQ) { + if (!irq) { pr_err("device tree node %pOFn is missing a interrupt", np); of_node_put(np);
Le 15/03/2023 à 13:12, Greg Kroah-Hartman a écrit :
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit bab537805a10bdbf55b31324ba4a9599e0651e5e ]
NO_IRQ is a relic from the old days. It is not used anymore in core functions. By the way, function irq_of_parse_and_map() returns value 0 on error.
In some drivers, NO_IRQ is erroneously used to check the return of irq_of_parse_and_map().
It is not a real bug today because the only architectures using the drivers being fixed by this patch define NO_IRQ as 0, but there are architectures which define NO_IRQ as -1. If one day those architectures start using the non fixed drivers, there will be a problem.
Long time ago Linus advocated for not using NO_IRQ, see https://lore.kernel.org/all/Pine.LNX.4.64.0511211150040.13959@g5.osdl.org
He re-iterated the same view recently in https://lore.kernel.org/all/CAHk-=wg2Pkb9kbfbstbB91AJA2SF6cySbsgHG-iQMq56j3V...
So test !irq instead of tesing irq == NO_IRQ.
All other usage of NO_IRQ for powerpc were removed in previous cycles so the time has come to remove NO_IRQ completely for powerpc.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/4b8d4f96140af01dec3a3330924dda8b2451c316.167447679... Signed-off-by: Sasha Levin sashal@kernel.org
Same, you can't remove NO_IRQ macro without first all preparation patches merged during the 6.2 cycle.
Christophe
arch/powerpc/include/asm/irq.h | 3 --- arch/powerpc/platforms/44x/fsp2.c | 2 +- 2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index 2b3278534bc14..858393f7fd7d7 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -16,9 +16,6 @@ extern atomic_t ppc_n_lost_interrupts; -/* This number is used when no interrupt has been assigned */ -#define NO_IRQ (0)
- /* Total number of virq in the platform */ #define NR_IRQS CONFIG_NR_IRQS
diff --git a/arch/powerpc/platforms/44x/fsp2.c b/arch/powerpc/platforms/44x/fsp2.c index 823397c802def..f8bbe05d9ef29 100644 --- a/arch/powerpc/platforms/44x/fsp2.c +++ b/arch/powerpc/platforms/44x/fsp2.c @@ -205,7 +205,7 @@ static void node_irq_request(const char *compat, irq_handler_t errirq_handler) for_each_compatible_node(np, NULL, compat) { irq = irq_of_parse_and_map(np, 0);
if (irq == NO_IRQ) {
if (!irq) { pr_err("device tree node %pOFn is missing a interrupt", np); of_node_put(np);
On Wed, Mar 15, 2023 at 12:32:27PM +0000, Christophe Leroy wrote:
Le 15/03/2023 à 13:12, Greg Kroah-Hartman a écrit :
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit bab537805a10bdbf55b31324ba4a9599e0651e5e ]
NO_IRQ is a relic from the old days. It is not used anymore in core functions. By the way, function irq_of_parse_and_map() returns value 0 on error.
In some drivers, NO_IRQ is erroneously used to check the return of irq_of_parse_and_map().
It is not a real bug today because the only architectures using the drivers being fixed by this patch define NO_IRQ as 0, but there are architectures which define NO_IRQ as -1. If one day those architectures start using the non fixed drivers, there will be a problem.
Long time ago Linus advocated for not using NO_IRQ, see https://lore.kernel.org/all/Pine.LNX.4.64.0511211150040.13959@g5.osdl.org
He re-iterated the same view recently in https://lore.kernel.org/all/CAHk-=wg2Pkb9kbfbstbB91AJA2SF6cySbsgHG-iQMq56j3V...
So test !irq instead of tesing irq == NO_IRQ.
All other usage of NO_IRQ for powerpc were removed in previous cycles so the time has come to remove NO_IRQ completely for powerpc.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/4b8d4f96140af01dec3a3330924dda8b2451c316.167447679... Signed-off-by: Sasha Levin sashal@kernel.org
Same, you can't remove NO_IRQ macro without first all preparation patches merged during the 6.2 cycle.
Ack, I'll drop this one.
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
[ Upstream commit b505063910c134778202dfad9332dfcecb76bab3 ]
When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once.
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20230202141919.2298821-1-gregkh@linuxfoundation.or... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/iommu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index a67fd54ccc573..8bea336fa5b70 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -68,11 +68,9 @@ static void iommu_debugfs_add(struct iommu_table *tbl) static void iommu_debugfs_del(struct iommu_table *tbl) { char name[10]; - struct dentry *liobn_entry;
sprintf(name, "%08lx", tbl->it_index); - liobn_entry = debugfs_lookup(name, iommu_debugfs_dir); - debugfs_remove(liobn_entry); + debugfs_lookup_and_remove(name, iommu_debugfs_dir); } #else static void iommu_debugfs_add(struct iommu_table *tbl){}
From: Rohan McLure rmclure@linux.ibm.com
[ Upstream commit 2a7ce82dc46c591c9244057d89a6591c9639b9b9 ]
In order for KCSAN to increase its likelihood of observing a data race, it sets a watchpoint on memory accesses and stalls, allowing for detection of conflicting accesses by other kernel threads or interrupts.
Stalls are implemented by injecting a call to udelay in instrumented code. To prevent recursive instrumentation, exclude udelay from being instrumented.
Signed-off-by: Rohan McLure rmclure@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20230206021801.105268-3-rmclure@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/time.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 934d8ae66cc63..4406d7a89558b 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -450,7 +450,7 @@ void vtime_flush(struct task_struct *tsk) #define calc_cputime_factors() #endif
-void __delay(unsigned long loops) +void __no_kcsan __delay(unsigned long loops) { unsigned long start;
@@ -471,7 +471,7 @@ void __delay(unsigned long loops) } EXPORT_SYMBOL(__delay);
-void udelay(unsigned long usecs) +void __no_kcsan udelay(unsigned long usecs) { __delay(tb_ticks_per_usec * usecs); }
From: Edward Humes aurxenon@lunos.org
[ Upstream commit b6b17a8b3ecd878d98d5472a9023ede9e669ca72 ]
Previously, R_ALPHA_LITERAL relocations would overflow for large kernel modules.
This was because the Alpha's apply_relocate_add was relying on the kernel's module loader to have sorted the GOT towards the very end of the module as it was mapped into memory in order to correctly assign the global pointer. While this behavior would mostly work fine for small kernel modules, this approach would overflow on kernel modules with large GOT's since the global pointer would be very far away from the GOT, and thus, certain entries would be out of range.
This patch fixes this by instead using the Tru64 behavior of assigning the global pointer to be 32KB away from the start of the GOT. The change made in this patch won't work for multi-GOT kernel modules as it makes the assumption the module only has one GOT located at the beginning of .got, although for the vast majority kernel modules, this should be fine. Of the kernel modules that would previously result in a relocation error, none of them, even modules like nouveau, have even come close to filling up a single GOT, and they've all worked fine under this patch.
Signed-off-by: Edward Humes aurxenon@lunos.org Signed-off-by: Matt Turner mattst88@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/alpha/kernel/module.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/alpha/kernel/module.c b/arch/alpha/kernel/module.c index 5b60c248de9ea..cbefa5a773846 100644 --- a/arch/alpha/kernel/module.c +++ b/arch/alpha/kernel/module.c @@ -146,10 +146,8 @@ apply_relocate_add(Elf64_Shdr *sechdrs, const char *strtab, base = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr; symtab = (Elf64_Sym *)sechdrs[symindex].sh_addr;
- /* The small sections were sorted to the end of the segment. - The following should definitely cover them. */ - gp = (u64)me->core_layout.base + me->core_layout.size - 0x8000; got = sechdrs[me->arch.gotsecindex].sh_addr; + gp = got + 0x8000;
for (i = 0; i < n; i++) { unsigned long r_sym = ELF64_R_SYM (rela[i].r_info);
From: Nathan Chancellor nathan@kernel.org
[ Upstream commit 748ea32d2dbd813d3bd958117bde5191182f909a ]
Clang warns:
drivers/macintosh/windfarm_lm75_sensor.c:63:14: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] lm->inited = 1; ^ ~
drivers/macintosh/windfarm_smu_sensors.c:356:19: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] pow->fake_volts = 1; ^ ~ drivers/macintosh/windfarm_smu_sensors.c:368:18: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] pow->quadratic = 1; ^ ~
There is no bug here since no code checks the actual value of these fields, just whether or not they are zero (boolean context), but this can be easily fixed by switching to an unsigned type.
Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20230215-windfarm-wsingle-bit-bitfield-constant-co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/macintosh/windfarm_lm75_sensor.c | 4 ++-- drivers/macintosh/windfarm_smu_sensors.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/macintosh/windfarm_lm75_sensor.c b/drivers/macintosh/windfarm_lm75_sensor.c index 29f48c2028b6d..e90ad1b78e936 100644 --- a/drivers/macintosh/windfarm_lm75_sensor.c +++ b/drivers/macintosh/windfarm_lm75_sensor.c @@ -34,8 +34,8 @@ #endif
struct wf_lm75_sensor { - int ds1775 : 1; - int inited : 1; + unsigned int ds1775 : 1; + unsigned int inited : 1; struct i2c_client *i2c; struct wf_sensor sens; }; diff --git a/drivers/macintosh/windfarm_smu_sensors.c b/drivers/macintosh/windfarm_smu_sensors.c index c8706cfb83fd8..714c1e14074ed 100644 --- a/drivers/macintosh/windfarm_smu_sensors.c +++ b/drivers/macintosh/windfarm_smu_sensors.c @@ -273,8 +273,8 @@ struct smu_cpu_power_sensor { struct list_head link; struct wf_sensor *volts; struct wf_sensor *amps; - int fake_volts : 1; - int quadratic : 1; + unsigned int fake_volts : 1; + unsigned int quadratic : 1; struct wf_sensor sens; }; #define to_smu_cpu_power(c) container_of(c, struct smu_cpu_power_sensor, sens)
From: Alvaro Karsz alvaro.karsz@solid-run.com
[ Upstream commit db6c4dee4c104f50ed163af71c53bfdb878a8318 ]
Add SolidRun vendor ID to pci_ids.h
The vendor ID is used in 2 different source files, the SNET vDPA driver and PCI quirks.
Signed-off-by: Alvaro Karsz alvaro.karsz@solid-run.com Acked-by: Bjorn Helgaas bhelgaas@google.com Message-Id: 20230110165638.123745-2-alvaro.karsz@solid-run.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/pci_ids.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 4853538bf1561..8a1e264735668 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -3094,6 +3094,8 @@
#define PCI_VENDOR_ID_3COM_2 0xa727
+#define PCI_VENDOR_ID_SOLIDRUN 0xd063 + #define PCI_VENDOR_ID_DIGIUM 0xd161 #define PCI_DEVICE_ID_DIGIUM_HFC4S 0xb410
From: Alvaro Karsz alvaro.karsz@solid-run.com
[ Upstream commit d089d69cc1f824936eeaa4fa172f8fa1a0949eaa ]
This patch fixes a FLR bug on the SNET DPU rev 1 by setting the PCI_DEV_FLAGS_NO_FLR_RESET flag.
As there is a quirk to avoid FLR (quirk_no_flr), I added a new quirk to check the rev ID before calling to quirk_no_flr.
Without this patch, a SNET DPU rev 1 may hang when FLR is applied.
Signed-off-by: Alvaro Karsz alvaro.karsz@solid-run.com Acked-by: Bjorn Helgaas bhelgaas@google.com Message-Id: 20230110165638.123745-3-alvaro.karsz@solid-run.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 643a3b292f0b6..7213d910685c7 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5354,6 +5354,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr);
+/* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */ +static void quirk_no_flr_snet(struct pci_dev *dev) +{ + if (dev->revision == 0x1) + quirk_no_flr(dev); +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SOLIDRUN, 0x1000, quirk_no_flr_snet); + static void quirk_no_ext_tags(struct pci_dev *pdev) { struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
From: Masahiro Yamada masahiroy@kernel.org
[ Upstream commit 87c7ee67deb7fce9951a5f9d80641138694aad17 ]
In the follow-up of commit fb3041d61f68 ("kbuild: fix SIGPIPE error message for AR=gcc-ar and AR=llvm-ar"), Kees Cook pointed out that tools should _not_ catch their own SIGPIPEs [1] [2].
Based on his feedback, LLVM was fixed [3].
However, Python's default behavior is to show noisy bracktrace when SIGPIPE is sent. So, scripts written in Python are basically in the same situation as the buggy llvm tools.
Example:
$ make -s allnoconfig $ make -s allmodconfig $ scripts/diffconfig .config.old .config | head -n1 -ALIX n Traceback (most recent call last): File "/home/masahiro/linux/scripts/diffconfig", line 132, in <module> main() File "/home/masahiro/linux/scripts/diffconfig", line 130, in main print_config("+", config, None, b[config]) File "/home/masahiro/linux/scripts/diffconfig", line 64, in print_config print("+%s %s" % (config, new_value)) BrokenPipeError: [Errno 32] Broken pipe
Python documentation [4] notes how to make scripts die immediately and silently:
""" Piping output of your program to tools like head(1) will cause a SIGPIPE signal to be sent to your process when the receiver of its standard output closes early. This results in an exception like BrokenPipeError: [Errno 32] Broken pipe. To handle this case, wrap your entry point to catch this exception as follows:
import os import sys
def main(): try: # simulate large output (your code replaces this loop) for x in range(10000): print("y") # flush output here to force SIGPIPE to be triggered # while inside this try block. sys.stdout.flush() except BrokenPipeError: # Python flushes standard streams on exit; redirect remaining output # to devnull to avoid another BrokenPipeError at shutdown devnull = os.open(os.devnull, os.O_WRONLY) os.dup2(devnull, sys.stdout.fileno()) sys.exit(1) # Python exits with error code 1 on EPIPE
if __name__ == '__main__': main()
Do not set SIGPIPE’s disposition to SIG_DFL in order to avoid BrokenPipeError. Doing that would cause your program to exit unexpectedly whenever any socket connection is interrupted while your program is still writing to it. """
Currently, tools/perf/scripts/python/intel-pt-events.py seems to be the only script that fixes the issue that way.
tools/perf/scripts/python/compaction-times.py uses another approach signal.signal(signal.SIGPIPE, signal.SIG_DFL) but the Python documentation clearly says "Don't do it".
I cannot fix all Python scripts since there are so many. I fixed some in the scripts/ directory.
[1]: https://lore.kernel.org/all/202211161056.1B9611A@keescook/ [2]: https://github.com/llvm/llvm-project/issues/59037 [3]: https://github.com/llvm/llvm-project/commit/4787efa38066adb51e2c049499d25b36... [4]: https://docs.python.org/3/library/signal.html#note-on-sigpipe
Signed-off-by: Masahiro Yamada masahiroy@kernel.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Nicolas Schier nicolas@fjasle.eu Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/checkkconfigsymbols.py | 13 ++++++++++++- scripts/clang-tools/run-clang-tools.py | 21 ++++++++++++++------- scripts/diffconfig | 16 ++++++++++++++-- 3 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/scripts/checkkconfigsymbols.py b/scripts/checkkconfigsymbols.py index 217d21abc86e8..36c920e713137 100755 --- a/scripts/checkkconfigsymbols.py +++ b/scripts/checkkconfigsymbols.py @@ -115,7 +115,7 @@ def parse_options(): return args
-def main(): +def print_undefined_symbols(): """Main function of this module.""" args = parse_options()
@@ -467,5 +467,16 @@ def parse_kconfig_file(kfile): return defined, references
+def main(): + try: + print_undefined_symbols() + except BrokenPipeError: + # Python flushes standard streams on exit; redirect remaining output + # to devnull to avoid another BrokenPipeError at shutdown + devnull = os.open(os.devnull, os.O_WRONLY) + os.dup2(devnull, sys.stdout.fileno()) + sys.exit(1) # Python exits with error code 1 on EPIPE + + if __name__ == "__main__": main() diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py index f754415af398b..f42699134f1c0 100755 --- a/scripts/clang-tools/run-clang-tools.py +++ b/scripts/clang-tools/run-clang-tools.py @@ -60,14 +60,21 @@ def run_analysis(entry):
def main(): - args = parse_arguments() + try: + args = parse_arguments()
- lock = multiprocessing.Lock() - pool = multiprocessing.Pool(initializer=init, initargs=(lock, args)) - # Read JSON data into the datastore variable - with open(args.path, "r") as f: - datastore = json.load(f) - pool.map(run_analysis, datastore) + lock = multiprocessing.Lock() + pool = multiprocessing.Pool(initializer=init, initargs=(lock, args)) + # Read JSON data into the datastore variable + with open(args.path, "r") as f: + datastore = json.load(f) + pool.map(run_analysis, datastore) + except BrokenPipeError: + # Python flushes standard streams on exit; redirect remaining output + # to devnull to avoid another BrokenPipeError at shutdown + devnull = os.open(os.devnull, os.O_WRONLY) + os.dup2(devnull, sys.stdout.fileno()) + sys.exit(1) # Python exits with error code 1 on EPIPE
if __name__ == "__main__": diff --git a/scripts/diffconfig b/scripts/diffconfig index d5da5fa05d1d3..43f0f3d273ae7 100755 --- a/scripts/diffconfig +++ b/scripts/diffconfig @@ -65,7 +65,7 @@ def print_config(op, config, value, new_value): else: print(" %s %s -> %s" % (config, value, new_value))
-def main(): +def show_diff(): global merge_style
# parse command line args @@ -129,4 +129,16 @@ def main(): for config in new: print_config("+", config, None, b[config])
-main() +def main(): + try: + show_diff() + except BrokenPipeError: + # Python flushes standard streams on exit; redirect remaining output + # to devnull to avoid another BrokenPipeError at shutdown + devnull = os.open(os.devnull, os.O_WRONLY) + os.dup2(devnull, sys.stdout.fileno()) + sys.exit(1) # Python exits with error code 1 on EPIPE + + +if __name__ == '__main__': + main()
From: Paul Elder paul.elder@ideasonboard.com
[ Upstream commit afa4805799c1d332980ad23339fdb07b5e0cf7e0 ]
Gain control is badly documented in publicly available (including leaked) documentation.
There is an AGC pre-gain in register 0x3a13, expressed as a 6-bit value (plus an enable bit in bit 6). The driver hardcodes it to 0x43, which one application note states is equal to x1.047. The documentation also states that 0x40 is equel to x1.000. The pre-gain thus seems to be expressed as in 1/64 increments, and thus ranges from x1.00 to x1.984. What the pre-gain does is however unspecified.
There is then an AGC gain limit, in registers 0x3a18 and 0x3a19, expressed as a 10-bit "real gain format" value. One application note sets it to 0x00f8 and states it is equal to x15.5, so it appears to be expressed in 1/16 increments, up to x63.9375.
The manual gain is stored in registers 0x350a and 0x350b, also as a 10-bit "real gain format" value. It is documented in the application note as a Q6.4 values, up to x63.9375.
One version of the datasheet indicates that the sensor supports a digital gain:
The OV5640 supports 1/2/4 digital gain. Normally, the gain is controlled automatically by the automatic gain control (AGC) block.
It isn't clear how that would be controlled manually.
There appears to be no indication regarding whether the gain controlled through registers 0x350a and 0x350b is an analogue gain only or also includes digital gain. The words "real gain" don't necessarily mean "combined analogue and digital gains". Some OmniVision sensors (such as the OV8858) are documented as supoprting different formats for the gain values, selectable through a register bit, and they are called "real gain format" and "sensor gain format". For that sensor, we have (one of) the gain registers documented as
0x3503[2]=0, gain[7:0] is real gain format, where low 4 bits are fraction bits, for example, 0x10 is 1x gain, 0x28 is 2.5x gain
If 0x3503[2]=1, gain[7:0] is sensor gain format, gain[7:4] is coarse gain, 00000: 1x, 00001: 2x, 00011: 4x, 00111: 8x, gain[7] is 1, gain[3:0] is fine gain. For example, 0x10 is 1x gain, 0x30 is 2x gain, 0x70 is 4x gain
(The second part of the text makes little sense)
"Real gain" may thus refer to the combination of the coarse and fine analogue gains as a single value.
The OV5640 0x350a and 0x350b registers thus appear to control analogue gain. The driver incorrectly uses V4L2_CID_GAIN as V4L2 has a specific control for analogue gain, V4L2_CID_ANALOGUE_GAIN. Use it.
If registers 0x350a and 0x350b are later found to control digital gain as well, the driver could then restrict the range of the analogue gain control value to lower than x64 and add a separate digital gain control.
Signed-off-by: Paul Elder paul.elder@ideasonboard.com Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Reviewed-by: Jacopo Mondi jacopo.mondi@ideasonboard.com Reviewed-by: Jai Luthra j-luthra@ti.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/ov5640.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/i2c/ov5640.c b/drivers/media/i2c/ov5640.c index db5a19babe67d..a141552531f7e 100644 --- a/drivers/media/i2c/ov5640.c +++ b/drivers/media/i2c/ov5640.c @@ -2776,7 +2776,7 @@ static int ov5640_init_controls(struct ov5640_dev *sensor) /* Auto/manual gain */ ctrls->auto_gain = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_AUTOGAIN, 0, 1, 1, 1); - ctrls->gain = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_GAIN, + ctrls->gain = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_ANALOGUE_GAIN, 0, 1023, 1, 0);
ctrls->saturation = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_SATURATION,
From: Li Jun jun.li@nxp.com
[ Upstream commit 30040818b338b8ebc956ce0ebd198f8d593586a6 ]
In case runtime PM is enabled, do runtime PM clean up to remove cpu latency qos request, otherwise driver removal may have below kernel dump:
[ 19.463299] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048 [ 19.472161] Mem abort info: [ 19.474985] ESR = 0x0000000096000004 [ 19.478754] EC = 0x25: DABT (current EL), IL = 32 bits [ 19.484081] SET = 0, FnV = 0 [ 19.487149] EA = 0, S1PTW = 0 [ 19.490361] FSC = 0x04: level 0 translation fault [ 19.495256] Data abort info: [ 19.498149] ISV = 0, ISS = 0x00000004 [ 19.501997] CM = 0, WnR = 0 [ 19.504977] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000049f81000 [ 19.511432] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000 [ 19.518245] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 19.524520] Modules linked in: gpio_ir_recv(+) rc_core [last unloaded: rc_core] [ 19.531845] CPU: 0 PID: 445 Comm: insmod Not tainted 6.2.0-rc1-00028-g2c397a46d47c #72 [ 19.531854] Hardware name: FSL i.MX8MM EVK board (DT) [ 19.531859] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 19.551777] pc : cpu_latency_qos_remove_request+0x20/0x110 [ 19.557277] lr : gpio_ir_recv_runtime_suspend+0x18/0x30 [gpio_ir_recv] [ 19.557294] sp : ffff800008ce3740 [ 19.557297] x29: ffff800008ce3740 x28: 0000000000000000 x27: ffff800008ce3d50 [ 19.574270] x26: ffffc7e3e9cea100 x25: 00000000000f4240 x24: ffffc7e3f9ef0e30 [ 19.574284] x23: 0000000000000000 x22: ffff0061803820f4 x21: 0000000000000008 [ 19.574296] x20: ffffc7e3fa75df30 x19: 0000000000000020 x18: ffffffffffffffff [ 19.588570] x17: 0000000000000000 x16: ffffc7e3f9efab70 x15: ffffffffffffffff [ 19.595712] x14: ffff800008ce37b8 x13: ffff800008ce37aa x12: 0000000000000001 [ 19.602853] x11: 0000000000000001 x10: ffffcbe3ec0dff87 x9 : 0000000000000008 [ 19.609991] x8 : 0101010101010101 x7 : 0000000000000000 x6 : 000000000f0bfe9f [ 19.624261] x5 : 00ffffffffffffff x4 : 0025ab8e00000000 x3 : ffff006180382010 [ 19.631405] x2 : ffffc7e3e9ce8030 x1 : ffffc7e3fc3eb810 x0 : 0000000000000020 [ 19.638548] Call trace: [ 19.640995] cpu_latency_qos_remove_request+0x20/0x110 [ 19.646142] gpio_ir_recv_runtime_suspend+0x18/0x30 [gpio_ir_recv] [ 19.652339] pm_generic_runtime_suspend+0x2c/0x44 [ 19.657055] __rpm_callback+0x48/0x1dc [ 19.660807] rpm_callback+0x6c/0x80 [ 19.664301] rpm_suspend+0x10c/0x640 [ 19.667880] rpm_idle+0x250/0x2d0 [ 19.671198] update_autosuspend+0x38/0xe0 [ 19.675213] pm_runtime_set_autosuspend_delay+0x40/0x60 [ 19.680442] gpio_ir_recv_probe+0x1b4/0x21c [gpio_ir_recv] [ 19.685941] platform_probe+0x68/0xc0 [ 19.689610] really_probe+0xc0/0x3dc [ 19.693189] __driver_probe_device+0x7c/0x190 [ 19.697550] driver_probe_device+0x3c/0x110 [ 19.701739] __driver_attach+0xf4/0x200 [ 19.705578] bus_for_each_dev+0x70/0xd0 [ 19.709417] driver_attach+0x24/0x30 [ 19.712998] bus_add_driver+0x17c/0x240 [ 19.716834] driver_register+0x78/0x130 [ 19.720676] __platform_driver_register+0x28/0x34 [ 19.725386] gpio_ir_recv_driver_init+0x20/0x1000 [gpio_ir_recv] [ 19.731404] do_one_initcall+0x44/0x2ac [ 19.735243] do_init_module+0x48/0x1d0 [ 19.739003] load_module+0x19fc/0x2034 [ 19.742759] __do_sys_finit_module+0xac/0x12c [ 19.747124] __arm64_sys_finit_module+0x20/0x30 [ 19.751664] invoke_syscall+0x48/0x114 [ 19.755420] el0_svc_common.constprop.0+0xcc/0xec [ 19.760132] do_el0_svc+0x38/0xb0 [ 19.763456] el0_svc+0x2c/0x84 [ 19.766516] el0t_64_sync_handler+0xf4/0x120 [ 19.770789] el0t_64_sync+0x190/0x194 [ 19.774460] Code: 910003fd a90153f3 aa0003f3 91204021 (f9401400) [ 19.780556] ---[ end trace 0000000000000000 ]---
Signed-off-by: Li Jun jun.li@nxp.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/rc/gpio-ir-recv.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/drivers/media/rc/gpio-ir-recv.c b/drivers/media/rc/gpio-ir-recv.c index 22e524b69806a..a56c844d7f816 100644 --- a/drivers/media/rc/gpio-ir-recv.c +++ b/drivers/media/rc/gpio-ir-recv.c @@ -130,6 +130,23 @@ static int gpio_ir_recv_probe(struct platform_device *pdev) "gpio-ir-recv-irq", gpio_dev); }
+static int gpio_ir_recv_remove(struct platform_device *pdev) +{ + struct gpio_rc_dev *gpio_dev = platform_get_drvdata(pdev); + struct device *pmdev = gpio_dev->pmdev; + + if (pmdev) { + pm_runtime_get_sync(pmdev); + cpu_latency_qos_remove_request(&gpio_dev->qos); + + pm_runtime_disable(pmdev); + pm_runtime_put_noidle(pmdev); + pm_runtime_set_suspended(pmdev); + } + + return 0; +} + #ifdef CONFIG_PM static int gpio_ir_recv_suspend(struct device *dev) { @@ -189,6 +206,7 @@ MODULE_DEVICE_TABLE(of, gpio_ir_recv_of_match);
static struct platform_driver gpio_ir_recv_driver = { .probe = gpio_ir_recv_probe, + .remove = gpio_ir_recv_remove, .driver = { .name = KBUILD_MODNAME, .of_match_table = of_match_ptr(gpio_ir_recv_of_match),
From: Seth Forshee sforshee@kernel.org
commit 42d0c4bdf753063b6eec55415003184d3ca24f6e upstream.
A user should be allowed to take out a lease via an idmapped mount if the fsuid matches the mapped uid of the inode. generic_setlease() is checking the unmapped inode uid, causing these operations to be denied.
Fix this by comparing against the mapped inode uid instead of the unmapped uid.
Fixes: 9caccd41541a ("fs: introduce MOUNT_ATTR_IDMAP") Cc: stable@vger.kernel.org Signed-off-by: Seth Forshee (DigitalOcean) sforshee@kernel.org Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/locks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/locks.c +++ b/fs/locks.c @@ -1901,9 +1901,10 @@ int generic_setlease(struct file *filp, void **priv) { struct inode *inode = locks_inode(filp); + kuid_t uid = i_uid_into_mnt(file_mnt_user_ns(filp), inode); int error;
- if ((!uid_eq(current_fsuid(), inode->i_uid)) && !capable(CAP_LEASE)) + if ((!uid_eq(current_fsuid(), uid)) && !capable(CAP_LEASE)) return -EACCES; if (!S_ISREG(inode->i_mode)) return -EINVAL;
From: Qais Yousef qais.yousef@arm.com
commit 244226035a1f9b2b6c326e55ae5188fab4f428cb upstream.
As reported by Yun Hsiang [1], if a task has its uclamp_min >= 0.8 * 1024, it'll always pick the previous CPU because fits_capacity() will always return false in this case.
The new util_fits_cpu() logic should handle this correctly for us beside more corner cases where similar failures could occur, like when using UCLAMP_MAX.
We open code uclamp_rq_util_with() except for the clamp() part, util_fits_cpu() needs the 'raw' values to be passed to it.
Also introduce uclamp_rq_{set, get}() shorthand accessors to get uclamp value for the rq. Makes the code more readable and ensures the right rules (use READ_ONCE/WRITE_ONCE) are respected transparently.
[1] https://lists.linaro.org/pipermail/eas-dev/2020-July/001488.html
Fixes: 1d42509e475c ("sched/fair: Make EAS wakeup placement consider uclamp restrictions") Reported-by: Yun Hsiang hsiang023167@gmail.com Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-4-qais.yousef@arm.com (cherry picked from commit 244226035a1f9b2b6c326e55ae5188fab4f428cb) [Conflict in kernel/sched/fair.c mainly due to new automatic variables being added on master vs 5.15] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/core.c | 10 +++++----- kernel/sched/fair.c | 26 ++++++++++++++++++++++++-- kernel/sched/sched.h | 42 +++++++++++++++++++++++++++++++++++++++--- 3 files changed, 68 insertions(+), 10 deletions(-)
--- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1335,7 +1335,7 @@ static inline void uclamp_idle_reset(str if (!(rq->uclamp_flags & UCLAMP_FLAG_IDLE)) return;
- WRITE_ONCE(rq->uclamp[clamp_id].value, clamp_value); + uclamp_rq_set(rq, clamp_id, clamp_value); }
static inline @@ -1513,8 +1513,8 @@ static inline void uclamp_rq_inc_id(stru if (bucket->tasks == 1 || uc_se->value > bucket->value) bucket->value = uc_se->value;
- if (uc_se->value > READ_ONCE(uc_rq->value)) - WRITE_ONCE(uc_rq->value, uc_se->value); + if (uc_se->value > uclamp_rq_get(rq, clamp_id)) + uclamp_rq_set(rq, clamp_id, uc_se->value); }
/* @@ -1580,7 +1580,7 @@ static inline void uclamp_rq_dec_id(stru if (likely(bucket->tasks)) return;
- rq_clamp = READ_ONCE(uc_rq->value); + rq_clamp = uclamp_rq_get(rq, clamp_id); /* * Defensive programming: this should never happen. If it happens, * e.g. due to future modification, warn and fixup the expected value. @@ -1588,7 +1588,7 @@ static inline void uclamp_rq_dec_id(stru SCHED_WARN_ON(bucket->value > rq_clamp); if (bucket->value >= rq_clamp) { bkt_clamp = uclamp_rq_max_value(rq, clamp_id, uc_se->value); - WRITE_ONCE(uc_rq->value, bkt_clamp); + uclamp_rq_set(rq, clamp_id, bkt_clamp); } }
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6938,6 +6938,8 @@ compute_energy(struct task_struct *p, in static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) { unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX; + unsigned long p_util_min = uclamp_is_used() ? uclamp_eff_value(p, UCLAMP_MIN) : 0; + unsigned long p_util_max = uclamp_is_used() ? uclamp_eff_value(p, UCLAMP_MAX) : 1024; struct root_domain *rd = cpu_rq(smp_processor_id())->rd; int cpu, best_energy_cpu = prev_cpu, target = -1; unsigned long cpu_cap, util, base_energy = 0; @@ -6967,6 +6969,8 @@ static int find_energy_efficient_cpu(str
for (; pd; pd = pd->next) { unsigned long cur_delta, spare_cap, max_spare_cap = 0; + unsigned long rq_util_min, rq_util_max; + unsigned long util_min, util_max; bool compute_prev_delta = false; unsigned long base_energy_pd; int max_spare_cap_cpu = -1; @@ -6987,8 +6991,26 @@ static int find_energy_efficient_cpu(str * much capacity we can get out of the CPU; this is * aligned with sched_cpu_util(). */ - util = uclamp_rq_util_with(cpu_rq(cpu), util, p); - if (!fits_capacity(util, cpu_cap)) + if (uclamp_is_used()) { + if (uclamp_rq_is_idle(cpu_rq(cpu))) { + util_min = p_util_min; + util_max = p_util_max; + } else { + /* + * Open code uclamp_rq_util_with() except for + * the clamp() part. Ie: apply max aggregation + * only. util_fits_cpu() logic requires to + * operate on non clamped util but must use the + * max-aggregated uclamp_{min, max}. + */ + rq_util_min = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); + rq_util_max = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); + + util_min = max(rq_util_min, p_util_min); + util_max = max(rq_util_max, p_util_max); + } + } + if (!util_fits_cpu(util, util_min, util_max, cpu)) continue;
if (cpu == prev_cpu) { --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2855,6 +2855,23 @@ static inline void cpufreq_update_util(s #ifdef CONFIG_UCLAMP_TASK unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
+static inline unsigned long uclamp_rq_get(struct rq *rq, + enum uclamp_id clamp_id) +{ + return READ_ONCE(rq->uclamp[clamp_id].value); +} + +static inline void uclamp_rq_set(struct rq *rq, enum uclamp_id clamp_id, + unsigned int value) +{ + WRITE_ONCE(rq->uclamp[clamp_id].value, value); +} + +static inline bool uclamp_rq_is_idle(struct rq *rq) +{ + return rq->uclamp_flags & UCLAMP_FLAG_IDLE; +} + /** * uclamp_rq_util_with - clamp @util with @rq and @p effective uclamp values. * @rq: The rq to clamp against. Must not be NULL. @@ -2890,12 +2907,12 @@ unsigned long uclamp_rq_util_with(struct * Ignore last runnable task's max clamp, as this task will * reset it. Similarly, no need to read the rq's min clamp. */ - if (rq->uclamp_flags & UCLAMP_FLAG_IDLE) + if (uclamp_rq_is_idle(rq)) goto out; }
- min_util = max_t(unsigned long, min_util, READ_ONCE(rq->uclamp[UCLAMP_MIN].value)); - max_util = max_t(unsigned long, max_util, READ_ONCE(rq->uclamp[UCLAMP_MAX].value)); + min_util = max_t(unsigned long, min_util, uclamp_rq_get(rq, UCLAMP_MIN)); + max_util = max_t(unsigned long, max_util, uclamp_rq_get(rq, UCLAMP_MAX)); out: /* * Since CPU's {min,max}_util clamps are MAX aggregated considering @@ -2941,6 +2958,25 @@ static inline bool uclamp_is_used(void) { return false; } + +static inline unsigned long uclamp_rq_get(struct rq *rq, + enum uclamp_id clamp_id) +{ + if (clamp_id == UCLAMP_MIN) + return 0; + + return SCHED_CAPACITY_SCALE; +} + +static inline void uclamp_rq_set(struct rq *rq, enum uclamp_id clamp_id, + unsigned int value) +{ +} + +static inline bool uclamp_rq_is_idle(struct rq *rq) +{ + return false; +} #endif /* CONFIG_UCLAMP_TASK */
#ifdef arch_scale_freq_capacity
From: Qais Yousef qais.yousef@arm.com
commit c56ab1b3506ba0e7a872509964b100912bde165d upstream.
So that it is now uclamp aware.
This fixes a major problem of busy tasks capped with UCLAMP_MAX keeping the system in overutilized state which disables EAS and leads to wasting energy in the long run.
Without this patch running a busy background activity like JIT compilation on Pixel 6 causes the system to be in overutilized state 74.5% of the time.
With this patch this goes down to 9.79%.
It also fixes another problem when long running tasks that have their UCLAMP_MIN changed while running such that they need to upmigrate to honour the new UCLAMP_MIN value. The upmigration doesn't get triggered because overutilized state never gets set in this state, hence misfit migration never happens at tick in this case until the task wakes up again.
Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-7-qais.yousef@arm.com (cherry picked from commit c56ab1b3506ba0e7a872509964b100912bde165d) [Fixed trivial conflict in cpu_overutilized() - use cpu_util() instead of cpu_util_cfs()] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5692,7 +5692,10 @@ static inline unsigned long cpu_util(int
static inline bool cpu_overutilized(int cpu) { - return !fits_capacity(cpu_util(cpu), capacity_of(cpu)); + unsigned long rq_util_min = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); + unsigned long rq_util_max = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); + + return !util_fits_cpu(cpu_util(cpu), rq_util_min, rq_util_max, cpu); }
static inline void update_overutilized_status(struct rq *rq)
From: Qais Yousef qais.yousef@arm.com
commit d81304bc6193554014d4372a01debdf65e1e9a4d upstream.
If the utilization of the woken up task is 0, we skip the energy calculation because it has no impact.
But if the task is boosted (uclamp_min != 0) will have an impact on task placement and frequency selection. Only skip if the util is truly 0 after applying uclamp values.
Change uclamp_task_cpu() signature to avoid unnecessary additional calls to uclamp_eff_get(). feec() is the only user now.
Fixes: 732cd75b8c920 ("sched/fair: Select an energy-efficient CPU on task wake-up") Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-8-qais.yousef@arm.com (cherry picked from commit d81304bc6193554014d4372a01debdf65e1e9a4d) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3974,14 +3974,16 @@ static inline unsigned long task_util_es }
#ifdef CONFIG_UCLAMP_TASK -static inline unsigned long uclamp_task_util(struct task_struct *p) +static inline unsigned long uclamp_task_util(struct task_struct *p, + unsigned long uclamp_min, + unsigned long uclamp_max) { - return clamp(task_util_est(p), - uclamp_eff_value(p, UCLAMP_MIN), - uclamp_eff_value(p, UCLAMP_MAX)); + return clamp(task_util_est(p), uclamp_min, uclamp_max); } #else -static inline unsigned long uclamp_task_util(struct task_struct *p) +static inline unsigned long uclamp_task_util(struct task_struct *p, + unsigned long uclamp_min, + unsigned long uclamp_max) { return task_util_est(p); } @@ -6967,7 +6969,7 @@ static int find_energy_efficient_cpu(str target = prev_cpu;
sync_entity_load_avg(&p->se); - if (!task_util_est(p)) + if (!uclamp_task_util(p, p_util_min, p_util_max)) goto unlock;
for (; pd; pd = pd->next) {
From: Qais Yousef qais.yousef@arm.com
commit 44c7b80bffc3a657a36857098d5d9c49d94e652b upstream.
Check each performance domain to see if thermal pressure is causing its capacity to be lower than another performance domain.
We assume that each performance domain has CPUs with the same capacities, which is similar to an assumption made in energy_model.c
We also assume that thermal pressure impacts all CPUs in a performance domain equally.
If there're multiple performance domains with the same capacity_orig, we will trigger a capacity inversion if the domain is under thermal pressure.
The new cpu_in_capacity_inversion() should help users to know when information about capacity_orig are not reliable and can opt in to use the inverted capacity as the 'actual' capacity_orig.
Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-9-qais.yousef@arm.com (cherry picked from commit 44c7b80bffc3a657a36857098d5d9c49d94e652b) [fix trivial conflict in kernel/sched/sched.h due to code shuffling] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 19 +++++++++++++++ 2 files changed, 79 insertions(+), 3 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8574,16 +8574,73 @@ static unsigned long scale_rt_capacity(i
static void update_cpu_capacity(struct sched_domain *sd, int cpu) { + unsigned long capacity_orig = arch_scale_cpu_capacity(cpu); unsigned long capacity = scale_rt_capacity(cpu); struct sched_group *sdg = sd->groups; + struct rq *rq = cpu_rq(cpu);
- cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu); + rq->cpu_capacity_orig = capacity_orig;
if (!capacity) capacity = 1;
- cpu_rq(cpu)->cpu_capacity = capacity; - trace_sched_cpu_capacity_tp(cpu_rq(cpu)); + rq->cpu_capacity = capacity; + + /* + * Detect if the performance domain is in capacity inversion state. + * + * Capacity inversion happens when another perf domain with equal or + * lower capacity_orig_of() ends up having higher capacity than this + * domain after subtracting thermal pressure. + * + * We only take into account thermal pressure in this detection as it's + * the only metric that actually results in *real* reduction of + * capacity due to performance points (OPPs) being dropped/become + * unreachable due to thermal throttling. + * + * We assume: + * * That all cpus in a perf domain have the same capacity_orig + * (same uArch). + * * Thermal pressure will impact all cpus in this perf domain + * equally. + */ + if (static_branch_unlikely(&sched_asym_cpucapacity)) { + unsigned long inv_cap = capacity_orig - thermal_load_avg(rq); + struct perf_domain *pd = rcu_dereference(rq->rd->pd); + + rq->cpu_capacity_inverted = 0; + + for (; pd; pd = pd->next) { + struct cpumask *pd_span = perf_domain_span(pd); + unsigned long pd_cap_orig, pd_cap; + + cpu = cpumask_any(pd_span); + pd_cap_orig = arch_scale_cpu_capacity(cpu); + + if (capacity_orig < pd_cap_orig) + continue; + + /* + * handle the case of multiple perf domains have the + * same capacity_orig but one of them is under higher + * thermal pressure. We record it as capacity + * inversion. + */ + if (capacity_orig == pd_cap_orig) { + pd_cap = pd_cap_orig - thermal_load_avg(cpu_rq(cpu)); + + if (pd_cap > inv_cap) { + rq->cpu_capacity_inverted = inv_cap; + break; + } + } else if (pd_cap_orig > inv_cap) { + rq->cpu_capacity_inverted = inv_cap; + break; + } + } + } + + trace_sched_cpu_capacity_tp(rq);
sdg->sgc->capacity = capacity; sdg->sgc->min_capacity = capacity; --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1003,6 +1003,7 @@ struct rq {
unsigned long cpu_capacity; unsigned long cpu_capacity_orig; + unsigned long cpu_capacity_inverted;
struct callback_head *balance_callback;
@@ -2993,6 +2994,24 @@ static inline unsigned long capacity_ori return cpu_rq(cpu)->cpu_capacity_orig; }
+/* + * Returns inverted capacity if the CPU is in capacity inversion state. + * 0 otherwise. + * + * Capacity inversion detection only considers thermal impact where actual + * performance points (OPPs) gets dropped. + * + * Capacity inversion state happens when another performance domain that has + * equal or lower capacity_orig_of() becomes effectively larger than the perf + * domain this CPU belongs to due to thermal pressure throttling it hard. + * + * See comment in update_cpu_capacity(). + */ +static inline unsigned long cpu_in_capacity_inversion(int cpu) +{ + return cpu_rq(cpu)->cpu_capacity_inverted; +} + /** * enum cpu_util_type - CPU utilization type * @FREQUENCY_UTIL: Utilization used to select frequency
From: Qais Yousef qais.yousef@arm.com
commit aa69c36f31aadc1669bfa8a3de6a47b5e6c98ee8 upstream.
We do consider thermal pressure in util_fits_cpu() for uclamp_min only. With the exception of the biggest cores which by definition are the max performance point of the system and all tasks by definition should fit.
Even under thermal pressure, the capacity of the biggest CPU is the highest in the system and should still fit every task. Except when it reaches capacity inversion point, then this is no longer true.
We can handle this by using the inverted capacity as capacity_orig in util_fits_cpu(). Which not only addresses the problem above, but also ensure uclamp_max now considers the inverted capacity. Force fitting a task when a CPU is in this adverse state will contribute to making the thermal throttling last longer.
Signed-off-by: Qais Yousef qais.yousef@arm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220804143609.515789-10-qais.yousef@arm.com (cherry picked from commit aa69c36f31aadc1669bfa8a3de6a47b5e6c98ee8) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4159,12 +4159,16 @@ static inline int util_fits_cpu(unsigned * For uclamp_max, we can tolerate a drop in performance level as the * goal is to cap the task. So it's okay if it's getting less. * - * In case of capacity inversion, which is not handled yet, we should - * honour the inverted capacity for both uclamp_min and uclamp_max all - * the time. + * In case of capacity inversion we should honour the inverted capacity + * for both uclamp_min and uclamp_max all the time. */ - capacity_orig = capacity_orig_of(cpu); - capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu); + capacity_orig = cpu_in_capacity_inversion(cpu); + if (capacity_orig) { + capacity_orig_thermal = capacity_orig; + } else { + capacity_orig = capacity_orig_of(cpu); + capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu); + }
/* * We want to force a task to fit a cpu as implied by uclamp_max.
From: Qais Yousef qyousef@layalina.io
commit e26fd28db82899be71b4b949527373d0a6be1e65 upstream.
Addresses the following warnings:
config: riscv-randconfig-m031-20221111 compiler: riscv64-linux-gcc (GCC) 12.1.0
smatch warnings: kernel/sched/fair.c:7263 find_energy_efficient_cpu() error: uninitialized symbol 'util_min'. kernel/sched/fair.c:7263 find_energy_efficient_cpu() error: uninitialized symbol 'util_max'.
Fixes: 244226035a1f ("sched/uclamp: Fix fits_capacity() check in feec()") Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter error27@gmail.com Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Link: https://lore.kernel.org/r/20230112122708.330667-2-qyousef@layalina.io (cherry picked from commit e26fd28db82899be71b4b949527373d0a6be1e65) [Conflict in kernel/sched/fair.c due to new automatic variables being added on master vs 5.15] Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6977,14 +6977,16 @@ static int find_energy_efficient_cpu(str goto unlock;
for (; pd; pd = pd->next) { + unsigned long util_min = p_util_min, util_max = p_util_max; unsigned long cur_delta, spare_cap, max_spare_cap = 0; unsigned long rq_util_min, rq_util_max; - unsigned long util_min, util_max; bool compute_prev_delta = false; unsigned long base_energy_pd; int max_spare_cap_cpu = -1;
for_each_cpu_and(cpu, perf_domain_span(pd), sched_domain_span(sd)) { + struct rq *rq = cpu_rq(cpu); + if (!cpumask_test_cpu(cpu, p->cpus_ptr)) continue;
@@ -7000,24 +7002,19 @@ static int find_energy_efficient_cpu(str * much capacity we can get out of the CPU; this is * aligned with sched_cpu_util(). */ - if (uclamp_is_used()) { - if (uclamp_rq_is_idle(cpu_rq(cpu))) { - util_min = p_util_min; - util_max = p_util_max; - } else { - /* - * Open code uclamp_rq_util_with() except for - * the clamp() part. Ie: apply max aggregation - * only. util_fits_cpu() logic requires to - * operate on non clamped util but must use the - * max-aggregated uclamp_{min, max}. - */ - rq_util_min = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); - rq_util_max = uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); - - util_min = max(rq_util_min, p_util_min); - util_max = max(rq_util_max, p_util_max); - } + if (uclamp_is_used() && !uclamp_rq_is_idle(rq)) { + /* + * Open code uclamp_rq_util_with() except for + * the clamp() part. Ie: apply max aggregation + * only. util_fits_cpu() logic requires to + * operate on non clamped util but must use the + * max-aggregated uclamp_{min, max}. + */ + rq_util_min = uclamp_rq_get(rq, UCLAMP_MIN); + rq_util_max = uclamp_rq_get(rq, UCLAMP_MAX); + + util_min = max(rq_util_min, p_util_min); + util_max = max(rq_util_max, p_util_max); } if (!util_fits_cpu(util, util_min, util_max, cpu)) continue;
From: Qais Yousef qyousef@layalina.io
commit da07d2f9c153e457e845d4dcfdd13568d71d18a4 upstream.
Traversing the Perf Domains requires rcu_read_lock() to be held and is conditional on sched_energy_enabled(). Ensure right protections applied.
Also skip capacity inversion detection for our own pd; which was an error.
Fixes: 44c7b80bffc3 ("sched/fair: Detect capacity inversion") Reported-by: Dietmar Eggemann dietmar.eggemann@arm.com Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Vincent Guittot vincent.guittot@linaro.org Link: https://lore.kernel.org/r/20230112122708.330667-3-qyousef@layalina.io (cherry picked from commit da07d2f9c153e457e845d4dcfdd13568d71d18a4) Signed-off-by: Qais Yousef (Google) qyousef@layalina.io Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8605,16 +8605,23 @@ static void update_cpu_capacity(struct s * * Thermal pressure will impact all cpus in this perf domain * equally. */ - if (static_branch_unlikely(&sched_asym_cpucapacity)) { + if (sched_energy_enabled()) { unsigned long inv_cap = capacity_orig - thermal_load_avg(rq); - struct perf_domain *pd = rcu_dereference(rq->rd->pd); + struct perf_domain *pd;
+ rcu_read_lock(); + + pd = rcu_dereference(rq->rd->pd); rq->cpu_capacity_inverted = 0;
for (; pd; pd = pd->next) { struct cpumask *pd_span = perf_domain_span(pd); unsigned long pd_cap_orig, pd_cap;
+ /* We can't be inverted against our own pd */ + if (cpumask_test_cpu(cpu_of(rq), pd_span)) + continue; + cpu = cpumask_any(pd_span); pd_cap_orig = arch_scale_cpu_capacity(cpu);
@@ -8639,6 +8646,8 @@ static void update_cpu_capacity(struct s break; } } + + rcu_read_unlock(); }
trace_sched_cpu_capacity_tp(rq);
From: Ritesh Harjani riteshh@linux.ibm.com
commit 8ac3939db99f99667b8eb670cf4baf292896e72d upstream.
ext4_free_blocks() function became too long and confusing, this patch just pulls out the ext4_mb_clear_bb() function logic from it which clears the block bitmap and frees it.
No functionality change in this patch
Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/22c30fbb26ba409cf8aa5f0c7912970272c459e8.164499261... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 180 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 102 insertions(+), 78 deletions(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -5888,7 +5888,8 @@ static void ext4_free_blocks_simple(stru }
/** - * ext4_free_blocks() -- Free given blocks and update quota + * ext4_mb_clear_bb() -- helper function for freeing blocks. + * Used by ext4_free_blocks() * @handle: handle for this transaction * @inode: inode * @bh: optional buffer of the block to be freed @@ -5896,9 +5897,9 @@ static void ext4_free_blocks_simple(stru * @count: number of blocks to be freed * @flags: flags used by ext4_free_blocks */ -void ext4_free_blocks(handle_t *handle, struct inode *inode, - struct buffer_head *bh, ext4_fsblk_t block, - unsigned long count, int flags) +static void ext4_mb_clear_bb(handle_t *handle, struct inode *inode, + ext4_fsblk_t block, unsigned long count, + int flags) { struct buffer_head *bitmap_bh = NULL; struct super_block *sb = inode->i_sb; @@ -5915,80 +5916,6 @@ void ext4_free_blocks(handle_t *handle,
sbi = EXT4_SB(sb);
- if (sbi->s_mount_state & EXT4_FC_REPLAY) { - ext4_free_blocks_simple(inode, block, count); - return; - } - - might_sleep(); - if (bh) { - if (block) - BUG_ON(block != bh->b_blocknr); - else - block = bh->b_blocknr; - } - - if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) && - !ext4_inode_block_valid(inode, block, count)) { - ext4_error(sb, "Freeing blocks not in datazone - " - "block = %llu, count = %lu", block, count); - goto error_return; - } - - ext4_debug("freeing block %llu\n", block); - trace_ext4_free_blocks(inode, block, count, flags); - - if (bh && (flags & EXT4_FREE_BLOCKS_FORGET)) { - BUG_ON(count > 1); - - ext4_forget(handle, flags & EXT4_FREE_BLOCKS_METADATA, - inode, bh, block); - } - - /* - * If the extent to be freed does not begin on a cluster - * boundary, we need to deal with partial clusters at the - * beginning and end of the extent. Normally we will free - * blocks at the beginning or the end unless we are explicitly - * requested to avoid doing so. - */ - overflow = EXT4_PBLK_COFF(sbi, block); - if (overflow) { - if (flags & EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER) { - overflow = sbi->s_cluster_ratio - overflow; - block += overflow; - if (count > overflow) - count -= overflow; - else - return; - } else { - block -= overflow; - count += overflow; - } - } - overflow = EXT4_LBLK_COFF(sbi, count); - if (overflow) { - if (flags & EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER) { - if (count > overflow) - count -= overflow; - else - return; - } else - count += sbi->s_cluster_ratio - overflow; - } - - if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) { - int i; - int is_metadata = flags & EXT4_FREE_BLOCKS_METADATA; - - for (i = 0; i < count; i++) { - cond_resched(); - if (is_metadata) - bh = sb_find_get_block(inode->i_sb, block + i); - ext4_forget(handle, is_metadata, inode, bh, block + i); - } - } - do_more: overflow = 0; ext4_get_group_no_and_offset(sb, block, &block_group, &bit); @@ -6156,6 +6083,103 @@ error_return: return; }
+/** + * ext4_free_blocks() -- Free given blocks and update quota + * @handle: handle for this transaction + * @inode: inode + * @bh: optional buffer of the block to be freed + * @block: starting physical block to be freed + * @count: number of blocks to be freed + * @flags: flags used by ext4_free_blocks + */ +void ext4_free_blocks(handle_t *handle, struct inode *inode, + struct buffer_head *bh, ext4_fsblk_t block, + unsigned long count, int flags) +{ + struct super_block *sb = inode->i_sb; + unsigned int overflow; + struct ext4_sb_info *sbi; + + sbi = EXT4_SB(sb); + + if (sbi->s_mount_state & EXT4_FC_REPLAY) { + ext4_free_blocks_simple(inode, block, count); + return; + } + + might_sleep(); + if (bh) { + if (block) + BUG_ON(block != bh->b_blocknr); + else + block = bh->b_blocknr; + } + + if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) && + !ext4_inode_block_valid(inode, block, count)) { + ext4_error(sb, "Freeing blocks not in datazone - " + "block = %llu, count = %lu", block, count); + return; + } + + ext4_debug("freeing block %llu\n", block); + trace_ext4_free_blocks(inode, block, count, flags); + + if (bh && (flags & EXT4_FREE_BLOCKS_FORGET)) { + BUG_ON(count > 1); + + ext4_forget(handle, flags & EXT4_FREE_BLOCKS_METADATA, + inode, bh, block); + } + + /* + * If the extent to be freed does not begin on a cluster + * boundary, we need to deal with partial clusters at the + * beginning and end of the extent. Normally we will free + * blocks at the beginning or the end unless we are explicitly + * requested to avoid doing so. + */ + overflow = EXT4_PBLK_COFF(sbi, block); + if (overflow) { + if (flags & EXT4_FREE_BLOCKS_NOFREE_FIRST_CLUSTER) { + overflow = sbi->s_cluster_ratio - overflow; + block += overflow; + if (count > overflow) + count -= overflow; + else + return; + } else { + block -= overflow; + count += overflow; + } + } + overflow = EXT4_LBLK_COFF(sbi, count); + if (overflow) { + if (flags & EXT4_FREE_BLOCKS_NOFREE_LAST_CLUSTER) { + if (count > overflow) + count -= overflow; + else + return; + } else + count += sbi->s_cluster_ratio - overflow; + } + + if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) { + int i; + int is_metadata = flags & EXT4_FREE_BLOCKS_METADATA; + + for (i = 0; i < count; i++) { + cond_resched(); + if (is_metadata) + bh = sb_find_get_block(inode->i_sb, block + i); + ext4_forget(handle, is_metadata, inode, bh, block + i); + } + } + + ext4_mb_clear_bb(handle, inode, block, count, flags); + return; +} + /** * ext4_group_add_blocks() -- Add given blocks to an existing group * @handle: handle to this transaction
From: Ritesh Harjani riteshh@linux.ibm.com
commit 6bc6c2bdf1baca6522b8d9ba976257d722423085 upstream.
This API will be needed at places where we don't have an inode for e.g. while freeing blocks in ext4_group_add_blocks()
Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Link: https://lore.kernel.org/r/dd34a236543ad5ae7123eeebe0cb69e6bdd44f34.164499261... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/block_validity.c | 26 +++++++++++++++++--------- fs/ext4/ext4.h | 3 +++ 2 files changed, 20 insertions(+), 9 deletions(-)
--- a/fs/ext4/block_validity.c +++ b/fs/ext4/block_validity.c @@ -292,15 +292,10 @@ void ext4_release_system_zone(struct sup call_rcu(&system_blks->rcu, ext4_destroy_system_zone); }
-/* - * Returns 1 if the passed-in block region (start_blk, - * start_blk+count) is valid; 0 if some part of the block region - * overlaps with some other filesystem metadata blocks. - */ -int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk, - unsigned int count) +int ext4_sb_block_valid(struct super_block *sb, struct inode *inode, + ext4_fsblk_t start_blk, unsigned int count) { - struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + struct ext4_sb_info *sbi = EXT4_SB(sb); struct ext4_system_blocks *system_blks; struct ext4_system_zone *entry; struct rb_node *n; @@ -329,7 +324,9 @@ int ext4_inode_block_valid(struct inode else if (start_blk >= (entry->start_blk + entry->count)) n = n->rb_right; else { - ret = (entry->ino == inode->i_ino); + ret = 0; + if (inode) + ret = (entry->ino == inode->i_ino); break; } } @@ -338,6 +335,17 @@ out_rcu: return ret; }
+/* + * Returns 1 if the passed-in block region (start_blk, + * start_blk+count) is valid; 0 if some part of the block region + * overlaps with some other filesystem metadata blocks. + */ +int ext4_inode_block_valid(struct inode *inode, ext4_fsblk_t start_blk, + unsigned int count) +{ + return ext4_sb_block_valid(inode->i_sb, inode, start_blk, count); +} + int ext4_check_blockref(const char *function, unsigned int line, struct inode *inode, __le32 *p, unsigned int max) { --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3698,6 +3698,9 @@ extern int ext4_inode_block_valid(struct unsigned int count); extern int ext4_check_blockref(const char *, unsigned int, struct inode *, __le32 *, unsigned int); +extern int ext4_sb_block_valid(struct super_block *sb, struct inode *inode, + ext4_fsblk_t start_blk, unsigned int count); +
/* extents.c */ struct ext4_ext_path;
From: Ritesh Harjani riteshh@linux.ibm.com
commit a00b482b82fb098956a5bed22bd7873e56f152f1 upstream.
Currently ext4_mb_clear_bb() & ext4_group_add_blocks() only checks whether the given block ranges (which is to be freed) belongs to any FS metadata blocks or not, of the block's respective block group. But to detect any FS error early, it is better to add more strict checkings in those functions which checks whether the given blocks belongs to any critical FS metadata or not within system-zone.
Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/ddd9143d064774e32d6364a99667817c6e8bfdc0.164499261... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 16 +++------------- 1 file changed, 3 insertions(+), 13 deletions(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -5946,13 +5946,7 @@ do_more: goto error_return; }
- if (in_range(ext4_block_bitmap(sb, gdp), block, count) || - in_range(ext4_inode_bitmap(sb, gdp), block, count) || - in_range(block, ext4_inode_table(sb, gdp), - sbi->s_itb_per_group) || - in_range(block + count - 1, ext4_inode_table(sb, gdp), - sbi->s_itb_per_group)) { - + if (!ext4_inode_block_valid(inode, block, count)) { ext4_error(sb, "Freeing blocks in system zone - " "Block = %llu, count = %lu", block, count); /* err = 0. ext4_std_error should be a no op */ @@ -6023,7 +6017,7 @@ do_more: NULL); if (err && err != -EOPNOTSUPP) ext4_msg(sb, KERN_WARNING, "discard request in" - " group:%d block:%d count:%lu failed" + " group:%u block:%d count:%lu failed" " with %d", block_group, bit, count, err); } else @@ -6236,11 +6230,7 @@ int ext4_group_add_blocks(handle_t *hand goto error_return; }
- if (in_range(ext4_block_bitmap(sb, desc), block, count) || - in_range(ext4_inode_bitmap(sb, desc), block, count) || - in_range(block, ext4_inode_table(sb, desc), sbi->s_itb_per_group) || - in_range(block + count - 1, ext4_inode_table(sb, desc), - sbi->s_itb_per_group)) { + if (!ext4_sb_block_valid(sb, NULL, block, count)) { ext4_error(sb, "Adding blocks in system zones - " "Block = %llu, count = %lu", block, count);
From: Lukas Czerner lczerner@redhat.com
commit 1e1c2b86ef86a8477fd9b9a4f48a6bfe235606f6 upstream.
Block range to free is validated in ext4_free_blocks() using ext4_inode_block_valid() and then it's passed to ext4_mb_clear_bb(). However in some situations on bigalloc file system the range might be adjusted after the validation in ext4_free_blocks() which can lead to troubles on corrupted file systems such as one found by syzkaller that resulted in the following BUG
kernel BUG at fs/ext4/ext4.h:3319! PREEMPT SMP NOPTI CPU: 28 PID: 4243 Comm: repro Kdump: loaded Not tainted 5.19.0-rc6+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014 RIP: 0010:ext4_free_blocks+0x95e/0xa90 Call Trace: <TASK> ? lock_timer_base+0x61/0x80 ? __es_remove_extent+0x5a/0x760 ? __mod_timer+0x256/0x380 ? ext4_ind_truncate_ensure_credits+0x90/0x220 ext4_clear_blocks+0x107/0x1b0 ext4_free_data+0x15b/0x170 ext4_ind_truncate+0x214/0x2c0 ? _raw_spin_unlock+0x15/0x30 ? ext4_discard_preallocations+0x15a/0x410 ? ext4_journal_check_start+0xe/0x90 ? __ext4_journal_start_sb+0x2f/0x110 ext4_truncate+0x1b5/0x460 ? __ext4_journal_start_sb+0x2f/0x110 ext4_evict_inode+0x2b4/0x6f0 evict+0xd0/0x1d0 ext4_enable_quotas+0x11f/0x1f0 ext4_orphan_cleanup+0x3de/0x430 ? proc_create_seq_private+0x43/0x50 ext4_fill_super+0x295f/0x3ae0 ? snprintf+0x39/0x40 ? sget_fc+0x19c/0x330 ? ext4_reconfigure+0x850/0x850 get_tree_bdev+0x16d/0x260 vfs_get_tree+0x25/0xb0 path_mount+0x431/0xa70 __x64_sys_mount+0xe2/0x120 do_syscall_64+0x5b/0x80 ? do_user_addr_fault+0x1e2/0x670 ? exc_page_fault+0x70/0x170 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fdf4e512ace
Fix it by making sure that the block range is properly validated before used every time it changes in ext4_free_blocks() or ext4_mb_clear_bb().
Link: https://syzkaller.appspot.com/bug?id=5266d464285a03cee9dbfda7d2452a72c3c2ae7... Reported-by: syzbot+15cd994e273307bf5cfa@syzkaller.appspotmail.com Signed-off-by: Lukas Czerner lczerner@redhat.com Cc: Tadeusz Struk tadeusz.struk@linaro.org Tested-by: Tadeusz Struk tadeusz.struk@linaro.org Link: https://lore.kernel.org/r/20220714165903.58260-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -5916,6 +5916,15 @@ static void ext4_mb_clear_bb(handle_t *h
sbi = EXT4_SB(sb);
+ if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) && + !ext4_inode_block_valid(inode, block, count)) { + ext4_error(sb, "Freeing blocks in system zone - " + "Block = %llu, count = %lu", block, count); + /* err = 0. ext4_std_error should be a no op */ + goto error_return; + } + flags |= EXT4_FREE_BLOCKS_VALIDATED; + do_more: overflow = 0; ext4_get_group_no_and_offset(sb, block, &block_group, &bit); @@ -5932,6 +5941,8 @@ do_more: overflow = EXT4_C2B(sbi, bit) + count - EXT4_BLOCKS_PER_GROUP(sb); count -= overflow; + /* The range changed so it's no longer validated */ + flags &= ~EXT4_FREE_BLOCKS_VALIDATED; } count_clusters = EXT4_NUM_B2C(sbi, count); bitmap_bh = ext4_read_block_bitmap(sb, block_group); @@ -5946,7 +5957,8 @@ do_more: goto error_return; }
- if (!ext4_inode_block_valid(inode, block, count)) { + if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) && + !ext4_inode_block_valid(inode, block, count)) { ext4_error(sb, "Freeing blocks in system zone - " "Block = %llu, count = %lu", block, count); /* err = 0. ext4_std_error should be a no op */ @@ -6069,6 +6081,8 @@ do_more: block += count; count = overflow; put_bh(bitmap_bh); + /* The range changed so it's no longer validated */ + flags &= ~EXT4_FREE_BLOCKS_VALIDATED; goto do_more; } error_return: @@ -6115,6 +6129,7 @@ void ext4_free_blocks(handle_t *handle, "block = %llu, count = %lu", block, count); return; } + flags |= EXT4_FREE_BLOCKS_VALIDATED;
ext4_debug("freeing block %llu\n", block); trace_ext4_free_blocks(inode, block, count, flags); @@ -6146,6 +6161,8 @@ void ext4_free_blocks(handle_t *handle, block -= overflow; count += overflow; } + /* The range changed so it's no longer validated */ + flags &= ~EXT4_FREE_BLOCKS_VALIDATED; } overflow = EXT4_LBLK_COFF(sbi, count); if (overflow) { @@ -6156,6 +6173,8 @@ void ext4_free_blocks(handle_t *handle, return; } else count += sbi->s_cluster_ratio - overflow; + /* The range changed so it's no longer validated */ + flags &= ~EXT4_FREE_BLOCKS_VALIDATED; }
if (!bh && (flags & EXT4_FREE_BLOCKS_FORGET)) {
From: Masahiro Yamada masahiroy@kernel.org
commit 99cb0d917ffa1ab628bb67364ca9b162c07699b1 upstream.
Dennis Gilmore reports that the BuildID is missing in the arm64 vmlinux since commit 994b7ac1697b ("arm64: remove special treatment for the link order of head.o").
The issue is that the type of .notes section, which contains the BuildID, changed from NOTES to PROGBITS.
Ard Biesheuvel figured out that whichever object gets linked first gets to decide the type of a section. The PROGBITS type is the result of the compiler emitting .note.GNU-stack as PROGBITS rather than NOTE.
While Ard provided a fix for arm64, I want to fix this globally because the same issue is happening on riscv since commit 2348e6bf4421 ("riscv: remove special treatment for the link order of head.o"). This problem will happen in general for other architectures if they start to drop unneeded entries from scripts/head-object-list.txt.
Discard .note.GNU-stack in include/asm-generic/vmlinux.lds.h.
Link: https://lore.kernel.org/lkml/CAABkxwuQoz1CTbyb57n0ZX65eSYiTonFCU8-LCQc=74D=x... Fixes: 994b7ac1697b ("arm64: remove special treatment for the link order of head.o") Fixes: 2348e6bf4421 ("riscv: remove special treatment for the link order of head.o") Reported-by: Dennis Gilmore dennis@ausil.us Suggested-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Acked-by: Palmer Dabbelt palmer@rivosinc.com [Tom: stable backport 5.15.y, 5.10.y, 5.4.y]
Though the above "Fixes:" commits are not in this kernel, the conditions which lead to a missing Build ID in arm64 vmlinux are similar.
Evidence points to these conditions: 1. ld version > 2.36 (exact binutils commit documented in a494398bde27) 2. first object which gets linked (head.o) has a PROGBITS .note.GNU-stack segment
These conditions can be observed when: - 5.15.60+ OR 5.10.136+ OR 5.4.210+ - AND ld version > 2.36 - AND arch=arm64 - AND CONFIG_MODVERSIONS=y
There are notable differences in the vmlinux elf files produced before(bad) and after(good) applying this series.
Good: p_type:PT_NOTE segment exists. Bad: p_type:PT_NOTE segment is missing.
Good: sh_name_str:.notes section has sh_type:SHT_NOTE Bad: sh_name_str:.notes section has sh_type:SHT_PROGBITS
`readelf -n` (as of v2.40) searches for Build Id by processing only the very first note in sh_type:SHT_NOTE sections.
This was previously bisected to the stable backport of 0d362be5b142. Follow-up experiments were discussed here: https://lore.kernel.org/all/20221221235413.xaisboqmr7dkqwn6@oracle.com/ which strongly hints at condition 2. Signed-off-by: Tom Saeger tom.saeger@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/asm-generic/vmlinux.lds.h | 5 +++++ 1 file changed, 5 insertions(+)
--- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -903,7 +903,12 @@ #define PRINTK_INDEX #endif
+/* + * Discard .note.GNU-stack, which is emitted as PROGBITS by the compiler. + * Otherwise, the type of .notes section would become PROGBITS instead of NOTES. + */ #define NOTES \ + /DISCARD/ : { *(.note.GNU-stack) } \ .notes : AT(ADDR(.notes) - LOAD_OFFSET) { \ __start_notes = .; \ KEEP(*(.note.*)) \
From: Michael Ellerman mpe@ellerman.id.au
commit 4b9880dbf3bdba3a7c56445137c3d0e30aaa0a40 upstream.
The powerpc linker script explicitly includes .exit.text, because otherwise the link fails due to references from __bug_table and __ex_table. The code is freed (discarded) at runtime along with .init.text and data.
That has worked in the past despite powerpc not defining RUNTIME_DISCARD_EXIT because DISCARDS appears late in the powerpc linker script (line 410), and the explicit inclusion of .exit.text earlier (line 280) supersedes the discard.
However commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") introduced an earlier use of DISCARD as part of the RO_DATA macro (line 136). With binutils < 2.36 that causes the DISCARD directives later in the script to be applied earlier [1], causing .exit.text to actually be discarded at link time, leading to build errors:
'.exit.text' referenced in section '__bug_table' of crypto/algboss.o: defined in discarded section '.exit.text' of crypto/algboss.o '.exit.text' referenced in section '__ex_table' of drivers/nvdimm/core.o: defined in discarded section '.exit.text' of drivers/nvdimm/core.o
Fix it by defining RUNTIME_DISCARD_EXIT, which causes the generic DISCARDS macro to not include .exit.text at all.
1: https://lore.kernel.org/lkml/87fscp2v7k.fsf@igel.home/
Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20230105132349.384666-1-mpe@ellerman.id.au Signed-off-by: Tom Saeger tom.saeger@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/vmlinux.lds.S | 1 + 1 file changed, 1 insertion(+)
--- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -8,6 +8,7 @@ #define BSS_FIRST_SECTIONS *(.bss.prominit) #define EMITS_PT_NOTE #define RO_EXCEPTION_TABLE_ALIGN 0 +#define RUNTIME_DISCARD_EXIT
#define SOFT_MASK_TABLE(align) \ . = ALIGN(align); \
From: Michael Ellerman mpe@ellerman.id.au
commit 07b050f9290ee012a407a0f64151db902a1520f5 upstream.
Relocatable kernels must not discard relocations, they need to be processed at runtime. As such they are included for CONFIG_RELOCATABLE builds in the powerpc linker script (line 340).
However they are also unconditionally discarded later in the script (line 414). Previously that worked because the earlier inclusion superseded the discard.
However commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") introduced an earlier use of DISCARD as part of the RO_DATA macro (line 137). With binutils < 2.36 that causes the DISCARD directives later in the script to be applied earlier, causing .rela* to actually be discarded at link time, leading to build warnings and a kernel that doesn't boot:
ld: warning: discarding dynamic section .rela.init.rodata
Fix it by conditionally discarding .rela* only when CONFIG_RELOCATABLE is disabled.
Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Signed-off-by: Michael Ellerman mpe@ellerman.id.au
Link: https://lore.kernel.org/r/20230105132349.384666-2-mpe@ellerman.id.au Signed-off-by: Tom Saeger tom.saeger@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/vmlinux.lds.S | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -408,9 +408,12 @@ SECTIONS DISCARDS /DISCARD/ : { *(*.EMB.apuinfo) - *(.glink .iplt .plt .rela* .comment) + *(.glink .iplt .plt .comment) *(.gnu.version*) *(.gnu.attributes) *(.eh_frame) +#ifndef CONFIG_RELOCATABLE + *(.rela*) +#endif } }
From: Masahiro Yamada masahiroy@kernel.org
commit a494398bde273143c2352dd373cad8211f7d94b2 upstream.
Nathan Chancellor reports that the s390 vmlinux fails to link with GNU ld < 2.36 since commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv").
It happens for defconfig, or more specifically for CONFIG_EXPOLINE=y.
$ s390x-linux-gnu-ld --version | head -n1 GNU ld (GNU Binutils for Debian) 2.35.2 $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- allnoconfig $ ./scripts/config -e CONFIG_EXPOLINE $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- olddefconfig $ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- `.exit.text' referenced in section `.s390_return_reg' of drivers/base/dd.o: defined in discarded section `.exit.text' of drivers/base/dd.o make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make: *** [Makefile:1252: vmlinux] Error 2
arch/s390/kernel/vmlinux.lds.S wants to keep EXIT_TEXT:
.exit.text : { EXIT_TEXT }
But, at the same time, EXIT_TEXT is thrown away by DISCARD because s390 does not define RUNTIME_DISCARD_EXIT.
I still do not understand why the latter wins after 99cb0d917ffa, but defining RUNTIME_DISCARD_EXIT seems correct because the comment line in arch/s390/kernel/vmlinux.lds.S says:
/* * .exit.text is discarded at runtime, not link time, * to deal with references from __bug_table */
Nathan also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it.
Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Reported-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Link: https://lore.kernel.org/r/20230105031306.1455409-1-masahiroy@kernel.org Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Tom Saeger tom.saeger@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/kernel/vmlinux.lds.S | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -17,6 +17,8 @@ /* Handle ro_after_init data on our own. */ #define RO_AFTER_INIT_DATA
+#define RUNTIME_DISCARD_EXIT + #define EMITS_PT_NOTE
#include <asm-generic/vmlinux.lds.h>
From: Tom Saeger tom.saeger@oracle.com
commit c1c551bebf928889e7a8fef7415b44f9a64975f4 upstream.
sh vmlinux fails to link with GNU ld < 2.40 (likely < 2.36) since commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv").
This is similar to fixes for powerpc and s390: commit 4b9880dbf3bd ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT"). commit a494398bde27 ("s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36").
$ sh4-linux-gnu-ld --version | head -n1 GNU ld (GNU Binutils for Debian) 2.35.2
$ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- microdev_defconfig $ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu-
`.exit.text' referenced in section `__bug_table' of crypto/algboss.o: defined in discarded section `.exit.text' of crypto/algboss.o `.exit.text' referenced in section `__bug_table' of drivers/char/hw_random/core.o: defined in discarded section `.exit.text' of drivers/char/hw_random/core.o make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 make[1]: *** [Makefile:1252: vmlinux] Error 2
arch/sh/kernel/vmlinux.lds.S keeps EXIT_TEXT:
/* * .exit.text is discarded at runtime, not link time, to deal with * references from __bug_table */ .exit.text : AT(ADDR(.exit.text)) { EXIT_TEXT }
However, EXIT_TEXT is thrown away by DISCARD(include/asm-generic/vmlinux.lds.h) because sh does not define RUNTIME_DISCARD_EXIT.
GNU ld 2.40 does not have this issue and builds fine. This corresponds with Masahiro's comments in a494398bde27: "Nathan [Chancellor] also found that binutils commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this issue, so we cannot reproduce it with binutils 2.36+, but it is better to not rely on it."
Link: https://lkml.kernel.org/r/9166a8abdc0f979e50377e61780a4bba1dfa2f52.167451846... Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Link: https://lore.kernel.org/all/20230123194218.47ssfzhrpnv3xfez@oracle.com/ Signed-off-by: Tom Saeger tom.saeger@oracle.com Tested-by: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Cc: Ard Biesheuvel ardb@kernel.org Cc: Arnd Bergmann arnd@arndb.de Cc: Christoph Hellwig hch@lst.de Cc: Dennis Gilmore dennis@ausil.us Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Masahiro Yamada masahiroy@kernel.org Cc: Naresh Kamboju naresh.kamboju@linaro.org Cc: Nathan Chancellor nathan@kernel.org Cc: Palmer Dabbelt palmer@rivosinc.com Cc: Rich Felker dalias@libc.org Cc: Yoshinori Sato ysato@users.sourceforge.jp Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Tom Saeger tom.saeger@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sh/kernel/vmlinux.lds.S | 1 + 1 file changed, 1 insertion(+)
--- a/arch/sh/kernel/vmlinux.lds.S +++ b/arch/sh/kernel/vmlinux.lds.S @@ -4,6 +4,7 @@ * Written by Niibe Yutaka and Paul Mundt */ OUTPUT_ARCH(sh) +#define RUNTIME_DISCARD_EXIT #include <asm/thread_info.h> #include <asm/cache.h> #include <asm/vmlinux.lds.h>
From: Andres Freund andres@anarazel.de
commit cfd59ca91467056bb2c36907b2fa67b8e1af9952 upstream.
binutils changed the signature of init_disassemble_info(), which now causes compilation failures for tools/{perf,bpf}, e.g. on debian unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=60a3da00bd5407...
This commit adds a feature test to detect the new signature. Subsequent commits will use it to fix the build failures.
Signed-off-by: Andres Freund andres@anarazel.de Acked-by: Quentin Monnet quentin@isovalent.com Cc: Alexei Starovoitov ast@kernel.org Cc: Ben Hutchings benh@debian.org Cc: Jiri Olsa jolsa@kernel.org Cc: Quentin Monnet quentin@isovalent.com Cc: Sedat Dilek sedat.dilek@gmail.com Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.d... Link: https://lore.kernel.org/r/20220801013834.156015-2-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/build/Makefile.feature | 1 + tools/build/feature/Makefile | 4 ++++ tools/build/feature/test-all.c | 4 ++++ tools/build/feature/test-disassembler-init-styled.c | 13 +++++++++++++ 4 files changed, 22 insertions(+) create mode 100644 tools/build/feature/test-disassembler-init-styled.c
--- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -69,6 +69,7 @@ FEATURE_TESTS_BASIC := libaio \ libzstd \ disassembler-four-args \ + disassembler-init-styled \ file-handle
# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -18,6 +18,7 @@ FILES= test-libbfd.bin \ test-libbfd-buildid.bin \ test-disassembler-four-args.bin \ + test-disassembler-init-styled.bin \ test-reallocarray.bin \ test-libbfd-liberty.bin \ test-libbfd-liberty-z.bin \ @@ -239,6 +240,9 @@ $(OUTPUT)test-libbfd-buildid.bin: $(OUTPUT)test-disassembler-four-args.bin: $(BUILD) -DPACKAGE='"perf"' -lbfd -lopcodes
+$(OUTPUT)test-disassembler-init-styled.bin: + $(BUILD) -DPACKAGE='"perf"' -lbfd -lopcodes + $(OUTPUT)test-reallocarray.bin: $(BUILD)
--- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -166,6 +166,10 @@ # include "test-disassembler-four-args.c" #undef main
+#define main main_test_disassembler_init_styled +# include "test-disassembler-init-styled.c" +#undef main + #define main main_test_libzstd # include "test-libzstd.c" #undef main --- /dev/null +++ b/tools/build/feature/test-disassembler-init-styled.c @@ -0,0 +1,13 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <dis-asm.h> + +int main(void) +{ + struct disassemble_info info; + + init_disassemble_info(&info, stdout, + NULL, NULL); + + return 0; +}
From: Andres Freund andres@anarazel.de
commit a45b3d6926231c3d024ea0de4f7bd967f83709ee upstream.
binutils changed the signature of init_disassemble_info(), which now causes compilation failures for tools/{perf,bpf}, e.g. on debian unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=60a3da00bd5407...
This commit introduces a wrapper for init_disassemble_info(), to avoid spreading #ifdef DISASM_INIT_STYLED to a bunch of places. Subsequent commits will use it to fix the build failures.
It likely is worth adding a wrapper for disassember(), to avoid the already existing DISASM_FOUR_ARGS_SIGNATURE ifdefery.
Signed-off-by: Andres Freund andres@anarazel.de Signed-off-by: Ben Hutchings benh@debian.org Acked-by: Quentin Monnet quentin@isovalent.com Cc: Alexei Starovoitov ast@kernel.org Cc: Ben Hutchings benh@debian.org Cc: Jiri Olsa jolsa@kernel.org Cc: Quentin Monnet quentin@isovalent.com Cc: Sedat Dilek sedat.dilek@gmail.com Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.d... Link: https://lore.kernel.org/r/20220801013834.156015-4-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/include/tools/dis-asm-compat.h | 55 +++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) create mode 100644 tools/include/tools/dis-asm-compat.h
--- /dev/null +++ b/tools/include/tools/dis-asm-compat.h @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */ +#ifndef _TOOLS_DIS_ASM_COMPAT_H +#define _TOOLS_DIS_ASM_COMPAT_H + +#include <stdio.h> +#include <dis-asm.h> + +/* define types for older binutils version, to centralize ifdef'ery a bit */ +#ifndef DISASM_INIT_STYLED +enum disassembler_style {DISASSEMBLER_STYLE_NOT_EMPTY}; +typedef int (*fprintf_styled_ftype) (void *, enum disassembler_style, const char*, ...); +#endif + +/* + * Trivial fprintf wrapper to be used as the fprintf_styled_func argument to + * init_disassemble_info_compat() when normal fprintf suffices. + */ +static inline int fprintf_styled(void *out, + enum disassembler_style style, + const char *fmt, ...) +{ + va_list args; + int r; + + (void)style; + + va_start(args, fmt); + r = vfprintf(out, fmt, args); + va_end(args); + + return r; +} + +/* + * Wrapper for init_disassemble_info() that hides version + * differences. Depending on binutils version and architecture either + * fprintf_func or fprintf_styled_func will be called. + */ +static inline void init_disassemble_info_compat(struct disassemble_info *info, + void *stream, + fprintf_ftype unstyled_func, + fprintf_styled_ftype styled_func) +{ +#ifdef DISASM_INIT_STYLED + init_disassemble_info(info, stream, + unstyled_func, + styled_func); +#else + (void)styled_func; + init_disassemble_info(info, stream, + unstyled_func); +#endif +} + +#endif /* _TOOLS_DIS_ASM_COMPAT_H */
From: Andres Freund andres@anarazel.de
commit 83aa0120487e8bc3f231e72c460add783f71f17c upstream.
binutils changed the signature of init_disassemble_info(), which now causes compilation failures for tools/perf/util/annotate.c, e.g. on debian unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=60a3da00bd5407...
Wire up the feature test and switch to init_disassemble_info_compat(), which were introduced in prior commits, fixing the compilation failure.
I verified that perf can still disassemble bpf programs by using bpftrace under load, recording a perf trace, and then annotating the bpf "function" with and without the changes. With old binutils there's no change in output before/after this patch. When comparing the output from old binutils (2.35) to new bintuils with the patch (upstream snapshot) there are a few output differences, but they are unrelated to this patch. An example hunk is:
1.15 : 55:mov %rbp,%rdx 0.00 : 58:add $0xfffffffffffffff8,%rdx 0.00 : 5c:xor %ecx,%ecx - 1.03 : 5e:callq 0xffffffffe12aca3c + 1.03 : 5e:call 0xffffffffe12aca3c 0.00 : 63:xor %eax,%eax - 2.18 : 65:leaveq - 2.82 : 66:retq + 2.18 : 65:leave + 2.82 : 66:ret
Signed-off-by: Andres Freund andres@anarazel.de Acked-by: Quentin Monnet quentin@isovalent.com Cc: Alexei Starovoitov ast@kernel.org Cc: Ben Hutchings benh@debian.org Cc: Jiri Olsa jolsa@kernel.org Cc: Sedat Dilek sedat.dilek@gmail.com Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.d... Link: https://lore.kernel.org/r/20220801013834.156015-5-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/Makefile.config | 8 ++++++++ tools/perf/util/annotate.c | 7 ++++--- 2 files changed, 12 insertions(+), 3 deletions(-)
--- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -296,6 +296,7 @@ FEATURE_CHECK_LDFLAGS-libpython := $(PYT FEATURE_CHECK_LDFLAGS-libaio = -lrt
FEATURE_CHECK_LDFLAGS-disassembler-four-args = -lbfd -lopcodes -ldl +FEATURE_CHECK_LDFLAGS-disassembler-init-styled = -lbfd -lopcodes -ldl
CORE_CFLAGS += -fno-omit-frame-pointer CORE_CFLAGS += -ggdb3 @@ -872,13 +873,16 @@ ifndef NO_LIBBFD ifeq ($(feature-libbfd-liberty), 1) EXTLIBS += -lbfd -lopcodes -liberty FEATURE_CHECK_LDFLAGS-disassembler-four-args += -liberty -ldl + FEATURE_CHECK_LDFLAGS-disassembler-init-styled += -liberty -ldl else ifeq ($(feature-libbfd-liberty-z), 1) EXTLIBS += -lbfd -lopcodes -liberty -lz FEATURE_CHECK_LDFLAGS-disassembler-four-args += -liberty -lz -ldl + FEATURE_CHECK_LDFLAGS-disassembler-init-styled += -liberty -lz -ldl endif endif $(call feature_check,disassembler-four-args) + $(call feature_check,disassembler-init-styled) endif
ifeq ($(feature-libbfd-buildid), 1) @@ -992,6 +996,10 @@ ifeq ($(feature-disassembler-four-args), CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE endif
+ifeq ($(feature-disassembler-init-styled), 1) + CFLAGS += -DDISASM_INIT_STYLED +endif + ifeq (${IS_64_BIT}, 1) ifndef NO_PERF_READ_VDSO32 $(call feature_check,compile-32) --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1694,6 +1694,7 @@ fallback: #include <bpf/btf.h> #include <bpf/libbpf.h> #include <linux/btf.h> +#include <tools/dis-asm-compat.h>
static int symbol__disassemble_bpf(struct symbol *sym, struct annotate_args *args) @@ -1736,9 +1737,9 @@ static int symbol__disassemble_bpf(struc ret = errno; goto out; } - init_disassemble_info(&info, s, - (fprintf_ftype) fprintf); - + init_disassemble_info_compat(&info, s, + (fprintf_ftype) fprintf, + fprintf_styled); info.arch = bfd_get_arch(bfdf); info.mach = bfd_get_mach(bfdf);
From: Andres Freund andres@anarazel.de
commit 96ed066054abf11c7d3e106e3011a51f3f1227a3 upstream.
binutils changed the signature of init_disassemble_info(), which now causes compilation to fail for tools/bpf/bpf_jit_disasm.c, e.g. on debian unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=60a3da00bd5407...
Wire up the feature test and switch to init_disassemble_info_compat(), which were introduced in prior commits, fixing the compilation failure.
I verified that bpf_jit_disasm can still disassemble bpf programs, both with the old and new dis-asm.h API. With old binutils there's no change in output before/after this patch. When comparing the output from old binutils (2.35) to new bintuils with the patch (upstream snapshot) there are a few output differences, but they are unrelated to this patch. An example hunk is:
f4: mov %r14,%rsi f7: mov %r15,%rdx fa: mov $0x2a,%ecx - ff: callq 0xffffffffea8c4988 + ff: call 0xffffffffea8c4988 104: test %rax,%rax 107: jge 0x0000000000000110 109: xor %eax,%eax - 10b: jmpq 0x0000000000000073 + 10b: jmp 0x0000000000000073 110: cmp $0x16,%rax
However, I had to use an older kernel to generate the bpf_jit_enabled = 2 output, as that has been broken since 5.18 / 1022a5498f6f745c ("bpf, x86_64: Use bpf_jit_binary_pack_alloc").
https://lore.kernel.org/20220703030210.pmjft7qc2eajzi6c@alap3.anarazel.de
Signed-off-by: Andres Freund andres@anarazel.de Acked-by: Quentin Monnet quentin@isovalent.com Cc: Alexei Starovoitov ast@kernel.org Cc: Ben Hutchings benh@debian.org Cc: Daniel Borkmann daniel@iogearbox.net Cc: Jiri Olsa jolsa@kernel.org Cc: Quentin Monnet quentin@isovalent.com Cc: Sedat Dilek sedat.dilek@gmail.com Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.d... Link: https://lore.kernel.org/r/20220801013834.156015-6-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/bpf/Makefile | 5 ++++- tools/bpf/bpf_jit_disasm.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)
--- a/tools/bpf/Makefile +++ b/tools/bpf/Makefile @@ -34,7 +34,7 @@ else endif
FEATURE_USER = .bpf -FEATURE_TESTS = libbfd disassembler-four-args +FEATURE_TESTS = libbfd disassembler-four-args disassembler-init-styled FEATURE_DISPLAY = libbfd disassembler-four-args
check_feat := 1 @@ -56,6 +56,9 @@ endif ifeq ($(feature-disassembler-four-args), 1) CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE endif +ifeq ($(feature-disassembler-init-styled), 1) +CFLAGS += -DDISASM_INIT_STYLED +endif
$(OUTPUT)%.yacc.c: $(srctree)/tools/bpf/%.y $(QUIET_BISON)$(YACC) -o $@ -d $< --- a/tools/bpf/bpf_jit_disasm.c +++ b/tools/bpf/bpf_jit_disasm.c @@ -28,6 +28,7 @@ #include <sys/types.h> #include <sys/stat.h> #include <limits.h> +#include <tools/dis-asm-compat.h>
#define CMD_ACTION_SIZE_BUFFER 10 #define CMD_ACTION_READ_ALL 3 @@ -64,7 +65,9 @@ static void get_asm_insns(uint8_t *image assert(bfdf); assert(bfd_check_format(bfdf, bfd_object));
- init_disassemble_info(&info, stdout, (fprintf_ftype) fprintf); + init_disassemble_info_compat(&info, stdout, + (fprintf_ftype) fprintf, + fprintf_styled); info.arch = bfd_get_arch(bfdf); info.mach = bfd_get_mach(bfdf); info.buffer = image;
From: Andres Freund andres@anarazel.de
commit 600b7b26c07a070d0153daa76b3806c1e52c9e00 upstream.
binutils changed the signature of init_disassemble_info(), which now causes compilation to fail for tools/bpf/bpftool/jit_disasm.c, e.g. on debian unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=60a3da00bd5407...
Wire up the feature test and switch to init_disassemble_info_compat(), which were introduced in prior commits, fixing the compilation failure.
I verified that bpftool can still disassemble bpf programs, both with an old and new dis-asm.h API. There are no output changes for plain and json formats. When comparing the output from old binutils (2.35) to new bintuils with the patch (upstream snapshot) there are a few output differences, but they are unrelated to this patch. An example hunk is:
2f: pop %r14 31: pop %r13 33: pop %rbx - 34: leaveq - 35: retq + 34: leave + 35: ret
Signed-off-by: Andres Freund andres@anarazel.de Acked-by: Quentin Monnet quentin@isovalent.com Cc: Alexei Starovoitov ast@kernel.org Cc: Ben Hutchings benh@debian.org Cc: Jiri Olsa jolsa@kernel.org Cc: Quentin Monnet quentin@isovalent.com Cc: Sedat Dilek sedat.dilek@gmail.com Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.d... Link: https://lore.kernel.org/r/20220801013834.156015-8-andres@anarazel.de Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/bpf/bpftool/Makefile | 5 +++- tools/bpf/bpftool/jit_disasm.c | 42 +++++++++++++++++++++++++++++++++-------- 2 files changed, 38 insertions(+), 9 deletions(-)
--- a/tools/bpf/bpftool/Makefile +++ b/tools/bpf/bpftool/Makefile @@ -76,7 +76,7 @@ INSTALL ?= install RM ?= rm -f
FEATURE_USER = .bpftool -FEATURE_TESTS = libbfd disassembler-four-args reallocarray zlib libcap \ +FEATURE_TESTS = libbfd disassembler-four-args disassembler-init-styled reallocarray zlib libcap \ clang-bpf-co-re FEATURE_DISPLAY = libbfd disassembler-four-args zlib libcap \ clang-bpf-co-re @@ -111,6 +111,9 @@ ifeq ($(feature-libcap), 1) CFLAGS += -DUSE_LIBCAP LIBS += -lcap endif +ifeq ($(feature-disassembler-init-styled), 1) + CFLAGS += -DDISASM_INIT_STYLED +endif
include $(wildcard $(OUTPUT)*.d)
--- a/tools/bpf/bpftool/jit_disasm.c +++ b/tools/bpf/bpftool/jit_disasm.c @@ -24,6 +24,7 @@ #include <sys/stat.h> #include <limits.h> #include <bpf/libbpf.h> +#include <tools/dis-asm-compat.h>
#include "json_writer.h" #include "main.h" @@ -39,15 +40,12 @@ static void get_exec_path(char *tpath, s }
static int oper_count; -static int fprintf_json(void *out, const char *fmt, ...) +static int printf_json(void *out, const char *fmt, va_list ap) { - va_list ap; char *s; int err;
- va_start(ap, fmt); err = vasprintf(&s, fmt, ap); - va_end(ap); if (err < 0) return -1;
@@ -73,6 +71,32 @@ static int fprintf_json(void *out, const return 0; }
+static int fprintf_json(void *out, const char *fmt, ...) +{ + va_list ap; + int r; + + va_start(ap, fmt); + r = printf_json(out, fmt, ap); + va_end(ap); + + return r; +} + +static int fprintf_json_styled(void *out, + enum disassembler_style style __maybe_unused, + const char *fmt, ...) +{ + va_list ap; + int r; + + va_start(ap, fmt); + r = printf_json(out, fmt, ap); + va_end(ap); + + return r; +} + void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, const char *arch, const char *disassembler_options, const struct btf *btf, @@ -99,11 +123,13 @@ void disasm_print_insn(unsigned char *im assert(bfd_check_format(bfdf, bfd_object));
if (json_output) - init_disassemble_info(&info, stdout, - (fprintf_ftype) fprintf_json); + init_disassemble_info_compat(&info, stdout, + (fprintf_ftype) fprintf_json, + fprintf_json_styled); else - init_disassemble_info(&info, stdout, - (fprintf_ftype) fprintf); + init_disassemble_info_compat(&info, stdout, + (fprintf_ftype) fprintf, + fprintf_styled);
/* Update architecture info for offload. */ if (arch) {
From: Miaohe Lin linmiaohe@huawei.com
commit 5a2a961be2ad6a16eb388a80442443b353c11d16 upstream.
When alloc_cpumask_var_node() fails for a certain cpu, there might be some allocated cpumasks for percpu cpu_kick_mask. We should free these cpumasks or memoryleak will occur.
Fixes: baff59ccdc65 ("KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Link: https://lore.kernel.org/r/20220823063414.59778-1-linmiaohe@huawei.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5649,7 +5649,7 @@ int kvm_init(void *opaque, unsigned vcpu
r = kvm_async_pf_init(); if (r) - goto out_free_5; + goto out_free_4;
kvm_chardev_ops.owner = module; kvm_vm_fops.owner = module; @@ -5682,10 +5682,9 @@ err_register: kvm_vfio_ops_exit(); err_vfio: kvm_async_pf_deinit(); -out_free_5: +out_free_4: for_each_possible_cpu(cpu) free_cpumask_var(per_cpu(cpu_kick_mask, cpu)); -out_free_4: kmem_cache_destroy(kvm_vcpu_cache); out_free_3: unregister_reboot_notifier(&kvm_reboot_notifier);
From: Gaosheng Cui cuigaosheng1@huawei.com
commit b0463b9dd7030a766133ad2f1571f97f204d7bdf upstream.
xfs_setattr_time() has been removed since commit e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes"), so remove it.
Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_iops.h | 1 - 1 file changed, 1 deletion(-)
--- a/fs/xfs/xfs_iops.h +++ b/fs/xfs/xfs_iops.h @@ -13,7 +13,6 @@ extern const struct file_operations xfs_
extern ssize_t xfs_vn_listxattr(struct dentry *, char *data, size_t size);
-extern void xfs_setattr_time(struct xfs_inode *ip, struct iattr *iattr); int xfs_vn_setattr_size(struct user_namespace *mnt_userns, struct dentry *dentry, struct iattr *vap);
From: Masahiro Yamada masahiroy@kernel.org
commit b99ddbe8336ee680257c8ab479f75051eaa49dcf upstream.
With CONFIG_VIRTIO_UML=y, GNU ld < 2.36 fails to link UML vmlinux (w/wo CONFIG_LD_SCRIPT_STATIC).
`.exit.text' referenced in section `.uml.exitcall.exit' of arch/um/drivers/virtio_uml.o: defined in discarded section `.exit.text' of arch/um/drivers/virtio_uml.o collect2: error: ld returned 1 exit status
This fix is similar to the following commits:
- 4b9880dbf3bd ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT") - a494398bde27 ("s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36") - c1c551bebf92 ("sh: define RUNTIME_DISCARD_EXIT")
Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Reported-by: SeongJae Park sj@kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Tested-by: SeongJae Park sj@kernel.org Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/um/kernel/vmlinux.lds.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/um/kernel/vmlinux.lds.S +++ b/arch/um/kernel/vmlinux.lds.S @@ -1,4 +1,4 @@ - +#define RUNTIME_DISCARD_EXIT KERNEL_STACK_SIZE = 4096 * (1 << CONFIG_KERNEL_STACK_ORDER);
#ifdef CONFIG_LD_SCRIPT_STATIC
From: Christian Brauner christian.brauner@ubuntu.com
commit e1bbcd277a53e08d619ffeec56c5c9287f2bf42f upstream.
Hold writers when changing a mount's idmapping to make it more robust.
The vfs layer takes care to retrieve the idmapping of a mount once ensuring that the idmapping used for vfs permission checking is identical to the idmapping passed down to the filesystem.
For ioctl codepaths the filesystem itself is responsible for taking the idmapping into account if they need to. While all filesystems with FS_ALLOW_IDMAP raised take the same precautions as the vfs we should enforce it explicitly by making sure there are no active writers on the relevant mount while changing the idmapping.
This is similar to turning a mount ro with the difference that in contrast to turning a mount ro changing the idmapping can only ever be done once while a mount can transition between ro and rw as much as it wants.
This is a minor user-visible change. But it is extremely unlikely to matter. The caller must've created a detached mount via OPEN_TREE_CLONE and then handed that O_PATH fd to another process or thread which then must've gotten a writable fd for that mount and started creating files in there while the caller is still changing mount properties. While not impossible it will be an extremely rare corner-case and should in general be considered a bug in the application. Consider making a mount MOUNT_ATTR_NOEXEC or MOUNT_ATTR_NODEV while allowing someone else to perform lookups or exec'ing in parallel by handing them a copy of the OPEN_TREE_CLONE fd or another fd beneath that mount.
Link: https://lore.kernel.org/r/20220510095840.152264-1-brauner@kernel.org Cc: Seth Forshee seth.forshee@digitalocean.com Cc: Christoph Hellwig hch@lst.de Cc: Al Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner (Microsoft) brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/namespace.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-)
--- a/fs/namespace.c +++ b/fs/namespace.c @@ -3968,6 +3968,23 @@ static int can_idmap_mount(const struct return 0; }
+/** + * mnt_allow_writers() - check whether the attribute change allows writers + * @kattr: the new mount attributes + * @mnt: the mount to which @kattr will be applied + * + * Check whether thew new mount attributes in @kattr allow concurrent writers. + * + * Return: true if writers need to be held, false if not + */ +static inline bool mnt_allow_writers(const struct mount_kattr *kattr, + const struct mount *mnt) +{ + return (!(kattr->attr_set & MNT_READONLY) || + (mnt->mnt.mnt_flags & MNT_READONLY)) && + !kattr->mnt_userns; +} + static struct mount *mount_setattr_prepare(struct mount_kattr *kattr, struct mount *mnt, int *err) { @@ -3998,8 +4015,7 @@ static struct mount *mount_setattr_prepa
last = m;
- if ((kattr->attr_set & MNT_READONLY) && - !(m->mnt.mnt_flags & MNT_READONLY)) { + if (!mnt_allow_writers(kattr, m)) { *err = mnt_hold_writers(m); if (*err) goto out; @@ -4050,13 +4066,8 @@ static void mount_setattr_commit(struct WRITE_ONCE(m->mnt.mnt_flags, flags); }
- /* - * We either set MNT_READONLY above so make it visible - * before ~MNT_WRITE_HOLD or we failed to recursively - * apply mount options. - */ - if ((kattr->attr_set & MNT_READONLY) && - (m->mnt.mnt_flags & MNT_WRITE_HOLD)) + /* If we had to hold writers unblock them. */ + if (m->mnt.mnt_flags & MNT_WRITE_HOLD) mnt_unhold_writers(m);
if (!err && kattr->propagation)
From: Vitaly Kuznetsov vkuznets@redhat.com
commit 250552b925ce400c17d166422fde9bb215958481 upstream.
When KVM runs as a nested hypervisor on top of Hyper-V it uses Enlightened VMCS and enables Enlightened MSR Bitmap feature for its L1s and L2s (which are actually L2s and L3s from Hyper-V's perspective). When MSR bitmap is updated, KVM has to reset HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP from clean fields to make Hyper-V aware of the change. For KVM's L1s, this is done in vmx_disable_intercept_for_msr()/vmx_enable_intercept_for_msr(). MSR bitmap for L2 is build in nested_vmx_prepare_msr_bitmap() by blending MSR bitmap for L1 and L1's idea of MSR bitmap for L2. KVM, however, doesn't check if the resulting bitmap is different and never cleans HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP in eVMCS02. This is incorrect and may result in Hyper-V missing the update.
The issue could've been solved by calling evmcs_touch_msr_bitmap() for eVMCS02 from nested_vmx_prepare_msr_bitmap() unconditionally but doing so would not give any performance benefits (compared to not using Enlightened MSR Bitmap at all). 3-level nesting is also not a very common setup nowadays.
Don't enable 'Enlightened MSR Bitmap' feature for KVM's L2s (real L3s) for now.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Message-Id: 20211129094704.326635-2-vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Alexandru Matei alexandru.matei@uipath.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/vmx.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2739,15 +2739,6 @@ int alloc_loaded_vmcs(struct loaded_vmcs if (!loaded_vmcs->msr_bitmap) goto out_vmcs; memset(loaded_vmcs->msr_bitmap, 0xff, PAGE_SIZE); - - if (IS_ENABLED(CONFIG_HYPERV) && - static_branch_unlikely(&enable_evmcs) && - (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)) { - struct hv_enlightened_vmcs *evmcs = - (struct hv_enlightened_vmcs *)loaded_vmcs->vmcs; - - evmcs->hv_enlightenments_control.msr_bitmap = 1; - } }
memset(&loaded_vmcs->host_state, 0, sizeof(struct vmcs_host_state)); @@ -6969,6 +6960,19 @@ static int vmx_create_vcpu(struct kvm_vc if (err < 0) goto free_pml;
+ /* + * Use Hyper-V 'Enlightened MSR Bitmap' feature when KVM runs as a + * nested (L1) hypervisor and Hyper-V in L0 supports it. Enable the + * feature only for vmcs01, KVM currently isn't equipped to realize any + * performance benefits from enabling it for vmcs02. + */ + if (IS_ENABLED(CONFIG_HYPERV) && static_branch_unlikely(&enable_evmcs) && + (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)) { + struct hv_enlightened_vmcs *evmcs = (void *)vmx->vmcs01.vmcs; + + evmcs->hv_enlightenments_control.msr_bitmap = 1; + } + /* The MSR bitmap starts with all ones */ bitmap_fill(vmx->shadow_msr_intercept.read, MAX_POSSIBLE_PASSTHROUGH_MSRS); bitmap_fill(vmx->shadow_msr_intercept.write, MAX_POSSIBLE_PASSTHROUGH_MSRS);
From: Vitaly Kuznetsov vkuznets@redhat.com
commit b84155c38076b36d625043a06a2f1c90bde62903 upstream.
In preparation to enabling 'Enlightened MSR Bitmap' feature for Hyper-V guests move MSR bitmap update tracking to a dedicated helper.
Note: vmx_msr_bitmap_l01_changed() is called when MSR bitmap might be updated. KVM doesn't check if the bit we're trying to set is already set (or the bit it's trying to clear is already cleared). Such situations should not be common and a few false positives should not be a problem.
No functional change intended.
Signed-off-by: Vitaly Kuznetsov vkuznets@redhat.com Reviewed-by: Maxim Levitsky mlevitsk@redhat.com Reviewed-by: Sean Christopherson seanjc@google.com Message-Id: 20211129094704.326635-3-vkuznets@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Alexandru Matei alexandru.matei@uipath.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/vmx.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3772,6 +3772,17 @@ void free_vpid(int vpid) spin_unlock(&vmx_vpid_lock); }
+static void vmx_msr_bitmap_l01_changed(struct vcpu_vmx *vmx) +{ + /* + * When KVM is a nested hypervisor on top of Hyper-V and uses + * 'Enlightened MSR Bitmap' feature L0 needs to know that MSR + * bitmap has changed. + */ + if (static_branch_unlikely(&enable_evmcs)) + evmcs_touch_msr_bitmap(); +} + void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3780,8 +3791,7 @@ void vmx_disable_intercept_for_msr(struc if (!cpu_has_vmx_msr_bitmap()) return;
- if (static_branch_unlikely(&enable_evmcs)) - evmcs_touch_msr_bitmap(); + vmx_msr_bitmap_l01_changed(vmx);
/* * Mark the desired intercept state in shadow bitmap, this is needed @@ -3825,8 +3835,7 @@ void vmx_enable_intercept_for_msr(struct if (!cpu_has_vmx_msr_bitmap()) return;
- if (static_branch_unlikely(&enable_evmcs)) - evmcs_touch_msr_bitmap(); + vmx_msr_bitmap_l01_changed(vmx);
/* * Mark the desired intercept state in shadow bitmap, this is needed
From: Alexandru Matei alexandru.matei@uipath.com
commit 93827a0a36396f2fd6368a54a020f420c8916e9b upstream.
KVM enables 'Enlightened VMCS' and 'Enlightened MSR Bitmap' when running as a nested hypervisor on top of Hyper-V. When MSR bitmap is updated, evmcs_touch_msr_bitmap function uses current_vmcs per-cpu variable to mark that the msr bitmap was changed.
vmx_vcpu_create() modifies the msr bitmap via vmx_disable_intercept_for_msr -> vmx_msr_bitmap_l01_changed which in the end calls this function. The function checks for current_vmcs if it is null but the check is insufficient because current_vmcs is not initialized. Because of this, the code might incorrectly write to the structure pointed by current_vmcs value left by another task. Preemption is not disabled, the current task can be preempted and moved to another CPU while current_vmcs is accessed multiple times from evmcs_touch_msr_bitmap() which leads to crash.
The manipulation of MSR bitmaps by callers happens only for vmcs01 so the solution is to use vmx->vmcs01.vmcs instead of current_vmcs.
BUG: kernel NULL pointer dereference, address: 0000000000000338 PGD 4e1775067 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI ... RIP: 0010:vmx_msr_bitmap_l01_changed+0x39/0x50 [kvm_intel] ... Call Trace: vmx_disable_intercept_for_msr+0x36/0x260 [kvm_intel] vmx_vcpu_create+0xe6/0x540 [kvm_intel] kvm_arch_vcpu_create+0x1d1/0x2e0 [kvm] kvm_vm_ioctl_create_vcpu+0x178/0x430 [kvm] kvm_vm_ioctl+0x53f/0x790 [kvm] __x64_sys_ioctl+0x8a/0xc0 do_syscall_64+0x5c/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support") Cc: stable@vger.kernel.org Suggested-by: Sean Christopherson seanjc@google.com Signed-off-by: Alexandru Matei alexandru.matei@uipath.com Link: https://lore.kernel.org/r/20230123221208.4964-1-alexandru.matei@uipath.com Signed-off-by: Sean Christopherson seanjc@google.com [manual backport: evmcs.h got renamed to hyperv.h in a later version, modified in evmcs.h instead] Signed-off-by: Alexandru Matei alexandru.matei@uipath.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/evmcs.h | 11 ----------- arch/x86/kvm/vmx/vmx.c | 9 +++++++-- 2 files changed, 7 insertions(+), 13 deletions(-)
--- a/arch/x86/kvm/vmx/evmcs.h +++ b/arch/x86/kvm/vmx/evmcs.h @@ -162,16 +162,6 @@ static inline u16 evmcs_read16(unsigned return *(u16 *)((char *)current_evmcs + offset); }
-static inline void evmcs_touch_msr_bitmap(void) -{ - if (unlikely(!current_evmcs)) - return; - - if (current_evmcs->hv_enlightenments_control.msr_bitmap) - current_evmcs->hv_clean_fields &= - ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP; -} - static inline void evmcs_load(u64 phys_addr) { struct hv_vp_assist_page *vp_ap = @@ -192,7 +182,6 @@ static inline u64 evmcs_read64(unsigned static inline u32 evmcs_read32(unsigned long field) { return 0; } static inline u16 evmcs_read16(unsigned long field) { return 0; } static inline void evmcs_load(u64 phys_addr) {} -static inline void evmcs_touch_msr_bitmap(void) {} #endif /* IS_ENABLED(CONFIG_HYPERV) */
#define EVMPTR_INVALID (-1ULL) --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3779,8 +3779,13 @@ static void vmx_msr_bitmap_l01_changed(s * 'Enlightened MSR Bitmap' feature L0 needs to know that MSR * bitmap has changed. */ - if (static_branch_unlikely(&enable_evmcs)) - evmcs_touch_msr_bitmap(); + if (IS_ENABLED(CONFIG_HYPERV) && static_branch_unlikely(&enable_evmcs)) { + struct hv_enlightened_vmcs *evmcs = (void *)vmx->vmcs01.vmcs; + + if (evmcs->hv_enlightenments_control.msr_bitmap) + evmcs->hv_clean_fields &= + ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP; + } }
void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type)
Hello Greg,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: 15 March 2023 12:11
This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000. Anything received after that time might be too late.
CIP configurations built and booted with Linux 5.15.103-rc1 (158686d9d0fd): https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/80... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/commits/linu...
Tested-by: Chris Paterson (CIP) chris.paterson2@renesas.com
Kind regards, Chris
Hello!
On 15/03/23 06:11, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.103-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We're seeing PowerPC build failures like the following (GCC-8, GCC-12, Clang-16):
-----8<----- In file included from /builds/linux/drivers/usb/host/ehci-hcd.c:1298: /builds/linux/drivers/usb/host/ehci-ppc-of.c:122:13: error: use of undeclared identifier 'NO_IRQ' if (irq == NO_IRQ) { ^ 1 error generated. ----->8-----
Then, Perf fails on Arm and i386 with GCC-12:
-----8<----- In function 'parse_events_term__num', inlined from 'parse_events_multi_pmu_add' at util/parse-events.c:1687:9: util/parse-events.c:3100:64: error: array subscript 'YYLTYPE[0]' is partly outside array bounds of 'char[4]' [-Werror=array-bounds] 3100 | .err_term = loc_term ? loc_term->first_column : 0, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ util/parse-events.c: In function 'parse_events_multi_pmu_add': util/parse-events.c:1678:39: note: object 'config' of size 4 1678 | char *config; | ^~~~~~ cc1: all warnings being treated as errors ----->8-----
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On Wed, Mar 15, 2023 at 08:29:45AM -0600, Daniel Díaz wrote:
Hello!
On 15/03/23 06:11, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.103-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We're seeing PowerPC build failures like the following (GCC-8, GCC-12, Clang-16):
-----8<----- In file included from /builds/linux/drivers/usb/host/ehci-hcd.c:1298: /builds/linux/drivers/usb/host/ehci-ppc-of.c:122:13: error: use of undeclared identifier 'NO_IRQ' if (irq == NO_IRQ) { ^ 1 error generated. ----->8-----
Will be fixed in -rc2
Then, Perf fails on Arm and i386 with GCC-12:
-----8<----- In function 'parse_events_term__num', inlined from 'parse_events_multi_pmu_add' at util/parse-events.c:1687:9: util/parse-events.c:3100:64: error: array subscript 'YYLTYPE[0]' is partly outside array bounds of 'char[4]' [-Werror=array-bounds] 3100 | .err_term = loc_term ? loc_term->first_column : 0, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ util/parse-events.c: In function 'parse_events_multi_pmu_add': util/parse-events.c:1678:39: note: object 'config' of size 4 1678 | char *config; | ^~~~~~ cc1: all warnings being treated as errors ----->8-----
This is odd as this file isn't modified, but other build fixes for perf for 5.15 were made. Let's see if this still happens in -rc2 and if so, if you could bisect to track down the offending commit that would be great.
thanks,
greg k-h
On 3/15/23 05:11, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.103-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 3/15/23 06:11, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Mar 2023 11:57:10 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.103-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Wed, Mar 15, 2023 at 01:11:06PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.103 release. There are 145 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
linux-stable-mirror@lists.linaro.org