This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.4.5-rc1
Mario Limonciello mario.limonciello@amd.com Revert "drm/amd: Disable PSR-SU on Parade 0803 TCON"
Thomas Bogendoerfer tsbogend@alpha.franken.de MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled
Dan Carpenter dan.carpenter@linaro.org net: dsa: ocelot: unlock on error in vsc9959_qos_port_tas_set()
Dan Carpenter dan.carpenter@linaro.org scsi: qla2xxx: Fix end of loop test
Manish Rangankar mrangankar@marvell.com scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue
Shreyas Deodhar sdeodhar@marvell.com scsi: qla2xxx: Pointer may be dereferenced
Bikash Hazarika bhazarika@marvell.com scsi: qla2xxx: Correct the index of array
Nilesh Javali njavali@marvell.com scsi: qla2xxx: Check valid rport returned by fc_bsg_to_rport()
Bikash Hazarika bhazarika@marvell.com scsi: qla2xxx: Fix potential NULL pointer dereference
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix buffer overrun
Nilesh Javali njavali@marvell.com scsi: qla2xxx: Avoid fcport pointer dereference
Nilesh Javali njavali@marvell.com scsi: qla2xxx: Array index may go out of bound
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix mem access after free
Quinn Tran qutran@marvell.com scsi: qla2xxx: Wait for io return on terminate rport
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix hang in task management
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix task management cmd fail due to unavailable resource
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix task management cmd failure
Quinn Tran qutran@marvell.com scsi: qla2xxx: Multi-que support for TMF
Beau Belgrave beaub@linux.microsoft.com tracing/user_events: Fix struct arg size match check
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails
Masami Hiramatsu (Google) mhiramat@kernel.org Revert "tracing: Add "(fault)" name injection to kernel probes"
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Fix to update dynamic data counter if fetcharg uses it
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Fix not to count error code to total length
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Fix to avoid double count of the string length on the array
Gustavo A. R. Silva gustavoars@kernel.org smb: client: Fix -Wstringop-overflow issues
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: pm_nl_ctl: fix 32-bit support
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: depend on SYN_COOKIES
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: userspace_pm: report errors with 'remove' tests
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: userspace_pm: use correct server port
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: sockopt: return error if wrong mark
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: connect: fail if nft supposed to work
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: sockopt: use 'iptables-legacy' if available
Paolo Abeni pabeni@redhat.com mptcp: ensure subflow is unhashed before cleaning the backlog
Paolo Abeni pabeni@redhat.com mptcp: do not rely on implicit state check in mptcp_listen()
Mateusz Stachyra m.stachyra@samsung.com tracing: Fix null pointer dereference in tracing_err_log_open()
Masami Hiramatsu (Google) mhiramat@kernel.org fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free()
Jiri Olsa jolsa@kernel.org fprobe: Release rethook after the ftrace_ops is unregistered
Karol Wachowski karol.wachowski@linux.intel.com accel/ivpu: Clear specific interrupt status bits on C0
Karol Wachowski karol.wachowski@linux.intel.com accel/ivpu: Fix VPU register access in irq disable
Heiner Kallweit hkallweit1@gmail.com pwm: meson: fix handling of period/duty if greater than UINT_MAX
Heiner Kallweit hkallweit1@gmail.com pwm: meson: modify and simplify calculation in meson_pwm_get_state
Chungkai Yang Chung-kai.Yang@mediatek.com PM: QoS: Restore support for default value on frequency QoS
Namhyung Kim namhyung@kernel.org perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR
Max Filippov jcmvbkbc@gmail.com xtensa: ISS: fix call to split_if_spec
Bharath SM bharathsm@microsoft.com cifs: if deferred close is disabled then close files immediately
Mario Limonciello mario.limonciello@amd.com drm/amd/pm: conditionally disable pcie lane/speed switching for SMU13
Evan Quan evan.quan@amd.com drm/amd/pm: share the code around SMU13 pcie parameters update
Zheng Yejian zhengyejian1@huawei.com ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()
Zheng Yejian zhengyejian1@huawei.com ring-buffer: Fix deadloop issue on reading trace_pipe
Krister Johansen kjlx@templeofstupid.com net: ena: fix shift-out-of-bounds in exponential backoff
Isaac J. Manjarres isaacmanjarres@google.com regmap-irq: Fix out-of-bounds access when allocating config buffers
Eric Lin eric.lin@sifive.com perf: RISC-V: Remove PERF_HES_STOPPED flag checking in riscv_pmu_start()
Florent Revest revest@chromium.org samples: ftrace: Save required argument registers in sample trampolines
Christoph Hellwig hch@lst.de nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
Zheng Yejian zhengyejian1@huawei.com tracing: Fix memory leak of iter->temp when reading trace_pipe
Mohamed Khalfella mkhalfella@purestorage.com tracing/histograms: Add histograms to hist_vars if they have referenced variables
Matthias Kaehlcke mka@chromium.org dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
Heiko Carstens hca@linux.ibm.com s390/decompressor: fix misaligned symbol build error
Jonas Gorski jonas.gorski@gmail.com bus: ixp4xx: fix IXP4XX_EXP_T1_MASK
Jiaqing Zhao jiaqing.zhao@linux.intel.com Revert "8250: add support for ASIX devices with a FIFO bug"
Sakari Ailus sakari.ailus@linux.intel.com media: uapi: Fix [GS]_ROUTING ACTIVE flag value
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org soundwire: qcom: fix storing port config out-of-bounds
Stephan Gerhold stephan.gerhold@kernkonzept.com opp: Fix use-after-free in lazy_opp_tables after probe deferral
George Stark gnstark@sberdevices.ru meson saradc: fix clock divider mask length
Weitao Wang WeitaoWang-oc@zhaoxin.com xhci: Show ZHAOXIN xHCI root hub speed correctly
Weitao Wang WeitaoWang-oc@zhaoxin.com xhci: Fix TRB prefetch issue of ZHAOXIN hosts
Weitao Wang WeitaoWang-oc@zhaoxin.com xhci: Fix resume issue of some ZHAOXIN hosts
Oliver Upton oliver.upton@linux.dev arm64: errata: Mitigate Ampere1 erratum AC03_CPU_38 at stage-2
Yinjun Zhang yinjun.zhang@corigine.com nfp: clean mc addresses in application firmware when closing port
Xiubo Li xiubli@redhat.com ceph: don't let check_caps skip sending responses for revoke msgs
Xiubo Li xiubli@redhat.com ceph: fix blindly expanding the readahead windows
Xiubo Li xiubli@redhat.com ceph: add a dedicated private data for netfs rreq
Ilya Dryomov idryomov@gmail.com libceph: harden msgr2.1 frame segment length checks
Christophe JAILLET christophe.jaillet@wanadoo.fr firmware: stratix10-svc: Fix a potential resource leak in svc_create_memory_pool()
Hui Li caelli@tencent.com tty: fix hang on tty device with no_room set
Martin Fuzzey martin.fuzzey@flowbird.group tty: serial: imx: fix rs485 rx after tx
Christophe JAILLET christophe.jaillet@wanadoo.fr tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() when iterating clk
Christophe JAILLET christophe.jaillet@wanadoo.fr tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() in case of error
Dan Carpenter dan.carpenter@linaro.org serial: atmel: don't enable IRQs prematurely
Christian König christian.koenig@amd.com drm/ttm: never consider pinned BOs for eviction&swap
Thomas Hellström thomas.hellstrom@linux.intel.com drm/ttm: Don't leak a resource on swapout move error
Thomas Hellström thomas.hellstrom@linux.intel.com drm/ttm: Don't leak a resource on eviction error
Yang Wang kevinyang.wang@amd.com drm/amd/pm: fix smu i2c data read risk
gaba gaba@amd.com drm/amdgpu: avoid restore process run into dead loop.
Aurabindo Pillai aurabindo.pillai@amd.com drm/amd/display: Add monitor specific edid quirk
Mario Limonciello mario.limonciello@amd.com drm/amd/display: Correct `DMUB_FW_VERSION` macro
Ilya Bakoulin ilya.bakoulin@amd.com drm/amd/display: Fix 128b132b link loss handling
Sung-huai Wang danny.wang@amd.com drm/amd/display: add a NULL pointer check
Mario Limonciello mario.limonciello@amd.com drm/amd: Disable PSR-SU on Parade 0803 TCON
Samuel Pitoiset samuel.pitoiset@gmail.com drm/amdgpu: fix clearing mappings for BOs that are always valid in VM
Leo Chen sancchen@amd.com drm/amd/display: disable seamless boot if force_odm_combine is enabled
Austin Zheng austin.zheng@amd.com drm/amd/display: Remove Phantom Pipe Check When Calculating K1 and K2
Hersen Wu hersenxs.wu@amd.com drm/amd/display: edp do not add non-edid timings
Dmytro Laktyushkin dmytro.laktyushkin@amd.com drm/amd/display: fix seamless odm transitions
Alan Liu HaoPing.Liu@amd.com drm/amd/display: Fix in secure display context creation
Alvin Lee Alvin.Lee2@amd.com drm/amd/display: Limit DCN32 8 channel or less parts to DPM1 for FPO
Wayne Lin Wayne.Lin@amd.com drm/dp_mst: Clear MSG_RDY flag before sending new message
Brian Norris briannorris@chromium.org drm/rockchip: vop: Leave vblank enabled in self-refresh
Brian Norris briannorris@chromium.org drm/atomic: Allow vblank-enabled + self-refresh "disable"
Justin Tee justin.tee@broadcom.com scsi: lpfc: Fix double free in lpfc_cmpl_els_logo_acc() caused by lpfc_nlp_not_used()
Alexander Aring aahringo@redhat.com fs: dlm: fix missing pending to false
Alexander Aring aahringo@redhat.com fs: dlm: clear pending bit when queue was empty
Alexander Aring aahringo@redhat.com fs: dlm: fix mismatch of plock results from userspace
Alexander Aring aahringo@redhat.com fs: dlm: make F_SETLK use unkillable wait_event
Alexander Aring aahringo@redhat.com fs: dlm: interrupt posix locks only when process is killed
Alexander Aring aahringo@redhat.com fs: dlm: fix cleanup pending ops when interrupted
Alexander Aring aahringo@redhat.com fs: dlm: return positive pid value for F_GETLK
Jason Baron jbaron@akamai.com md/raid0: add discard support for the 'original' layout
Johan Hovold johan+linaro@kernel.org mfd: pm8008: Fix module autoloading
Damien Le Moal dlemoal@kernel.org misc: pci_endpoint_test: Re-init completion for every test
Damien Le Moal dlemoal@kernel.org misc: pci_endpoint_test: Free IRQs before removing the device
Damien Le Moal dlemoal@kernel.org PCI: rockchip: Set address alignment for endpoint mode
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Use u32 variable to access 32-bit registers
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Write PCI Device ID to correct register
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Assert PCI Configuration Enable bit after probe
Damien Le Moal dlemoal@kernel.org PCI: epf-test: Fix DMA transfer completion detection
Damien Le Moal dlemoal@kernel.org PCI: epf-test: Fix DMA transfer completion initialization
Manivannan Sadhasivam mani@kernel.org PCI: qcom: Disable write access to read only registers for IP v2.3.3
Igor Mammedov imammedo@redhat.com PCI: acpiphp: Reassign resources on bridge if necessary
Robin Murphy robin.murphy@arm.com PCI: Add function 1 DMA alias quirk for Marvell 88SE9235
Ross Lagerwall ross.lagerwall@citrix.com PCI: Release resource invalidated by coalescing
Ondrej Zary linux@zary.sk PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
Harald Freudenberger freude@linux.ibm.com s390/zcrypt: do not retry administrative requests
Sathya Prakash sathya.prakash@broadcom.com scsi: mpi3mr: Propagate sense data for admin queue SCSI I/O
Mikulas Patocka mpatocka@redhat.com dm integrity: reduce vmalloc space footprint on 32-bit architectures
Martin Kaiser martin@kaiser.cx hwrng: imx-rngc - fix the timeout for init and self check
Sinthu Raja sinthu.raja@ti.com arm64: dts: ti: k3-j721s2: Fix wkup pinmux range
Frank Wunderlich frank-w@public-files.de arm64: dts: mt7986: use size of reserved partition for bl2
Siddh Raman Pant code@siddh.me jfs: jfs_dmap: Validate db_l2nbperpage while mounting
Ritesh Harjani (IBM) ritesh.list@gmail.com ext2/dax: Fix ext2_setsize when len is page aligned
Christian Marangi ansuelsmth@gmail.com soc: qcom: mdt_loader: Fix unconditional call to scm_pas_mem_setup
David Woodhouse dwmw@amazon.co.uk mm/mmap: Fix error return in do_vmi_align_munmap()
Alexander Aring aahringo@redhat.com fs: dlm: revert check required context while close
Baokun Li libaokun1@huawei.com ext4: only update i_reserved_data_blocks on successful block allocation
Baokun Li libaokun1@huawei.com ext4: turn quotas off if mount failed after enabling quotas
Chao Yu chao@kernel.org ext4: fix to check return value of freeze_bdev() in ext4_shutdown()
Theodore Ts'o tytso@mit.edu ext4: avoid updating the superblock on a r/o mount if not needed
Kemeng Shi shikemeng@huaweicloud.com ext4: fix wrong unit use in ext4_mb_new_blocks
Kemeng Shi shikemeng@huaweicloud.com ext4: get block from bh in ext4_free_blocks for fast commit replay
Kemeng Shi shikemeng@huaweicloud.com ext4: fix wrong unit use in ext4_mb_clear_bb
Zhihao Cheng chengzhihao1@huawei.com ext4: Fix reusing stale buffer heads from last failed mounting
Huacai Chen chenhuacai@kernel.org MIPS: KVM: Fix NULL pointer dereference
Huacai Chen chenhuacai@kernel.org MIPS: Loongson: Fix build error when make modules_install
Huacai Chen chenhuacai@kernel.org MIPS: Loongson: Fix cpu_probe_loongson() again
Jiaxun Yang jiaxun.yang@flygoat.com MIPS: cpu-features: Use boot_cpu_type for CPU type based features
Hamza Mahfooz hamza.mahfooz@amd.com drm/amd/display: perform a bounds check before filling dirty rectangles
Michael Ellerman mpe@ellerman.id.au powerpc/64s: Fix native_hpte_remove() to be irq-safe
Michael Ellerman mpe@ellerman.id.au powerpc/security: Fix Speculation_Store_Bypass reporting on Power10
Ekansh Gupta quic_ekangupt@quicinc.com misc: fastrpc: Create fastrpc scalar with correct buffer count
Naveen N Rao naveen@kernel.org powerpc: Fail build if using recordmcount with binutils v2.37
sunliming sunliming@kylinos.cn tracing/user_events: Fix incorrect return value for writing operation when events are disabled
Andrey Konovalov andreyknvl@gmail.com kasan: fix type cast in memory_is_poisoned_n
Andrey Konovalov andreyknvl@gmail.com kasan, slub: fix HW_TAGS zeroing with slub_debug
Arnd Bergmann arnd@arndb.de kasan: use internal prototypes matching gcc-13 builtins
Arnd Bergmann arnd@arndb.de kasan: add kasan_tag_mismatch prototype
Oleksij Rempel linux@rempel-privat.de net: phy: dp83td510: fix kernel stall during netboot in DP83TD510E PHY driver
Florian Fainelli florian.fainelli@broadcom.com net: bcmgenet: Ensure MDIO unregistration has clocks enabled
Arseniy Krasnov AVKrasnov@sberdevices.ru mtd: rawnand: meson: fix unaligned DMA buffers handling
Florian Bezdeka florian@bezdeka.de tpm/tpm_tis: Disable interrupts for Lenovo L590 devices
Lino Sanfilippo l.sanfilippo@kunbus.com tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs
Christian Hesse mail@eworm.de tpm/tpm_tis: Disable interrupts for Framework Laptop Intel 13th gen
Jerry Snitselaar jsnitsel@redhat.com tpm: return false from tpm_amd_is_rng_defective on non-x86 platforms
Alexander Sverdlin alexander.sverdlin@siemens.com tpm: tis_i2c: Limit write bursts to I2C_SMBUS_BLOCK_MAX (32) bytes
Christian Hesse mail@eworm.de tpm/tpm_tis: Disable interrupts for Framework Laptop Intel 12th gen
Alexander Sverdlin alexander.sverdlin@siemens.com tpm: tis_i2c: Limit read bursts to I2C_SMBUS_BLOCK_MAX (32) bytes
Peter Ujfalusi peter.ujfalusi@linux.intel.com tpm: tpm_tis: Disable interrupts *only* for AEON UPX-i11
Jarkko Sakkinen jarkko@kernel.org tpm: tpm_vtpm_proxy: fix a race condition in /dev/vtpmx creation
Valentin David valentin.david@gmail.com tpm: Do not remap from ACPI resources again for Pluton TPM
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Unify debounce handling into amd_pinconf_set()
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Drop pull up select configuration
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Use amd_pinconf_set() for all config options
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Only use special debounce behavior for GPIO 0
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Revert "pinctrl: amd: disable and mask interrupts on probe"
Kornel Dulęba korneld@chromium.org pinctrl: amd: Detect and mask spurious interrupts
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Fix mistake in handling clearing pins at startup
Mario Limonciello mario.limonciello@amd.com pinctrl: amd: Detect internal GPIO0 debounce handling
Masahiro Yamada masahiroy@kernel.org kbuild: make modules_install copy modules.builtin(.modinfo)
Jaegeuk Kim jaegeuk@kernel.org f2fs: fix deadlock in i_xattr_sem and inode page lock
Chao Yu chao@kernel.org f2fs: don't reset unchangable mount option in f2fs_remount()
Thomas Zimmermann tzimmermann@suse.de drm/client: Send hotplug event after registering a client
Paulo Alcantara pc@manguebit.com smb: client: fix parsing of source mount option
Winston Wen wentao@uniontech.com cifs: fix session state check in smb2_find_smb_ses
Paulo Alcantara pc@manguebit.com smb: client: improve DFS mount check
Ming Lei ming.lei@redhat.com nvme-pci: fix DMA direction of unmapping integrity data
Pedro Tammela pctammela@mojatatu.com net/sched: sch_qfq: account for stab overhead in qfq_enqueue
Pedro Tammela pctammela@mojatatu.com net/sched: sch_qfq: reintroduce lmax bound check for MTU
Zhang Shurong zhang_shurong@foxmail.com wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set()
Jiawen Wu jiawenwu@trustnetic.com net: txgbe: fix eeprom calculation error
Pedro Tammela pctammela@mojatatu.com net/sched: make psched_mtu() RTNL-less safe
Karol Herbst kherbst@redhat.com drm/nouveau: bring back blit subchannel for pre nv50 GPUs
Karol Herbst kherbst@redhat.com drm/nouveau/acr: Abort loading ACR if no firmware was found
Dan Carpenter dan.carpenter@linaro.org netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write()
Karol Herbst kherbst@redhat.com drm/nouveau/disp/g94: enable HDMI
Karol Herbst kherbst@redhat.com drm/nouveau/disp: fix HDMI on gt215+
Jisheng Zhang jszhang@kernel.org riscv: mm: fix truncation warning on RV32
Ido Schimmel idosch@nvidia.com net/sched: flower: Ensure both minimum and maximum ports are specified
Larysa Zaremba larysa.zaremba@intel.com xdp: use trusted arguments in XDP hints kfuncs
Pu Lehui pulehui@huawei.com bpf: cpumap: Fix memory leak in cpu_map_update_elem
Randy Dunlap rdunlap@infradead.org wifi: airo: avoid uninitialized warning in airo_get_rate()
Xin Yin yinxin.x@bytedance.com erofs: fix fsdax unavailability for chunk-based regular files
Chunhai Guo guochunhai@vivo.com erofs: avoid infinite loop in z_erofs_do_read_page() when reading beyond EOF
Chunhai Guo guochunhai@vivo.com erofs: avoid useless loops in z_erofs_pcluster_readmore() when reading beyond EOF
Suman Ghosh sumang@marvell.com octeontx2-pf: Add additional check for MCAM rules
Lu Hongfei luhongfei@vivo.com net: dsa: Removed unneeded of_node_put in felix_parse_ports_node
Tvrtko Ursulin tvrtko.ursulin@intel.com drm/i915: Fix one wrong caching mode enum usage
Stanislav Lisovskiy stanislav.lisovskiy@intel.com drm/i915: Don't preserve dpll_hw_state for slave crtc in Bigjoiner
Wei Fang wei.fang@nxp.com net: fec: increase the size of tx ring and update tx_wake_threshold
Wei Fang wei.fang@nxp.com net: fec: recycle pages for transmitted XDP frames
Wei Fang wei.fang@nxp.com net: fec: remove last_bdp from fec_enet_txq_xmit_frame()
Wei Fang wei.fang@nxp.com net: fec: remove useless fec_enet_reset_skb()
Björn Töpel bjorn@rivosinc.com riscv, bpf: Fix inconsistent JIT image generation
Stafford Horne shorne@gmail.com openrisc: Union fpcsr and oldmask in sigcontext to unbreak userspace ABI
Ankit Kumar ankit.kumar@samsung.com nvme: fix the NVME_ID_NS_NVM_STS_MASK definition
Florian Kauer florian.kauer@linutronix.de igc: Fix inserting of empty frame for launchtime
Florian Kauer florian.kauer@linutronix.de igc: Fix launchtime before start of cycle
Florian Kauer florian.kauer@linutronix.de igc: No strict mode in pure launchtime/CBS offload
Ze Gao zegao2021@gmail.com fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock
Tzvetomir Stoyanov (VMware) tz.stoyanov@gmail.com kernel/trace: Fix cleanup logic of enable_trace_eprobe
Florian Kauer florian.kauer@linutronix.de igc: Handle already enabled taprio offload for basetime 0
Florian Kauer florian.kauer@linutronix.de igc: Do not enable taprio offload for invalid arguments
Florian Kauer florian.kauer@linutronix.de igc: Rename qbv_enable to taprio_offload_enable
Vladimir Oltean vladimir.oltean@nxp.com net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum
Andy Shevchenko andriy.shevchenko@linux.intel.com platform/x86: wmi: Break possible infinite loop when parsing GUID
Peter Zijlstra peterz@infradead.org x86/fineibt: Poison ENDBR at +0
Jiasheng Jiang jiasheng@iscas.ac.cn net: dsa: qca8k: Add check for skb_copy
Arnd Bergmann arnd@arndb.de HID: hyperv: avoid struct memcpy overrun warning
Ziyang Xuan william.xuanziyang@huawei.com ipv6/addrconf: fix a potential refcount underflow for idev
Jiasheng Jiang jiasheng@iscas.ac.cn NTB: ntb_tool: Add check for devm_kcalloc
Yang Yingliang yangyingliang@huawei.com NTB: ntb_transport: fix possible memory leak while device_register() fails
Yuan Can yuancan@huawei.com ntb: intel: Fix error handling in intel_ntb_pci_driver_init()
Yuan Can yuancan@huawei.com NTB: amd: Fix error handling in amd_ntb_pci_driver_init()
Yuan Can yuancan@huawei.com ntb: idt: Fix error handling in idt_pci_driver_init()
Eric Dumazet edumazet@google.com udp6: fix udp6_ehashfn() typo
Kuniyuki Iwashima kuniyu@amazon.com icmp6: Fix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev().
Niklas Schnelle schnelle@linux.ibm.com s390/ism: Do not unregister clients with registered DMBs
Niklas Schnelle schnelle@linux.ibm.com s390/ism: Fix and simplify add()/remove() callback handling
Niklas Schnelle schnelle@linux.ibm.com s390/ism: Fix locking for forwarding of IRQs and events to clients
Paolo Abeni pabeni@redhat.com net: prevent skb corruption on frag list segmentation
Rafał Miłecki rafal@milecki.pl net: bgmac: postpone turning IRQs off to avoid SoC hangs
Ivan Babrou ivan@cloudflare.com udp6: add a missing call into udp_fail_queue_rcv_skb tracepoint
Nitya Sunkad nitya.sunkad@amd.com ionic: remove WARN_ON to prevent panic_on_warn
Sai Krishna saikrishnag@marvell.com octeontx2-af: Move validation of ptp pointer before its usage
Ratheesh Kannoth rkannoth@marvell.com octeontx2-af: Promisc enable/disable through mbox
Geert Uytterhoeven geert+renesas@glider.be drm/fbdev-dma: Fix documented default preferred_bpp value
Junfeng Guo junfeng.guo@intel.com gve: Set default duplex configuration to full
M A Ramdhan ramdhan@starlabs.sg net/sched: cls_fw: Fix improper refcount update leads to use-after-free
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: fix oversize frame dropping for preemptible TCs
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: felix: make vsc9959_tas_guard_bands_update() visible to ocelot->ops
Klaus Kudielka klaus.kudielka@gmail.com net: mvneta: fix txq_map in case of txq_number==1
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Fix max stack depth check for async callbacks
Randy Dunlap rdunlap@infradead.org scsi: ufs: ufs-mediatek: Add dependency for RESET_CONTROLLER
Dan Carpenter dan.carpenter@linaro.org scsi: qla2xxx: Fix error code in qla2x00_start_sp()
Eric Biggers ebiggers@google.com blk-crypto: use dynamic lock class for blk_crypto_profile::lock
Aravindhan Gunasekaran aravindhan.gunasekaran@intel.com igc: Handle PPS start time programming for past time values
Tan Tee Min tee.min.tan@linux.intel.com igc: Include the length/type field and VLAN tag in queueMaxSDU
Prasad Koya prasad@arista.com igc: set TP bit in 'supported' and 'advertising' fields of ethtool_link_ksettings
Dragos Tatulea dtatulea@nvidia.com net/mlx5e: RX, Fix page_pool page fragment tracking for XDP
Maher Sanalla msanalla@nvidia.com net/mlx5: Query hca_cap_2 only when supported
Yevgeny Kliteynik kliteyn@nvidia.com net/mlx5e: TC, CT: Offload ct clear only once
Vlad Buslov vladbu@nvidia.com net/mlx5e: Check for NOT_READY flag state after locking
Saeed Mahameed saeedm@nvidia.com net/mlx5: Register a unique thermal zone per device
Dragos Tatulea dtatulea@nvidia.com net/mlx5e: RX, Fix flush and close release flow of regular rq for legacy rq
Zhengchao Shao shaozhengchao@huawei.com net/mlx5e: fix memory leak in mlx5e_ptp_open
Zhengchao Shao shaozhengchao@huawei.com net/mlx5e: fix memory leak in mlx5e_fs_tt_redirect_any_create
Zhengchao Shao shaozhengchao@huawei.com net/mlx5e: fix double free in mlx5e_destroy_flow_table
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Fix TX Hang issue when QBV Gate is closed
Jesper Dangaard Brouer brouer@redhat.com igc: Add XDP hints kfuncs for RX hash
Jesper Dangaard Brouer brouer@redhat.com igc: Add igc_xdp_buff wrapper for xdp_buff in driver
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Remove delay during TX ring configuration
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Add condition for qbv_config_change_errors counter
Sridhar Samudrala sridhar.samudrala@intel.com ice: Fix tx queue rate limit when TCs are configured
Sridhar Samudrala sridhar.samudrala@intel.com ice: Fix max_rate check while configuring TX rate limits
Florian Westphal fw@strlen.de netfilter: conntrack: don't fold port numbers into addresses before hashing
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: report use refcount overflow
Petr Pavlu petr.pavlu@suse.com xen/virtio: Fix NULL deref when a bridge of PCI root bus has no parent
Marek Vasut marex@denx.de drm/panel: simple: Add Powertip PH800480T013 drm_display_mode flags
Petr Tesarik petr.tesarik.ext@huawei.com swiotlb: reduce the number of areas to match actual memory pool size
Petr Tesarik petr.tesarik.ext@huawei.com swiotlb: always set the number of areas before allocating the pool
Douglas Anderson dianders@chromium.org drm/bridge: ti-sn65dsi86: Fix auxiliary bus lifetime
Adrián Larumbe adrian.larumbe@collabora.com drm: bridge: dw_hdmi: fix connector access for scdc
Fabio Estevam festevam@denx.de drm/panel: simple: Add connector_type for innolux_at043tn24
Namjae Jeon linkinjeon@kernel.org ksmbd: fix out of bounds read in smb2_sess_setup
Namjae Jeon linkinjeon@kernel.org ksmbd: add missing compound request handing in some commands
Simon Horman horms@kernel.org net: lan743x: select FIXED_PHY
Moritz Fischer moritzf@google.com net: lan743x: Don't sleep in atomic context
Basavaraj Natikar Basavaraj.Natikar@amd.com HID: amd_sfh: Fix for shift-out-of-bounds
Basavaraj Natikar Basavaraj.Natikar@amd.com HID: amd_sfh: Rename the float32 variable
Dmitry Torokhov dmitry.torokhov@gmail.com HID: input: fix mapping for camera access keys
Nayna Jain nayna@linux.ibm.com security/integrity: fix pointer to ESL data and its size on pseries
Ivan Mikhaylov fr0st61te@gmail.com net/ncsi: change from ndo_set_mac_address to dev_set_mac_address
-------------
Diffstat:
Documentation/arm64/silicon-errata.rst | 3 + .../media/v4l/vidioc-subdev-g-routing.rst | 2 +- Makefile | 30 ++- arch/arm64/Kconfig | 19 ++ .../dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso | 7 +- arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts | 42 ++-- .../boot/dts/ti/k3-j721s2-common-proc-board.dts | 76 +++--- arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi | 29 ++- arch/arm64/kernel/cpu_errata.c | 7 + arch/arm64/kernel/traps.c | 2 +- arch/arm64/kvm/hyp/pgtable.c | 14 +- arch/arm64/mm/fault.c | 2 +- arch/arm64/tools/cpucaps | 1 + arch/mips/Makefile | 10 +- arch/mips/include/asm/cpu-features.h | 4 +- arch/mips/include/asm/kvm_host.h | 6 +- arch/mips/kernel/cpu-probe.c | 9 +- arch/mips/kvm/emulate.c | 22 +- arch/mips/kvm/mips.c | 16 +- arch/mips/kvm/stats.c | 4 +- arch/mips/kvm/trace.h | 8 +- arch/mips/kvm/vz.c | 20 +- arch/openrisc/include/uapi/asm/sigcontext.h | 6 +- arch/openrisc/kernel/signal.c | 4 +- arch/powerpc/Makefile | 8 + arch/powerpc/kernel/security.c | 37 +-- arch/powerpc/mm/book3s64/hash_native.c | 13 +- arch/riscv/mm/init.c | 2 +- arch/riscv/net/bpf_jit.h | 6 +- arch/riscv/net/bpf_jit_core.c | 19 +- arch/s390/Makefile | 1 + arch/x86/events/intel/core.c | 7 + arch/x86/kernel/alternative.c | 16 ++ arch/xtensa/platforms/iss/network.c | 2 +- block/blk-crypto-profile.c | 12 +- drivers/accel/ivpu/ivpu_drv.h | 1 + drivers/accel/ivpu/ivpu_hw_mtl.c | 20 +- drivers/base/regmap/regmap-irq.c | 2 +- drivers/bus/intel-ixp4xx-eb.c | 2 +- drivers/char/hw_random/imx-rngc.c | 6 +- drivers/char/tpm/tpm-chip.c | 7 + drivers/char/tpm/tpm_crb.c | 19 +- drivers/char/tpm/tpm_tis.c | 25 ++ drivers/char/tpm/tpm_tis_core.c | 103 ++++++-- drivers/char/tpm/tpm_tis_core.h | 4 + drivers/char/tpm/tpm_tis_i2c.c | 59 +++-- drivers/char/tpm/tpm_vtpm_proxy.c | 30 +-- drivers/firmware/stratix10-svc.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 + drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 64 ++--- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h | 2 +- .../drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 26 +++ drivers/gpu/drm/amd/display/dc/core/dc.c | 3 + .../drm/amd/display/dc/dce112/dce112_resource.c | 10 +- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 11 + drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 4 - drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c | 2 +- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h | 1 + .../gpu/drm/amd/display/dc/dcn32/dcn32_resource.c | 2 + .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 15 ++ .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h | 2 + .../dc/link/protocols/link_dp_irq_handler.c | 11 +- drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 2 +- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 4 + drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 2 +- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 67 ++++++ .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 35 +-- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 +- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 33 +-- drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 9 +- drivers/gpu/drm/bridge/ti-sn65dsi86.c | 35 +-- drivers/gpu/drm/display/drm_dp_mst_topology.c | 54 ++++- drivers/gpu/drm/drm_atomic_helper.c | 11 +- drivers/gpu/drm/drm_client.c | 21 ++ drivers/gpu/drm/drm_fbdev_dma.c | 6 +- drivers/gpu/drm/drm_fbdev_generic.c | 4 - drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 4 - drivers/gpu/drm/gma500/fbdev.c | 4 - drivers/gpu/drm/i915/display/intel_display.c | 1 - drivers/gpu/drm/i915/display/intel_dp.c | 7 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/msm/msm_fbdev.c | 4 - drivers/gpu/drm/nouveau/dispnv50/disp.c | 12 +- drivers/gpu/drm/nouveau/nouveau_chan.c | 1 + drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + drivers/gpu/drm/nouveau/nouveau_drm.c | 20 +- drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c | 1 + drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c | 2 +- drivers/gpu/drm/omapdrm/omap_fbdev.c | 4 - drivers/gpu/drm/panel/panel-simple.c | 2 + drivers/gpu/drm/radeon/radeon_fbdev.c | 4 - drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 8 +- drivers/gpu/drm/tegra/fbdev.c | 4 - drivers/gpu/drm/ttm/ttm_bo.c | 29 ++- drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c | 30 ++- drivers/hid/hid-hyperv.c | 10 +- drivers/hid/hid-input.c | 7 +- drivers/iio/adc/meson_saradc.c | 2 +- drivers/md/dm-integrity.c | 4 +- drivers/md/dm-verity-loadpin.c | 3 + drivers/md/raid0.c | 62 ++++- drivers/md/raid0.h | 1 + drivers/mfd/qcom-pm8008.c | 1 + drivers/misc/fastrpc.c | 2 +- drivers/misc/pci_endpoint_test.c | 10 +- drivers/mtd/nand/raw/meson_nand.c | 4 + drivers/net/dsa/hirschmann/hellcreek.c | 14 +- drivers/net/dsa/ocelot/felix.c | 6 +- drivers/net/dsa/ocelot/felix.h | 1 - drivers/net/dsa/ocelot/felix_vsc9959.c | 28 ++- drivers/net/dsa/qca/qca8k-8xxx.c | 3 + drivers/net/dsa/sja1105/sja1105_tas.c | 7 +- drivers/net/ethernet/amazon/ena/ena_com.c | 3 + drivers/net/ethernet/broadcom/bgmac.c | 4 +- drivers/net/ethernet/broadcom/genet/bcmmii.c | 2 + drivers/net/ethernet/engleder/tsnep_selftests.c | 12 +- drivers/net/ethernet/engleder/tsnep_tc.c | 4 +- drivers/net/ethernet/freescale/enetc/enetc_qos.c | 6 +- drivers/net/ethernet/freescale/fec.h | 17 +- drivers/net/ethernet/freescale/fec_main.c | 178 ++++++++------ drivers/net/ethernet/google/gve/gve_ethtool.c | 3 + drivers/net/ethernet/intel/ice/ice_main.c | 23 +- drivers/net/ethernet/intel/ice/ice_tc_lib.c | 22 +- drivers/net/ethernet/intel/ice/ice_tc_lib.h | 1 + drivers/net/ethernet/intel/igc/igc.h | 15 +- drivers/net/ethernet/intel/igc/igc_ethtool.c | 2 + drivers/net/ethernet/intel/igc/igc_main.c | 158 ++++++++++--- drivers/net/ethernet/intel/igc/igc_ptp.c | 25 +- drivers/net/ethernet/intel/igc/igc_tsn.c | 68 ++++-- drivers/net/ethernet/marvell/mvneta.c | 4 +- drivers/net/ethernet/marvell/octeontx2/af/ptp.c | 19 +- drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 2 +- .../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 11 +- .../ethernet/marvell/octeontx2/af/rvu_npc_hash.c | 23 +- .../ethernet/marvell/octeontx2/nic/otx2_flows.c | 8 + .../net/ethernet/marvell/octeontx2/nic/otx2_tc.c | 15 ++ .../mellanox/mlx5/core/en/fs_tt_redirect.c | 6 +- drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c | 6 +- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 14 +- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 3 +- .../ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c | 1 + drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 44 ++-- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 6 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 3 + drivers/net/ethernet/mellanox/mlx5/core/thermal.c | 19 +- drivers/net/ethernet/microchip/Kconfig | 2 +- drivers/net/ethernet/microchip/lan743x_main.c | 21 +- .../net/ethernet/microchip/lan966x/lan966x_tc.c | 10 +- drivers/net/ethernet/mscc/ocelot_mm.c | 7 +- .../net/ethernet/netronome/nfp/nfp_net_common.c | 5 + drivers/net/ethernet/pensando/ionic/ionic_lif.c | 5 - drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c | 7 +- drivers/net/ethernet/ti/am65-cpsw-qos.c | 11 +- drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c | 3 - drivers/net/netdevsim/dev.c | 9 +- drivers/net/phy/dp83td510.c | 23 +- drivers/net/wireless/cisco/airo.c | 5 +- drivers/net/wireless/realtek/rtw89/debug.c | 5 +- drivers/ntb/hw/amd/ntb_hw_amd.c | 7 +- drivers/ntb/hw/idt/ntb_hw_idt.c | 7 +- drivers/ntb/hw/intel/ntb_hw_gen1.c | 7 +- drivers/ntb/ntb_transport.c | 2 +- drivers/ntb/test/ntb_tool.c | 2 + drivers/nvme/host/core.c | 36 ++- drivers/nvme/host/pci.c | 2 +- drivers/opp/core.c | 3 + drivers/pci/controller/dwc/pcie-qcom.c | 2 + drivers/pci/controller/pcie-rockchip-ep.c | 65 ++---- drivers/pci/controller/pcie-rockchip.c | 17 ++ drivers/pci/controller/pcie-rockchip.h | 11 +- drivers/pci/endpoint/functions/pci-epf-test.c | 40 +++- drivers/pci/hotplug/acpiphp_glue.c | 5 +- drivers/pci/pci.c | 10 +- drivers/pci/probe.c | 4 +- drivers/pci/quirks.c | 2 + drivers/perf/riscv_pmu.c | 3 - drivers/pinctrl/pinctrl-amd.c | 103 +++----- drivers/pinctrl/pinctrl-amd.h | 2 +- drivers/platform/x86/wmi.c | 22 +- drivers/pwm/pwm-meson.c | 28 +-- drivers/s390/crypto/zcrypt_msgtype6.c | 6 + drivers/s390/net/ism_drv.c | 139 ++++++----- drivers/scsi/lpfc/lpfc_crtn.h | 1 - drivers/scsi/lpfc/lpfc_els.c | 30 +-- drivers/scsi/lpfc/lpfc_hbadisc.c | 24 +- drivers/scsi/mpi3mr/mpi3mr_fw.c | 5 + drivers/scsi/qla2xxx/qla_attr.c | 13 ++ drivers/scsi/qla2xxx/qla_bsg.c | 6 + drivers/scsi/qla2xxx/qla_def.h | 22 +- drivers/scsi/qla2xxx/qla_edif.c | 4 +- drivers/scsi/qla2xxx/qla_gbl.h | 2 +- drivers/scsi/qla2xxx/qla_init.c | 258 +++++++++++++++++++-- drivers/scsi/qla2xxx/qla_inline.h | 5 +- drivers/scsi/qla2xxx/qla_iocb.c | 38 ++- drivers/scsi/qla2xxx/qla_isr.c | 64 ++++- drivers/scsi/qla2xxx/qla_nvme.c | 3 - drivers/scsi/qla2xxx/qla_os.c | 133 ++++++----- drivers/soc/qcom/mdt_loader.c | 16 +- drivers/soundwire/qcom.c | 3 +- drivers/tty/n_tty.c | 25 +- drivers/tty/serial/8250/8250.h | 1 - drivers/tty/serial/8250/8250_pci.c | 19 -- drivers/tty/serial/8250/8250_port.c | 11 +- drivers/tty/serial/atmel_serial.c | 4 +- drivers/tty/serial/imx.c | 18 +- drivers/tty/serial/samsung_tty.c | 14 +- drivers/ufs/host/Kconfig | 1 + drivers/usb/host/xhci-mem.c | 39 +++- drivers/usb/host/xhci-pci.c | 12 + drivers/usb/host/xhci.h | 2 + drivers/xen/grant-dma-ops.c | 2 + fs/ceph/addr.c | 85 +++++-- fs/ceph/caps.c | 9 + fs/ceph/super.h | 13 ++ fs/dlm/ast.c | 8 +- fs/dlm/lockspace.c | 12 - fs/dlm/lockspace.h | 1 - fs/dlm/lowcomms.c | 1 + fs/dlm/midcomms.c | 3 - fs/dlm/plock.c | 115 +++++---- fs/erofs/inode.c | 3 +- fs/erofs/zdata.c | 4 +- fs/ext2/inode.c | 5 +- fs/ext4/indirect.c | 8 + fs/ext4/inode.c | 10 - fs/ext4/ioctl.c | 5 +- fs/ext4/mballoc.c | 17 +- fs/ext4/super.c | 31 ++- fs/f2fs/dir.c | 9 +- fs/f2fs/super.c | 30 ++- fs/f2fs/xattr.c | 6 +- fs/jfs/jfs_dmap.c | 6 + fs/jfs/jfs_filsys.h | 2 + fs/smb/client/cifs_dfs_ref.c | 20 +- fs/smb/client/cifsproto.h | 2 + fs/smb/client/cifssmb.c | 2 +- fs/smb/client/dfs.c | 43 +--- fs/smb/client/file.c | 4 +- fs/smb/client/fs_context.c | 59 ++++- fs/smb/client/misc.c | 17 +- fs/smb/client/smb2transport.c | 7 + fs/smb/server/smb2pdu.c | 109 +++++---- include/drm/display/drm_dp_mst_helper.h | 7 +- include/linux/blk-crypto-profile.h | 1 + include/linux/ism.h | 7 +- include/linux/kasan.h | 2 +- include/linux/nvme.h | 2 +- include/linux/rethook.h | 1 + include/linux/serial_8250.h | 1 - include/net/netfilter/nf_conntrack_tuple.h | 3 + include/net/netfilter/nf_tables.h | 31 ++- include/net/pkt_sched.h | 9 +- include/soc/mscc/ocelot.h | 1 + kernel/bpf/cpumap.c | 40 ++-- kernel/bpf/verifier.c | 5 +- kernel/dma/swiotlb.c | 46 +++- kernel/power/qos.c | 9 +- kernel/trace/fprobe.c | 15 +- kernel/trace/ftrace.c | 45 ++-- kernel/trace/rethook.c | 13 ++ kernel/trace/ring_buffer.c | 24 +- kernel/trace/trace.c | 3 +- kernel/trace/trace.h | 2 + kernel/trace/trace_eprobe.c | 18 +- kernel/trace/trace_events_hist.c | 8 +- kernel/trace/trace_events_user.c | 6 +- kernel/trace/trace_probe.c | 2 +- kernel/trace/trace_probe_kernel.h | 30 +-- kernel/trace/trace_probe_tmpl.h | 10 +- kernel/trace/trace_uprobe.c | 3 +- mm/kasan/common.c | 2 +- mm/kasan/generic.c | 73 +++--- mm/kasan/kasan.h | 171 +++++++------- mm/kasan/report.c | 17 +- mm/kasan/report_generic.c | 12 +- mm/kasan/report_hw_tags.c | 2 +- mm/kasan/report_sw_tags.c | 2 +- mm/kasan/shadow.c | 36 +-- mm/kasan/sw_tags.c | 20 +- mm/mmap.c | 9 +- mm/slab.h | 16 +- net/ceph/messenger_v2.c | 41 ++-- net/core/net-traces.c | 2 + net/core/skbuff.c | 5 + net/core/xdp.c | 2 +- net/ipv6/addrconf.c | 3 +- net/ipv6/icmp.c | 5 +- net/ipv6/udp.c | 4 +- net/mptcp/protocol.c | 7 +- net/ncsi/ncsi-rsp.c | 5 +- net/netfilter/nf_conntrack_core.c | 20 +- net/netfilter/nf_tables_api.c | 163 ++++++++----- net/netfilter/nft_flow_offload.c | 6 +- net/netfilter/nft_immediate.c | 8 +- net/netfilter/nft_objref.c | 8 +- net/sched/cls_flower.c | 10 + net/sched/cls_fw.c | 10 +- net/sched/sch_qfq.c | 18 +- net/sched/sch_taprio.c | 4 +- samples/ftrace/ftrace-direct-too.c | 14 +- security/integrity/platform_certs/load_powerpc.c | 40 ++-- tools/testing/selftests/net/mptcp/config | 1 + tools/testing/selftests/net/mptcp/mptcp_connect.sh | 3 + tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 29 +-- tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 10 +- tools/testing/selftests/net/mptcp/userspace_pm.sh | 4 +- 312 files changed, 3504 insertions(+), 1921 deletions(-)
From: Ivan Mikhaylov fr0st61te@gmail.com
commit 790071347a0a1a89e618eedcd51c687ea783aeb3 upstream.
Change ndo_set_mac_address to dev_set_mac_address because dev_set_mac_address provides a way to notify network layer about MAC change. In other case, services may not aware about MAC change and keep using old one which set from network adapter driver.
As example, DHCP client from systemd do not update MAC address without notification from net subsystem which leads to the problem with acquiring the right address from DHCP server.
Fixes: cb10c7c0dfd9e ("net/ncsi: Add NCSI Broadcom OEM command") Cc: stable@vger.kernel.org # v6.0+ 2f38e84 net/ncsi: make one oem_gma function for all mfr id Signed-off-by: Paul Fertser fercerpav@gmail.com Signed-off-by: Ivan Mikhaylov fr0st61te@gmail.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ncsi/ncsi-rsp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/ncsi/ncsi-rsp.c +++ b/net/ncsi/ncsi-rsp.c @@ -616,7 +616,6 @@ static int ncsi_rsp_handler_oem_mlx_gma( { struct ncsi_dev_priv *ndp = nr->ndp; struct net_device *ndev = ndp->ndev.dev; - const struct net_device_ops *ops = ndev->netdev_ops; struct ncsi_rsp_oem_pkt *rsp; struct sockaddr saddr; int ret = 0; @@ -630,7 +629,9 @@ static int ncsi_rsp_handler_oem_mlx_gma( /* Set the flag for GMA command which should only be called once */ ndp->gma_flag = 1;
- ret = ops->ndo_set_mac_address(ndev, &saddr); + rtnl_lock(); + ret = dev_set_mac_address(ndev, &saddr, NULL); + rtnl_unlock(); if (ret < 0) netdev_warn(ndev, "NCSI: 'Writing mac address to device failed\n");
From: Nayna Jain nayna@linux.ibm.com
commit e66effaf61ffb1dc6088492ca3a0e98dcbf1c10d upstream.
On PowerVM guest, variable data is prefixed with 8 bytes of timestamp. Extract ESL by stripping off the timestamp before passing to ESL parser.
Fixes: 4b3e71e9a34c ("integrity/powerpc: Support loading keys from PLPKS") Cc: stable@vger.kenrnel.org # v6.3 Signed-off-by: Nayna Jain nayna@linux.ibm.com Tested-by: Nageswara R Sastry rnsastry@linux.ibm.com Acked-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230608120444.382527-1-nayna@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../integrity/platform_certs/load_powerpc.c | 40 ++++++++++++------- 1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/security/integrity/platform_certs/load_powerpc.c b/security/integrity/platform_certs/load_powerpc.c index b9de70b90826..170789dc63d2 100644 --- a/security/integrity/platform_certs/load_powerpc.c +++ b/security/integrity/platform_certs/load_powerpc.c @@ -15,6 +15,9 @@ #include "keyring_handler.h" #include "../integrity.h"
+#define extract_esl(db, data, size, offset) \ + do { db = data + offset; size = size - offset; } while (0) + /* * Get a certificate list blob from the named secure variable. * @@ -55,8 +58,9 @@ static __init void *get_cert_list(u8 *key, unsigned long keylen, u64 *size) */ static int __init load_powerpc_certs(void) { - void *db = NULL, *dbx = NULL; - u64 dbsize = 0, dbxsize = 0; + void *db = NULL, *dbx = NULL, *data = NULL; + u64 dsize = 0; + u64 offset = 0; int rc = 0; ssize_t len; char buf[32]; @@ -74,38 +78,46 @@ static int __init load_powerpc_certs(void) return -ENODEV; }
+ if (strcmp("ibm,plpks-sb-v1", buf) == 0) + /* PLPKS authenticated variables ESL data is prefixed with 8 bytes of timestamp */ + offset = 8; + /* * Get db, and dbx. They might not exist, so it isn't an error if we * can't get them. */ - db = get_cert_list("db", 3, &dbsize); - if (!db) { + data = get_cert_list("db", 3, &dsize); + if (!data) { pr_info("Couldn't get db list from firmware\n"); - } else if (IS_ERR(db)) { - rc = PTR_ERR(db); + } else if (IS_ERR(data)) { + rc = PTR_ERR(data); pr_err("Error reading db from firmware: %d\n", rc); return rc; } else { - rc = parse_efi_signature_list("powerpc:db", db, dbsize, + extract_esl(db, data, dsize, offset); + + rc = parse_efi_signature_list("powerpc:db", db, dsize, get_handler_for_db); if (rc) pr_err("Couldn't parse db signatures: %d\n", rc); - kfree(db); + kfree(data); }
- dbx = get_cert_list("dbx", 4, &dbxsize); - if (!dbx) { + data = get_cert_list("dbx", 4, &dsize); + if (!data) { pr_info("Couldn't get dbx list from firmware\n"); - } else if (IS_ERR(dbx)) { - rc = PTR_ERR(dbx); + } else if (IS_ERR(data)) { + rc = PTR_ERR(data); pr_err("Error reading dbx from firmware: %d\n", rc); return rc; } else { - rc = parse_efi_signature_list("powerpc:dbx", dbx, dbxsize, + extract_esl(dbx, data, dsize, offset); + + rc = parse_efi_signature_list("powerpc:dbx", dbx, dsize, get_handler_for_dbx); if (rc) pr_err("Couldn't parse dbx signatures: %d\n", rc); - kfree(dbx); + kfree(data); }
return rc;
From: Dmitry Torokhov dmitry.torokhov@gmail.com
commit e3ea6467f623b80906ff0c93b58755ab903ce12f upstream.
Commit 9f4211bf7f81 ("HID: add mapping for camera access keys") added mapping for the camera access keys, but unfortunately used wrong usage codes for them. HUTRR72[1] specifies that camera access controls use 0x76, 0x077 and 0x78 usages in the consumer control page. Previously mapped 0xd5, 0xd6 and 0xd7 usages are actually defined in HUTRR64[2] as game recording controls.
[1] https://www.usb.org/sites/default/files/hutrr72_-_usages_to_control_camera_a... [2] https://www.usb.org/sites/default/files/hutrr64b_-_game_recording_controller...
Fixes: 9f4211bf7f81 ("HID: add mapping for camera access keys") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Link: https://lore.kernel.org/r/ZJtd/fMXRUgq20TW@google.com Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-input.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index a1d2690a1a0d..851ee86eff32 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -1093,6 +1093,10 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel case 0x074: map_key_clear(KEY_BRIGHTNESS_MAX); break; case 0x075: map_key_clear(KEY_BRIGHTNESS_AUTO); break;
+ case 0x076: map_key_clear(KEY_CAMERA_ACCESS_ENABLE); break; + case 0x077: map_key_clear(KEY_CAMERA_ACCESS_DISABLE); break; + case 0x078: map_key_clear(KEY_CAMERA_ACCESS_TOGGLE); break; + case 0x079: map_key_clear(KEY_KBDILLUMUP); break; case 0x07a: map_key_clear(KEY_KBDILLUMDOWN); break; case 0x07c: map_key_clear(KEY_KBDILLUMTOGGLE); break; @@ -1139,9 +1143,6 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel case 0x0cd: map_key_clear(KEY_PLAYPAUSE); break; case 0x0cf: map_key_clear(KEY_VOICECOMMAND); break;
- case 0x0d5: map_key_clear(KEY_CAMERA_ACCESS_ENABLE); break; - case 0x0d6: map_key_clear(KEY_CAMERA_ACCESS_DISABLE); break; - case 0x0d7: map_key_clear(KEY_CAMERA_ACCESS_TOGGLE); break; case 0x0d8: map_key_clear(KEY_DICTATE); break; case 0x0d9: map_key_clear(KEY_EMOJI_PICKER); break;
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit c1685a862a4bea863537f06abaa37a123aef493c upstream.
As float32 is also used in other places as a data type, it is necessary to rename the float32 variable in order to avoid confusion.
Cc: stable@vger.kernel.org Tested-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Akshata MukundShetty akshata.mukundshetty@amd.com Link: https://lore.kernel.org/r/20230707065722.9036-2-Basavaraj.Natikar@amd.com Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c +++ b/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c @@ -132,13 +132,13 @@ static void get_common_inputs(struct com common->event_type = HID_USAGE_SENSOR_EVENT_DATA_UPDATED_ENUM; }
-static int float_to_int(u32 float32) +static int float_to_int(u32 flt32_val) { int fraction, shift, mantissa, sign, exp, zeropre;
- mantissa = float32 & GENMASK(22, 0); - sign = (float32 & BIT(31)) ? -1 : 1; - exp = (float32 & ~BIT(31)) >> 23; + mantissa = flt32_val & GENMASK(22, 0); + sign = (flt32_val & BIT(31)) ? -1 : 1; + exp = (flt32_val & ~BIT(31)) >> 23;
if (!exp && !mantissa) return 0; @@ -151,10 +151,10 @@ static int float_to_int(u32 float32) }
shift = 23 - exp; - float32 = BIT(exp) + (mantissa >> shift); + flt32_val = BIT(exp) + (mantissa >> shift); fraction = mantissa & GENMASK(shift - 1, 0);
- return (((fraction * 100) >> shift) >= 50) ? sign * (float32 + 1) : sign * float32; + return (((fraction * 100) >> shift) >= 50) ? sign * (flt32_val + 1) : sign * flt32_val; }
static u8 get_input_rep(u8 current_index, int sensor_idx, int report_id,
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit 87854366176403438d01f368b09de3ec2234e0f5 upstream.
Shift operation of 'exp' and 'shift' variables exceeds the maximum number of shift values in the u32 range leading to UBSAN shift-out-of-bounds.
... [ 6.120512] UBSAN: shift-out-of-bounds in drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c:149:50 [ 6.120598] shift exponent 104 is too large for 64-bit type 'long unsigned int' [ 6.120659] CPU: 4 PID: 96 Comm: kworker/4:1 Not tainted 6.4.0amd_1-next-20230519-dirty #10 [ 6.120665] Hardware name: AMD Birman-PHX/Birman-PHX, BIOS SFH_with_HPD_SEN.FD 04/05/2023 [ 6.120667] Workqueue: events amd_sfh_work_buffer [amd_sfh] [ 6.120687] Call Trace: [ 6.120690] <TASK> [ 6.120694] dump_stack_lvl+0x48/0x70 [ 6.120704] dump_stack+0x10/0x20 [ 6.120707] ubsan_epilogue+0x9/0x40 [ 6.120716] __ubsan_handle_shift_out_of_bounds+0x10f/0x170 [ 6.120720] ? psi_group_change+0x25f/0x4b0 [ 6.120729] float_to_int.cold+0x18/0xba [amd_sfh] [ 6.120739] get_input_rep+0x57/0x340 [amd_sfh] [ 6.120748] ? __schedule+0xba7/0x1b60 [ 6.120756] ? __pfx_get_input_rep+0x10/0x10 [amd_sfh] [ 6.120764] amd_sfh_work_buffer+0x91/0x180 [amd_sfh] [ 6.120772] process_one_work+0x229/0x430 [ 6.120780] worker_thread+0x4a/0x3c0 [ 6.120784] ? __pfx_worker_thread+0x10/0x10 [ 6.120788] kthread+0xf7/0x130 [ 6.120792] ? __pfx_kthread+0x10/0x10 [ 6.120795] ret_from_fork+0x29/0x50 [ 6.120804] </TASK> ...
Fix this by adding the condition to validate shift ranges.
Fixes: 93ce5e0231d7 ("HID: amd_sfh: Implement SFH1.1 functionality") Cc: stable@vger.kernel.org Tested-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Akshata MukundShetty akshata.mukundshetty@amd.com Link: https://lore.kernel.org/r/20230707065722.9036-3-Basavaraj.Natikar@amd.com Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c +++ b/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c @@ -143,16 +143,32 @@ static int float_to_int(u32 flt32_val) if (!exp && !mantissa) return 0;
+ /* + * Calculate the exponent and fraction part of floating + * point representation. + */ exp -= 127; if (exp < 0) { exp = -exp; + if (exp >= BITS_PER_TYPE(u32)) + return 0; zeropre = (((BIT(23) + mantissa) * 100) >> 23) >> exp; return zeropre >= 50 ? sign : 0; }
shift = 23 - exp; - flt32_val = BIT(exp) + (mantissa >> shift); - fraction = mantissa & GENMASK(shift - 1, 0); + if (abs(shift) >= BITS_PER_TYPE(u32)) + return 0; + + if (shift < 0) { + shift = -shift; + flt32_val = BIT(exp) + (mantissa << shift); + shift = 0; + } else { + flt32_val = BIT(exp) + (mantissa >> shift); + } + + fraction = (shift == 0) ? 0 : mantissa & GENMASK(shift - 1, 0);
return (((fraction * 100) >> shift) >= 50) ? sign * (flt32_val + 1) : sign * flt32_val; }
From: Moritz Fischer moritzf@google.com
commit 7a8227b2e76be506b2ac64d2beac950ca04892a5 upstream.
dev_set_rx_mode() grabs a spin_lock, and the lan743x implementation proceeds subsequently to go to sleep using readx_poll_timeout().
Introduce a helper wrapping the readx_poll_timeout_atomic() function and use it to replace the calls to readx_polL_timeout().
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Cc: stable@vger.kernel.org Cc: Bryan Whitehead bryan.whitehead@microchip.com Cc: UNGLinuxDriver@microchip.com Signed-off-by: Moritz Fischer moritzf@google.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230627035000.1295254-1-moritzf@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/microchip/lan743x_main.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -144,6 +144,18 @@ static int lan743x_csr_light_reset(struc !(data & HW_CFG_LRST_), 100000, 10000000); }
+static int lan743x_csr_wait_for_bit_atomic(struct lan743x_adapter *adapter, + int offset, u32 bit_mask, + int target_value, int udelay_min, + int udelay_max, int count) +{ + u32 data; + + return readx_poll_timeout_atomic(LAN743X_CSR_READ_OP, offset, data, + target_value == !!(data & bit_mask), + udelay_max, udelay_min * count); +} + static int lan743x_csr_wait_for_bit(struct lan743x_adapter *adapter, int offset, u32 bit_mask, int target_value, int usleep_min, @@ -746,8 +758,8 @@ static int lan743x_dp_write(struct lan74 u32 dp_sel; int i;
- if (lan743x_csr_wait_for_bit(adapter, DP_SEL, DP_SEL_DPRDY_, - 1, 40, 100, 100)) + if (lan743x_csr_wait_for_bit_atomic(adapter, DP_SEL, DP_SEL_DPRDY_, + 1, 40, 100, 100)) return -EIO; dp_sel = lan743x_csr_read(adapter, DP_SEL); dp_sel &= ~DP_SEL_MASK_; @@ -758,8 +770,9 @@ static int lan743x_dp_write(struct lan74 lan743x_csr_write(adapter, DP_ADDR, addr + i); lan743x_csr_write(adapter, DP_DATA_0, buf[i]); lan743x_csr_write(adapter, DP_CMD, DP_CMD_WRITE_); - if (lan743x_csr_wait_for_bit(adapter, DP_SEL, DP_SEL_DPRDY_, - 1, 40, 100, 100)) + if (lan743x_csr_wait_for_bit_atomic(adapter, DP_SEL, + DP_SEL_DPRDY_, + 1, 40, 100, 100)) return -EIO; }
From: Simon Horman horms@kernel.org
commit 73c4d1b307aeb713e80ab03f90c7df9d417dc0f0 upstream.
The blamed commit introduces usage of fixed_phy_register() but not a corresponding dependency on FIXED_PHY.
This can result in a build failure.
s390-linux-ld: drivers/net/ethernet/microchip/lan743x_main.o: in function `lan743x_phy_open': drivers/net/ethernet/microchip/lan743x_main.c:1514: undefined reference to `fixed_phy_register'
Fixes: 624864fbff92 ("net: lan743x: add fixed phy support for LAN7431 device") Cc: stable@vger.kernel.org Reported-by: Randy Dunlap rdunlap@infradead.org Closes: https://lore.kernel.org/netdev/725bf1c5-b252-7d19-7582-a6809716c7d6@infradea... Reviewed-by: Randy Dunlap rdunlap@infradead.org Tested-by: Randy Dunlap rdunlap@infradead.org # build-tested Signed-off-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/microchip/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/microchip/Kconfig +++ b/drivers/net/ethernet/microchip/Kconfig @@ -46,7 +46,7 @@ config LAN743X tristate "LAN743x support" depends on PCI depends on PTP_1588_CLOCK_OPTIONAL - select PHYLIB + select FIXED_PHY select CRC16 select CRC32 help
From: Namjae Jeon linkinjeon@kernel.org
commit 7b7d709ef7cf285309157fb94c33f625dd22c5e1 upstream.
This patch add the compound request handling to the some commands. Existing clients do not send these commands as compound requests, but ksmbd should consider that they may come.
Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 78 ++++++++++++++++++++++++++++++++---------------- 1 file changed, 53 insertions(+), 25 deletions(-)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -1911,14 +1911,16 @@ out_err: int smb2_tree_connect(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; - struct smb2_tree_connect_req *req = smb2_get_msg(work->request_buf); - struct smb2_tree_connect_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_tree_connect_req *req; + struct smb2_tree_connect_rsp *rsp; struct ksmbd_session *sess = work->sess; char *treename = NULL, *name = NULL; struct ksmbd_tree_conn_status status; struct ksmbd_share_config *share; int rc = -EINVAL;
+ WORK_BUFFERS(work, req, rsp); + treename = smb_strndup_from_utf16(req->Buffer, le16_to_cpu(req->PathLength), true, conn->local_nls); @@ -2087,19 +2089,19 @@ static int smb2_create_open_flags(bool f */ int smb2_tree_disconnect(struct ksmbd_work *work) { - struct smb2_tree_disconnect_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_tree_disconnect_rsp *rsp; + struct smb2_tree_disconnect_req *req; struct ksmbd_session *sess = work->sess; struct ksmbd_tree_connect *tcon = work->tcon;
+ WORK_BUFFERS(work, req, rsp); + rsp->StructureSize = cpu_to_le16(4); inc_rfc1001_len(work->response_buf, 4);
ksmbd_debug(SMB, "request\n");
if (!tcon || test_and_set_bit(TREE_CONN_EXPIRE, &tcon->status)) { - struct smb2_tree_disconnect_req *req = - smb2_get_msg(work->request_buf); - ksmbd_debug(SMB, "Invalid tid %d\n", req->hdr.Id.SyncId.TreeId);
rsp->hdr.Status = STATUS_NETWORK_NAME_DELETED; @@ -2122,10 +2124,14 @@ int smb2_tree_disconnect(struct ksmbd_wo int smb2_session_logoff(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; - struct smb2_logoff_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_logoff_req *req; + struct smb2_logoff_rsp *rsp; struct ksmbd_session *sess; - struct smb2_logoff_req *req = smb2_get_msg(work->request_buf); - u64 sess_id = le64_to_cpu(req->hdr.SessionId); + u64 sess_id; + + WORK_BUFFERS(work, req, rsp); + + sess_id = le64_to_cpu(req->hdr.SessionId);
rsp->StructureSize = cpu_to_le16(4); inc_rfc1001_len(work->response_buf, 4); @@ -2165,12 +2171,14 @@ int smb2_session_logoff(struct ksmbd_wor */ static noinline int create_smb2_pipe(struct ksmbd_work *work) { - struct smb2_create_rsp *rsp = smb2_get_msg(work->response_buf); - struct smb2_create_req *req = smb2_get_msg(work->request_buf); + struct smb2_create_rsp *rsp; + struct smb2_create_req *req; int id; int err; char *name;
+ WORK_BUFFERS(work, req, rsp); + name = smb_strndup_from_utf16(req->Buffer, le16_to_cpu(req->NameLength), 1, work->conn->local_nls); if (IS_ERR(name)) { @@ -5305,8 +5313,10 @@ int smb2_query_info(struct ksmbd_work *w static noinline int smb2_close_pipe(struct ksmbd_work *work) { u64 id; - struct smb2_close_req *req = smb2_get_msg(work->request_buf); - struct smb2_close_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_close_req *req; + struct smb2_close_rsp *rsp; + + WORK_BUFFERS(work, req, rsp);
id = req->VolatileFileId; ksmbd_session_rpc_close(work->sess, id); @@ -5448,6 +5458,9 @@ int smb2_echo(struct ksmbd_work *work) { struct smb2_echo_rsp *rsp = smb2_get_msg(work->response_buf);
+ if (work->next_smb2_rcv_hdr_off) + rsp = ksmbd_resp_buf_next(work); + rsp->StructureSize = cpu_to_le16(4); rsp->Reserved = 0; inc_rfc1001_len(work->response_buf, 4); @@ -6082,8 +6095,10 @@ static noinline int smb2_read_pipe(struc int nbytes = 0, err; u64 id; struct ksmbd_rpc_command *rpc_resp; - struct smb2_read_req *req = smb2_get_msg(work->request_buf); - struct smb2_read_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_read_req *req; + struct smb2_read_rsp *rsp; + + WORK_BUFFERS(work, req, rsp);
id = req->VolatileFileId;
@@ -6331,14 +6346,16 @@ out: */ static noinline int smb2_write_pipe(struct ksmbd_work *work) { - struct smb2_write_req *req = smb2_get_msg(work->request_buf); - struct smb2_write_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_write_req *req; + struct smb2_write_rsp *rsp; struct ksmbd_rpc_command *rpc_resp; u64 id = 0; int err = 0, ret = 0; char *data_buf; size_t length;
+ WORK_BUFFERS(work, req, rsp); + length = le32_to_cpu(req->Length); id = req->VolatileFileId;
@@ -6607,6 +6624,9 @@ int smb2_cancel(struct ksmbd_work *work) struct ksmbd_work *iter; struct list_head *command_list;
+ if (work->next_smb2_rcv_hdr_off) + hdr = ksmbd_resp_buf_next(work); + ksmbd_debug(SMB, "smb2 cancel called on mid %llu, async flags 0x%x\n", hdr->MessageId, hdr->Flags);
@@ -6766,8 +6786,8 @@ static inline bool lock_defer_pending(st */ int smb2_lock(struct ksmbd_work *work) { - struct smb2_lock_req *req = smb2_get_msg(work->request_buf); - struct smb2_lock_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_lock_req *req; + struct smb2_lock_rsp *rsp; struct smb2_lock_element *lock_ele; struct ksmbd_file *fp = NULL; struct file_lock *flock = NULL; @@ -6784,6 +6804,8 @@ int smb2_lock(struct ksmbd_work *work) LIST_HEAD(rollback_list); int prior_lock = 0;
+ WORK_BUFFERS(work, req, rsp); + ksmbd_debug(SMB, "Received lock request\n"); fp = ksmbd_lookup_fd_slow(work, req->VolatileFileId, req->PersistentFileId); if (!fp) { @@ -7897,8 +7919,8 @@ out: */ static void smb20_oplock_break_ack(struct ksmbd_work *work) { - struct smb2_oplock_break *req = smb2_get_msg(work->request_buf); - struct smb2_oplock_break *rsp = smb2_get_msg(work->response_buf); + struct smb2_oplock_break *req; + struct smb2_oplock_break *rsp; struct ksmbd_file *fp; struct oplock_info *opinfo = NULL; __le32 err = 0; @@ -7907,6 +7929,8 @@ static void smb20_oplock_break_ack(struc char req_oplevel = 0, rsp_oplevel = 0; unsigned int oplock_change_type;
+ WORK_BUFFERS(work, req, rsp); + volatile_id = req->VolatileFid; persistent_id = req->PersistentFid; req_oplevel = req->OplockLevel; @@ -8041,8 +8065,8 @@ static int check_lease_state(struct leas static void smb21_lease_break_ack(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; - struct smb2_lease_ack *req = smb2_get_msg(work->request_buf); - struct smb2_lease_ack *rsp = smb2_get_msg(work->response_buf); + struct smb2_lease_ack *req; + struct smb2_lease_ack *rsp; struct oplock_info *opinfo; __le32 err = 0; int ret = 0; @@ -8050,6 +8074,8 @@ static void smb21_lease_break_ack(struct __le32 lease_state; struct lease *lease;
+ WORK_BUFFERS(work, req, rsp); + ksmbd_debug(OPLOCK, "smb21 lease break, lease state(0x%x)\n", le32_to_cpu(req->LeaseState)); opinfo = lookup_lease_in_table(conn, req->LeaseKey); @@ -8175,8 +8201,10 @@ err_out: */ int smb2_oplock_break(struct ksmbd_work *work) { - struct smb2_oplock_break *req = smb2_get_msg(work->request_buf); - struct smb2_oplock_break *rsp = smb2_get_msg(work->response_buf); + struct smb2_oplock_break *req; + struct smb2_oplock_break *rsp; + + WORK_BUFFERS(work, req, rsp);
switch (le16_to_cpu(req->StructureSize)) { case OP_BREAK_STRUCT_SIZE_20:
From: Namjae Jeon linkinjeon@kernel.org
commit 98422bdd4cb3ca4d08844046f6507d7ec2c2b8d8 upstream.
ksmbd does not consider the case of that smb2 session setup is in compound request. If this is the second payload of the compound, OOB read issue occurs while processing the first payload in the smb2_sess_setup().
Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21355 Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -1322,9 +1322,8 @@ static int decode_negotiation_token(stru
static int ntlm_negotiate(struct ksmbd_work *work, struct negotiate_message *negblob, - size_t negblob_len) + size_t negblob_len, struct smb2_sess_setup_rsp *rsp) { - struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf); struct challenge_message *chgblob; unsigned char *spnego_blob = NULL; u16 spnego_blob_len; @@ -1429,10 +1428,10 @@ static struct ksmbd_user *session_user(s return user; }
-static int ntlm_authenticate(struct ksmbd_work *work) +static int ntlm_authenticate(struct ksmbd_work *work, + struct smb2_sess_setup_req *req, + struct smb2_sess_setup_rsp *rsp) { - struct smb2_sess_setup_req *req = smb2_get_msg(work->request_buf); - struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf); struct ksmbd_conn *conn = work->conn; struct ksmbd_session *sess = work->sess; struct channel *chann = NULL; @@ -1566,10 +1565,10 @@ binding_session: }
#ifdef CONFIG_SMB_SERVER_KERBEROS5 -static int krb5_authenticate(struct ksmbd_work *work) +static int krb5_authenticate(struct ksmbd_work *work, + struct smb2_sess_setup_req *req, + struct smb2_sess_setup_rsp *rsp) { - struct smb2_sess_setup_req *req = smb2_get_msg(work->request_buf); - struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf); struct ksmbd_conn *conn = work->conn; struct ksmbd_session *sess = work->sess; char *in_blob, *out_blob; @@ -1647,7 +1646,9 @@ static int krb5_authenticate(struct ksmb return 0; } #else -static int krb5_authenticate(struct ksmbd_work *work) +static int krb5_authenticate(struct ksmbd_work *work, + struct smb2_sess_setup_req *req, + struct smb2_sess_setup_rsp *rsp) { return -EOPNOTSUPP; } @@ -1656,8 +1657,8 @@ static int krb5_authenticate(struct ksmb int smb2_sess_setup(struct ksmbd_work *work) { struct ksmbd_conn *conn = work->conn; - struct smb2_sess_setup_req *req = smb2_get_msg(work->request_buf); - struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf); + struct smb2_sess_setup_req *req; + struct smb2_sess_setup_rsp *rsp; struct ksmbd_session *sess; struct negotiate_message *negblob; unsigned int negblob_len, negblob_off; @@ -1665,6 +1666,8 @@ int smb2_sess_setup(struct ksmbd_work *w
ksmbd_debug(SMB, "Received request for session setup\n");
+ WORK_BUFFERS(work, req, rsp); + rsp->StructureSize = cpu_to_le16(9); rsp->SessionFlags = 0; rsp->SecurityBufferOffset = cpu_to_le16(72); @@ -1786,7 +1789,7 @@ int smb2_sess_setup(struct ksmbd_work *w
if (conn->preferred_auth_mech & (KSMBD_AUTH_KRB5 | KSMBD_AUTH_MSKRB5)) { - rc = krb5_authenticate(work); + rc = krb5_authenticate(work, req, rsp); if (rc) { rc = -EINVAL; goto out_err; @@ -1800,7 +1803,7 @@ int smb2_sess_setup(struct ksmbd_work *w sess->Preauth_HashValue = NULL; } else if (conn->preferred_auth_mech == KSMBD_AUTH_NTLMSSP) { if (negblob->MessageType == NtLmNegotiate) { - rc = ntlm_negotiate(work, negblob, negblob_len); + rc = ntlm_negotiate(work, negblob, negblob_len, rsp); if (rc) goto out_err; rsp->hdr.Status = @@ -1813,7 +1816,7 @@ int smb2_sess_setup(struct ksmbd_work *w le16_to_cpu(rsp->SecurityBufferLength) - 1);
} else if (negblob->MessageType == NtLmAuthenticate) { - rc = ntlm_authenticate(work); + rc = ntlm_authenticate(work, req, rsp); if (rc) goto out_err;
From: Fabio Estevam festevam@denx.de
[ Upstream commit 2c56a751845ddfd3078ebe79981aaaa182629163 ]
The innolux at043tn24 display is a parallel LCD. Pass the 'connector_type' information to avoid the following warning:
panel-simple panel: Specify missing connector_type
Signed-off-by: Fabio Estevam festevam@denx.de Fixes: 41bcceb4de9c ("drm/panel: simple: Add support for Innolux AT043TN24") Reviewed-by: Sam Ravnborg sam@ravnborg.org Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20230620112202.654981-1-festev... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-simple.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index d8efbcee9bc12..1927fef9aed67 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -2117,6 +2117,7 @@ static const struct panel_desc innolux_at043tn24 = { .height = 54, }, .bus_format = MEDIA_BUS_FMT_RGB888_1X24, + .connector_type = DRM_MODE_CONNECTOR_DPI, .bus_flags = DRM_BUS_FLAG_DE_HIGH | DRM_BUS_FLAG_PIXDATA_DRIVE_POSEDGE, };
From: Adrián Larumbe adrian.larumbe@collabora.com
[ Upstream commit 98703e4e061fb8715c7613cd227e32cdfd136b23 ]
Commit 5d844091f237 ("drm/scdc-helper: Pimp SCDC debugs") changed the scdc interface to pick up an i2c adapter from a connector instead. However, in the case of dw-hdmi, the wrong connector was being used to pass i2c adapter information, since dw-hdmi's embedded connector structure is only populated when the bridge attachment callback explicitly asks for it.
drm-meson is handling connector creation, so this won't happen, leading to a NULL pointer dereference.
Fix it by having scdc functions access dw-hdmi's current connector pointer instead, which is assigned during the bridge enablement stage.
Fixes: 5d844091f237 ("drm/scdc-helper: Pimp SCDC debugs") Signed-off-by: Adrián Larumbe adrian.larumbe@collabora.com Reported-by: Lukas F. Hartmann lukas@mntre.com Acked-by: Neil Armstrong neil.armstrong@linaro.org [narmstrong: moved Fixes tag before first S-o-b and added Reported-by tag] Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20230601123153.196867-1-adrian... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c index 603bb3c51027b..3b40e0fdca5cb 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c @@ -1426,9 +1426,9 @@ void dw_hdmi_set_high_tmds_clock_ratio(struct dw_hdmi *hdmi, /* Control for TMDS Bit Period/TMDS Clock-Period Ratio */ if (dw_hdmi_support_scdc(hdmi, display)) { if (mtmdsclock > HDMI14_MAX_TMDSCLK) - drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 1); + drm_scdc_set_high_tmds_clock_ratio(hdmi->curr_conn, 1); else - drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 0); + drm_scdc_set_high_tmds_clock_ratio(hdmi->curr_conn, 0); } } EXPORT_SYMBOL_GPL(dw_hdmi_set_high_tmds_clock_ratio); @@ -2116,7 +2116,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi, min_t(u8, bytes, SCDC_MIN_SOURCE_VERSION));
/* Enabled Scrambling in the Sink */ - drm_scdc_set_scrambling(&hdmi->connector, 1); + drm_scdc_set_scrambling(hdmi->curr_conn, 1);
/* * To activate the scrambler feature, you must ensure @@ -2132,7 +2132,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi, hdmi_writeb(hdmi, 0, HDMI_FC_SCRAMBLER_CTRL); hdmi_writeb(hdmi, (u8)~HDMI_MC_SWRSTZ_TMDSSWRST_REQ, HDMI_MC_SWRSTZ); - drm_scdc_set_scrambling(&hdmi->connector, 0); + drm_scdc_set_scrambling(hdmi->curr_conn, 0); } }
@@ -3553,6 +3553,7 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device *pdev, hdmi->bridge.ops = DRM_BRIDGE_OP_DETECT | DRM_BRIDGE_OP_EDID | DRM_BRIDGE_OP_HPD; hdmi->bridge.interlace_allowed = true; + hdmi->bridge.ddc = hdmi->ddc; #ifdef CONFIG_OF hdmi->bridge.of_node = pdev->dev.of_node; #endif
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 7aa83fbd712a6f08ffa67890061f26d140c2a84f ]
Memory for the "struct device" for any given device isn't supposed to be released until the device's release() is called. This is important because someone might be holding a kobject reference to the "struct device" and might try to access one of its members even after any other cleanup/uninitialization has happened.
Code analysis of ti-sn65dsi86 shows that this isn't quite right. When the code was written, it was believed that we could rely on the fact that the child devices would all be freed before the parent devices and thus we didn't need to worry about a release() function. While I still believe that the parent's "struct device" is guaranteed to outlive the child's "struct device" (because the child holds a kobject reference to the parent), the parent's "devm" allocated memory is a different story. That appears to be freed much earlier.
Let's make this better for ti-sn65dsi86 by allocating each auxiliary with kzalloc and then free that memory in the release().
Fixes: bf73537f411b ("drm/bridge: ti-sn65dsi86: Break GPIO and MIPI-to-eDP bridge into sub-drivers") Suggested-by: Stephen Boyd swboyd@chromium.org Reviewed-by: Stephen Boyd swboyd@chromium.org Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://patchwork.freedesktop.org/patch/msgid/20230613065812.v2.1.I24b838a5b... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/ti-sn65dsi86.c | 35 +++++++++++++++++---------- 1 file changed, 22 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index 4676cf2900dfd..3c8fd6ea6d6a4 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -170,10 +170,10 @@ * @pwm_refclk_freq: Cache for the reference clock input to the PWM. */ struct ti_sn65dsi86 { - struct auxiliary_device bridge_aux; - struct auxiliary_device gpio_aux; - struct auxiliary_device aux_aux; - struct auxiliary_device pwm_aux; + struct auxiliary_device *bridge_aux; + struct auxiliary_device *gpio_aux; + struct auxiliary_device *aux_aux; + struct auxiliary_device *pwm_aux;
struct device *dev; struct regmap *regmap; @@ -468,27 +468,34 @@ static void ti_sn65dsi86_delete_aux(void *data) auxiliary_device_delete(data); }
-/* - * AUX bus docs say that a non-NULL release is mandatory, but it makes no - * sense for the model used here where all of the aux devices are allocated - * in the single shared structure. We'll use this noop as a workaround. - */ -static void ti_sn65dsi86_noop(struct device *dev) {} +static void ti_sn65dsi86_aux_device_release(struct device *dev) +{ + struct auxiliary_device *aux = container_of(dev, struct auxiliary_device, dev); + + kfree(aux); +}
static int ti_sn65dsi86_add_aux_device(struct ti_sn65dsi86 *pdata, - struct auxiliary_device *aux, + struct auxiliary_device **aux_out, const char *name) { struct device *dev = pdata->dev; + struct auxiliary_device *aux; int ret;
+ aux = kzalloc(sizeof(*aux), GFP_KERNEL); + if (!aux) + return -ENOMEM; + aux->name = name; aux->dev.parent = dev; - aux->dev.release = ti_sn65dsi86_noop; + aux->dev.release = ti_sn65dsi86_aux_device_release; device_set_of_node_from_dev(&aux->dev, dev); ret = auxiliary_device_init(aux); - if (ret) + if (ret) { + kfree(aux); return ret; + } ret = devm_add_action_or_reset(dev, ti_sn65dsi86_uninit_aux, aux); if (ret) return ret; @@ -497,6 +504,8 @@ static int ti_sn65dsi86_add_aux_device(struct ti_sn65dsi86 *pdata, if (ret) return ret; ret = devm_add_action_or_reset(dev, ti_sn65dsi86_delete_aux, aux); + if (!ret) + *aux_out = aux;
return ret; }
From: Petr Tesarik petr.tesarik.ext@huawei.com
[ Upstream commit aabd12609f91155f26584508b01f548215cc3c0c ]
The number of areas defaults to the number of possible CPUs. However, the total number of slots may have to be increased after adjusting the number of areas. Consequently, the number of areas must be determined before allocating the memory pool. This is even explained with a comment in swiotlb_init_remap(), but swiotlb_init_late() adjusts the number of areas after slots are already allocated. The areas may end up being smaller than IO_TLB_SEGSIZE, which breaks per-area locking.
While fixing swiotlb_init_late(), move all relevant comments before the definition of swiotlb_adjust_nareas() and convert them to kernel-doc.
Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik petr.tesarik.ext@huawei.com Reviewed-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/swiotlb.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index af2e304c672c4..16f53d8c51bcf 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -115,9 +115,16 @@ static bool round_up_default_nslabs(void) return true; }
+/** + * swiotlb_adjust_nareas() - adjust the number of areas and slots + * @nareas: Desired number of areas. Zero is treated as 1. + * + * Adjust the default number of areas in a memory pool. + * The default size of the memory pool may also change to meet minimum area + * size requirements. + */ static void swiotlb_adjust_nareas(unsigned int nareas) { - /* use a single area when non is specified */ if (!nareas) nareas = 1; else if (!is_power_of_2(nareas)) @@ -298,10 +305,6 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, if (swiotlb_force_disable) return;
- /* - * default_nslabs maybe changed when adjust area number. - * So allocate bounce buffer after adjusting area number. - */ if (!default_nareas) swiotlb_adjust_nareas(num_possible_cpus());
@@ -363,6 +366,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, if (swiotlb_force_disable) return 0;
+ if (!default_nareas) + swiotlb_adjust_nareas(num_possible_cpus()); + retry: order = get_order(nslabs << IO_TLB_SHIFT); nslabs = SLABS_PER_PAGE << order; @@ -397,9 +403,6 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, (PAGE_SIZE << order) >> 20); }
- if (!default_nareas) - swiotlb_adjust_nareas(num_possible_cpus()); - area_order = get_order(array_size(sizeof(*mem->areas), default_nareas)); mem->areas = (struct io_tlb_area *)
From: Petr Tesarik petr.tesarik.ext@huawei.com
[ Upstream commit 8ac04063354a01a484d2e55d20ed1958aa0d3392 ]
Although the desired size of the SWIOTLB memory pool is increased in swiotlb_adjust_nareas() to match the number of areas, the actual allocation may be smaller, which may require reducing the number of areas.
For example, Xen uses swiotlb_init_late(), which in turn uses the page allocator. On x86, page size is 4 KiB and MAX_ORDER is 10 (1024 pages), resulting in a maximum memory pool size of 4 MiB. This corresponds to 2048 slots of 2 KiB each. The minimum area size is 128 (IO_TLB_SEGSIZE), allowing at most 2048 / 128 = 16 areas.
If num_possible_cpus() is greater than the maximum number of areas, areas are smaller than IO_TLB_SEGSIZE and contiguous groups of free slots will span multiple areas. When allocating and freeing slots, only one area will be properly locked, causing race conditions on the unlocked slots and ultimately data corruption, kernel hangs and crashes.
Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik petr.tesarik.ext@huawei.com Reviewed-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/swiotlb.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 16f53d8c51bcf..b1bbd6270ba79 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -138,6 +138,23 @@ static void swiotlb_adjust_nareas(unsigned int nareas) (default_nslabs << IO_TLB_SHIFT) >> 20); }
+/** + * limit_nareas() - get the maximum number of areas for a given memory pool size + * @nareas: Desired number of areas. + * @nslots: Total number of slots in the memory pool. + * + * Limit the number of areas to the maximum possible number of areas in + * a memory pool of the given size. + * + * Return: Maximum possible number of areas. + */ +static unsigned int limit_nareas(unsigned int nareas, unsigned long nslots) +{ + if (nslots < nareas * IO_TLB_SEGSIZE) + return nslots / IO_TLB_SEGSIZE; + return nareas; +} + static int __init setup_io_tlb_npages(char *str) { @@ -297,6 +314,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, { struct io_tlb_mem *mem = &io_tlb_default_mem; unsigned long nslabs; + unsigned int nareas; size_t alloc_size; void *tlb;
@@ -309,10 +327,12 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, swiotlb_adjust_nareas(num_possible_cpus());
nslabs = default_nslabs; + nareas = limit_nareas(default_nareas, nslabs); while ((tlb = swiotlb_memblock_alloc(nslabs, flags, remap)) == NULL) { if (nslabs <= IO_TLB_MIN_SLABS) return; nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE); + nareas = limit_nareas(nareas, nslabs); }
if (default_nslabs != nslabs) { @@ -358,6 +378,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, { struct io_tlb_mem *mem = &io_tlb_default_mem; unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE); + unsigned int nareas; unsigned char *vstart = NULL; unsigned int order, area_order; bool retried = false; @@ -403,8 +424,8 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, (PAGE_SIZE << order) >> 20); }
- area_order = get_order(array_size(sizeof(*mem->areas), - default_nareas)); + nareas = limit_nareas(default_nareas, nslabs); + area_order = get_order(array_size(sizeof(*mem->areas), nareas)); mem->areas = (struct io_tlb_area *) __get_free_pages(GFP_KERNEL | __GFP_ZERO, area_order); if (!mem->areas) @@ -418,7 +439,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, set_memory_decrypted((unsigned long)vstart, (nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT); swiotlb_init_io_tlb_mem(mem, virt_to_phys(vstart), nslabs, 0, true, - default_nareas); + nareas);
swiotlb_print_info(); return 0;
From: Marek Vasut marex@denx.de
[ Upstream commit 1c519980aced3da1fae37c1339cf43b24eccdee7 ]
Add missing drm_display_mode DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC flags. Those are used by various bridges in the pipeline to correctly configure its sync signals polarity.
Fixes: d69de69f2be1 ("drm/panel: simple: Add Powertip PH800480T013 panel") Signed-off-by: Marek Vasut marex@denx.de Reviewed-by: Sam Ravnborg sam@ravnborg.org Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20230615201602.565948-1-marex@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-simple.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c index 1927fef9aed67..e02249b212c2a 100644 --- a/drivers/gpu/drm/panel/panel-simple.c +++ b/drivers/gpu/drm/panel/panel-simple.c @@ -3110,6 +3110,7 @@ static const struct drm_display_mode powertip_ph800480t013_idf02_mode = { .vsync_start = 480 + 49, .vsync_end = 480 + 49 + 2, .vtotal = 480 + 49 + 2 + 22, + .flags = DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC, };
static const struct panel_desc powertip_ph800480t013_idf02 = {
From: Petr Pavlu petr.pavlu@suse.com
[ Upstream commit 21a235bce12361e64adfc2ef97e4ae2e51ad63d4 ]
When attempting to run Xen on a QEMU/KVM virtual machine with virtio devices (all x86_64), function xen_dt_get_node() crashes on accessing bus->bridge->parent->of_node because a bridge of the PCI root bus has no parent set:
[ 1.694192][ T1] BUG: kernel NULL pointer dereference, address: 0000000000000288 [ 1.695688][ T1] #PF: supervisor read access in kernel mode [ 1.696297][ T1] #PF: error_code(0x0000) - not-present page [ 1.696297][ T1] PGD 0 P4D 0 [ 1.696297][ T1] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1.696297][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.7-1-default #1 openSUSE Tumbleweed a577eae57964bb7e83477b5a5645a1781df990f0 [ 1.696297][ T1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 1.696297][ T1] RIP: e030:xen_virtio_restricted_mem_acc+0xd9/0x1c0 [ 1.696297][ T1] Code: 45 0c 83 e8 c9 a3 ea ff 31 c0 eb d7 48 8b 87 40 ff ff ff 48 89 c2 48 8b 40 10 48 85 c0 75 f4 48 8b 82 10 01 00 00 48 8b 40 40 <48> 83 b8 88 02 00 00 00 0f 84 45 ff ff ff 66 90 31 c0 eb a5 48 89 [ 1.696297][ T1] RSP: e02b:ffffc90040013cc8 EFLAGS: 00010246 [ 1.696297][ T1] RAX: 0000000000000000 RBX: ffff888006c75000 RCX: 0000000000000029 [ 1.696297][ T1] RDX: ffff888005ed1000 RSI: ffffc900400f100c RDI: ffff888005ee30d0 [ 1.696297][ T1] RBP: ffff888006c75010 R08: 0000000000000001 R09: 0000000330000006 [ 1.696297][ T1] R10: ffff888005850028 R11: 0000000000000002 R12: ffffffff830439a0 [ 1.696297][ T1] R13: 0000000000000000 R14: ffff888005657900 R15: ffff888006e3e1e8 [ 1.696297][ T1] FS: 0000000000000000(0000) GS:ffff88804a000000(0000) knlGS:0000000000000000 [ 1.696297][ T1] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.696297][ T1] CR2: 0000000000000288 CR3: 0000000002e36000 CR4: 0000000000050660 [ 1.696297][ T1] Call Trace: [ 1.696297][ T1] <TASK> [ 1.696297][ T1] virtio_features_ok+0x1b/0xd0 [ 1.696297][ T1] virtio_dev_probe+0x19c/0x270 [ 1.696297][ T1] really_probe+0x19b/0x3e0 [ 1.696297][ T1] __driver_probe_device+0x78/0x160 [ 1.696297][ T1] driver_probe_device+0x1f/0x90 [ 1.696297][ T1] __driver_attach+0xd2/0x1c0 [ 1.696297][ T1] bus_for_each_dev+0x74/0xc0 [ 1.696297][ T1] bus_add_driver+0x116/0x220 [ 1.696297][ T1] driver_register+0x59/0x100 [ 1.696297][ T1] virtio_console_init+0x7f/0x110 [ 1.696297][ T1] do_one_initcall+0x47/0x220 [ 1.696297][ T1] kernel_init_freeable+0x328/0x480 [ 1.696297][ T1] kernel_init+0x1a/0x1c0 [ 1.696297][ T1] ret_from_fork+0x29/0x50 [ 1.696297][ T1] </TASK> [ 1.696297][ T1] Modules linked in: [ 1.696297][ T1] CR2: 0000000000000288 [ 1.696297][ T1] ---[ end trace 0000000000000000 ]---
The PCI root bus is in this case created from ACPI description via acpi_pci_root_add() -> pci_acpi_scan_root() -> acpi_pci_root_create() -> pci_create_root_bus() where the last function is called with parent=NULL. It indicates that no parent is present and then bus->bridge->parent is NULL too.
Fix the problem by checking bus->bridge->parent in xen_dt_get_node() for NULL first.
Fixes: ef8ae384b4c9 ("xen/virtio: Handle PCI devices which Host controller is described in DT") Signed-off-by: Petr Pavlu petr.pavlu@suse.com Reviewed-by: Oleksandr Tyshchenko oleksandr_tyshchenko@epam.com Reviewed-by: Stefano Stabellini sstabellini@kernel.org Link: https://lore.kernel.org/r/20230621131214.9398-2-petr.pavlu@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/grant-dma-ops.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c index 9784a77fa3c99..76f6f26265a3b 100644 --- a/drivers/xen/grant-dma-ops.c +++ b/drivers/xen/grant-dma-ops.c @@ -303,6 +303,8 @@ static struct device_node *xen_dt_get_node(struct device *dev) while (!pci_is_root_bus(bus)) bus = bus->parent;
+ if (!bus->bridge->parent) + return NULL; return of_node_get(bus->bridge->parent->of_node); }
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 1689f25924ada8fe14a4a82c38925d04994c7142 ]
Overflow use refcount checks are not complete.
Add helper function to deal with object reference counter tracking. Report -EMFILE in case UINT_MAX is reached.
nft_use_dec() splats in case that reference counter underflows, which should not ever happen.
Add nft_use_inc_restore() and nft_use_dec_restore() which are used to restore reference counter from error and abort paths.
Use u32 in nft_flowtable and nft_object since helper functions cannot work on bitfields.
Remove the few early incomplete checks now that the helper functions are in place and used to check for refcount overflow.
Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 31 +++++- net/netfilter/nf_tables_api.c | 163 ++++++++++++++++++------------ net/netfilter/nft_flow_offload.c | 6 +- net/netfilter/nft_immediate.c | 8 +- net/netfilter/nft_objref.c | 8 +- 5 files changed, 141 insertions(+), 75 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index ee47d7143d99f..1b0beb8f08aee 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1211,6 +1211,29 @@ int __nft_release_basechain(struct nft_ctx *ctx);
unsigned int nft_do_chain(struct nft_pktinfo *pkt, void *priv);
+static inline bool nft_use_inc(u32 *use) +{ + if (*use == UINT_MAX) + return false; + + (*use)++; + + return true; +} + +static inline void nft_use_dec(u32 *use) +{ + WARN_ON_ONCE((*use)-- == 0); +} + +/* For error and abort path: restore use counter to previous state. */ +static inline void nft_use_inc_restore(u32 *use) +{ + WARN_ON_ONCE(!nft_use_inc(use)); +} + +#define nft_use_dec_restore nft_use_dec + /** * struct nft_table - nf_tables table * @@ -1296,8 +1319,8 @@ struct nft_object { struct list_head list; struct rhlist_head rhlhead; struct nft_object_hash_key key; - u32 genmask:2, - use:30; + u32 genmask:2; + u32 use; u64 handle; u16 udlen; u8 *udata; @@ -1399,8 +1422,8 @@ struct nft_flowtable { char *name; int hooknum; int ops_len; - u32 genmask:2, - use:30; + u32 genmask:2; + u32 use; u64 handle; /* runtime data below here */ struct list_head hook_list ____cacheline_aligned; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 79719e8cda799..18546f9b2a63a 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -253,8 +253,10 @@ int nf_tables_bind_chain(const struct nft_ctx *ctx, struct nft_chain *chain) if (chain->bound) return -EBUSY;
+ if (!nft_use_inc(&chain->use)) + return -EMFILE; + chain->bound = true; - chain->use++; nft_chain_trans_bind(ctx, chain);
return 0; @@ -437,7 +439,7 @@ static int nft_delchain(struct nft_ctx *ctx) if (IS_ERR(trans)) return PTR_ERR(trans);
- ctx->table->use--; + nft_use_dec(&ctx->table->use); nft_deactivate_next(ctx->net, ctx->chain);
return 0; @@ -476,7 +478,7 @@ nf_tables_delrule_deactivate(struct nft_ctx *ctx, struct nft_rule *rule) /* You cannot delete the same rule twice */ if (nft_is_active_next(ctx->net, rule)) { nft_deactivate_next(ctx->net, rule); - ctx->chain->use--; + nft_use_dec(&ctx->chain->use); return 0; } return -ENOENT; @@ -643,7 +645,7 @@ static int nft_delset(const struct nft_ctx *ctx, struct nft_set *set) nft_map_deactivate(ctx, set);
nft_deactivate_next(ctx->net, set); - ctx->table->use--; + nft_use_dec(&ctx->table->use);
return err; } @@ -675,7 +677,7 @@ static int nft_delobj(struct nft_ctx *ctx, struct nft_object *obj) return err;
nft_deactivate_next(ctx->net, obj); - ctx->table->use--; + nft_use_dec(&ctx->table->use);
return err; } @@ -710,7 +712,7 @@ static int nft_delflowtable(struct nft_ctx *ctx, return err;
nft_deactivate_next(ctx->net, flowtable); - ctx->table->use--; + nft_use_dec(&ctx->table->use);
return err; } @@ -2395,9 +2397,6 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask, struct nft_chain *chain; int err;
- if (table->use == UINT_MAX) - return -EOVERFLOW; - if (nla[NFTA_CHAIN_HOOK]) { struct nft_stats __percpu *stats = NULL; struct nft_chain_hook hook = {}; @@ -2493,6 +2492,11 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask, if (err < 0) goto err_destroy_chain;
+ if (!nft_use_inc(&table->use)) { + err = -EMFILE; + goto err_use; + } + trans = nft_trans_chain_add(ctx, NFT_MSG_NEWCHAIN); if (IS_ERR(trans)) { err = PTR_ERR(trans); @@ -2509,10 +2513,11 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask, goto err_unregister_hook; }
- table->use++; - return 0; + err_unregister_hook: + nft_use_dec_restore(&table->use); +err_use: nf_tables_unregister_hook(net, table, chain); err_destroy_chain: nf_tables_chain_destroy(ctx); @@ -3841,9 +3846,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, return -EINVAL; handle = nf_tables_alloc_handle(table);
- if (chain->use == UINT_MAX) - return -EOVERFLOW; - if (nla[NFTA_RULE_POSITION]) { pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION])); old_rule = __nft_rule_lookup(chain, pos_handle); @@ -3937,6 +3939,11 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, } }
+ if (!nft_use_inc(&chain->use)) { + err = -EMFILE; + goto err_release_rule; + } + if (info->nlh->nlmsg_flags & NLM_F_REPLACE) { err = nft_delrule(&ctx, old_rule); if (err < 0) @@ -3968,7 +3975,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, } } kvfree(expr_info); - chain->use++;
if (flow) nft_trans_flow_rule(trans) = flow; @@ -3979,6 +3985,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, return 0;
err_destroy_flow_rule: + nft_use_dec_restore(&chain->use); if (flow) nft_flow_rule_destroy(flow); err_release_rule: @@ -5015,9 +5022,15 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info, alloc_size = sizeof(*set) + size + udlen; if (alloc_size < size || alloc_size > INT_MAX) return -ENOMEM; + + if (!nft_use_inc(&table->use)) + return -EMFILE; + set = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT); - if (!set) - return -ENOMEM; + if (!set) { + err = -ENOMEM; + goto err_alloc; + }
name = nla_strdup(nla[NFTA_SET_NAME], GFP_KERNEL_ACCOUNT); if (!name) { @@ -5075,7 +5088,7 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info, goto err_set_expr_alloc;
list_add_tail_rcu(&set->list, &table->sets); - table->use++; + return 0;
err_set_expr_alloc: @@ -5087,6 +5100,9 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info, kfree(set->name); err_set_name: kvfree(set); +err_alloc: + nft_use_dec_restore(&table->use); + return err; }
@@ -5225,9 +5241,6 @@ int nf_tables_bind_set(const struct nft_ctx *ctx, struct nft_set *set, struct nft_set_binding *i; struct nft_set_iter iter;
- if (set->use == UINT_MAX) - return -EOVERFLOW; - if (!list_empty(&set->bindings) && nft_set_is_anonymous(set)) return -EBUSY;
@@ -5255,10 +5268,12 @@ int nf_tables_bind_set(const struct nft_ctx *ctx, struct nft_set *set, return iter.err; } bind: + if (!nft_use_inc(&set->use)) + return -EMFILE; + binding->chain = ctx->chain; list_add_tail_rcu(&binding->list, &set->bindings); nft_set_trans_bind(ctx, set); - set->use++;
return 0; } @@ -5332,7 +5347,7 @@ void nf_tables_activate_set(const struct nft_ctx *ctx, struct nft_set *set) nft_clear(ctx->net, set); }
- set->use++; + nft_use_inc_restore(&set->use); } EXPORT_SYMBOL_GPL(nf_tables_activate_set);
@@ -5348,7 +5363,7 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set, else list_del_rcu(&binding->list);
- set->use--; + nft_use_dec(&set->use); break; case NFT_TRANS_PREPARE: if (nft_set_is_anonymous(set)) { @@ -5357,7 +5372,7 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
nft_deactivate_next(ctx->net, set); } - set->use--; + nft_use_dec(&set->use); return; case NFT_TRANS_ABORT: case NFT_TRANS_RELEASE: @@ -5365,7 +5380,7 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set, set->flags & (NFT_SET_MAP | NFT_SET_OBJECT)) nft_map_deactivate(ctx, set);
- set->use--; + nft_use_dec(&set->use); fallthrough; default: nf_tables_unbind_set(ctx, set, binding, @@ -6134,7 +6149,7 @@ void nft_set_elem_destroy(const struct nft_set *set, void *elem, nft_set_elem_expr_destroy(&ctx, nft_set_ext_expr(ext));
if (nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF)) - (*nft_set_ext_obj(ext))->use--; + nft_use_dec(&(*nft_set_ext_obj(ext))->use); kfree(elem); } EXPORT_SYMBOL_GPL(nft_set_elem_destroy); @@ -6636,8 +6651,16 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, set->objtype, genmask); if (IS_ERR(obj)) { err = PTR_ERR(obj); + obj = NULL; goto err_parse_key_end; } + + if (!nft_use_inc(&obj->use)) { + err = -EMFILE; + obj = NULL; + goto err_parse_key_end; + } + err = nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF); if (err < 0) goto err_parse_key_end; @@ -6706,10 +6729,9 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (flags) *nft_set_ext_flags(ext) = flags;
- if (obj) { + if (obj) *nft_set_ext_obj(ext) = obj; - obj->use++; - } + if (ulen > 0) { if (nft_set_ext_check(&tmpl, NFT_SET_EXT_USERDATA, ulen) < 0) { err = -EINVAL; @@ -6774,12 +6796,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, kfree(trans); err_elem_free: nf_tables_set_elem_destroy(ctx, set, elem.priv); - if (obj) - obj->use--; err_parse_data: if (nla[NFTA_SET_ELEM_DATA] != NULL) nft_data_release(&elem.data.val, desc.type); err_parse_key_end: + if (obj) + nft_use_dec_restore(&obj->use); + nft_data_release(&elem.key_end.val, NFT_DATA_VALUE); err_parse_key: nft_data_release(&elem.key.val, NFT_DATA_VALUE); @@ -6859,7 +6882,7 @@ void nft_data_hold(const struct nft_data *data, enum nft_data_types type) case NFT_JUMP: case NFT_GOTO: chain = data->verdict.chain; - chain->use++; + nft_use_inc_restore(&chain->use); break; } } @@ -6874,7 +6897,7 @@ static void nft_setelem_data_activate(const struct net *net, if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) nft_data_hold(nft_set_ext_data(ext), set->dtype); if (nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF)) - (*nft_set_ext_obj(ext))->use++; + nft_use_inc_restore(&(*nft_set_ext_obj(ext))->use); }
static void nft_setelem_data_deactivate(const struct net *net, @@ -6886,7 +6909,7 @@ static void nft_setelem_data_deactivate(const struct net *net, if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA)) nft_data_release(nft_set_ext_data(ext), set->dtype); if (nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF)) - (*nft_set_ext_obj(ext))->use--; + nft_use_dec(&(*nft_set_ext_obj(ext))->use); }
static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set, @@ -7429,9 +7452,14 @@ static int nf_tables_newobj(struct sk_buff *skb, const struct nfnl_info *info,
nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
+ if (!nft_use_inc(&table->use)) + return -EMFILE; + type = nft_obj_type_get(net, objtype); - if (IS_ERR(type)) - return PTR_ERR(type); + if (IS_ERR(type)) { + err = PTR_ERR(type); + goto err_type; + }
obj = nft_obj_init(&ctx, type, nla[NFTA_OBJ_DATA]); if (IS_ERR(obj)) { @@ -7465,7 +7493,7 @@ static int nf_tables_newobj(struct sk_buff *skb, const struct nfnl_info *info, goto err_obj_ht;
list_add_tail_rcu(&obj->list, &table->objects); - table->use++; + return 0; err_obj_ht: /* queued in transaction log */ @@ -7481,6 +7509,9 @@ static int nf_tables_newobj(struct sk_buff *skb, const struct nfnl_info *info, kfree(obj); err_init: module_put(type->owner); +err_type: + nft_use_dec_restore(&table->use); + return err; }
@@ -7882,7 +7913,7 @@ void nf_tables_deactivate_flowtable(const struct nft_ctx *ctx, case NFT_TRANS_PREPARE: case NFT_TRANS_ABORT: case NFT_TRANS_RELEASE: - flowtable->use--; + nft_use_dec(&flowtable->use); fallthrough; default: return; @@ -8236,9 +8267,14 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
+ if (!nft_use_inc(&table->use)) + return -EMFILE; + flowtable = kzalloc(sizeof(*flowtable), GFP_KERNEL_ACCOUNT); - if (!flowtable) - return -ENOMEM; + if (!flowtable) { + err = -ENOMEM; + goto flowtable_alloc; + }
flowtable->table = table; flowtable->handle = nf_tables_alloc_handle(table); @@ -8293,7 +8329,6 @@ static int nf_tables_newflowtable(struct sk_buff *skb, goto err5;
list_add_tail_rcu(&flowtable->list, &table->flowtables); - table->use++;
return 0; err5: @@ -8310,6 +8345,9 @@ static int nf_tables_newflowtable(struct sk_buff *skb, kfree(flowtable->name); err1: kfree(flowtable); +flowtable_alloc: + nft_use_dec_restore(&table->use); + return err; }
@@ -9680,7 +9718,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) */ if (nft_set_is_anonymous(nft_trans_set(trans)) && !list_empty(&nft_trans_set(trans)->bindings)) - trans->ctx.table->use--; + nft_use_dec(&trans->ctx.table->use); } nf_tables_set_notify(&trans->ctx, nft_trans_set(trans), NFT_MSG_NEWSET, GFP_KERNEL); @@ -9910,7 +9948,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) nft_trans_destroy(trans); break; } - trans->ctx.table->use--; + nft_use_dec_restore(&trans->ctx.table->use); nft_chain_del(trans->ctx.chain); nf_tables_unregister_hook(trans->ctx.net, trans->ctx.table, @@ -9923,7 +9961,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) list_splice(&nft_trans_chain_hooks(trans), &nft_trans_basechain(trans)->hook_list); } else { - trans->ctx.table->use++; + nft_use_inc_restore(&trans->ctx.table->use); nft_clear(trans->ctx.net, trans->ctx.chain); } nft_trans_destroy(trans); @@ -9933,7 +9971,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) nft_trans_destroy(trans); break; } - trans->ctx.chain->use--; + nft_use_dec_restore(&trans->ctx.chain->use); list_del_rcu(&nft_trans_rule(trans)->list); nft_rule_expr_deactivate(&trans->ctx, nft_trans_rule(trans), @@ -9943,7 +9981,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) break; case NFT_MSG_DELRULE: case NFT_MSG_DESTROYRULE: - trans->ctx.chain->use++; + nft_use_inc_restore(&trans->ctx.chain->use); nft_clear(trans->ctx.net, nft_trans_rule(trans)); nft_rule_expr_activate(&trans->ctx, nft_trans_rule(trans)); if (trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD) @@ -9956,7 +9994,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) nft_trans_destroy(trans); break; } - trans->ctx.table->use--; + nft_use_dec_restore(&trans->ctx.table->use); if (nft_trans_set_bound(trans)) { nft_trans_destroy(trans); break; @@ -9965,7 +10003,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) break; case NFT_MSG_DELSET: case NFT_MSG_DESTROYSET: - trans->ctx.table->use++; + nft_use_inc_restore(&trans->ctx.table->use); nft_clear(trans->ctx.net, nft_trans_set(trans)); if (nft_trans_set(trans)->flags & (NFT_SET_MAP | NFT_SET_OBJECT)) nft_map_activate(&trans->ctx, nft_trans_set(trans)); @@ -10009,13 +10047,13 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) nft_obj_destroy(&trans->ctx, nft_trans_obj_newobj(trans)); nft_trans_destroy(trans); } else { - trans->ctx.table->use--; + nft_use_dec_restore(&trans->ctx.table->use); nft_obj_del(nft_trans_obj(trans)); } break; case NFT_MSG_DELOBJ: case NFT_MSG_DESTROYOBJ: - trans->ctx.table->use++; + nft_use_inc_restore(&trans->ctx.table->use); nft_clear(trans->ctx.net, nft_trans_obj(trans)); nft_trans_destroy(trans); break; @@ -10024,7 +10062,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) nft_unregister_flowtable_net_hooks(net, &nft_trans_flowtable_hooks(trans)); } else { - trans->ctx.table->use--; + nft_use_dec_restore(&trans->ctx.table->use); list_del_rcu(&nft_trans_flowtable(trans)->list); nft_unregister_flowtable_net_hooks(net, &nft_trans_flowtable(trans)->hook_list); @@ -10036,7 +10074,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) list_splice(&nft_trans_flowtable_hooks(trans), &nft_trans_flowtable(trans)->hook_list); } else { - trans->ctx.table->use++; + nft_use_inc_restore(&trans->ctx.table->use); nft_clear(trans->ctx.net, nft_trans_flowtable(trans)); } nft_trans_destroy(trans); @@ -10486,8 +10524,9 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data, if (desc->flags & NFT_DATA_DESC_SETELEM && chain->flags & NFT_CHAIN_BINDING) return -EINVAL; + if (!nft_use_inc(&chain->use)) + return -EMFILE;
- chain->use++; data->verdict.chain = chain; break; } @@ -10505,7 +10544,7 @@ static void nft_verdict_uninit(const struct nft_data *data) case NFT_JUMP: case NFT_GOTO: chain = data->verdict.chain; - chain->use--; + nft_use_dec(&chain->use); break; } } @@ -10674,11 +10713,11 @@ int __nft_release_basechain(struct nft_ctx *ctx) nf_tables_unregister_hook(ctx->net, ctx->chain->table, ctx->chain); list_for_each_entry_safe(rule, nr, &ctx->chain->rules, list) { list_del(&rule->list); - ctx->chain->use--; + nft_use_dec(&ctx->chain->use); nf_tables_rule_release(ctx, rule); } nft_chain_del(ctx->chain); - ctx->table->use--; + nft_use_dec(&ctx->table->use); nf_tables_chain_destroy(ctx);
return 0; @@ -10728,18 +10767,18 @@ static void __nft_release_table(struct net *net, struct nft_table *table) ctx.chain = chain; list_for_each_entry_safe(rule, nr, &chain->rules, list) { list_del(&rule->list); - chain->use--; + nft_use_dec(&chain->use); nf_tables_rule_release(&ctx, rule); } } list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) { list_del(&flowtable->list); - table->use--; + nft_use_dec(&table->use); nf_tables_flowtable_destroy(flowtable); } list_for_each_entry_safe(set, ns, &table->sets, list) { list_del(&set->list); - table->use--; + nft_use_dec(&table->use); if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT)) nft_map_deactivate(&ctx, set);
@@ -10747,13 +10786,13 @@ static void __nft_release_table(struct net *net, struct nft_table *table) } list_for_each_entry_safe(obj, ne, &table->objects, list) { nft_obj_del(obj); - table->use--; + nft_use_dec(&table->use); nft_obj_destroy(&ctx, obj); } list_for_each_entry_safe(chain, nc, &table->chains, list) { ctx.chain = chain; nft_chain_del(chain); - table->use--; + nft_use_dec(&table->use); nf_tables_chain_destroy(&ctx); } nf_tables_table_destroy(&ctx); diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c index e860d8fe0e5e2..03159c6c6c4b6 100644 --- a/net/netfilter/nft_flow_offload.c +++ b/net/netfilter/nft_flow_offload.c @@ -404,8 +404,10 @@ static int nft_flow_offload_init(const struct nft_ctx *ctx, if (IS_ERR(flowtable)) return PTR_ERR(flowtable);
+ if (!nft_use_inc(&flowtable->use)) + return -EMFILE; + priv->flowtable = flowtable; - flowtable->use++;
return nf_ct_netns_get(ctx->net, ctx->family); } @@ -424,7 +426,7 @@ static void nft_flow_offload_activate(const struct nft_ctx *ctx, { struct nft_flow_offload *priv = nft_expr_priv(expr);
- priv->flowtable->use++; + nft_use_inc_restore(&priv->flowtable->use); }
static void nft_flow_offload_destroy(const struct nft_ctx *ctx, diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c index 3d76ebfe8939b..407d7197f75bb 100644 --- a/net/netfilter/nft_immediate.c +++ b/net/netfilter/nft_immediate.c @@ -159,7 +159,7 @@ static void nft_immediate_deactivate(const struct nft_ctx *ctx, default: nft_chain_del(chain); chain->bound = false; - chain->table->use--; + nft_use_dec(&chain->table->use); break; } break; @@ -198,7 +198,7 @@ static void nft_immediate_destroy(const struct nft_ctx *ctx, * let the transaction records release this chain and its rules. */ if (chain->bound) { - chain->use--; + nft_use_dec(&chain->use); break; }
@@ -206,9 +206,9 @@ static void nft_immediate_destroy(const struct nft_ctx *ctx, chain_ctx = *ctx; chain_ctx.chain = chain;
- chain->use--; + nft_use_dec(&chain->use); list_for_each_entry_safe(rule, n, &chain->rules, list) { - chain->use--; + nft_use_dec(&chain->use); list_del(&rule->list); nf_tables_rule_destroy(&chain_ctx, rule); } diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c index a48dd5b5d45b1..509011b1ef597 100644 --- a/net/netfilter/nft_objref.c +++ b/net/netfilter/nft_objref.c @@ -41,8 +41,10 @@ static int nft_objref_init(const struct nft_ctx *ctx, if (IS_ERR(obj)) return -ENOENT;
+ if (!nft_use_inc(&obj->use)) + return -EMFILE; + nft_objref_priv(expr) = obj; - obj->use++;
return 0; } @@ -72,7 +74,7 @@ static void nft_objref_deactivate(const struct nft_ctx *ctx, if (phase == NFT_TRANS_COMMIT) return;
- obj->use--; + nft_use_dec(&obj->use); }
static void nft_objref_activate(const struct nft_ctx *ctx, @@ -80,7 +82,7 @@ static void nft_objref_activate(const struct nft_ctx *ctx, { struct nft_object *obj = nft_objref_priv(expr);
- obj->use++; + nft_use_inc_restore(&obj->use); }
static const struct nft_expr_ops nft_objref_ops = {
From: Florian Westphal fw@strlen.de
[ Upstream commit eaf9e7192ec9af2fbf1b6eb2299dd0feca6c5f7e ]
Originally this used jhash2() over tuple and folded the zone id, the pernet hash value, destination port and l4 protocol number into the 32bit seed value.
When the switch to siphash was done, I used an on-stack temporary buffer to build a suitable key to be hashed via siphash().
But this showed up as performance regression, so I got rid of the temporary copy and collected to-be-hashed data in 4 u64 variables.
This makes it easy to build tuples that produce the same hash, which isn't desirable even though chain lengths are limited.
Switch back to plain siphash, but just like with jhash2(), take advantage of the fact that most of to-be-hashed data is already in a suitable order.
Use an empty struct as annotation in 'struct nf_conntrack_tuple' to mark last member that can be used as hash input.
The only remaining data that isn't present in the tuple structure are the zone identifier and the pernet hash: fold those into the key.
Fixes: d2c806abcf0b ("netfilter: conntrack: use siphash_4u64") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_conntrack_tuple.h | 3 +++ net/netfilter/nf_conntrack_core.c | 20 +++++++------------- 2 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack_tuple.h b/include/net/netfilter/nf_conntrack_tuple.h index 9334371c94e2b..f7dd950ff2509 100644 --- a/include/net/netfilter/nf_conntrack_tuple.h +++ b/include/net/netfilter/nf_conntrack_tuple.h @@ -67,6 +67,9 @@ struct nf_conntrack_tuple { /* The protocol. */ u_int8_t protonum;
+ /* The direction must be ignored for the tuplehash */ + struct { } __nfct_hash_offsetend; + /* The direction (for tuplehash) */ u_int8_t dir; } dst; diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index d119f1d4c2fc8..992393102d5f5 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -211,24 +211,18 @@ static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple, unsigned int zoneid, const struct net *net) { - u64 a, b, c, d; + siphash_key_t key;
get_random_once(&nf_conntrack_hash_rnd, sizeof(nf_conntrack_hash_rnd));
- /* The direction must be ignored, handle usable tuplehash members manually */ - a = (u64)tuple->src.u3.all[0] << 32 | tuple->src.u3.all[3]; - b = (u64)tuple->dst.u3.all[0] << 32 | tuple->dst.u3.all[3]; + key = nf_conntrack_hash_rnd;
- c = (__force u64)tuple->src.u.all << 32 | (__force u64)tuple->dst.u.all << 16; - c |= tuple->dst.protonum; + key.key[0] ^= zoneid; + key.key[1] ^= net_hash_mix(net);
- d = (u64)zoneid << 32 | net_hash_mix(net); - - /* IPv4: u3.all[1,2,3] == 0 */ - c ^= (u64)tuple->src.u3.all[1] << 32 | tuple->src.u3.all[2]; - d += (u64)tuple->dst.u3.all[1] << 32 | tuple->dst.u3.all[2]; - - return (u32)siphash_4u64(a, b, c, d, &nf_conntrack_hash_rnd); + return siphash((void *)tuple, + offsetofend(struct nf_conntrack_tuple, dst.__nfct_hash_offsetend), + &key); }
static u32 scale_hash(u32 hash)
From: Sridhar Samudrala sridhar.samudrala@intel.com
[ Upstream commit 5f16da6ee6ac32e6c8098bc4cfcc4f170694f9da ]
Remove incorrect check in ice_validate_mqprio_opt() that limits filter configuration when sum of max_rates of all TCs exceeds the link speed. The max rate of each TC is unrelated to value used by other TCs and is valid as long as it is less than link speed.
Fixes: fbc7b27af0f9 ("ice: enable ndo_setup_tc support for mqprio_qdisc") Signed-off-by: Sridhar Samudrala sridhar.samudrala@intel.com Signed-off-by: Sudheer Mogilappagari sudheer.mogilappagari@intel.com Tested-by: Bharathi Sreenivas bharathi.sreenivas@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index fcc027c938fda..eef7c1224887a 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -8114,10 +8114,10 @@ static int ice_validate_mqprio_qopt(struct ice_vsi *vsi, struct tc_mqprio_qopt_offload *mqprio_qopt) { - u64 sum_max_rate = 0, sum_min_rate = 0; int non_power_of_2_qcount = 0; struct ice_pf *pf = vsi->back; int max_rss_q_cnt = 0; + u64 sum_min_rate = 0; struct device *dev; int i, speed; u8 num_tc; @@ -8133,6 +8133,7 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi, dev = ice_pf_to_dev(pf); vsi->ch_rss_size = 0; num_tc = mqprio_qopt->qopt.num_tc; + speed = ice_get_link_speed_kbps(vsi);
for (i = 0; num_tc; i++) { int qcount = mqprio_qopt->qopt.count[i]; @@ -8173,7 +8174,6 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi, */ max_rate = mqprio_qopt->max_rate[i]; max_rate = div_u64(max_rate, ICE_BW_KBPS_DIVISOR); - sum_max_rate += max_rate;
/* min_rate is minimum guaranteed rate and it can't be zero */ min_rate = mqprio_qopt->min_rate[i]; @@ -8186,6 +8186,12 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi, return -EINVAL; }
+ if (max_rate && max_rate > speed) { + dev_err(dev, "TC%d: max_rate(%llu Kbps) > link speed of %u Kbps\n", + i, max_rate, speed); + return -EINVAL; + } + iter_div_u64_rem(min_rate, ICE_MIN_BW_LIMIT, &rem); if (rem) { dev_err(dev, "TC%d: Min Rate not multiple of %u Kbps", @@ -8223,12 +8229,6 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi, (mqprio_qopt->qopt.offset[i] + mqprio_qopt->qopt.count[i])) return -EINVAL;
- speed = ice_get_link_speed_kbps(vsi); - if (sum_max_rate && sum_max_rate > (u64)speed) { - dev_err(dev, "Invalid max Tx rate(%llu) Kbps > speed(%u) Kbps specified\n", - sum_max_rate, speed); - return -EINVAL; - } if (sum_min_rate && sum_min_rate > (u64)speed) { dev_err(dev, "Invalid min Tx rate(%llu) Kbps > speed (%u) Kbps specified\n", sum_min_rate, speed);
From: Sridhar Samudrala sridhar.samudrala@intel.com
[ Upstream commit 479cdfe388a04a16fdd127f3e9e9e019e45e5573 ]
Configuring tx_maxrate via sysfs interface /sys/class/net/eth0/queues/tx-1/tx_maxrate was not working when TCs are configured because always main VSI was being used. Fix by using correct VSI in ice_set_tx_maxrate when TCs are configured.
Fixes: 1ddef455f4a8 ("ice: Add NDO callback to set the maximum per-queue bitrate") Signed-off-by: Sridhar Samudrala sridhar.samudrala@intel.com Signed-off-by: Sudheer Mogilappagari sudheer.mogilappagari@intel.com Tested-by: Bharathi Sreenivas bharathi.sreenivas@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 7 +++++++ drivers/net/ethernet/intel/ice/ice_tc_lib.c | 22 ++++++++++----------- drivers/net/ethernet/intel/ice/ice_tc_lib.h | 1 + 3 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index eef7c1224887a..1277e0a044ee4 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -5969,6 +5969,13 @@ ice_set_tx_maxrate(struct net_device *netdev, int queue_index, u32 maxrate) q_handle = vsi->tx_rings[queue_index]->q_handle; tc = ice_dcb_get_tc(vsi, queue_index);
+ vsi = ice_locate_vsi_using_queue(vsi, queue_index); + if (!vsi) { + netdev_err(netdev, "Invalid VSI for given queue %d\n", + queue_index); + return -EINVAL; + } + /* Set BW back to default, when user set maxrate to 0 */ if (!maxrate) status = ice_cfg_q_bw_dflt_lmt(vsi->port_info, vsi->idx, tc, diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.c b/drivers/net/ethernet/intel/ice/ice_tc_lib.c index d1a31f236d26a..8578dc1cb967d 100644 --- a/drivers/net/ethernet/intel/ice/ice_tc_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.c @@ -735,17 +735,16 @@ ice_eswitch_add_tc_fltr(struct ice_vsi *vsi, struct ice_tc_flower_fltr *fltr) /** * ice_locate_vsi_using_queue - locate VSI using queue (forward to queue action) * @vsi: Pointer to VSI - * @tc_fltr: Pointer to tc_flower_filter + * @queue: Queue index * - * Locate the VSI using specified queue. When ADQ is not enabled, always - * return input VSI, otherwise locate corresponding VSI based on per channel - * offset and qcount + * Locate the VSI using specified "queue". When ADQ is not enabled, + * always return input VSI, otherwise locate corresponding + * VSI based on per channel "offset" and "qcount" */ -static struct ice_vsi * -ice_locate_vsi_using_queue(struct ice_vsi *vsi, - struct ice_tc_flower_fltr *tc_fltr) +struct ice_vsi * +ice_locate_vsi_using_queue(struct ice_vsi *vsi, int queue) { - int num_tc, tc, queue; + int num_tc, tc;
/* if ADQ is not active, passed VSI is the candidate VSI */ if (!ice_is_adq_active(vsi->back)) @@ -755,7 +754,6 @@ ice_locate_vsi_using_queue(struct ice_vsi *vsi, * upon queue number) */ num_tc = vsi->mqprio_qopt.qopt.num_tc; - queue = tc_fltr->action.fwd.q.queue;
for (tc = 0; tc < num_tc; tc++) { int qcount = vsi->mqprio_qopt.qopt.count[tc]; @@ -797,6 +795,7 @@ ice_tc_forward_action(struct ice_vsi *vsi, struct ice_tc_flower_fltr *tc_fltr) struct ice_pf *pf = vsi->back; struct device *dev; u32 tc_class; + int q;
dev = ice_pf_to_dev(pf);
@@ -825,7 +824,8 @@ ice_tc_forward_action(struct ice_vsi *vsi, struct ice_tc_flower_fltr *tc_fltr) /* Determine destination VSI even though the action is * FWD_TO_QUEUE, because QUEUE is associated with VSI */ - dest_vsi = tc_fltr->dest_vsi; + q = tc_fltr->action.fwd.q.queue; + dest_vsi = ice_locate_vsi_using_queue(vsi, q); break; default: dev_err(dev, @@ -1702,7 +1702,7 @@ ice_tc_forward_to_queue(struct ice_vsi *vsi, struct ice_tc_flower_fltr *fltr, /* If ADQ is configured, and the queue belongs to ADQ VSI, then prepare * ADQ switch filter */ - ch_vsi = ice_locate_vsi_using_queue(vsi, fltr); + ch_vsi = ice_locate_vsi_using_queue(vsi, fltr->action.fwd.q.queue); if (!ch_vsi) return -EINVAL; fltr->dest_vsi = ch_vsi; diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.h b/drivers/net/ethernet/intel/ice/ice_tc_lib.h index 8d5e22ac7023c..189c73d885356 100644 --- a/drivers/net/ethernet/intel/ice/ice_tc_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.h @@ -203,6 +203,7 @@ static inline int ice_chnl_dmac_fltr_cnt(struct ice_pf *pf) return pf->num_dmac_chnl_fltrs; }
+struct ice_vsi *ice_locate_vsi_using_queue(struct ice_vsi *vsi, int queue); int ice_add_cls_flower(struct net_device *netdev, struct ice_vsi *vsi, struct flow_cls_offload *cls_flower);
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit ed89b74d2dc920cb61d3094e0e97ec8775b13086 ]
Add condition to increase the qbv counter during taprio qbv configuration only.
There might be a case when TC already been setup then user configure the ETF/CBS qdisc and this counter will increase if no condition above.
Fixes: ae4fe4698300 ("igc: Add qbv_config_change_errors counter") Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 1 + drivers/net/ethernet/intel/igc/igc_main.c | 2 ++ drivers/net/ethernet/intel/igc/igc_tsn.c | 1 + 3 files changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 9dc9b982a7ea6..9902f726f06a9 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -184,6 +184,7 @@ struct igc_adapter { u32 max_frame_size; u32 min_frame_size;
+ int tc_setup_type; ktime_t base_time; ktime_t cycle_time; bool qbv_enable; diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 5f2e8bcd75973..a8815ccf7887d 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -6295,6 +6295,8 @@ static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type, { struct igc_adapter *adapter = netdev_priv(dev);
+ adapter->tc_setup_type = type; + switch (type) { case TC_QUERY_CAPS: return igc_tc_query_caps(adapter, type_data); diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index 94a2b0dfb54d4..6b299b83e7ef2 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -249,6 +249,7 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter) * Gate Control List (GCL) is running. */ if ((rd32(IGC_BASET_H) || rd32(IGC_BASET_L)) && + (adapter->tc_setup_type == TC_SETUP_QDISC_TAPRIO) && tsn_mode_reconfig) adapter->qbv_config_change_errors++; } else {
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit cca28ceac7c7857bc2d313777017585aef00bcc4 ]
Remove unnecessary delay during the TX ring configuration. This will cause delay, especially during link down and link up activity.
Furthermore, old SKUs like as I225 will call the reset_adapter to reset the controller during TSN mode Gate Control List (GCL) setting. This will add more time to the configuration of the real-time use case.
It doesn't mentioned about this delay in the Software User Manual. It might have been ported from legacy code I210 in the past.
Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index a8815ccf7887d..b131c8f2b03df 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -711,7 +711,6 @@ static void igc_configure_tx_ring(struct igc_adapter *adapter, /* disable the queue */ wr32(IGC_TXDCTL(reg_idx), 0); wrfl(); - mdelay(10);
wr32(IGC_TDLEN(reg_idx), ring->count * sizeof(union igc_adv_tx_desc));
From: Jesper Dangaard Brouer brouer@redhat.com
[ Upstream commit 73b7123de0cfa4f6609677e927ab02cb05b593c2 ]
Driver specific metadata data for XDP-hints kfuncs are propagated via tail extending the struct xdp_buff with a locally scoped driver struct.
Zero-Copy AF_XDP/XSK does similar tricks via struct xdp_buff_xsk. This xdp_buff_xsk struct contains a CB area (24 bytes) that can be used for extending the locally scoped driver into. The XSK_CHECK_PRIV_TYPE define catch size violations build time.
The changes needed for AF_XDP zero-copy in igc_clean_rx_irq_zc() is done in next patch, because the member rx_desc isn't available at this point.
Signed-off-by: Jesper Dangaard Brouer brouer@redhat.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Yoong Siang yoong.siang.song@intel.com Link: https://lore.kernel.org/bpf/168182464779.616355.3761989884165609387.stgit@fi... Stable-dep-of: 175c241288c0 ("igc: Fix TX Hang issue when QBV Gate is closed") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 5 +++++ drivers/net/ethernet/intel/igc/igc_main.c | 16 +++++++++------- 2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 9902f726f06a9..3bb48840a249e 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -502,6 +502,11 @@ struct igc_rx_buffer { }; };
+/* context wrapper around xdp_buff to provide access to descriptor metadata */ +struct igc_xdp_buff { + struct xdp_buff xdp; +}; + struct igc_q_vector { struct igc_adapter *adapter; /* backlink */ void __iomem *itr_register; diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index b131c8f2b03df..c6169357f72fc 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2246,6 +2246,8 @@ static bool igc_alloc_rx_buffers_zc(struct igc_ring *ring, u16 count) if (!count) return ok;
+ XSK_CHECK_PRIV_TYPE(struct igc_xdp_buff); + desc = IGC_RX_DESC(ring, i); bi = &ring->rx_buffer_info[i]; i -= ring->count; @@ -2530,8 +2532,8 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget) union igc_adv_rx_desc *rx_desc; struct igc_rx_buffer *rx_buffer; unsigned int size, truesize; + struct igc_xdp_buff ctx; ktime_t timestamp = 0; - struct xdp_buff xdp; int pkt_offset = 0; void *pktbuf;
@@ -2565,13 +2567,13 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget) }
if (!skb) { - xdp_init_buff(&xdp, truesize, &rx_ring->xdp_rxq); - xdp_prepare_buff(&xdp, pktbuf - igc_rx_offset(rx_ring), + xdp_init_buff(&ctx.xdp, truesize, &rx_ring->xdp_rxq); + xdp_prepare_buff(&ctx.xdp, pktbuf - igc_rx_offset(rx_ring), igc_rx_offset(rx_ring) + pkt_offset, size, true); - xdp_buff_clear_frags_flag(&xdp); + xdp_buff_clear_frags_flag(&ctx.xdp);
- skb = igc_xdp_run_prog(adapter, &xdp); + skb = igc_xdp_run_prog(adapter, &ctx.xdp); }
if (IS_ERR(skb)) { @@ -2593,9 +2595,9 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget) } else if (skb) igc_add_rx_frag(rx_ring, rx_buffer, skb, size); else if (ring_uses_build_skb(rx_ring)) - skb = igc_build_skb(rx_ring, rx_buffer, &xdp); + skb = igc_build_skb(rx_ring, rx_buffer, &ctx.xdp); else - skb = igc_construct_skb(rx_ring, rx_buffer, &xdp, + skb = igc_construct_skb(rx_ring, rx_buffer, &ctx.xdp, timestamp);
/* exit if we failed to retrieve a buffer */
From: Jesper Dangaard Brouer brouer@redhat.com
[ Upstream commit 8416814fffa9cfa74c18da149f522dd9e1850987 ]
This implements XDP hints kfunc for RX-hash (xmo_rx_hash). The HW rss hash type is handled via mapping table.
This igc driver (default config) does L3 hashing for UDP packets (excludes UDP src/dest ports in hash calc). Meaning RSS hash type is L3 based. Tested that the igc_rss_type_num for UDP is either IGC_RSS_TYPE_HASH_IPV4 or IGC_RSS_TYPE_HASH_IPV6.
This patch also updates AF_XDP zero-copy function igc_clean_rx_irq_zc() to use the xdp_buff wrapper struct igc_xdp_buff.
Signed-off-by: Jesper Dangaard Brouer brouer@redhat.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Song Yoong Siang yoong.siang.song@intel.com Link: https://lore.kernel.org/bpf/168182465285.616355.2701740913376314790.stgit@fi... Stable-dep-of: 175c241288c0 ("igc: Fix TX Hang issue when QBV Gate is closed") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 1 + drivers/net/ethernet/intel/igc/igc_main.c | 53 +++++++++++++++++++++++ 2 files changed, 54 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 3bb48840a249e..f09c6a65e3ab8 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -505,6 +505,7 @@ struct igc_rx_buffer { /* context wrapper around xdp_buff to provide access to descriptor metadata */ struct igc_xdp_buff { struct xdp_buff xdp; + union igc_adv_rx_desc *rx_desc; };
struct igc_q_vector { diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index c6169357f72fc..c0e21701e7817 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2572,6 +2572,7 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget) igc_rx_offset(rx_ring) + pkt_offset, size, true); xdp_buff_clear_frags_flag(&ctx.xdp); + ctx.rx_desc = rx_desc;
skb = igc_xdp_run_prog(adapter, &ctx.xdp); } @@ -2698,6 +2699,15 @@ static void igc_dispatch_skb_zc(struct igc_q_vector *q_vector, napi_gro_receive(&q_vector->napi, skb); }
+static struct igc_xdp_buff *xsk_buff_to_igc_ctx(struct xdp_buff *xdp) +{ + /* xdp_buff pointer used by ZC code path is alloc as xdp_buff_xsk. The + * igc_xdp_buff shares its layout with xdp_buff_xsk and private + * igc_xdp_buff fields fall into xdp_buff_xsk->cb + */ + return (struct igc_xdp_buff *)xdp; +} + static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget) { struct igc_adapter *adapter = q_vector->adapter; @@ -2716,6 +2726,7 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget) while (likely(total_packets < budget)) { union igc_adv_rx_desc *desc; struct igc_rx_buffer *bi; + struct igc_xdp_buff *ctx; ktime_t timestamp = 0; unsigned int size; int res; @@ -2733,6 +2744,9 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
bi = &ring->rx_buffer_info[ntc];
+ ctx = xsk_buff_to_igc_ctx(bi->xdp); + ctx->rx_desc = desc; + if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) { timestamp = igc_ptp_rx_pktstamp(q_vector->adapter, bi->xdp->data); @@ -6490,6 +6504,44 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg) return value; }
+/* Mapping HW RSS Type to enum xdp_rss_hash_type */ +static enum xdp_rss_hash_type igc_xdp_rss_type[IGC_RSS_TYPE_MAX_TABLE] = { + [IGC_RSS_TYPE_NO_HASH] = XDP_RSS_TYPE_L2, + [IGC_RSS_TYPE_HASH_TCP_IPV4] = XDP_RSS_TYPE_L4_IPV4_TCP, + [IGC_RSS_TYPE_HASH_IPV4] = XDP_RSS_TYPE_L3_IPV4, + [IGC_RSS_TYPE_HASH_TCP_IPV6] = XDP_RSS_TYPE_L4_IPV6_TCP, + [IGC_RSS_TYPE_HASH_IPV6_EX] = XDP_RSS_TYPE_L3_IPV6_EX, + [IGC_RSS_TYPE_HASH_IPV6] = XDP_RSS_TYPE_L3_IPV6, + [IGC_RSS_TYPE_HASH_TCP_IPV6_EX] = XDP_RSS_TYPE_L4_IPV6_TCP_EX, + [IGC_RSS_TYPE_HASH_UDP_IPV4] = XDP_RSS_TYPE_L4_IPV4_UDP, + [IGC_RSS_TYPE_HASH_UDP_IPV6] = XDP_RSS_TYPE_L4_IPV6_UDP, + [IGC_RSS_TYPE_HASH_UDP_IPV6_EX] = XDP_RSS_TYPE_L4_IPV6_UDP_EX, + [10] = XDP_RSS_TYPE_NONE, /* RSS Type above 9 "Reserved" by HW */ + [11] = XDP_RSS_TYPE_NONE, /* keep array sized for SW bit-mask */ + [12] = XDP_RSS_TYPE_NONE, /* to handle future HW revisons */ + [13] = XDP_RSS_TYPE_NONE, + [14] = XDP_RSS_TYPE_NONE, + [15] = XDP_RSS_TYPE_NONE, +}; + +static int igc_xdp_rx_hash(const struct xdp_md *_ctx, u32 *hash, + enum xdp_rss_hash_type *rss_type) +{ + const struct igc_xdp_buff *ctx = (void *)_ctx; + + if (!(ctx->xdp.rxq->dev->features & NETIF_F_RXHASH)) + return -ENODATA; + + *hash = le32_to_cpu(ctx->rx_desc->wb.lower.hi_dword.rss); + *rss_type = igc_xdp_rss_type[igc_rss_type(ctx->rx_desc)]; + + return 0; +} + +static const struct xdp_metadata_ops igc_xdp_metadata_ops = { + .xmo_rx_hash = igc_xdp_rx_hash, +}; + /** * igc_probe - Device Initialization Routine * @pdev: PCI device information struct @@ -6563,6 +6615,7 @@ static int igc_probe(struct pci_dev *pdev, hw->hw_addr = adapter->io_addr;
netdev->netdev_ops = &igc_netdev_ops; + netdev->xdp_metadata_ops = &igc_xdp_metadata_ops; igc_ethtool_set_ops(netdev); netdev->watchdog_timeo = 5 * HZ;
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit 175c241288c09f81eb7b44d65c1ef6045efa4d1a ]
If a user schedules a Gate Control List (GCL) to close one of the QBV gates while also transmitting a packet to that closed gate, TX Hang will be happen. HW would not drop any packet when the gate is closed and keep queuing up in HW TX FIFO until the gate is re-opened. This patch implements the solution to drop the packet for the closed gate.
This patch will also reset the adapter to perform SW initialization for each 1st Gate Control List (GCL) to avoid hang. This is due to the HW design, where changing to TSN transmit mode requires SW initialization. Intel Discrete I225/6 transmit mode cannot be changed when in dynamic mode according to Software User Manual Section 7.5.2.1. Subsequent Gate Control List (GCL) operations will proceed without a reset, as they already are in TSN Mode.
Step to reproduce:
DUT: 1) Configure GCL List with certain gate close.
BASE=$(date +%s%N) tc qdisc replace dev $IFACE parent root handle 100 taprio \ num_tc 4 \ map 0 1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time $BASE \ sched-entry S 0x8 500000 \ sched-entry S 0x4 500000 \ flags 0x2
2) Transmit the packet to closed gate. You may use udp_tai application to transmit UDP packet to any of the closed gate.
./udp_tai -i <interface> -P 100000 -p 90 -c 1 -t <0/1> -u 30004
Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading") Co-developed-by: Tan Tee Min tee.min.tan@linux.intel.com Signed-off-by: Tan Tee Min tee.min.tan@linux.intel.com Tested-by: Chwee Lin Choong chwee.lin.choong@intel.com Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 6 +++ drivers/net/ethernet/intel/igc/igc_main.c | 58 +++++++++++++++++++++-- drivers/net/ethernet/intel/igc/igc_tsn.c | 41 ++++++++++------ 3 files changed, 87 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index f09c6a65e3ab8..c0a07af36cb23 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -14,6 +14,7 @@ #include <linux/timecounter.h> #include <linux/net_tstamp.h> #include <linux/bitfield.h> +#include <linux/hrtimer.h>
#include "igc_hw.h"
@@ -101,6 +102,8 @@ struct igc_ring { u32 start_time; u32 end_time; u32 max_sdu; + bool oper_gate_closed; /* Operating gate. True if the TX Queue is closed */ + bool admin_gate_closed; /* Future gate. True if the TX Queue will be closed */
/* CBS parameters */ bool cbs_enable; /* indicates if CBS is enabled */ @@ -160,6 +163,7 @@ struct igc_adapter { struct timer_list watchdog_timer; struct timer_list dma_err_timer; struct timer_list phy_info_timer; + struct hrtimer hrtimer;
u32 wol; u32 en_mng_pt; @@ -189,6 +193,8 @@ struct igc_adapter { ktime_t cycle_time; bool qbv_enable; u32 qbv_config_change_errors; + bool qbv_transition; + unsigned int qbv_count;
/* OS defined structs */ struct pci_dev *pdev; diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index c0e21701e7817..826556e609800 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1572,6 +1572,9 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, first->bytecount = skb->len; first->gso_segs = 1;
+ if (adapter->qbv_transition || tx_ring->oper_gate_closed) + goto out_drop; + if (tx_ring->max_sdu > 0) { u32 max_sdu = 0;
@@ -3004,8 +3007,8 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget) time_after(jiffies, tx_buffer->time_stamp + (adapter->tx_timeout_factor * HZ)) && !(rd32(IGC_STATUS) & IGC_STATUS_TXOFF) && - (rd32(IGC_TDH(tx_ring->reg_idx)) != - readl(tx_ring->tail))) { + (rd32(IGC_TDH(tx_ring->reg_idx)) != readl(tx_ring->tail)) && + !tx_ring->oper_gate_closed) { /* detected Tx unit hang */ netdev_err(tx_ring->netdev, "Detected Tx Unit Hang\n" @@ -6095,6 +6098,8 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter) adapter->base_time = 0; adapter->cycle_time = NSEC_PER_SEC; adapter->qbv_config_change_errors = 0; + adapter->qbv_transition = false; + adapter->qbv_count = 0;
for (i = 0; i < adapter->num_tx_queues; i++) { struct igc_ring *ring = adapter->tx_ring[i]; @@ -6102,6 +6107,8 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter) ring->start_time = 0; ring->end_time = NSEC_PER_SEC; ring->max_sdu = 0; + ring->oper_gate_closed = false; + ring->admin_gate_closed = false; }
return 0; @@ -6113,6 +6120,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, bool queue_configured[IGC_MAX_TX_QUEUES] = { }; struct igc_hw *hw = &adapter->hw; u32 start_time = 0, end_time = 0; + struct timespec64 now; size_t n; int i;
@@ -6133,6 +6141,8 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, adapter->cycle_time = qopt->cycle_time; adapter->base_time = qopt->base_time;
+ igc_ptp_read(adapter, &now); + for (n = 0; n < qopt->num_entries; n++) { struct tc_taprio_sched_entry *e = &qopt->entries[n];
@@ -6167,7 +6177,10 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, ring->start_time = start_time; ring->end_time = end_time;
- queue_configured[i] = true; + if (ring->start_time >= adapter->cycle_time) + queue_configured[i] = false; + else + queue_configured[i] = true; }
start_time += e->interval; @@ -6177,8 +6190,20 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, * If not, set the start and end time to be end time. */ for (i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *ring = adapter->tx_ring[i]; + + if (!is_base_time_past(qopt->base_time, &now)) { + ring->admin_gate_closed = false; + } else { + ring->oper_gate_closed = false; + ring->admin_gate_closed = false; + } + if (!queue_configured[i]) { - struct igc_ring *ring = adapter->tx_ring[i]; + if (!is_base_time_past(qopt->base_time, &now)) + ring->admin_gate_closed = true; + else + ring->oper_gate_closed = true;
ring->start_time = end_time; ring->end_time = end_time; @@ -6542,6 +6567,27 @@ static const struct xdp_metadata_ops igc_xdp_metadata_ops = { .xmo_rx_hash = igc_xdp_rx_hash, };
+static enum hrtimer_restart igc_qbv_scheduling_timer(struct hrtimer *timer) +{ + struct igc_adapter *adapter = container_of(timer, struct igc_adapter, + hrtimer); + unsigned int i; + + adapter->qbv_transition = true; + for (i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *tx_ring = adapter->tx_ring[i]; + + if (tx_ring->admin_gate_closed) { + tx_ring->admin_gate_closed = false; + tx_ring->oper_gate_closed = true; + } else { + tx_ring->oper_gate_closed = false; + } + } + adapter->qbv_transition = false; + return HRTIMER_NORESTART; +} + /** * igc_probe - Device Initialization Routine * @pdev: PCI device information struct @@ -6720,6 +6766,9 @@ static int igc_probe(struct pci_dev *pdev, INIT_WORK(&adapter->reset_task, igc_reset_task); INIT_WORK(&adapter->watchdog_task, igc_watchdog_task);
+ hrtimer_init(&adapter->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + adapter->hrtimer.function = &igc_qbv_scheduling_timer; + /* Initialize link properties that are user-changeable */ adapter->fc_autoneg = true; hw->mac.autoneg = true; @@ -6823,6 +6872,7 @@ static void igc_remove(struct pci_dev *pdev)
cancel_work_sync(&adapter->reset_task); cancel_work_sync(&adapter->watchdog_task); + hrtimer_cancel(&adapter->hrtimer);
/* Release control of h/w to f/w. If f/w is AMT enabled, this * would have already happened in close and is redundant. diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index 6b299b83e7ef2..3cdb0c9887283 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -114,7 +114,6 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter) static int igc_tsn_enable_offload(struct igc_adapter *adapter) { struct igc_hw *hw = &adapter->hw; - bool tsn_mode_reconfig = false; u32 tqavctrl, baset_l, baset_h; u32 sec, nsec, cycle; ktime_t base_time, systim; @@ -228,11 +227,10 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
tqavctrl = rd32(IGC_TQAVCTRL) & ~IGC_TQAVCTRL_FUTSCDDIS;
- if (tqavctrl & IGC_TQAVCTRL_TRANSMIT_MODE_TSN) - tsn_mode_reconfig = true; - tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
+ adapter->qbv_count++; + cycle = adapter->cycle_time; base_time = adapter->base_time;
@@ -250,17 +248,28 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter) */ if ((rd32(IGC_BASET_H) || rd32(IGC_BASET_L)) && (adapter->tc_setup_type == TC_SETUP_QDISC_TAPRIO) && - tsn_mode_reconfig) + (adapter->qbv_count > 1)) adapter->qbv_config_change_errors++; } else { - /* According to datasheet section 7.5.2.9.3.3, FutScdDis bit - * has to be configured before the cycle time and base time. - * Tx won't hang if there is a GCL is already running, - * so in this case we don't need to set FutScdDis. - */ - if (igc_is_device_id_i226(hw) && - !(rd32(IGC_BASET_H) || rd32(IGC_BASET_L))) - tqavctrl |= IGC_TQAVCTRL_FUTSCDDIS; + if (igc_is_device_id_i226(hw)) { + ktime_t adjust_time, expires_time; + + /* According to datasheet section 7.5.2.9.3.3, FutScdDis bit + * has to be configured before the cycle time and base time. + * Tx won't hang if a GCL is already running, + * so in this case we don't need to set FutScdDis. + */ + if (!(rd32(IGC_BASET_H) || rd32(IGC_BASET_L))) + tqavctrl |= IGC_TQAVCTRL_FUTSCDDIS; + + nsec = rd32(IGC_SYSTIML); + sec = rd32(IGC_SYSTIMH); + systim = ktime_set(sec, nsec); + + adjust_time = adapter->base_time; + expires_time = ktime_sub_ns(adjust_time, systim); + hrtimer_start(&adapter->hrtimer, expires_time, HRTIMER_MODE_REL); + } }
wr32(IGC_TQAVCTRL, tqavctrl); @@ -306,7 +315,11 @@ int igc_tsn_offload_apply(struct igc_adapter *adapter) { struct igc_hw *hw = &adapter->hw;
- if (netif_running(adapter->netdev) && igc_is_device_id_i225(hw)) { + /* Per I225/6 HW Design Section 7.5.2.1, transmit mode + * cannot be changed dynamically. Require reset the adapter. + */ + if (netif_running(adapter->netdev) && + (igc_is_device_id_i225(hw) || !adapter->qbv_count)) { schedule_work(&adapter->reset_task); return 0; }
From: Zhengchao Shao shaozhengchao@huawei.com
[ Upstream commit 884abe45a9014d0de2e6edb0630dfd64f23f1d1b ]
In function accel_fs_tcp_create_groups(), when the ft->g memory is successfully allocated but the 'in' memory fails to be allocated, the memory pointed to by ft->g is released once. And in function accel_fs_tcp_create_table, mlx5e_destroy_flow_table is called to release the memory pointed to by ft->g again. This will cause double free problem.
Fixes: c062d52ac24c ("net/mlx5e: Receive flow steering framework for accelerated TCP flows") Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c index 88a5aed9d6781..c7d191f66ad1b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c @@ -190,6 +190,7 @@ static int accel_fs_tcp_create_groups(struct mlx5e_flow_table *ft, in = kvzalloc(inlen, GFP_KERNEL); if (!in || !ft->g) { kfree(ft->g); + ft->g = NULL; kvfree(in); return -ENOMEM; }
From: Zhengchao Shao shaozhengchao@huawei.com
[ Upstream commit 3250affdc658557a41df9c5fb567723e421f8bf2 ]
The memory pointed to by the fs->any pointer is not freed in the error path of mlx5e_fs_tt_redirect_any_create, which can lead to a memory leak. Fix by freeing the memory in the error path, thereby making the error path identical to mlx5e_fs_tt_redirect_any_destroy().
Fixes: 0f575c20bf06 ("net/mlx5e: Introduce Flow Steering ANY API") Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Rahul Rameshbabu rrameshbabu@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c b/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c index 03cb79adf912f..be83ad9db82a4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c @@ -594,7 +594,7 @@ int mlx5e_fs_tt_redirect_any_create(struct mlx5e_flow_steering *fs)
err = fs_any_create_table(fs); if (err) - return err; + goto err_free_any;
err = fs_any_enable(fs); if (err) @@ -606,8 +606,8 @@ int mlx5e_fs_tt_redirect_any_create(struct mlx5e_flow_steering *fs)
err_destroy_table: fs_any_destroy_table(fs_any); - - kfree(fs_any); +err_free_any: mlx5e_fs_set_any(fs, NULL); + kfree(fs_any); return err; }
From: Zhengchao Shao shaozhengchao@huawei.com
[ Upstream commit d543b649ffe58a0cb4b6948b3305069c5980a1fa ]
When kvzalloc_node or kvzalloc failed in mlx5e_ptp_open, the memory pointed by "c" or "cparams" is not freed, which can lead to a memory leak. Fix by freeing the array in the error path.
Fixes: 145e5637d941 ("net/mlx5e: Add TX PTP port object support") Signed-off-by: Zhengchao Shao shaozhengchao@huawei.com Reviewed-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Gal Pressman gal@nvidia.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c index 3cbebfba582bd..b0b429a0321ed 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c @@ -729,8 +729,10 @@ int mlx5e_ptp_open(struct mlx5e_priv *priv, struct mlx5e_params *params,
c = kvzalloc_node(sizeof(*c), GFP_KERNEL, dev_to_node(mlx5_core_dma_dev(mdev))); cparams = kvzalloc(sizeof(*cparams), GFP_KERNEL); - if (!c || !cparams) - return -ENOMEM; + if (!c || !cparams) { + err = -ENOMEM; + goto err_free; + }
c->priv = priv; c->mdev = priv->mdev;
From: Dragos Tatulea dtatulea@nvidia.com
[ Upstream commit 2e2d1965794d22fbe86df45bf4f933216743577d ]
Regular (non-XSK) RQs get flushed on XSK setup and re-activated on XSK close. If the same regular RQ is closed (a config change for example) soon after the XSK close, a double release occurs because the missing wqes get released a second time.
Fixes: 3f93f82988bc ("net/mlx5e: RX, Defer page release in legacy rq for better recycling") Signed-off-by: Dragos Tatulea dtatulea@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 69634829558e2..111f6a4a64b64 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -390,10 +390,18 @@ static void mlx5e_dealloc_rx_wqe(struct mlx5e_rq *rq, u16 ix) { struct mlx5e_wqe_frag_info *wi = get_frag(rq, ix);
- if (rq->xsk_pool) + if (rq->xsk_pool) { mlx5e_xsk_free_rx_wqe(wi); - else + } else { mlx5e_free_rx_wqe(rq, wi); + + /* Avoid a second release of the wqe pages: dealloc is called + * for the same missing wqes on regular RQ flush and on regular + * RQ close. This happens when XSK RQs come into play. + */ + for (int i = 0; i < rq->wqe.info.num_frags; i++, wi++) + wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); + } }
static void mlx5e_xsk_free_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk)
From: Saeed Mahameed saeedm@nvidia.com
[ Upstream commit 631079e08aa4a20b73e70de4cf457886194f029f ]
Prior to this patch only one "mlx5" thermal zone could have been registered regardless of the number of individual mlx5 devices in the system.
To fix this setup a unique name per device to register its own thermal zone.
In order to not register a thermal zone for a virtual device (VF/SF) add a check for PF device type.
The new name is a concatenation between "mlx5_" and "<PCI_DEV_BDF>", which will also help associating a thermal zone with its PCI device.
$ lspci | grep ConnectX 00:04.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx] 00:05.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
$ cat /sys/devices/virtual/thermal/thermal_zone0/type mlx5_0000:00:04.0 $ cat /sys/devices/virtual/thermal/thermal_zone1/type mlx5_0000:00:05.0
Fixes: c1fef618d611 ("net/mlx5: Implement thermal zone") CC: Sandipan Patra spatra@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/thermal.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c index e47fa6fb836f1..89a22ff04cb60 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c @@ -68,14 +68,19 @@ static struct thermal_zone_device_ops mlx5_thermal_ops = {
int mlx5_thermal_init(struct mlx5_core_dev *mdev) { + char data[THERMAL_NAME_LENGTH]; struct mlx5_thermal *thermal; - struct thermal_zone_device *tzd; - const char *data = "mlx5"; + int err;
- tzd = thermal_zone_get_zone_by_name(data); - if (!IS_ERR(tzd)) + if (!mlx5_core_is_pf(mdev) && !mlx5_core_is_ecpf(mdev)) return 0;
+ err = snprintf(data, sizeof(data), "mlx5_%s", dev_name(mdev->device)); + if (err < 0 || err >= sizeof(data)) { + mlx5_core_err(mdev, "Failed to setup thermal zone name, %d\n", err); + return -EINVAL; + } + thermal = kzalloc(sizeof(*thermal), GFP_KERNEL); if (!thermal) return -ENOMEM; @@ -88,10 +93,10 @@ int mlx5_thermal_init(struct mlx5_core_dev *mdev) &mlx5_thermal_ops, NULL, 0, MLX5_THERMAL_POLL_INT_MSEC); if (IS_ERR(thermal->tzdev)) { - dev_err(mdev->device, "Failed to register thermal zone device (%s) %ld\n", - data, PTR_ERR(thermal->tzdev)); + err = PTR_ERR(thermal->tzdev); + mlx5_core_err(mdev, "Failed to register thermal zone device (%s) %d\n", data, err); kfree(thermal); - return -EINVAL; + return err; }
mdev->thermal = thermal;
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 65e64640e97c0f223e77f9ea69b5a46186b93470 ]
Currently the check for NOT_READY flag is performed before obtaining the necessary lock. This opens a possibility for race condition when the flow is concurrently removed from unready_flows list by the workqueue task, which causes a double-removal from the list and a crash[0]. Fix the issue by moving the flag check inside the section protected by uplink_priv->unready_flows_lock mutex.
[0]: [44376.389654] general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP [44376.391665] CPU: 7 PID: 59123 Comm: tc Not tainted 6.4.0-rc4+ #1 [44376.392984] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [44376.395342] RIP: 0010:mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core] [44376.396857] Code: 00 48 8b b8 68 ce 02 00 e8 8a 4d 02 00 4c 8d a8 a8 01 00 00 4c 89 ef e8 8b 79 88 e1 48 8b 83 98 06 00 00 48 8b 93 90 06 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 90 06 [44376.399167] RSP: 0018:ffff88812cc97570 EFLAGS: 00010246 [44376.399680] RAX: dead000000000122 RBX: ffff8881088e3800 RCX: ffff8881881bac00 [44376.400337] RDX: dead000000000100 RSI: ffff88812cc97500 RDI: ffff8881242f71b0 [44376.401001] RBP: ffff88811cbb0940 R08: 0000000000000400 R09: 0000000000000001 [44376.401663] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88812c944000 [44376.402342] R13: ffff8881242f71a8 R14: ffff8881222b4000 R15: 0000000000000000 [44376.402999] FS: 00007f0451104800(0000) GS:ffff88852cb80000(0000) knlGS:0000000000000000 [44376.403787] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [44376.404343] CR2: 0000000000489108 CR3: 0000000123a79003 CR4: 0000000000370ea0 [44376.405004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [44376.405665] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [44376.406339] Call Trace: [44376.406651] <TASK> [44376.406939] ? die_addr+0x33/0x90 [44376.407311] ? exc_general_protection+0x192/0x390 [44376.407795] ? asm_exc_general_protection+0x22/0x30 [44376.408292] ? mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core] [44376.408876] __mlx5e_tc_del_fdb_peer_flow+0xbc/0xe0 [mlx5_core] [44376.409482] mlx5e_tc_del_flow+0x42/0x210 [mlx5_core] [44376.410055] mlx5e_flow_put+0x25/0x50 [mlx5_core] [44376.410529] mlx5e_delete_flower+0x24b/0x350 [mlx5_core] [44376.411043] tc_setup_cb_reoffload+0x22/0x80 [44376.411462] fl_reoffload+0x261/0x2f0 [cls_flower] [44376.411907] ? mlx5e_rep_indr_setup_ft_cb+0x160/0x160 [mlx5_core] [44376.412481] ? mlx5e_rep_indr_setup_ft_cb+0x160/0x160 [mlx5_core] [44376.413044] tcf_block_playback_offloads+0x76/0x170 [44376.413497] tcf_block_unbind+0x7b/0xd0 [44376.413881] tcf_block_setup+0x17d/0x1c0 [44376.414269] tcf_block_offload_cmd.isra.0+0xf1/0x130 [44376.414725] tcf_block_offload_unbind+0x43/0x70 [44376.415153] __tcf_block_put+0x82/0x150 [44376.415532] ingress_destroy+0x22/0x30 [sch_ingress] [44376.415986] qdisc_destroy+0x3b/0xd0 [44376.416343] qdisc_graft+0x4d0/0x620 [44376.416706] tc_get_qdisc+0x1c9/0x3b0 [44376.417074] rtnetlink_rcv_msg+0x29c/0x390 [44376.419978] ? rep_movs_alternative+0x3a/0xa0 [44376.420399] ? rtnl_calcit.isra.0+0x120/0x120 [44376.420813] netlink_rcv_skb+0x54/0x100 [44376.421192] netlink_unicast+0x1f6/0x2c0 [44376.421573] netlink_sendmsg+0x232/0x4a0 [44376.421980] sock_sendmsg+0x38/0x60 [44376.422328] ____sys_sendmsg+0x1d0/0x1e0 [44376.422709] ? copy_msghdr_from_user+0x6d/0xa0 [44376.423127] ___sys_sendmsg+0x80/0xc0 [44376.423495] ? ___sys_recvmsg+0x8b/0xc0 [44376.423869] __sys_sendmsg+0x51/0x90 [44376.424226] do_syscall_64+0x3d/0x90 [44376.424587] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [44376.425046] RIP: 0033:0x7f045134f887 [44376.425403] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [44376.426914] RSP: 002b:00007ffd63a82b98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [44376.427592] RAX: ffffffffffffffda RBX: 000000006481955f RCX: 00007f045134f887 [44376.428195] RDX: 0000000000000000 RSI: 00007ffd63a82c00 RDI: 0000000000000003 [44376.428796] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [44376.429404] R10: 00007f0451208708 R11: 0000000000000246 R12: 0000000000000001 [44376.430039] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400 [44376.430644] </TASK> [44376.430907] Modules linked in: mlx5_ib mlx5_core act_mirred act_tunnel_key cls_flower vxlan dummy sch_ingress openvswitch nsh rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_g ss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core] [44376.433936] ---[ end trace 0000000000000000 ]--- [44376.434373] RIP: 0010:mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core] [44376.434951] Code: 00 48 8b b8 68 ce 02 00 e8 8a 4d 02 00 4c 8d a8 a8 01 00 00 4c 89 ef e8 8b 79 88 e1 48 8b 83 98 06 00 00 48 8b 93 90 06 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 90 06 [44376.436452] RSP: 0018:ffff88812cc97570 EFLAGS: 00010246 [44376.436924] RAX: dead000000000122 RBX: ffff8881088e3800 RCX: ffff8881881bac00 [44376.437530] RDX: dead000000000100 RSI: ffff88812cc97500 RDI: ffff8881242f71b0 [44376.438179] RBP: ffff88811cbb0940 R08: 0000000000000400 R09: 0000000000000001 [44376.438786] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88812c944000 [44376.439393] R13: ffff8881242f71a8 R14: ffff8881222b4000 R15: 0000000000000000 [44376.439998] FS: 00007f0451104800(0000) GS:ffff88852cb80000(0000) knlGS:0000000000000000 [44376.440714] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [44376.441225] CR2: 0000000000489108 CR3: 0000000123a79003 CR4: 0000000000370ea0 [44376.441843] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [44376.442471] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: ad86755b18d5 ("net/mlx5e: Protect unready flows with dedicated lock") Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index b9b1da751a3b8..ed05ac8ae1de5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -1639,7 +1639,8 @@ static void remove_unready_flow(struct mlx5e_tc_flow *flow) uplink_priv = &rpriv->uplink_priv;
mutex_lock(&uplink_priv->unready_flows_lock); - unready_flow_del(flow); + if (flow_flag_test(flow, NOT_READY)) + unready_flow_del(flow); mutex_unlock(&uplink_priv->unready_flows_lock); }
@@ -1932,8 +1933,7 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv, esw_attr = attr->esw_attr; mlx5e_put_flow_tunnel_id(flow);
- if (flow_flag_test(flow, NOT_READY)) - remove_unready_flow(flow); + remove_unready_flow(flow);
if (mlx5e_is_offloaded_flow(flow)) { if (flow_flag_test(flow, SLOW))
From: Yevgeny Kliteynik kliteyn@nvidia.com
[ Upstream commit f7a485115ad4cfc560833942014bf791abf1f827 ]
Non-clear CT action causes a flow rule split, while CT clear action doesn't and is just a header-rewrite to the current flow rule. But ct offload is done in post_parse and is per ct action instance, so ct clear offload is parsed multiple times, while its deleted once.
Fix this by post_parsing the ct action only once per flow attribute (which is per flow rule) by using a offloaded ct_attr flag.
Fixes: 08fe94ec5f77 ("net/mlx5e: TC, Remove special handling of CT action") Signed-off-by: Paul Blakey paulb@nvidia.com Signed-off-by: Yevgeny Kliteynik kliteyn@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 14 +++++++++++--- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h | 1 + 2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index a254e728ac954..fadfa8b50bebe 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -1545,7 +1545,8 @@ mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv,
attr->ct_attr.ct_action |= act->ct.action; /* So we can have clear + ct */ attr->ct_attr.zone = act->ct.zone; - attr->ct_attr.nf_ft = act->ct.flow_table; + if (!(act->ct.action & TCA_CT_ACT_CLEAR)) + attr->ct_attr.nf_ft = act->ct.flow_table; attr->ct_attr.act_miss_cookie = act->miss_cookie;
return 0; @@ -1990,6 +1991,9 @@ mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *att if (!priv) return -EOPNOTSUPP;
+ if (attr->ct_attr.offloaded) + return 0; + if (attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR) { err = mlx5_tc_ct_entry_set_registers(priv, &attr->parse_attr->mod_hdr_acts, 0, 0, 0, 0); @@ -1999,11 +2003,15 @@ mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *att attr->action |= MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; }
- if (!attr->ct_attr.nf_ft) /* means only ct clear action, and not ct_clear,ct() */ + if (!attr->ct_attr.nf_ft) { /* means only ct clear action, and not ct_clear,ct() */ + attr->ct_attr.offloaded = true; return 0; + }
mutex_lock(&priv->control_lock); err = __mlx5_tc_ct_flow_offload(priv, attr); + if (!err) + attr->ct_attr.offloaded = true; mutex_unlock(&priv->control_lock);
return err; @@ -2021,7 +2029,7 @@ void mlx5_tc_ct_delete_flow(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr) { - if (!attr->ct_attr.ft) /* no ct action, return */ + if (!attr->ct_attr.offloaded) /* no ct action, return */ return; if (!attr->ct_attr.nf_ft) /* means only ct clear action, and not ct_clear,ct() */ return; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h index 8e9316fa46d4b..b66c5f98067f7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h @@ -29,6 +29,7 @@ struct mlx5_ct_attr { u32 ct_labels_id; u32 act_miss_mapping; u64 act_miss_cookie; + bool offloaded; struct mlx5_ct_ft *ft; };
From: Maher Sanalla msanalla@nvidia.com
[ Upstream commit 6496357aa5f710eec96f91345b9da1b37c3231f6 ]
On vport enable, where fw's hca caps are queried, the driver queries hca_caps_2 without checking if fw truly supports them, causing a false failure of vfs vport load and blocking SRIOV enablement on old devices such as CX4 where hca_caps_2 support is missing.
Thus, add a check for the said caps support before accessing them.
Fixes: e5b9642a33be ("net/mlx5: E-Switch, Implement devlink port function cmds to control migratable") Signed-off-by: Maher Sanalla msanalla@nvidia.com Reviewed-by: Shay Drory shayd@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 901c53751b0aa..f81c6d8d5e0f4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -800,6 +800,9 @@ static int mlx5_esw_vport_caps_get(struct mlx5_eswitch *esw, struct mlx5_vport * hca_caps = MLX5_ADDR_OF(query_hca_cap_out, query_ctx, capability); vport->info.roce_enabled = MLX5_GET(cmd_hca_cap, hca_caps, roce);
+ if (!MLX5_CAP_GEN_MAX(esw->dev, hca_cap_2)) + goto out_free; + memset(query_ctx, 0, query_out_sz); err = mlx5_vport_get_other_func_cap(esw->dev, vport->vport, query_ctx, MLX5_CAP_GENERAL_2);
From: Dragos Tatulea dtatulea@nvidia.com
[ Upstream commit 7abd955a58fb0fcd4e756fa2065c03ae488fcfa7 ]
Currently mlx5e releases pages directly to the page_pool for XDP_TX and does page fragment counting for XDP_REDIRECT. RX pages from the page_pool are leaking on XDP_REDIRECT because the xdp core will release only one fragment out of MLX5E_PAGECNT_BIAS_MAX and subsequently the page is marked as "skip release" which avoids the driver release.
A fix would be to take an extra fragment for XDP_REDIRECT and not set the "skip release" bit so that the release on the driver side can handle the remaining bias fragments. But this would be a shortsighted solution. Instead, this patch converges the two XDP paths (XDP_TX and XDP_REDIRECT) to always do fragment tracking. The "skip release" bit is no longer necessary for XDP.
Fixes: 6f5742846053 ("net/mlx5e: RX, Enable skb page recycling through the page_pool") Signed-off-by: Dragos Tatulea dtatulea@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 3 +- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 32 +++++++------------ 2 files changed, 13 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index f0e6095809faf..40589cebb7730 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -662,8 +662,7 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) * as we know this is a page_pool page. */ - page_pool_put_defragged_page(page->pp, - page, -1, true); + page_pool_recycle_direct(page->pp, page); } while (++n < num);
break; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 111f6a4a64b64..08e08489f4220 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1753,11 +1753,11 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
prog = rcu_dereference(rq->xdp_prog); if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) { - if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { struct mlx5e_wqe_frag_info *pwi;
for (pwi = head_wi; pwi < wi; pwi++) - pwi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); + pwi->frag_page->frags++; } return NULL; /* page/packet was consumed by XDP */ } @@ -1827,12 +1827,8 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) rq, wi, cqe, cqe_bcnt); if (!skb) { /* probably for XDP */ - if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { - /* do not return page to cache, - * it will be returned on XDP_TX completion. - */ - wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); - } + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) + wi->frag_page->frags++; goto wq_cyc_pop; }
@@ -1878,12 +1874,8 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) rq, wi, cqe, cqe_bcnt); if (!skb) { /* probably for XDP */ - if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { - /* do not return page to cache, - * it will be returned on XDP_TX completion. - */ - wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); - } + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) + wi->frag_page->frags++; goto wq_cyc_pop; }
@@ -2062,12 +2054,12 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w if (prog) { if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { - int i; + struct mlx5e_frag_page *pfp; + + for (pfp = head_page; pfp < frag_page; pfp++) + pfp->frags++;
- for (i = 0; i < sinfo->nr_frags; i++) - /* non-atomic */ - __set_bit(page_idx + i, wi->skip_release_bitmap); - return NULL; + wi->linear_page.frags++; } mlx5e_page_release_fragmented(rq, &wi->linear_page); return NULL; /* page/packet was consumed by XDP */ @@ -2165,7 +2157,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, cqe_bcnt, &mxbuf); if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) - __set_bit(page_idx, wi->skip_release_bitmap); /* non-atomic */ + frag_page->frags++; return NULL; /* page/packet was consumed by XDP */ }
From: Prasad Koya prasad@arista.com
[ Upstream commit 9ac3fc2f42e5ffa1e927dcbffb71b15fa81459e2 ]
set TP bit in the 'supported' and 'advertising' fields. i225/226 parts only support twisted pair copper.
Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Prasad Koya prasad@arista.com Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_ethtool.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c index 0e2cb00622d1a..93bce729be76a 100644 --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c @@ -1708,6 +1708,8 @@ static int igc_ethtool_get_link_ksettings(struct net_device *netdev, /* twisted pair */ cmd->base.port = PORT_TP; cmd->base.phy_address = hw->phy.addr; + ethtool_link_ksettings_add_link_mode(cmd, supported, TP); + ethtool_link_ksettings_add_link_mode(cmd, advertising, TP);
/* advertising link modes */ if (hw->phy.autoneg_advertised & ADVERTISE_10_HALF)
From: Tan Tee Min tee.min.tan@linux.intel.com
[ Upstream commit 25102893e409bc02761ab82dbcfa092006404790 ]
IEEE 802.1Q does not have clear definitions of what constitutes an SDU (Service Data Unit), but IEEE Std 802.3 clause 3.1.2 does define the MAC service primitives and clause 3.2.7 does define the MAC Client Data for Q-tagged frames.
It shows that the mac_service_data_unit (MSDU) does NOT contain the preamble, destination and source address, or FCS. The MSDU does contain the length/type field, MAC client data, VLAN tag and any padding data (prior to the FCS).
Thus, the maximum 802.3 frame size that is allowed to be transmitted should be QueueMaxSDU (MSDU) + 16 (6 byte SA + 6 byte DA + 4 byte FCS).
Fixes: 92a0dcb8427d ("igc: offload queue max SDU from tc-taprio") Signed-off-by: Tan Tee Min tee.min.tan@linux.intel.com Reviewed-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 826556e609800..e7bd2c60ee383 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1575,16 +1575,9 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, if (adapter->qbv_transition || tx_ring->oper_gate_closed) goto out_drop;
- if (tx_ring->max_sdu > 0) { - u32 max_sdu = 0; - - max_sdu = tx_ring->max_sdu + - (skb_vlan_tagged(first->skb) ? VLAN_HLEN : 0); - - if (first->bytecount > max_sdu) { - adapter->stats.txdrop++; - goto out_drop; - } + if (tx_ring->max_sdu > 0 && first->bytecount > tx_ring->max_sdu) { + adapter->stats.txdrop++; + goto out_drop; }
if (unlikely(test_bit(IGC_RING_FLAG_TX_HWTSTAMP, &tx_ring->flags) && @@ -6215,7 +6208,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, struct net_device *dev = adapter->netdev;
if (qopt->max_sdu[i]) - ring->max_sdu = qopt->max_sdu[i] + dev->hard_header_len; + ring->max_sdu = qopt->max_sdu[i] + dev->hard_header_len - ETH_TLEN; else ring->max_sdu = 0; }
From: Aravindhan Gunasekaran aravindhan.gunasekaran@intel.com
[ Upstream commit 84a192e46106355de1a314d709e657231d4b1026 ]
I225/6 hardware can be programmed to start PPS output once the time in Target Time registers is reached. The time programmed in these registers should always be into future. Only then PPS output is triggered when SYSTIM register reaches the programmed value. There are two modes in i225/6 hardware to program PPS, pulse and clock mode.
There were issues reported where PPS is not generated when start time is in past.
Example 1, "echo 0 0 0 2 0 > /sys/class/ptp/ptp0/period"
In the current implementation, a value of '0' is programmed into Target time registers and PPS output is in pulse mode. Eventually an interrupt which is triggered upon SYSTIM register reaching Target time is not fired. Thus no PPS output is generated.
Example 2, "echo 0 0 0 1 0 > /sys/class/ptp/ptp0/period"
Above case, a value of '0' is programmed into Target time registers and PPS output is in clock mode. Here, HW tries to catch-up the current time by incrementing Target Time register. This catch-up time seem to vary according to programmed PPS period time as per the HW design. In my experiments, the delay ranged between few tens of seconds to few minutes. The PPS output is only generated after the Target time register reaches current time.
In my experiments, I also observed PPS stopped working with below test and could not recover until module is removed and loaded again.
1) echo 0 <future time> 0 1 0 > /sys/class/ptp/ptp1/period 2) echo 0 0 0 1 0 > /sys/class/ptp/ptp1/period 3) echo 0 0 0 1 0 > /sys/class/ptp/ptp1/period
After this PPS did not work even if i re-program with proper values. I could only get this back working by reloading the driver.
This patch takes care of calculating and programming appropriate future time value into Target Time registers.
Fixes: 5e91c72e560c ("igc: Fix PPS delta between two synchronized end-points") Signed-off-by: Aravindhan Gunasekaran aravindhan.gunasekaran@intel.com Reviewed-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_ptp.c | 25 +++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c index 32ef112f8291a..f0b979a706552 100644 --- a/drivers/net/ethernet/intel/igc/igc_ptp.c +++ b/drivers/net/ethernet/intel/igc/igc_ptp.c @@ -356,16 +356,35 @@ static int igc_ptp_feature_enable_i225(struct ptp_clock_info *ptp, tsim &= ~IGC_TSICR_TT0; } if (on) { + struct timespec64 safe_start; int i = rq->perout.index;
igc_pin_perout(igc, i, pin, use_freq); - igc->perout[i].start.tv_sec = rq->perout.start.sec; + igc_ptp_read(igc, &safe_start); + + /* PPS output start time is triggered by Target time(TT) + * register. Programming any past time value into TT + * register will cause PPS to never start. Need to make + * sure we program the TT register a time ahead in + * future. There isn't a stringent need to fire PPS out + * right away. Adding +2 seconds should take care of + * corner cases. Let's say if the SYSTIML is close to + * wrap up and the timer keeps ticking as we program the + * register, adding +2seconds is safe bet. + */ + safe_start.tv_sec += 2; + + if (rq->perout.start.sec < safe_start.tv_sec) + igc->perout[i].start.tv_sec = safe_start.tv_sec; + else + igc->perout[i].start.tv_sec = rq->perout.start.sec; igc->perout[i].start.tv_nsec = rq->perout.start.nsec; igc->perout[i].period.tv_sec = ts.tv_sec; igc->perout[i].period.tv_nsec = ts.tv_nsec; - wr32(trgttimh, rq->perout.start.sec); + wr32(trgttimh, (u32)igc->perout[i].start.tv_sec); /* For now, always select timer 0 as source. */ - wr32(trgttiml, rq->perout.start.nsec | IGC_TT_IO_TIMER_SEL_SYSTIM0); + wr32(trgttiml, (u32)(igc->perout[i].start.tv_nsec | + IGC_TT_IO_TIMER_SEL_SYSTIM0)); if (use_freq) wr32(freqout, ns); tsauxc |= tsauxc_mask;
From: Eric Biggers ebiggers@google.com
[ Upstream commit 2fb48d88e77f29bf9d278f25bcfe82cf59a0e09b ]
When a device-mapper device is passing through the inline encryption support of an underlying device, calls to blk_crypto_evict_key() take the blk_crypto_profile::lock of the device-mapper device, then take the blk_crypto_profile::lock of the underlying device (nested). This isn't a real deadlock, but it causes a lockdep report because there is only one lock class for all instances of this lock.
Lockdep subclasses don't really work here because the hierarchy of block devices is dynamic and could have more than 2 levels.
Instead, register a dynamic lock class for each blk_crypto_profile, and associate that with the lock.
This avoids false-positive lockdep reports like the following:
============================================ WARNING: possible recursive locking detected 6.4.0-rc5 #2 Not tainted -------------------------------------------- fscryptctl/1421 is trying to acquire lock: ffffff80829ca418 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0x44/0x1c0
but task is already holding lock: ffffff8086b68ca8 (&profile->lock){++++}-{3:3}, at: __blk_crypto_evict_key+0xc8/0x1c0
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&profile->lock); lock(&profile->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
Fixes: 1b2628397058 ("block: Keyslot Manager for Inline Encryption") Reported-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Eric Biggers ebiggers@google.com Reviewed-by: Bart Van Assche bvanassche@acm.org Link: https://lore.kernel.org/r/20230610061139.212085-1-ebiggers@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-crypto-profile.c | 12 ++++++++++-- include/linux/blk-crypto-profile.h | 1 + 2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/block/blk-crypto-profile.c b/block/blk-crypto-profile.c index 2a67d3fb63e5c..7fabc883e39f1 100644 --- a/block/blk-crypto-profile.c +++ b/block/blk-crypto-profile.c @@ -79,7 +79,14 @@ int blk_crypto_profile_init(struct blk_crypto_profile *profile, unsigned int slot_hashtable_size;
memset(profile, 0, sizeof(*profile)); - init_rwsem(&profile->lock); + + /* + * profile->lock of an underlying device can nest inside profile->lock + * of a device-mapper device, so use a dynamic lock class to avoid + * false-positive lockdep reports. + */ + lockdep_register_key(&profile->lockdep_key); + __init_rwsem(&profile->lock, "&profile->lock", &profile->lockdep_key);
if (num_slots == 0) return 0; @@ -89,7 +96,7 @@ int blk_crypto_profile_init(struct blk_crypto_profile *profile, profile->slots = kvcalloc(num_slots, sizeof(profile->slots[0]), GFP_KERNEL); if (!profile->slots) - return -ENOMEM; + goto err_destroy;
profile->num_slots = num_slots;
@@ -435,6 +442,7 @@ void blk_crypto_profile_destroy(struct blk_crypto_profile *profile) { if (!profile) return; + lockdep_unregister_key(&profile->lockdep_key); kvfree(profile->slot_hashtable); kvfree_sensitive(profile->slots, sizeof(profile->slots[0]) * profile->num_slots); diff --git a/include/linux/blk-crypto-profile.h b/include/linux/blk-crypto-profile.h index e6802b69cdd64..90ab33cb5d0ef 100644 --- a/include/linux/blk-crypto-profile.h +++ b/include/linux/blk-crypto-profile.h @@ -111,6 +111,7 @@ struct blk_crypto_profile { * keyslots while ensuring that they can't be changed concurrently. */ struct rw_semaphore lock; + struct lock_class_key lockdep_key;
/* List of idle slots, with least recently used slot at front */ wait_queue_head_t idle_slots_wait_queue;
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit e579b007eff3ff8d29d59d16214cd85fb9e573f7 ]
This should be negative -EAGAIN instead of positive. The callers treat non-zero error codes the same so it doesn't really impact runtime beyond some trivial differences to debug output.
Fixes: 80676d054e5a ("scsi: qla2xxx: Fix session cleanup hang") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/49866d28-4cfe-47b0-842b-78f110e61aab@moroto.mounta... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_iocb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c index b9b3e6f80ea9b..1ed13199f27ce 100644 --- a/drivers/scsi/qla2xxx/qla_iocb.c +++ b/drivers/scsi/qla2xxx/qla_iocb.c @@ -3892,7 +3892,7 @@ qla2x00_start_sp(srb_t *sp)
pkt = __qla2x00_alloc_iocbs(sp->qpair, sp); if (!pkt) { - rval = EAGAIN; + rval = -EAGAIN; ql_log(ql_log_warn, vha, 0x700c, "qla2x00_alloc_iocbs failed.\n"); goto done;
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 89f7ef7f2b23b2a7b8ce346c23161916eae5b15c ]
When RESET_CONTROLLER is not set, kconfig complains about missing dependencies for RESET_TI_SYSCON, so add the missing dependency just as is done above for SCSI_UFS_QCOM.
Silences this kconfig warning:
WARNING: unmet direct dependencies detected for RESET_TI_SYSCON Depends on [n]: RESET_CONTROLLER [=n] && HAS_IOMEM [=y] Selected by [m]: - SCSI_UFS_MEDIATEK [=m] && SCSI_UFSHCD [=y] && SCSI_UFSHCD_PLATFORM [=y] && ARCH_MEDIATEK [=y]
Fixes: de48898d0cb6 ("scsi: ufs-mediatek: Create reset control device_link") Signed-off-by: Randy Dunlap rdunlap@infradead.org Link: lore.kernel.org/r/202306020859.1wHg9AaT-lkp@intel.com Link: https://lore.kernel.org/r/20230701052348.28046-1-rdunlap@infradead.org Cc: Stanley Chu stanley.chu@mediatek.com Cc: Peter Wang peter.wang@mediatek.com Cc: Paul Gazzillo paul@pgazz.com Cc: Necip Fazil Yildiran fazilyildiran@gmail.com Cc: linux-scsi@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mediatek@lists.infradead.org Cc: "James E.J. Bottomley" jejb@linux.ibm.com Cc: "Martin K. Petersen" martin.petersen@oracle.com Reported-by: kernel test robot lkp@intel.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ufs/host/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ufs/host/Kconfig b/drivers/ufs/host/Kconfig index 8793e34335806..f11e98c9e6652 100644 --- a/drivers/ufs/host/Kconfig +++ b/drivers/ufs/host/Kconfig @@ -72,6 +72,7 @@ config SCSI_UFS_QCOM config SCSI_UFS_MEDIATEK tristate "Mediatek specific hooks to UFS controller platform driver" depends on SCSI_UFSHCD_PLATFORM && ARCH_MEDIATEK + depends on RESET_CONTROLLER select PHY_MTK_UFS select RESET_TI_SYSCON help
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit 5415ccd50a8620c8cbaa32d6f18c946c453566f5 ]
The check_max_stack_depth pass happens after the verifier's symbolic execution, and attempts to walk the call graph of the BPF program, ensuring that the stack usage stays within bounds for all possible call chains. There are two cases to consider: bpf_pseudo_func and bpf_pseudo_call. In the former case, the callback pointer is loaded into a register, and is assumed that it is passed to some helper later which calls it (however there is no way to be sure), but the check remains conservative and accounts the stack usage anyway. For this particular case, asynchronous callbacks are skipped as they execute asynchronously when their corresponding event fires.
The case of bpf_pseudo_call is simpler and we know that the call is definitely made, hence the stack depth of the subprog is accounted for.
However, the current check still skips an asynchronous callback even if a bpf_pseudo_call was made for it. This is erroneous, as it will miss accounting for the stack usage of the asynchronous callback, which can be used to breach the maximum stack depth limit.
Fix this by only skipping asynchronous callbacks when the instruction is not a pseudo call to the subprog.
Fixes: 7ddc80a476c2 ("bpf: Teach stack depth check about async callbacks.") Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Link: https://lore.kernel.org/r/20230705144730.235802-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 30fabae47a07b..aac31e33323bb 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5450,8 +5450,9 @@ static int check_max_stack_depth(struct bpf_verifier_env *env) verbose(env, "verifier bug. subprog has tail_call and async cb\n"); return -EFAULT; } - /* async callbacks don't increase bpf prog stack size */ - continue; + /* async callbacks don't increase bpf prog stack size unless called directly */ + if (!bpf_pseudo_call(insn + i)) + continue; } i = next_insn;
From: Klaus Kudielka klaus.kudielka@gmail.com
[ Upstream commit 21327f81db6337c8843ce755b01523c7d3df715b ]
If we boot with mvneta.txq_number=1, the txq_map is set incorrectly: MVNETA_CPU_TXQ_ACCESS(1) refers to TX queue 1, but only TX queue 0 is initialized. Fix this.
Fixes: 50bf8cb6fc9c ("net: mvneta: Configure XPS support") Signed-off-by: Klaus Kudielka klaus.kudielka@gmail.com Reviewed-by: Michal Kubiak michal.kubiak@intel.com Link: https://lore.kernel.org/r/20230705053712.3914-1-klaus.kudielka@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvneta.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index 2cad76d0a50ef..4401fad31fb98 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1505,7 +1505,7 @@ static void mvneta_defaults_set(struct mvneta_port *pp) */ if (txq_number == 1) txq_map = (cpu == pp->rxq_def) ? - MVNETA_CPU_TXQ_ACCESS(1) : 0; + MVNETA_CPU_TXQ_ACCESS(0) : 0;
} else { txq_map = MVNETA_CPU_TXQ_ACCESS_ALL_MASK; @@ -4295,7 +4295,7 @@ static void mvneta_percpu_elect(struct mvneta_port *pp) */ if (txq_number == 1) txq_map = (cpu == elected_cpu) ? - MVNETA_CPU_TXQ_ACCESS(1) : 0; + MVNETA_CPU_TXQ_ACCESS(0) : 0; else txq_map = mvreg_read(pp, MVNETA_CPU_MAP(cpu)) & MVNETA_CPU_TXQ_ACCESS_ALL_MASK;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit c60819149b637d0f9f7f66e110d2a0d90a3993ea ]
In a future change we will need to make ocelot_port_update_active_preemptible_tcs() call vsc9959_tas_guard_bands_update(), but that is currently not possible, since the ocelot switch lib does not have access to functions private to the DSA wrapper.
Move the pointer to vsc9959_tas_guard_bands_update() from felix->info (which is private to the DSA driver) to ocelot->ops (which is also visible to the ocelot switch lib).
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Message-ID: 20230705104422.49025-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: c6efb4ae387c ("net: mscc: ocelot: fix oversize frame dropping for preemptible TCs") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/ocelot/felix.c | 5 ++--- drivers/net/dsa/ocelot/felix.h | 1 - drivers/net/dsa/ocelot/felix_vsc9959.c | 2 +- include/soc/mscc/ocelot.h | 1 + 4 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index 70c0e2b1936b3..8348da2b3c97a 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -1786,14 +1786,13 @@ static int felix_change_mtu(struct dsa_switch *ds, int port, int new_mtu) { struct ocelot *ocelot = ds->priv; struct ocelot_port *ocelot_port = ocelot->ports[port]; - struct felix *felix = ocelot_to_felix(ocelot);
ocelot_port_set_maxlen(ocelot, port, new_mtu);
mutex_lock(&ocelot->tas_lock);
- if (ocelot_port->taprio && felix->info->tas_guard_bands_update) - felix->info->tas_guard_bands_update(ocelot, port); + if (ocelot_port->taprio && ocelot->ops->tas_guard_bands_update) + ocelot->ops->tas_guard_bands_update(ocelot, port);
mutex_unlock(&ocelot->tas_lock);
diff --git a/drivers/net/dsa/ocelot/felix.h b/drivers/net/dsa/ocelot/felix.h index 96008c046da53..1d4befe7cfe8e 100644 --- a/drivers/net/dsa/ocelot/felix.h +++ b/drivers/net/dsa/ocelot/felix.h @@ -57,7 +57,6 @@ struct felix_info { void (*mdio_bus_free)(struct ocelot *ocelot); int (*port_setup_tc)(struct dsa_switch *ds, int port, enum tc_setup_type type, void *type_data); - void (*tas_guard_bands_update)(struct ocelot *ocelot, int port); void (*port_sched_speed_set)(struct ocelot *ocelot, int port, u32 speed); void (*phylink_mac_config)(struct ocelot *ocelot, int port, diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c index d172a3e9736c4..219fb672a68d7 100644 --- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -2600,6 +2600,7 @@ static const struct ocelot_ops vsc9959_ops = { .cut_through_fwd = vsc9959_cut_through_fwd, .tas_clock_adjust = vsc9959_tas_clock_adjust, .update_stats = vsc9959_update_stats, + .tas_guard_bands_update = vsc9959_tas_guard_bands_update, };
static const struct felix_info felix_info_vsc9959 = { @@ -2625,7 +2626,6 @@ static const struct felix_info felix_info_vsc9959 = { .port_modes = vsc9959_port_modes, .port_setup_tc = vsc9959_port_setup_tc, .port_sched_speed_set = vsc9959_sched_speed_set, - .tas_guard_bands_update = vsc9959_tas_guard_bands_update, };
/* The INTB interrupt is shared between for PTP TX timestamp availability diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h index 22aae505c813b..85a726fb006ca 100644 --- a/include/soc/mscc/ocelot.h +++ b/include/soc/mscc/ocelot.h @@ -663,6 +663,7 @@ struct ocelot_ops { struct flow_stats *stats); void (*cut_through_fwd)(struct ocelot *ocelot); void (*tas_clock_adjust)(struct ocelot *ocelot); + void (*tas_guard_bands_update)(struct ocelot *ocelot, int port); void (*update_stats)(struct ocelot *ocelot); };
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit c6efb4ae387c79bf0d4da286108c810b7b40de3c ]
This switch implements Hold/Release in a strange way, with no control from the user as required by IEEE 802.1Q-2018 through Set-And-Hold-MAC and Set-And-Release-MAC, but rather, it emits HOLD requests implicitly based on the schedule.
Namely, when the gate of a preemptible TC is about to close (actually QSYS::PREEMPTION_CFG.HOLD_ADVANCE octet times in advance of this event), the QSYS seems to emit a HOLD request pulse towards the MAC which preempts the currently transmitted packet, and further packets are held back in the queue system.
This allows large frames to be squeezed through small time slots, because HOLD requests initiated by the gate events result in the frame being segmented in multiple fragments, the bit time of which is equal to the size of the time slot.
It has been reported that the vsc9959_tas_guard_bands_update() logic breaks this, because it doesn't take preemptible TCs into account, and enables oversized frame dropping when the time slot doesn't allow a full MTU to be sent, but it does allow 2*minFragSize to be sent (128B). Packets larger than 128B are dropped instead of being sent in multiple fragments.
Confusingly, the manual says:
| For guard band, SDU calculation of a traffic class of a port, if | preemption is enabled (through 'QSYS::PREEMPTION_CFG.P_QUEUES') then | QSYS::PREEMPTION_CFG.HOLD_ADVANCE is used, otherwise | QSYS::QMAXSDU_CFG_*.QMAXSDU_* is used.
but this only refers to the static guard band durations, and the QMAXSDU_CFG_* registers have dual purpose - the other being oversized frame dropping, which takes place irrespective of whether frames are preemptible or express.
So, to fix the problem, we need to call vsc9959_tas_guard_bands_update() from ocelot_port_update_active_preemptible_tcs(), and modify the guard band logic to consider a different (lower) oversize limit for preemptible traffic classes.
Fixes: 403ffc2c34de ("net: mscc: ocelot: add support for preemptible traffic classes") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Message-ID: 20230705104422.49025-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/ocelot/felix_vsc9959.c | 21 +++++++++++++++++---- drivers/net/ethernet/mscc/ocelot_mm.c | 7 +++++-- 2 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c index 219fb672a68d7..bd11f9fb95e54 100644 --- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -1221,11 +1221,13 @@ static u32 vsc9959_tas_tc_max_sdu(struct tc_taprio_qopt_offload *taprio, int tc) static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port) { struct ocelot_port *ocelot_port = ocelot->ports[port]; + struct ocelot_mm_state *mm = &ocelot->mm[port]; struct tc_taprio_qopt_offload *taprio; u64 min_gate_len[OCELOT_NUM_TC]; + u32 val, maxlen, add_frag_size; + u64 needed_min_frag_time_ps; int speed, picos_per_byte; u64 needed_bit_time_ps; - u32 val, maxlen; u8 tas_speed; int tc;
@@ -1265,9 +1267,18 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port) */ needed_bit_time_ps = (u64)(maxlen + 24) * picos_per_byte;
+ /* Preemptible TCs don't need to pass a full MTU, the port will + * automatically emit a HOLD request when a preemptible TC gate closes + */ + val = ocelot_read_rix(ocelot, QSYS_PREEMPTION_CFG, port); + add_frag_size = QSYS_PREEMPTION_CFG_MM_ADD_FRAG_SIZE_X(val); + needed_min_frag_time_ps = picos_per_byte * + (u64)(24 + 2 * ethtool_mm_frag_size_add_to_min(add_frag_size)); + dev_dbg(ocelot->dev, - "port %d: max frame size %d needs %llu ps at speed %d\n", - port, maxlen, needed_bit_time_ps, speed); + "port %d: max frame size %d needs %llu ps, %llu ps for mPackets at speed %d\n", + port, maxlen, needed_bit_time_ps, needed_min_frag_time_ps, + speed);
vsc9959_tas_min_gate_lengths(taprio, min_gate_len);
@@ -1281,7 +1292,9 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port) remaining_gate_len_ps = vsc9959_tas_remaining_gate_len_ps(min_gate_len[tc]);
- if (remaining_gate_len_ps > needed_bit_time_ps) { + if ((mm->active_preemptible_tcs & BIT(tc)) ? + remaining_gate_len_ps > needed_min_frag_time_ps : + remaining_gate_len_ps > needed_bit_time_ps) { /* Setting QMAXSDU_CFG to 0 disables oversized frame * dropping. */ diff --git a/drivers/net/ethernet/mscc/ocelot_mm.c b/drivers/net/ethernet/mscc/ocelot_mm.c index fb3145118d686..99b29d1e62449 100644 --- a/drivers/net/ethernet/mscc/ocelot_mm.c +++ b/drivers/net/ethernet/mscc/ocelot_mm.c @@ -67,10 +67,13 @@ void ocelot_port_update_active_preemptible_tcs(struct ocelot *ocelot, int port) val = mm->preemptible_tcs;
/* Cut through switching doesn't work for preemptible priorities, - * so first make sure it is disabled. + * so first make sure it is disabled. Also, changing the preemptible + * TCs affects the oversized frame dropping logic, so that needs to be + * re-triggered. And since tas_guard_bands_update() also implicitly + * calls cut_through_fwd(), we don't need to explicitly call it. */ mm->active_preemptible_tcs = val; - ocelot->ops->cut_through_fwd(ocelot); + ocelot->ops->tas_guard_bands_update(ocelot, port);
dev_dbg(ocelot->dev, "port %d %s/%s, MM TX %s, preemptible TCs 0x%x, active 0x%x\n",
From: M A Ramdhan ramdhan@starlabs.sg
[ Upstream commit 0323bce598eea038714f941ce2b22541c46d488f ]
In the event of a failure in tcf_change_indev(), fw_set_parms() will immediately return an error after incrementing or decrementing reference counter in tcf_bind_filter(). If attacker can control reference counter to zero and make reference freed, leading to use after free.
In order to prevent this, move the point of possible failure above the point where the TC_FW_CLASSID is handled.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: M A Ramdhan ramdhan@starlabs.sg Signed-off-by: M A Ramdhan ramdhan@starlabs.sg Acked-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Pedro Tammela pctammela@mojatatu.com Message-ID: 20230705161530.52003-1-ramdhan@starlabs.sg Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_fw.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c index ae9439a6c56c9..8641f80593179 100644 --- a/net/sched/cls_fw.c +++ b/net/sched/cls_fw.c @@ -212,11 +212,6 @@ static int fw_set_parms(struct net *net, struct tcf_proto *tp, if (err < 0) return err;
- if (tb[TCA_FW_CLASSID]) { - f->res.classid = nla_get_u32(tb[TCA_FW_CLASSID]); - tcf_bind_filter(tp, &f->res, base); - } - if (tb[TCA_FW_INDEV]) { int ret; ret = tcf_change_indev(net, tb[TCA_FW_INDEV], extack); @@ -233,6 +228,11 @@ static int fw_set_parms(struct net *net, struct tcf_proto *tp, } else if (head->mask != 0xFFFFFFFF) return err;
+ if (tb[TCA_FW_CLASSID]) { + f->res.classid = nla_get_u32(tb[TCA_FW_CLASSID]); + tcf_bind_filter(tp, &f->res, base); + } + return 0; }
From: Junfeng Guo junfeng.guo@intel.com
[ Upstream commit 0503efeadbf6bb8bf24397613a73b67e665eac5f ]
Current duplex mode was unset in the driver, resulting in the default parameter being set to 0, which corresponds to half duplex. It might mislead users to have incorrect expectation about the driver's transmission capabilities. Set the default duplex configuration to full, as the driver runs in full duplex mode at this point.
Fixes: 7e074d5a76ca ("gve: Enable Link Speed Reporting in the driver.") Signed-off-by: Junfeng Guo junfeng.guo@intel.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Message-ID: 20230706044128.2726747-1-junfeng.guo@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/google/gve/gve_ethtool.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c index cfd4b8d284d12..50162ec9424df 100644 --- a/drivers/net/ethernet/google/gve/gve_ethtool.c +++ b/drivers/net/ethernet/google/gve/gve_ethtool.c @@ -590,6 +590,9 @@ static int gve_get_link_ksettings(struct net_device *netdev, err = gve_adminq_report_link_speed(priv);
cmd->base.speed = priv->link_speed; + + cmd->base.duplex = DUPLEX_FULL; + return err; }
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 15008052b34efaa86c1d56190ac73c4bf8c462f9 ]
As of commit 6c80a93be62d398e ("drm/fb-helper: Initialize fb-helper's preferred BPP in prepare function"), the preferred_bpp parameter of drm_fb_helper_prepare() defaults to 32 instead of drm_mode_config.preferred_depth. Hence this also applies to drm_fbdev_dma_setup(), which just passes its own preferred_bpp parameter.
Fixes: b79fe9abd58bab73 ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA helpers") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Thomas Zimmermann tzimmermann@suse.de Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/91f093ffe436a9f94d58fb2bfbc140... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/drm_fbdev_dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_fbdev_dma.c b/drivers/gpu/drm/drm_fbdev_dma.c index 728deffcc0d92..e85cdf69cd6c4 100644 --- a/drivers/gpu/drm/drm_fbdev_dma.c +++ b/drivers/gpu/drm/drm_fbdev_dma.c @@ -218,7 +218,7 @@ static const struct drm_client_funcs drm_fbdev_dma_client_funcs = { * drm_fbdev_dma_setup() - Setup fbdev emulation for GEM DMA helpers * @dev: DRM device * @preferred_bpp: Preferred bits per pixel for the device. - * @dev->mode_config.preferred_depth is used if this is zero. + * 32 is used if this is zero. * * This function sets up fbdev emulation for GEM DMA drivers that support * dumb buffers with a virtual address and that can be mmap'ed.
From: Ratheesh Kannoth rkannoth@marvell.com
[ Upstream commit af42088bdaf292060b8d8a00d8644ca7b2b3f2d1 ]
In legacy silicon, promiscuous mode is only modified through CGX mbox messages. In CN10KB silicon, it is modified from CGX mbox and NIX. This breaks legacy application behaviour. Fix this by removing call from NIX.
Fixes: d6c9784baf59 ("octeontx2-af: Invoke exact match functions if supported") Signed-off-by: Ratheesh Kannoth rkannoth@marvell.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Reviewed-by: Michal Kubiak michal.kubiak@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/marvell/octeontx2/af/rvu_nix.c | 11 ++------- .../marvell/octeontx2/af/rvu_npc_hash.c | 23 +++++++++++++++++-- 2 files changed, 23 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c index f01d057ad025a..8cdf91a5bf44f 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c @@ -3815,21 +3815,14 @@ int rvu_mbox_handler_nix_set_rx_mode(struct rvu *rvu, struct nix_rx_mode *req, }
/* install/uninstall promisc entry */ - if (promisc) { + if (promisc) rvu_npc_install_promisc_entry(rvu, pcifunc, nixlf, pfvf->rx_chan_base, pfvf->rx_chan_cnt); - - if (rvu_npc_exact_has_match_table(rvu)) - rvu_npc_exact_promisc_enable(rvu, pcifunc); - } else { + else if (!nix_rx_multicast) rvu_npc_enable_promisc_entry(rvu, pcifunc, nixlf, false);
- if (rvu_npc_exact_has_match_table(rvu)) - rvu_npc_exact_promisc_disable(rvu, pcifunc); - } - return 0; }
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c index 9f11c1e407373..6fe67f3a7f6f1 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c @@ -1164,8 +1164,10 @@ static u16 __rvu_npc_exact_cmd_rules_cnt_update(struct rvu *rvu, int drop_mcam_i { struct npc_exact_table *table; u16 *cnt, old_cnt; + bool promisc;
table = rvu->hw->table; + promisc = table->promisc_mode[drop_mcam_idx];
cnt = &table->cnt_cmd_rules[drop_mcam_idx]; old_cnt = *cnt; @@ -1177,13 +1179,18 @@ static u16 __rvu_npc_exact_cmd_rules_cnt_update(struct rvu *rvu, int drop_mcam_i
*enable_or_disable_cam = false;
- /* If all rules are deleted, disable cam */ + if (promisc) + goto done; + + /* If all rules are deleted and not already in promisc mode; + * disable cam + */ if (!*cnt && val < 0) { *enable_or_disable_cam = true; goto done; }
- /* If rule got added, enable cam */ + /* If rule got added and not already in promisc mode; enable cam */ if (!old_cnt && val > 0) { *enable_or_disable_cam = true; goto done; @@ -1462,6 +1469,12 @@ int rvu_npc_exact_promisc_disable(struct rvu *rvu, u16 pcifunc) *promisc = false; mutex_unlock(&table->lock);
+ /* Enable drop rule */ + rvu_npc_enable_mcam_by_entry_index(rvu, drop_mcam_idx, NIX_INTF_RX, + true); + + dev_dbg(rvu->dev, "%s: disabled promisc mode (cgx=%d lmac=%d)\n", + __func__, cgx_id, lmac_id); return 0; }
@@ -1503,6 +1516,12 @@ int rvu_npc_exact_promisc_enable(struct rvu *rvu, u16 pcifunc) *promisc = true; mutex_unlock(&table->lock);
+ /* disable drop rule */ + rvu_npc_enable_mcam_by_entry_index(rvu, drop_mcam_idx, NIX_INTF_RX, + false); + + dev_dbg(rvu->dev, "%s: Enabled promisc mode (cgx=%d lmac=%d)\n", + __func__, cgx_id, lmac_id); return 0; }
From: Sai Krishna saikrishnag@marvell.com
[ Upstream commit 7709fbd4922c197efabda03660d93e48a3e80323 ]
Moved PTP pointer validation before its use to avoid smatch warning. Also used kzalloc/kfree instead of devm_kzalloc/devm_kfree.
Fixes: 2ef4e45d99b1 ("octeontx2-af: Add PTP PPS Errata workaround on CN10K silicon") Signed-off-by: Naveen Mamindlapalli naveenm@marvell.com Signed-off-by: Sunil Goutham sgoutham@marvell.com Signed-off-by: Sai Krishna saikrishnag@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/marvell/octeontx2/af/ptp.c | 19 +++++++++---------- .../net/ethernet/marvell/octeontx2/af/rvu.c | 2 +- 2 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/ptp.c b/drivers/net/ethernet/marvell/octeontx2/af/ptp.c index 3411e2e47d46b..0ee420a489fc4 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/ptp.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/ptp.c @@ -208,7 +208,7 @@ struct ptp *ptp_get(void) /* Check driver is bound to PTP block */ if (!ptp) ptp = ERR_PTR(-EPROBE_DEFER); - else + else if (!IS_ERR(ptp)) pci_dev_get(ptp->pdev);
return ptp; @@ -388,11 +388,10 @@ static int ptp_extts_on(struct ptp *ptp, int on) static int ptp_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { - struct device *dev = &pdev->dev; struct ptp *ptp; int err;
- ptp = devm_kzalloc(dev, sizeof(*ptp), GFP_KERNEL); + ptp = kzalloc(sizeof(*ptp), GFP_KERNEL); if (!ptp) { err = -ENOMEM; goto error; @@ -428,20 +427,19 @@ static int ptp_probe(struct pci_dev *pdev, return 0;
error_free: - devm_kfree(dev, ptp); + kfree(ptp);
error: /* For `ptp_get()` we need to differentiate between the case * when the core has not tried to probe this device and the case when - * the probe failed. In the later case we pretend that the - * initialization was successful and keep the error in + * the probe failed. In the later case we keep the error in * `dev->driver_data`. */ pci_set_drvdata(pdev, ERR_PTR(err)); if (!first_ptp_block) first_ptp_block = ERR_PTR(err);
- return 0; + return err; }
static void ptp_remove(struct pci_dev *pdev) @@ -449,16 +447,17 @@ static void ptp_remove(struct pci_dev *pdev) struct ptp *ptp = pci_get_drvdata(pdev); u64 clock_cfg;
- if (cn10k_ptp_errata(ptp) && hrtimer_active(&ptp->hrtimer)) - hrtimer_cancel(&ptp->hrtimer); - if (IS_ERR_OR_NULL(ptp)) return;
+ if (cn10k_ptp_errata(ptp) && hrtimer_active(&ptp->hrtimer)) + hrtimer_cancel(&ptp->hrtimer); + /* Disable PTP clock */ clock_cfg = readq(ptp->reg_base + PTP_CLOCK_CFG); clock_cfg &= ~PTP_CLOCK_CFG_PTP_EN; writeq(clock_cfg, ptp->reg_base + PTP_CLOCK_CFG); + kfree(ptp); }
static const struct pci_device_id ptp_id_table[] = { diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c index b26b013216933..73932e2755bca 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c @@ -3253,7 +3253,7 @@ static int rvu_probe(struct pci_dev *pdev, const struct pci_device_id *id) rvu->ptp = ptp_get(); if (IS_ERR(rvu->ptp)) { err = PTR_ERR(rvu->ptp); - if (err == -EPROBE_DEFER) + if (err) goto err_release_regions; rvu->ptp = NULL; }
From: Nitya Sunkad nitya.sunkad@amd.com
[ Upstream commit abfb2a58a5377ebab717d4362d6180f901b6e5c1 ]
Remove unnecessary early code development check and the WARN_ON that it uses. The irq alloc and free paths have long been cleaned up and this check shouldn't have stuck around so long.
Fixes: 77ceb68e29cc ("ionic: Add notifyq support") Signed-off-by: Nitya Sunkad nitya.sunkad@amd.com Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 957027e546b30..e03a94f2469ab 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -474,11 +474,6 @@ static void ionic_qcqs_free(struct ionic_lif *lif) static void ionic_link_qcq_interrupts(struct ionic_qcq *src_qcq, struct ionic_qcq *n_qcq) { - if (WARN_ON(n_qcq->flags & IONIC_QCQ_F_INTR)) { - ionic_intr_free(n_qcq->cq.lif->ionic, n_qcq->intr.index); - n_qcq->flags &= ~IONIC_QCQ_F_INTR; - } - n_qcq->intr.vector = src_qcq->intr.vector; n_qcq->intr.index = src_qcq->intr.index; n_qcq->napi_qcq = src_qcq->napi_qcq;
From: Ivan Babrou ivan@cloudflare.com
[ Upstream commit 8139dccd464aaee4a2c351506ff883733c6ca5a3 ]
The tracepoint has existed for 12 years, but it only covered udp over the legacy IPv4 protocol. Having it enabled for udp6 removes the unnecessary difference in error visibility.
Signed-off-by: Ivan Babrou ivan@cloudflare.com Fixes: 296f7ea75b45 ("udp: add tracepoints for queueing skb to rcvbuf") Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/net-traces.c | 2 ++ net/ipv6/udp.c | 2 ++ 2 files changed, 4 insertions(+)
diff --git a/net/core/net-traces.c b/net/core/net-traces.c index 805b7385dd8da..6aef976bc1da2 100644 --- a/net/core/net-traces.c +++ b/net/core/net-traces.c @@ -63,4 +63,6 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(napi_poll); EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_send_reset); EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_bad_csum);
+EXPORT_TRACEPOINT_SYMBOL_GPL(udp_fail_queue_rcv_skb); + EXPORT_TRACEPOINT_SYMBOL_GPL(sk_data_ready); diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index e5a337e6b9705..debb98fb23c0b 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -45,6 +45,7 @@ #include <net/tcp_states.h> #include <net/ip6_checksum.h> #include <net/ip6_tunnel.h> +#include <trace/events/udp.h> #include <net/xfrm.h> #include <net/inet_hashtables.h> #include <net/inet6_hashtables.h> @@ -680,6 +681,7 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) } UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite); kfree_skb_reason(skb, drop_reason); + trace_udp_fail_queue_rcv_skb(rc, sk); return -1; }
From: Rafał Miłecki rafal@milecki.pl
[ Upstream commit e7731194fdf085f46d58b1adccfddbd0dfee4873 ]
Turning IRQs off is done by accessing Ethernet controller registers. That can't be done until device's clock is enabled. It results in a SoC hang otherwise.
This bug remained unnoticed for years as most bootloaders keep all Ethernet interfaces turned on. It seems to only affect a niche SoC family BCM47189. It has two Ethernet controllers but CFE bootloader uses only the first one.
Fixes: 34322615cbaa ("net: bgmac: Mask interrupts during probe") Signed-off-by: Rafał Miłecki rafal@milecki.pl Reviewed-by: Michal Kubiak michal.kubiak@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bgmac.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c index 1761df8fb7f96..10c7c232cc4ec 100644 --- a/drivers/net/ethernet/broadcom/bgmac.c +++ b/drivers/net/ethernet/broadcom/bgmac.c @@ -1492,8 +1492,6 @@ int bgmac_enet_probe(struct bgmac *bgmac)
bgmac->in_init = true;
- bgmac_chip_intrs_off(bgmac); - net_dev->irq = bgmac->irq; SET_NETDEV_DEV(net_dev, bgmac->dev); dev_set_drvdata(bgmac->dev, bgmac); @@ -1511,6 +1509,8 @@ int bgmac_enet_probe(struct bgmac *bgmac) */ bgmac_clk_enable(bgmac, 0);
+ bgmac_chip_intrs_off(bgmac); + /* This seems to be fixing IRQ by assigning OOB #6 to the core */ if (!(bgmac->feature_flags & BGMAC_FEAT_IDM_MASK)) { if (bgmac->feature_flags & BGMAC_FEAT_IRQ_ID_OOB_6)
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit c329b261afe71197d9da83c1f18eb45a7e97e089 ]
Ian reported several skb corruptions triggered by rx-gro-list, collecting different oops alike:
[ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0 [ 62.631083] #PF: supervisor read access in kernel mode [ 62.636312] #PF: error_code(0x0000) - not-present page [ 62.641541] PGD 0 P4D 0 [ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364 [ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F, BIOS 1.7a 10/13/2022 [ 62.663344] RIP: 0010:__udp_gso_segment (./include/linux/skbuff.h:2858 ./include/linux/udp.h:23 net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261 net/ipv4/udp_offload.c:277) [ 62.687193] RSP: 0018:ffffbd3a83b4f868 EFLAGS: 00010246 [ 62.692515] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000 [ 62.699743] RDX: ffffa124def8a000 RSI: 0000000000000079 RDI: ffffa125952a14d4 [ 62.706970] RBP: ffffa124def8a000 R08: 0000000000000022 R09: 00002000001558c9 [ 62.714199] R10: 0000000000000000 R11: 00000000be554639 R12: 00000000000000e2 [ 62.721426] R13: ffffa125952a1400 R14: ffffa125952a1400 R15: 00002000001558c9 [ 62.728654] FS: 0000000000000000(0000) GS:ffffa127efa40000(0000) knlGS:0000000000000000 [ 62.736852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 62.742702] CR2: 00000000000000c0 CR3: 00000001034b0000 CR4: 00000000003526e0 [ 62.749948] Call Trace: [ 62.752498] <TASK> [ 62.779267] inet_gso_segment (net/ipv4/af_inet.c:1398) [ 62.787605] skb_mac_gso_segment (net/core/gro.c:141) [ 62.791906] __skb_gso_segment (net/core/dev.c:3403 (discriminator 2)) [ 62.800492] validate_xmit_skb (./include/linux/netdevice.h:4862 net/core/dev.c:3659) [ 62.804695] validate_xmit_skb_list (net/core/dev.c:3710) [ 62.809158] sch_direct_xmit (net/sched/sch_generic.c:330) [ 62.813198] __dev_queue_xmit (net/core/dev.c:3805 net/core/dev.c:4210) net/netfilter/core.c:626) [ 62.821093] br_dev_queue_push_xmit (net/bridge/br_forward.c:55) [ 62.825652] maybe_deliver (net/bridge/br_forward.c:193) [ 62.829420] br_flood (net/bridge/br_forward.c:233) [ 62.832758] br_handle_frame_finish (net/bridge/br_input.c:215) [ 62.837403] br_handle_frame (net/bridge/br_input.c:298 net/bridge/br_input.c:416) [ 62.851417] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387) [ 62.866114] __netif_receive_skb_list_core (net/core/dev.c:5570) [ 62.871367] netif_receive_skb_list_internal (net/core/dev.c:5638 net/core/dev.c:5727) [ 62.876795] napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067) [ 62.881004] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191) [ 62.893534] __napi_poll (net/core/dev.c:6498) [ 62.897133] napi_threaded_poll (./include/linux/netpoll.h:89 net/core/dev.c:6640) [ 62.905276] kthread (kernel/kthread.c:379) [ 62.913435] ret_from_fork (arch/x86/entry/entry_64.S:314) [ 62.917119] </TASK>
In the critical scenario, rx-gro-list GRO-ed packets are fed, via a bridge, both to the local input path and to an egress device (tun).
The segmentation of such packets unsafely writes to the cloned skbs with shared heads.
This change addresses the issue by uncloning as needed the to-be-segmented skbs.
Reported-by: Ian Kumlien ian.kumlien@gmail.com Tested-by: Ian Kumlien ian.kumlien@gmail.com Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.") Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/skbuff.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index cea28d30abb55..1b6a1d99869dc 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4270,6 +4270,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
skb_push(skb, -skb_network_offset(skb) + offset);
+ /* Ensure the head is writeable before touching the shared info */ + err = skb_unclone(skb, GFP_ATOMIC); + if (err) + goto err_linearize; + skb_shinfo(skb)->frag_list = NULL;
while (list_skb) {
From: Niklas Schnelle schnelle@linux.ibm.com
[ Upstream commit 6b5c13b591d753c6022fbd12f8c0c0a9a07fc065 ]
The clients array references all registered clients and is protected by the clients_lock. Besides its use as general list of clients the clients array is accessed in ism_handle_irq() to forward ISM device events to clients.
While the clients_lock is taken in the IRQ handler when calling handle_event() it is however incorrectly not held during the client->handle_irq() call and for the preceding clients[] access leaving it unprotected against concurrent client (un-)registration.
Furthermore the accesses to ism->sba_client_arr[] in ism_register_dmb() and ism_unregister_dmb() are not protected by any lock. This is especially problematic as the client ID from the ism->sba_client_arr[] is not checked against NO_CLIENT and neither is the client pointer checked.
Instead of expanding the use of the clients_lock further add a separate array in struct ism_dev which references clients subscribed to the device's events and IRQs. This array is protected by ism->lock which is already taken in ism_handle_irq() and can be taken outside the IRQ handler when adding/removing subscribers or the accessing ism->sba_client_arr[]. This also means that the clients_lock is no longer taken in IRQ context.
Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Reviewed-by: Alexandra Winter wintera@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/net/ism_drv.c | 44 +++++++++++++++++++++++++++++++------- include/linux/ism.h | 1 + 2 files changed, 37 insertions(+), 8 deletions(-)
diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c index c2096e4bba319..216eb4b386286 100644 --- a/drivers/s390/net/ism_drv.c +++ b/drivers/s390/net/ism_drv.c @@ -47,6 +47,15 @@ static struct ism_dev_list ism_dev_list = { .mutex = __MUTEX_INITIALIZER(ism_dev_list.mutex), };
+static void ism_setup_forwarding(struct ism_client *client, struct ism_dev *ism) +{ + unsigned long flags; + + spin_lock_irqsave(&ism->lock, flags); + ism->subs[client->id] = client; + spin_unlock_irqrestore(&ism->lock, flags); +} + int ism_register_client(struct ism_client *client) { struct ism_dev *ism; @@ -71,6 +80,7 @@ int ism_register_client(struct ism_client *client) list_for_each_entry(ism, &ism_dev_list.list, list) { ism->priv[i] = NULL; client->add(ism); + ism_setup_forwarding(client, ism); } } mutex_unlock(&ism_dev_list.mutex); @@ -92,6 +102,9 @@ int ism_unregister_client(struct ism_client *client) max_client--; spin_unlock_irqrestore(&clients_lock, flags); list_for_each_entry(ism, &ism_dev_list.list, list) { + spin_lock_irqsave(&ism->lock, flags); + /* Stop forwarding IRQs and events */ + ism->subs[client->id] = NULL; for (int i = 0; i < ISM_NR_DMBS; ++i) { if (ism->sba_client_arr[i] == client->id) { pr_err("%s: attempt to unregister client '%s'" @@ -101,6 +114,7 @@ int ism_unregister_client(struct ism_client *client) goto out; } } + spin_unlock_irqrestore(&ism->lock, flags); } out: mutex_unlock(&ism_dev_list.mutex); @@ -328,6 +342,7 @@ int ism_register_dmb(struct ism_dev *ism, struct ism_dmb *dmb, struct ism_client *client) { union ism_reg_dmb cmd; + unsigned long flags; int ret;
ret = ism_alloc_dmb(ism, dmb); @@ -351,7 +366,9 @@ int ism_register_dmb(struct ism_dev *ism, struct ism_dmb *dmb, goto out; } dmb->dmb_tok = cmd.response.dmb_tok; + spin_lock_irqsave(&ism->lock, flags); ism->sba_client_arr[dmb->sba_idx - ISM_DMB_BIT_OFFSET] = client->id; + spin_unlock_irqrestore(&ism->lock, flags); out: return ret; } @@ -360,6 +377,7 @@ EXPORT_SYMBOL_GPL(ism_register_dmb); int ism_unregister_dmb(struct ism_dev *ism, struct ism_dmb *dmb) { union ism_unreg_dmb cmd; + unsigned long flags; int ret;
memset(&cmd, 0, sizeof(cmd)); @@ -368,7 +386,9 @@ int ism_unregister_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
cmd.request.dmb_tok = dmb->dmb_tok;
+ spin_lock_irqsave(&ism->lock, flags); ism->sba_client_arr[dmb->sba_idx - ISM_DMB_BIT_OFFSET] = NO_CLIENT; + spin_unlock_irqrestore(&ism->lock, flags);
ret = ism_cmd(ism, &cmd); if (ret && ret != ISM_ERROR) @@ -491,6 +511,7 @@ static u16 ism_get_chid(struct ism_dev *ism) static void ism_handle_event(struct ism_dev *ism) { struct ism_event *entry; + struct ism_client *clt; int i;
while ((ism->ieq_idx + 1) != READ_ONCE(ism->ieq->header.idx)) { @@ -499,21 +520,21 @@ static void ism_handle_event(struct ism_dev *ism)
entry = &ism->ieq->entry[ism->ieq_idx]; debug_event(ism_debug_info, 2, entry, sizeof(*entry)); - spin_lock(&clients_lock); - for (i = 0; i < max_client; ++i) - if (clients[i]) - clients[i]->handle_event(ism, entry); - spin_unlock(&clients_lock); + for (i = 0; i < max_client; ++i) { + clt = ism->subs[i]; + if (clt) + clt->handle_event(ism, entry); + } } }
static irqreturn_t ism_handle_irq(int irq, void *data) { struct ism_dev *ism = data; - struct ism_client *clt; unsigned long bit, end; unsigned long *bv; u16 dmbemask; + u8 client_id;
bv = (void *) &ism->sba->dmb_bits[ISM_DMB_WORD_OFFSET]; end = sizeof(ism->sba->dmb_bits) * BITS_PER_BYTE - ISM_DMB_BIT_OFFSET; @@ -530,8 +551,10 @@ static irqreturn_t ism_handle_irq(int irq, void *data) dmbemask = ism->sba->dmbe_mask[bit + ISM_DMB_BIT_OFFSET]; ism->sba->dmbe_mask[bit + ISM_DMB_BIT_OFFSET] = 0; barrier(); - clt = clients[ism->sba_client_arr[bit]]; - clt->handle_irq(ism, bit + ISM_DMB_BIT_OFFSET, dmbemask); + client_id = ism->sba_client_arr[bit]; + if (unlikely(client_id == NO_CLIENT || !ism->subs[client_id])) + continue; + ism->subs[client_id]->handle_irq(ism, bit + ISM_DMB_BIT_OFFSET, dmbemask); }
if (ism->sba->e) { @@ -554,6 +577,7 @@ static void ism_dev_add_work_func(struct work_struct *work) add_work);
client->add(client->tgt_ism); + ism_setup_forwarding(client, client->tgt_ism); atomic_dec(&client->tgt_ism->add_dev_cnt); wake_up(&client->tgt_ism->waitq); } @@ -691,7 +715,11 @@ static void ism_dev_remove_work_func(struct work_struct *work) { struct ism_client *client = container_of(work, struct ism_client, remove_work); + unsigned long flags;
+ spin_lock_irqsave(&client->tgt_ism->lock, flags); + client->tgt_ism->subs[client->id] = NULL; + spin_unlock_irqrestore(&client->tgt_ism->lock, flags); client->remove(client->tgt_ism); atomic_dec(&client->tgt_ism->free_clients_cnt); wake_up(&client->tgt_ism->waitq); diff --git a/include/linux/ism.h b/include/linux/ism.h index ea2bcdae74012..5160d47e5ea9e 100644 --- a/include/linux/ism.h +++ b/include/linux/ism.h @@ -44,6 +44,7 @@ struct ism_dev { u64 local_gid; int ieq_idx;
+ struct ism_client *subs[MAX_CLIENTS]; atomic_t free_clients_cnt; atomic_t add_dev_cnt; wait_queue_head_t waitq;
From: Niklas Schnelle schnelle@linux.ibm.com
[ Upstream commit 76631ffa2fd2d45bae5ad717eef716b94144e0e7 ]
Previously the clients_lock was protecting the clients array against concurrent addition/removal of clients but was also accessed from IRQ context. This meant that it had to be a spinlock and that the add() and remove() callbacks in which clients need to do allocation and take mutexes can't be called under the clients_lock. To work around this these callbacks were moved to workqueues. This not only introduced significant complexity but is also subtly broken in at least one way.
In ism_dev_init() and ism_dev_exit() clients[i]->tgt_ism is used to communicate the added/removed ISM device to the work function. While write access to client[i]->tgt_ism is protected by the clients_lock and the code waits that there is no pending add/remove work before and after setting clients[i]->tgt_ism this is not enough. The problem is that the wait happens based on per ISM device counters. Thus a concurrent ism_dev_init()/ism_dev_exit() for a different ISM device may overwrite a clients[i]->tgt_ism between unlocking the clients_lock and the subsequent wait for the work to finnish.
Thankfully with the clients_lock no longer held in IRQ context it can be turned into a mutex which can be held during the calls to add()/remove() completely removing the need for the workqueues and the associated broken housekeeping including the per ISM device counters and the clients[i]->tgt_ism.
Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/net/ism_drv.c | 86 +++++++++++--------------------------- include/linux/ism.h | 6 --- 2 files changed, 24 insertions(+), 68 deletions(-)
diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c index 216eb4b386286..d65571b3d5cad 100644 --- a/drivers/s390/net/ism_drv.c +++ b/drivers/s390/net/ism_drv.c @@ -36,7 +36,7 @@ static const struct smcd_ops ism_ops; static struct ism_client *clients[MAX_CLIENTS]; /* use an array rather than */ /* a list for fast mapping */ static u8 max_client; -static DEFINE_SPINLOCK(clients_lock); +static DEFINE_MUTEX(clients_lock); struct ism_dev_list { struct list_head list; struct mutex mutex; /* protects ism device list */ @@ -59,11 +59,10 @@ static void ism_setup_forwarding(struct ism_client *client, struct ism_dev *ism) int ism_register_client(struct ism_client *client) { struct ism_dev *ism; - unsigned long flags; int i, rc = -ENOSPC;
mutex_lock(&ism_dev_list.mutex); - spin_lock_irqsave(&clients_lock, flags); + mutex_lock(&clients_lock); for (i = 0; i < MAX_CLIENTS; ++i) { if (!clients[i]) { clients[i] = client; @@ -74,7 +73,8 @@ int ism_register_client(struct ism_client *client) break; } } - spin_unlock_irqrestore(&clients_lock, flags); + mutex_unlock(&clients_lock); + if (i < MAX_CLIENTS) { /* initialize with all devices that we got so far */ list_for_each_entry(ism, &ism_dev_list.list, list) { @@ -96,11 +96,11 @@ int ism_unregister_client(struct ism_client *client) int rc = 0;
mutex_lock(&ism_dev_list.mutex); - spin_lock_irqsave(&clients_lock, flags); + mutex_lock(&clients_lock); clients[client->id] = NULL; if (client->id + 1 == max_client) max_client--; - spin_unlock_irqrestore(&clients_lock, flags); + mutex_unlock(&clients_lock); list_for_each_entry(ism, &ism_dev_list.list, list) { spin_lock_irqsave(&ism->lock, flags); /* Stop forwarding IRQs and events */ @@ -571,21 +571,9 @@ static u64 ism_get_local_gid(struct ism_dev *ism) return ism->local_gid; }
-static void ism_dev_add_work_func(struct work_struct *work) -{ - struct ism_client *client = container_of(work, struct ism_client, - add_work); - - client->add(client->tgt_ism); - ism_setup_forwarding(client, client->tgt_ism); - atomic_dec(&client->tgt_ism->add_dev_cnt); - wake_up(&client->tgt_ism->waitq); -} - static int ism_dev_init(struct ism_dev *ism) { struct pci_dev *pdev = ism->pdev; - unsigned long flags; int i, ret;
ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI); @@ -618,25 +606,16 @@ static int ism_dev_init(struct ism_dev *ism) /* hardware is V2 capable */ ism_create_system_eid();
- init_waitqueue_head(&ism->waitq); - atomic_set(&ism->free_clients_cnt, 0); - atomic_set(&ism->add_dev_cnt, 0); - - wait_event(ism->waitq, !atomic_read(&ism->add_dev_cnt)); - spin_lock_irqsave(&clients_lock, flags); - for (i = 0; i < max_client; ++i) + mutex_lock(&ism_dev_list.mutex); + mutex_lock(&clients_lock); + for (i = 0; i < max_client; ++i) { if (clients[i]) { - INIT_WORK(&clients[i]->add_work, - ism_dev_add_work_func); - clients[i]->tgt_ism = ism; - atomic_inc(&ism->add_dev_cnt); - schedule_work(&clients[i]->add_work); + clients[i]->add(ism); + ism_setup_forwarding(clients[i], ism); } - spin_unlock_irqrestore(&clients_lock, flags); - - wait_event(ism->waitq, !atomic_read(&ism->add_dev_cnt)); + } + mutex_unlock(&clients_lock);
- mutex_lock(&ism_dev_list.mutex); list_add(&ism->list, &ism_dev_list.list); mutex_unlock(&ism_dev_list.mutex);
@@ -711,40 +690,24 @@ static int ism_probe(struct pci_dev *pdev, const struct pci_device_id *id) return ret; }
-static void ism_dev_remove_work_func(struct work_struct *work) -{ - struct ism_client *client = container_of(work, struct ism_client, - remove_work); - unsigned long flags; - - spin_lock_irqsave(&client->tgt_ism->lock, flags); - client->tgt_ism->subs[client->id] = NULL; - spin_unlock_irqrestore(&client->tgt_ism->lock, flags); - client->remove(client->tgt_ism); - atomic_dec(&client->tgt_ism->free_clients_cnt); - wake_up(&client->tgt_ism->waitq); -} - -/* Callers must hold ism_dev_list.mutex */ static void ism_dev_exit(struct ism_dev *ism) { struct pci_dev *pdev = ism->pdev; unsigned long flags; int i;
- wait_event(ism->waitq, !atomic_read(&ism->free_clients_cnt)); - spin_lock_irqsave(&clients_lock, flags); + spin_lock_irqsave(&ism->lock, flags); for (i = 0; i < max_client; ++i) - if (clients[i]) { - INIT_WORK(&clients[i]->remove_work, - ism_dev_remove_work_func); - clients[i]->tgt_ism = ism; - atomic_inc(&ism->free_clients_cnt); - schedule_work(&clients[i]->remove_work); - } - spin_unlock_irqrestore(&clients_lock, flags); + ism->subs[i] = NULL; + spin_unlock_irqrestore(&ism->lock, flags);
- wait_event(ism->waitq, !atomic_read(&ism->free_clients_cnt)); + mutex_lock(&ism_dev_list.mutex); + mutex_lock(&clients_lock); + for (i = 0; i < max_client; ++i) { + if (clients[i]) + clients[i]->remove(ism); + } + mutex_unlock(&clients_lock);
if (SYSTEM_EID.serial_number[0] != '0' || SYSTEM_EID.type[0] != '0') @@ -755,15 +718,14 @@ static void ism_dev_exit(struct ism_dev *ism) kfree(ism->sba_client_arr); pci_free_irq_vectors(pdev); list_del_init(&ism->list); + mutex_unlock(&ism_dev_list.mutex); }
static void ism_remove(struct pci_dev *pdev) { struct ism_dev *ism = dev_get_drvdata(&pdev->dev);
- mutex_lock(&ism_dev_list.mutex); ism_dev_exit(ism); - mutex_unlock(&ism_dev_list.mutex);
pci_release_mem_regions(pdev); pci_disable_device(pdev); diff --git a/include/linux/ism.h b/include/linux/ism.h index 5160d47e5ea9e..9a4c204df3da1 100644 --- a/include/linux/ism.h +++ b/include/linux/ism.h @@ -45,9 +45,6 @@ struct ism_dev { int ieq_idx;
struct ism_client *subs[MAX_CLIENTS]; - atomic_t free_clients_cnt; - atomic_t add_dev_cnt; - wait_queue_head_t waitq; };
struct ism_event { @@ -69,9 +66,6 @@ struct ism_client { */ void (*handle_irq)(struct ism_dev *dev, unsigned int bit, u16 dmbemask); /* Private area - don't touch! */ - struct work_struct remove_work; - struct work_struct add_work; - struct ism_dev *tgt_ism; u8 id; };
From: Niklas Schnelle schnelle@linux.ibm.com
[ Upstream commit 266deeea34ffd28c6b6a63edf2af9b5a07161c24 ]
When ism_unregister_client() is called but the client still has DMBs registered it returns -EBUSY and prints an error. This only happens after the client has already been unregistered however. This is unexpected as the unregister claims to have failed. Furthermore as this implies a client bug a WARN() is more appropriate. Thus move the deregistration after the check and use WARN().
Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/net/ism_drv.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c index d65571b3d5cad..6db5cf7e901f9 100644 --- a/drivers/s390/net/ism_drv.c +++ b/drivers/s390/net/ism_drv.c @@ -96,29 +96,32 @@ int ism_unregister_client(struct ism_client *client) int rc = 0;
mutex_lock(&ism_dev_list.mutex); - mutex_lock(&clients_lock); - clients[client->id] = NULL; - if (client->id + 1 == max_client) - max_client--; - mutex_unlock(&clients_lock); list_for_each_entry(ism, &ism_dev_list.list, list) { spin_lock_irqsave(&ism->lock, flags); /* Stop forwarding IRQs and events */ ism->subs[client->id] = NULL; for (int i = 0; i < ISM_NR_DMBS; ++i) { if (ism->sba_client_arr[i] == client->id) { - pr_err("%s: attempt to unregister client '%s'" - "with registered dmb(s)\n", __func__, - client->name); + WARN(1, "%s: attempt to unregister '%s' with registered dmb(s)\n", + __func__, client->name); rc = -EBUSY; - goto out; + goto err_reg_dmb; } } spin_unlock_irqrestore(&ism->lock, flags); } -out: mutex_unlock(&ism_dev_list.mutex);
+ mutex_lock(&clients_lock); + clients[client->id] = NULL; + if (client->id + 1 == max_client) + max_client--; + mutex_unlock(&clients_lock); + return rc; + +err_reg_dmb: + spin_unlock_irqrestore(&ism->lock, flags); + mutex_unlock(&ism_dev_list.mutex); return rc; } EXPORT_SYMBOL_GPL(ism_unregister_client);
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 2aaa8a15de73874847d62eb595c6683bface80fd ]
With some IPv6 Ext Hdr (RPL, SRv6, etc.), we can send a packet that has the link-local address as src and dst IP and will be forwarded to an external IP in the IPv6 Ext Hdr.
For example, the script below generates a packet whose src IP is the link-local address and dst is updated to 11::.
# for f in $(find /proc/sys/net/ -name *seg6_enabled*); do echo 1 > $f; done # python3
from socket import * from scapy.all import *
SRC_ADDR = DST_ADDR = "fe80::5054:ff:fe12:3456"
pkt = IPv6(src=SRC_ADDR, dst=DST_ADDR) pkt /= IPv6ExtHdrSegmentRouting(type=4, addresses=["11::", "22::"], segleft=1)
sk = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW) sk.sendto(bytes(pkt), (DST_ADDR, 0))
For such a packet, we call ip6_route_input() to look up a route for the next destination in these three functions depending on the header type.
* ipv6_rthdr_rcv() * ipv6_rpl_srh_rcv() * ipv6_srh_rcv()
If no route is found, ip6_null_entry is set to skb, and the following dst_input(skb) calls ip6_pkt_drop().
Finally, in icmp6_dev(), we dereference skb_rt6_info(skb)->rt6i_idev->dev as the input device is the loopback interface. Then, we have to check if skb_rt6_info(skb)->rt6i_idev is NULL or not to avoid NULL pointer deref for ip6_null_entry.
BUG: kernel NULL pointer dereference, address: 0000000000000000 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 157 Comm: python3 Not tainted 6.4.0-11996-gb121d614371c #35 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:icmp6_send (net/ipv6/icmp.c:436 net/ipv6/icmp.c:503) Code: fe ff ff 48 c7 40 30 c0 86 5d 83 e8 c6 44 1c 00 e9 c8 fc ff ff 49 8b 46 58 48 83 e0 fe 0f 84 4a fb ff ff 48 8b 80 d0 00 00 00 <48> 8b 00 44 8b 88 e0 00 00 00 e9 34 fb ff ff 4d 85 ed 0f 85 69 01 RSP: 0018:ffffc90000003c70 EFLAGS: 00000286 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000000000e0 RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff888006d72a18 RBP: ffffc90000003d80 R08: 0000000000000000 R09: 0000000000000001 R10: ffffc90000003d98 R11: 0000000000000040 R12: ffff888006d72a10 R13: 0000000000000000 R14: ffff8880057fb800 R15: ffffffff835d86c0 FS: 00007f9dc72ee740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000057b2000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <IRQ> ip6_pkt_drop (net/ipv6/route.c:4513) ipv6_rthdr_rcv (net/ipv6/exthdrs.c:640 net/ipv6/exthdrs.c:686) ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5)) ip6_input_finish (./include/linux/rcupdate.h:781 net/ipv6/ip6_input.c:483) __netif_receive_skb_one_core (net/core/dev.c:5455) process_backlog (./include/linux/rcupdate.h:781 net/core/dev.c:5895) __napi_poll (net/core/dev.c:6460) net_rx_action (net/core/dev.c:6529 net/core/dev.c:6660) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554) do_softirq (kernel/softirq.c:454 kernel/softirq.c:441) </IRQ> <TASK> __local_bh_enable_ip (kernel/softirq.c:381) __dev_queue_xmit (net/core/dev.c:4231) ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:135) rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914) sock_sendmsg (net/socket.c:725 net/socket.c:748) __sys_sendto (net/socket.c:2134) __x64_sys_sendto (net/socket.c:2146 net/socket.c:2142 net/socket.c:2142) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7f9dc751baea Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffe98712c38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007ffe98712cf8 RCX: 00007f9dc751baea RDX: 0000000000000060 RSI: 00007f9dc6460b90 RDI: 0000000000000003 RBP: 00007f9dc56e8be0 R08: 00007ffe98712d70 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f9dc6af5d1b </TASK> Modules linked in: CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- RIP: 0010:icmp6_send (net/ipv6/icmp.c:436 net/ipv6/icmp.c:503) Code: fe ff ff 48 c7 40 30 c0 86 5d 83 e8 c6 44 1c 00 e9 c8 fc ff ff 49 8b 46 58 48 83 e0 fe 0f 84 4a fb ff ff 48 8b 80 d0 00 00 00 <48> 8b 00 44 8b 88 e0 00 00 00 e9 34 fb ff ff 4d 85 ed 0f 85 69 01 RSP: 0018:ffffc90000003c70 EFLAGS: 00000286 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00000000000000e0 RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff888006d72a18 RBP: ffffc90000003d80 R08: 0000000000000000 R09: 0000000000000001 R10: ffffc90000003d98 R11: 0000000000000040 R12: ffff888006d72a10 R13: 0000000000000000 R14: ffff8880057fb800 R15: ffffffff835d86c0 FS: 00007f9dc72ee740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000057b2000 CR4: 00000000007506f0 PKRU: 55555554 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: disabled
Fixes: 4832c30d5458 ("net: ipv6: put host and anycast routes on device with address") Reported-by: Wang Yufen wangyufen@huawei.com Closes: https://lore.kernel.org/netdev/c41403a9-c2f6-3b7e-0c96-e1901e605cd0@huawei.c... Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: David Ahern dsahern@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/icmp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 9edf1f45b1ed6..65fa5014bc85e 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -424,7 +424,10 @@ static struct net_device *icmp6_dev(const struct sk_buff *skb) if (unlikely(dev->ifindex == LOOPBACK_IFINDEX || netif_is_l3_master(skb->dev))) { const struct rt6_info *rt6 = skb_rt6_info(skb);
- if (rt6) + /* The destination could be an external IP in Ext Hdr (SRv6, RPL, etc.), + * and ip6_null_entry could be set to skb if no route is found. + */ + if (rt6 && rt6->rt6i_idev) dev = rt6->rt6i_idev->dev; }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 51d03e2f2203e76ed02d33fb5ffbb5fc85ffaf54 ]
Amit Klein reported that udp6_ehash_secret was initialized but never used.
Fixes: 1bbdceef1e53 ("inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once") Reported-by: Amit Klein aksecurity@gmail.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Willy Tarreau w@1wt.eu Cc: Willem de Bruijn willemdebruijn.kernel@gmail.com Cc: David Ahern dsahern@kernel.org Cc: Hannes Frederic Sowa hannes@stressinduktion.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/udp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index debb98fb23c0b..d594a0425749b 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -91,7 +91,7 @@ static u32 udp6_ehashfn(const struct net *net, fhash = __ipv6_addr_jhash(faddr, udp_ipv6_hash_secret);
return __inet6_ehashfn(lhash, lport, fhash, fport, - udp_ipv6_hash_secret + net_hash_mix(net)); + udp6_ehash_secret + net_hash_mix(net)); }
int udp_v6_get_port(struct sock *sk, unsigned short snum)
From: Yuan Can yuancan@huawei.com
[ Upstream commit c012968259b451dc4db407f2310fe131eaefd800 ]
A problem about ntb_hw_idt create debugfs failed is triggered with the following log given:
[ 1236.637636] IDT PCI-E Non-Transparent Bridge Driver 2.0 [ 1236.639292] debugfs: Directory 'ntb_hw_idt' with parent '/' already present!
The reason is that idt_pci_driver_init() returns pci_register_driver() directly without checking its return value, if pci_register_driver() failed, it returns without destroy the newly created debugfs, resulting the debugfs of ntb_hw_idt can never be created later.
idt_pci_driver_init() debugfs_create_dir() # create debugfs directory pci_register_driver() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without destroy debugfs directory
Fix by removing debugfs when pci_register_driver() returns error.
Fixes: bf2a952d31d2 ("NTB: Add IDT 89HPESxNTx PCIe-switches support") Signed-off-by: Yuan Can yuancan@huawei.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ntb/hw/idt/ntb_hw_idt.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c index 0ed6f809ff2ee..51799fccf8404 100644 --- a/drivers/ntb/hw/idt/ntb_hw_idt.c +++ b/drivers/ntb/hw/idt/ntb_hw_idt.c @@ -2891,6 +2891,7 @@ static struct pci_driver idt_pci_driver = {
static int __init idt_pci_driver_init(void) { + int ret; pr_info("%s %s\n", NTB_DESC, NTB_VER);
/* Create the top DebugFS directory if the FS is initialized */ @@ -2898,7 +2899,11 @@ static int __init idt_pci_driver_init(void) dbgfs_topdir = debugfs_create_dir(KBUILD_MODNAME, NULL);
/* Register the NTB hardware driver to handle the PCI device */ - return pci_register_driver(&idt_pci_driver); + ret = pci_register_driver(&idt_pci_driver); + if (ret) + debugfs_remove_recursive(dbgfs_topdir); + + return ret; } module_init(idt_pci_driver_init);
From: Yuan Can yuancan@huawei.com
[ Upstream commit 98af0a33c1101c29b3ce4f0cf4715fd927c717f9 ]
A problem about ntb_hw_amd create debugfs failed is triggered with the following log given:
[ 618.431232] AMD(R) PCI-E Non-Transparent Bridge Driver 1.0 [ 618.433284] debugfs: Directory 'ntb_hw_amd' with parent '/' already present!
The reason is that amd_ntb_pci_driver_init() returns pci_register_driver() directly without checking its return value, if pci_register_driver() failed, it returns without destroy the newly created debugfs, resulting the debugfs of ntb_hw_amd can never be created later.
amd_ntb_pci_driver_init() debugfs_create_dir() # create debugfs directory pci_register_driver() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without destroy debugfs directory
Fix by removing debugfs when pci_register_driver() returns error.
Fixes: a1b3695820aa ("NTB: Add support for AMD PCI-Express Non-Transparent Bridge") Signed-off-by: Yuan Can yuancan@huawei.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ntb/hw/amd/ntb_hw_amd.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c index 04550b1f984c6..730f2103b91d1 100644 --- a/drivers/ntb/hw/amd/ntb_hw_amd.c +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c @@ -1338,12 +1338,17 @@ static struct pci_driver amd_ntb_pci_driver = {
static int __init amd_ntb_pci_driver_init(void) { + int ret; pr_info("%s %s\n", NTB_DESC, NTB_VER);
if (debugfs_initialized()) debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
- return pci_register_driver(&amd_ntb_pci_driver); + ret = pci_register_driver(&amd_ntb_pci_driver); + if (ret) + debugfs_remove_recursive(debugfs_dir); + + return ret; } module_init(amd_ntb_pci_driver_init);
From: Yuan Can yuancan@huawei.com
[ Upstream commit 4c3c796aca02883ad35bb117468938cc4022ca41 ]
A problem about ntb_hw_intel create debugfs failed is triggered with the following log given:
[ 273.112733] Intel(R) PCI-E Non-Transparent Bridge Driver 2.0 [ 273.115342] debugfs: Directory 'ntb_hw_intel' with parent '/' already present!
The reason is that intel_ntb_pci_driver_init() returns pci_register_driver() directly without checking its return value, if pci_register_driver() failed, it returns without destroy the newly created debugfs, resulting the debugfs of ntb_hw_intel can never be created later.
intel_ntb_pci_driver_init() debugfs_create_dir() # create debugfs directory pci_register_driver() driver_register() bus_add_driver() priv = kzalloc(...) # OOM happened # return without destroy debugfs directory
Fix by removing debugfs when pci_register_driver() returns error.
Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers") Signed-off-by: Yuan Can yuancan@huawei.com Acked-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ntb/hw/intel/ntb_hw_gen1.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.c b/drivers/ntb/hw/intel/ntb_hw_gen1.c index 84772013812bf..60a4ebc7bf35a 100644 --- a/drivers/ntb/hw/intel/ntb_hw_gen1.c +++ b/drivers/ntb/hw/intel/ntb_hw_gen1.c @@ -2064,12 +2064,17 @@ static struct pci_driver intel_ntb_pci_driver = {
static int __init intel_ntb_pci_driver_init(void) { + int ret; pr_info("%s %s\n", NTB_DESC, NTB_VER);
if (debugfs_initialized()) debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
- return pci_register_driver(&intel_ntb_pci_driver); + ret = pci_register_driver(&intel_ntb_pci_driver); + if (ret) + debugfs_remove_recursive(debugfs_dir); + + return ret; } module_init(intel_ntb_pci_driver_init);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 8623ccbfc55d962e19a3537652803676ad7acb90 ]
If device_register() returns error, the name allocated by dev_set_name() need be freed. As comment of device_register() says, it should use put_device() to give up the reference in the error path. So fix this by calling put_device(), then the name can be freed in kobject_cleanup(), and client_dev is freed in ntb_transport_client_release().
Fixes: fce8a7bb5b4b ("PCI-Express Non-Transparent Bridge Support") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ntb/ntb_transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c index a9b97ebc71ac5..2abd2235bbcab 100644 --- a/drivers/ntb/ntb_transport.c +++ b/drivers/ntb/ntb_transport.c @@ -410,7 +410,7 @@ int ntb_transport_register_client_dev(char *device_name)
rc = device_register(dev); if (rc) { - kfree(client_dev); + put_device(dev); goto err; }
From: Jiasheng Jiang jiasheng@iscas.ac.cn
[ Upstream commit 2790143f09938776a3b4f69685b380bae8fd06c7 ]
As the devm_kcalloc may return NULL pointer, it should be better to add check for the return value, as same as the others.
Fixes: 7f46c8b3a552 ("NTB: ntb_tool: Add full multi-port NTB API support") Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Reviewed-by: Serge Semin fancer.lancer@gmail.com Reviewed-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ntb/test/ntb_tool.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c index 5ee0afa621a95..eeeb4b1c97d2c 100644 --- a/drivers/ntb/test/ntb_tool.c +++ b/drivers/ntb/test/ntb_tool.c @@ -998,6 +998,8 @@ static int tool_init_mws(struct tool_ctx *tc) tc->peers[pidx].outmws = devm_kcalloc(&tc->ntb->dev, tc->peers[pidx].outmw_cnt, sizeof(*tc->peers[pidx].outmws), GFP_KERNEL); + if (tc->peers[pidx].outmws == NULL) + return -ENOMEM;
for (widx = 0; widx < tc->peers[pidx].outmw_cnt; widx++) { tc->peers[pidx].outmws[widx].pidx = pidx;
From: Ziyang Xuan william.xuanziyang@huawei.com
[ Upstream commit 06a0716949c22e2aefb648526580671197151acc ]
Now in addrconf_mod_rs_timer(), reference idev depends on whether rs_timer is not pending. Then modify rs_timer timeout.
There is a time gap in [1], during which if the pending rs_timer becomes not pending. It will miss to hold idev, but the rs_timer is activated. Thus rs_timer callback function addrconf_rs_timer() will be executed and put idev later without holding idev. A refcount underflow issue for idev can be caused by this.
if (!timer_pending(&idev->rs_timer)) in6_dev_hold(idev); <--------------[1] mod_timer(&idev->rs_timer, jiffies + when);
To fix the issue, hold idev if mod_timer() return 0.
Fixes: b7b1bfce0bb6 ("ipv6: split duplicate address detection and router solicitation timer") Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/addrconf.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 3797917237d03..5affca8e2f53a 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -318,9 +318,8 @@ static void addrconf_del_dad_work(struct inet6_ifaddr *ifp) static void addrconf_mod_rs_timer(struct inet6_dev *idev, unsigned long when) { - if (!timer_pending(&idev->rs_timer)) + if (!mod_timer(&idev->rs_timer, jiffies + when)) in6_dev_hold(idev); - mod_timer(&idev->rs_timer, jiffies + when); }
static void addrconf_mod_dad_work(struct inet6_ifaddr *ifp,
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 5f151364b1da6bd217632fd4ee8cc24eaf66a497 ]
A previous patch addressed the fortified memcpy warning for most builds, but I still see this one with gcc-9:
In file included from include/linux/string.h:254, from drivers/hid/hid-hyperv.c:8: In function 'fortify_memcpy_chk', inlined from 'mousevsc_on_receive' at drivers/hid/hid-hyperv.c:272:3: include/linux/fortify-string.h:583:4: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 583 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
My guess is that the WARN_ON() itself is what confuses gcc, so it no longer sees that there is a correct range check. Rework the code in a way that helps readability and avoids the warning.
Fixes: 542f25a94471 ("HID: hyperv: Replace one-element array with flexible-array member") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Michael Kelley mikelley@microsoft.com Link: https://lore.kernel.org/r/20230705140242.844167-1-arnd@kernel.org Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-hyperv.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c index 49d4a26895e76..f33485d83d24f 100644 --- a/drivers/hid/hid-hyperv.c +++ b/drivers/hid/hid-hyperv.c @@ -258,19 +258,17 @@ static void mousevsc_on_receive(struct hv_device *device,
switch (hid_msg_hdr->type) { case SYNTH_HID_PROTOCOL_RESPONSE: + len = struct_size(pipe_msg, data, pipe_msg->size); + /* * While it will be impossible for us to protect against * malicious/buggy hypervisor/host, add a check here to * ensure we don't corrupt memory. */ - if (struct_size(pipe_msg, data, pipe_msg->size) - > sizeof(struct mousevsc_prt_msg)) { - WARN_ON(1); + if (WARN_ON(len > sizeof(struct mousevsc_prt_msg))) break; - }
- memcpy(&input_dev->protocol_resp, pipe_msg, - struct_size(pipe_msg, data, pipe_msg->size)); + memcpy(&input_dev->protocol_resp, pipe_msg, len); complete(&input_dev->wait_event); break;
From: Jiasheng Jiang jiasheng@iscas.ac.cn
[ Upstream commit 87355b7c3da9bfd81935caba0ab763355147f7b0 ]
Add check for the return value of skb_copy in order to avoid NULL pointer dereference.
Fixes: 2cd548566384 ("net: dsa: qca8k: add support for phy read/write with mgmt Ethernet") Signed-off-by: Jiasheng Jiang jiasheng@iscas.ac.cn Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/qca/qca8k-8xxx.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c index 6d5ac7588a691..d775a14784f7e 100644 --- a/drivers/net/dsa/qca/qca8k-8xxx.c +++ b/drivers/net/dsa/qca/qca8k-8xxx.c @@ -588,6 +588,9 @@ qca8k_phy_eth_busy_wait(struct qca8k_mgmt_eth_data *mgmt_eth_data, bool ack; int ret;
+ if (!skb) + return -ENOMEM; + reinit_completion(&mgmt_eth_data->rw_done);
/* Increment seq_num and set it in the copy pkt */
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 04505bbbbb15da950ea0239e328a76a3ad2376e0 ]
Alyssa noticed that when building the kernel with CFI_CLANG+IBT and booting on IBT enabled hardware to obtain FineIBT, the indirect functions look like:
__cfi_foo: endbr64 subl $hash, %r10d jz 1f ud2 nop 1: foo: endbr64
This is because the compiler generates code for kCFI+IBT. In that case the caller does the hash check and will jump to +0, so there must be an ENDBR there. The compiler doesn't know about FineIBT at all; also it is possible to actually use kCFI+IBT when booting with 'cfi=kcfi' on IBT enabled hardware.
Having this second ENDBR however makes it possible to elide the CFI check. Therefore, we should poison this second ENDBR when switching to FineIBT mode.
Fixes: 931ab63664f0 ("x86/ibt: Implement FineIBT") Reported-by: "Milburn, Alyssa" alyssa.milburn@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Sami Tolvanen samitolvanen@google.com Link: https://lore.kernel.org/r/20230615193722.194131053@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/alternative.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index f615e0cb6d932..4e2c70f88e05b 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -940,6 +940,17 @@ static int cfi_rewrite_preamble(s32 *start, s32 *end) return 0; }
+static void cfi_rewrite_endbr(s32 *start, s32 *end) +{ + s32 *s; + + for (s = start; s < end; s++) { + void *addr = (void *)s + *s; + + poison_endbr(addr+16, false); + } +} + /* .retpoline_sites */ static int cfi_rand_callers(s32 *start, s32 *end) { @@ -1034,14 +1045,19 @@ static void __apply_fineibt(s32 *start_retpoline, s32 *end_retpoline, return;
case CFI_FINEIBT: + /* place the FineIBT preamble at func()-16 */ ret = cfi_rewrite_preamble(start_cfi, end_cfi); if (ret) goto err;
+ /* rewrite the callers to target func()-16 */ ret = cfi_rewrite_callers(start_retpoline, end_retpoline); if (ret) goto err;
+ /* now that nobody targets func()+0, remove ENDBR there */ + cfi_rewrite_endbr(start_cfi, end_cfi); + if (builtin) pr_info("Using FineIBT CFI\n"); return;
On Fri, Jul 21, 2023 at 06:02:56PM +0200, Greg Kroah-Hartman wrote:
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 04505bbbbb15da950ea0239e328a76a3ad2376e0 ]
Alyssa noticed that when building the kernel with CFI_CLANG+IBT and booting on IBT enabled hardware to obtain FineIBT, the indirect functions look like:
__cfi_foo: endbr64 subl $hash, %r10d jz 1f ud2 nop 1: foo: endbr64
This is because the compiler generates code for kCFI+IBT. In that case the caller does the hash check and will jump to +0, so there must be an ENDBR there. The compiler doesn't know about FineIBT at all; also it is possible to actually use kCFI+IBT when booting with 'cfi=kcfi' on IBT enabled hardware.
Having this second ENDBR however makes it possible to elide the CFI check. Therefore, we should poison this second ENDBR when switching to FineIBT mode.
Fixes: 931ab63664f0 ("x86/ibt: Implement FineIBT") Reported-by: "Milburn, Alyssa" alyssa.milburn@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Sami Tolvanen samitolvanen@google.com Link: https://lore.kernel.org/r/20230615193722.194131053@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org
If you take this patch you should also take the patches from Brian that moves ret_from_fork() into C, otherwise you end up with a non-bootable kernel.
On Fri, Jul 21, 2023 at 08:51:35PM +0200, Peter Zijlstra wrote:
On Fri, Jul 21, 2023 at 06:02:56PM +0200, Greg Kroah-Hartman wrote:
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 04505bbbbb15da950ea0239e328a76a3ad2376e0 ]
Alyssa noticed that when building the kernel with CFI_CLANG+IBT and booting on IBT enabled hardware to obtain FineIBT, the indirect functions look like:
__cfi_foo: endbr64 subl $hash, %r10d jz 1f ud2 nop 1: foo: endbr64
This is because the compiler generates code for kCFI+IBT. In that case the caller does the hash check and will jump to +0, so there must be an ENDBR there. The compiler doesn't know about FineIBT at all; also it is possible to actually use kCFI+IBT when booting with 'cfi=kcfi' on IBT enabled hardware.
Having this second ENDBR however makes it possible to elide the CFI check. Therefore, we should poison this second ENDBR when switching to FineIBT mode.
Fixes: 931ab63664f0 ("x86/ibt: Implement FineIBT") Reported-by: "Milburn, Alyssa" alyssa.milburn@intel.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Sami Tolvanen samitolvanen@google.com Link: https://lore.kernel.org/r/20230615193722.194131053@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org
If you take this patch you should also take the patches from Brian that moves ret_from_fork() into C, otherwise you end up with a non-bootable kernel.
Thanks for letting me know, I've just dropped this patch instead for now.
greg k-h
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 028e6e204ace1f080cfeacd72c50397eb8ae8883 ]
The while-loop may break on one of the two conditions, either ID string is empty or GUID matches. The second one, may never be reached if the parsed string is not correct GUID. In such a case the loop will never advance to check the next ID.
Break possible infinite loop by factoring out guid_parse_and_compare() helper which may be moved to the generic header for everyone later on and preventing from similar mistake in the future.
Interestingly that firstly it appeared when WMI was turned into a bus driver, but later when duplicated GUIDs were checked, the while-loop has been replaced by for-loop and hence no mistake made again.
Fixes: a48e23385fcf ("platform/x86: wmi: add context pointer field to struct wmi_device_id") Fixes: 844af950da94 ("platform/x86: wmi: Turn WMI into a bus driver") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230621151155.78279-1-andriy.shevchenko@linux.int... Tested-by: Armin Wolf W_Armin@gmx.de Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/wmi.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index d81319a502efc..e1a3bfeeed529 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -136,6 +136,16 @@ static acpi_status find_guid(const char *guid_string, struct wmi_block **out) return AE_NOT_FOUND; }
+static bool guid_parse_and_compare(const char *string, const guid_t *guid) +{ + guid_t guid_input; + + if (guid_parse(string, &guid_input)) + return false; + + return guid_equal(&guid_input, guid); +} + static const void *find_guid_context(struct wmi_block *wblock, struct wmi_driver *wdriver) { @@ -146,11 +156,7 @@ static const void *find_guid_context(struct wmi_block *wblock, return NULL;
while (*id->guid_string) { - guid_t guid_input; - - if (guid_parse(id->guid_string, &guid_input)) - continue; - if (guid_equal(&wblock->gblock.guid, &guid_input)) + if (guid_parse_and_compare(id->guid_string, &wblock->gblock.guid)) return id->context; id++; } @@ -827,11 +833,7 @@ static int wmi_dev_match(struct device *dev, struct device_driver *driver) return 0;
while (*id->guid_string) { - guid_t driver_guid; - - if (WARN_ON(guid_parse(id->guid_string, &driver_guid))) - continue; - if (guid_equal(&driver_guid, &wblock->gblock.guid)) + if (guid_parse_and_compare(id->guid_string, &wblock->gblock.guid)) return 1;
id++;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 2d800bc500fb3fb07a0fb42e2d0a1356fb9e1e8f ]
Inspired from struct flow_cls_offload :: cmd, in order for taprio to be able to report statistics (which is future work), it seems that we need to drill one step further with the ndo_setup_tc(TC_SETUP_QDISC_TAPRIO) multiplexing, and pass the command as part of the common portion of the muxed structure.
Since we already have an "enable" variable in tc_taprio_qopt_offload, refactor all drivers to check for "cmd" instead of "enable", and reject every other command except "replace" and "destroy" - to be future proof.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Horatiu Vultur horatiu.vultur@microchip.com # for lan966x Acked-by: Kurt Kanzenbach kurt@linutronix.de # hellcreek Reviewed-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Reviewed-by: Gerhard Engleder gerhard@engleder-embedded.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 8046063df887 ("igc: Rename qbv_enable to taprio_offload_enable") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/hirschmann/hellcreek.c | 14 +++++++++----- drivers/net/dsa/ocelot/felix_vsc9959.c | 4 +++- drivers/net/dsa/sja1105/sja1105_tas.c | 7 +++++-- drivers/net/ethernet/engleder/tsnep_selftests.c | 12 ++++++------ drivers/net/ethernet/engleder/tsnep_tc.c | 4 +++- drivers/net/ethernet/freescale/enetc/enetc_qos.c | 6 +++++- drivers/net/ethernet/intel/igc/igc_main.c | 13 +++++++++++-- .../net/ethernet/microchip/lan966x/lan966x_tc.c | 10 ++++++++-- drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c | 7 +++++-- drivers/net/ethernet/ti/am65-cpsw-qos.c | 11 ++++++++--- include/net/pkt_sched.h | 7 ++++++- net/sched/sch_taprio.c | 4 ++-- 12 files changed, 71 insertions(+), 28 deletions(-)
diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c index 595a548bb0a80..af50001ccdd4e 100644 --- a/drivers/net/dsa/hirschmann/hellcreek.c +++ b/drivers/net/dsa/hirschmann/hellcreek.c @@ -1885,13 +1885,17 @@ static int hellcreek_port_setup_tc(struct dsa_switch *ds, int port, case TC_SETUP_QDISC_TAPRIO: { struct tc_taprio_qopt_offload *taprio = type_data;
- if (!hellcreek_validate_schedule(hellcreek, taprio)) - return -EOPNOTSUPP; + switch (taprio->cmd) { + case TAPRIO_CMD_REPLACE: + if (!hellcreek_validate_schedule(hellcreek, taprio)) + return -EOPNOTSUPP;
- if (taprio->enable) return hellcreek_port_set_schedule(ds, port, taprio); - - return hellcreek_port_del_schedule(ds, port); + case TAPRIO_CMD_DESTROY: + return hellcreek_port_del_schedule(ds, port); + default: + return -EOPNOTSUPP; + } } default: return -EOPNOTSUPP; diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c index bd11f9fb95e54..772f8b817390b 100644 --- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -1436,7 +1436,7 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
mutex_lock(&ocelot->tas_lock);
- if (!taprio->enable) { + if (taprio->cmd == TAPRIO_CMD_DESTROY) { ocelot_port_mqprio(ocelot, port, &taprio->mqprio); ocelot_rmw_rix(ocelot, 0, QSYS_TAG_CONFIG_ENABLE, QSYS_TAG_CONFIG, port); @@ -1448,6 +1448,8 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
mutex_unlock(&ocelot->tas_lock); return 0; + } else if (taprio->cmd != TAPRIO_CMD_REPLACE) { + return -EOPNOTSUPP; }
ret = ocelot_port_mqprio(ocelot, port, &taprio->mqprio); diff --git a/drivers/net/dsa/sja1105/sja1105_tas.c b/drivers/net/dsa/sja1105/sja1105_tas.c index e6153848a9509..d7818710bc028 100644 --- a/drivers/net/dsa/sja1105/sja1105_tas.c +++ b/drivers/net/dsa/sja1105/sja1105_tas.c @@ -516,10 +516,11 @@ int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port, /* Can't change an already configured port (must delete qdisc first). * Can't delete the qdisc from an unconfigured port. */ - if (!!tas_data->offload[port] == admin->enable) + if ((!!tas_data->offload[port] && admin->cmd == TAPRIO_CMD_REPLACE) || + (!tas_data->offload[port] && admin->cmd == TAPRIO_CMD_DESTROY)) return -EINVAL;
- if (!admin->enable) { + if (admin->cmd == TAPRIO_CMD_DESTROY) { taprio_offload_free(tas_data->offload[port]); tas_data->offload[port] = NULL;
@@ -528,6 +529,8 @@ int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port, return rc;
return sja1105_static_config_reload(priv, SJA1105_SCHEDULING); + } else if (admin->cmd != TAPRIO_CMD_REPLACE) { + return -EOPNOTSUPP; }
/* The cycle time extension is the amount of time the last cycle from diff --git a/drivers/net/ethernet/engleder/tsnep_selftests.c b/drivers/net/ethernet/engleder/tsnep_selftests.c index 1581d6b222320..8a9145f93147c 100644 --- a/drivers/net/ethernet/engleder/tsnep_selftests.c +++ b/drivers/net/ethernet/engleder/tsnep_selftests.c @@ -329,7 +329,7 @@ static bool disable_taprio(struct tsnep_adapter *adapter) int retval;
memset(&qopt, 0, sizeof(qopt)); - qopt.enable = 0; + qopt.cmd = TAPRIO_CMD_DESTROY; retval = tsnep_tc_setup(adapter->netdev, TC_SETUP_QDISC_TAPRIO, &qopt); if (retval) return false; @@ -360,7 +360,7 @@ static bool tsnep_test_taprio(struct tsnep_adapter *adapter) for (i = 0; i < 255; i++) qopt->entries[i].command = TC_TAPRIO_CMD_SET_GATES;
- qopt->enable = 1; + qopt->cmd = TAPRIO_CMD_REPLACE; qopt->base_time = ktime_set(0, 0); qopt->cycle_time = 1500000; qopt->cycle_time_extension = 0; @@ -382,7 +382,7 @@ static bool tsnep_test_taprio(struct tsnep_adapter *adapter) if (!run_taprio(adapter, qopt, 100)) goto failed;
- qopt->enable = 1; + qopt->cmd = TAPRIO_CMD_REPLACE; qopt->base_time = ktime_set(0, 0); qopt->cycle_time = 411854; qopt->cycle_time_extension = 0; @@ -406,7 +406,7 @@ static bool tsnep_test_taprio(struct tsnep_adapter *adapter) if (!run_taprio(adapter, qopt, 100)) goto failed;
- qopt->enable = 1; + qopt->cmd = TAPRIO_CMD_REPLACE; qopt->base_time = ktime_set(0, 0); delay_base_time(adapter, qopt, 12); qopt->cycle_time = 125000; @@ -457,7 +457,7 @@ static bool tsnep_test_taprio_change(struct tsnep_adapter *adapter) for (i = 0; i < 255; i++) qopt->entries[i].command = TC_TAPRIO_CMD_SET_GATES;
- qopt->enable = 1; + qopt->cmd = TAPRIO_CMD_REPLACE; qopt->base_time = ktime_set(0, 0); qopt->cycle_time = 100000; qopt->cycle_time_extension = 0; @@ -610,7 +610,7 @@ static bool tsnep_test_taprio_extension(struct tsnep_adapter *adapter) for (i = 0; i < 255; i++) qopt->entries[i].command = TC_TAPRIO_CMD_SET_GATES;
- qopt->enable = 1; + qopt->cmd = TAPRIO_CMD_REPLACE; qopt->base_time = ktime_set(0, 0); qopt->cycle_time = 100000; qopt->cycle_time_extension = 50000; diff --git a/drivers/net/ethernet/engleder/tsnep_tc.c b/drivers/net/ethernet/engleder/tsnep_tc.c index d083e6684f120..745b191a55402 100644 --- a/drivers/net/ethernet/engleder/tsnep_tc.c +++ b/drivers/net/ethernet/engleder/tsnep_tc.c @@ -325,7 +325,7 @@ static int tsnep_taprio(struct tsnep_adapter *adapter, if (!adapter->gate_control) return -EOPNOTSUPP;
- if (!qopt->enable) { + if (qopt->cmd == TAPRIO_CMD_DESTROY) { /* disable gate control if active */ mutex_lock(&adapter->gate_control_lock);
@@ -337,6 +337,8 @@ static int tsnep_taprio(struct tsnep_adapter *adapter, mutex_unlock(&adapter->gate_control_lock);
return 0; + } else if (qopt->cmd != TAPRIO_CMD_REPLACE) { + return -EOPNOTSUPP; }
retval = tsnep_validate_gcl(qopt); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c index 126007ab70f61..dfec50106106f 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c @@ -65,7 +65,7 @@ static int enetc_setup_taprio(struct net_device *ndev, gcl_len = admin_conf->num_entries;
tge = enetc_rd(hw, ENETC_PTGCR); - if (!admin_conf->enable) { + if (admin_conf->cmd == TAPRIO_CMD_DESTROY) { enetc_wr(hw, ENETC_PTGCR, tge & ~ENETC_PTGCR_TGE); enetc_reset_ptcmsdur(hw);
@@ -138,6 +138,10 @@ int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data) struct enetc_ndev_priv *priv = netdev_priv(ndev); int err, i;
+ if (taprio->cmd != TAPRIO_CMD_REPLACE && + taprio->cmd != TAPRIO_CMD_DESTROY) + return -EOPNOTSUPP; + /* TSD and Qbv are mutually exclusive in hardware */ for (i = 0; i < priv->num_tx_rings; i++) if (priv->tx_ring[i]->tsd_enable) diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index e7bd2c60ee383..ae986e44a4718 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -6117,9 +6117,18 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, size_t n; int i;
- adapter->qbv_enable = qopt->enable; + switch (qopt->cmd) { + case TAPRIO_CMD_REPLACE: + adapter->qbv_enable = true; + break; + case TAPRIO_CMD_DESTROY: + adapter->qbv_enable = false; + break; + default: + return -EOPNOTSUPP; + }
- if (!qopt->enable) + if (!adapter->qbv_enable) return igc_tsn_clear_schedule(adapter);
if (qopt->base_time < 0) diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c b/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c index cf0cc7562d042..ee652f2d23595 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c @@ -21,8 +21,14 @@ static int lan966x_tc_setup_qdisc_mqprio(struct lan966x_port *port, static int lan966x_tc_setup_qdisc_taprio(struct lan966x_port *port, struct tc_taprio_qopt_offload *taprio) { - return taprio->enable ? lan966x_taprio_add(port, taprio) : - lan966x_taprio_del(port); + switch (taprio->cmd) { + case TAPRIO_CMD_REPLACE: + return lan966x_taprio_add(port, taprio); + case TAPRIO_CMD_DESTROY: + return lan966x_taprio_del(port); + default: + return -EOPNOTSUPP; + } }
static int lan966x_tc_setup_qdisc_tbf(struct lan966x_port *port, diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c index 9d55226479b4a..ac41ef4cbd2f0 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c @@ -966,8 +966,11 @@ static int tc_setup_taprio(struct stmmac_priv *priv, return -EOPNOTSUPP; }
- if (!qopt->enable) + if (qopt->cmd == TAPRIO_CMD_DESTROY) goto disable; + else if (qopt->cmd != TAPRIO_CMD_REPLACE) + return -EOPNOTSUPP; + if (qopt->num_entries >= dep) return -EINVAL; if (!qopt->cycle_time) @@ -988,7 +991,7 @@ static int tc_setup_taprio(struct stmmac_priv *priv,
mutex_lock(&priv->plat->est->lock); priv->plat->est->gcl_size = size; - priv->plat->est->enable = qopt->enable; + priv->plat->est->enable = qopt->cmd == TAPRIO_CMD_REPLACE; mutex_unlock(&priv->plat->est->lock);
for (i = 0; i < size; i++) { diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.c b/drivers/net/ethernet/ti/am65-cpsw-qos.c index 3a908db6e5b22..eced87fa261c9 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-qos.c +++ b/drivers/net/ethernet/ti/am65-cpsw-qos.c @@ -450,7 +450,7 @@ static int am65_cpsw_configure_taprio(struct net_device *ndev,
am65_cpsw_est_update_state(ndev);
- if (!est_new->taprio.enable) { + if (est_new->taprio.cmd == TAPRIO_CMD_DESTROY) { am65_cpsw_stop_est(ndev); return ret; } @@ -476,7 +476,7 @@ static int am65_cpsw_configure_taprio(struct net_device *ndev, am65_cpsw_est_set_sched_list(ndev, est_new); am65_cpsw_port_est_assign_buf_num(ndev, est_new->buf);
- am65_cpsw_est_set(ndev, est_new->taprio.enable); + am65_cpsw_est_set(ndev, est_new->taprio.cmd == TAPRIO_CMD_REPLACE);
if (tact == TACT_PROG) { ret = am65_cpsw_timer_set(ndev, est_new); @@ -520,7 +520,7 @@ static int am65_cpsw_set_taprio(struct net_device *ndev, void *type_data) am65_cpsw_cp_taprio(taprio, &est_new->taprio); ret = am65_cpsw_configure_taprio(ndev, est_new); if (!ret) { - if (taprio->enable) { + if (taprio->cmd == TAPRIO_CMD_REPLACE) { devm_kfree(&ndev->dev, port->qos.est_admin);
port->qos.est_admin = est_new; @@ -564,8 +564,13 @@ static void am65_cpsw_est_link_up(struct net_device *ndev, int link_speed) static int am65_cpsw_setup_taprio(struct net_device *ndev, void *type_data) { struct am65_cpsw_port *port = am65_ndev_to_port(ndev); + struct tc_taprio_qopt_offload *taprio = type_data; struct am65_cpsw_common *common = port->common;
+ if (taprio->cmd != TAPRIO_CMD_REPLACE && + taprio->cmd != TAPRIO_CMD_DESTROY) + return -EOPNOTSUPP; + if (!IS_ENABLED(CONFIG_TI_AM65_CPSW_TAS)) return -ENODEV;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index 5722931d83d43..7dba1c3a7b801 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -187,6 +187,11 @@ struct tc_taprio_caps { bool broken_mqprio:1; };
+enum tc_taprio_qopt_cmd { + TAPRIO_CMD_REPLACE, + TAPRIO_CMD_DESTROY, +}; + struct tc_taprio_sched_entry { u8 command; /* TC_TAPRIO_CMD_* */
@@ -198,7 +203,7 @@ struct tc_taprio_sched_entry { struct tc_taprio_qopt_offload { struct tc_mqprio_qopt_offload mqprio; struct netlink_ext_ack *extack; - u8 enable; + enum tc_taprio_qopt_cmd cmd; ktime_t base_time; u64 cycle_time; u64 cycle_time_extension; diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index cf0e61ed92253..4caf80ddc6721 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -1527,7 +1527,7 @@ static int taprio_enable_offload(struct net_device *dev, "Not enough memory for enabling offload mode"); return -ENOMEM; } - offload->enable = 1; + offload->cmd = TAPRIO_CMD_REPLACE; offload->extack = extack; mqprio_qopt_reconstruct(dev, &offload->mqprio.qopt); offload->mqprio.extack = extack; @@ -1575,7 +1575,7 @@ static int taprio_disable_offload(struct net_device *dev, "Not enough memory to disable offload mode"); return -ENOMEM; } - offload->enable = 0; + offload->cmd = TAPRIO_CMD_DESTROY;
err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload); if (err < 0) {
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit 8046063df887bee35c002224267ba46f41be7cf6 ]
In the current implementation the flags adapter->qbv_enable and IGC_FLAG_TSN_QBV_ENABLED have a similar name, but do not have the same meaning. The first one is used only to indicate taprio offload (i.e. when igc_save_qbv_schedule was called), while the second one corresponds to the Qbv mode of the hardware. However, the second one is also used to support the TX launchtime feature, i.e. ETF qdisc offload. This leads to situations where adapter->qbv_enable is false, but the flag IGC_FLAG_TSN_QBV_ENABLED is set. This is prone to confusion.
The rename should reduce this confusion. Since it is a pure rename, it has no impact on functionality.
Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 2 +- drivers/net/ethernet/intel/igc/igc_main.c | 6 +++--- drivers/net/ethernet/intel/igc/igc_tsn.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index c0a07af36cb23..345d3a4e8ed44 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -191,7 +191,7 @@ struct igc_adapter { int tc_setup_type; ktime_t base_time; ktime_t cycle_time; - bool qbv_enable; + bool taprio_offload_enable; u32 qbv_config_change_errors; bool qbv_transition; unsigned int qbv_count; diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index ae986e44a4718..6bed12224120f 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -6119,16 +6119,16 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
switch (qopt->cmd) { case TAPRIO_CMD_REPLACE: - adapter->qbv_enable = true; + adapter->taprio_offload_enable = true; break; case TAPRIO_CMD_DESTROY: - adapter->qbv_enable = false; + adapter->taprio_offload_enable = false; break; default: return -EOPNOTSUPP; }
- if (!adapter->qbv_enable) + if (!adapter->taprio_offload_enable) return igc_tsn_clear_schedule(adapter);
if (qopt->base_time < 0) diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index 3cdb0c9887283..b76ebfc10b1d5 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -37,7 +37,7 @@ static unsigned int igc_tsn_new_flags(struct igc_adapter *adapter) { unsigned int new_flags = adapter->flags & ~IGC_FLAG_TSN_ANY_ENABLED;
- if (adapter->qbv_enable) + if (adapter->taprio_offload_enable) new_flags |= IGC_FLAG_TSN_QBV_ENABLED;
if (is_any_launchtime(adapter))
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit 82ff5f29b7377d614f0c01fd74b5d0cb225f0adc ]
Only set adapter->taprio_offload_enable after validating the arguments. Otherwise, it stays set even if the offload was not enabled. Since the subsequent code does not get executed in case of invalid arguments, it will not be read at first. However, by activating and then deactivating another offload (e.g. ETF/TX launchtime offload), taprio_offload_enable is read and erroneously keeps the offload feature of the NIC enabled.
This can be reproduced as follows:
# TAPRIO offload (flags == 0x2) and negative base-time leading to expected -ERANGE sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time -1000 \ sched-entry S 01 300000 \ flags 0x2
# IGC_TQAVCTRL is 0x0 as expected (iomem=relaxed for reading register) sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1
# Activate ETF offload sudo tc qdisc replace dev enp1s0 parent root handle 6666 mqprio \ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 \ hw 0 sudo tc qdisc add dev enp1s0 parent 6666:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offload
# IGC_TQAVCTRL is 0x9 as expected sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1
# Deactivate ETF offload again sudo tc qdisc delete dev enp1s0 parent 6666:1
# IGC_TQAVCTRL should now be 0x0 again, but is observed as 0x9 sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1
Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 6bed12224120f..f051ca733af1b 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -6090,6 +6090,7 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter)
adapter->base_time = 0; adapter->cycle_time = NSEC_PER_SEC; + adapter->taprio_offload_enable = false; adapter->qbv_config_change_errors = 0; adapter->qbv_transition = false; adapter->qbv_count = 0; @@ -6117,20 +6118,12 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, size_t n; int i;
- switch (qopt->cmd) { - case TAPRIO_CMD_REPLACE: - adapter->taprio_offload_enable = true; - break; - case TAPRIO_CMD_DESTROY: - adapter->taprio_offload_enable = false; - break; - default: - return -EOPNOTSUPP; - } - - if (!adapter->taprio_offload_enable) + if (qopt->cmd == TAPRIO_CMD_DESTROY) return igc_tsn_clear_schedule(adapter);
+ if (qopt->cmd != TAPRIO_CMD_REPLACE) + return -EOPNOTSUPP; + if (qopt->base_time < 0) return -ERANGE;
@@ -6142,6 +6135,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
adapter->cycle_time = qopt->cycle_time; adapter->base_time = qopt->base_time; + adapter->taprio_offload_enable = true;
igc_ptp_read(adapter, &now);
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit e5d88c53d03f8df864776431175d08c053645f50 ]
Since commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") it is possible to enable taprio offload with a basetime of 0. However, the check if taprio offload is already enabled (and thus -EALREADY should be returned for igc_save_qbv_schedule) still relied on adapter->base_time > 0.
This can be reproduced as follows:
# TAPRIO offload (flags == 0x2) and base-time = 0 sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x2
# The second call should fail with "Error: Device failed to setup taprio offload." # But that only happens if base-time was != 0 sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x2
Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index f051ca733af1b..97eb3c390de9a 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -6127,7 +6127,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, if (qopt->base_time < 0) return -ERANGE;
- if (igc_is_device_id_i225(hw) && adapter->base_time) + if (igc_is_device_id_i225(hw) && adapter->taprio_offload_enable) return -EALREADY;
if (!validate_schedule(adapter, qopt))
From: Tzvetomir Stoyanov (VMware) tz.stoyanov@gmail.com
[ Upstream commit cf0a624dc706c306294c14e6b3e7694702f25191 ]
The enable_trace_eprobe() function enables all event probes, attached to given trace probe. If an error occurs in enabling one of the event probes, all others should be roll backed. There is a bug in that roll back logic - instead of all event probes, only the failed one is disabled.
Link: https://lore.kernel.org/all/20230703042853.1427493-1-tz.stoyanov@gmail.com/
Reported-by: Dan Carpenter dan.carpenter@linaro.org Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events") Signed-off-by: Tzvetomir Stoyanov (VMware) tz.stoyanov@gmail.com Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_eprobe.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c index 67e854979d53e..3f04f0ffe0d70 100644 --- a/kernel/trace/trace_eprobe.c +++ b/kernel/trace/trace_eprobe.c @@ -675,6 +675,7 @@ static int enable_trace_eprobe(struct trace_event_call *call, struct trace_eprobe *ep; bool enabled; int ret = 0; + int cnt = 0;
tp = trace_probe_primary_from_call(call); if (WARN_ON_ONCE(!tp)) @@ -698,12 +699,25 @@ static int enable_trace_eprobe(struct trace_event_call *call, if (ret) break; enabled = true; + cnt++; }
if (ret) { /* Failed to enable one of them. Roll back all */ - if (enabled) - disable_eprobe(ep, file->tr); + if (enabled) { + /* + * It's a bug if one failed for something other than memory + * not being available but another eprobe succeeded. + */ + WARN_ON_ONCE(ret != -ENOMEM); + + list_for_each_entry(pos, trace_probe_probe_list(tp), list) { + ep = container_of(pos, struct trace_eprobe, tp); + disable_eprobe(ep, file->tr); + if (!--cnt) + break; + } + } if (file) trace_probe_remove_file(tp, file); else
From: Ze Gao zegao2021@gmail.com
[ Upstream commit 5f0c584daf7464f04114c65dd07269ee2bfedc13 ]
Unlock ftrace recursion lock when fprobe_kprobe_handler() is failed because of some running kprobe.
Link: https://lore.kernel.org/all/20230703092336.268371-1-zegao@tencent.com/
Fixes: 3cc4e2c5fbae ("fprobe: make fprobe_kprobe_handler recursion free") Reported-by: Yafang laoar.shao@gmail.com Closes: https://lore.kernel.org/linux-trace-kernel/CALOAHbC6UpfFOOibdDiC7xFc5YFUgZnk... Signed-off-by: Ze Gao zegao@tencent.com Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Acked-by: Yafang Shao laoar.shao@gmail.com Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/fprobe.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c index 18d36842faf57..93b3e361bb97a 100644 --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -102,12 +102,14 @@ static void fprobe_kprobe_handler(unsigned long ip, unsigned long parent_ip,
if (unlikely(kprobe_running())) { fp->nmissed++; - return; + goto recursion_unlock; }
kprobe_busy_begin(); __fprobe_handler(ip, parent_ip, ops, fregs); kprobe_busy_end(); + +recursion_unlock: ftrace_test_recursion_unlock(bit); }
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit 8b86f10ab64eca0287ea8f7c94e9ad8b2e101c01 ]
The flags IGC_TXQCTL_STRICT_CYCLE and IGC_TXQCTL_STRICT_END prevent the packet transmission over slot and cycle boundaries. This is important for taprio offload where the slots and cycles correspond to the slots and cycles configured for the network.
However, the Qbv offload feature of the i225 is also used for enabling TX launchtime / ETF offload. In that case, however, the cycle has no meaning for the network and is only used internally to adapt the base time register after a second has passed.
Enabling strict mode in this case would unnecessarily prevent the transmission of certain packets (i.e. at the boundary of a second) and thus interferes with the ETF qdisc that promises transmission at a certain point in time.
Similar to ETF, this also applies to CBS offload that also should not be influenced by strict mode unless taprio offload would be enabled at the same time.
This fully reverts commit d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling") but its commit message only describes what was already implemented before that commit. The difference to a plain revert of that commit is that it now copes with the base_time = 0 case that was fixed with commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
In particular, enabling strict mode leads to TX hang situations under high traffic if taprio is applied WITHOUT taprio offload but WITH ETF offload, e.g. as in
sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x1 \ txtime-delay 500000 \ clockid CLOCK_TAI sudo tc qdisc replace dev enp1s0 parent 100:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offload \ skip_sock_check
and traffic generator
sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns
with traffic.cfg
#define ETH_P_IP 0x0800
{ /* Ethernet Header */ 0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed 0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed const16(ETH_P_IP),
/* IPv4 Header */ 0b01000101, 0, # IPv4 version, IHL, TOS const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header)) const16(2), # IPv4 ident 0b01000000, 0, # IPv4 flags, fragmentation off 64, # IPv4 TTL 17, # Protocol UDP csumip(14, 33), # IPv4 checksum
/* UDP Header */ 10, 0, 48, 1, # IP Src - adapt as needed 10, 0, 48, 10, # IP Dest - adapt as needed const16(5555), # UDP Src Port const16(6666), # UDP Dest Port const16(1008), # UDP length (UDP header 8 bytes + payload length) csumudp(14, 34), # UDP checksum
/* Payload */ fill('W', 1000), }
and the observed message with that is for example
igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang Tx Queue <0> TDH <d0> TDT <f0> next_to_use <f0> next_to_clean <d0> buffer_info[next_to_clean] time_stamp <ffff661f> next_to_watch <00000000245a4efb> jiffies <ffff6e48> desc.status <1048000>
Fixes: d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_tsn.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index b76ebfc10b1d5..a9c08321aca90 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -132,8 +132,28 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter) wr32(IGC_STQT(i), ring->start_time); wr32(IGC_ENDQT(i), ring->end_time);
- txqctl |= IGC_TXQCTL_STRICT_CYCLE | - IGC_TXQCTL_STRICT_END; + if (adapter->taprio_offload_enable) { + /* If taprio_offload_enable is set we are in "taprio" + * mode and we need to be strict about the + * cycles: only transmit a packet if it can be + * completed during that cycle. + * + * If taprio_offload_enable is NOT true when + * enabling TSN offload, the cycle should have + * no external effects, but is only used internally + * to adapt the base time register after a second + * has passed. + * + * Enabling strict mode in this case would + * unnecessarily prevent the transmission of + * certain packets (i.e. at the boundary of a + * second) and thus interfere with the launchtime + * feature that promises transmission at a + * certain point in time. + */ + txqctl |= IGC_TXQCTL_STRICT_CYCLE | + IGC_TXQCTL_STRICT_END; + }
if (ring->launchtime_enable) txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit c1bca9ac0bcb355be11354c2e68bc7bf31f5ac5a ]
It is possible (verified on a running system) that frames are processed by igc_tx_launchtime with a txtime before the start of the cycle (baset_est).
However, the result of txtime - baset_est is written into a u32, leading to a wrap around to a positive number. The following launchtime > 0 check will only branch to executing launchtime = 0 if launchtime is already 0.
Fix it by using a s32 before checking launchtime > 0.
Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 97eb3c390de9a..96a2f6e6f6b8a 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1016,7 +1016,7 @@ static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime, ktime_t base_time = adapter->base_time; ktime_t now = ktime_get_clocktai(); ktime_t baset_est, end_of_cycle; - u32 launchtime; + s32 launchtime; s64 n;
n = div64_s64(ktime_sub_ns(now, base_time), cycle_time);
From: Florian Kauer florian.kauer@linutronix.de
[ Upstream commit 0bcc62858d6ba62cbade957d69745e6adeed5f3d ]
The insertion of an empty frame was introduced with commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") in order to ensure that the current cycle has at least one packet if there is some packet to be scheduled for the next cycle.
However, the current implementation does not properly check if a packet is already scheduled for the current cycle. Currently, an empty packet is always inserted if and only if txtime >= end_of_cycle && txtime > last_tx_cycle but since last_tx_cycle is always either the end of the current cycle (end_of_cycle) or the end of a previous cycle, the second part (txtime > last_tx_cycle) is always true unless txtime == last_tx_cycle.
What actually needs to be checked here is if the last_tx_cycle was already written within the current cycle, so an empty frame should only be inserted if and only if txtime >= end_of_cycle && end_of_cycle > last_tx_cycle.
This patch does not only avoid an unnecessary insertion, but it can actually be harmful to insert an empty packet if packets are already scheduled in the current cycle, because it can lead to a situation where the empty packet is actually processed as the first packet in the upcoming cycle shifting the packet with the first_flag even one cycle into the future, finally leading to a TX hang.
The TX hang can be reproduced on a i225 with:
sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 1 \ map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 \ base-time 0 \ sched-entry S 01 300000 \ flags 0x1 \ txtime-delay 500000 \ clockid CLOCK_TAI sudo tc qdisc replace dev enp1s0 parent 100:1 etf \ clockid CLOCK_TAI \ delta 500000 \ offload \ skip_sock_check
and traffic generator
sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns
with traffic.cfg
#define ETH_P_IP 0x0800
{ /* Ethernet Header */ 0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed 0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed const16(ETH_P_IP),
/* IPv4 Header */ 0b01000101, 0, # IPv4 version, IHL, TOS const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header)) const16(2), # IPv4 ident 0b01000000, 0, # IPv4 flags, fragmentation off 64, # IPv4 TTL 17, # Protocol UDP csumip(14, 33), # IPv4 checksum
/* UDP Header */ 10, 0, 48, 1, # IP Src - adapt as needed 10, 0, 48, 10, # IP Dest - adapt as needed const16(5555), # UDP Src Port const16(6666), # UDP Dest Port const16(1008), # UDP length (UDP header 8 bytes + payload length) csumudp(14, 34), # UDP checksum
/* Payload */ fill('W', 1000), }
and the observed message with that is for example
igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang Tx Queue <0> TDH <32> TDT <3c> next_to_use <3c> next_to_clean <32> buffer_info[next_to_clean] time_stamp <ffff26a8> next_to_watch <00000000632a1828> jiffies <ffff27f8> desc.status <1048000>
Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit") Signed-off-by: Florian Kauer florian.kauer@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 96a2f6e6f6b8a..44aa4342cbbb5 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -1029,7 +1029,7 @@ static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime, *first_flag = true; ring->last_ff_cycle = baset_est;
- if (ktime_compare(txtime, ring->last_tx_cycle) > 0) + if (ktime_compare(end_of_cycle, ring->last_tx_cycle) > 0) *insert_empty = true; } }
From: Ankit Kumar ankit.kumar@samsung.com
[ Upstream commit b938e6603660652dc3db66d3c915fbfed3bce21d ]
As per NVMe command set specification 1.0c Storage tag size is 7 bits.
Fixes: 4020aad85c67 ("nvme: add support for enhanced metadata") Signed-off-by: Ankit Kumar ankit.kumar@samsung.com Reviewed-by: Kanchan Joshi joshi.k@samsung.com Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/nvme.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 779507ac750b8..2819d6c3a6b5d 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -473,7 +473,7 @@ struct nvme_id_ns_nvm { };
enum { - NVME_ID_NS_NVM_STS_MASK = 0x3f, + NVME_ID_NS_NVM_STS_MASK = 0x7f, NVME_ID_NS_NVM_GUARD_SHIFT = 7, NVME_ID_NS_NVM_GUARD_MASK = 0x3, };
From: Stafford Horne shorne@gmail.com
[ Upstream commit dceaafd668812115037fc13a1893d068b7b880f5 ]
With commit 27267655c531 ("openrisc: Support floating point user api") I added an entry to the struct sigcontext which caused an unwanted change to the userspace ABI.
To fix this we use the previously unused oldmask field space for the floating point fpcsr state. We do this with a union to restore the ABI back to the pre kernel v6.4 ABI and keep API compatibility.
This does mean if there is some code somewhere that is setting oldmask in an OpenRISC specific userspace sighandler it would end up setting the floating point register status, but I think it's unlikely as oldmask was never functional before.
Fixes: 27267655c531 ("openrisc: Support floating point user api") Reported-by: Szabolcs Nagy nsz@port70.net Closes: https://lore.kernel.org/openrisc/20230626213840.GA1236108@port70.net/ Signed-off-by: Stafford Horne shorne@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/openrisc/include/uapi/asm/sigcontext.h | 6 ++++-- arch/openrisc/kernel/signal.c | 4 ++-- 2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/openrisc/include/uapi/asm/sigcontext.h b/arch/openrisc/include/uapi/asm/sigcontext.h index ca585e4af6b8e..e7ffb58ff58fb 100644 --- a/arch/openrisc/include/uapi/asm/sigcontext.h +++ b/arch/openrisc/include/uapi/asm/sigcontext.h @@ -28,8 +28,10 @@
struct sigcontext { struct user_regs_struct regs; /* needs to be first */ - struct __or1k_fpu_state fpu; - unsigned long oldmask; + union { + unsigned long fpcsr; + unsigned long oldmask; /* unused */ + }; };
#endif /* __ASM_OPENRISC_SIGCONTEXT_H */ diff --git a/arch/openrisc/kernel/signal.c b/arch/openrisc/kernel/signal.c index 4664a18f0787d..2e7257a433ff4 100644 --- a/arch/openrisc/kernel/signal.c +++ b/arch/openrisc/kernel/signal.c @@ -50,7 +50,7 @@ static int restore_sigcontext(struct pt_regs *regs, err |= __copy_from_user(regs, sc->regs.gpr, 32 * sizeof(unsigned long)); err |= __copy_from_user(®s->pc, &sc->regs.pc, sizeof(unsigned long)); err |= __copy_from_user(®s->sr, &sc->regs.sr, sizeof(unsigned long)); - err |= __copy_from_user(®s->fpcsr, &sc->fpu.fpcsr, sizeof(unsigned long)); + err |= __copy_from_user(®s->fpcsr, &sc->fpcsr, sizeof(unsigned long));
/* make sure the SM-bit is cleared so user-mode cannot fool us */ regs->sr &= ~SPR_SR_SM; @@ -113,7 +113,7 @@ static int setup_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc) err |= __copy_to_user(sc->regs.gpr, regs, 32 * sizeof(unsigned long)); err |= __copy_to_user(&sc->regs.pc, ®s->pc, sizeof(unsigned long)); err |= __copy_to_user(&sc->regs.sr, ®s->sr, sizeof(unsigned long)); - err |= __copy_to_user(&sc->fpu.fpcsr, ®s->fpcsr, sizeof(unsigned long)); + err |= __copy_to_user(&sc->fpcsr, ®s->fpcsr, sizeof(unsigned long));
return err; }
From: Björn Töpel bjorn@rivosinc.com
[ Upstream commit c56fb2aab23505bb7160d06097c8de100b82b851 ]
In order to generate the prologue and epilogue, the BPF JIT needs to know which registers that are clobbered. Therefore, the during pre-final passes, the prologue is generated after the body of the program body-prologue-epilogue. Then, in the final pass, a proper prologue-body-epilogue JITted image is generated.
This scheme has worked most of the time. However, for some large programs with many jumps, e.g. the test_kmod.sh BPF selftest with hardening enabled (blinding constants), this has shown to be incorrect. For the final pass, when the proper prologue-body-epilogue is generated, the image has not converged. This will lead to that the final image will have incorrect jump offsets. The following is an excerpt from an incorrect image:
| ... | 3b8: 00c50663 beq a0,a2,3c4 <.text+0x3c4> | 3bc: 0020e317 auipc t1,0x20e | 3c0: 49630067 jalr zero,1174(t1) # 20e852 <.text+0x20e852> | ... | 20e84c: 8796 c.mv a5,t0 | 20e84e: 6422 c.ldsp s0,8(sp) # Epilogue start | 20e850: 6141 c.addi16sp sp,16 | 20e852: 853e c.mv a0,a5 # Incorrect jump target | 20e854: 8082 c.jr ra
The image has shrunk, and the epilogue offset is incorrect in the final pass.
Correct the problem by always generating proper prologue-body-epilogue outputs, which means that the first pass will only generate the body to track what registers that are touched.
Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G") Signed-off-by: Björn Töpel bjorn@rivosinc.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20230710074131.19596-1-bjorn@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/net/bpf_jit.h | 6 +++--- arch/riscv/net/bpf_jit_core.c | 19 +++++++++++++------ 2 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h index bf9802a63061d..2717f54904287 100644 --- a/arch/riscv/net/bpf_jit.h +++ b/arch/riscv/net/bpf_jit.h @@ -69,7 +69,7 @@ struct rv_jit_context { struct bpf_prog *prog; u16 *insns; /* RV insns */ int ninsns; - int body_len; + int prologue_len; int epilogue_offset; int *offset; /* BPF to RV */ int nexentries; @@ -216,8 +216,8 @@ static inline int rv_offset(int insn, int off, struct rv_jit_context *ctx) int from, to;
off++; /* BPF branch is from PC+1, RV is from PC */ - from = (insn > 0) ? ctx->offset[insn - 1] : 0; - to = (insn + off > 0) ? ctx->offset[insn + off - 1] : 0; + from = (insn > 0) ? ctx->offset[insn - 1] : ctx->prologue_len; + to = (insn + off > 0) ? ctx->offset[insn + off - 1] : ctx->prologue_len; return ninsns_rvoff(to - from); }
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c index 737baf8715da7..7a26a3e1c73cf 100644 --- a/arch/riscv/net/bpf_jit_core.c +++ b/arch/riscv/net/bpf_jit_core.c @@ -44,7 +44,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) unsigned int prog_size = 0, extable_size = 0; bool tmp_blinded = false, extra_pass = false; struct bpf_prog *tmp, *orig_prog = prog; - int pass = 0, prev_ninsns = 0, prologue_len, i; + int pass = 0, prev_ninsns = 0, i; struct rv_jit_data *jit_data; struct rv_jit_context *ctx;
@@ -83,6 +83,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) prog = orig_prog; goto out_offset; } + + if (build_body(ctx, extra_pass, NULL)) { + prog = orig_prog; + goto out_offset; + } + for (i = 0; i < prog->len; i++) { prev_ninsns += 32; ctx->offset[i] = prev_ninsns; @@ -91,12 +97,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) for (i = 0; i < NR_JIT_ITERATIONS; i++) { pass++; ctx->ninsns = 0; + + bpf_jit_build_prologue(ctx); + ctx->prologue_len = ctx->ninsns; + if (build_body(ctx, extra_pass, ctx->offset)) { prog = orig_prog; goto out_offset; } - ctx->body_len = ctx->ninsns; - bpf_jit_build_prologue(ctx); + ctx->epilogue_offset = ctx->ninsns; bpf_jit_build_epilogue(ctx);
@@ -162,10 +171,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
if (!prog->is_func || extra_pass) { bpf_jit_binary_lock_ro(jit_data->header); - prologue_len = ctx->epilogue_offset - ctx->body_len; for (i = 0; i < prog->len; i++) - ctx->offset[i] = ninsns_rvoff(prologue_len + - ctx->offset[i]); + ctx->offset[i] = ninsns_rvoff(ctx->offset[i]); bpf_prog_fill_jited_linfo(prog, ctx->offset); out_offset: kfree(ctx->offset);
From: Wei Fang wei.fang@nxp.com
[ Upstream commit 2ae9c66b04554bf5b3eeaab8c12a0bfb9f28ebde ]
This patch is a cleanup for fec driver. The fec_enet_reset_skb() is used to free skb buffers for tx queues and is only invoked in fec_restart(). However, fec_enet_bd_init() also resets skb buffers and is invoked in fec_restart() too. So fec_enet_reset_skb() is redundant and useless.
Signed-off-by: Wei Fang wei.fang@nxp.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 20f797399035 ("net: fec: recycle pages for transmitted XDP frames") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 21 --------------------- 1 file changed, 21 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 38e5b5abe067c..c08331f7da7b3 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1011,24 +1011,6 @@ static void fec_enet_enable_ring(struct net_device *ndev) } }
-static void fec_enet_reset_skb(struct net_device *ndev) -{ - struct fec_enet_private *fep = netdev_priv(ndev); - struct fec_enet_priv_tx_q *txq; - int i, j; - - for (i = 0; i < fep->num_tx_queues; i++) { - txq = fep->tx_queue[i]; - - for (j = 0; j < txq->bd.ring_size; j++) { - if (txq->tx_skbuff[j]) { - dev_kfree_skb_any(txq->tx_skbuff[j]); - txq->tx_skbuff[j] = NULL; - } - } - } -} - /* * This function is called to start or restart the FEC during a link * change, transmit timeout, or to reconfigure the FEC. The network @@ -1071,9 +1053,6 @@ fec_restart(struct net_device *ndev)
fec_enet_enable_ring(ndev);
- /* Reset tx SKB buffers. */ - fec_enet_reset_skb(ndev); - /* Enable MII mode */ if (fep->full_duplex == DUPLEX_FULL) { /* FD enable */
From: Wei Fang wei.fang@nxp.com
[ Upstream commit bc638eabfed90fdc798fd5765e67e41abea76152 ]
The last_bdp is initialized to bdp, and both last_bdp and bdp are not changed. That is to say that last_bdp and bdp are always equal. So bdp can be used directly.
Signed-off-by: Wei Fang wei.fang@nxp.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230529022615.669589-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni pabeni@redhat.com Stable-dep-of: 20f797399035 ("net: fec: recycle pages for transmitted XDP frames") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index c08331f7da7b3..40d71be45f604 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3770,7 +3770,7 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, struct xdp_frame *frame) { unsigned int index, status, estatus; - struct bufdesc *bdp, *last_bdp; + struct bufdesc *bdp; dma_addr_t dma_addr; int entries_free;
@@ -3782,7 +3782,6 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
/* Fill in a Tx ring entry */ bdp = txq->bd.cur; - last_bdp = bdp; status = fec16_to_cpu(bdp->cbd_sc); status &= ~BD_ENET_TX_STATS;
@@ -3810,7 +3809,6 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, ebdp->cbd_esc = cpu_to_fec32(estatus); }
- index = fec_enet_get_bd_index(last_bdp, &txq->bd); txq->tx_skbuff[index] = NULL;
/* Make sure the updates to rest of the descriptor are performed before @@ -3825,7 +3823,7 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, bdp->cbd_sc = cpu_to_fec16(status);
/* If this was the last BD in the ring, start at the beginning again. */ - bdp = fec_enet_get_nextdesc(last_bdp, &txq->bd); + bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
/* Make sure the update to bdp are performed before txq->bd.cur. */ dma_wmb();
From: Wei Fang wei.fang@nxp.com
[ Upstream commit 20f797399035a8052dbd7297fdbe094079a9482e ]
Once the XDP frames have been successfully transmitted through the ndo_xdp_xmit() interface, it's the driver responsibility to free the frames so that the page_pool can recycle the pages and reuse them. However, this action is not implemented in the fec driver. This leads to a user-visible problem that the console will print the following warning log.
[ 157.568851] page_pool_release_retry() stalled pool shutdown 1389 inflight 60 sec [ 217.983446] page_pool_release_retry() stalled pool shutdown 1389 inflight 120 sec [ 278.399006] page_pool_release_retry() stalled pool shutdown 1389 inflight 181 sec [ 338.812885] page_pool_release_retry() stalled pool shutdown 1389 inflight 241 sec [ 399.226946] page_pool_release_retry() stalled pool shutdown 1389 inflight 302 sec
Therefore, to solve this issue, we free XDP frames via xdp_return_frame() while cleaning the tx BD ring.
Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang wei.fang@nxp.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec.h | 15 ++- drivers/net/ethernet/freescale/fec_main.c | 148 +++++++++++++++------- 2 files changed, 115 insertions(+), 48 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 9939ccafb5566..8c0226d061fec 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -544,10 +544,23 @@ enum { XDP_STATS_TOTAL, };
+enum fec_txbuf_type { + FEC_TXBUF_T_SKB, + FEC_TXBUF_T_XDP_NDO, +}; + +struct fec_tx_buffer { + union { + struct sk_buff *skb; + struct xdp_frame *xdp; + }; + enum fec_txbuf_type type; +}; + struct fec_enet_priv_tx_q { struct bufdesc_prop bd; unsigned char *tx_bounce[TX_RING_SIZE]; - struct sk_buff *tx_skbuff[TX_RING_SIZE]; + struct fec_tx_buffer tx_buf[TX_RING_SIZE];
unsigned short tx_stop_threshold; unsigned short tx_wake_threshold; diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 40d71be45f604..e6ed36e5daefa 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -397,7 +397,7 @@ static void fec_dump(struct net_device *ndev) fec16_to_cpu(bdp->cbd_sc), fec32_to_cpu(bdp->cbd_bufaddr), fec16_to_cpu(bdp->cbd_datlen), - txq->tx_skbuff[index]); + txq->tx_buf[index].skb); bdp = fec_enet_get_nextdesc(bdp, &txq->bd); index++; } while (bdp != txq->bd.base); @@ -654,7 +654,7 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
index = fec_enet_get_bd_index(last_bdp, &txq->bd); /* Save skb pointer */ - txq->tx_skbuff[index] = skb; + txq->tx_buf[index].skb = skb;
/* Make sure the updates to rest of the descriptor are performed before * transferring ownership. @@ -672,9 +672,7 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
skb_tx_timestamp(skb);
- /* Make sure the update to bdp and tx_skbuff are performed before - * txq->bd.cur. - */ + /* Make sure the update to bdp is performed before txq->bd.cur. */ wmb(); txq->bd.cur = bdp;
@@ -862,7 +860,7 @@ static int fec_enet_txq_submit_tso(struct fec_enet_priv_tx_q *txq, }
/* Save skb pointer */ - txq->tx_skbuff[index] = skb; + txq->tx_buf[index].skb = skb;
skb_tx_timestamp(skb); txq->bd.cur = bdp; @@ -952,16 +950,33 @@ static void fec_enet_bd_init(struct net_device *dev) for (i = 0; i < txq->bd.ring_size; i++) { /* Initialize the BD for every fragment in the page. */ bdp->cbd_sc = cpu_to_fec16(0); - if (bdp->cbd_bufaddr && - !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr))) - dma_unmap_single(&fep->pdev->dev, - fec32_to_cpu(bdp->cbd_bufaddr), - fec16_to_cpu(bdp->cbd_datlen), - DMA_TO_DEVICE); - if (txq->tx_skbuff[i]) { - dev_kfree_skb_any(txq->tx_skbuff[i]); - txq->tx_skbuff[i] = NULL; + if (txq->tx_buf[i].type == FEC_TXBUF_T_SKB) { + if (bdp->cbd_bufaddr && + !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr))) + dma_unmap_single(&fep->pdev->dev, + fec32_to_cpu(bdp->cbd_bufaddr), + fec16_to_cpu(bdp->cbd_datlen), + DMA_TO_DEVICE); + if (txq->tx_buf[i].skb) { + dev_kfree_skb_any(txq->tx_buf[i].skb); + txq->tx_buf[i].skb = NULL; + } + } else { + if (bdp->cbd_bufaddr) + dma_unmap_single(&fep->pdev->dev, + fec32_to_cpu(bdp->cbd_bufaddr), + fec16_to_cpu(bdp->cbd_datlen), + DMA_TO_DEVICE); + + if (txq->tx_buf[i].xdp) { + xdp_return_frame(txq->tx_buf[i].xdp); + txq->tx_buf[i].xdp = NULL; + } + + /* restore default tx buffer type: FEC_TXBUF_T_SKB */ + txq->tx_buf[i].type = FEC_TXBUF_T_SKB; } + bdp->cbd_bufaddr = cpu_to_fec32(0); bdp = fec_enet_get_nextdesc(bdp, &txq->bd); } @@ -1360,6 +1375,7 @@ static void fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) { struct fec_enet_private *fep; + struct xdp_frame *xdpf; struct bufdesc *bdp; unsigned short status; struct sk_buff *skb; @@ -1387,16 +1403,31 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
index = fec_enet_get_bd_index(bdp, &txq->bd);
- skb = txq->tx_skbuff[index]; - txq->tx_skbuff[index] = NULL; - if (!IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr))) - dma_unmap_single(&fep->pdev->dev, - fec32_to_cpu(bdp->cbd_bufaddr), - fec16_to_cpu(bdp->cbd_datlen), - DMA_TO_DEVICE); - bdp->cbd_bufaddr = cpu_to_fec32(0); - if (!skb) - goto skb_done; + if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) { + skb = txq->tx_buf[index].skb; + txq->tx_buf[index].skb = NULL; + if (bdp->cbd_bufaddr && + !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr))) + dma_unmap_single(&fep->pdev->dev, + fec32_to_cpu(bdp->cbd_bufaddr), + fec16_to_cpu(bdp->cbd_datlen), + DMA_TO_DEVICE); + bdp->cbd_bufaddr = cpu_to_fec32(0); + if (!skb) + goto tx_buf_done; + } else { + xdpf = txq->tx_buf[index].xdp; + if (bdp->cbd_bufaddr) + dma_unmap_single(&fep->pdev->dev, + fec32_to_cpu(bdp->cbd_bufaddr), + fec16_to_cpu(bdp->cbd_datlen), + DMA_TO_DEVICE); + bdp->cbd_bufaddr = cpu_to_fec32(0); + if (!xdpf) { + txq->tx_buf[index].type = FEC_TXBUF_T_SKB; + goto tx_buf_done; + } + }
/* Check for errors. */ if (status & (BD_ENET_TX_HB | BD_ENET_TX_LC | @@ -1415,21 +1446,11 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) ndev->stats.tx_carrier_errors++; } else { ndev->stats.tx_packets++; - ndev->stats.tx_bytes += skb->len; - }
- /* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who - * are to time stamp the packet, so we still need to check time - * stamping enabled flag. - */ - if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS && - fep->hwts_tx_en) && - fep->bufdesc_ex) { - struct skb_shared_hwtstamps shhwtstamps; - struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp; - - fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts), &shhwtstamps); - skb_tstamp_tx(skb, &shhwtstamps); + if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) + ndev->stats.tx_bytes += skb->len; + else + ndev->stats.tx_bytes += xdpf->len; }
/* Deferred means some collisions occurred during transmit, @@ -1438,10 +1459,32 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) if (status & BD_ENET_TX_DEF) ndev->stats.collisions++;
- /* Free the sk buffer associated with this last transmit */ - dev_kfree_skb_any(skb); -skb_done: - /* Make sure the update to bdp and tx_skbuff are performed + if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) { + /* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who + * are to time stamp the packet, so we still need to check time + * stamping enabled flag. + */ + if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS && + fep->hwts_tx_en) && fep->bufdesc_ex) { + struct skb_shared_hwtstamps shhwtstamps; + struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp; + + fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts), &shhwtstamps); + skb_tstamp_tx(skb, &shhwtstamps); + } + + /* Free the sk buffer associated with this last transmit */ + dev_kfree_skb_any(skb); + } else { + xdp_return_frame(xdpf); + + txq->tx_buf[index].xdp = NULL; + /* restore default tx buffer type: FEC_TXBUF_T_SKB */ + txq->tx_buf[index].type = FEC_TXBUF_T_SKB; + } + +tx_buf_done: + /* Make sure the update to bdp and tx_buf are performed * before dirty_tx */ wmb(); @@ -3247,9 +3290,19 @@ static void fec_enet_free_buffers(struct net_device *ndev) for (i = 0; i < txq->bd.ring_size; i++) { kfree(txq->tx_bounce[i]); txq->tx_bounce[i] = NULL; - skb = txq->tx_skbuff[i]; - txq->tx_skbuff[i] = NULL; - dev_kfree_skb(skb); + + if (txq->tx_buf[i].type == FEC_TXBUF_T_SKB) { + skb = txq->tx_buf[i].skb; + txq->tx_buf[i].skb = NULL; + dev_kfree_skb(skb); + } else { + if (txq->tx_buf[i].xdp) { + xdp_return_frame(txq->tx_buf[i].xdp); + txq->tx_buf[i].xdp = NULL; + } + + txq->tx_buf[i].type = FEC_TXBUF_T_SKB; + } } } } @@ -3809,7 +3862,8 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep, ebdp->cbd_esc = cpu_to_fec32(estatus); }
- txq->tx_skbuff[index] = NULL; + txq->tx_buf[index].type = FEC_TXBUF_T_XDP_NDO; + txq->tx_buf[index].xdp = frame;
/* Make sure the updates to rest of the descriptor are performed before * transferring ownership.
From: Wei Fang wei.fang@nxp.com
[ Upstream commit 56b3c6ba53d0e9649ea5e4089b39cadde13aaef8 ]
When the XDP feature is enabled and with heavy XDP frames to be transmitted, there is a considerable probability that available tx BDs are insufficient. This will lead to some XDP frames to be discarded and the "NOT enough BD for SG!" error log will appear in the console (as shown below).
[ 160.013112] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.023116] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.028926] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.038946] fec 30be0000.ethernet eth0: NOT enough BD for SG! [ 160.044758] fec 30be0000.ethernet eth0: NOT enough BD for SG!
In the case of heavy XDP traffic, sometimes the speed of recycling tx BDs may be slower than the speed of sending XDP frames. There may be several specific reasons, such as the interrupt is not responsed in time, the efficiency of the NAPI callback function is too low due to all the queues (tx queues and rx queues) share the same NAPI, and so on.
After trying various methods, I think that increase the size of tx BD ring is simple and effective. Maybe the best resolution is that allocate NAPI for each queue to improve the efficiency of the NAPI callback, but this change is a bit big and I didn't try this method. Perheps this method will be implemented in a future patch.
This patch also updates the tx_wake_threshold of tx ring which is related to the size of tx ring in the previous logic. Otherwise, the tx_wake_threshold will be too high (403 BDs), which is more likely to impact the slow path in the case of heavy XDP traffic, because XDP path and slow path share the tx BD rings. According to Jakub's suggestion, the tx_wake_threshold is at least equal to tx_stop_threshold + 2 * MAX_SKB_FRAGS, if a queue of hundreds of entries is overflowing, we should be able to apply a hysteresis of a few tens of entries.
Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang wei.fang@nxp.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec.h | 2 +- drivers/net/ethernet/freescale/fec_main.c | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 8c0226d061fec..63a053dea819d 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -355,7 +355,7 @@ struct bufdesc_ex { #define RX_RING_SIZE (FEC_ENET_RX_FRPPG * FEC_ENET_RX_PAGES) #define FEC_ENET_TX_FRSIZE 2048 #define FEC_ENET_TX_FRPPG (PAGE_SIZE / FEC_ENET_TX_FRSIZE) -#define TX_RING_SIZE 512 /* Must be power of two */ +#define TX_RING_SIZE 1024 /* Must be power of two */ #define TX_RING_MOD_MASK 511 /* for this to work */
#define BD_ENET_RX_INT 0x00800000 diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index e6ed36e5daefa..7659888a96917 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3347,8 +3347,7 @@ static int fec_enet_alloc_queue(struct net_device *ndev) fep->total_tx_ring_size += fep->tx_queue[i]->bd.ring_size;
txq->tx_stop_threshold = FEC_MAX_SKB_DESCS; - txq->tx_wake_threshold = - (txq->bd.ring_size - txq->tx_stop_threshold) / 2; + txq->tx_wake_threshold = FEC_MAX_SKB_DESCS + 2 * MAX_SKB_FRAGS;
txq->tso_hdrs = dma_alloc_coherent(&fep->pdev->dev, txq->bd.ring_size * TSO_HEADER_SIZE,
From: Stanislav Lisovskiy stanislav.lisovskiy@intel.com
[ Upstream commit 5c413188c68da0e4bffc93de1c80257e20741e69 ]
If we are using Bigjoiner dpll_hw_state is supposed to be exactly same as for master crtc, so no need to save it's state for slave crtc.
Signed-off-by: Stanislav Lisovskiy stanislav.lisovskiy@intel.com Fixes: 0ff0e219d9b8 ("drm/i915: Compute clocks earlier") Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230628141017.18937-1-stanisl... (cherry picked from commit cbaf758809952c95ec00e796695049babb08bb60) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/display/intel_display.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 7749f95d5d02a..a805b57f3d912 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -4968,7 +4968,6 @@ copy_bigjoiner_crtc_state_modeset(struct intel_atomic_state *state, saved_state->uapi = slave_crtc_state->uapi; saved_state->scaler_state = slave_crtc_state->scaler_state; saved_state->shared_dpll = slave_crtc_state->shared_dpll; - saved_state->dpll_hw_state = slave_crtc_state->dpll_hw_state; saved_state->crc_enabled = slave_crtc_state->crc_enabled;
intel_crtc_free_hw_state(slave_crtc_state);
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
[ Upstream commit 113899c2669dff148b2a5bea4780123811aecc13 ]
Commit a4d86249c773 ("drm/i915/gt: Provide a utility to create a scratch buffer") mistakenly passed in uapi I915_CACHING_CACHED as argument to i915_gem_object_set_cache_coherency(), which actually takes internal enum i915_cache_level.
No functional issue since the value matches I915_CACHE_LLC (1 == 1), which is the intended caching mode, but lets clean it up nevertheless.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Fixes: a4d86249c773 ("drm/i915/gt: Provide a utility to create a scratch buffer") Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Reviewed-by: Tejas Upadhyay tejas.upadhyay@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230707125503.3965817-1-tvrtk... (cherry picked from commit 49c60b2f0867ac36fd54d513882a48431aeccae7) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 4f436ba7a3c83..123b82f29a1bf 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -625,7 +625,7 @@ __vm_create_scratch_for_read(struct i915_address_space *vm, unsigned long size) if (IS_ERR(obj)) return ERR_CAST(obj);
- i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED); + i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
vma = i915_vma_instance(obj, vm, NULL); if (IS_ERR(vma)) {
From: Lu Hongfei luhongfei@vivo.com
[ Upstream commit 04499f28b40bfc24f20b0e2331008bb90a54a6cf ]
Remove unnecessary of_node_put from the continue path to prevent child node from being released twice, which could avoid resource leak or other unexpected issues.
Signed-off-by: Lu Hongfei luhongfei@vivo.com Reviewed-by: Vladimir Oltean vladimir.oltean@nxp.com Fixes: de879a016a94 ("net: dsa: felix: add functionality when not all ports are supported") Link: https://lore.kernel.org/r/20230710031859.36784-1-luhongfei@vivo.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/ocelot/felix.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index 8348da2b3c97a..d78b4bd4787e8 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -1286,7 +1286,6 @@ static int felix_parse_ports_node(struct felix *felix, if (err < 0) { dev_info(dev, "Unsupported PHY mode %s on port %d\n", phy_modes(phy_mode), port); - of_node_put(child);
/* Leave port_phy_modes[port] = 0, which is also * PHY_INTERFACE_MODE_NA. This will perform a
From: Suman Ghosh sumang@marvell.com
[ Upstream commit 8278ee2a2646b9acf747317895e47a640ba933c9 ]
Due to hardware limitation, MCAM drop rule with ether_type == 802.1Q and vlan_id == 0 is not supported. Hence rejecting such rules.
Fixes: dce677da57c0 ("octeontx2-pf: Add vlan-etype to ntuple filters") Signed-off-by: Suman Ghosh sumang@marvell.com Link: https://lore.kernel.org/r/20230710103027.2244139-1-sumang@marvell.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/marvell/octeontx2/nic/otx2_flows.c | 8 ++++++++ .../net/ethernet/marvell/octeontx2/nic/otx2_tc.c | 15 +++++++++++++++ 2 files changed, 23 insertions(+)
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c index 10e11262d48a0..2d7713a1a1539 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c @@ -872,6 +872,14 @@ static int otx2_prepare_flow_request(struct ethtool_rx_flow_spec *fsp, return -EINVAL;
vlan_etype = be16_to_cpu(fsp->h_ext.vlan_etype); + + /* Drop rule with vlan_etype == 802.1Q + * and vlan_id == 0 is not supported + */ + if (vlan_etype == ETH_P_8021Q && !fsp->m_ext.vlan_tci && + fsp->ring_cookie == RX_CLS_FLOW_DISC) + return -EINVAL; + /* Only ETH_P_8021Q and ETH_P_802AD types supported */ if (vlan_etype != ETH_P_8021Q && vlan_etype != ETH_P_8021AD) diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c index 8392f63e433fc..293bd3f29b077 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c @@ -604,6 +604,21 @@ static int otx2_tc_prepare_flow(struct otx2_nic *nic, struct otx2_tc_flow *node, return -EOPNOTSUPP; }
+ if (!match.mask->vlan_id) { + struct flow_action_entry *act; + int i; + + flow_action_for_each(i, act, &rule->action) { + if (act->id == FLOW_ACTION_DROP) { + netdev_err(nic->netdev, + "vlan tpid 0x%x with vlan_id %d is not supported for DROP rule.\n", + ntohs(match.key->vlan_tpid), + match.key->vlan_id); + return -EOPNOTSUPP; + } + } + } + if (match.mask->vlan_id || match.mask->vlan_dei || match.mask->vlan_priority) {
From: Chunhai Guo guochunhai@vivo.com
[ Upstream commit 936aa701d82d397c2d1afcd18ce2c739471d978d ]
z_erofs_pcluster_readmore() may take a long time to loop when the page offset is large enough, which is unnecessary should be prevented.
For example, when the following case is encountered, it will loop 4691368 times, taking about 27 seconds: - offset = 19217289215 - inode_size = 1442672
Signed-off-by: Chunhai Guo guochunhai@vivo.com Fixes: 386292919c25 ("erofs: introduce readmore decompression strategy") Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Yue Hu huyue2@coolpad.com Reviewed-by: Chao Yu chao@kernel.org Link: https://lore.kernel.org/r/20230710042531.28761-1-guochunhai@vivo.com Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 502893e3da010..bedfff5d45faf 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1807,7 +1807,7 @@ static void z_erofs_pcluster_readmore(struct z_erofs_decompress_frontend *f, }
cur = map->m_la + map->m_llen - 1; - while (cur >= end) { + while ((cur >= end) && (cur < i_size_read(inode))) { pgoff_t index = cur >> PAGE_SHIFT; struct page *page;
From: Chunhai Guo guochunhai@vivo.com
[ Upstream commit 8191213a5835b0317c5e4d0d337ae1ae00c75253 ]
z_erofs_do_read_page() may loop infinitely due to the inappropriate truncation in the below statement. Since the offset is 64 bits and min_t() truncates the result to 32 bits. The solution is to replace unsigned int with a 64-bit type, such as erofs_off_t. cur = end - min_t(unsigned int, offset + end - map->m_la, end);
- For example: - offset = 0x400160000 - end = 0x370 - map->m_la = 0x160370 - offset + end - map->m_la = 0x400000000 - offset + end - map->m_la = 0x00000000 (truncated as unsigned int) - Expected result: - cur = 0 - Actual result: - cur = 0x370
Signed-off-by: Chunhai Guo guochunhai@vivo.com Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support") Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu chao@kernel.org Link: https://lore.kernel.org/r/20230710093410.44071-1-guochunhai@vivo.com Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index bedfff5d45faf..997ca4b32e87f 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -990,7 +990,7 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe, */ tight &= (fe->mode > Z_EROFS_PCLUSTER_FOLLOWED_NOINPLACE);
- cur = end - min_t(unsigned int, offset + end - map->m_la, end); + cur = end - min_t(erofs_off_t, offset + end - map->m_la, end); if (!(map->m_flags & EROFS_MAP_MAPPED)) { zero_user_segment(page, cur, end); goto next_part;
From: Xin Yin yinxin.x@bytedance.com
[ Upstream commit 18bddc5b67038722cb88fcf51fbf41a0277092cb ]
DAX can be used to share page cache between VMs, reducing guest memory overhead. And chunk based data format is widely used for VM and container image. So enable dax support for it, make erofs better used for VM scenarios.
Fixes: c5aa903a59db ("erofs: support reading chunk-based uncompressed files") Signed-off-by: Xin Yin yinxin.x@bytedance.com Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Chao Yu chao@kernel.org Link: https://lore.kernel.org/r/20230711062130.7860-1-yinxin.x@bytedance.com Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/inode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c index d70b12b81507f..e12592727a546 100644 --- a/fs/erofs/inode.c +++ b/fs/erofs/inode.c @@ -183,7 +183,8 @@ static void *erofs_read_inode(struct erofs_buf *buf,
inode->i_flags &= ~S_DAX; if (test_opt(&sbi->opt, DAX_ALWAYS) && S_ISREG(inode->i_mode) && - vi->datalayout == EROFS_INODE_FLAT_PLAIN) + (vi->datalayout == EROFS_INODE_FLAT_PLAIN || + vi->datalayout == EROFS_INODE_CHUNK_BASED)) inode->i_flags |= S_DAX;
if (!nblks)
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 9373771aaed17f5c2c38485f785568abe3a9f8c1 ]
Quieten a gcc (11.3.0) build error or warning by checking the function call status and returning -EBUSY if the function call failed. This is similar to what several other wireless drivers do for the SIOCGIWRATE ioctl call when there is a locking problem.
drivers/net/wireless/cisco/airo.c: error: 'status_rid.currentXmitRate' is used uninitialized [-Werror=uninitialized]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: Geert Uytterhoeven geert@linux-m68k.org Link: https://lore.kernel.org/r/39abf2c7-24a-f167-91da-ed4c5435d1c4@linux-m68k.org Link: https://lore.kernel.org/r/20230709133154.26206-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/cisco/airo.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/cisco/airo.c b/drivers/net/wireless/cisco/airo.c index 7c4cc5f5e1eb4..dbd13f7aa3e6e 100644 --- a/drivers/net/wireless/cisco/airo.c +++ b/drivers/net/wireless/cisco/airo.c @@ -6157,8 +6157,11 @@ static int airo_get_rate(struct net_device *dev, struct iw_param *vwrq = &wrqu->bitrate; struct airo_info *local = dev->ml_priv; StatusRid status_rid; /* Card status info */ + int ret;
- readStatusRid(local, &status_rid, 1); + ret = readStatusRid(local, &status_rid, 1); + if (ret) + return -EBUSY;
vwrq->value = le16_to_cpu(status_rid.currentXmitRate) * 500000; /* If more than one rate, set auto */
From: Pu Lehui pulehui@huawei.com
[ Upstream commit 4369016497319a9635702da010d02af1ebb1849d ]
Syzkaller reported a memory leak as follows:
BUG: memory leak unreferenced object 0xff110001198ef748 (size 192): comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s) hex dump (first 32 bytes): 00 00 00 00 4a 19 00 00 80 ad e3 e4 fe ff c0 00 ....J........... 00 b2 d3 0c 01 00 11 ff 28 f5 8e 19 01 00 11 ff ........(....... backtrace: [<ffffffffadd28087>] __cpu_map_entry_alloc+0xf7/0xb00 [<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0 [<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520 [<ffffffffadc7349b>] map_update_elem+0x4cb/0x720 [<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90 [<ffffffffb029cc80>] do_syscall_64+0x30/0x40 [<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
BUG: memory leak unreferenced object 0xff110001198ef528 (size 192): comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffadd281f0>] __cpu_map_entry_alloc+0x260/0xb00 [<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0 [<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520 [<ffffffffadc7349b>] map_update_elem+0x4cb/0x720 [<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90 [<ffffffffb029cc80>] do_syscall_64+0x30/0x40 [<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
BUG: memory leak unreferenced object 0xff1100010fd93d68 (size 8): comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s) hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace: [<ffffffffade5db3e>] kvmalloc_node+0x11e/0x170 [<ffffffffadd28280>] __cpu_map_entry_alloc+0x2f0/0xb00 [<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0 [<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520 [<ffffffffadc7349b>] map_update_elem+0x4cb/0x720 [<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90 [<ffffffffb029cc80>] do_syscall_64+0x30/0x40 [<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
In the cpu_map_update_elem flow, when kthread_stop is called before calling the threadfn of rcpu->kthread, since the KTHREAD_SHOULD_STOP bit of kthread has been set by kthread_stop, the threadfn of rcpu->kthread will never be executed, and rcpu->refcnt will never be 0, which will lead to the allocated rcpu, rcpu->queue and rcpu->queue->queue cannot be released.
Calling kthread_stop before executing kthread's threadfn will return -EINTR. We can complete the release of memory resources in this state.
Fixes: 6710e1126934 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP") Signed-off-by: Pu Lehui pulehui@huawei.com Acked-by: Jesper Dangaard Brouer hawk@kernel.org Acked-by: Hou Tao houtao1@huawei.com Link: https://lore.kernel.org/r/20230711115848.2701559-1-pulehui@huaweicloud.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/cpumap.c | 40 ++++++++++++++++++++++++---------------- 1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 8ec18faa74ac3..3da63be602d1c 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -126,22 +126,6 @@ static void get_cpu_map_entry(struct bpf_cpu_map_entry *rcpu) atomic_inc(&rcpu->refcnt); }
-/* called from workqueue, to workaround syscall using preempt_disable */ -static void cpu_map_kthread_stop(struct work_struct *work) -{ - struct bpf_cpu_map_entry *rcpu; - - rcpu = container_of(work, struct bpf_cpu_map_entry, kthread_stop_wq); - - /* Wait for flush in __cpu_map_entry_free(), via full RCU barrier, - * as it waits until all in-flight call_rcu() callbacks complete. - */ - rcu_barrier(); - - /* kthread_stop will wake_up_process and wait for it to complete */ - kthread_stop(rcpu->kthread); -} - static void __cpu_map_ring_cleanup(struct ptr_ring *ring) { /* The tear-down procedure should have made sure that queue is @@ -169,6 +153,30 @@ static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu) } }
+/* called from workqueue, to workaround syscall using preempt_disable */ +static void cpu_map_kthread_stop(struct work_struct *work) +{ + struct bpf_cpu_map_entry *rcpu; + int err; + + rcpu = container_of(work, struct bpf_cpu_map_entry, kthread_stop_wq); + + /* Wait for flush in __cpu_map_entry_free(), via full RCU barrier, + * as it waits until all in-flight call_rcu() callbacks complete. + */ + rcu_barrier(); + + /* kthread_stop will wake_up_process and wait for it to complete */ + err = kthread_stop(rcpu->kthread); + if (err) { + /* kthread_stop may be called before cpu_map_kthread_run + * is executed, so we need to release the memory related + * to rcpu. + */ + put_cpu_map_entry(rcpu); + } +} + static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu, struct list_head *listp, struct xdp_cpumap_stats *stats)
From: Larysa Zaremba larysa.zaremba@intel.com
[ Upstream commit 2e06c57d66d3f6c26faa5f5b479fb3add34ce85a ]
Currently, verifier does not reject XDP programs that pass NULL pointer to hints functions. At the same time, this case is not handled in any driver implementation (including veth). For example, changing
bpf_xdp_metadata_rx_timestamp(ctx, ×tamp);
to
bpf_xdp_metadata_rx_timestamp(ctx, NULL);
in xdp_metadata test successfully crashes the system.
Add KF_TRUSTED_ARGS flag to hints kfunc definitions, so driver code does not have to worry about getting invalid pointers.
Fixes: 3d76a4d3d4e5 ("bpf: XDP metadata RX kfuncs") Reported-by: Stanislav Fomichev sdf@google.com Closes: https://lore.kernel.org/bpf/ZKWo0BbpLfkZHbyE@google.com/ Signed-off-by: Larysa Zaremba larysa.zaremba@intel.com Acked-by: Jesper Dangaard Brouer hawk@kernel.org Acked-by: Stanislav Fomichev sdf@google.com Link: https://lore.kernel.org/r/20230711105930.29170-1-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/xdp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/xdp.c b/net/core/xdp.c index 41e5ca8643ec9..8362130bf085d 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -741,7 +741,7 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash, __diag_pop();
BTF_SET8_START(xdp_metadata_kfunc_ids) -#define XDP_METADATA_KFUNC(_, name) BTF_ID_FLAGS(func, name, 0) +#define XDP_METADATA_KFUNC(_, name) BTF_ID_FLAGS(func, name, KF_TRUSTED_ARGS) XDP_METADATA_KFUNC_xxx #undef XDP_METADATA_KFUNC BTF_SET8_END(xdp_metadata_kfunc_ids)
From: Ido Schimmel idosch@nvidia.com
[ Upstream commit d3f87278bcb80bd7f9519669d928b43320363d4f ]
The kernel does not currently validate that both the minimum and maximum ports of a port range are specified. This can lead user space to think that a filter matching on a port range was successfully added, when in fact it was not. For example, with a patched (buggy) iproute2 that only sends the minimum port, the following commands do not return an error:
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 100-200 action pass
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp dst_port 100-200 action pass
# tc filter show dev swp1 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 ip_proto udp not_in_hw action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1
filter protocol ip pref 1 flower chain 0 handle 0x2 eth_type ipv4 ip_proto udp not_in_hw action order 1: gact action pass random type none pass val 0 index 2 ref 1 bind 1
Fix by returning an error unless both ports are specified:
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 100-200 action pass Error: Both min and max source ports must be specified. We have an error talking to the kernel
# tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp dst_port 100-200 action pass Error: Both min and max destination ports must be specified. We have an error talking to the kernel
Fixes: 5c72299fba9d ("net: sched: cls_flower: Classify packets using port ranges") Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Petr Machata petrm@nvidia.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_flower.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 815c3e416bc54..652158f612fc2 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -799,6 +799,16 @@ static int fl_set_key_port_range(struct nlattr **tb, struct fl_flow_key *key, TCA_FLOWER_KEY_PORT_SRC_MAX, &mask->tp_range.tp_max.src, TCA_FLOWER_UNSPEC, sizeof(key->tp_range.tp_max.src));
+ if (mask->tp_range.tp_min.dst != mask->tp_range.tp_max.dst) { + NL_SET_ERR_MSG(extack, + "Both min and max destination ports must be specified"); + return -EINVAL; + } + if (mask->tp_range.tp_min.src != mask->tp_range.tp_max.src) { + NL_SET_ERR_MSG(extack, + "Both min and max source ports must be specified"); + return -EINVAL; + } if (mask->tp_range.tp_min.dst && mask->tp_range.tp_max.dst && ntohs(key->tp_range.tp_max.dst) <= ntohs(key->tp_range.tp_min.dst)) {
From: Jisheng Zhang jszhang@kernel.org
[ Upstream commit b690e266dae2f85f4dfea21fa6a05e3500a51054 ]
lkp reports below sparse warning when building for RV32: arch/riscv/mm/init.c:1204:48: sparse: warning: cast truncates bits from constant value (100000000 becomes 0)
IMO, the reason we didn't see this truncates bug in real world is "0" means MEMBLOCK_ALLOC_ACCESSIBLE in memblock and there's no RV32 HW with more than 4GB memory.
Fix it anyway to make sparse happy.
Fixes: decf89f86ecd ("riscv: try to allocate crashkern region from 32bit addressible memory") Signed-off-by: Jisheng Zhang jszhang@kernel.org Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202306080034.SLiCiOMn-lkp@intel.com/ Link: https://lore.kernel.org/r/20230709171036.1906-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 1306149aad57a..93e7bb9f67fd4 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1346,7 +1346,7 @@ static void __init reserve_crashkernel(void) */ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, search_start, - min(search_end, (unsigned long) SZ_4G)); + min(search_end, (unsigned long)(SZ_4G - 1))); if (crash_base == 0) { /* Try again without restricting region to 32bit addressible memory */ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
From: Karol Herbst kherbst@redhat.com
[ Upstream commit d94303699921bda8141ad33554ae55b615ddd149 ]
Cc: Ben Skeggs bskeggs@redhat.com Cc: Lyude Paul lyude@redhat.com Fixes: f530bc60a30b ("drm/nouveau/disp: move HDMI config into acquire + infoframe methods") Signed-off-by: Karol Herbst kherbst@redhat.com Reviewed-by: Ben Skeggs bskeggs@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230628212248.3798605-1-kherb... Signed-off-by: Karol Herbst kherbst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c index a2c7c6f83dcdb..506ffbe7b8421 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c @@ -125,7 +125,7 @@ gt215_sor_hdmi_infoframe_avi(struct nvkm_ior *ior, int head, void *data, u32 siz pack_hdmi_infoframe(&avi, data, size);
nvkm_mask(device, 0x61c520 + soff, 0x00000001, 0x00000000); - if (size) + if (!size) return;
nvkm_wr32(device, 0x61c528 + soff, avi.header);
From: Karol Herbst kherbst@redhat.com
[ Upstream commit c177872cb056e0b499af4717d8d1977017fd53df ]
Cc: Ben Skeggs bskeggs@redhat.com Cc: Lyude Paul lyude@redhat.com Fixes: f530bc60a30b ("drm/nouveau/disp: move HDMI config into acquire + infoframe methods") Signed-off-by: Karol Herbst kherbst@redhat.com Reviewed-by: Ben Skeggs bskeggs@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230630160645.3984596-1-kherb... Signed-off-by: Karol Herbst kherbst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c index a4853c4e5ee3a..67ef889a0c5f4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c @@ -295,6 +295,7 @@ g94_sor = { .clock = nv50_sor_clock, .war_2 = g94_sor_war_2, .war_3 = g94_sor_war_3, + .hdmi = &g84_sor_hdmi, .dp = &g94_sor_dp, };
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit f72207a5c0dbaaf6921cf9a6c0d2fd0bc249ea78 ]
The simple_write_to_buffer() function is designed to handle partial writes. It returns negatives on error, otherwise it returns the number of bytes that were able to be copied. This code doesn't check the return properly. We only know that the first byte is written, the rest of the buffer might be uninitialized.
There is no need to use the simple_write_to_buffer() function. Partial writes are prohibited by the "if (*ppos != 0)" check at the start of the function. Just use memdup_user() and copy the whole buffer.
Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://lore.kernel.org/r/7c1f950b-3a7d-4252-82a6-876e53078ef7@moroto.mounta... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/netdevsim/dev.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c index 6045bece2654d..b4d3b9cde8bd6 100644 --- a/drivers/net/netdevsim/dev.c +++ b/drivers/net/netdevsim/dev.c @@ -184,13 +184,10 @@ static ssize_t nsim_dev_trap_fa_cookie_write(struct file *file, cookie_len = (count - 1) / 2; if ((count - 1) % 2) return -EINVAL; - buf = kmalloc(count, GFP_KERNEL | __GFP_NOWARN); - if (!buf) - return -ENOMEM;
- ret = simple_write_to_buffer(buf, count, ppos, data, count); - if (ret < 0) - goto free_buf; + buf = memdup_user(data, count); + if (IS_ERR(buf)) + return PTR_ERR(buf);
fa_cookie = kmalloc(sizeof(*fa_cookie) + cookie_len, GFP_KERNEL | __GFP_NOWARN);
From: Karol Herbst kherbst@redhat.com
[ Upstream commit 938a06c8b7913455073506c33ae3bff029c3c4ef ]
This fixes a NULL pointer access inside nvkm_acr_oneinit in case necessary firmware files couldn't be loaded.
Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/212 Fixes: 4b569ded09fd ("drm/nouveau/acr/ga102: initial support") Signed-off-by: Karol Herbst kherbst@redhat.com Reviewed-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230522201838.1496622-1-kherb... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c index 795f3a649b122..9b8ca4e898f90 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c @@ -224,7 +224,7 @@ nvkm_acr_oneinit(struct nvkm_subdev *subdev) u64 falcons; int ret, i;
- if (list_empty(&acr->hsfw)) { + if (list_empty(&acr->hsfw) || !acr->func || !acr->func->wpr_layout) { nvkm_debug(subdev, "No HSFW(s)\n"); nvkm_acr_cleanup(acr); return 0;
From: Karol Herbst kherbst@redhat.com
[ Upstream commit 835a65f51790e1f72b1ab106ec89db9ac15b47d6 ]
1ba6113a90a0 removed a lot of the kernel GPU channel, but method 0x128 was important as otherwise the GPU spams us with `CACHE_ERROR` messages.
We use the blit subchannel inside our vblank handling, so we should keep at least this part.
v2: Only do it for NV11+ GPUs
Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/201 Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers") Signed-off-by: Karol Herbst kherbst@redhat.com Reviewed-by: Ben Skeggs bskeggs@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230526091052.2169044-1-kherb... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_chan.c | 1 + drivers/gpu/drm/nouveau/nouveau_chan.h | 1 + drivers/gpu/drm/nouveau/nouveau_drm.c | 20 +++++++++++++++++--- 3 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.c b/drivers/gpu/drm/nouveau/nouveau_chan.c index e648ecd0c1a03..3dfbc374478e6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_chan.c +++ b/drivers/gpu/drm/nouveau/nouveau_chan.c @@ -90,6 +90,7 @@ nouveau_channel_del(struct nouveau_channel **pchan) if (cli) nouveau_svmm_part(chan->vmm->svmm, chan->inst);
+ nvif_object_dtor(&chan->blit); nvif_object_dtor(&chan->nvsw); nvif_object_dtor(&chan->gart); nvif_object_dtor(&chan->vram); diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.h b/drivers/gpu/drm/nouveau/nouveau_chan.h index e06a8ffed31a8..bad7466bd0d59 100644 --- a/drivers/gpu/drm/nouveau/nouveau_chan.h +++ b/drivers/gpu/drm/nouveau/nouveau_chan.h @@ -53,6 +53,7 @@ struct nouveau_channel { u32 user_put;
struct nvif_object user; + struct nvif_object blit;
struct nvif_event kill; atomic_t killed; diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 7aac9384600ed..40fb9a8349180 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -375,15 +375,29 @@ nouveau_accel_gr_init(struct nouveau_drm *drm) ret = nvif_object_ctor(&drm->channel->user, "drmNvsw", NVDRM_NVSW, nouveau_abi16_swclass(drm), NULL, 0, &drm->channel->nvsw); + + if (ret == 0 && device->info.chipset >= 0x11) { + ret = nvif_object_ctor(&drm->channel->user, "drmBlit", + 0x005f, 0x009f, + NULL, 0, &drm->channel->blit); + } + if (ret == 0) { struct nvif_push *push = drm->channel->chan.push; - ret = PUSH_WAIT(push, 2); - if (ret == 0) + ret = PUSH_WAIT(push, 8); + if (ret == 0) { + if (device->info.chipset >= 0x11) { + PUSH_NVSQ(push, NV05F, 0x0000, drm->channel->blit.handle); + PUSH_NVSQ(push, NV09F, 0x0120, 0, + 0x0124, 1, + 0x0128, 2); + } PUSH_NVSQ(push, NV_SW, 0x0000, drm->channel->nvsw.handle); + } }
if (ret) { - NV_ERROR(drm, "failed to allocate sw class, %d\n", ret); + NV_ERROR(drm, "failed to allocate sw or blit class, %d\n", ret); nouveau_accel_gr_fini(drm); return; }
From: Pedro Tammela pctammela@mojatatu.com
[ Upstream commit 150e33e62c1fa4af5aaab02776b6c3812711d478 ]
Eric Dumazet says[1]: ------- Speaking of psched_mtu(), I see that net/sched/sch_pie.c is using it without holding RTNL, so dev->mtu can be changed underneath. KCSAN could issue a warning. -------
Annotate dev->mtu with READ_ONCE() so KCSAN don't issue a warning.
[1] https://lore.kernel.org/all/CANn89iJoJO5VtaJ-2=_d2aOQhb0Xw8iBT_Cxqp2HyuS-zj6...
v1 -> v2: Fix commit message
Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme") Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230711021634.561598-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/pkt_sched.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index 7dba1c3a7b801..2465d1e79d10e 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -134,7 +134,7 @@ extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1]; */ static inline unsigned int psched_mtu(const struct net_device *dev) { - return dev->mtu + dev->hard_header_len; + return READ_ONCE(dev->mtu) + dev->hard_header_len; }
static inline struct net *qdisc_net(struct Qdisc *q)
From: Jiawen Wu jiawenwu@trustnetic.com
[ Upstream commit aa846677a9fb19a0f2c58154c140398aa92a87ba ]
For some device types like TXGBE_ID_XAUI, *checksum computed in txgbe_calc_eeprom_checksum() is larger than TXGBE_EEPROM_SUM. Remove the limit on the size of *checksum.
Fixes: 049fe5365324 ("net: txgbe: Add operations to interact with firmware") Fixes: 5e2ea7801fac ("net: txgbe: Fix unsigned comparison to zero in txgbe_calc_eeprom_checksum()") Signed-off-by: Jiawen Wu jiawenwu@trustnetic.com Link: https://lore.kernel.org/r/20230711063414.3311-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c index ebc46f3be0569..fc37af2e71ffc 100644 --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c @@ -196,9 +196,6 @@ static int txgbe_calc_eeprom_checksum(struct wx *wx, u16 *checksum) if (eeprom_ptrs) kvfree(eeprom_ptrs);
- if (*checksum > TXGBE_EEPROM_SUM) - return -EINVAL; - *checksum = TXGBE_EEPROM_SUM - *checksum;
return 0;
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 4f4626cd049576af1276c7568d5b44eb3f7bb1b1 ]
If there is a failure during rtw89_fw_h2c_raw() rtw89_debug_priv_send_h2c should return negative error code instead of a positive value count. Fix this bug by returning correct error code.
Fixes: e3ec7017f6a2 ("rtw89: add Realtek 802.11ax driver") Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Acked-by: Ping-Ke Shih pkshih@realtek.com Link: https://lore.kernel.org/r/tencent_AD09A61BC4DA92AD1EB0790F5C850E544D07@qq.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/realtek/rtw89/debug.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw89/debug.c b/drivers/net/wireless/realtek/rtw89/debug.c index 1e5b7a9987163..858494ddfb12e 100644 --- a/drivers/net/wireless/realtek/rtw89/debug.c +++ b/drivers/net/wireless/realtek/rtw89/debug.c @@ -2998,17 +2998,18 @@ static ssize_t rtw89_debug_priv_send_h2c_set(struct file *filp, struct rtw89_debugfs_priv *debugfs_priv = filp->private_data; struct rtw89_dev *rtwdev = debugfs_priv->rtwdev; u8 *h2c; + int ret; u16 h2c_len = count / 2;
h2c = rtw89_hex2bin_user(rtwdev, user_buf, count); if (IS_ERR(h2c)) return -EFAULT;
- rtw89_fw_h2c_raw(rtwdev, h2c, h2c_len); + ret = rtw89_fw_h2c_raw(rtwdev, h2c, h2c_len);
kfree(h2c);
- return count; + return ret ? ret : count; }
static int
From: Pedro Tammela pctammela@mojatatu.com
[ Upstream commit 158810b261d02fc7dd92ca9c392d8f8a211a2401 ]
25369891fcef deletes a check for the case where no 'lmax' is specified which 3037933448f6 previously fixed as 'lmax' could be set to the device's MTU without any bound checking for QFQ_LMAX_MIN and QFQ_LMAX_MAX. Therefore, reintroduce the check.
Fixes: 25369891fcef ("net/sched: sch_qfq: refactor parsing of netlink parameters") Acked-by: Jamal Hadi Salim jhs@mojatatu.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_qfq.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index dfd9a99e62570..63a5b277c117f 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -423,10 +423,17 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid, else weight = 1;
- if (tb[TCA_QFQ_LMAX]) + if (tb[TCA_QFQ_LMAX]) { lmax = nla_get_u32(tb[TCA_QFQ_LMAX]); - else + } else { + /* MTU size is user controlled */ lmax = psched_mtu(qdisc_dev(sch)); + if (lmax < QFQ_MIN_LMAX || lmax > QFQ_MAX_LMAX) { + NL_SET_ERR_MSG_MOD(extack, + "MTU size out of bounds for qfq"); + return -EINVAL; + } + }
inv_w = ONE_FP / weight; weight = ONE_FP / inv_w;
From: Pedro Tammela pctammela@mojatatu.com
[ Upstream commit 3e337087c3b5805fe0b8a46ba622a962880b5d64 ]
Lion says: ------- In the QFQ scheduler a similar issue to CVE-2023-31436 persists.
Consider the following code in net/sched/sch_qfq.c:
static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) { unsigned int len = qdisc_pkt_len(skb), gso_segs;
// ...
if (unlikely(cl->agg->lmax < len)) { pr_debug("qfq: increasing maxpkt from %u to %u for class %u", cl->agg->lmax, len, cl->common.classid); err = qfq_change_agg(sch, cl, cl->agg->class_weight, len); if (err) { cl->qstats.drops++; return qdisc_drop(skb, sch, to_free); }
// ...
}
Similarly to CVE-2023-31436, "lmax" is increased without any bounds checks according to the packet length "len". Usually this would not impose a problem because packet sizes are naturally limited.
This is however not the actual packet length, rather the "qdisc_pkt_len(skb)" which might apply size transformations according to "struct qdisc_size_table" as created by "qdisc_get_stab()" in net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.
A user may choose virtually any size using such a table.
As a result the same issue as in CVE-2023-31436 can occur, allowing heap out-of-bounds read / writes in the kmalloc-8192 cache. -------
We can create the issue with the following commands:
tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \ overhead 999999999 linklayer ethernet qfq tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k tc filter add dev $DEV parent 1: matchall classid 1:1 ping -I $DEV 1.1.1.2
This is caused by incorrectly assuming that qdisc_pkt_len() returns a length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.
Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Reported-by: Lion nnamrec@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Pedro Tammela pctammela@mojatatu.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_qfq.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index 63a5b277c117f..befaf74b33caa 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -381,8 +381,13 @@ static int qfq_change_agg(struct Qdisc *sch, struct qfq_class *cl, u32 weight, u32 lmax) { struct qfq_sched *q = qdisc_priv(sch); - struct qfq_aggregate *new_agg = qfq_find_agg(q, lmax, weight); + struct qfq_aggregate *new_agg;
+ /* 'lmax' can range from [QFQ_MIN_LMAX, pktlen + stab overhead] */ + if (lmax > QFQ_MAX_LMAX) + return -EINVAL; + + new_agg = qfq_find_agg(q, lmax, weight); if (new_agg == NULL) { /* create new aggregate */ new_agg = kzalloc(sizeof(*new_agg), GFP_ATOMIC); if (new_agg == NULL)
From: Ming Lei ming.lei@redhat.com
[ Upstream commit b8f6446b6853768cb99e7c201bddce69ca60c15e ]
DMA direction should be taken in dma_unmap_page() for unmapping integrity data.
Fix this DMA direction, and reported in Guangwu's test.
Reported-by: Guangwu Zhang guazhang@redhat.com Fixes: 4aedb705437f ("nvme-pci: split metadata handling from nvme_map_data / nvme_unmap_data") Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 492f319ebdf37..5b5303f0e2c20 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -968,7 +968,7 @@ static __always_inline void nvme_pci_unmap_rq(struct request *req) struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
dma_unmap_page(dev->dev, iod->meta_dma, - rq_integrity_vec(req)->bv_len, rq_data_dir(req)); + rq_integrity_vec(req)->bv_len, rq_dma_dir(req)); }
if (blk_rq_nr_phys_segments(req))
From: Paulo Alcantara pc@manguebit.com
commit 5f2a0afa9890e728428db2ed9281bddca242e90b upstream.
Some servers may return error codes from REQ_GET_DFS_REFERRAL requests that are unexpected by the client, so to make it easier, assume non-DFS mounts when the client can't get the initial DFS referral of @ctx->UNC in dfs_mount_share().
Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/dfs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/smb/client/dfs.c +++ b/fs/smb/client/dfs.c @@ -296,8 +296,9 @@ int dfs_mount_share(struct cifs_mount_ct if (!nodfs) { rc = dfs_get_referral(mnt_ctx, ctx->UNC + 1, NULL, NULL); if (rc) { - if (rc != -ENOENT && rc != -EOPNOTSUPP && rc != -EIO) - return rc; + cifs_dbg(FYI, "%s: no dfs referral for %s: %d\n", + __func__, ctx->UNC + 1, rc); + cifs_dbg(FYI, "%s: assuming non-dfs mount...\n", __func__); nodfs = true; } }
From: Winston Wen wentao@uniontech.com
commit 66be5c48ee1b5b8c919cc329fe6d32e16badaa40 upstream.
Chech the session state and skip it if it's exiting.
Signed-off-by: Winston Wen wentao@uniontech.com Reviewed-by: Shyam Prasad N sprasad@microsoft.com Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/smb2transport.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -153,7 +153,14 @@ smb2_find_smb_ses_unlocked(struct TCP_Se list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) { if (ses->Suid != ses_id) continue; + + spin_lock(&ses->ses_lock); + if (ses->ses_status == SES_EXITING) { + spin_unlock(&ses->ses_lock); + continue; + } ++ses->ses_count; + spin_unlock(&ses->ses_lock); return ses; }
From: Paulo Alcantara pc@manguebit.com
commit 49024ec8795ed2bd7217c249ef50a70c4e25d662 upstream.
Handle trailing and leading separators when parsing UNC and prefix paths in smb3_parse_devname(). Then, store the sanitised paths in smb3_fs_context::source.
This fixes the following cases
$ mount //srv/share// /mnt/1 -o ... $ cat /mnt/1/d0/f0 cat: /mnt/1/d0/f0: Invalid argument
The -EINVAL was returned because the client sent SMB2_CREATE "\d0\f0" rather than SMB2_CREATE "\d0\f0".
$ mount //srv//share /mnt/1 -o ... mount: Invalid argument
The -EINVAL was returned correctly although the client only realised it after sending a couple of bad requests rather than bailing out earlier when parsing mount options.
Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifs_dfs_ref.c | 20 ++++++++++---- fs/smb/client/cifsproto.h | 2 + fs/smb/client/dfs.c | 38 ++------------------------- fs/smb/client/fs_context.c | 59 ++++++++++++++++++++++++++++++++++++------- fs/smb/client/misc.c | 17 ++++++++---- 5 files changed, 80 insertions(+), 56 deletions(-)
--- a/fs/smb/client/cifs_dfs_ref.c +++ b/fs/smb/client/cifs_dfs_ref.c @@ -118,12 +118,12 @@ cifs_build_devname(char *nodename, const return dev; }
-static int set_dest_addr(struct smb3_fs_context *ctx, const char *full_path) +static int set_dest_addr(struct smb3_fs_context *ctx) { struct sockaddr *addr = (struct sockaddr *)&ctx->dstaddr; int rc;
- rc = dns_resolve_server_name_to_ip(full_path, addr, NULL); + rc = dns_resolve_server_name_to_ip(ctx->source, addr, NULL); if (!rc) cifs_set_port(addr, ctx->port); return rc; @@ -171,10 +171,9 @@ static struct vfsmount *cifs_dfs_do_auto mnt = ERR_CAST(full_path); goto out; } - cifs_dbg(FYI, "%s: full_path: %s\n", __func__, full_path);
tmp = *cur_ctx; - tmp.source = full_path; + tmp.source = NULL; tmp.leaf_fullpath = NULL; tmp.UNC = tmp.prepath = NULL; tmp.dfs_root_ses = NULL; @@ -185,13 +184,22 @@ static struct vfsmount *cifs_dfs_do_auto goto out; }
- rc = set_dest_addr(ctx, full_path); + rc = smb3_parse_devname(full_path, ctx); if (rc) { mnt = ERR_PTR(rc); goto out; }
- rc = smb3_parse_devname(full_path, ctx); + ctx->source = smb3_fs_context_fullpath(ctx, '/'); + if (IS_ERR(ctx->source)) { + mnt = ERR_CAST(ctx->source); + ctx->source = NULL; + goto out; + } + cifs_dbg(FYI, "%s: ctx: source=%s UNC=%s prepath=%s dstaddr=%pISpc\n", + __func__, ctx->source, ctx->UNC, ctx->prepath, &ctx->dstaddr); + + rc = set_dest_addr(ctx); if (!rc) mnt = fc_mount(fc); else --- a/fs/smb/client/cifsproto.h +++ b/fs/smb/client/cifsproto.h @@ -85,6 +85,8 @@ extern void release_mid(struct mid_q_ent extern void cifs_wake_up_task(struct mid_q_entry *mid); extern int cifs_handle_standard(struct TCP_Server_Info *server, struct mid_q_entry *mid); +extern char *smb3_fs_context_fullpath(const struct smb3_fs_context *ctx, + char dirsep); extern int smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx); extern int smb3_parse_opt(const char *options, const char *key, char **val); extern int cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs); --- a/fs/smb/client/dfs.c +++ b/fs/smb/client/dfs.c @@ -54,39 +54,6 @@ out: return rc; }
-/* - * cifs_build_path_to_root returns full path to root when we do not have an - * existing connection (tcon) - */ -static char *build_unc_path_to_root(const struct smb3_fs_context *ctx, - const struct cifs_sb_info *cifs_sb, bool useppath) -{ - char *full_path, *pos; - unsigned int pplen = useppath && ctx->prepath ? strlen(ctx->prepath) + 1 : 0; - unsigned int unc_len = strnlen(ctx->UNC, MAX_TREE_SIZE + 1); - - if (unc_len > MAX_TREE_SIZE) - return ERR_PTR(-EINVAL); - - full_path = kmalloc(unc_len + pplen + 1, GFP_KERNEL); - if (full_path == NULL) - return ERR_PTR(-ENOMEM); - - memcpy(full_path, ctx->UNC, unc_len); - pos = full_path + unc_len; - - if (pplen) { - *pos = CIFS_DIR_SEP(cifs_sb); - memcpy(pos + 1, ctx->prepath, pplen); - pos += pplen; - } - - *pos = '\0'; /* add trailing null */ - convert_delimiter(full_path, CIFS_DIR_SEP(cifs_sb)); - cifs_dbg(FYI, "%s: full_path=%s\n", __func__, full_path); - return full_path; -} - static int get_session(struct cifs_mount_ctx *mnt_ctx, const char *full_path) { struct smb3_fs_context *ctx = mnt_ctx->fs_ctx; @@ -179,6 +146,7 @@ static int __dfs_mount_share(struct cifs struct TCP_Server_Info *server; struct cifs_tcon *tcon; char *origin_fullpath = NULL; + char sep = CIFS_DIR_SEP(cifs_sb); int num_links = 0; int rc;
@@ -186,7 +154,7 @@ static int __dfs_mount_share(struct cifs if (IS_ERR(ref_path)) return PTR_ERR(ref_path);
- full_path = build_unc_path_to_root(ctx, cifs_sb, true); + full_path = smb3_fs_context_fullpath(ctx, sep); if (IS_ERR(full_path)) { rc = PTR_ERR(full_path); full_path = NULL; @@ -228,7 +196,7 @@ static int __dfs_mount_share(struct cifs kfree(full_path); ref_path = full_path = NULL;
- full_path = build_unc_path_to_root(ctx, cifs_sb, true); + full_path = smb3_fs_context_fullpath(ctx, sep); if (IS_ERR(full_path)) { rc = PTR_ERR(full_path); full_path = NULL; --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -441,14 +441,17 @@ out: * but there are some bugs that prevent rename from working if there are * multiple delimiters. * - * Returns a sanitized duplicate of @path. @gfp indicates the GFP_* flags - * for kstrdup. + * Return a sanitized duplicate of @path or NULL for empty prefix paths. + * Otherwise, return ERR_PTR. + * + * @gfp indicates the GFP_* flags for kstrdup. * The caller is responsible for freeing the original. */ #define IS_DELIM(c) ((c) == '/' || (c) == '\') char *cifs_sanitize_prepath(char *prepath, gfp_t gfp) { char *cursor1 = prepath, *cursor2 = prepath; + char *s;
/* skip all prepended delimiters */ while (IS_DELIM(*cursor1)) @@ -469,8 +472,39 @@ char *cifs_sanitize_prepath(char *prepat if (IS_DELIM(*(cursor2 - 1))) cursor2--;
- *(cursor2) = '\0'; - return kstrdup(prepath, gfp); + *cursor2 = '\0'; + if (!*prepath) + return NULL; + s = kstrdup(prepath, gfp); + if (!s) + return ERR_PTR(-ENOMEM); + return s; +} + +/* + * Return full path based on the values of @ctx->{UNC,prepath}. + * + * It is assumed that both values were already parsed by smb3_parse_devname(). + */ +char *smb3_fs_context_fullpath(const struct smb3_fs_context *ctx, char dirsep) +{ + size_t ulen, plen; + char *s; + + ulen = strlen(ctx->UNC); + plen = ctx->prepath ? strlen(ctx->prepath) + 1 : 0; + + s = kmalloc(ulen + plen + 1, GFP_KERNEL); + if (!s) + return ERR_PTR(-ENOMEM); + memcpy(s, ctx->UNC, ulen); + if (plen) { + s[ulen] = dirsep; + memcpy(s + ulen + 1, ctx->prepath, plen); + } + s[ulen + plen] = '\0'; + convert_delimiter(s, dirsep); + return s; }
/* @@ -484,6 +518,7 @@ smb3_parse_devname(const char *devname, char *pos; const char *delims = "/\"; size_t len; + int rc;
if (unlikely(!devname || !*devname)) { cifs_dbg(VFS, "Device name not specified\n"); @@ -511,6 +546,8 @@ smb3_parse_devname(const char *devname,
/* now go until next delimiter or end of string */ len = strcspn(pos, delims); + if (!len) + return -EINVAL;
/* move "pos" up to delimiter or NULL */ pos += len; @@ -533,8 +570,11 @@ smb3_parse_devname(const char *devname, return 0;
ctx->prepath = cifs_sanitize_prepath(pos, GFP_KERNEL); - if (!ctx->prepath) - return -ENOMEM; + if (IS_ERR(ctx->prepath)) { + rc = PTR_ERR(ctx->prepath); + ctx->prepath = NULL; + return rc; + }
return 0; } @@ -1146,12 +1186,13 @@ static int smb3_fs_context_parse_param(s cifs_errorf(fc, "Unknown error parsing devname\n"); goto cifs_parse_mount_err; } - ctx->source = kstrdup(param->string, GFP_KERNEL); - if (ctx->source == NULL) { + ctx->source = smb3_fs_context_fullpath(ctx, '/'); + if (IS_ERR(ctx->source)) { + ctx->source = NULL; cifs_errorf(fc, "OOM when copying UNC string\n"); goto cifs_parse_mount_err; } - fc->source = kstrdup(param->string, GFP_KERNEL); + fc->source = kstrdup(ctx->source, GFP_KERNEL); if (fc->source == NULL) { cifs_errorf(fc, "OOM when copying UNC string\n"); goto cifs_parse_mount_err; --- a/fs/smb/client/misc.c +++ b/fs/smb/client/misc.c @@ -1211,16 +1211,21 @@ int match_target_ip(struct TCP_Server_In
int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix) { + int rc; + kfree(cifs_sb->prepath); + cifs_sb->prepath = NULL;
if (prefix && *prefix) { cifs_sb->prepath = cifs_sanitize_prepath(prefix, GFP_ATOMIC); - if (!cifs_sb->prepath) - return -ENOMEM; - - convert_delimiter(cifs_sb->prepath, CIFS_DIR_SEP(cifs_sb)); - } else - cifs_sb->prepath = NULL; + if (IS_ERR(cifs_sb->prepath)) { + rc = PTR_ERR(cifs_sb->prepath); + cifs_sb->prepath = NULL; + return rc; + } + if (cifs_sb->prepath) + convert_delimiter(cifs_sb->prepath, CIFS_DIR_SEP(cifs_sb)); + }
cifs_sb->mnt_cifs_flags |= CIFS_MOUNT_USE_PREFIX_PATH; return 0;
From: Thomas Zimmermann tzimmermann@suse.de
commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5 upstream.
Generate a hotplug event after registering a client to allow the client to configure its display. Remove the hotplug calls from the existing clients for fbdev emulation. This change fixes a concurrency bug between registering a client and receiving events from the DRM core. The bug is present in the fbdev emulation of all drivers.
The fbdev emulation currently generates a hotplug event before registering the client to the device. For each new output, the DRM core sends an additional hotplug event to each registered client.
If the DRM core detects first output between sending the artificial hotplug and registering the device, the output's hotplug event gets lost. If this is the first output, the fbdev console display remains dark. This has been observed with amdgpu and fbdev-generic.
Fix this by adding hotplug generation directly to the client's register helper drm_client_register(). Registering the client and receiving events are serialized by struct drm_device.clientlist_mutex. So an output is either configured by the initial hotplug event, or the client has already been registered.
The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup is done"), in which adding a client and receiving a hotplug event switched order. It was hidden, as most hardware and drivers have at least on static output configured. Other drivers didn't use the internal DRM client or still had struct drm_mode_config_funcs.output_poll_changed set. That callback handled hotplug events as well. After not setting the callback in amdgpu in commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct drm_driver.output_poll_changed"), amdgpu did not show a framebuffer console if output events got lost. The bug got copy-pasted from fbdev-generic into the other fbdev emulation.
Reported-by: Moritz Duge MoritzDuge@kolahilft.de Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649 Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup is done") Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate source file") Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA helpers") Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel client") Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel client") Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation") Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client") Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel client") Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation") Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client") Fixes: 0e3172bac3f4 ("drm/amdgpu: Don't set struct drm_driver.output_poll_changed") Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Tested-by: Moritz Duge MoritzDuge@kolahilft.de Tested-by: Torsten Krah krah.tm@gmail.com Tested-by: Paul Schyska pschyska@gmail.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: David Airlie airlied@gmail.com Cc: Noralf Trønnes noralf@tronnes.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Maxime Ripard mripard@kernel.org Cc: Javier Martinez Canillas javierm@redhat.com Cc: Russell King linux@armlinux.org.uk Cc: Inki Dae inki.dae@samsung.com Cc: Seung-Woo Kim sw0312.kim@samsung.com Cc: Kyungmin Park kyungmin.park@samsung.com Cc: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Cc: Patrik Jakobsson patrik.r.jakobsson@gmail.com Cc: Rob Clark robdclark@gmail.com Cc: Abhinav Kumar quic_abhinavk@quicinc.com Cc: Dmitry Baryshkov dmitry.baryshkov@linaro.org Cc: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Cc: Alex Deucher alexander.deucher@amd.com Cc: "Christian König" christian.koenig@amd.com Cc: "Pan, Xinhui" Xinhui.Pan@amd.com Cc: Thierry Reding thierry.reding@gmail.com Cc: Mikko Perttunen mperttunen@nvidia.com Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: linux-tegra@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Javier Martinez Canillas javierm@redhat.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org # msm Link: https://patchwork.freedesktop.org/patch/msgid/20230710091029.27503-1-tzimmer... [ Dropped changes to drivers/gpu/drm/armada/armada_fbdev.c as 174c3c38e3a2 drm/armada: Initialize fbdev DRM client was introduced in 6.5-rc1 ] Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_client.c | 21 +++++++++++++++++++++ drivers/gpu/drm/drm_fbdev_dma.c | 4 ---- drivers/gpu/drm/drm_fbdev_generic.c | 4 ---- drivers/gpu/drm/exynos/exynos_drm_fbdev.c | 4 ---- drivers/gpu/drm/gma500/fbdev.c | 4 ---- drivers/gpu/drm/msm/msm_fbdev.c | 4 ---- drivers/gpu/drm/omapdrm/omap_fbdev.c | 4 ---- drivers/gpu/drm/radeon/radeon_fbdev.c | 4 ---- drivers/gpu/drm/tegra/fbdev.c | 4 ---- 9 files changed, 21 insertions(+), 32 deletions(-)
--- a/drivers/gpu/drm/drm_client.c +++ b/drivers/gpu/drm/drm_client.c @@ -122,13 +122,34 @@ EXPORT_SYMBOL(drm_client_init); * drm_client_register() it is no longer permissible to call drm_client_release() * directly (outside the unregister callback), instead cleanup will happen * automatically on driver unload. + * + * Registering a client generates a hotplug event that allows the client + * to set up its display from pre-existing outputs. The client must have + * initialized its state to able to handle the hotplug event successfully. */ void drm_client_register(struct drm_client_dev *client) { struct drm_device *dev = client->dev; + int ret;
mutex_lock(&dev->clientlist_mutex); list_add(&client->list, &dev->clientlist); + + if (client->funcs && client->funcs->hotplug) { + /* + * Perform an initial hotplug event to pick up the + * display configuration for the client. This step + * has to be performed *after* registering the client + * in the list of clients, or a concurrent hotplug + * event might be lost; leaving the display off. + * + * Hold the clientlist_mutex as for a regular hotplug + * event. + */ + ret = client->funcs->hotplug(client); + if (ret) + drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); + } mutex_unlock(&dev->clientlist_mutex); } EXPORT_SYMBOL(drm_client_register); --- a/drivers/gpu/drm/drm_fbdev_dma.c +++ b/drivers/gpu/drm/drm_fbdev_dma.c @@ -253,10 +253,6 @@ void drm_fbdev_dma_setup(struct drm_devi goto err_drm_client_init; }
- ret = drm_fbdev_dma_client_hotplug(&fb_helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&fb_helper->client);
return; --- a/drivers/gpu/drm/drm_fbdev_generic.c +++ b/drivers/gpu/drm/drm_fbdev_generic.c @@ -340,10 +340,6 @@ void drm_fbdev_generic_setup(struct drm_ goto err_drm_client_init; }
- ret = drm_fbdev_generic_client_hotplug(&fb_helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&fb_helper->client);
return; --- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c +++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c @@ -216,10 +216,6 @@ void exynos_drm_fbdev_setup(struct drm_d if (ret) goto err_drm_client_init;
- ret = exynos_drm_fbdev_client_hotplug(&fb_helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&fb_helper->client);
return; --- a/drivers/gpu/drm/gma500/fbdev.c +++ b/drivers/gpu/drm/gma500/fbdev.c @@ -330,10 +330,6 @@ void psb_fbdev_setup(struct drm_psb_priv goto err_drm_fb_helper_unprepare; }
- ret = psb_fbdev_client_hotplug(&fb_helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&fb_helper->client);
return; --- a/drivers/gpu/drm/msm/msm_fbdev.c +++ b/drivers/gpu/drm/msm/msm_fbdev.c @@ -227,10 +227,6 @@ void msm_fbdev_setup(struct drm_device * goto err_drm_fb_helper_unprepare; }
- ret = msm_fbdev_client_hotplug(&helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&helper->client);
return; --- a/drivers/gpu/drm/omapdrm/omap_fbdev.c +++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c @@ -323,10 +323,6 @@ void omap_fbdev_setup(struct drm_device
INIT_WORK(&fbdev->work, pan_worker);
- ret = omap_fbdev_client_hotplug(&helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&helper->client);
return; --- a/drivers/gpu/drm/radeon/radeon_fbdev.c +++ b/drivers/gpu/drm/radeon/radeon_fbdev.c @@ -386,10 +386,6 @@ void radeon_fbdev_setup(struct radeon_de goto err_drm_client_init; }
- ret = radeon_fbdev_client_hotplug(&fb_helper->client); - if (ret) - drm_dbg_kms(rdev->ddev, "client hotplug ret=%d\n", ret); - drm_client_register(&fb_helper->client);
return; --- a/drivers/gpu/drm/tegra/fbdev.c +++ b/drivers/gpu/drm/tegra/fbdev.c @@ -227,10 +227,6 @@ void tegra_fbdev_setup(struct drm_device if (ret) goto err_drm_client_init;
- ret = tegra_fbdev_client_hotplug(&helper->client); - if (ret) - drm_dbg_kms(dev, "client hotplug ret=%d\n", ret); - drm_client_register(&helper->client);
return;
From: Chao Yu chao@kernel.org
commit 458c15dfbce62c35fefd9ca637b20a051309c9f1 upstream.
syzbot reports a bug as below:
general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] PREEMPT SMP KASAN RIP: 0010:__lock_acquire+0x69/0x2000 kernel/locking/lockdep.c:4942 Call Trace: lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5691 __raw_write_lock include/linux/rwlock_api_smp.h:209 [inline] _raw_write_lock+0x2e/0x40 kernel/locking/spinlock.c:300 __drop_extent_tree+0x3ac/0x660 fs/f2fs/extent_cache.c:1100 f2fs_drop_extent_tree+0x17/0x30 fs/f2fs/extent_cache.c:1116 f2fs_insert_range+0x2d5/0x3c0 fs/f2fs/file.c:1664 f2fs_fallocate+0x4e4/0x6d0 fs/f2fs/file.c:1838 vfs_fallocate+0x54b/0x6b0 fs/open.c:324 ksys_fallocate fs/open.c:347 [inline] __do_sys_fallocate fs/open.c:355 [inline] __se_sys_fallocate fs/open.c:353 [inline] __x64_sys_fallocate+0xbd/0x100 fs/open.c:353 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The root cause is race condition as below: - since it tries to remount rw filesystem, so that do_remount won't call sb_prepare_remount_readonly to block fallocate, there may be race condition in between remount and fallocate. - in f2fs_remount(), default_options() will reset mount option to default one, and then update it based on result of parse_options(), so there is a hole which race condition can happen.
Thread A Thread B - f2fs_fill_super - parse_options - clear_opt(READ_EXTENT_CACHE)
- f2fs_remount - default_options - set_opt(READ_EXTENT_CACHE) - f2fs_fallocate - f2fs_insert_range - f2fs_drop_extent_tree - __drop_extent_tree - __may_extent_tree - test_opt(READ_EXTENT_CACHE) return true - write_lock(&et->lock) access NULL pointer - parse_options - clear_opt(READ_EXTENT_CACHE)
Cc: stable@vger.kernel.org Reported-by: syzbot+d015b6c2fbb5c383bf08@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/20230522124203.3838360-1-chao@kerne... Signed-off-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/super.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-)
--- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -2086,9 +2086,22 @@ static int f2fs_show_options(struct seq_ return 0; }
-static void default_options(struct f2fs_sb_info *sbi) +static void default_options(struct f2fs_sb_info *sbi, bool remount) { /* init some FS parameters */ + if (!remount) { + set_opt(sbi, READ_EXTENT_CACHE); + clear_opt(sbi, DISABLE_CHECKPOINT); + + if (f2fs_hw_support_discard(sbi) || f2fs_hw_should_discard(sbi)) + set_opt(sbi, DISCARD); + + if (f2fs_sb_has_blkzoned(sbi)) + F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_SECTION; + else + F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_BLOCK; + } + if (f2fs_sb_has_readonly(sbi)) F2FS_OPTION(sbi).active_logs = NR_CURSEG_RO_TYPE; else @@ -2118,23 +2131,16 @@ static void default_options(struct f2fs_ set_opt(sbi, INLINE_XATTR); set_opt(sbi, INLINE_DATA); set_opt(sbi, INLINE_DENTRY); - set_opt(sbi, READ_EXTENT_CACHE); set_opt(sbi, NOHEAP); - clear_opt(sbi, DISABLE_CHECKPOINT); set_opt(sbi, MERGE_CHECKPOINT); F2FS_OPTION(sbi).unusable_cap = 0; sbi->sb->s_flags |= SB_LAZYTIME; if (!f2fs_is_readonly(sbi)) set_opt(sbi, FLUSH_MERGE); - if (f2fs_hw_support_discard(sbi) || f2fs_hw_should_discard(sbi)) - set_opt(sbi, DISCARD); - if (f2fs_sb_has_blkzoned(sbi)) { + if (f2fs_sb_has_blkzoned(sbi)) F2FS_OPTION(sbi).fs_mode = FS_MODE_LFS; - F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_SECTION; - } else { + else F2FS_OPTION(sbi).fs_mode = FS_MODE_ADAPTIVE; - F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_BLOCK; - }
#ifdef CONFIG_F2FS_FS_XATTR set_opt(sbi, XATTR_USER); @@ -2306,7 +2312,7 @@ static int f2fs_remount(struct super_blo clear_sbi_flag(sbi, SBI_NEED_SB_WRITE); }
- default_options(sbi); + default_options(sbi, true);
/* parse mount options */ err = parse_options(sb, data, true); @@ -4357,7 +4363,7 @@ try_onemore: sbi->s_chksum_seed = f2fs_chksum(sbi, ~0, raw_super->uuid, sizeof(raw_super->uuid));
- default_options(sbi); + default_options(sbi, false); /* parse mount options */ options = kstrdup((const char *)data, GFP_KERNEL); if (data && !options) {
From: Jaegeuk Kim jaegeuk@kernel.org
commit 5eda1ad1aaffdfebdecf7a164e586060a210f74f upstream.
Thread #1:
[122554.641906][ T92] f2fs_getxattr+0xd4/0x5fc -> waiting for f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);
[122554.641927][ T92] __f2fs_get_acl+0x50/0x284 [122554.641948][ T92] f2fs_init_acl+0x84/0x54c [122554.641969][ T92] f2fs_init_inode_metadata+0x460/0x5f0 [122554.641990][ T92] f2fs_add_inline_entry+0x11c/0x350 -> Locked dir->inode_page by f2fs_get_node_page()
[122554.642009][ T92] f2fs_do_add_link+0x100/0x1e4 [122554.642025][ T92] f2fs_create+0xf4/0x22c [122554.642047][ T92] vfs_create+0x130/0x1f4
Thread #2:
[123996.386358][ T92] __get_node_page+0x8c/0x504 -> waiting for dir->inode_page lock
[123996.386383][ T92] read_all_xattrs+0x11c/0x1f4 [123996.386405][ T92] __f2fs_setxattr+0xcc/0x528 [123996.386424][ T92] f2fs_setxattr+0x158/0x1f4 -> f2fs_down_write(&F2FS_I(inode)->i_xattr_sem);
[123996.386443][ T92] __f2fs_set_acl+0x328/0x430 [123996.386618][ T92] f2fs_set_acl+0x38/0x50 [123996.386642][ T92] posix_acl_chmod+0xc8/0x1c8 [123996.386669][ T92] f2fs_setattr+0x5e0/0x6bc [123996.386689][ T92] notify_change+0x4d8/0x580 [123996.386717][ T92] chmod_common+0xd8/0x184 [123996.386748][ T92] do_fchmodat+0x60/0x124 [123996.386766][ T92] __arm64_sys_fchmodat+0x28/0x3c
Cc: stable@vger.kernel.org Fixes: 27161f13e3c3 "f2fs: avoid race in between read xattr & write xattr" Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/dir.c | 9 ++++++++- fs/f2fs/xattr.c | 6 ++++-- 2 files changed, 12 insertions(+), 3 deletions(-)
--- a/fs/f2fs/dir.c +++ b/fs/f2fs/dir.c @@ -775,8 +775,15 @@ int f2fs_add_dentry(struct inode *dir, c { int err = -EAGAIN;
- if (f2fs_has_inline_dentry(dir)) + if (f2fs_has_inline_dentry(dir)) { + /* + * Should get i_xattr_sem to keep the lock order: + * i_xattr_sem -> inode_page lock used by f2fs_setxattr. + */ + f2fs_down_read(&F2FS_I(dir)->i_xattr_sem); err = f2fs_add_inline_entry(dir, fname, inode, ino, mode); + f2fs_up_read(&F2FS_I(dir)->i_xattr_sem); + } if (err == -EAGAIN) err = f2fs_add_regular_entry(dir, fname, inode, ino, mode);
--- a/fs/f2fs/xattr.c +++ b/fs/f2fs/xattr.c @@ -528,10 +528,12 @@ int f2fs_getxattr(struct inode *inode, i if (len > F2FS_NAME_LEN) return -ERANGE;
- f2fs_down_read(&F2FS_I(inode)->i_xattr_sem); + if (!ipage) + f2fs_down_read(&F2FS_I(inode)->i_xattr_sem); error = lookup_all_xattrs(inode, ipage, index, len, name, &entry, &base_addr, &base_size, &is_inline); - f2fs_up_read(&F2FS_I(inode)->i_xattr_sem); + if (!ipage) + f2fs_up_read(&F2FS_I(inode)->i_xattr_sem); if (error) return error;
From: Masahiro Yamada masahiroy@kernel.org
commit 8ae071fc216a25f4f797f33c56857f4dd6b4408e upstream.
Josh Triplett reports that initramfs-tools needs modules.builtin and modules.builtin.modinfo to create a working initramfs for a non-modular kernel.
If this is a general tooling issue not limited to Debian, I think it makes sense to change modules_install.
This commit changes the targets as follows when CONFIG_MODULES=n.
In-tree builds: make modules -> no-op make modules_install -> install modules.builtin(.modinfo)
External module builds: make modules -> show error message like before make modules_install -> show error message like before
Link: https://lore.kernel.org/lkml/36a4014c73a52af27d930d3ca31d362b60f4461c.168635... Reported-by: Josh Triplett josh@joshtriplett.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Reviewed-by: Nicolas Schier nicolas@fjasle.eu Tested-by: Nicolas Schier nicolas@fjasle.eu Reviewed-by: Josh Triplett josh@joshtriplett.org Tested-by: Josh Triplett josh@joshtriplett.org Stable-dep-of: 4243afdb9326 ("kbuild: builddeb: always make modules_install, to install modules.builtin*") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Makefile | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-)
--- a/Makefile +++ b/Makefile @@ -1561,6 +1561,8 @@ modules_sign_only := y endif endif
+endif # CONFIG_MODULES + modinst_pre := ifneq ($(filter modules_install,$(MAKECMDGOALS)),) modinst_pre := __modinst_pre @@ -1571,18 +1573,18 @@ PHONY += __modinst_pre __modinst_pre: @rm -rf $(MODLIB)/kernel @rm -f $(MODLIB)/source - @mkdir -p $(MODLIB)/kernel + @mkdir -p $(MODLIB) +ifdef CONFIG_MODULES @ln -s $(abspath $(srctree)) $(MODLIB)/source @if [ ! $(objtree) -ef $(MODLIB)/build ]; then \ rm -f $(MODLIB)/build ; \ ln -s $(CURDIR) $(MODLIB)/build ; \ fi @sed 's:^(.*).o$$:kernel/\1.ko:' modules.order > $(MODLIB)/modules.order +endif @cp -f modules.builtin $(MODLIB)/ @cp -f $(objtree)/modules.builtin.modinfo $(MODLIB)/
-endif # CONFIG_MODULES - ### # Cleaning is done on three levels. # make clean Delete most generated files @@ -1924,6 +1926,13 @@ help: @echo ' clean - remove generated files in module directory only' @echo ''
+__external_modules_error: + @echo >&2 '***' + @echo >&2 '*** The present kernel disabled CONFIG_MODULES.' + @echo >&2 '*** You cannot build or install external modules.' + @echo >&2 '***' + @false + endif # KBUILD_EXTMOD
# --------------------------------------------------------------------------- @@ -1960,13 +1969,10 @@ else # CONFIG_MODULES # Modules not configured # ---------------------------------------------------------------------------
-modules modules_install: - @echo >&2 '***' - @echo >&2 '*** The present kernel configuration has modules disabled.' - @echo >&2 '*** To use the module feature, please run "make menuconfig" etc.' - @echo >&2 '*** to enable CONFIG_MODULES.' - @echo >&2 '***' - @exit 1 +PHONY += __external_modules_error + +modules modules_install: __external_modules_error + @:
KBUILD_MODULES :=
From: Mario Limonciello mario.limonciello@amd.com
commit 968ab9261627fa305307e3935ca1a32fcddd36cb upstream.
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe") had a mistake in loop iteration 63 that it would clear offset 0xFC instead of 0x100. Offset 0xFC is actually `WAKE_INT_MASTER_REG`. This was clearing bits 13 and 15 from the register which significantly changed the expected handling for some platforms for GPIO0.
commit b26cd9325be4 ("pinctrl: amd: Disable and mask interrupts on resume") actually fixed this bug, but lead to regressions on Lenovo Z13 and some other systems. This is because there was no handling in the driver for bit 15 debounce behavior.
Quoting a public BKDG: ``` EnWinBlueBtn. Read-write. Reset: 0. 0=GPIO0 detect debounced power button; Power button override is 4 seconds. 1=GPIO0 detect debounced power button in S3/S5/S0i3, and detect "pressed less than 2 seconds" and "pressed 2~10 seconds" in S0; Power button override is 10 seconds ```
Cross referencing the same master register in Windows it's obvious that Windows doesn't use debounce values in this configuration. So align the Linux driver to do this as well. This fixes wake on lid when WAKE_INT_MASTER_REG is properly programmed.
Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217315 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230421120625.3366-2-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 7 +++++++ drivers/pinctrl/pinctrl-amd.h | 1 + 2 files changed, 8 insertions(+)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -125,6 +125,12 @@ static int amd_gpio_set_debounce(struct struct amd_gpio *gpio_dev = gpiochip_get_data(gc);
raw_spin_lock_irqsave(&gpio_dev->lock, flags); + + /* Use special handling for Pin0 debounce */ + pin_reg = readl(gpio_dev->base + WAKE_INT_MASTER_REG); + if (pin_reg & INTERNAL_GPIO0_DEBOUNCE) + debounce = 0; + pin_reg = readl(gpio_dev->base + offset * 4);
if (debounce) { @@ -219,6 +225,7 @@ static void amd_gpio_dbg_show(struct seq char *debounce_enable; char *wake_cntrlz;
+ seq_printf(s, "WAKE_INT_MASTER_REG: 0x%08x\n", readl(gpio_dev->base + WAKE_INT_MASTER_REG)); for (bank = 0; bank < gpio_dev->hwbank_num; bank++) { unsigned int time = 0; unsigned int unit = 0; --- a/drivers/pinctrl/pinctrl-amd.h +++ b/drivers/pinctrl/pinctrl-amd.h @@ -17,6 +17,7 @@ #define AMD_GPIO_PINS_BANK3 32
#define WAKE_INT_MASTER_REG 0xfc +#define INTERNAL_GPIO0_DEBOUNCE (1 << 15) #define EOI_MASK (1 << 29)
#define WAKE_INT_STATUS_REG0 0x2f8
From: Mario Limonciello mario.limonciello@amd.com
commit a855724dc08b8cb0c13ab1e065a4922f1e5a7552 upstream.
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe") had a mistake in loop iteration 63 that it would clear offset 0xFC instead of 0x100. Offset 0xFC is actually `WAKE_INT_MASTER_REG`. This was clearing bits 13 and 15 from the register which significantly changed the expected handling for some platforms for GPIO0.
Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217315 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230421120625.3366-3-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -897,9 +897,9 @@ static void amd_gpio_irq_init(struct amd
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
- pin_reg = readl(gpio_dev->base + i * 4); + pin_reg = readl(gpio_dev->base + pin * 4); pin_reg &= ~mask; - writel(pin_reg, gpio_dev->base + i * 4); + writel(pin_reg, gpio_dev->base + pin * 4);
raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); }
From: Kornel Dulęba korneld@chromium.org
commit 0cf9e48ff22e15f3f0882991f33d23ccc5ae1d01 upstream.
Leverage gpiochip_line_is_irq to check whether a pin has an irq associated with it. The previous check ("irq == 0") didn't make much sense. The irq variable refers to the pinctrl irq, and has nothing do to with an individual pin.
On some systems, during suspend/resume cycle, the firmware leaves an interrupt enabled on a pin that is not used by the kernel. Without this patch that caused an interrupt storm.
Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217315 Signed-off-by: Kornel Dulęba korneld@chromium.org Reviewed-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230421120625.3366-4-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -660,21 +660,21 @@ static bool do_amd_gpio_irq_handler(int * We must read the pin register again, in case the * value was changed while executing * generic_handle_domain_irq() above. - * If we didn't find a mapping for the interrupt, - * disable it in order to avoid a system hang caused - * by an interrupt storm. + * If the line is not an irq, disable it in order to + * avoid a system hang caused by an interrupt storm. */ raw_spin_lock_irqsave(&gpio_dev->lock, flags); regval = readl(regs + i); - if (irq == 0) { - regval &= ~BIT(INTERRUPT_ENABLE_OFF); + if (!gpiochip_line_is_irq(gc, irqnr + i)) { + regval &= ~BIT(INTERRUPT_MASK_OFF); dev_dbg(&gpio_dev->pdev->dev, "Disabling spurious GPIO IRQ %d\n", irqnr + i); + } else { + ret = true; } writel(regval, regs + i); raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); - ret = true; } } /* did not cause wake on resume context for shared IRQ */
From: Mario Limonciello mario.limonciello@amd.com
commit 65f6c7c91cb2ebacbf155e0f881f81e79f90d138 upstream.
commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe") was well intentioned to mask a firmware issue on a surface laptop, but it has a few problems: 1. It had a bug in the loop handling for iteration 63 that lead to other problems with GPIO0 handling. 2. It disables interrupts that are used internally by the SOC but masked by default. 3. It masked a real firmware problem in some chromebooks that should have been caught during development but wasn't.
There has been a lot of other development around s2idle; particularly around handling of the spurious wakeups. If there is still a problem on the original reported surface laptop it should be avoided by adding a quirk to gpiolib-acpi for that system instead.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230421120625.3366-5-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 31 ------------------------------- 1 file changed, 31 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -877,34 +877,6 @@ static const struct pinconf_ops amd_pinc .pin_config_group_set = amd_pinconf_group_set, };
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev) -{ - struct pinctrl_desc *desc = gpio_dev->pctrl->desc; - unsigned long flags; - u32 pin_reg, mask; - int i; - - mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) | - BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) | - BIT(WAKE_CNTRL_OFF_S4); - - for (i = 0; i < desc->npins; i++) { - int pin = desc->pins[i].number; - const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin); - - if (!pd) - continue; - - raw_spin_lock_irqsave(&gpio_dev->lock, flags); - - pin_reg = readl(gpio_dev->base + pin * 4); - pin_reg &= ~mask; - writel(pin_reg, gpio_dev->base + pin * 4); - - raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); - } -} - #ifdef CONFIG_PM_SLEEP static bool amd_gpio_should_save(struct amd_gpio *gpio_dev, unsigned int pin) { @@ -1142,9 +1114,6 @@ static int amd_gpio_probe(struct platfor return PTR_ERR(gpio_dev->pctrl); }
- /* Disable and mask interrupts */ - amd_gpio_irq_init(gpio_dev); - girq = &gpio_dev->gc.irq; gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip); /* This will let us handle the parent IRQ in the driver */
From: Mario Limonciello mario.limonciello@amd.com
commit 0d5ace1a07f7e846d0f6d972af60d05515599d0b upstream.
It's uncommon to use debounce on any other pin, but technically we should only set debounce to 0 when working off GPIO0.
Cc: stable@vger.kernel.org Tested-by: Jan Visser starquake@linuxeverywhere.org Fixes: 968ab9261627 ("pinctrl: amd: Detect internal GPIO0 debounce handling") Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230705133005.577-2-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -127,9 +127,11 @@ static int amd_gpio_set_debounce(struct raw_spin_lock_irqsave(&gpio_dev->lock, flags);
/* Use special handling for Pin0 debounce */ - pin_reg = readl(gpio_dev->base + WAKE_INT_MASTER_REG); - if (pin_reg & INTERNAL_GPIO0_DEBOUNCE) - debounce = 0; + if (offset == 0) { + pin_reg = readl(gpio_dev->base + WAKE_INT_MASTER_REG); + if (pin_reg & INTERNAL_GPIO0_DEBOUNCE) + debounce = 0; + }
pin_reg = readl(gpio_dev->base + offset * 4);
From: Mario Limonciello mario.limonciello@amd.com
commit 635a750d958e158e17af0f524bedc484b27fbb93 upstream.
On ASUS TUF A16 it is reported that the ITE5570 ACPI device connected to GPIO 7 is causing an interrupt storm. This issue doesn't happen on Windows.
Comparing the GPIO register configuration between Windows and Linux bit 20 has been configured as a pull up on Windows, but not on Linux. Checking GPIO declaration from the firmware it is clear it *should* have been a pull up on Linux as well.
``` GpioInt (Level, ActiveLow, Exclusive, PullUp, 0x0000, "\_SB.GPIO", 0x00, ResourceConsumer, ,) { // Pin list 0x0007 } ```
On Linux amd_gpio_set_config() is currently only used for programming the debounce. Actually the GPIO core calls it with all the arguments that are supported by a GPIO, pinctrl-amd just responds `-ENOTSUPP`.
To solve this issue expand amd_gpio_set_config() to support the other arguments amd_pinconf_set() supports, namely `PIN_CONFIG_BIAS_PULL_DOWN`, `PIN_CONFIG_BIAS_PULL_UP`, and `PIN_CONFIG_DRIVE_STRENGTH`.
Reported-by: Nik P npliashechnikov@gmail.com Reported-by: Nathan Schulte nmschulte@gmail.com Reported-by: Friedrich Vock friedrich.vock@gmx.de Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217336 Reported-by: dridri85@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217493 Link: https://lore.kernel.org/linux-input/20230530154058.17594-1-friedrich.vock@gm... Tested-by: Jan Visser starquake@linuxeverywhere.org Fixes: 2956b5d94a76 ("pinctrl / gpio: Introduce .set_config() callback for GPIO chips") Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230705133005.577-3-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -188,18 +188,6 @@ static int amd_gpio_set_debounce(struct return ret; }
-static int amd_gpio_set_config(struct gpio_chip *gc, unsigned offset, - unsigned long config) -{ - u32 debounce; - - if (pinconf_to_config_param(config) != PIN_CONFIG_INPUT_DEBOUNCE) - return -ENOTSUPP; - - debounce = pinconf_to_config_argument(config); - return amd_gpio_set_debounce(gc, offset, debounce); -} - #ifdef CONFIG_DEBUG_FS static void amd_gpio_dbg_show(struct seq_file *s, struct gpio_chip *gc) { @@ -782,7 +770,7 @@ static int amd_pinconf_get(struct pinctr }
static int amd_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin, - unsigned long *configs, unsigned num_configs) + unsigned long *configs, unsigned int num_configs) { int i; u32 arg; @@ -872,6 +860,20 @@ static int amd_pinconf_group_set(struct return 0; }
+static int amd_gpio_set_config(struct gpio_chip *gc, unsigned int pin, + unsigned long config) +{ + struct amd_gpio *gpio_dev = gpiochip_get_data(gc); + + if (pinconf_to_config_param(config) == PIN_CONFIG_INPUT_DEBOUNCE) { + u32 debounce = pinconf_to_config_argument(config); + + return amd_gpio_set_debounce(gc, pin, debounce); + } + + return amd_pinconf_set(gpio_dev->pctrl, pin, &config, 1); +} + static const struct pinconf_ops amd_pinconf_ops = { .pin_config_get = amd_pinconf_get, .pin_config_set = amd_pinconf_set,
From: Mario Limonciello mario.limonciello@amd.com
commit 3f62312d04d4c68aace9cd06fc135e09573325f3 upstream.
pinctrl-amd currently tries to program bit 19 of all GPIOs to select either a 4kΩ or 8hΩ pull up, but this isn't what bit 19 does. Bit 19 is marked as reserved, even in the latest platforms documentation.
Drop this programming functionality.
Tested-by: Jan Visser starquake@linuxeverywhere.org Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230705133005.577-4-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 16 ++++------------ drivers/pinctrl/pinctrl-amd.h | 1 - 2 files changed, 4 insertions(+), 13 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -209,7 +209,6 @@ static void amd_gpio_dbg_show(struct seq char *pin_sts; char *interrupt_sts; char *wake_sts; - char *pull_up_sel; char *orientation; char debounce_value[40]; char *debounce_enable; @@ -317,14 +316,9 @@ static void amd_gpio_dbg_show(struct seq seq_printf(s, " %s|", wake_sts);
if (pin_reg & BIT(PULL_UP_ENABLE_OFF)) { - if (pin_reg & BIT(PULL_UP_SEL_OFF)) - pull_up_sel = "8k"; - else - pull_up_sel = "4k"; - seq_printf(s, "%s ↑|", - pull_up_sel); + seq_puts(s, " ↑ |"); } else if (pin_reg & BIT(PULL_DOWN_ENABLE_OFF)) { - seq_puts(s, " ↓|"); + seq_puts(s, " ↓ |"); } else { seq_puts(s, " |"); } @@ -751,7 +745,7 @@ static int amd_pinconf_get(struct pinctr break;
case PIN_CONFIG_BIAS_PULL_UP: - arg = (pin_reg >> PULL_UP_SEL_OFF) & (BIT(0) | BIT(1)); + arg = (pin_reg >> PULL_UP_ENABLE_OFF) & BIT(0); break;
case PIN_CONFIG_DRIVE_STRENGTH: @@ -798,10 +792,8 @@ static int amd_pinconf_set(struct pinctr break;
case PIN_CONFIG_BIAS_PULL_UP: - pin_reg &= ~BIT(PULL_UP_SEL_OFF); - pin_reg |= (arg & BIT(0)) << PULL_UP_SEL_OFF; pin_reg &= ~BIT(PULL_UP_ENABLE_OFF); - pin_reg |= ((arg>>1) & BIT(0)) << PULL_UP_ENABLE_OFF; + pin_reg |= (arg & BIT(0)) << PULL_UP_ENABLE_OFF; break;
case PIN_CONFIG_DRIVE_STRENGTH: --- a/drivers/pinctrl/pinctrl-amd.h +++ b/drivers/pinctrl/pinctrl-amd.h @@ -36,7 +36,6 @@ #define WAKE_CNTRL_OFF_S4 15 #define PIN_STS_OFF 16 #define DRV_STRENGTH_SEL_OFF 17 -#define PULL_UP_SEL_OFF 19 #define PULL_UP_ENABLE_OFF 20 #define PULL_DOWN_ENABLE_OFF 21 #define OUTPUT_VALUE_OFF 22
From: Mario Limonciello mario.limonciello@amd.com
commit 283c5ce7da0a676f46539094d40067ad17c4f294 upstream.
Debounce handling is done in two different entry points in the driver. Unify this to make sure that it's always handled the same.
Tested-by: Jan Visser starquake@linuxeverywhere.org Signed-off-by: Mario Limonciello mario.limonciello@amd.com Link: https://lore.kernel.org/r/20230705133005.577-5-mario.limonciello@amd.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -115,16 +115,12 @@ static void amd_gpio_set_value(struct gp raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); }
-static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset, - unsigned debounce) +static int amd_gpio_set_debounce(struct amd_gpio *gpio_dev, unsigned int offset, + unsigned int debounce) { u32 time; u32 pin_reg; int ret = 0; - unsigned long flags; - struct amd_gpio *gpio_dev = gpiochip_get_data(gc); - - raw_spin_lock_irqsave(&gpio_dev->lock, flags);
/* Use special handling for Pin0 debounce */ if (offset == 0) { @@ -183,7 +179,6 @@ static int amd_gpio_set_debounce(struct pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF); } writel(pin_reg, gpio_dev->base + offset * 4); - raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
return ret; } @@ -782,9 +777,8 @@ static int amd_pinconf_set(struct pinctr
switch (param) { case PIN_CONFIG_INPUT_DEBOUNCE: - pin_reg &= ~DB_TMR_OUT_MASK; - pin_reg |= arg & DB_TMR_OUT_MASK; - break; + ret = amd_gpio_set_debounce(gpio_dev, pin, arg); + goto out_unlock;
case PIN_CONFIG_BIAS_PULL_DOWN: pin_reg &= ~BIT(PULL_DOWN_ENABLE_OFF); @@ -811,6 +805,7 @@ static int amd_pinconf_set(struct pinctr
writel(pin_reg, gpio_dev->base + pin*4); } +out_unlock: raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
return ret; @@ -857,12 +852,6 @@ static int amd_gpio_set_config(struct gp { struct amd_gpio *gpio_dev = gpiochip_get_data(gc);
- if (pinconf_to_config_param(config) == PIN_CONFIG_INPUT_DEBOUNCE) { - u32 debounce = pinconf_to_config_argument(config); - - return amd_gpio_set_debounce(gc, pin, debounce); - } - return amd_pinconf_set(gpio_dev->pctrl, pin, &config, 1); }
From: Valentin David valentin.david@gmail.com
commit b1c1b98962d17a922989aa3b2822946bbb5c091f upstream.
For Pluton TPM devices, it was assumed that there was no ACPI memory regions. This is not true for ASUS ROG Ally. ACPI advertises 0xfd500000-0xfd5fffff.
Since remapping is already done in `crb_map_pluton`, remapping again in `crb_map_io` causes EBUSY error:
[ 3.510453] tpm_crb MSFT0101:00: can't request region for resource [mem 0xfd500000-0xfd5fffff] [ 3.510463] tpm_crb: probe of MSFT0101:00 failed with error -16
Cc: stable@vger.kernel.org # v6.3+ Fixes: 4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton") Signed-off-by: Valentin David valentin.david@gmail.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_crb.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
--- a/drivers/char/tpm/tpm_crb.c +++ b/drivers/char/tpm/tpm_crb.c @@ -563,15 +563,18 @@ static int crb_map_io(struct acpi_device u32 rsp_size; int ret;
- INIT_LIST_HEAD(&acpi_resource_list); - ret = acpi_dev_get_resources(device, &acpi_resource_list, - crb_check_resource, iores_array); - if (ret < 0) - return ret; - acpi_dev_free_resource_list(&acpi_resource_list); - - /* Pluton doesn't appear to define ACPI memory regions */ + /* + * Pluton sometimes does not define ACPI memory regions. + * Mapping is then done in crb_map_pluton + */ if (priv->sm != ACPI_TPM2_COMMAND_BUFFER_WITH_PLUTON) { + INIT_LIST_HEAD(&acpi_resource_list); + ret = acpi_dev_get_resources(device, &acpi_resource_list, + crb_check_resource, iores_array); + if (ret < 0) + return ret; + acpi_dev_free_resource_list(&acpi_resource_list); + if (resource_type(iores_array) != IORESOURCE_MEM) { dev_err(dev, FW_BUG "TPM2 ACPI table does not define a memory resource\n"); return -EINVAL;
From: Jarkko Sakkinen jarkko.sakkinen@tuni.fi
commit f4032d615f90970d6c3ac1d9c0bce3351eb4445c upstream.
/dev/vtpmx is made visible before 'workqueue' is initialized, which can lead to a memory corruption in the worst case scenario.
Address this by initializing 'workqueue' as the very first step of the driver initialization.
Cc: stable@vger.kernel.org Fixes: 6f99612e2500 ("tpm: Proxy driver for supporting multiple emulated TPMs") Reviewed-by: Stefan Berger stefanb@linux.ibm.com Signed-off-by: Jarkko Sakkinen jarkko.sakkinen@tuni.fi Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_vtpm_proxy.c | 30 +++++++----------------------- 1 file changed, 7 insertions(+), 23 deletions(-)
--- a/drivers/char/tpm/tpm_vtpm_proxy.c +++ b/drivers/char/tpm/tpm_vtpm_proxy.c @@ -683,37 +683,21 @@ static struct miscdevice vtpmx_miscdev = .fops = &vtpmx_fops, };
-static int vtpmx_init(void) -{ - return misc_register(&vtpmx_miscdev); -} - -static void vtpmx_cleanup(void) -{ - misc_deregister(&vtpmx_miscdev); -} - static int __init vtpm_module_init(void) { int rc;
- rc = vtpmx_init(); - if (rc) { - pr_err("couldn't create vtpmx device\n"); - return rc; - } - workqueue = create_workqueue("tpm-vtpm"); if (!workqueue) { pr_err("couldn't create workqueue\n"); - rc = -ENOMEM; - goto err_vtpmx_cleanup; + return -ENOMEM; }
- return 0; - -err_vtpmx_cleanup: - vtpmx_cleanup(); + rc = misc_register(&vtpmx_miscdev); + if (rc) { + pr_err("couldn't create vtpmx device\n"); + destroy_workqueue(workqueue); + }
return rc; } @@ -721,7 +705,7 @@ err_vtpmx_cleanup: static void __exit vtpm_module_exit(void) { destroy_workqueue(workqueue); - vtpmx_cleanup(); + misc_deregister(&vtpmx_miscdev); }
module_init(vtpm_module_init);
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
commit edb13d7bb034c4d5523f15e9aeea31c504af6f91 upstream.
Further restrict with DMI_PRODUCT_VERSION.
Cc: stable@vger.kernel.org # v6.4+ Link: https://lore.kernel.org/linux-integrity/20230517122931.22385-1-peter.ujfalus... Fixes: 95a9359ee22f ("tpm: tpm_tis: Disable interrupts for AEON UPX-i11") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 7db3593941ea..9cb4e81fc548 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -143,6 +143,7 @@ static const struct dmi_system_id tpm_tis_dmi_table[] = { .ident = "UPX-TGL", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "AAEON"), + DMI_MATCH(DMI_PRODUCT_VERSION, "UPX-TGL"), }, }, {}
From: Alexander Sverdlin alexander.sverdlin@siemens.com
commit f3b70b6e3390bfdf18fdd7d278a72a12784fdcce upstream.
Underlying I2C bus drivers not always support longer transfers and imx-lpi2c for instance doesn't. SLB 9673 offers 427-bytes packets.
Visible symptoms are:
tpm tpm0: Error left over data tpm tpm0: tpm_transmit: tpm_recv: error -5 tpm_tis_i2c: probe of 1-002e failed with error -5
Cc: stable@vger.kernel.org # v5.20+ Fixes: bbc23a07b072 ("tpm: Add tpm_tis_i2c backend for tpm_tis_core") Tested-by: Michael Haener michael.haener@siemens.com Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Reviewed-by: Jerry Snitselaar jsnitsel@redhat.com Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis_i2c.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-)
--- a/drivers/char/tpm/tpm_tis_i2c.c +++ b/drivers/char/tpm/tpm_tis_i2c.c @@ -189,21 +189,28 @@ static int tpm_tis_i2c_read_bytes(struct int ret;
for (i = 0; i < TPM_RETRY; i++) { - /* write register */ - msg.len = sizeof(reg); - msg.buf = ® - msg.flags = 0; - ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg); - if (ret < 0) - return ret; + u16 read = 0;
- /* read data */ - msg.buf = result; - msg.len = len; - msg.flags = I2C_M_RD; - ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg); - if (ret < 0) - return ret; + while (read < len) { + /* write register */ + msg.len = sizeof(reg); + msg.buf = ® + msg.flags = 0; + ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg); + if (ret < 0) + return ret; + + /* read data */ + msg.buf = result + read; + msg.len = len - read; + msg.flags = I2C_M_RD; + if (msg.len > I2C_SMBUS_BLOCK_MAX) + msg.len = I2C_SMBUS_BLOCK_MAX; + ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg); + if (ret < 0) + return ret; + read += msg.len; + }
ret = tpm_tis_i2c_sanity_check_read(reg, len, result); if (ret == 0)
From: Christian Hesse mail@eworm.de
commit 08b0af4478bacb8bb701c172c99a34ea32da89f5 upstream.
This device suffer an irq storm, so add it in tpm_tis_dmi_table to force polling.
Cc: stable@vger.kernel.org # v6.4+ Link: https://community.frame.work/t/boot-and-shutdown-hangs-with-arch-linux-kerne... Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Reported-by: roubro1991@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217631 Signed-off-by: Christian Hesse mail@eworm.de Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 9cb4e81fc548..5dd391ed3320 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -114,6 +114,14 @@ static int tpm_tis_disable_irq(const struct dmi_system_id *d) }
static const struct dmi_system_id tpm_tis_dmi_table[] = { + { + .callback = tpm_tis_disable_irq, + .ident = "Framework Laptop (12th Gen Intel Core)", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Framework"), + DMI_MATCH(DMI_PRODUCT_NAME, "Laptop (12th Gen Intel Core)"), + }, + }, { .callback = tpm_tis_disable_irq, .ident = "ThinkPad T490s",
From: Alexander Sverdlin alexander.sverdlin@siemens.com
commit 83e7e5d89f04d1c417492940f7922bc8416a8cc4 upstream.
Underlying I2C bus drivers not always support longer transfers and imx-lpi2c for instance doesn't. The fix is symmetric to previous patch which fixed the read direction.
Cc: stable@vger.kernel.org # v5.20+ Fixes: bbc23a07b072 ("tpm: Add tpm_tis_i2c backend for tpm_tis_core") Tested-by: Michael Haener michael.haener@siemens.com Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Reviewed-by: Jerry Snitselaar jsnitsel@redhat.com Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis_i2c.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
--- a/drivers/char/tpm/tpm_tis_i2c.c +++ b/drivers/char/tpm/tpm_tis_i2c.c @@ -230,19 +230,27 @@ static int tpm_tis_i2c_write_bytes(struc struct i2c_msg msg = { .addr = phy->i2c_client->addr }; u8 reg = tpm_tis_i2c_address_to_register(addr); int ret; + u16 wrote = 0;
if (len > TPM_BUFSIZE - 1) return -EIO;
- /* write register and data in one go */ phy->io_buf[0] = reg; - memcpy(phy->io_buf + sizeof(reg), value, len); - - msg.len = sizeof(reg) + len; msg.buf = phy->io_buf; - ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg); - if (ret < 0) - return ret; + while (wrote < len) { + /* write register and data in one go */ + msg.len = sizeof(reg) + len - wrote; + if (msg.len > I2C_SMBUS_BLOCK_MAX) + msg.len = I2C_SMBUS_BLOCK_MAX; + + memcpy(phy->io_buf + sizeof(reg), value + wrote, + msg.len - sizeof(reg)); + + ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg); + if (ret < 0) + return ret; + wrote += msg.len - sizeof(reg); + }
return 0; }
From: Jerry Snitselaar jsnitsel@redhat.com
commit ecff6813d2bcf0c670881a9ba3f51cb032dd405a upstream.
tpm_amd_is_rng_defective is for dealing with an issue related to the AMD firmware TPM, so on non-x86 architectures just have it inline and return false.
Cc: stable@vger.kernel.org # v6.3+ Reported-by: Sachin Sant sachinp@linux.ibm.com Reported-by: Aneesh Kumar K. V aneesh.kumar@linux.ibm.com Closes: https://lore.kernel.org/lkml/99B81401-DB46-49B9-B321-CF832B50CAC3@linux.ibm.... Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs") Signed-off-by: Jerry Snitselaar jsnitsel@redhat.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm-chip.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/char/tpm/tpm-chip.c +++ b/drivers/char/tpm/tpm-chip.c @@ -518,6 +518,7 @@ static int tpm_add_legacy_sysfs(struct t * 6.x.y.z series: 6.0.18.6 + * 3.x.y.z series: 3.57.y.5 + */ +#ifdef CONFIG_X86 static bool tpm_amd_is_rng_defective(struct tpm_chip *chip) { u32 val1, val2; @@ -566,6 +567,12 @@ release:
return true; } +#else +static inline bool tpm_amd_is_rng_defective(struct tpm_chip *chip) +{ + return false; +} +#endif /* CONFIG_X86 */
static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait) {
From: Christian Hesse mail@eworm.de
commit bc825e851c2fe89c127cac1e0e5cf344c4940619 upstream.
This device suffer an irq storm, so add it in tpm_tis_dmi_table to force polling.
Cc: stable@vger.kernel.org # v6.4+ Link: https://community.frame.work/t/boot-and-shutdown-hangs-with-arch-linux-kerne... Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Reported-by: roubro1991@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217631 Signed-off-by: Christian Hesse mail@eworm.de Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 5dd391ed3320..4e4426965cd0 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -122,6 +122,14 @@ static const struct dmi_system_id tpm_tis_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Laptop (12th Gen Intel Core)"), }, }, + { + .callback = tpm_tis_disable_irq, + .ident = "Framework Laptop (13th Gen Intel Core)", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Framework"), + DMI_MATCH(DMI_PRODUCT_NAME, "Laptop (13th Gen Intel Core)"), + }, + }, { .callback = tpm_tis_disable_irq, .ident = "ThinkPad T490s",
From: Lino Sanfilippo l.sanfilippo@kunbus.com
commit 481c2d14627de8ecbb54dd125466e4b4a5069b47 upstream.
After activation of interrupts for TPM TIS drivers 0-day reports an interrupt storm on an Inspur NF5180M6 server.
Fix this by detecting the storm and falling back to polling: Count the number of unhandled interrupts within a 10 ms time interval. In case that more than 1000 were unhandled deactivate interrupts entirely, deregister the handler and use polling instead.
Also print a note to point to the tpm_tis_dmi_table.
Since the interrupt deregistration function devm_free_irq() waits for all interrupt handlers to finish, only trigger a worker in the interrupt handler and do the unregistration in the worker to avoid a deadlock.
Note: the storm detection logic equals the implementation in note_interrupt() which uses timestamps and counters stored in struct irq_desc. Since this structure is private to the generic interrupt core the TPM TIS core uses its own timestamps and counters. Furthermore the TPM interrupt handler always returns IRQ_HANDLED to prevent the generic interrupt core from processing the interrupt storm.
Cc: stable@vger.kernel.org # v6.4+ Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Reported-by: kernel test robot yujie.liu@intel.com Closes: https://lore.kernel.org/oe-lkp/202305041325.ae8b0c43-yujie.liu@intel.com/ Suggested-by: Lukas Wunner lukas@wunner.de Signed-off-by: Lino Sanfilippo l.sanfilippo@kunbus.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis_core.c | 103 +++++++++++++++++++++++++++----- drivers/char/tpm/tpm_tis_core.h | 4 ++ 2 files changed, 92 insertions(+), 15 deletions(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index 558144fa707a..88a5384c09c0 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -24,9 +24,12 @@ #include <linux/wait.h> #include <linux/acpi.h> #include <linux/freezer.h> +#include <linux/dmi.h> #include "tpm.h" #include "tpm_tis_core.h"
+#define TPM_TIS_MAX_UNHANDLED_IRQS 1000 + static void tpm_tis_clkrun_enable(struct tpm_chip *chip, bool value);
static bool wait_for_tpm_stat_cond(struct tpm_chip *chip, u8 mask, @@ -468,25 +471,29 @@ static int tpm_tis_send_data(struct tpm_chip *chip, const u8 *buf, size_t len) return rc; }
-static void disable_interrupts(struct tpm_chip *chip) +static void __tpm_tis_disable_interrupts(struct tpm_chip *chip) +{ + struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); + u32 int_mask = 0; + + tpm_tis_read32(priv, TPM_INT_ENABLE(priv->locality), &int_mask); + int_mask &= ~TPM_GLOBAL_INT_ENABLE; + tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality), int_mask); + + chip->flags &= ~TPM_CHIP_FLAG_IRQ; +} + +static void tpm_tis_disable_interrupts(struct tpm_chip *chip) { struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); - u32 intmask; - int rc;
if (priv->irq == 0) return;
- rc = tpm_tis_read32(priv, TPM_INT_ENABLE(priv->locality), &intmask); - if (rc < 0) - intmask = 0; - - intmask &= ~TPM_GLOBAL_INT_ENABLE; - rc = tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality), intmask); + __tpm_tis_disable_interrupts(chip);
devm_free_irq(chip->dev.parent, priv->irq, chip); priv->irq = 0; - chip->flags &= ~TPM_CHIP_FLAG_IRQ; }
/* @@ -552,7 +559,7 @@ static int tpm_tis_send(struct tpm_chip *chip, u8 *buf, size_t len) if (!test_bit(TPM_TIS_IRQ_TESTED, &priv->flags)) tpm_msleep(1); if (!test_bit(TPM_TIS_IRQ_TESTED, &priv->flags)) - disable_interrupts(chip); + tpm_tis_disable_interrupts(chip); set_bit(TPM_TIS_IRQ_TESTED, &priv->flags); return rc; } @@ -752,6 +759,57 @@ static bool tpm_tis_req_canceled(struct tpm_chip *chip, u8 status) return status == TPM_STS_COMMAND_READY; }
+static irqreturn_t tpm_tis_revert_interrupts(struct tpm_chip *chip) +{ + struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); + const char *product; + const char *vendor; + + dev_warn(&chip->dev, FW_BUG + "TPM interrupt storm detected, polling instead\n"); + + vendor = dmi_get_system_info(DMI_SYS_VENDOR); + product = dmi_get_system_info(DMI_PRODUCT_VERSION); + + if (vendor && product) { + dev_info(&chip->dev, + "Consider adding the following entry to tpm_tis_dmi_table:\n"); + dev_info(&chip->dev, "\tDMI_SYS_VENDOR: %s\n", vendor); + dev_info(&chip->dev, "\tDMI_PRODUCT_VERSION: %s\n", product); + } + + if (tpm_tis_request_locality(chip, 0) != 0) + return IRQ_NONE; + + __tpm_tis_disable_interrupts(chip); + tpm_tis_relinquish_locality(chip, 0); + + schedule_work(&priv->free_irq_work); + + return IRQ_HANDLED; +} + +static irqreturn_t tpm_tis_update_unhandled_irqs(struct tpm_chip *chip) +{ + struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev); + irqreturn_t irqret = IRQ_HANDLED; + + if (!(chip->flags & TPM_CHIP_FLAG_IRQ)) + return IRQ_HANDLED; + + if (time_after(jiffies, priv->last_unhandled_irq + HZ/10)) + priv->unhandled_irqs = 1; + else + priv->unhandled_irqs++; + + priv->last_unhandled_irq = jiffies; + + if (priv->unhandled_irqs > TPM_TIS_MAX_UNHANDLED_IRQS) + irqret = tpm_tis_revert_interrupts(chip); + + return irqret; +} + static irqreturn_t tis_int_handler(int dummy, void *dev_id) { struct tpm_chip *chip = dev_id; @@ -761,10 +819,10 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt); if (rc < 0) - return IRQ_NONE; + goto err;
if (interrupt == 0) - return IRQ_NONE; + goto err;
set_bit(TPM_TIS_IRQ_TESTED, &priv->flags); if (interrupt & TPM_INTF_DATA_AVAIL_INT) @@ -780,10 +838,13 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id) rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), interrupt); tpm_tis_relinquish_locality(chip, 0); if (rc < 0) - return IRQ_NONE; + goto err;
tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt); return IRQ_HANDLED; + +err: + return tpm_tis_update_unhandled_irqs(chip); }
static void tpm_tis_gen_interrupt(struct tpm_chip *chip) @@ -804,6 +865,15 @@ static void tpm_tis_gen_interrupt(struct tpm_chip *chip) chip->flags &= ~TPM_CHIP_FLAG_IRQ; }
+static void tpm_tis_free_irq_func(struct work_struct *work) +{ + struct tpm_tis_data *priv = container_of(work, typeof(*priv), free_irq_work); + struct tpm_chip *chip = priv->chip; + + devm_free_irq(chip->dev.parent, priv->irq, chip); + priv->irq = 0; +} + /* Register the IRQ and issue a command that will cause an interrupt. If an * irq is seen then leave the chip setup for IRQ operation, otherwise reverse * everything and leave in polling mode. Returns 0 on success. @@ -816,6 +886,7 @@ static int tpm_tis_probe_irq_single(struct tpm_chip *chip, u32 intmask, int rc; u32 int_status;
+ INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func);
rc = devm_request_threaded_irq(chip->dev.parent, irq, NULL, tis_int_handler, IRQF_ONESHOT | flags, @@ -918,6 +989,7 @@ void tpm_tis_remove(struct tpm_chip *chip) interrupt = 0;
tpm_tis_write32(priv, reg, ~TPM_GLOBAL_INT_ENABLE & interrupt); + flush_work(&priv->free_irq_work);
tpm_tis_clkrun_enable(chip, false);
@@ -1021,6 +1093,7 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq, chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX); chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX); chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX); + priv->chip = chip; priv->timeout_min = TPM_TIMEOUT_USECS_MIN; priv->timeout_max = TPM_TIMEOUT_USECS_MAX; priv->phy_ops = phy_ops; @@ -1179,7 +1252,7 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq, rc = tpm_tis_request_locality(chip, 0); if (rc < 0) goto out_err; - disable_interrupts(chip); + tpm_tis_disable_interrupts(chip); tpm_tis_relinquish_locality(chip, 0); } } diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h index 610bfadb6acf..b1a169d7d1ca 100644 --- a/drivers/char/tpm/tpm_tis_core.h +++ b/drivers/char/tpm/tpm_tis_core.h @@ -91,11 +91,15 @@ enum tpm_tis_flags { };
struct tpm_tis_data { + struct tpm_chip *chip; u16 manufacturer_id; struct mutex locality_count_mutex; unsigned int locality_count; int locality; int irq; + struct work_struct free_irq_work; + unsigned long last_unhandled_irq; + unsigned int unhandled_irqs; unsigned int int_mask; unsigned long flags; void __iomem *ilb_base_addr;
From: Florian Bezdeka florian@bezdeka.de
commit 393f362389cecc2e4f2e3520a6c8ee9dbb1e3d15 upstream.
The Lenovo L590 suffers from an irq storm issue like the T490, T490s and P360 Tiny, so add an entry for it to tpm_tis_dmi_table and force polling.
Cc: stable@vger.kernel.org # v6.4+ Link: https://bugzilla.redhat.com/show_bug.cgi?id=2214069#c0 Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Signed-off-by: Florian Bezdeka florian@bezdeka.de Reviewed-by: Jerry Snitselaar jsnitsel@redhat.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 4e4426965cd0..cc42cf3de960 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -154,6 +154,14 @@ static const struct dmi_system_id tpm_tis_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L490"), }, }, + { + .callback = tpm_tis_disable_irq, + .ident = "ThinkPad L590", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L590"), + }, + }, { .callback = tpm_tis_disable_irq, .ident = "UPX-TGL",
From: Arseniy Krasnov AVKrasnov@sberdevices.ru
commit 98480a181a08ceeede417e5b28f6d0429d8ae156 upstream.
Meson NAND controller requires 8 bytes alignment for DMA addresses, otherwise it "aligns" passed address by itself thus accessing invalid location in the provided buffer. This patch makes unaligned buffers to be reallocated to become valid.
Fixes: 8fae856c5350 ("mtd: rawnand: meson: add support for Amlogic NAND flash controller") Cc: Stable@vger.kernel.org Signed-off-by: Arseniy Krasnov AVKrasnov@sberdevices.ru Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20230615080815.3291006-1-AVKrasnov@sberdev... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/meson_nand.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/mtd/nand/raw/meson_nand.c +++ b/drivers/mtd/nand/raw/meson_nand.c @@ -76,6 +76,7 @@ #define GENCMDIADDRH(aih, addr) ((aih) | (((addr) >> 16) & 0xffff))
#define DMA_DIR(dir) ((dir) ? NFC_CMD_N2M : NFC_CMD_M2N) +#define DMA_ADDR_ALIGN 8
#define ECC_CHECK_RETURN_FF (-1)
@@ -842,6 +843,9 @@ static int meson_nfc_read_oob(struct nan
static bool meson_nfc_is_buffer_dma_safe(const void *buffer) { + if ((uintptr_t)buffer % DMA_ADDR_ALIGN) + return false; + if (virt_addr_valid(buffer) && (!object_is_on_stack(buffer))) return true; return false;
From: Florian Fainelli florian.fainelli@broadcom.com
commit 1b5ea7ffb7a3bdfffb4b7f40ce0d20a3372ee405 upstream.
With support for Ethernet PHY LEDs having been added, while unregistering a MDIO bus and its child device liks PHYs there may be "late" accesses to the MDIO bus. One typical use case is setting the PHY LEDs brightness to OFF for instance.
We need to ensure that the MDIO bus controller remains entirely functional since it runs off the main GENET adapter clock.
Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230617155500.4005881-1-andrew@lunn.ch/ Fixes: 9a4e79697009 ("net: bcmgenet: utilize generic Broadcom UniMAC MDIO controller driver") Signed-off-by: Florian Fainelli florian.fainelli@broadcom.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230622103107.1760280-1-florian.fainelli@broadcom... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/broadcom/genet/bcmmii.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c @@ -673,5 +673,7 @@ void bcmgenet_mii_exit(struct net_device if (of_phy_is_fixed_link(dn)) of_phy_deregister_fixed_link(dn); of_node_put(priv->phy_dn); + clk_prepare_enable(priv->clk); platform_device_unregister(priv->mii_pdev); + clk_disable_unprepare(priv->clk); }
From: Oleksij Rempel o.rempel@pengutronix.de
commit fc0649395dca81f2b3b02d9b248acb38cbcee55c upstream.
Fix an issue where the kernel would stall during netboot, showing the "sched: RT throttling activated" message. This stall was triggered by the behavior of the mii_interrupt bit (Bit 7 - DP83TD510E_STS_MII_INT) in the DP83TD510E's PHY_STS Register (Address = 0x10). The DP83TD510E datasheet (2020) states that the bit clears on write, however, in practice, the bit clears on read.
This discrepancy had significant implications on the driver's interrupt handling. The PHY_STS Register was used by handle_interrupt() to check for pending interrupts and by read_status() to get the current link status. The call to read_status() was unintentionally clearing the mii_interrupt status bit without deasserting the IRQ pin, causing handle_interrupt() to miss other pending interrupts. This issue was most apparent during netboot.
The fix refrains from using the PHY_STS Register for interrupt handling. Instead, we now solely rely on the INTERRUPT_REG_1 Register (Address = 0x12) and INTERRUPT_REG_2 Register (Address = 0x13) for this purpose. These registers directly influence the IRQ pin state and are latched high until read.
Note: The INTERRUPT_REG_2 Register (Address = 0x13) exists and can also be used for interrupt handling, specifically for "Aneg page received interrupt" and "Polarity change interrupt". However, these features are currently not supported by this driver.
Fixes: 165cd04fe253 ("net: phy: dp83td510: Add support for the DP83TD510 Ethernet PHY") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20230621043848.3806124-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/dp83td510.c | 23 +++++------------------ 1 file changed, 5 insertions(+), 18 deletions(-)
--- a/drivers/net/phy/dp83td510.c +++ b/drivers/net/phy/dp83td510.c @@ -12,6 +12,11 @@
/* MDIO_MMD_VEND2 registers */ #define DP83TD510E_PHY_STS 0x10 +/* Bit 7 - mii_interrupt, active high. Clears on read. + * Note: Clearing does not necessarily deactivate IRQ pin if interrupts pending. + * This differs from the DP83TD510E datasheet (2020) which states this bit + * clears on write 0. + */ #define DP83TD510E_STS_MII_INT BIT(7) #define DP83TD510E_LINK_STATUS BIT(0)
@@ -53,12 +58,6 @@ static int dp83td510_config_intr(struct int ret;
if (phydev->interrupts == PHY_INTERRUPT_ENABLED) { - /* Clear any pending interrupts */ - ret = phy_write_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_PHY_STS, - 0x0); - if (ret) - return ret; - ret = phy_write_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_INTERRUPT_REG_1, DP83TD510E_INT1_LINK_EN); @@ -81,10 +80,6 @@ static int dp83td510_config_intr(struct DP83TD510E_GENCFG_INT_EN); if (ret) return ret; - - /* Clear any pending interrupts */ - ret = phy_write_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_PHY_STS, - 0x0); }
return ret; @@ -94,14 +89,6 @@ static irqreturn_t dp83td510_handle_inte { int ret;
- ret = phy_read_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_PHY_STS); - if (ret < 0) { - phy_error(phydev); - return IRQ_NONE; - } else if (!(ret & DP83TD510E_STS_MII_INT)) { - return IRQ_NONE; - } - /* Read the current enabled interrupts */ ret = phy_read_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_INTERRUPT_REG_1); if (ret < 0) {
From: Arnd Bergmann arnd@arndb.de
commit fb646a4cd3f0ff27d19911bef7b6622263723df6 upstream.
The kasan sw-tags implementation contains one function that is only called from assembler and has no prototype in a header. This causes a W=1 warning:
mm/kasan/sw_tags.c:171:6: warning: no previous prototype for 'kasan_tag_mismatch' [-Wmissing-prototypes] 171 | void kasan_tag_mismatch(unsigned long addr, unsigned long access_info,
Add a prototype in the local header to get a clean build.
Link: https://lkml.kernel.org/r/20230509145735.9263-1-arnd@kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Marco Elver elver@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kasan/kasan.h | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/kasan/kasan.h +++ b/mm/kasan/kasan.h @@ -646,4 +646,7 @@ void *__hwasan_memset(void *addr, int c, void *__hwasan_memmove(void *dest, const void *src, size_t len); void *__hwasan_memcpy(void *dest, const void *src, size_t len);
+void kasan_tag_mismatch(unsigned long addr, unsigned long access_info, + unsigned long ret_ip); + #endif /* __MM_KASAN_KASAN_H */
From: Arnd Bergmann arnd@arndb.de
commit bb6e04a173f06e51819a4bb512e127dfbc50dcfa upstream.
gcc-13 warns about function definitions for builtin interfaces that have a different prototype, e.g.:
In file included from kasan_test.c:31: kasan.h:574:6: error: conflicting types for built-in function '__asan_register_globals'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch] 574 | void __asan_register_globals(struct kasan_global *globals, size_t size); kasan.h:577:6: error: conflicting types for built-in function '__asan_alloca_poison'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch] 577 | void __asan_alloca_poison(unsigned long addr, size_t size); kasan.h:580:6: error: conflicting types for built-in function '__asan_load1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch] 580 | void __asan_load1(unsigned long addr); kasan.h:581:6: error: conflicting types for built-in function '__asan_store1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch] 581 | void __asan_store1(unsigned long addr); kasan.h:643:6: error: conflicting types for built-in function '__hwasan_tag_memory'; expected 'void(void *, unsigned char, long int)' [-Werror=builtin-declaration-mismatch] 643 | void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size);
The two problems are:
- Addresses are passes as 'unsigned long' in the kernel, but gcc-13 expects a 'void *'.
- sizes meant to use a signed ssize_t rather than size_t.
Change all the prototypes to match these. Using 'void *' consistently for addresses gets rid of a couple of type casts, so push that down to the leaf functions where possible.
This now passes all randconfig builds on arm, arm64 and x86, but I have not tested it on the other architectures that support kasan, since they tend to fail randconfig builds in other ways. This might fail if any of the 32-bit architectures expect a 'long' instead of 'int' for the size argument.
The __asan_allocas_unpoison() function prototype is somewhat weird, since it uses a pointer for 'stack_top' and an size_t for 'stack_bottom'. This looks like it is meant to be 'addr' and 'size' like the others, but the implementation clearly treats them as 'top' and 'bottom'.
Link: https://lkml.kernel.org/r/20230509145735.9263-2-arnd@kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Marco Elver elver@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/traps.c | 2 arch/arm64/mm/fault.c | 2 include/linux/kasan.h | 2 mm/kasan/common.c | 2 mm/kasan/generic.c | 72 ++++++++++----------- mm/kasan/kasan.h | 156 +++++++++++++++++++++++----------------------- mm/kasan/report.c | 17 ++--- mm/kasan/report_generic.c | 12 +-- mm/kasan/report_hw_tags.c | 2 mm/kasan/report_sw_tags.c | 2 mm/kasan/shadow.c | 36 +++++----- mm/kasan/sw_tags.c | 20 ++--- 12 files changed, 162 insertions(+), 163 deletions(-)
--- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -1044,7 +1044,7 @@ static int kasan_handler(struct pt_regs bool recover = esr & KASAN_ESR_RECOVER; bool write = esr & KASAN_ESR_WRITE; size_t size = KASAN_ESR_SIZE(esr); - u64 addr = regs->regs[0]; + void *addr = (void *)regs->regs[0]; u64 pc = regs->pc;
kasan_report(addr, size, write, pc); --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -317,7 +317,7 @@ static void report_tag_fault(unsigned lo * find out access size. */ bool is_write = !!(esr & ESR_ELx_WNR); - kasan_report(addr, 0, is_write, regs->pc); + kasan_report((void *)addr, 0, is_write, regs->pc); } #else /* Tag faults aren't enabled without CONFIG_KASAN_HW_TAGS. */ --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -343,7 +343,7 @@ static inline void *kasan_reset_tag(cons * @is_write: whether the bad access is a write or a read * @ip: instruction pointer for the accessibility check or the bad access itself */ -bool kasan_report(unsigned long addr, size_t size, +bool kasan_report(const void *addr, size_t size, bool is_write, unsigned long ip);
#else /* CONFIG_KASAN_SW_TAGS || CONFIG_KASAN_HW_TAGS */ --- a/mm/kasan/common.c +++ b/mm/kasan/common.c @@ -445,7 +445,7 @@ void * __must_check __kasan_krealloc(con bool __kasan_check_byte(const void *address, unsigned long ip) { if (!kasan_byte_accessible(address)) { - kasan_report((unsigned long)address, 1, false, ip); + kasan_report(address, 1, false, ip); return false; } return true; --- a/mm/kasan/generic.c +++ b/mm/kasan/generic.c @@ -40,39 +40,39 @@ * depending on memory access size X. */
-static __always_inline bool memory_is_poisoned_1(unsigned long addr) +static __always_inline bool memory_is_poisoned_1(const void *addr) { - s8 shadow_value = *(s8 *)kasan_mem_to_shadow((void *)addr); + s8 shadow_value = *(s8 *)kasan_mem_to_shadow(addr);
if (unlikely(shadow_value)) { - s8 last_accessible_byte = addr & KASAN_GRANULE_MASK; + s8 last_accessible_byte = (unsigned long)addr & KASAN_GRANULE_MASK; return unlikely(last_accessible_byte >= shadow_value); }
return false; }
-static __always_inline bool memory_is_poisoned_2_4_8(unsigned long addr, +static __always_inline bool memory_is_poisoned_2_4_8(const void *addr, unsigned long size) { - u8 *shadow_addr = (u8 *)kasan_mem_to_shadow((void *)addr); + u8 *shadow_addr = (u8 *)kasan_mem_to_shadow(addr);
/* * Access crosses 8(shadow size)-byte boundary. Such access maps * into 2 shadow bytes, so we need to check them both. */ - if (unlikely(((addr + size - 1) & KASAN_GRANULE_MASK) < size - 1)) + if (unlikely((((unsigned long)addr + size - 1) & KASAN_GRANULE_MASK) < size - 1)) return *shadow_addr || memory_is_poisoned_1(addr + size - 1);
return memory_is_poisoned_1(addr + size - 1); }
-static __always_inline bool memory_is_poisoned_16(unsigned long addr) +static __always_inline bool memory_is_poisoned_16(const void *addr) { - u16 *shadow_addr = (u16 *)kasan_mem_to_shadow((void *)addr); + u16 *shadow_addr = (u16 *)kasan_mem_to_shadow(addr);
/* Unaligned 16-bytes access maps into 3 shadow bytes. */ - if (unlikely(!IS_ALIGNED(addr, KASAN_GRANULE_SIZE))) + if (unlikely(!IS_ALIGNED((unsigned long)addr, KASAN_GRANULE_SIZE))) return *shadow_addr || memory_is_poisoned_1(addr + 15);
return *shadow_addr; @@ -120,26 +120,25 @@ static __always_inline unsigned long mem return bytes_is_nonzero(start, (end - start) % 8); }
-static __always_inline bool memory_is_poisoned_n(unsigned long addr, - size_t size) +static __always_inline bool memory_is_poisoned_n(const void *addr, size_t size) { unsigned long ret;
- ret = memory_is_nonzero(kasan_mem_to_shadow((void *)addr), - kasan_mem_to_shadow((void *)addr + size - 1) + 1); + ret = memory_is_nonzero(kasan_mem_to_shadow(addr), + kasan_mem_to_shadow(addr + size - 1) + 1);
if (unlikely(ret)) { - unsigned long last_byte = addr + size - 1; - s8 *last_shadow = (s8 *)kasan_mem_to_shadow((void *)last_byte); + const void *last_byte = addr + size - 1; + s8 *last_shadow = (s8 *)kasan_mem_to_shadow(last_byte);
if (unlikely(ret != (unsigned long)last_shadow || - ((long)(last_byte & KASAN_GRANULE_MASK) >= *last_shadow))) + (((long)last_byte & KASAN_GRANULE_MASK) >= *last_shadow))) return true; } return false; }
-static __always_inline bool memory_is_poisoned(unsigned long addr, size_t size) +static __always_inline bool memory_is_poisoned(const void *addr, size_t size) { if (__builtin_constant_p(size)) { switch (size) { @@ -159,7 +158,7 @@ static __always_inline bool memory_is_po return memory_is_poisoned_n(addr, size); }
-static __always_inline bool check_region_inline(unsigned long addr, +static __always_inline bool check_region_inline(const void *addr, size_t size, bool write, unsigned long ret_ip) { @@ -172,7 +171,7 @@ static __always_inline bool check_region if (unlikely(addr + size < addr)) return !kasan_report(addr, size, write, ret_ip);
- if (unlikely(!addr_has_metadata((void *)addr))) + if (unlikely(!addr_has_metadata(addr))) return !kasan_report(addr, size, write, ret_ip);
if (likely(!memory_is_poisoned(addr, size))) @@ -181,7 +180,7 @@ static __always_inline bool check_region return !kasan_report(addr, size, write, ret_ip); }
-bool kasan_check_range(unsigned long addr, size_t size, bool write, +bool kasan_check_range(const void *addr, size_t size, bool write, unsigned long ret_ip) { return check_region_inline(addr, size, write, ret_ip); @@ -221,36 +220,37 @@ static void register_global(struct kasan KASAN_GLOBAL_REDZONE, false); }
-void __asan_register_globals(struct kasan_global *globals, size_t size) +void __asan_register_globals(void *ptr, ssize_t size) { int i; + struct kasan_global *globals = ptr;
for (i = 0; i < size; i++) register_global(&globals[i]); } EXPORT_SYMBOL(__asan_register_globals);
-void __asan_unregister_globals(struct kasan_global *globals, size_t size) +void __asan_unregister_globals(void *ptr, ssize_t size) { } EXPORT_SYMBOL(__asan_unregister_globals);
#define DEFINE_ASAN_LOAD_STORE(size) \ - void __asan_load##size(unsigned long addr) \ + void __asan_load##size(void *addr) \ { \ check_region_inline(addr, size, false, _RET_IP_); \ } \ EXPORT_SYMBOL(__asan_load##size); \ __alias(__asan_load##size) \ - void __asan_load##size##_noabort(unsigned long); \ + void __asan_load##size##_noabort(void *); \ EXPORT_SYMBOL(__asan_load##size##_noabort); \ - void __asan_store##size(unsigned long addr) \ + void __asan_store##size(void *addr) \ { \ check_region_inline(addr, size, true, _RET_IP_); \ } \ EXPORT_SYMBOL(__asan_store##size); \ __alias(__asan_store##size) \ - void __asan_store##size##_noabort(unsigned long); \ + void __asan_store##size##_noabort(void *); \ EXPORT_SYMBOL(__asan_store##size##_noabort)
DEFINE_ASAN_LOAD_STORE(1); @@ -259,24 +259,24 @@ DEFINE_ASAN_LOAD_STORE(4); DEFINE_ASAN_LOAD_STORE(8); DEFINE_ASAN_LOAD_STORE(16);
-void __asan_loadN(unsigned long addr, size_t size) +void __asan_loadN(void *addr, ssize_t size) { kasan_check_range(addr, size, false, _RET_IP_); } EXPORT_SYMBOL(__asan_loadN);
__alias(__asan_loadN) -void __asan_loadN_noabort(unsigned long, size_t); +void __asan_loadN_noabort(void *, ssize_t); EXPORT_SYMBOL(__asan_loadN_noabort);
-void __asan_storeN(unsigned long addr, size_t size) +void __asan_storeN(void *addr, ssize_t size) { kasan_check_range(addr, size, true, _RET_IP_); } EXPORT_SYMBOL(__asan_storeN);
__alias(__asan_storeN) -void __asan_storeN_noabort(unsigned long, size_t); +void __asan_storeN_noabort(void *, ssize_t); EXPORT_SYMBOL(__asan_storeN_noabort);
/* to shut up compiler complaints */ @@ -284,7 +284,7 @@ void __asan_handle_no_return(void) {} EXPORT_SYMBOL(__asan_handle_no_return);
/* Emitted by compiler to poison alloca()ed objects. */ -void __asan_alloca_poison(unsigned long addr, size_t size) +void __asan_alloca_poison(void *addr, ssize_t size) { size_t rounded_up_size = round_up(size, KASAN_GRANULE_SIZE); size_t padding_size = round_up(size, KASAN_ALLOCA_REDZONE_SIZE) - @@ -295,7 +295,7 @@ void __asan_alloca_poison(unsigned long KASAN_ALLOCA_REDZONE_SIZE); const void *right_redzone = (const void *)(addr + rounded_up_size);
- WARN_ON(!IS_ALIGNED(addr, KASAN_ALLOCA_REDZONE_SIZE)); + WARN_ON(!IS_ALIGNED((unsigned long)addr, KASAN_ALLOCA_REDZONE_SIZE));
kasan_unpoison((const void *)(addr + rounded_down_size), size - rounded_down_size, false); @@ -307,18 +307,18 @@ void __asan_alloca_poison(unsigned long EXPORT_SYMBOL(__asan_alloca_poison);
/* Emitted by compiler to unpoison alloca()ed areas when the stack unwinds. */ -void __asan_allocas_unpoison(const void *stack_top, const void *stack_bottom) +void __asan_allocas_unpoison(void *stack_top, ssize_t stack_bottom) { - if (unlikely(!stack_top || stack_top > stack_bottom)) + if (unlikely(!stack_top || stack_top > (void *)stack_bottom)) return;
- kasan_unpoison(stack_top, stack_bottom - stack_top, false); + kasan_unpoison(stack_top, (void *)stack_bottom - stack_top, false); } EXPORT_SYMBOL(__asan_allocas_unpoison);
/* Emitted by the compiler to [un]poison local variables. */ #define DEFINE_ASAN_SET_SHADOW(byte) \ - void __asan_set_shadow_##byte(const void *addr, size_t size) \ + void __asan_set_shadow_##byte(const void *addr, ssize_t size) \ { \ __memset((void *)addr, 0x##byte, size); \ } \ --- a/mm/kasan/kasan.h +++ b/mm/kasan/kasan.h @@ -198,13 +198,13 @@ enum kasan_report_type { struct kasan_report_info { /* Filled in by kasan_report_*(). */ enum kasan_report_type type; - void *access_addr; + const void *access_addr; size_t access_size; bool is_write; unsigned long ip;
/* Filled in by the common reporting code. */ - void *first_bad_addr; + const void *first_bad_addr; struct kmem_cache *cache; void *object; size_t alloc_size; @@ -311,7 +311,7 @@ static __always_inline bool addr_has_met * @ret_ip: return address * @return: true if access was valid, false if invalid */ -bool kasan_check_range(unsigned long addr, size_t size, bool write, +bool kasan_check_range(const void *addr, size_t size, bool write, unsigned long ret_ip);
#else /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */ @@ -323,7 +323,7 @@ static __always_inline bool addr_has_met
#endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
-void *kasan_find_first_bad_addr(void *addr, size_t size); +const void *kasan_find_first_bad_addr(const void *addr, size_t size); size_t kasan_get_alloc_size(void *object, struct kmem_cache *cache); void kasan_complete_mode_report_info(struct kasan_report_info *info); void kasan_metadata_fetch_row(char *buffer, void *row); @@ -346,7 +346,7 @@ void kasan_print_aux_stacks(struct kmem_ static inline void kasan_print_aux_stacks(struct kmem_cache *cache, const void *object) { } #endif
-bool kasan_report(unsigned long addr, size_t size, +bool kasan_report(const void *addr, size_t size, bool is_write, unsigned long ip); void kasan_report_invalid_free(void *object, unsigned long ip, enum kasan_report_type type);
@@ -571,82 +571,82 @@ void kasan_restore_multi_shot(bool enabl */
asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); -void __asan_register_globals(struct kasan_global *globals, size_t size); -void __asan_unregister_globals(struct kasan_global *globals, size_t size); +void __asan_register_globals(void *globals, ssize_t size); +void __asan_unregister_globals(void *globals, ssize_t size); void __asan_handle_no_return(void); -void __asan_alloca_poison(unsigned long addr, size_t size); -void __asan_allocas_unpoison(const void *stack_top, const void *stack_bottom); +void __asan_alloca_poison(void *, ssize_t size); +void __asan_allocas_unpoison(void *stack_top, ssize_t stack_bottom);
-void __asan_load1(unsigned long addr); -void __asan_store1(unsigned long addr); -void __asan_load2(unsigned long addr); -void __asan_store2(unsigned long addr); -void __asan_load4(unsigned long addr); -void __asan_store4(unsigned long addr); -void __asan_load8(unsigned long addr); -void __asan_store8(unsigned long addr); -void __asan_load16(unsigned long addr); -void __asan_store16(unsigned long addr); -void __asan_loadN(unsigned long addr, size_t size); -void __asan_storeN(unsigned long addr, size_t size); - -void __asan_load1_noabort(unsigned long addr); -void __asan_store1_noabort(unsigned long addr); -void __asan_load2_noabort(unsigned long addr); -void __asan_store2_noabort(unsigned long addr); -void __asan_load4_noabort(unsigned long addr); -void __asan_store4_noabort(unsigned long addr); -void __asan_load8_noabort(unsigned long addr); -void __asan_store8_noabort(unsigned long addr); -void __asan_load16_noabort(unsigned long addr); -void __asan_store16_noabort(unsigned long addr); -void __asan_loadN_noabort(unsigned long addr, size_t size); -void __asan_storeN_noabort(unsigned long addr, size_t size); - -void __asan_report_load1_noabort(unsigned long addr); -void __asan_report_store1_noabort(unsigned long addr); -void __asan_report_load2_noabort(unsigned long addr); -void __asan_report_store2_noabort(unsigned long addr); -void __asan_report_load4_noabort(unsigned long addr); -void __asan_report_store4_noabort(unsigned long addr); -void __asan_report_load8_noabort(unsigned long addr); -void __asan_report_store8_noabort(unsigned long addr); -void __asan_report_load16_noabort(unsigned long addr); -void __asan_report_store16_noabort(unsigned long addr); -void __asan_report_load_n_noabort(unsigned long addr, size_t size); -void __asan_report_store_n_noabort(unsigned long addr, size_t size); - -void __asan_set_shadow_00(const void *addr, size_t size); -void __asan_set_shadow_f1(const void *addr, size_t size); -void __asan_set_shadow_f2(const void *addr, size_t size); -void __asan_set_shadow_f3(const void *addr, size_t size); -void __asan_set_shadow_f5(const void *addr, size_t size); -void __asan_set_shadow_f8(const void *addr, size_t size); - -void *__asan_memset(void *addr, int c, size_t len); -void *__asan_memmove(void *dest, const void *src, size_t len); -void *__asan_memcpy(void *dest, const void *src, size_t len); - -void __hwasan_load1_noabort(unsigned long addr); -void __hwasan_store1_noabort(unsigned long addr); -void __hwasan_load2_noabort(unsigned long addr); -void __hwasan_store2_noabort(unsigned long addr); -void __hwasan_load4_noabort(unsigned long addr); -void __hwasan_store4_noabort(unsigned long addr); -void __hwasan_load8_noabort(unsigned long addr); -void __hwasan_store8_noabort(unsigned long addr); -void __hwasan_load16_noabort(unsigned long addr); -void __hwasan_store16_noabort(unsigned long addr); -void __hwasan_loadN_noabort(unsigned long addr, size_t size); -void __hwasan_storeN_noabort(unsigned long addr, size_t size); - -void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size); - -void *__hwasan_memset(void *addr, int c, size_t len); -void *__hwasan_memmove(void *dest, const void *src, size_t len); -void *__hwasan_memcpy(void *dest, const void *src, size_t len); +void __asan_load1(void *); +void __asan_store1(void *); +void __asan_load2(void *); +void __asan_store2(void *); +void __asan_load4(void *); +void __asan_store4(void *); +void __asan_load8(void *); +void __asan_store8(void *); +void __asan_load16(void *); +void __asan_store16(void *); +void __asan_loadN(void *, ssize_t size); +void __asan_storeN(void *, ssize_t size); + +void __asan_load1_noabort(void *); +void __asan_store1_noabort(void *); +void __asan_load2_noabort(void *); +void __asan_store2_noabort(void *); +void __asan_load4_noabort(void *); +void __asan_store4_noabort(void *); +void __asan_load8_noabort(void *); +void __asan_store8_noabort(void *); +void __asan_load16_noabort(void *); +void __asan_store16_noabort(void *); +void __asan_loadN_noabort(void *, ssize_t size); +void __asan_storeN_noabort(void *, ssize_t size); + +void __asan_report_load1_noabort(void *); +void __asan_report_store1_noabort(void *); +void __asan_report_load2_noabort(void *); +void __asan_report_store2_noabort(void *); +void __asan_report_load4_noabort(void *); +void __asan_report_store4_noabort(void *); +void __asan_report_load8_noabort(void *); +void __asan_report_store8_noabort(void *); +void __asan_report_load16_noabort(void *); +void __asan_report_store16_noabort(void *); +void __asan_report_load_n_noabort(void *, ssize_t size); +void __asan_report_store_n_noabort(void *, ssize_t size); + +void __asan_set_shadow_00(const void *addr, ssize_t size); +void __asan_set_shadow_f1(const void *addr, ssize_t size); +void __asan_set_shadow_f2(const void *addr, ssize_t size); +void __asan_set_shadow_f3(const void *addr, ssize_t size); +void __asan_set_shadow_f5(const void *addr, ssize_t size); +void __asan_set_shadow_f8(const void *addr, ssize_t size); + +void *__asan_memset(void *addr, int c, ssize_t len); +void *__asan_memmove(void *dest, const void *src, ssize_t len); +void *__asan_memcpy(void *dest, const void *src, ssize_t len); + +void __hwasan_load1_noabort(void *); +void __hwasan_store1_noabort(void *); +void __hwasan_load2_noabort(void *); +void __hwasan_store2_noabort(void *); +void __hwasan_load4_noabort(void *); +void __hwasan_store4_noabort(void *); +void __hwasan_load8_noabort(void *); +void __hwasan_store8_noabort(void *); +void __hwasan_load16_noabort(void *); +void __hwasan_store16_noabort(void *); +void __hwasan_loadN_noabort(void *, ssize_t size); +void __hwasan_storeN_noabort(void *, ssize_t size); + +void __hwasan_tag_memory(void *, u8 tag, ssize_t size); + +void *__hwasan_memset(void *addr, int c, ssize_t len); +void *__hwasan_memmove(void *dest, const void *src, ssize_t len); +void *__hwasan_memcpy(void *dest, const void *src, ssize_t len);
-void kasan_tag_mismatch(unsigned long addr, unsigned long access_info, +void kasan_tag_mismatch(void *addr, unsigned long access_info, unsigned long ret_ip);
#endif /* __MM_KASAN_KASAN_H */ --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -211,7 +211,7 @@ static void start_report(unsigned long * pr_err("==================================================================\n"); }
-static void end_report(unsigned long *flags, void *addr) +static void end_report(unsigned long *flags, const void *addr) { if (addr) trace_error_report_end(ERROR_DETECTOR_KASAN, @@ -450,8 +450,8 @@ static void print_memory_metadata(const
static void print_report(struct kasan_report_info *info) { - void *addr = kasan_reset_tag(info->access_addr); - u8 tag = get_tag(info->access_addr); + void *addr = kasan_reset_tag((void *)info->access_addr); + u8 tag = get_tag((void *)info->access_addr);
print_error_description(info); if (addr_has_metadata(addr)) @@ -468,12 +468,12 @@ static void print_report(struct kasan_re
static void complete_report_info(struct kasan_report_info *info) { - void *addr = kasan_reset_tag(info->access_addr); + void *addr = kasan_reset_tag((void *)info->access_addr); struct slab *slab;
if (info->type == KASAN_REPORT_ACCESS) info->first_bad_addr = kasan_find_first_bad_addr( - info->access_addr, info->access_size); + (void *)info->access_addr, info->access_size); else info->first_bad_addr = addr;
@@ -544,11 +544,10 @@ void kasan_report_invalid_free(void *ptr * user_access_save/restore(): kasan_report_invalid_free() cannot be called * from a UACCESS region, and kasan_report_async() is not used on x86. */ -bool kasan_report(unsigned long addr, size_t size, bool is_write, +bool kasan_report(const void *addr, size_t size, bool is_write, unsigned long ip) { bool ret = true; - void *ptr = (void *)addr; unsigned long ua_flags = user_access_save(); unsigned long irq_flags; struct kasan_report_info info; @@ -562,7 +561,7 @@ bool kasan_report(unsigned long addr, si
memset(&info, 0, sizeof(info)); info.type = KASAN_REPORT_ACCESS; - info.access_addr = ptr; + info.access_addr = addr; info.access_size = size; info.is_write = is_write; info.ip = ip; @@ -571,7 +570,7 @@ bool kasan_report(unsigned long addr, si
print_report(&info);
- end_report(&irq_flags, ptr); + end_report(&irq_flags, (void *)addr);
out: user_access_restore(ua_flags); --- a/mm/kasan/report_generic.c +++ b/mm/kasan/report_generic.c @@ -30,9 +30,9 @@ #include "kasan.h" #include "../slab.h"
-void *kasan_find_first_bad_addr(void *addr, size_t size) +const void *kasan_find_first_bad_addr(const void *addr, size_t size) { - void *p = addr; + const void *p = addr;
if (!addr_has_metadata(p)) return p; @@ -362,14 +362,14 @@ void kasan_print_address_stack_frame(con #endif /* CONFIG_KASAN_STACK */
#define DEFINE_ASAN_REPORT_LOAD(size) \ -void __asan_report_load##size##_noabort(unsigned long addr) \ +void __asan_report_load##size##_noabort(void *addr) \ { \ kasan_report(addr, size, false, _RET_IP_); \ } \ EXPORT_SYMBOL(__asan_report_load##size##_noabort)
#define DEFINE_ASAN_REPORT_STORE(size) \ -void __asan_report_store##size##_noabort(unsigned long addr) \ +void __asan_report_store##size##_noabort(void *addr) \ { \ kasan_report(addr, size, true, _RET_IP_); \ } \ @@ -386,13 +386,13 @@ DEFINE_ASAN_REPORT_STORE(4); DEFINE_ASAN_REPORT_STORE(8); DEFINE_ASAN_REPORT_STORE(16);
-void __asan_report_load_n_noabort(unsigned long addr, size_t size) +void __asan_report_load_n_noabort(void *addr, ssize_t size) { kasan_report(addr, size, false, _RET_IP_); } EXPORT_SYMBOL(__asan_report_load_n_noabort);
-void __asan_report_store_n_noabort(unsigned long addr, size_t size) +void __asan_report_store_n_noabort(void *addr, ssize_t size) { kasan_report(addr, size, true, _RET_IP_); } --- a/mm/kasan/report_hw_tags.c +++ b/mm/kasan/report_hw_tags.c @@ -15,7 +15,7 @@
#include "kasan.h"
-void *kasan_find_first_bad_addr(void *addr, size_t size) +const void *kasan_find_first_bad_addr(const void *addr, size_t size) { /* * Hardware Tag-Based KASAN only calls this function for normal memory --- a/mm/kasan/report_sw_tags.c +++ b/mm/kasan/report_sw_tags.c @@ -30,7 +30,7 @@ #include "kasan.h" #include "../slab.h"
-void *kasan_find_first_bad_addr(void *addr, size_t size) +const void *kasan_find_first_bad_addr(const void *addr, size_t size) { u8 tag = get_tag(addr); void *p = kasan_reset_tag(addr); --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -28,13 +28,13 @@
bool __kasan_check_read(const volatile void *p, unsigned int size) { - return kasan_check_range((unsigned long)p, size, false, _RET_IP_); + return kasan_check_range((void *)p, size, false, _RET_IP_); } EXPORT_SYMBOL(__kasan_check_read);
bool __kasan_check_write(const volatile void *p, unsigned int size) { - return kasan_check_range((unsigned long)p, size, true, _RET_IP_); + return kasan_check_range((void *)p, size, true, _RET_IP_); } EXPORT_SYMBOL(__kasan_check_write);
@@ -50,7 +50,7 @@ EXPORT_SYMBOL(__kasan_check_write); #undef memset void *memset(void *addr, int c, size_t len) { - if (!kasan_check_range((unsigned long)addr, len, true, _RET_IP_)) + if (!kasan_check_range(addr, len, true, _RET_IP_)) return NULL;
return __memset(addr, c, len); @@ -60,8 +60,8 @@ void *memset(void *addr, int c, size_t l #undef memmove void *memmove(void *dest, const void *src, size_t len) { - if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) || - !kasan_check_range((unsigned long)dest, len, true, _RET_IP_)) + if (!kasan_check_range(src, len, false, _RET_IP_) || + !kasan_check_range(dest, len, true, _RET_IP_)) return NULL;
return __memmove(dest, src, len); @@ -71,17 +71,17 @@ void *memmove(void *dest, const void *sr #undef memcpy void *memcpy(void *dest, const void *src, size_t len) { - if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) || - !kasan_check_range((unsigned long)dest, len, true, _RET_IP_)) + if (!kasan_check_range(src, len, false, _RET_IP_) || + !kasan_check_range(dest, len, true, _RET_IP_)) return NULL;
return __memcpy(dest, src, len); } #endif
-void *__asan_memset(void *addr, int c, size_t len) +void *__asan_memset(void *addr, int c, ssize_t len) { - if (!kasan_check_range((unsigned long)addr, len, true, _RET_IP_)) + if (!kasan_check_range(addr, len, true, _RET_IP_)) return NULL;
return __memset(addr, c, len); @@ -89,10 +89,10 @@ void *__asan_memset(void *addr, int c, s EXPORT_SYMBOL(__asan_memset);
#ifdef __HAVE_ARCH_MEMMOVE -void *__asan_memmove(void *dest, const void *src, size_t len) +void *__asan_memmove(void *dest, const void *src, ssize_t len) { - if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) || - !kasan_check_range((unsigned long)dest, len, true, _RET_IP_)) + if (!kasan_check_range(src, len, false, _RET_IP_) || + !kasan_check_range(dest, len, true, _RET_IP_)) return NULL;
return __memmove(dest, src, len); @@ -100,10 +100,10 @@ void *__asan_memmove(void *dest, const v EXPORT_SYMBOL(__asan_memmove); #endif
-void *__asan_memcpy(void *dest, const void *src, size_t len) +void *__asan_memcpy(void *dest, const void *src, ssize_t len) { - if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) || - !kasan_check_range((unsigned long)dest, len, true, _RET_IP_)) + if (!kasan_check_range(src, len, false, _RET_IP_) || + !kasan_check_range(dest, len, true, _RET_IP_)) return NULL;
return __memcpy(dest, src, len); @@ -111,13 +111,13 @@ void *__asan_memcpy(void *dest, const vo EXPORT_SYMBOL(__asan_memcpy);
#ifdef CONFIG_KASAN_SW_TAGS -void *__hwasan_memset(void *addr, int c, size_t len) __alias(__asan_memset); +void *__hwasan_memset(void *addr, int c, ssize_t len) __alias(__asan_memset); EXPORT_SYMBOL(__hwasan_memset); #ifdef __HAVE_ARCH_MEMMOVE -void *__hwasan_memmove(void *dest, const void *src, size_t len) __alias(__asan_memmove); +void *__hwasan_memmove(void *dest, const void *src, ssize_t len) __alias(__asan_memmove); EXPORT_SYMBOL(__hwasan_memmove); #endif -void *__hwasan_memcpy(void *dest, const void *src, size_t len) __alias(__asan_memcpy); +void *__hwasan_memcpy(void *dest, const void *src, ssize_t len) __alias(__asan_memcpy); EXPORT_SYMBOL(__hwasan_memcpy); #endif
--- a/mm/kasan/sw_tags.c +++ b/mm/kasan/sw_tags.c @@ -70,8 +70,8 @@ u8 kasan_random_tag(void) return (u8)(state % (KASAN_TAG_MAX + 1)); }
-bool kasan_check_range(unsigned long addr, size_t size, bool write, - unsigned long ret_ip) +bool kasan_check_range(const void *addr, size_t size, bool write, + unsigned long ret_ip) { u8 tag; u8 *shadow_first, *shadow_last, *shadow; @@ -133,12 +133,12 @@ bool kasan_byte_accessible(const void *a }
#define DEFINE_HWASAN_LOAD_STORE(size) \ - void __hwasan_load##size##_noabort(unsigned long addr) \ + void __hwasan_load##size##_noabort(void *addr) \ { \ - kasan_check_range(addr, size, false, _RET_IP_); \ + kasan_check_range(addr, size, false, _RET_IP_); \ } \ EXPORT_SYMBOL(__hwasan_load##size##_noabort); \ - void __hwasan_store##size##_noabort(unsigned long addr) \ + void __hwasan_store##size##_noabort(void *addr) \ { \ kasan_check_range(addr, size, true, _RET_IP_); \ } \ @@ -150,25 +150,25 @@ DEFINE_HWASAN_LOAD_STORE(4); DEFINE_HWASAN_LOAD_STORE(8); DEFINE_HWASAN_LOAD_STORE(16);
-void __hwasan_loadN_noabort(unsigned long addr, unsigned long size) +void __hwasan_loadN_noabort(void *addr, ssize_t size) { kasan_check_range(addr, size, false, _RET_IP_); } EXPORT_SYMBOL(__hwasan_loadN_noabort);
-void __hwasan_storeN_noabort(unsigned long addr, unsigned long size) +void __hwasan_storeN_noabort(void *addr, ssize_t size) { kasan_check_range(addr, size, true, _RET_IP_); } EXPORT_SYMBOL(__hwasan_storeN_noabort);
-void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size) +void __hwasan_tag_memory(void *addr, u8 tag, ssize_t size) { - kasan_poison((void *)addr, size, tag, false); + kasan_poison(addr, size, tag, false); } EXPORT_SYMBOL(__hwasan_tag_memory);
-void kasan_tag_mismatch(unsigned long addr, unsigned long access_info, +void kasan_tag_mismatch(void *addr, unsigned long access_info, unsigned long ret_ip) { kasan_report(addr, 1 << (access_info & 0xf), access_info & 0x10,
From: Andrey Konovalov andreyknvl@google.com
commit fdb54d96600aafe45951f549866cd6fc1af59954 upstream.
Commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") added precise kmalloc redzone poisoning to the slub_debug functionality.
However, this commit didn't account for HW_TAGS KASAN fully initializing the object via its built-in memory initialization feature. Even though HW_TAGS KASAN memory initialization contains special memory initialization handling for when slub_debug is enabled, it does not account for in-object slub_debug redzones. As a result, HW_TAGS KASAN can overwrite these redzones and cause false-positive slub_debug reports.
To fix the issue, avoid HW_TAGS KASAN memory initialization when slub_debug is enabled altogether. Implement this by moving the __slub_debug_enabled check to slab_post_alloc_hook. Common slab code seems like a more appropriate place for a slub_debug check anyway.
Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.168856101... Fixes: 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") Signed-off-by: Andrey Konovalov andreyknvl@google.com Reported-by: Will Deacon will@kernel.org Acked-by: Marco Elver elver@google.com Cc: Mark Rutland mark.rutland@arm.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Christoph Lameter cl@linux.com Cc: David Rientjes rientjes@google.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Feng Tang feng.tang@intel.com Cc: Hyeonggon Yoo 42.hyeyoo@gmail.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: kasan-dev@googlegroups.com Cc: Pekka Enberg penberg@kernel.org Cc: Peter Collingbourne pcc@google.com Cc: Roman Gushchin roman.gushchin@linux.dev Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kasan/kasan.h | 12 ------------ mm/slab.h | 16 ++++++++++++++-- 2 files changed, 14 insertions(+), 14 deletions(-)
--- a/mm/kasan/kasan.h +++ b/mm/kasan/kasan.h @@ -466,18 +466,6 @@ static inline void kasan_unpoison(const
if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK)) return; - /* - * Explicitly initialize the memory with the precise object size to - * avoid overwriting the slab redzone. This disables initialization in - * the arch code and may thus lead to performance penalty. This penalty - * does not affect production builds, as slab redzones are not enabled - * there. - */ - if (__slub_debug_enabled() && - init && ((unsigned long)size & KASAN_GRANULE_MASK)) { - init = false; - memzero_explicit((void *)addr, size); - } size = round_up(size, KASAN_GRANULE_SIZE);
hw_set_mem_tag_range((void *)addr, size, tag, init); --- a/mm/slab.h +++ b/mm/slab.h @@ -684,6 +684,7 @@ static inline void slab_post_alloc_hook( unsigned int orig_size) { unsigned int zero_size = s->object_size; + bool kasan_init = init; size_t i;
flags &= gfp_allowed_mask; @@ -701,6 +702,17 @@ static inline void slab_post_alloc_hook( zero_size = orig_size;
/* + * When slub_debug is enabled, avoid memory initialization integrated + * into KASAN and instead zero out the memory via the memset below with + * the proper size. Otherwise, KASAN might overwrite SLUB redzones and + * cause false-positive reports. This does not lead to a performance + * penalty on production builds, as slub_debug is not intended to be + * enabled there. + */ + if (__slub_debug_enabled()) + kasan_init = false; + + /* * As memory initialization might be integrated into KASAN, * kasan_slab_alloc and initialization memset must be * kept together to avoid discrepancies in behavior. @@ -708,8 +720,8 @@ static inline void slab_post_alloc_hook( * As p[i] might get tagged, memset and kmemleak hook come after KASAN. */ for (i = 0; i < size; i++) { - p[i] = kasan_slab_alloc(s, p[i], flags, init); - if (p[i] && init && !kasan_has_integrated_init()) + p[i] = kasan_slab_alloc(s, p[i], flags, kasan_init); + if (p[i] && init && (!kasan_init || !kasan_has_integrated_init())) memset(p[i], 0, zero_size); kmemleak_alloc_recursive(p[i], s->object_size, 1, s->flags, flags);
From: Andrey Konovalov andreyknvl@google.com
commit 05c56e7b4319d7f6352f27da876a1acdc8fa5cc4 upstream.
Commit bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins") introduced a bug into the memory_is_poisoned_n implementation: it effectively removed the cast to a signed integer type after applying KASAN_GRANULE_MASK.
As a result, KASAN started failing to properly check memset, memcpy, and other similar functions.
Fix the bug by adding the cast back (through an additional signed integer variable to make the code more readable).
Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.168843186... Fixes: bb6e04a173f0 ("kasan: use internal prototypes matching gcc-13 builtins") Signed-off-by: Andrey Konovalov andreyknvl@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Arnd Bergmann arnd@arndb.de Cc: Dmitry Vyukov dvyukov@google.com Cc: Marco Elver elver@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kasan/generic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/kasan/generic.c +++ b/mm/kasan/generic.c @@ -130,9 +130,10 @@ static __always_inline bool memory_is_po if (unlikely(ret)) { const void *last_byte = addr + size - 1; s8 *last_shadow = (s8 *)kasan_mem_to_shadow(last_byte); + s8 last_accessible_byte = (unsigned long)last_byte & KASAN_GRANULE_MASK;
if (unlikely(ret != (unsigned long)last_shadow || - (((long)last_byte & KASAN_GRANULE_MASK) >= *last_shadow))) + last_accessible_byte >= *last_shadow)) return true; } return false;
From: sunliming sunliming@kylinos.cn
commit f6d026eea390d59787a6cdc2ef5c983d02e029d0 upstream.
The writing operation return the count of writes regardless of whether events are enabled or disabled. Switch it to return -EBADF to indicates that the event is disabled.
Link: https://lkml.kernel.org/r/20230626111344.19136-2-sunliming@kylinos.cn
Cc: stable@vger.kernel.org 7f5a08c79df35 ("user_events: Add minimal support for trace_event into ftrace") Acked-by: Beau Belgrave beaub@linux.microsoft.com Signed-off-by: sunliming sunliming@kylinos.cn Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_user.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace_events_user.c +++ b/kernel/trace/trace_events_user.c @@ -2096,7 +2096,8 @@ static ssize_t user_events_write_core(st
if (unlikely(faulted)) return -EFAULT; - } + } else + return -EBADF;
return ret; }
From: Naveen N Rao naveen@kernel.org
commit 25ea739ea1d4d3de41acc4f4eb2d1a97eee0eb75 upstream.
binutils v2.37 drops unused section symbols, which prevents recordmcount from capturing mcount locations in sections that have no non-weak symbols. This results in a build failure with a message such as: Cannot find symbol for section 12: .text.perf_callchain_kernel. kernel/events/callchain.o: failed
The change to binutils was reverted for v2.38, so this behavior is specific to binutils v2.37: https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=c09c8b42021180...
Objtool is able to cope with such sections, so this issue is specific to recordmcount.
Fail the build and print a warning if binutils v2.37 is detected and if we are using recordmcount.
Cc: stable@vger.kernel.org Suggested-by: Joel Stanley joel@jms.id.au Signed-off-by: Naveen N Rao naveen@kernel.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230530061436.56925-1-naveen@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/Makefile | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -409,3 +409,11 @@ checkbin: echo -n '*** Please use a different binutils version.' ; \ false ; \ fi + @if test "x${CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT}" = "xy" -a \ + "x${CONFIG_LD_IS_BFD}" = "xy" -a \ + "${CONFIG_LD_VERSION}" = "23700" ; then \ + echo -n '*** binutils 2.37 drops unused section symbols, which recordmcount ' ; \ + echo 'is unable to handle.' ; \ + echo '*** Please use a different binutils version.' ; \ + false ; \ + fi
From: Ekansh Gupta quic_ekangupt@quicinc.com
commit 0b4e32df3e09406b835d8230b9331273f2805058 upstream.
A process can spawn a PD on DSP with some attributes that can be associated with the PD during spawn and run. The invocation corresponding to the create request with attributes has total 4 buffers at the DSP side implementation. If this number is not correct, the invocation is expected to fail on DSP. Added change to use correct number of buffer count for creating fastrpc scalar.
Fixes: d73f71c7c6ee ("misc: fastrpc: Add support for create remote init process") Cc: stable stable@kernel.org Tested-by: Ekansh Gupta quic_ekangupt@quicinc.com Signed-off-by: Ekansh Gupta quic_ekangupt@quicinc.com Message-ID: 1686743685-21715-1-git-send-email-quic_ekangupt@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/fastrpc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c index 30d4d0476248..37f32d20fcd0 100644 --- a/drivers/misc/fastrpc.c +++ b/drivers/misc/fastrpc.c @@ -1437,7 +1437,7 @@ static int fastrpc_init_create_process(struct fastrpc_user *fl,
sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE, 4, 0); if (init.attrs) - sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 6, 0); + sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 4, 0);
err = fastrpc_internal_invoke(fl, true, FASTRPC_INIT_HANDLE, sc, args);
From: Michael Ellerman mpe@ellerman.id.au
commit 5bcedc5931e7bd6928a2d8207078d4cb476b3b55 upstream.
Nageswara reported that /proc/self/status was showing "vulnerable" for the Speculation_Store_Bypass feature on Power10, eg:
$ grep Speculation_Store_Bypass: /proc/self/status Speculation_Store_Bypass: vulnerable
But at the same time the sysfs files, and lscpu, were showing "Not affected".
This turns out to simply be a bug in the reporting of the Speculation_Store_Bypass, aka. PR_SPEC_STORE_BYPASS, case.
When SEC_FTR_STF_BARRIER was added, so that firmware could communicate the vulnerability was not present, the code in ssb_prctl_get() was not updated to check the new flag.
So add the check for SEC_FTR_STF_BARRIER being disabled. Rather than adding the new check to the existing if block and expanding the comment to cover both cases, rewrite the three cases to be separate so they can be commented separately for clarity.
Fixes: 84ed26fd00c5 ("powerpc/security: Add a security feature for STF barrier") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Nageswara R Sastry rnsastry@linux.ibm.com Tested-by: Nageswara R Sastry rnsastry@linux.ibm.com Reviewed-by: Russell Currey ruscur@russell.cc Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230517074945.53188-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/security.c | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-)
--- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -364,26 +364,27 @@ ssize_t cpu_show_spec_store_bypass(struc
static int ssb_prctl_get(struct task_struct *task) { + /* + * The STF_BARRIER feature is on by default, so if it's off that means + * firmware has explicitly said the CPU is not vulnerable via either + * the hypercall or device tree. + */ + if (!security_ftr_enabled(SEC_FTR_STF_BARRIER)) + return PR_SPEC_NOT_AFFECTED; + + /* + * If the system's CPU has no known barrier (see setup_stf_barrier()) + * then assume that the CPU is not vulnerable. + */ if (stf_enabled_flush_types == STF_BARRIER_NONE) - /* - * We don't have an explicit signal from firmware that we're - * vulnerable or not, we only have certain CPU revisions that - * are known to be vulnerable. - * - * We assume that if we're on another CPU, where the barrier is - * NONE, then we are not vulnerable. - */ return PR_SPEC_NOT_AFFECTED; - else - /* - * If we do have a barrier type then we are vulnerable. The - * barrier is not a global or per-process mitigation, so the - * only value we can report here is PR_SPEC_ENABLE, which - * appears as "vulnerable" in /proc. - */ - return PR_SPEC_ENABLE;
- return -EINVAL; + /* + * Otherwise the CPU is vulnerable. The barrier is not a global or + * per-process mitigation, so the only value that can be reported here + * is PR_SPEC_ENABLE, which appears as "vulnerable" in /proc. + */ + return PR_SPEC_ENABLE; }
int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
From: Michael Ellerman mpe@ellerman.id.au
commit 8bbe9fee5848371d4af101be445303cac8d880c5 upstream.
Lockdep warns that the use of the hpte_lock in native_hpte_remove() is not safe against an IRQ coming in:
================================ WARNING: inconsistent lock state 6.4.0-rc2-g0c54f4d30ecc #1 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. qemu-system-ppc/93865 [HC0[0]:SC0[0]:HE1:SE1] takes: c0000000021f5180 (hpte_lock){+.?.}-{0:0}, at: native_lock_hpte+0x8/0xd0 {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x134/0x3f0 native_lock_hpte+0x44/0xd0 native_hpte_insert+0xd4/0x2a0 __hash_page_64K+0x218/0x4f0 hash_page_mm+0x464/0x840 do_hash_fault+0x11c/0x260 data_access_common_virt+0x210/0x220 __ip_select_ident+0x140/0x150 ... net_rx_action+0x3bc/0x440 __do_softirq+0x180/0x534 ... sys_sendmmsg+0x34/0x50 system_call_exception+0x128/0x320 system_call_common+0x160/0x2e4 ... Possible unsafe locking scenario:
CPU0 ---- lock(hpte_lock); <Interrupt> lock(hpte_lock);
*** DEADLOCK *** ... Call Trace: dump_stack_lvl+0x98/0xe0 (unreliable) print_usage_bug.part.0+0x250/0x278 mark_lock+0xc9c/0xd30 __lock_acquire+0x440/0x1ca0 lock_acquire+0x134/0x3f0 native_lock_hpte+0x44/0xd0 native_hpte_remove+0xb0/0x190 kvmppc_mmu_map_page+0x650/0x698 [kvm_pr] kvmppc_handle_pagefault+0x534/0x6e8 [kvm_pr] kvmppc_handle_exit_pr+0x6d8/0xe90 [kvm_pr] after_sprg3_load+0x80/0x90 [kvm_pr] kvmppc_vcpu_run_pr+0x108/0x270 [kvm_pr] kvmppc_vcpu_run+0x34/0x48 [kvm] kvm_arch_vcpu_ioctl_run+0x340/0x470 [kvm] kvm_vcpu_ioctl+0x338/0x8b8 [kvm] sys_ioctl+0x7c4/0x13e0 system_call_exception+0x128/0x320 system_call_common+0x160/0x2e4
I suspect kvm_pr is the only caller that doesn't already have IRQs disabled, which is why this hasn't been reported previously.
Fix it by disabling IRQs in native_hpte_remove().
Fixes: 35159b5717fa ("powerpc/64s: make HPTE lock and native_tlbie_lock irq-safe") Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230517123033.18430-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/mm/book3s64/hash_native.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/arch/powerpc/mm/book3s64/hash_native.c +++ b/arch/powerpc/mm/book3s64/hash_native.c @@ -328,10 +328,12 @@ static long native_hpte_insert(unsigned
static long native_hpte_remove(unsigned long hpte_group) { + unsigned long hpte_v, flags; struct hash_pte *hptep; int i; int slot_offset; - unsigned long hpte_v; + + local_irq_save(flags);
DBG_LOW(" remove(group=%lx)\n", hpte_group);
@@ -356,13 +358,16 @@ static long native_hpte_remove(unsigned slot_offset &= 0x7; }
- if (i == HPTES_PER_GROUP) - return -1; + if (i == HPTES_PER_GROUP) { + i = -1; + goto out; + }
/* Invalidate the hpte. NOTE: this also unlocks it */ release_hpte_lock(); hptep->v = 0; - +out: + local_irq_restore(flags); return i; }
From: Hamza Mahfooz hamza.mahfooz@amd.com
commit af22d6a869cc26b519bfdcd54293c53f2e491870 upstream.
Currently, it is possible for us to access memory that we shouldn't. Since, we acquire (possibly dangling) pointers to dirty rectangles before doing a bounds check to make sure we can actually accommodate the number of dirty rectangles userspace has requested to fill. This issue is especially evident if a compositor requests both MPO and damage clips at the same time, in which case I have observed a soft-hang. So, to avoid this issue, perform the bounds check before filling a single dirty rectangle and WARN() about it, if it is ever attempted in fill_dc_dirty_rect().
Cc: stable@vger.kernel.org # 6.1+ Fixes: 30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support") Reviewed-by: Leo Li sunpeng.li@amd.com Signed-off-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5057,11 +5057,7 @@ static inline void fill_dc_dirty_rect(st s32 y, s32 width, s32 height, int *i, bool ffu) { - if (*i > DC_MAX_DIRTY_RECTS) - return; - - if (*i == DC_MAX_DIRTY_RECTS) - goto out; + WARN_ON(*i >= DC_MAX_DIRTY_RECTS);
dirty_rect->x = x; dirty_rect->y = y; @@ -5077,7 +5073,6 @@ static inline void fill_dc_dirty_rect(st "[PLANE:%d] PSR SU dirty rect at (%d, %d) size (%d, %d)", plane->base.id, x, y, width, height);
-out: (*i)++; }
@@ -5164,6 +5159,9 @@ static void fill_dc_dirty_rects(struct d
*dirty_regions_changed = bb_changed;
+ if ((num_clips + (bb_changed ? 2 : 0)) > DC_MAX_DIRTY_RECTS) + goto ffu; + if (bb_changed) { fill_dc_dirty_rect(new_plane_state->plane, &dirty_rects[i], new_plane_state->crtc_x, @@ -5193,9 +5191,6 @@ static void fill_dc_dirty_rects(struct d new_plane_state->crtc_h, &i, false); }
- if (i > DC_MAX_DIRTY_RECTS) - goto ffu; - flip_addrs->dirty_rect_count = i; return;
From: Jiaxun Yang jiaxun.yang@flygoat.com
commit 5487a7b60695a92cf998350e4beac17144c91fcd upstream.
Some CPU feature macros were using current_cpu_type to mark feature availability.
However current_cpu_type will use smp_processor_id, which is prohibited under preemptable context.
Since those features are all uniform on all CPUs in a SMP system, use boot_cpu_type instead of current_cpu_type to fix preemptable kernel.
Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang jiaxun.yang@flygoat.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/include/asm/cpu-features.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/mips/include/asm/cpu-features.h +++ b/arch/mips/include/asm/cpu-features.h @@ -125,7 +125,7 @@ ({ \ int __res; \ \ - switch (current_cpu_type()) { \ + switch (boot_cpu_type()) { \ case CPU_CAVIUM_OCTEON: \ case CPU_CAVIUM_OCTEON_PLUS: \ case CPU_CAVIUM_OCTEON2: \ @@ -368,7 +368,7 @@ ({ \ int __res; \ \ - switch (current_cpu_type()) { \ + switch (boot_cpu_type()) { \ case CPU_M14KC: \ case CPU_74K: \ case CPU_1074K: \
From: Huacai Chen chenhuacai@loongson.cn
commit 65fee014dc41a774bcd94896f3fb380bc39d8dda upstream.
Commit 7db5e9e9e5e6c10d7d ("MIPS: loongson64: fix FTLB configuration") move decode_configs() from the beginning of cpu_probe_loongson() to the end in order to fix FTLB configuration. However, it breaks the CPUCFG decoding because decode_configs() use "c->options = xxxx" rather than "c->options |= xxxx", all information get from CPUCFG by decode_cpucfg() is lost.
This causes error when creating a KVM guest on Loongson-3A4000: Exception Code: 4 not handled @ PC: 0000000087ad5981, inst: 0xcb7a1898 BadVaddr: 0x0 Status: 0x0
Fix this by moving the c->cputype setting to the beginning and moving decode_configs() after that.
Fixes: 7db5e9e9e5e6c10d7d ("MIPS: loongson64: fix FTLB configuration") Cc: stable@vger.kernel.org Cc: Huang Pei huangpei@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/kernel/cpu-probe.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/arch/mips/kernel/cpu-probe.c +++ b/arch/mips/kernel/cpu-probe.c @@ -1677,7 +1677,10 @@ static inline void decode_cpucfg(struct
static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu) { + c->cputype = CPU_LOONGSON64; + /* All Loongson processors covered here define ExcCode 16 as GSExc. */ + decode_configs(c); c->options |= MIPS_CPU_GSEXCEX;
switch (c->processor_id & PRID_IMP_MASK) { @@ -1687,7 +1690,6 @@ static inline void cpu_probe_loongson(st case PRID_REV_LOONGSON2K_R1_1: case PRID_REV_LOONGSON2K_R1_2: case PRID_REV_LOONGSON2K_R1_3: - c->cputype = CPU_LOONGSON64; __cpu_name[cpu] = "Loongson-2K"; set_elf_platform(cpu, "gs264e"); set_isa(c, MIPS_CPU_ISA_M64R2); @@ -1700,14 +1702,12 @@ static inline void cpu_probe_loongson(st switch (c->processor_id & PRID_REV_MASK) { case PRID_REV_LOONGSON3A_R2_0: case PRID_REV_LOONGSON3A_R2_1: - c->cputype = CPU_LOONGSON64; __cpu_name[cpu] = "ICT Loongson-3"; set_elf_platform(cpu, "loongson3a"); set_isa(c, MIPS_CPU_ISA_M64R2); break; case PRID_REV_LOONGSON3A_R3_0: case PRID_REV_LOONGSON3A_R3_1: - c->cputype = CPU_LOONGSON64; __cpu_name[cpu] = "ICT Loongson-3"; set_elf_platform(cpu, "loongson3a"); set_isa(c, MIPS_CPU_ISA_M64R2); @@ -1727,7 +1727,6 @@ static inline void cpu_probe_loongson(st c->ases &= ~MIPS_ASE_VZ; /* VZ of Loongson-3A2000/3000 is incomplete */ break; case PRID_IMP_LOONGSON_64G: - c->cputype = CPU_LOONGSON64; __cpu_name[cpu] = "ICT Loongson-3"; set_elf_platform(cpu, "loongson3a"); set_isa(c, MIPS_CPU_ISA_M64R2); @@ -1737,8 +1736,6 @@ static inline void cpu_probe_loongson(st panic("Unknown Loongson Processor ID!"); break; } - - decode_configs(c); } #else static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu) { }
From: Huacai Chen chenhuacai@loongson.cn
commit 531b3d1195d096f14e030c4b01ec3a53b80276bf upstream.
After commit 0e96ea5c3eb5904e5dc2f ("MIPS: Loongson64: Clean up use of cc-ifversion") we get a build error when make modules_install:
cc1: error: '-mloongson-mmi' must be used with '-mhard-float'
The reason is when make modules_install, 'call cc-option' doesn't work in $(KBUILD_CFLAGS) of 'CHECKFLAGS'. Then there is no -mno-loongson-mmi applied and -march=loongson3a enable MMI instructions.
To be detail, the error message comes from the CHECKFLAGS invocation of $(CC) but it has no impact on the final result of make modules_install, it is purely a cosmetic issue. The error occurs because cc-option is defined in scripts/Makefile.compiler, which is not included in Makefile when running 'make modules_install', as install targets are not supposed to require the compiler; see commit 805b2e1d427aab4b ("kbuild: include Makefile.compiler only when compiler is needed"). As a result, the call to check for '-mno-loongson-mmi' just never happens.
Fix this by partially reverting to the old logic, use 'call cc-option' to conditionally apply -march=loongson3a and -march=mips64r2.
By the way, Loongson-2E/2F is also broken in commit 13ceb48bc19c563e05f4 ("MIPS: Loongson2ef: Remove unnecessary {as,cc}-option calls") so fix it together.
Fixes: 13ceb48bc19c563e05f4 ("MIPS: Loongson2ef: Remove unnecessary {as,cc}-option calls") Fixes: 0e96ea5c3eb5904e5dc2 ("MIPS: Loongson64: Clean up use of cc-ifversion") Cc: stable@vger.kernel.org Cc: Feiyang Chen chenfeiyang@loongson.cn Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Huacai Chen chenhuacai@loongson.cn Reviewed-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/Makefile | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
--- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -181,16 +181,12 @@ endif cflags-$(CONFIG_CAVIUM_CN63XXP1) += -Wa,-mfix-cn63xxp1 cflags-$(CONFIG_CPU_BMIPS) += -march=mips32 -Wa,-mips32 -Wa,--trap
-cflags-$(CONFIG_CPU_LOONGSON2E) += -march=loongson2e -Wa,--trap -cflags-$(CONFIG_CPU_LOONGSON2F) += -march=loongson2f -Wa,--trap +cflags-$(CONFIG_CPU_LOONGSON2E) += $(call cc-option,-march=loongson2e) -Wa,--trap +cflags-$(CONFIG_CPU_LOONGSON2F) += $(call cc-option,-march=loongson2f) -Wa,--trap +cflags-$(CONFIG_CPU_LOONGSON64) += $(call cc-option,-march=loongson3a,-march=mips64r2) -Wa,--trap # Some -march= flags enable MMI instructions, and GCC complains about that # support being enabled alongside -msoft-float. Thus explicitly disable MMI. cflags-$(CONFIG_CPU_LOONGSON2EF) += $(call cc-option,-mno-loongson-mmi) -ifdef CONFIG_CPU_LOONGSON64 -cflags-$(CONFIG_CPU_LOONGSON64) += -Wa,--trap -cflags-$(CONFIG_CC_IS_GCC) += -march=loongson3a -cflags-$(CONFIG_CC_IS_CLANG) += -march=mips64r2 -endif cflags-$(CONFIG_CPU_LOONGSON64) += $(call cc-option,-mno-loongson-mmi)
cflags-$(CONFIG_CPU_R4000_WORKAROUNDS) += $(call cc-option,-mfix-r4000,)
From: Huacai Chen chenhuacai@loongson.cn
commit e4de2057698636c0ee709e545d19b169d2069fa3 upstream.
After commit 45c7e8af4a5e3f0bea4ac209 ("MIPS: Remove KVM_TE support") we get a NULL pointer dereference when creating a KVM guest:
[ 146.243409] Starting KVM with MIPS VZ extensions [ 149.849151] CPU 3 Unable to handle kernel paging request at virtual address 0000000000000300, epc == ffffffffc06356ec, ra == ffffffffc063568c [ 149.849177] Oops[#1]: [ 149.849182] CPU: 3 PID: 2265 Comm: qemu-system-mip Not tainted 6.4.0-rc3+ #1671 [ 149.849188] Hardware name: THTF CX TL630 Series/THTF-LS3A4000-7A1000-ML4A, BIOS KL4.1F.TF.D.166.201225.R 12/25/2020 [ 149.849192] $ 0 : 0000000000000000 000000007400cce0 0000000000400004 ffffffff8119c740 [ 149.849209] $ 4 : 000000007400cce1 000000007400cce1 0000000000000000 0000000000000000 [ 149.849221] $ 8 : 000000240058bb36 ffffffff81421ac0 0000000000000000 0000000000400dc0 [ 149.849233] $12 : 9800000102a07cc8 ffffffff80e40e38 0000000000000001 0000000000400dc0 [ 149.849245] $16 : 0000000000000000 9800000106cd0000 9800000106cd0000 9800000100cce000 [ 149.849257] $20 : ffffffffc0632b28 ffffffffc05b31b0 9800000100ccca00 0000000000400000 [ 149.849269] $24 : 9800000106cd09ce ffffffff802f69d0 [ 149.849281] $28 : 9800000102a04000 9800000102a07cd0 98000001106a8000 ffffffffc063568c [ 149.849293] Hi : 00000335b2111e66 [ 149.849295] Lo : 6668d90061ae0ae9 [ 149.849298] epc : ffffffffc06356ec kvm_vz_vcpu_setup+0xc4/0x328 [kvm] [ 149.849324] ra : ffffffffc063568c kvm_vz_vcpu_setup+0x64/0x328 [kvm] [ 149.849336] Status: 7400cce3 KX SX UX KERNEL EXL IE [ 149.849351] Cause : 1000000c (ExcCode 03) [ 149.849354] BadVA : 0000000000000300 [ 149.849357] PrId : 0014c004 (ICT Loongson-3) [ 149.849360] Modules linked in: kvm nfnetlink_queue nfnetlink_log nfnetlink fuse sha256_generic libsha256 cfg80211 rfkill binfmt_misc vfat fat snd_hda_codec_hdmi input_leds led_class snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_pcm snd_timer snd serio_raw xhci_pci radeon drm_suballoc_helper drm_display_helper xhci_hcd ip_tables x_tables [ 149.849432] Process qemu-system-mip (pid: 2265, threadinfo=00000000ae2982d2, task=0000000038e09ad4, tls=000000ffeba16030) [ 149.849439] Stack : 9800000000000003 9800000100ccca00 9800000100ccc000 ffffffffc062cef4 [ 149.849453] 9800000102a07d18 c89b63a7ab338e00 0000000000000000 ffffffff811a0000 [ 149.849465] 0000000000000000 9800000106cd0000 ffffffff80e59938 98000001106a8920 [ 149.849476] ffffffff80e57f30 ffffffffc062854c ffffffff811a0000 9800000102bf4240 [ 149.849488] ffffffffc05b0000 ffffffff80e3a798 000000ff78000000 000000ff78000010 [ 149.849500] 0000000000000255 98000001021f7de0 98000001023f0078 ffffffff81434000 [ 149.849511] 0000000000000000 0000000000000000 9800000102ae0000 980000025e92ae28 [ 149.849523] 0000000000000000 c89b63a7ab338e00 0000000000000001 ffffffff8119dce0 [ 149.849535] 000000ff78000010 ffffffff804f3d3c 9800000102a07eb0 0000000000000255 [ 149.849546] 0000000000000000 ffffffff8049460c 000000ff78000010 0000000000000255 [ 149.849558] ... [ 149.849565] Call Trace: [ 149.849567] [<ffffffffc06356ec>] kvm_vz_vcpu_setup+0xc4/0x328 [kvm] [ 149.849586] [<ffffffffc062cef4>] kvm_arch_vcpu_create+0x184/0x228 [kvm] [ 149.849605] [<ffffffffc062854c>] kvm_vm_ioctl+0x64c/0xf28 [kvm] [ 149.849623] [<ffffffff805209c0>] sys_ioctl+0xc8/0x118 [ 149.849631] [<ffffffff80219eb0>] syscall_common+0x34/0x58
The root cause is the deletion of kvm_mips_commpage_init() leaves vcpu ->arch.cop0 NULL. So fix it by making cop0 from a pointer to an embedded object.
Fixes: 45c7e8af4a5e3f0bea4ac209 ("MIPS: Remove KVM_TE support") Cc: stable@vger.kernel.org Reported-by: Yu Zhao yuzhao@google.com Suggested-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/include/asm/kvm_host.h | 6 +++--- arch/mips/kvm/emulate.c | 22 +++++++++++----------- arch/mips/kvm/mips.c | 16 ++++++++-------- arch/mips/kvm/trace.h | 8 ++++---- arch/mips/kvm/vz.c | 20 ++++++++++---------- 5 files changed, 36 insertions(+), 36 deletions(-)
--- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -317,7 +317,7 @@ struct kvm_vcpu_arch { unsigned int aux_inuse;
/* COP0 State */ - struct mips_coproc *cop0; + struct mips_coproc cop0;
/* Resume PC after MMIO completion */ unsigned long io_pc; @@ -698,7 +698,7 @@ static inline bool kvm_mips_guest_can_ha static inline bool kvm_mips_guest_has_fpu(struct kvm_vcpu_arch *vcpu) { return kvm_mips_guest_can_have_fpu(vcpu) && - kvm_read_c0_guest_config1(vcpu->cop0) & MIPS_CONF1_FP; + kvm_read_c0_guest_config1(&vcpu->cop0) & MIPS_CONF1_FP; }
static inline bool kvm_mips_guest_can_have_msa(struct kvm_vcpu_arch *vcpu) @@ -710,7 +710,7 @@ static inline bool kvm_mips_guest_can_ha static inline bool kvm_mips_guest_has_msa(struct kvm_vcpu_arch *vcpu) { return kvm_mips_guest_can_have_msa(vcpu) && - kvm_read_c0_guest_config3(vcpu->cop0) & MIPS_CONF3_MSA; + kvm_read_c0_guest_config3(&vcpu->cop0) & MIPS_CONF3_MSA; }
struct kvm_mips_callbacks { --- a/arch/mips/kvm/emulate.c +++ b/arch/mips/kvm/emulate.c @@ -312,7 +312,7 @@ int kvm_get_badinstrp(u32 *opc, struct k */ int kvm_mips_count_disabled(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0;
return (vcpu->arch.count_ctl & KVM_REG_MIPS_COUNT_CTL_DC) || (kvm_read_c0_guest_cause(cop0) & CAUSEF_DC); @@ -384,7 +384,7 @@ static inline ktime_t kvm_mips_count_tim */ static u32 kvm_mips_read_count_running(struct kvm_vcpu *vcpu, ktime_t now) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; ktime_t expires, threshold; u32 count, compare; int running; @@ -444,7 +444,7 @@ static u32 kvm_mips_read_count_running(s */ u32 kvm_mips_read_count(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0;
/* If count disabled just read static copy of count */ if (kvm_mips_count_disabled(vcpu)) @@ -502,7 +502,7 @@ ktime_t kvm_mips_freeze_hrtimer(struct k static void kvm_mips_resume_hrtimer(struct kvm_vcpu *vcpu, ktime_t now, u32 count) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; u32 compare; u64 delta; ktime_t expire; @@ -603,7 +603,7 @@ resume: */ void kvm_mips_write_count(struct kvm_vcpu *vcpu, u32 count) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; ktime_t now;
/* Calculate bias */ @@ -649,7 +649,7 @@ void kvm_mips_init_count(struct kvm_vcpu */ int kvm_mips_set_count_hz(struct kvm_vcpu *vcpu, s64 count_hz) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; int dc; ktime_t now; u32 count; @@ -696,7 +696,7 @@ int kvm_mips_set_count_hz(struct kvm_vcp */ void kvm_mips_write_compare(struct kvm_vcpu *vcpu, u32 compare, bool ack) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; int dc; u32 old_compare = kvm_read_c0_guest_compare(cop0); s32 delta = compare - old_compare; @@ -779,7 +779,7 @@ void kvm_mips_write_compare(struct kvm_v */ static ktime_t kvm_mips_count_disable(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; u32 count; ktime_t now;
@@ -806,7 +806,7 @@ static ktime_t kvm_mips_count_disable(st */ void kvm_mips_count_disable_cause(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0;
kvm_set_c0_guest_cause(cop0, CAUSEF_DC); if (!(vcpu->arch.count_ctl & KVM_REG_MIPS_COUNT_CTL_DC)) @@ -826,7 +826,7 @@ void kvm_mips_count_disable_cause(struct */ void kvm_mips_count_enable_cause(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; u32 count;
kvm_clear_c0_guest_cause(cop0, CAUSEF_DC); @@ -852,7 +852,7 @@ void kvm_mips_count_enable_cause(struct */ int kvm_mips_set_count_ctl(struct kvm_vcpu *vcpu, s64 count_ctl) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; s64 changed = count_ctl ^ vcpu->arch.count_ctl; s64 delta; ktime_t expire, now; --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -649,7 +649,7 @@ static int kvm_mips_copy_reg_indices(str static int kvm_mips_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; struct mips_fpu_struct *fpu = &vcpu->arch.fpu; int ret; s64 v; @@ -761,7 +761,7 @@ static int kvm_mips_get_reg(struct kvm_v static int kvm_mips_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; struct mips_fpu_struct *fpu = &vcpu->arch.fpu; s64 v; s64 vs[2]; @@ -1086,7 +1086,7 @@ int kvm_vm_ioctl_check_extension(struct int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) { return kvm_mips_pending_timer(vcpu) || - kvm_read_c0_guest_cause(vcpu->arch.cop0) & C_TI; + kvm_read_c0_guest_cause(&vcpu->arch.cop0) & C_TI; }
int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu) @@ -1110,7 +1110,7 @@ int kvm_arch_vcpu_dump_regs(struct kvm_v kvm_debug("\thi: 0x%08lx\n", vcpu->arch.hi); kvm_debug("\tlo: 0x%08lx\n", vcpu->arch.lo);
- cop0 = vcpu->arch.cop0; + cop0 = &vcpu->arch.cop0; kvm_debug("\tStatus: 0x%08x, Cause: 0x%08x\n", kvm_read_c0_guest_status(cop0), kvm_read_c0_guest_cause(cop0)); @@ -1232,7 +1232,7 @@ static int __kvm_mips_handle_exit(struct
case EXCCODE_TLBS: kvm_debug("TLB ST fault: cause %#x, status %#x, PC: %p, BadVaddr: %#lx\n", - cause, kvm_read_c0_guest_status(vcpu->arch.cop0), opc, + cause, kvm_read_c0_guest_status(&vcpu->arch.cop0), opc, badvaddr);
++vcpu->stat.tlbmiss_st_exits; @@ -1304,7 +1304,7 @@ static int __kvm_mips_handle_exit(struct kvm_get_badinstr(opc, vcpu, &inst); kvm_err("Exception Code: %d, not yet handled, @ PC: %p, inst: 0x%08x BadVaddr: %#lx Status: %#x\n", exccode, opc, inst, badvaddr, - kvm_read_c0_guest_status(vcpu->arch.cop0)); + kvm_read_c0_guest_status(&vcpu->arch.cop0)); kvm_arch_vcpu_dump_regs(vcpu); run->exit_reason = KVM_EXIT_INTERNAL_ERROR; ret = RESUME_HOST; @@ -1377,7 +1377,7 @@ int noinstr kvm_mips_handle_exit(struct /* Enable FPU for guest and restore context */ void kvm_own_fpu(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; unsigned int sr, cfg5;
preempt_disable(); @@ -1421,7 +1421,7 @@ void kvm_own_fpu(struct kvm_vcpu *vcpu) /* Enable MSA for guest and restore context */ void kvm_own_msa(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; unsigned int sr, cfg5;
preempt_disable(); --- a/arch/mips/kvm/trace.h +++ b/arch/mips/kvm/trace.h @@ -322,11 +322,11 @@ TRACE_EVENT_FN(kvm_guest_mode_change, ),
TP_fast_assign( - __entry->epc = kvm_read_c0_guest_epc(vcpu->arch.cop0); + __entry->epc = kvm_read_c0_guest_epc(&vcpu->arch.cop0); __entry->pc = vcpu->arch.pc; - __entry->badvaddr = kvm_read_c0_guest_badvaddr(vcpu->arch.cop0); - __entry->status = kvm_read_c0_guest_status(vcpu->arch.cop0); - __entry->cause = kvm_read_c0_guest_cause(vcpu->arch.cop0); + __entry->badvaddr = kvm_read_c0_guest_badvaddr(&vcpu->arch.cop0); + __entry->status = kvm_read_c0_guest_status(&vcpu->arch.cop0); + __entry->cause = kvm_read_c0_guest_cause(&vcpu->arch.cop0); ),
TP_printk("EPC: 0x%08lx PC: 0x%08lx Status: 0x%08x Cause: 0x%08x BadVAddr: 0x%08lx", --- a/arch/mips/kvm/vz.c +++ b/arch/mips/kvm/vz.c @@ -422,7 +422,7 @@ static void _kvm_vz_restore_htimer(struc */ static void kvm_vz_restore_timer(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; u32 cause, compare;
compare = kvm_read_sw_gc0_compare(cop0); @@ -517,7 +517,7 @@ static void _kvm_vz_save_htimer(struct k */ static void kvm_vz_save_timer(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; u32 gctl0, compare, cause;
gctl0 = read_c0_guestctl0(); @@ -863,7 +863,7 @@ static unsigned long mips_process_maar(u
static void kvm_write_maari(struct kvm_vcpu *vcpu, unsigned long val) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0;
val &= MIPS_MAARI_INDEX; if (val == MIPS_MAARI_INDEX) @@ -876,7 +876,7 @@ static enum emulation_result kvm_vz_gpsi u32 *opc, u32 cause, struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; enum emulation_result er = EMULATE_DONE; u32 rt, rd, sel; unsigned long curr_pc; @@ -1911,7 +1911,7 @@ static int kvm_vz_get_one_reg(struct kvm const struct kvm_one_reg *reg, s64 *v) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; unsigned int idx;
switch (reg->id) { @@ -2081,7 +2081,7 @@ static int kvm_vz_get_one_reg(struct kvm case KVM_REG_MIPS_CP0_MAARI: if (!cpu_guest_has_maar || cpu_guest_has_dyn_maar) return -EINVAL; - *v = kvm_read_sw_gc0_maari(vcpu->arch.cop0); + *v = kvm_read_sw_gc0_maari(&vcpu->arch.cop0); break; #ifdef CONFIG_64BIT case KVM_REG_MIPS_CP0_XCONTEXT: @@ -2135,7 +2135,7 @@ static int kvm_vz_set_one_reg(struct kvm const struct kvm_one_reg *reg, s64 v) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; unsigned int idx; int ret = 0; unsigned int cur, change; @@ -2562,7 +2562,7 @@ static void kvm_vz_vcpu_load_tlb(struct
static int kvm_vz_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; bool migrated, all;
/* @@ -2704,7 +2704,7 @@ static int kvm_vz_vcpu_load(struct kvm_v
static int kvm_vz_vcpu_put(struct kvm_vcpu *vcpu, int cpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0;
if (current->flags & PF_VCPU) kvm_vz_vcpu_save_wired(vcpu); @@ -3076,7 +3076,7 @@ static void kvm_vz_vcpu_uninit(struct kv
static int kvm_vz_vcpu_setup(struct kvm_vcpu *vcpu) { - struct mips_coproc *cop0 = vcpu->arch.cop0; + struct mips_coproc *cop0 = &vcpu->arch.cop0; unsigned long count_hz = 100*1000*1000; /* default to 100 MHz */
/*
From: Zhihao Cheng chengzhihao1@huawei.com
commit 26fb5290240dc31cae99b8b4dd2af7f46dfcba6b upstream.
Following process makes ext4 load stale buffer heads from last failed mounting in a new mounting operation: mount_bdev ext4_fill_super | ext4_load_and_init_journal | ext4_load_journal | jbd2_journal_load | load_superblock | journal_get_superblock | set_buffer_verified(bh) // buffer head is verified | jbd2_journal_recover // failed caused by EIO | goto failed_mount3a // skip 'sb->s_root' initialization deactivate_locked_super kill_block_super generic_shutdown_super if (sb->s_root) // false, skip ext4_put_super->invalidate_bdev-> // invalidate_mapping_pages->mapping_evict_folio-> // filemap_release_folio->try_to_free_buffers, which // cannot drop buffer head. blkdev_put blkdev_put_whole if (atomic_dec_and_test(&bdev->bd_openers)) // false, systemd-udev happens to open the device. Then // blkdev_flush_mapping->kill_bdev->truncate_inode_pages-> // truncate_inode_folio->truncate_cleanup_folio-> // folio_invalidate->block_invalidate_folio-> // filemap_release_folio->try_to_free_buffers will be skipped, // dropping buffer head is missed again.
Second mount: ext4_fill_super ext4_load_and_init_journal ext4_load_journal ext4_get_journal jbd2_journal_init_inode journal_init_common bh = getblk_unmovable bh = __find_get_block // Found stale bh in last failed mounting journal->j_sb_buffer = bh jbd2_journal_load load_superblock journal_get_superblock if (buffer_verified(bh)) // true, skip journal->j_format_version = 2, value is 0 jbd2_journal_recover do_one_pass next_log_block += count_tags(journal, bh) // According to journal_tag_bytes(), 'tag_bytes' calculating is // affected by jbd2_has_feature_csum3(), jbd2_has_feature_csum3() // returns false because 'j->j_format_version >= 2' is not true, // then we get wrong next_log_block. The do_one_pass may exit // early whenoccuring non JBD2_MAGIC_NUMBER in 'next_log_block'.
The filesystem is corrupted here, journal is partially replayed, and new journal sequence number actually is already used by last mounting.
The invalidate_bdev() can drop all buffer heads even racing with bare reading block device(eg. systemd-udev), so we can fix it by invalidating bdev in error handling path in __ext4_fill_super().
Fetch a reproducer in [Link].
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217171 Fixes: 25ed6e8a54df ("jbd2: enable journal clients to enable v2 checksumming") Cc: stable@vger.kernel.org # v3.5 Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230315013128.3911115-2-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1128,6 +1128,12 @@ static void ext4_blkdev_remove(struct ex struct block_device *bdev; bdev = sbi->s_journal_bdev; if (bdev) { + /* + * Invalidate the journal device's buffers. We don't want them + * floating about in memory - the physical journal device may + * hotswapped, and it breaks the `ro-after' testing code. + */ + invalidate_bdev(bdev); ext4_blkdev_put(bdev); sbi->s_journal_bdev = NULL; } @@ -1328,13 +1334,7 @@ static void ext4_put_super(struct super_ sync_blockdev(sb->s_bdev); invalidate_bdev(sb->s_bdev); if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) { - /* - * Invalidate the journal device's buffers. We don't want them - * floating about in memory - the physical journal device may - * hotswapped, and it breaks the `ro-after' testing code. - */ sync_blockdev(sbi->s_journal_bdev); - invalidate_bdev(sbi->s_journal_bdev); ext4_blkdev_remove(sbi); }
@@ -5645,6 +5645,7 @@ failed_mount: brelse(sbi->s_sbh); ext4_blkdev_remove(sbi); out_fail: + invalidate_bdev(sb->s_bdev); sb->s_fs_info = NULL; return err; }
From: Kemeng Shi shikemeng@huaweicloud.com
commit 247c3d214c23dfeeeb892e91a82ac1188bdaec9f upstream.
Function ext4_issue_discard need count in cluster. Pass count_clusters instead of count to fix the mismatch.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Cc: stable@kernel.org Reviewed-by: Ojaswin Mujoo ojaswin@linux.ibm.com Link: https://lore.kernel.org/r/20230603150327.3596033-11-shikemeng@huaweicloud.co... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -6243,8 +6243,8 @@ do_more: * them with group lock_held */ if (test_opt(sb, DISCARD)) { - err = ext4_issue_discard(sb, block_group, bit, count, - NULL); + err = ext4_issue_discard(sb, block_group, bit, + count_clusters, NULL); if (err && err != -EOPNOTSUPP) ext4_msg(sb, KERN_WARNING, "discard request in" " group:%u block:%d count:%lu failed"
From: Kemeng Shi shikemeng@huaweicloud.com
commit 11b6890be0084ad4df0e06d89a9fdcc948472c65 upstream.
ext4_free_blocks will retrieve block from bh if block parameter is zero. Retrieve block before ext4_free_blocks_simple to avoid potentially passing wrong block to ext4_free_blocks_simple.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Cc: stable@kernel.org Reviewed-by: Ojaswin Mujoo ojaswin@linux.ibm.com Link: https://lore.kernel.org/r/20230603150327.3596033-9-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -6328,12 +6328,6 @@ void ext4_free_blocks(handle_t *handle,
sbi = EXT4_SB(sb);
- if (sbi->s_mount_state & EXT4_FC_REPLAY) { - ext4_free_blocks_simple(inode, block, count); - return; - } - - might_sleep(); if (bh) { if (block) BUG_ON(block != bh->b_blocknr); @@ -6341,6 +6335,13 @@ void ext4_free_blocks(handle_t *handle, block = bh->b_blocknr; }
+ if (sbi->s_mount_state & EXT4_FC_REPLAY) { + ext4_free_blocks_simple(inode, block, count); + return; + } + + might_sleep(); + if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) && !ext4_inode_block_valid(inode, block, count)) { ext4_error(sb, "Freeing blocks not in datazone - "
From: Kemeng Shi shikemeng@huaweicloud.com
commit 2ec6d0a5ea72689a79e6f725fd8b443a788ae279 upstream.
Function ext4_free_blocks_simple needs count in cluster. Function ext4_free_blocks accepts count in block. Convert count to cluster to fix the mismatch.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Cc: stable@kernel.org Reviewed-by: Ojaswin Mujoo ojaswin@linux.ibm.com Link: https://lore.kernel.org/r/20230603150327.3596033-12-shikemeng@huaweicloud.co... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/mballoc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -6336,7 +6336,7 @@ void ext4_free_blocks(handle_t *handle, }
if (sbi->s_mount_state & EXT4_FC_REPLAY) { - ext4_free_blocks_simple(inode, block, count); + ext4_free_blocks_simple(inode, block, EXT4_NUM_B2C(sbi, count)); return; }
From: Theodore Ts'o tytso@mit.edu
commit 2ef6c32a914b85217b44a0a2418e830e520b085e upstream.
This was noticed by a user who noticied that the mtime of a file backing a loopback device was getting bumped when the loopback device is mounted read/only. Note: This doesn't show up when doing a loopback mount of a file directly, via "mount -o ro /tmp/foo.img /mnt", since the loop device is set read-only when mount automatically creates loop device. However, this is noticeable for a LUKS loop device like this:
% cryptsetup luksOpen /tmp/foo.img test % mount -o ro /dev/loop0 /mnt ; umount /mnt
or, if LUKS is not in use, if the user manually creates the loop device like this:
% losetup /dev/loop0 /tmp/foo.img % mount -o ro /dev/loop0 /mnt ; umount /mnt
The modified mtime causes rsync to do a rolling checksum scan of the file on the local and remote side, incrementally increasing the time to rsync the not-modified-but-touched image file.
Fixes: eee00237fa5e ("ext4: commit super block if fs record error when journal record without error") Cc: stable@kernel.org Link: https://lore.kernel.org/r/ZIauBR7YiV3rVAHL@glitch Reported-by: Sean Greenslade sean@seangreenslade.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5980,19 +5980,27 @@ static int ext4_load_journal(struct supe err = jbd2_journal_wipe(journal, !really_read_only); if (!err) { char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL); + __le16 orig_state; + bool changed = false;
if (save) memcpy(save, ((char *) es) + EXT4_S_ERR_START, EXT4_S_ERR_LEN); err = jbd2_journal_load(journal); - if (save) + if (save && memcmp(((char *) es) + EXT4_S_ERR_START, + save, EXT4_S_ERR_LEN)) { memcpy(((char *) es) + EXT4_S_ERR_START, save, EXT4_S_ERR_LEN); + changed = true; + } kfree(save); + orig_state = es->s_state; es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS); + if (orig_state != es->s_state) + changed = true; /* Write out restored error information to the superblock */ - if (!bdev_read_only(sb->s_bdev)) { + if (changed && !really_read_only) { int err2; err2 = ext4_commit_super(sb); err = err ? : err2;
From: Chao Yu chao@kernel.org
commit c4d13222afd8a64bf11bc7ec68645496ee8b54b9 upstream.
freeze_bdev() can fail due to a lot of reasons, it needs to check its reason before later process.
Fixes: 783d94854499 ("ext4: add EXT4_IOC_GOINGDOWN ioctl") Cc: stable@kernel.org Signed-off-by: Chao Yu chao@kernel.org Link: https://lore.kernel.org/r/20230606073203.1310389-1-chao@kernel.org Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ioctl.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -797,6 +797,7 @@ static int ext4_shutdown(struct super_bl { struct ext4_sb_info *sbi = EXT4_SB(sb); __u32 flags; + int ret;
if (!capable(CAP_SYS_ADMIN)) return -EPERM; @@ -815,7 +816,9 @@ static int ext4_shutdown(struct super_bl
switch (flags) { case EXT4_GOING_FLAGS_DEFAULT: - freeze_bdev(sb->s_bdev); + ret = freeze_bdev(sb->s_bdev); + if (ret) + return ret; set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags); thaw_bdev(sb->s_bdev); break;
From: Baokun Li libaokun1@huawei.com
commit d13f99632748462c32fc95d729f5e754bab06064 upstream.
Yi found during a review of the patch "ext4: don't BUG on inconsistent journal feature" that when ext4_mark_recovery_complete() returns an error value, the error handling path does not turn off the enabled quotas, which triggers the following kmemleak:
================================================================ unreferenced object 0xffff8cf68678e7c0 (size 64): comm "mount", pid 746, jiffies 4294871231 (age 11.540s) hex dump (first 32 bytes): 00 90 ef 82 f6 8c ff ff 00 00 00 00 41 01 00 00 ............A... c7 00 00 00 bd 00 00 00 0a 00 00 00 48 00 00 00 ............H... backtrace: [<00000000c561ef24>] __kmem_cache_alloc_node+0x4d4/0x880 [<00000000d4e621d7>] kmalloc_trace+0x39/0x140 [<00000000837eee74>] v2_read_file_info+0x18a/0x3a0 [<0000000088f6c877>] dquot_load_quota_sb+0x2ed/0x770 [<00000000340a4782>] dquot_load_quota_inode+0xc6/0x1c0 [<0000000089a18bd5>] ext4_enable_quotas+0x17e/0x3a0 [ext4] [<000000003a0268fa>] __ext4_fill_super+0x3448/0x3910 [ext4] [<00000000b0f2a8a8>] ext4_fill_super+0x13d/0x340 [ext4] [<000000004a9489c4>] get_tree_bdev+0x1dc/0x370 [<000000006e723bf1>] ext4_get_tree+0x1d/0x30 [ext4] [<00000000c7cb663d>] vfs_get_tree+0x31/0x160 [<00000000320e1bed>] do_new_mount+0x1d5/0x480 [<00000000c074654c>] path_mount+0x22e/0xbe0 [<0000000003e97a8e>] do_mount+0x95/0xc0 [<000000002f3d3736>] __x64_sys_mount+0xc4/0x160 [<0000000027d2140c>] do_syscall_64+0x3f/0x90 ================================================================
To solve this problem, we add a "failed_mount10" tag, and call ext4_quota_off_umount() in this tag to release the enabled qoutas.
Fixes: 11215630aada ("ext4: don't BUG on inconsistent journal feature") Cc: stable@kernel.org Signed-off-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230327141630.156875-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5567,7 +5567,7 @@ static int __ext4_fill_super(struct fs_c ext4_msg(sb, KERN_INFO, "recovery complete"); err = ext4_mark_recovery_complete(sb, es); if (err) - goto failed_mount9; + goto failed_mount10; }
if (test_opt(sb, DISCARD) && !bdev_max_discard_sectors(sb->s_bdev)) @@ -5586,7 +5586,9 @@ static int __ext4_fill_super(struct fs_c
return 0;
-failed_mount9: +failed_mount10: + ext4_quota_off_umount(sb); +failed_mount9: __maybe_unused ext4_release_orphan_info(sb); failed_mount8: ext4_unregister_sysfs(sb);
From: Baokun Li libaokun1@huawei.com
commit de25d6e9610a8b30cce9bbb19b50615d02ebca02 upstream.
In our fault injection test, we create an ext4 file, migrate it to non-extent based file, then punch a hole and finally trigger a WARN_ON in the ext4_da_update_reserve_space():
EXT4-fs warning (device sda): ext4_da_update_reserve_space:369: ino 14, used 11 with only 10 reserved data blocks
When writing back a non-extent based file, if we enable delalloc, the number of reserved blocks will be subtracted from the number of blocks mapped by ext4_ind_map_blocks(), and the extent status tree will be updated. We update the extent status tree by first removing the old extent_status and then inserting the new extent_status. If the block range we remove happens to be in an extent, then we need to allocate another extent_status with ext4_es_alloc_extent().
use old to remove to add new |----------|------------|------------| old extent_status
The problem is that the allocation of a new extent_status failed due to a fault injection, and __es_shrink() did not get free memory, resulting in a return of -ENOMEM. Then do_writepages() retries after receiving -ENOMEM, we map to the same extent again, and the number of reserved blocks is again subtracted from the number of blocks in that extent. Since the blocks in the same extent are subtracted twice, we end up triggering WARN_ON at ext4_da_update_reserve_space() because used > ei->i_reserved_data_blocks.
For non-extent based file, we update the number of reserved blocks after ext4_ind_map_blocks() is executed, which causes a problem that when we call ext4_ind_map_blocks() to create a block, it doesn't always create a block, but we always reduce the number of reserved blocks. So we move the logic for updating reserved blocks to ext4_ind_map_blocks() to ensure that the number of reserved blocks is updated only after we do succeed in allocating some new blocks.
Fixes: 5f634d064c70 ("ext4: Fix quota accounting error with fallocate") Cc: stable@kernel.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230424033846.4732-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/indirect.c | 8 ++++++++ fs/ext4/inode.c | 10 ---------- 2 files changed, 8 insertions(+), 10 deletions(-)
--- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -651,6 +651,14 @@ int ext4_ind_map_blocks(handle_t *handle
ext4_update_inode_fsync_trans(handle, inode, 1); count = ar.len; + + /* + * Update reserved blocks/metadata blocks after successful block + * allocation which had been deferred till now. + */ + if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) + ext4_da_update_reserve_space(inode, count, 1); + got_it: map->m_flags |= EXT4_MAP_MAPPED; map->m_pblk = le32_to_cpu(chain[depth-1].key); --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -632,16 +632,6 @@ found: */ ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE); } - - /* - * Update reserved blocks/metadata blocks after successful - * block allocation which had been deferred till now. We don't - * support fallocate for non extent files. So we can update - * reserve space here. - */ - if ((retval > 0) && - (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)) - ext4_da_update_reserve_space(inode, retval, 1); }
if (retval > 0) {
From: Alexander Aring aahringo@redhat.com
commit c6b6d6dcc7f32767d57740e0552337c8de40610b upstream.
This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context while close"). The function dlm_midcomms_close(), which will call later dlm_lowcomms_close(), is called when the cluster manager tells the node got fenced which means on midcomms/lowcomms layer to disconnect the node from the cluster communication. The node can rejoin the cluster later. This patch was ensuring no new message were able to be triggered when we are in the close() function context. This was done by checking if the lockspace has been stopped. However there is a missing check that we only need to check specific lockspaces where the fenced node is member of. This is currently complicated because there is no way to easily check if a node is part of a specific lockspace without stopping the recovery. For now we just revert this commit as it is just a check to finding possible leaks of stopping lockspaces before close() is called.
Cc: stable@vger.kernel.org Fixes: 2c3fa6ae4d52 ("dlm: check required context while close") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/lockspace.c | 12 ------------ fs/dlm/lockspace.h | 1 - fs/dlm/midcomms.c | 3 --- 3 files changed, 16 deletions(-)
--- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -935,15 +935,3 @@ void dlm_stop_lockspaces(void) log_print("dlm user daemon left %d lockspaces", count); }
-void dlm_stop_lockspaces_check(void) -{ - struct dlm_ls *ls; - - spin_lock(&lslist_lock); - list_for_each_entry(ls, &lslist, ls_list) { - if (WARN_ON(!rwsem_is_locked(&ls->ls_in_recovery) || - !dlm_locking_stopped(ls))) - break; - } - spin_unlock(&lslist_lock); -} --- a/fs/dlm/lockspace.h +++ b/fs/dlm/lockspace.h @@ -27,7 +27,6 @@ struct dlm_ls *dlm_find_lockspace_local( struct dlm_ls *dlm_find_lockspace_device(int minor); void dlm_put_lockspace(struct dlm_ls *ls); void dlm_stop_lockspaces(void); -void dlm_stop_lockspaces_check(void); int dlm_new_user_lockspace(const char *name, const char *cluster, uint32_t flags, int lvblen, const struct dlm_lockspace_ops *ops, --- a/fs/dlm/midcomms.c +++ b/fs/dlm/midcomms.c @@ -136,7 +136,6 @@ #include <net/tcp.h>
#include "dlm_internal.h" -#include "lockspace.h" #include "lowcomms.h" #include "config.h" #include "memory.h" @@ -1491,8 +1490,6 @@ int dlm_midcomms_close(int nodeid) if (nodeid == dlm_our_nodeid()) return 0;
- dlm_stop_lockspaces_check(); - idx = srcu_read_lock(&nodes_srcu); /* Abort pending close/remove operation */ node = nodeid2node(nodeid, 0);
From: David Woodhouse dwmw@amazon.co.uk
commit 6c26bd4384da24841bac4f067741bbca18b0fb74 upstream.
If mas_store_gfp() in the gather loop failed, the 'error' variable that ultimately gets returned was not being set. In many cases, its original value of -ENOMEM was still in place, and that was fine. But if VMAs had been split at the start or end of the range, then 'error' could be zero.
Change to the 'error = foo(); if (error) goto …' idiom to fix the bug.
Also clean up a later case which avoided the same bug by *explicitly* setting error = -ENOMEM right before calling the function that might return -ENOMEM.
In a final cosmetic change, move the 'Point of no return' comment to *after* the goto. That's been in the wrong place since the preallocation was removed, and this new error path was added.
Fixes: 606c812eb1d5 ("mm/mmap: Fix error path in do_vmi_align_munmap()") Signed-off-by: David Woodhouse dwmw@amazon.co.uk Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Liam R. Howlett Liam.Howlett@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/mmap.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/mm/mmap.c +++ b/mm/mmap.c @@ -2480,7 +2480,8 @@ do_vmi_align_munmap(struct vma_iterator } vma_start_write(next); mas_set_range(&mas_detach, next->vm_start, next->vm_end - 1); - if (mas_store_gfp(&mas_detach, next, GFP_KERNEL)) + error = mas_store_gfp(&mas_detach, next, GFP_KERNEL); + if (error) goto munmap_gather_failed; vma_mark_detached(next, true); if (next->vm_flags & VM_LOCKED) @@ -2529,12 +2530,12 @@ do_vmi_align_munmap(struct vma_iterator BUG_ON(count != test_count); } #endif - /* Point of no return */ - error = -ENOMEM; vma_iter_set(vmi, start); - if (vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL)) + error = vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL); + if (error) goto clear_tree_failed;
+ /* Point of no return */ mm->locked_vm -= locked_vm; mm->map_count -= count; /*
From: Christian Marangi ansuelsmth@gmail.com
commit bcb889891371c3cf767f2b9e8768cfe2fdd3810f upstream.
Commit ebeb20a9cd3f ("soc: qcom: mdt_loader: Always invoke PAS mem_setup") dropped the relocate check and made pas_mem_setup run unconditionally. The code was later moved with commit f4e526ff7e38 ("soc: qcom: mdt_loader: Extract PAS operations") to qcom_mdt_pas_init() effectively losing track of what was actually done.
The assumption that PAS mem_setup can be done anytime was effectively wrong, with no good reason and this caused regression on some SoC that use remoteproc to bringup ath11k. One example is IPQ8074 SoC that effectively broke resulting in remoteproc silently die and ath11k not working.
On this SoC FW relocate is not enabled and PAS mem_setup was correctly skipped in previous kernel version resulting in correct bringup and function of remoteproc and ath11k.
To fix the regression, reintroduce the relocate check in qcom_mdt_pas_init() and correctly skip PAS mem_setup where relocate is not enabled.
Fixes: ebeb20a9cd3f ("soc: qcom: mdt_loader: Always invoke PAS mem_setup") Tested-by: Robert Marko robimarko@gmail.com Co-developed-by: Robert Marko robimarko@gmail.com Signed-off-by: Robert Marko robimarko@gmail.com Signed-off-by: Christian Marangi ansuelsmth@gmail.com Cc: stable@vger.kernel.org Reviewed-by: Mukesh Ojha quic_mojha@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20230526115511.3328-1-ansuelsmth@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soc/qcom/mdt_loader.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
--- a/drivers/soc/qcom/mdt_loader.c +++ b/drivers/soc/qcom/mdt_loader.c @@ -210,6 +210,7 @@ int qcom_mdt_pas_init(struct device *dev const struct elf32_hdr *ehdr; phys_addr_t min_addr = PHYS_ADDR_MAX; phys_addr_t max_addr = 0; + bool relocate = false; size_t metadata_len; void *metadata; int ret; @@ -224,6 +225,9 @@ int qcom_mdt_pas_init(struct device *dev if (!mdt_phdr_valid(phdr)) continue;
+ if (phdr->p_flags & QCOM_MDT_RELOCATABLE) + relocate = true; + if (phdr->p_paddr < min_addr) min_addr = phdr->p_paddr;
@@ -246,11 +250,13 @@ int qcom_mdt_pas_init(struct device *dev goto out; }
- ret = qcom_scm_pas_mem_setup(pas_id, mem_phys, max_addr - min_addr); - if (ret) { - /* Unable to set up relocation */ - dev_err(dev, "error %d setting up firmware %s\n", ret, fw_name); - goto out; + if (relocate) { + ret = qcom_scm_pas_mem_setup(pas_id, mem_phys, max_addr - min_addr); + if (ret) { + /* Unable to set up relocation */ + dev_err(dev, "error %d setting up firmware %s\n", ret, fw_name); + goto out; + } }
out:
From: Ritesh Harjani (IBM) ritesh.list@gmail.com
commit fcced95b6ba2a507a83b8b3e0358a8ac16b13e35 upstream.
PAGE_ALIGN(x) macro gives the next highest value which is multiple of pagesize. But if x is already page aligned then it simply returns x. So, if x passed is 0 in dax_zero_range() function, that means the length gets passed as 0 to ->iomap_begin().
In ext2 it then calls ext2_get_blocks -> max_blocks as 0 and hits bug_on here in ext2_get_blocks(). BUG_ON(maxblocks == 0);
Instead we should be calling dax_truncate_page() here which takes care of it. i.e. it only calls dax_zero_range if the offset is not page/block aligned.
This can be easily triggered with following on fsdax mounted pmem device.
dd if=/dev/zero of=file count=1 bs=512 truncate -s 0 file
[79.525838] EXT2-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [79.529376] ext2 filesystem being mounted at /mnt1/test supports timestamps until 2038 (0x7fffffff) [93.793207] ------------[ cut here ]------------ [93.795102] kernel BUG at fs/ext2/inode.c:637! [93.796904] invalid opcode: 0000 [#1] PREEMPT SMP PTI [93.798659] CPU: 0 PID: 1192 Comm: truncate Not tainted 6.3.0-rc2-xfstests-00056-g131086faa369 #139 [93.806459] RIP: 0010:ext2_get_blocks.constprop.0+0x524/0x610 <...> [93.835298] Call Trace: [93.836253] <TASK> [93.837103] ? lock_acquire+0xf8/0x110 [93.838479] ? d_lookup+0x69/0xd0 [93.839779] ext2_iomap_begin+0xa7/0x1c0 [93.841154] iomap_iter+0xc7/0x150 [93.842425] dax_zero_range+0x6e/0xa0 [93.843813] ext2_setsize+0x176/0x1b0 [93.845164] ext2_setattr+0x151/0x200 [93.846467] notify_change+0x341/0x4e0 [93.847805] ? lock_acquire+0xf8/0x110 [93.849143] ? do_truncate+0x74/0xe0 [93.850452] ? do_truncate+0x84/0xe0 [93.851739] do_truncate+0x84/0xe0 [93.852974] do_sys_ftruncate+0x2b4/0x2f0 [93.854404] do_syscall_64+0x3f/0x90 [93.855789] entry_SYSCALL_64_after_hwframe+0x72/0xdc
CC: stable@vger.kernel.org Fixes: 2aa3048e03d3 ("iomap: switch iomap_zero_range to use iomap_iter") Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Signed-off-by: Jan Kara jack@suse.cz Message-Id: 046a58317f29d9603d1068b2bbae47c2332c17ae.1682069716.git.ritesh.list@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext2/inode.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -1259,9 +1259,8 @@ static int ext2_setsize(struct inode *in inode_dio_wait(inode);
if (IS_DAX(inode)) - error = dax_zero_range(inode, newsize, - PAGE_ALIGN(newsize) - newsize, NULL, - &ext2_iomap_ops); + error = dax_truncate_page(inode, newsize, NULL, + &ext2_iomap_ops); else error = block_truncate_page(inode->i_mapping, newsize, ext2_get_block);
From: Siddh Raman Pant code@siddh.me
commit 11509910c599cbd04585ec35a6d5e1a0053d84c1 upstream.
In jfs_dmap.c at line 381, BLKTODMAP is used to get a logical block number inside dbFree(). db_l2nbperpage, which is the log2 number of blocks per page, is passed as an argument to BLKTODMAP which uses it for shifting.
Syzbot reported a shift out-of-bounds crash because db_l2nbperpage is too big. This happens because the large value is set without any validation in dbMount() at line 181.
Thus, make sure that db_l2nbperpage is correct while mounting.
Max number of blocks per page = Page size / Min block size => log2(Max num_block per page) = log2(Page size / Min block size) = log2(Page size) - log2(Min block size)
=> Max db_l2nbperpage = L2PSIZE - L2MINBLOCKSIZE
Reported-and-tested-by: syzbot+d2cd27dcf8e04b232eb2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=2a70a453331db32ed491f5cbb07e81bf2d22571... Cc: stable@vger.kernel.org Suggested-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Siddh Raman Pant code@siddh.me Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jfs/jfs_dmap.c | 6 ++++++ fs/jfs/jfs_filsys.h | 2 ++ 2 files changed, 8 insertions(+)
--- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -178,7 +178,13 @@ int dbMount(struct inode *ipbmap) dbmp_le = (struct dbmap_disk *) mp->data; bmp->db_mapsize = le64_to_cpu(dbmp_le->dn_mapsize); bmp->db_nfree = le64_to_cpu(dbmp_le->dn_nfree); + bmp->db_l2nbperpage = le32_to_cpu(dbmp_le->dn_l2nbperpage); + if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE) { + err = -EINVAL; + goto err_release_metapage; + } + bmp->db_numag = le32_to_cpu(dbmp_le->dn_numag); if (!bmp->db_numag) { err = -EINVAL; --- a/fs/jfs/jfs_filsys.h +++ b/fs/jfs/jfs_filsys.h @@ -122,7 +122,9 @@ #define NUM_INODE_PER_IAG INOSPERIAG
#define MINBLOCKSIZE 512 +#define L2MINBLOCKSIZE 9 #define MAXBLOCKSIZE 4096 +#define L2MAXBLOCKSIZE 12 #define MAXFILESIZE ((s64)1 << 52)
#define JFS_LINK_MAX 0xffffffff
From: Frank Wunderlich frank-w@public-files.de
commit 7afe7b5969329175ac4f55a6b9c13ba4f6dc267e upstream.
To store uncompressed bl2 more space is required than partition is actually defined.
There is currently no known usage of this reserved partition. Openwrt uses same partition layout.
We added same change to u-boot with commit d7bb1099 [1].
[1] https://source.denx.de/u-boot/u-boot/-/commit/d7bb109900c1ca754a0198b9afb50e...
Cc: stable@vger.kernel.org Fixes: 8e01fb15b815 ("arm64: dts: mt7986: add Bananapi R3") Signed-off-by: Frank Wunderlich frank-w@public-files.de Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Reviewed-by: Daniel Golle daniel@makrotopia.org Link: https://lore.kernel.org/r/20230528113343.7649-1-linux@fw-web.de Signed-off-by: Matthias Brugger matthias.bgg@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso b/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso index 84aa229e80f3..e48881be4ed6 100644 --- a/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso +++ b/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso @@ -27,15 +27,10 @@ partitions {
partition@0 { label = "bl2"; - reg = <0x0 0x20000>; + reg = <0x0 0x40000>; read-only; };
- partition@20000 { - label = "reserved"; - reg = <0x20000 0x20000>; - }; - partition@40000 { label = "u-boot-env"; reg = <0x40000 0x40000>;
From: Sinthu Raja sinthu.raja@ti.com
commit 6bc829ceea4158c7aeb3a9e73d5c52634d78fb6f upstream.
The WKUP_PADCONFIG register region in J721S2 has multiple non-addressable regions, accordingly split the existing wkup_pmx region as follows to avoid the non-addressable regions and include the rest of valid WKUP_PADCONFIG registers. Also update references to old nodes with new ones.
wkup_pmx0 -> 13 pins (WKUP_PADCONFIG 0 - 12) wkup_pmx1 -> 11 pins (WKUP_PADCONFIG 14 - 24) wkup_pmx2 -> 72 pins (WKUP_PADCONFIG 26 - 97) wkup_pmx3 -> 1 pin (WKUP_PADCONFIG 100)
Fixes: b8545f9d3a54 ("arm64: dts: ti: Add initial support for J721S2 SoC") Cc: stable@vger.kernel.org # 6.3 Signed-off-by: Sinthu Raja sinthu.raja@ti.com Signed-off-by: Thejasvi Konduru t-konduru@ti.com Signed-off-by: Nishanth Menon nm@ti.com Reviewed-by: Udit Kumar u-kumar1@ti.com Link: https://lore.kernel.org/r/20230602153554.1571128-2-nm@ti.com Signed-off-by: Vignesh Raghavendra vigneshr@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts | 42 ++++----- arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts | 76 ++++++++--------- arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi | 29 ++++++ 3 files changed, 87 insertions(+), 60 deletions(-)
--- a/arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts +++ b/arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts @@ -175,49 +175,49 @@ }; };
-&wkup_pmx0 { +&wkup_pmx2 { mcu_cpsw_pins_default: mcu-cpsw-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x094, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */ - J721S2_WKUP_IOPAD(0x090, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */ - J721S2_WKUP_IOPAD(0x08c, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */ - J721S2_WKUP_IOPAD(0x088, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */ - J721S2_WKUP_IOPAD(0x084, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */ - J721S2_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */ - J721S2_WKUP_IOPAD(0x07c, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */ - J721S2_WKUP_IOPAD(0x078, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */ - J721S2_WKUP_IOPAD(0x074, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */ - J721S2_WKUP_IOPAD(0x070, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */ - J721S2_WKUP_IOPAD(0x080, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */ - J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */ + J721S2_WKUP_IOPAD(0x02C, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */ + J721S2_WKUP_IOPAD(0x028, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */ + J721S2_WKUP_IOPAD(0x024, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */ + J721S2_WKUP_IOPAD(0x020, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */ + J721S2_WKUP_IOPAD(0x01C, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */ + J721S2_WKUP_IOPAD(0x004, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */ + J721S2_WKUP_IOPAD(0x014, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */ + J721S2_WKUP_IOPAD(0x010, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */ + J721S2_WKUP_IOPAD(0x00C, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */ + J721S2_WKUP_IOPAD(0x008, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */ + J721S2_WKUP_IOPAD(0x018, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */ + J721S2_WKUP_IOPAD(0x000, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */ >; };
mcu_mdio_pins_default: mcu-mdio-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x09c, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */ - J721S2_WKUP_IOPAD(0x098, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */ + J721S2_WKUP_IOPAD(0x034, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */ + J721S2_WKUP_IOPAD(0x030, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */ >; };
mcu_mcan0_pins_default: mcu-mcan0-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0bc, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */ - J721S2_WKUP_IOPAD(0x0b8, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */ + J721S2_WKUP_IOPAD(0x054, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */ + J721S2_WKUP_IOPAD(0x050, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */ >; };
mcu_mcan1_pins_default: mcu-mcan1-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0d4, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */ - J721S2_WKUP_IOPAD(0x0d0, PIN_OUTPUT, 0) /* (C23) WKUP_GPIO0_4.MCU_MCAN1_TX*/ + J721S2_WKUP_IOPAD(0x06C, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */ + J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (C23) WKUP_GPIO0_4.MCU_MCAN1_TX*/ >; };
mcu_i2c1_pins_default: mcu-i2c1-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0e0, PIN_INPUT, 0) /* (F24) WKUP_GPIO0_8.MCU_I2C1_SCL */ - J721S2_WKUP_IOPAD(0x0e4, PIN_INPUT, 0) /* (H26) WKUP_GPIO0_9.MCU_I2C1_SDA */ + J721S2_WKUP_IOPAD(0x078, PIN_INPUT, 0) /* (F24) WKUP_GPIO0_8.MCU_I2C1_SCL */ + J721S2_WKUP_IOPAD(0x07c, PIN_INPUT, 0) /* (H26) WKUP_GPIO0_9.MCU_I2C1_SDA */ >; }; }; --- a/arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts +++ b/arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts @@ -146,81 +146,81 @@ }; };
-&wkup_pmx0 { +&wkup_pmx2 { mcu_cpsw_pins_default: mcu-cpsw-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x094, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */ - J721S2_WKUP_IOPAD(0x090, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */ - J721S2_WKUP_IOPAD(0x08c, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */ - J721S2_WKUP_IOPAD(0x088, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */ - J721S2_WKUP_IOPAD(0x084, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */ - J721S2_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */ - J721S2_WKUP_IOPAD(0x07c, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */ - J721S2_WKUP_IOPAD(0x078, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */ - J721S2_WKUP_IOPAD(0x074, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */ - J721S2_WKUP_IOPAD(0x070, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */ - J721S2_WKUP_IOPAD(0x080, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */ - J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */ + J721S2_WKUP_IOPAD(0x02c, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */ + J721S2_WKUP_IOPAD(0x028, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */ + J721S2_WKUP_IOPAD(0x024, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */ + J721S2_WKUP_IOPAD(0x020, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */ + J721S2_WKUP_IOPAD(0x01c, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */ + J721S2_WKUP_IOPAD(0x004, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */ + J721S2_WKUP_IOPAD(0x014, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */ + J721S2_WKUP_IOPAD(0x010, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */ + J721S2_WKUP_IOPAD(0x00c, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */ + J721S2_WKUP_IOPAD(0x008, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */ + J721S2_WKUP_IOPAD(0x018, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */ + J721S2_WKUP_IOPAD(0x000, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */ >; };
mcu_mdio_pins_default: mcu-mdio-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x09c, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */ - J721S2_WKUP_IOPAD(0x098, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */ + J721S2_WKUP_IOPAD(0x034, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */ + J721S2_WKUP_IOPAD(0x030, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */ >; };
mcu_mcan0_pins_default: mcu-mcan0-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0bc, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */ - J721S2_WKUP_IOPAD(0x0b8, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */ + J721S2_WKUP_IOPAD(0x054, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */ + J721S2_WKUP_IOPAD(0x050, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */ >; };
mcu_mcan1_pins_default: mcu-mcan1-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0d4, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */ - J721S2_WKUP_IOPAD(0x0d0, PIN_OUTPUT, 0) /* (C23) WKUP_GPIO0_4.MCU_MCAN1_TX */ + J721S2_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */ + J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /*(C23) WKUP_GPIO0_4.MCU_MCAN1_TX */ >; };
mcu_mcan0_gpio_pins_default: mcu-mcan0-gpio-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0c0, PIN_INPUT, 7) /* (D26) WKUP_GPIO0_0 */ - J721S2_WKUP_IOPAD(0x0a8, PIN_INPUT, 7) /* (B25) MCU_SPI0_D1.WKUP_GPIO0_69 */ + J721S2_WKUP_IOPAD(0x058, PIN_INPUT, 7) /* (D26) WKUP_GPIO0_0 */ + J721S2_WKUP_IOPAD(0x040, PIN_INPUT, 7) /* (B25) MCU_SPI0_D1.WKUP_GPIO0_69 */ >; };
mcu_mcan1_gpio_pins_default: mcu-mcan1-gpio-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x0c8, PIN_INPUT, 7) /* (C28) WKUP_GPIO0_2 */ + J721S2_WKUP_IOPAD(0x060, PIN_INPUT, 7) /* (C28) WKUP_GPIO0_2 */ >; };
mcu_adc0_pins_default: mcu-adc0-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x134, PIN_INPUT, 0) /* (L25) MCU_ADC0_AIN0 */ - J721S2_WKUP_IOPAD(0x138, PIN_INPUT, 0) /* (K25) MCU_ADC0_AIN1 */ - J721S2_WKUP_IOPAD(0x13c, PIN_INPUT, 0) /* (M24) MCU_ADC0_AIN2 */ - J721S2_WKUP_IOPAD(0x140, PIN_INPUT, 0) /* (L24) MCU_ADC0_AIN3 */ - J721S2_WKUP_IOPAD(0x144, PIN_INPUT, 0) /* (L27) MCU_ADC0_AIN4 */ - J721S2_WKUP_IOPAD(0x148, PIN_INPUT, 0) /* (K24) MCU_ADC0_AIN5 */ - J721S2_WKUP_IOPAD(0x14c, PIN_INPUT, 0) /* (M27) MCU_ADC0_AIN6 */ - J721S2_WKUP_IOPAD(0x150, PIN_INPUT, 0) /* (M26) MCU_ADC0_AIN7 */ + J721S2_WKUP_IOPAD(0x0cc, PIN_INPUT, 0) /* (L25) MCU_ADC0_AIN0 */ + J721S2_WKUP_IOPAD(0x0d0, PIN_INPUT, 0) /* (K25) MCU_ADC0_AIN1 */ + J721S2_WKUP_IOPAD(0x0d4, PIN_INPUT, 0) /* (M24) MCU_ADC0_AIN2 */ + J721S2_WKUP_IOPAD(0x0d8, PIN_INPUT, 0) /* (L24) MCU_ADC0_AIN3 */ + J721S2_WKUP_IOPAD(0x0dc, PIN_INPUT, 0) /* (L27) MCU_ADC0_AIN4 */ + J721S2_WKUP_IOPAD(0x0e0, PIN_INPUT, 0) /* (K24) MCU_ADC0_AIN5 */ + J721S2_WKUP_IOPAD(0x0e4, PIN_INPUT, 0) /* (M27) MCU_ADC0_AIN6 */ + J721S2_WKUP_IOPAD(0x0e8, PIN_INPUT, 0) /* (M26) MCU_ADC0_AIN7 */ >; };
mcu_adc1_pins_default: mcu-adc1-pins-default { pinctrl-single,pins = < - J721S2_WKUP_IOPAD(0x154, PIN_INPUT, 0) /* (P25) MCU_ADC1_AIN0 */ - J721S2_WKUP_IOPAD(0x158, PIN_INPUT, 0) /* (R25) MCU_ADC1_AIN1 */ - J721S2_WKUP_IOPAD(0x15c, PIN_INPUT, 0) /* (P28) MCU_ADC1_AIN2 */ - J721S2_WKUP_IOPAD(0x160, PIN_INPUT, 0) /* (P27) MCU_ADC1_AIN3 */ - J721S2_WKUP_IOPAD(0x164, PIN_INPUT, 0) /* (N25) MCU_ADC1_AIN4 */ - J721S2_WKUP_IOPAD(0x168, PIN_INPUT, 0) /* (P26) MCU_ADC1_AIN5 */ - J721S2_WKUP_IOPAD(0x16c, PIN_INPUT, 0) /* (N26) MCU_ADC1_AIN6 */ - J721S2_WKUP_IOPAD(0x170, PIN_INPUT, 0) /* (N27) MCU_ADC1_AIN7 */ + J721S2_WKUP_IOPAD(0x0ec, PIN_INPUT, 0) /* (P25) MCU_ADC1_AIN0 */ + J721S2_WKUP_IOPAD(0x0f0, PIN_INPUT, 0) /* (R25) MCU_ADC1_AIN1 */ + J721S2_WKUP_IOPAD(0x0f4, PIN_INPUT, 0) /* (P28) MCU_ADC1_AIN2 */ + J721S2_WKUP_IOPAD(0x0f8, PIN_INPUT, 0) /* (P27) MCU_ADC1_AIN3 */ + J721S2_WKUP_IOPAD(0x0fc, PIN_INPUT, 0) /* (N25) MCU_ADC1_AIN4 */ + J721S2_WKUP_IOPAD(0x100, PIN_INPUT, 0) /* (P26) MCU_ADC1_AIN5 */ + J721S2_WKUP_IOPAD(0x104, PIN_INPUT, 0) /* (N26) MCU_ADC1_AIN6 */ + J721S2_WKUP_IOPAD(0x108, PIN_INPUT, 0) /* (N27) MCU_ADC1_AIN7 */ >; }; }; --- a/arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi @@ -50,7 +50,34 @@ wkup_pmx0: pinctrl@4301c000 { compatible = "pinctrl-single"; /* Proxy 0 addressing */ - reg = <0x00 0x4301c000 0x00 0x178>; + reg = <0x00 0x4301c000 0x00 0x034>; + #pinctrl-cells = <1>; + pinctrl-single,register-width = <32>; + pinctrl-single,function-mask = <0xffffffff>; + }; + + wkup_pmx1: pinctrl@4301c038 { + compatible = "pinctrl-single"; + /* Proxy 0 addressing */ + reg = <0x00 0x4301c038 0x00 0x02C>; + #pinctrl-cells = <1>; + pinctrl-single,register-width = <32>; + pinctrl-single,function-mask = <0xffffffff>; + }; + + wkup_pmx2: pinctrl@4301c068 { + compatible = "pinctrl-single"; + /* Proxy 0 addressing */ + reg = <0x00 0x4301c068 0x00 0x120>; + #pinctrl-cells = <1>; + pinctrl-single,register-width = <32>; + pinctrl-single,function-mask = <0xffffffff>; + }; + + wkup_pmx3: pinctrl@4301c190 { + compatible = "pinctrl-single"; + /* Proxy 0 addressing */ + reg = <0x00 0x4301c190 0x00 0x004>; #pinctrl-cells = <1>; pinctrl-single,register-width = <32>; pinctrl-single,function-mask = <0xffffffff>;
From: Martin Kaiser martin@kaiser.cx
commit d744ae7477190967a3ddc289e2cd4ae59e8b1237 upstream.
Fix the timeout that is used for the initialisation and for the self test. wait_for_completion_timeout expects a timeout in jiffies, but RNGC_TIMEOUT is in milliseconds. Call msecs_to_jiffies to do the conversion.
Cc: stable@vger.kernel.org Fixes: 1d5449445bd0 ("hwrng: mx-rngc - add a driver for Freescale RNGC") Signed-off-by: Martin Kaiser martin@kaiser.cx Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/hw_random/imx-rngc.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/char/hw_random/imx-rngc.c +++ b/drivers/char/hw_random/imx-rngc.c @@ -110,7 +110,7 @@ static int imx_rngc_self_test(struct imx cmd = readl(rngc->base + RNGC_COMMAND); writel(cmd | RNGC_CMD_SELF_TEST, rngc->base + RNGC_COMMAND);
- ret = wait_for_completion_timeout(&rngc->rng_op_done, RNGC_TIMEOUT); + ret = wait_for_completion_timeout(&rngc->rng_op_done, msecs_to_jiffies(RNGC_TIMEOUT)); imx_rngc_irq_mask_clear(rngc); if (!ret) return -ETIMEDOUT; @@ -187,9 +187,7 @@ static int imx_rngc_init(struct hwrng *r cmd = readl(rngc->base + RNGC_COMMAND); writel(cmd | RNGC_CMD_SEED, rngc->base + RNGC_COMMAND);
- ret = wait_for_completion_timeout(&rngc->rng_op_done, - RNGC_TIMEOUT); - + ret = wait_for_completion_timeout(&rngc->rng_op_done, msecs_to_jiffies(RNGC_TIMEOUT)); if (!ret) { ret = -ETIMEDOUT; goto err;
From: Mikulas Patocka mpatocka@redhat.com
commit 6d50eb4725934fd22f5eeccb401000687c790fd0 upstream.
It was reported that dm-integrity runs out of vmalloc space on 32-bit architectures. On x86, there is only 128MiB vmalloc space and dm-integrity consumes it quickly because it has a 64MiB journal and 8MiB recalculate buffer.
Fix this by reducing the size of the journal to 4MiB and the size of the recalculate buffer to 1MiB, so that multiple dm-integrity devices can be created and activated on 32-bit architectures.
Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-integrity.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -34,11 +34,11 @@ #define DEFAULT_BUFFER_SECTORS 128 #define DEFAULT_JOURNAL_WATERMARK 50 #define DEFAULT_SYNC_MSEC 10000 -#define DEFAULT_MAX_JOURNAL_SECTORS 131072 +#define DEFAULT_MAX_JOURNAL_SECTORS (IS_ENABLED(CONFIG_64BIT) ? 131072 : 8192) #define MIN_LOG2_INTERLEAVE_SECTORS 3 #define MAX_LOG2_INTERLEAVE_SECTORS 31 #define METADATA_WORKQUEUE_MAX_ACTIVE 16 -#define RECALC_SECTORS 32768 +#define RECALC_SECTORS (IS_ENABLED(CONFIG_64BIT) ? 32768 : 2048) #define RECALC_WRITE_SUPER 16 #define BITMAP_BLOCK_SIZE 4096 /* don't change it */ #define BITMAP_FLUSH_INTERVAL (10 * HZ)
From: Sathya Prakash sathya.prakash@broadcom.com
commit f762326b2baa86ae647e2ba6832bc87e238f68ad upstream.
Copy the sense data to internal driver buffer when the firmware completes any SCSI I/O command sent through admin queue with sense data for further use.
Fixes: 506bc1a0d6ba ("scsi: mpi3mr: Add support for MPT commands") Cc: stable@vger.kernel.org Signed-off-by: Sathya Prakash sathya.prakash@broadcom.com Signed-off-by: Sumit Saxena sumit.saxena@broadcom.com Link: https://lore.kernel.org/r/20230531184025.3803-1-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/mpi3mr/mpi3mr_fw.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/scsi/mpi3mr/mpi3mr_fw.c +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c @@ -402,6 +402,11 @@ static void mpi3mr_process_admin_reply_d memcpy((u8 *)cmdptr->reply, (u8 *)def_reply, mrioc->reply_sz); } + if (sense_buf && cmdptr->sensebuf) { + cmdptr->is_sense = 1; + memcpy(cmdptr->sensebuf, sense_buf, + MPI3MR_SENSE_BUF_SZ); + } if (cmdptr->is_waiting) { complete(&cmdptr->done); cmdptr->is_waiting = 0;
From: Harald Freudenberger freude@linux.ibm.com
commit af40322e90d4e0093569eceb7d3a28ab635f3e75 upstream.
All kind of administrative requests should not been retried. Some card firmware detects this and assumes a replay attack. This patch checks on failure if the low level functions indicate a retry (EAGAIN) and checks for the ADMIN flag set on the request message. If this both are true, the response code for this message is changed to EIO to make sure the zcrypt API layer does not attempt to retry the request. As of now the ADMIN flag is set for a request message when - for EP11 the field 'flags' of the EP11 CPRB struct has the leftmost bit set. - for CCA when the CPRB minor version is 'T3', 'T5', 'T6' or 'T7'.
Please note that the do-not-retry only applies to a request which has been sent to the card (= has been successfully enqueued) but the reply indicates some kind of failure and by default it would be replied. It is totally fine to retry a request if a previous attempt to enqueue the msg into the firmware queue had some kind of failure and thus the card has never seen this request.
Reported-by: Frank Uhlig Frank.Uhlig1@ibm.com Signed-off-by: Harald Freudenberger freude@linux.ibm.com Reviewed-by: Holger Dengler dengler@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/crypto/zcrypt_msgtype6.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/s390/crypto/zcrypt_msgtype6.c +++ b/drivers/s390/crypto/zcrypt_msgtype6.c @@ -1143,6 +1143,9 @@ static long zcrypt_msgtype6_send_cprb(bo ap_cancel_message(zq->queue, ap_msg); }
+ if (rc == -EAGAIN && ap_msg->flags & AP_MSG_FLAG_ADMIN) + rc = -EIO; /* do not retry administrative requests */ + out: if (rc) ZCRYPT_DBF_DBG("%s send cprb at dev=%02x.%04x rc=%d\n", @@ -1263,6 +1266,9 @@ static long zcrypt_msgtype6_send_ep11_cp ap_cancel_message(zq->queue, ap_msg); }
+ if (rc == -EAGAIN && ap_msg->flags & AP_MSG_FLAG_ADMIN) + rc = -EIO; /* do not retry administrative requests */ + out: if (rc) ZCRYPT_DBF_DBG("%s send cprb at dev=%02x.%04x rc=%d\n",
From: Ondrej Zary linux@zary.sk
commit 9e30fd26f43b89cb6b4e850a86caa2e50dedb454 upstream.
The quirk for Elo i2 introduced in commit 92597f97a40b ("PCI/PM: Avoid putting Elo i2 PCIe Ports in D3cold") is also needed by EloPOS E2/S2/H2 which uses the same Continental Z2 board.
Change the quirk to match the board instead of system.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215715 Link: https://lore.kernel.org/r/20230614074253.22318-1-linux@zary.sk Signed-off-by: Ondrej Zary linux@zary.sk Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2949,13 +2949,13 @@ static const struct dmi_system_id bridge { /* * Downstream device is not accessible after putting a root port - * into D3cold and back into D0 on Elo i2. + * into D3cold and back into D0 on Elo Continental Z2 board */ - .ident = "Elo i2", + .ident = "Elo Continental Z2", .matches = { - DMI_MATCH(DMI_SYS_VENDOR, "Elo Touch Solutions"), - DMI_MATCH(DMI_PRODUCT_NAME, "Elo i2"), - DMI_MATCH(DMI_PRODUCT_VERSION, "RevB"), + DMI_MATCH(DMI_BOARD_VENDOR, "Elo Touch Solutions"), + DMI_MATCH(DMI_BOARD_NAME, "Geminilake"), + DMI_MATCH(DMI_BOARD_VERSION, "Continental Z2"), }, }, #endif
From: Ross Lagerwall ross.lagerwall@citrix.com
commit e54223275ba1bc6f704a6bab015fcd2ae4f72572 upstream.
When contiguous windows are coalesced by pci_register_host_bridge(), the second resource is expanded to include the first, and the first is invalidated and consequently not added to the bus. However, it remains in the resource hierarchy. For example, these windows:
fec00000-fec7ffff : PCI Bus 0000:00 fec80000-fecbffff : PCI Bus 0000:00
are coalesced into this, where the first resource remains in the tree with start/end zeroed out:
00000000-00000000 : PCI Bus 0000:00 fec00000-fecbffff : PCI Bus 0000:00
In some cases (e.g. the Xen scratch region), this causes future calls to allocate_resource() to choose an inappropriate location which the caller cannot handle.
Fix by releasing the zeroed-out resource and removing it from the resource hierarchy.
[bhelgaas: commit log] Fixes: 7c3855c423b1 ("PCI: Coalesce host bridge contiguous apertures") Link: https://lore.kernel.org/r/20230525153248.712779-1-ross.lagerwall@citrix.com Signed-off-by: Ross Lagerwall ross.lagerwall@citrix.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/probe.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -997,8 +997,10 @@ static int pci_register_host_bridge(stru resource_list_for_each_entry_safe(window, n, &resources) { offset = window->offset; res = window->res; - if (!res->flags && !res->start && !res->end) + if (!res->flags && !res->start && !res->end) { + release_resource(res); continue; + }
list_move_tail(&window->node, &bridge->windows);
From: Robin Murphy robin.murphy@arm.com
commit 88d341716b83abd355558523186ca488918627ee upstream.
Marvell's own product brief implies the 92xx series are a closely related family, and sure enough it turns out that 9235 seems to need the same quirk as the other three, although possibly only when certain ports are used.
Link: https://lore.kernel.org/linux-iommu/2a699a99-545c-1324-e052-7d2f41fed1ae@yah... Link: https://lore.kernel.org/r/731507e05d70239aec96fcbfab6e65d8ce00edd2.168615716... Reported-by: Jason Adriaanse jason_a69@yahoo.co.uk Signed-off-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -4174,6 +4174,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_M /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230, quirk_dma_func1_alias); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235, + quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642, quirk_dma_func1_alias); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
From: Igor Mammedov imammedo@redhat.com
commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 upstream.
When using ACPI PCI hotplug, hotplugging a device with large BARs may fail if bridge windows programmed by firmware are not large enough.
Reproducer: $ qemu-kvm -monitor stdio -M q35 -m 4G \ -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on \ -device id=rp1,pcie-root-port,bus=pcie.0,chassis=4 \ disk_image
wait till linux guest boots, then hotplug device: (qemu) device_add qxl,bus=rp1
hotplug on guest side fails with: pci 0000:01:00.0: [1b36:0100] type 00 class 0x038000 pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff] pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x03ffffff] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00001fff] pci 0000:01:00.0: reg 0x1c: [io 0x0000-0x001f] pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000] pci 0000:01:00.0: BAR 1: no space for [mem size 0x04000000] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x04000000] pci 0000:01:00.0: BAR 2: assigned [mem 0xfe800000-0xfe801fff] pci 0000:01:00.0: BAR 3: assigned [io 0x1000-0x101f] qxl 0000:01:00.0: enabling device (0000 -> 0003) Unable to create vram_mapping qxl: probe of 0000:01:00.0 failed with error -12
However when using native PCIe hotplug '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off' it works fine, since kernel attempts to reassign unused resources.
Use the same machinery as native PCIe hotplug to (re)assign resources.
Link: https://lore.kernel.org/r/20230424191557.2464760-1-imammedo@redhat.com Signed-off-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Acked-by: Michael S. Tsirkin mst@redhat.com Acked-by: Rafael J. Wysocki rafael@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/hotplug/acpiphp_glue.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/drivers/pci/hotplug/acpiphp_glue.c +++ b/drivers/pci/hotplug/acpiphp_glue.c @@ -498,7 +498,6 @@ static void enable_slot(struct acpiphp_s acpiphp_native_scan_bridge(dev); } } else { - LIST_HEAD(add_list); int max, pass;
acpiphp_rescan_slot(slot); @@ -512,12 +511,10 @@ static void enable_slot(struct acpiphp_s if (pass && dev->subordinate) { check_hotplug_bridge(slot, dev); pcibios_resource_survey_bus(dev->subordinate); - __pci_bus_size_bridges(dev->subordinate, - &add_list); } } } - __pci_bus_assign_resources(bus, &add_list, NULL); + pci_assign_unassigned_bridge_resources(bus->self); }
acpiphp_sanitize_bus(bus);
On 21.07.23 18:04, Greg Kroah-Hartman wrote:
From: Igor Mammedov imammedo@redhat.com
commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 upstream.
When using ACPI PCI hotplug, hotplugging a device with large BARs may fail if bridge windows programmed by firmware are not large enough.
Reproducer: $ qemu-kvm -monitor stdio -M q35 -m 4G \ -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on \ -device id=rp1,pcie-root-port,bus=pcie.0,chassis=4 \ disk_image
wait till linux guest boots, then hotplug device: (qemu) device_add qxl,bus=rp1
[...]
Greg, just so you know, that patch (which is also queued for 6.1 and 5.15) is known to cause a regression in 6.5-rc. To quote https://lore.kernel.org/all/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com/
``` Laptop shows a kernel crash trace after a first suspend to ram, on a second attempt to suspend it becomes frozen solid. This is 100% repeatable with a 6.5-rc2 kernel, not happening with a 6.4 kernel - see the attached dmesg output.
I have bisected the kernel uilds and it points to : [40613da52b13fb21c5566f10b287e0ca8c12c4e9] PCI: acpiphp: Reassign resources on bridge if necessary
Reversing this patch seems to fix the kernel crash problem on my laptop. ```
Ciao, Thorsten
P.S.: I only noticed this by chance and yet again wonder how to handle these situations better. I guess exporting the list of regressions regzbot tracks in some simple format might be a start; then the stable scripts could simply look up commit ids there when a patch is queued and warn if they find it (which won't help in caes the regression is reported after the patch is queued :-/ ).
On 23.07.23 11:17, Thorsten Leemhuis wrote:
On 21.07.23 18:04, Greg Kroah-Hartman wrote:
From: Igor Mammedov imammedo@redhat.com
commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 upstream.
When using ACPI PCI hotplug, hotplugging a device with large BARs may fail if bridge windows programmed by firmware are not large enough.
[...]
Greg, just so you know, that patch (which is also queued for 6.1 and 5.15) is known to cause a regression in 6.5-rc. To quote https://lore.kernel.org/all/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com/
Laptop shows a kernel crash trace after a first suspend to ram, on a second attempt to suspend it becomes frozen solid. This is 100% repeatable with a 6.5-rc2 kernel, not happening with a 6.4 kernel - see the attached dmesg output. I have bisected the kernel uilds and it points to : [40613da52b13fb21c5566f10b287e0ca8c12c4e9] PCI: acpiphp: Reassign resources on bridge if necessary Reversing this patch seems to fix the kernel crash problem on my laptop.
Forgot to mention the reply from Bjorn:
``` I queued up a revert of 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") (on my for-linus branch for v6.5).
It looks like a NULL pointer dereference; hopefully the fix is obvious and I can drop the revert and replace it with the fix. ```
Ciao, Thorsten
On Sun, Jul 23, 2023 at 11:25:29AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
On 23.07.23 11:17, Thorsten Leemhuis wrote:
On 21.07.23 18:04, Greg Kroah-Hartman wrote:
From: Igor Mammedov imammedo@redhat.com
commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 upstream.
When using ACPI PCI hotplug, hotplugging a device with large BARs may fail if bridge windows programmed by firmware are not large enough.
[...]
Greg, just so you know, that patch (which is also queued for 6.1 and 5.15) is known to cause a regression in 6.5-rc. To quote https://lore.kernel.org/all/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com/
Laptop shows a kernel crash trace after a first suspend to ram, on a second attempt to suspend it becomes frozen solid. This is 100% repeatable with a 6.5-rc2 kernel, not happening with a 6.4 kernel - see the attached dmesg output. I have bisected the kernel uilds and it points to : [40613da52b13fb21c5566f10b287e0ca8c12c4e9] PCI: acpiphp: Reassign resources on bridge if necessary Reversing this patch seems to fix the kernel crash problem on my laptop.
Forgot to mention the reply from Bjorn:
I queued up a revert of 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") (on my for-linus branch for v6.5). It looks like a NULL pointer dereference; hopefully the fix is obvious and I can drop the revert and replace it with the fix.
Thanks, I've dropped this from the stable queues now.
greg k-h
From: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org
commit a33d700e8eea76c62120cb3dbf5e01328f18319a upstream.
In the post init sequence of v2.9.0, write access to read only registers are not disabled after updating the registers. Fix it by disabling the access after register update.
Link: https://lore.kernel.org/r/20230619150408.8468-2-manivannan.sadhasivam@linaro... Fixes: 5d76117f070d ("PCI: qcom: Add support for IPQ8074 PCIe controller") Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-qcom.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -834,6 +834,8 @@ static int qcom_pcie_post_init_2_3_3(str writel(PCI_EXP_DEVCTL2_COMP_TMOUT_DIS, pci->dbi_base + offset + PCI_EXP_DEVCTL2);
+ dw_pcie_dbi_ro_wr_dis(pci); + return 0; }
From: Damien Le Moal dlemoal@kernel.org
commit 4aca56f8eae8aa44867ddd6aa107e06f7613226f upstream.
Reinitialize the transfer_complete DMA transfer completion before calling tx_submit(), to avoid seeing the DMA transfer complete before the completion is initialized, thus potentially losing the completion notification.
Link: https://lore.kernel.org/r/20230415023542.77601-4-dlemoal@kernel.org Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/endpoint/functions/pci-epf-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/endpoint/functions/pci-epf-test.c +++ b/drivers/pci/endpoint/functions/pci-epf-test.c @@ -151,10 +151,10 @@ static int pci_epf_test_data_transfer(st return -EIO; }
+ reinit_completion(&epf_test->transfer_complete); tx->callback = pci_epf_test_dma_callback; tx->callback_param = epf_test; cookie = tx->tx_submit(tx); - reinit_completion(&epf_test->transfer_complete);
ret = dma_submit_error(cookie); if (ret) {
From: Damien Le Moal dlemoal@kernel.org
commit 933f31a2fe1f20e5b1ee065579f652cd1b317183 upstream.
pci_epf_test_data_transfer() and pci_epf_test_dma_callback() are not handling DMA transfer completion correctly, leading to completion notifications to the RC side that are too early. This problem can be detected when the RC side is running an IOMMU with messages such as:
pci-endpoint-test 0000:0b:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x001c address=0xfff00000 flags=0x0000]
When running the pcitest.sh tests: the address used for a previous test transfer generates the above error while the next test transfer is running.
Fix this by testing the DMA transfer status in pci_epf_test_dma_callback() and notifying the completion only when the transfer status is DMA_COMPLETE or DMA_ERROR. Furthermore, in pci_epf_test_data_transfer(), be paranoid and check again the transfer status and always call dmaengine_terminate_sync() before returning.
Link: https://lore.kernel.org/r/20230415023542.77601-5-dlemoal@kernel.org Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/endpoint/functions/pci-epf-test.c | 36 ++++++++++++++++++-------- 1 file changed, 26 insertions(+), 10 deletions(-)
--- a/drivers/pci/endpoint/functions/pci-epf-test.c +++ b/drivers/pci/endpoint/functions/pci-epf-test.c @@ -54,6 +54,9 @@ struct pci_epf_test { struct delayed_work cmd_handler; struct dma_chan *dma_chan_tx; struct dma_chan *dma_chan_rx; + struct dma_chan *transfer_chan; + dma_cookie_t transfer_cookie; + enum dma_status transfer_status; struct completion transfer_complete; bool dma_supported; bool dma_private; @@ -85,8 +88,14 @@ static size_t bar_size[] = { 512, 512, 1 static void pci_epf_test_dma_callback(void *param) { struct pci_epf_test *epf_test = param; + struct dma_tx_state state;
- complete(&epf_test->transfer_complete); + epf_test->transfer_status = + dmaengine_tx_status(epf_test->transfer_chan, + epf_test->transfer_cookie, &state); + if (epf_test->transfer_status == DMA_COMPLETE || + epf_test->transfer_status == DMA_ERROR) + complete(&epf_test->transfer_complete); }
/** @@ -120,7 +129,6 @@ static int pci_epf_test_data_transfer(st struct dma_async_tx_descriptor *tx; struct dma_slave_config sconf = {}; struct device *dev = &epf->dev; - dma_cookie_t cookie; int ret;
if (IS_ERR_OR_NULL(chan)) { @@ -152,25 +160,33 @@ static int pci_epf_test_data_transfer(st }
reinit_completion(&epf_test->transfer_complete); + epf_test->transfer_chan = chan; tx->callback = pci_epf_test_dma_callback; tx->callback_param = epf_test; - cookie = tx->tx_submit(tx); + epf_test->transfer_cookie = tx->tx_submit(tx);
- ret = dma_submit_error(cookie); + ret = dma_submit_error(epf_test->transfer_cookie); if (ret) { - dev_err(dev, "Failed to do DMA tx_submit %d\n", cookie); - return -EIO; + dev_err(dev, "Failed to do DMA tx_submit %d\n", ret); + goto terminate; }
dma_async_issue_pending(chan); ret = wait_for_completion_interruptible(&epf_test->transfer_complete); if (ret < 0) { - dmaengine_terminate_sync(chan); - dev_err(dev, "DMA wait_for_completion_timeout\n"); - return -ETIMEDOUT; + dev_err(dev, "DMA wait_for_completion interrupted\n"); + goto terminate; + } + + if (epf_test->transfer_status == DMA_ERROR) { + dev_err(dev, "DMA transfer failed\n"); + ret = -EIO; }
- return 0; +terminate: + dmaengine_terminate_sync(chan); + + return ret; }
struct epf_dma_filter {
From: Rick Wertenbroek rick.wertenbroek@gmail.com
commit f397fd4ac1fa3afcabd8cee030f953ccaed2a364 upstream.
Assert PCI Configuration Enable bit after probe. When this bit is left to 0 in the endpoint mode, the RK3399 PCIe endpoint core will generate configuration request retry status (CRS) messages back to the root complex. Assert this bit after probe to allow the RK3399 PCIe endpoint core to reply to configuration requests from the root complex. This is documented in section 17.5.8.1.2 of the RK3399 TRM.
Link: https://lore.kernel.org/r/20230418074700.1083505-4-rick.wertenbroek@gmail.co... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip-ep.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -631,6 +631,9 @@ static int rockchip_pcie_ep_probe(struct
ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
+ rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE, + PCIE_CLIENT_CONFIG); + return 0; err_epc_mem_exit: pci_epc_mem_exit(epc);
From: Rick Wertenbroek rick.wertenbroek@gmail.com
commit 1f1c42ece18de365c976a060f3c8eb481b038e3a upstream.
Write PCI Device ID (DID) to the correct register. The Device ID was not updated through the correct register. Device ID was written to a read-only register and therefore did not work. The Device ID is now set through the correct register. This is documented in the RK3399 TRM section 17.6.6.1.1
Link: https://lore.kernel.org/r/20230418074700.1083505-3-rick.wertenbroek@gmail.co... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip-ep.c | 6 ++++-- drivers/pci/controller/pcie-rockchip.h | 2 ++ 2 files changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -125,6 +125,7 @@ static void rockchip_pcie_prog_ep_ob_atu static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn, struct pci_epf_header *hdr) { + u32 reg; struct rockchip_pcie_ep *ep = epc_get_drvdata(epc); struct rockchip_pcie *rockchip = &ep->rockchip;
@@ -137,8 +138,9 @@ static int rockchip_pcie_ep_write_header PCIE_CORE_CONFIG_VENDOR); }
- rockchip_pcie_write(rockchip, hdr->deviceid << 16, - ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + PCI_VENDOR_ID); + reg = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_DID_VID); + reg = (reg & 0xFFFF) | (hdr->deviceid << 16); + rockchip_pcie_write(rockchip, reg, PCIE_EP_CONFIG_DID_VID);
rockchip_pcie_write(rockchip, hdr->revid | --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -133,6 +133,8 @@ #define PCIE_RC_RP_ATS_BASE 0x400000 #define PCIE_RC_CONFIG_NORMAL_BASE 0x800000 #define PCIE_RC_CONFIG_BASE 0xa00000 +#define PCIE_EP_CONFIG_BASE 0xa00000 +#define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00) #define PCIE_RC_CONFIG_RID_CCR (PCIE_RC_CONFIG_BASE + 0x08) #define PCIE_RC_CONFIG_DCR (PCIE_RC_CONFIG_BASE + 0xc4) #define PCIE_RC_CONFIG_DCR_CSPL_SHIFT 18
From: Rick Wertenbroek rick.wertenbroek@gmail.com
commit 9dd3c7c4c8c3f7f010d9cdb7c3f42506d93c9527 upstream.
The RK3399 PCIe controller should wait until the PHY PLLs are locked. Add poll and timeout to wait for PHY PLLs to be locked. If they cannot be locked generate error message and jump to error handler. Accessing registers in the PHY clock domain when PLLs are not locked causes hang The PHY PLLs status is checked through a side channel register. This is documented in the TRM section 17.5.8.1 "PCIe Initialization Sequence".
Link: https://lore.kernel.org/r/20230418074700.1083505-5-rick.wertenbroek@gmail.co... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip.c | 17 +++++++++++++++++ drivers/pci/controller/pcie-rockchip.h | 2 ++ 2 files changed, 19 insertions(+)
--- a/drivers/pci/controller/pcie-rockchip.c +++ b/drivers/pci/controller/pcie-rockchip.c @@ -14,6 +14,7 @@ #include <linux/clk.h> #include <linux/delay.h> #include <linux/gpio/consumer.h> +#include <linux/iopoll.h> #include <linux/of_pci.h> #include <linux/phy/phy.h> #include <linux/platform_device.h> @@ -153,6 +154,12 @@ int rockchip_pcie_parse_dt(struct rockch } EXPORT_SYMBOL_GPL(rockchip_pcie_parse_dt);
+#define rockchip_pcie_read_addr(addr) rockchip_pcie_read(rockchip, addr) +/* 100 ms max wait time for PHY PLLs to lock */ +#define RK_PHY_PLL_LOCK_TIMEOUT_US 100000 +/* Sleep should be less than 20ms */ +#define RK_PHY_PLL_LOCK_SLEEP_US 1000 + int rockchip_pcie_init_port(struct rockchip_pcie *rockchip) { struct device *dev = rockchip->dev; @@ -254,6 +261,16 @@ int rockchip_pcie_init_port(struct rockc } }
+ err = readx_poll_timeout(rockchip_pcie_read_addr, + PCIE_CLIENT_SIDE_BAND_STATUS, + regs, !(regs & PCIE_CLIENT_PHY_ST), + RK_PHY_PLL_LOCK_SLEEP_US, + RK_PHY_PLL_LOCK_TIMEOUT_US); + if (err) { + dev_err(dev, "PHY PLLs could not lock, %d\n", err); + goto err_power_off_phy; + } + /* * Please don't reorder the deassert sequence of the following * four reset pins. --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -38,6 +38,8 @@ #define PCIE_CLIENT_MODE_EP HIWORD_UPDATE(0x0040, 0) #define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0) #define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080) +#define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20) +#define PCIE_CLIENT_PHY_ST BIT(12) #define PCIE_CLIENT_DEBUG_OUT_0 (PCIE_CLIENT_BASE + 0x3c) #define PCIE_CLIENT_DEBUG_LTSSM_MASK GENMASK(5, 0) #define PCIE_CLIENT_DEBUG_LTSSM_L1 0x18
From: Rick Wertenbroek rick.wertenbroek@gmail.com
commit 166e89d99dd85a856343cca51eee781b793801f2 upstream.
Fix legacy IRQ generation for RK3399 PCIe endpoint core according to the technical reference manual (TRM). Assert and deassert legacy interrupt (INTx) through the legacy interrupt control register ("PCIE_CLIENT_LEGACY_INT_CTRL") instead of manually generating a PCIe message. The generation of the legacy interrupt was tested and validated with the PCIe endpoint test driver.
Link: https://lore.kernel.org/r/20230418074700.1083505-8-rick.wertenbroek@gmail.co... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip-ep.c | 45 +++++++----------------------- drivers/pci/controller/pcie-rockchip.h | 6 +++- 2 files changed, 16 insertions(+), 35 deletions(-)
--- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -347,48 +347,25 @@ static int rockchip_pcie_ep_get_msi(stru }
static void rockchip_pcie_ep_assert_intx(struct rockchip_pcie_ep *ep, u8 fn, - u8 intx, bool is_asserted) + u8 intx, bool do_assert) { struct rockchip_pcie *rockchip = &ep->rockchip; - u32 r = ep->max_regions - 1; - u32 offset; - u32 status; - u8 msg_code; - - if (unlikely(ep->irq_pci_addr != ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR || - ep->irq_pci_fn != fn)) { - rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r, - AXI_WRAPPER_NOR_MSG, - ep->irq_phys_addr, 0, 0); - ep->irq_pci_addr = ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR; - ep->irq_pci_fn = fn; - }
intx &= 3; - if (is_asserted) { + + if (do_assert) { ep->irq_pending |= BIT(intx); - msg_code = ROCKCHIP_PCIE_MSG_CODE_ASSERT_INTA + intx; + rockchip_pcie_write(rockchip, + PCIE_CLIENT_INT_IN_ASSERT | + PCIE_CLIENT_INT_PEND_ST_PEND, + PCIE_CLIENT_LEGACY_INT_CTRL); } else { ep->irq_pending &= ~BIT(intx); - msg_code = ROCKCHIP_PCIE_MSG_CODE_DEASSERT_INTA + intx; + rockchip_pcie_write(rockchip, + PCIE_CLIENT_INT_IN_DEASSERT | + PCIE_CLIENT_INT_PEND_ST_NORMAL, + PCIE_CLIENT_LEGACY_INT_CTRL); } - - status = rockchip_pcie_read(rockchip, - ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + - ROCKCHIP_PCIE_EP_CMD_STATUS); - status &= ROCKCHIP_PCIE_EP_CMD_STATUS_IS; - - if ((status != 0) ^ (ep->irq_pending != 0)) { - status ^= ROCKCHIP_PCIE_EP_CMD_STATUS_IS; - rockchip_pcie_write(rockchip, status, - ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + - ROCKCHIP_PCIE_EP_CMD_STATUS); - } - - offset = - ROCKCHIP_PCIE_MSG_ROUTING(ROCKCHIP_PCIE_MSG_ROUTING_LOCAL_INTX) | - ROCKCHIP_PCIE_MSG_CODE(msg_code) | ROCKCHIP_PCIE_MSG_NO_DATA; - writel(0, ep->irq_cpu_addr + offset); }
static int rockchip_pcie_ep_send_legacy_irq(struct rockchip_pcie_ep *ep, u8 fn, --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -38,6 +38,11 @@ #define PCIE_CLIENT_MODE_EP HIWORD_UPDATE(0x0040, 0) #define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0) #define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080) +#define PCIE_CLIENT_LEGACY_INT_CTRL (PCIE_CLIENT_BASE + 0x0c) +#define PCIE_CLIENT_INT_IN_ASSERT HIWORD_UPDATE_BIT(0x0002) +#define PCIE_CLIENT_INT_IN_DEASSERT HIWORD_UPDATE(0x0002, 0) +#define PCIE_CLIENT_INT_PEND_ST_PEND HIWORD_UPDATE_BIT(0x0001) +#define PCIE_CLIENT_INT_PEND_ST_NORMAL HIWORD_UPDATE(0x0001, 0) #define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20) #define PCIE_CLIENT_PHY_ST BIT(12) #define PCIE_CLIENT_DEBUG_OUT_0 (PCIE_CLIENT_BASE + 0x3c) @@ -227,7 +232,6 @@ #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16) #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24) #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1 -#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3 #define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12)) #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \ (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008)
From: Rick Wertenbroek rick.wertenbroek@gmail.com
commit 8962b2cb39119cbda4fc69a1f83957824f102f81 upstream.
Previously u16 variables were used to access 32-bit registers, this resulted in not all of the data being read from the registers. Also the left shift of more than 16-bits would result in moving data out of the variable. Use u32 variables to access 32-bit registers
Link: https://lore.kernel.org/r/20230418074700.1083505-10-rick.wertenbroek@gmail.c... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip-ep.c | 10 +++++----- drivers/pci/controller/pcie-rockchip.h | 1 + 2 files changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -314,15 +314,15 @@ static int rockchip_pcie_ep_set_msi(stru { struct rockchip_pcie_ep *ep = epc_get_drvdata(epc); struct rockchip_pcie *rockchip = &ep->rockchip; - u16 flags; + u32 flags;
flags = rockchip_pcie_read(rockchip, ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + ROCKCHIP_PCIE_EP_MSI_CTRL_REG); flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK; flags |= - ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) | - PCI_MSI_FLAGS_64BIT; + (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) | + (PCI_MSI_FLAGS_64BIT << ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET); flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP; rockchip_pcie_write(rockchip, flags, ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + @@ -334,7 +334,7 @@ static int rockchip_pcie_ep_get_msi(stru { struct rockchip_pcie_ep *ep = epc_get_drvdata(epc); struct rockchip_pcie *rockchip = &ep->rockchip; - u16 flags; + u32 flags;
flags = rockchip_pcie_read(rockchip, ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + @@ -395,7 +395,7 @@ static int rockchip_pcie_ep_send_msi_irq u8 interrupt_num) { struct rockchip_pcie *rockchip = &ep->rockchip; - u16 flags, mme, data, data_mask; + u32 flags, mme, data, data_mask; u8 msi_count; u64 pci_addr, pci_addr_mask = 0xff;
--- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -225,6 +225,7 @@ #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4 #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19) #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90 +#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16 #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17 #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17) #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
From: Damien Le Moal dlemoal@kernel.org
commit 7e6689b34a815bd379dfdbe9855d36f395ef056c upstream.
The address translation unit of the rockchip EP controller does not use the lower 8 bits of a PCIe-space address to map local memory. Thus we must set the align feature field to 256 to let the user know about this constraint.
Link: https://lore.kernel.org/r/20230418074700.1083505-12-rick.wertenbroek@gmail.c... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rockchip-ep.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -485,6 +485,7 @@ static const struct pci_epc_features roc .linkup_notifier = false, .msi_capable = true, .msix_capable = false, + .align = 256, };
static const struct pci_epc_features*
From: Damien Le Moal dlemoal@kernel.org
commit f61b7634a3249d12b9daa36ffbdb9965b6f24c6c upstream.
In pci_endpoint_test_remove(), freeing the IRQs after removing the device creates a small race window for IRQs to be received with the test device memory already released, causing the IRQ handler to access invalid memory, resulting in an oops.
Free the device IRQs before removing the device to avoid this issue.
Link: https://lore.kernel.org/r/20230415023542.77601-15-dlemoal@kernel.org Fixes: e03327122e2c ("pci_endpoint_test: Add 2 ioctl commands") Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/pci_endpoint_test.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/misc/pci_endpoint_test.c +++ b/drivers/misc/pci_endpoint_test.c @@ -938,6 +938,9 @@ static void pci_endpoint_test_remove(str if (id < 0) return;
+ pci_endpoint_test_release_irq(test); + pci_endpoint_test_free_irq_vectors(test); + misc_deregister(&test->miscdev); kfree(misc_device->name); kfree(test->name); @@ -947,9 +950,6 @@ static void pci_endpoint_test_remove(str pci_iounmap(pdev, test->bar[bar]); }
- pci_endpoint_test_release_irq(test); - pci_endpoint_test_free_irq_vectors(test); - pci_release_regions(pdev); pci_disable_device(pdev); }
From: Damien Le Moal dlemoal@kernel.org
commit fb620ae73b70c2f57b9d3e911fc24c024ba2324f upstream.
The irq_raised completion used to detect the end of a test case is initialized when the test device is probed, but never reinitialized again before a test case. As a result, the irq_raised completion synchronization is effective only for the first ioctl test case executed. Any subsequent call to wait_for_completion() by another ioctl() call will immediately return, potentially too early, leading to false positive failures.
Fix this by reinitializing the irq_raised completion before starting a new ioctl() test command.
Link: https://lore.kernel.org/r/20230415023542.77601-16-dlemoal@kernel.org Fixes: 2c156ac71c6b ("misc: Add host side PCI driver for PCI test function device") Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/pci_endpoint_test.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/misc/pci_endpoint_test.c +++ b/drivers/misc/pci_endpoint_test.c @@ -729,6 +729,10 @@ static long pci_endpoint_test_ioctl(stru struct pci_dev *pdev = test->pdev;
mutex_lock(&test->mutex); + + reinit_completion(&test->irq_raised); + test->last_irq = -ENODATA; + switch (cmd) { case PCITEST_BAR: bar = arg;
From: Johan Hovold johan+linaro@kernel.org
commit d420c9886f5369697047b880221789bf0054e438 upstream.
Add the missing module device table alias to that the driver can be autoloaded when built as a module.
Cc: stable@vger.kernel.org # 5.14 Fixes: 6b149f3310a4 ("mfd: pm8008: Add driver for QCOM PM8008 PMIC") Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Signed-off-by: Lee Jones lee@kernel.org Link: https://lore.kernel.org/r/20230526091646.17318-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/qcom-pm8008.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mfd/qcom-pm8008.c +++ b/drivers/mfd/qcom-pm8008.c @@ -199,6 +199,7 @@ static const struct of_device_id pm8008_ { .compatible = "qcom,pm8008", }, { }, }; +MODULE_DEVICE_TABLE(of, pm8008_match);
static struct i2c_driver pm8008_mfd_driver = { .driver = {
From: Jason Baron jbaron@akamai.com
commit e836007089ba8fdf24e636ef2b007651fb4582e6 upstream.
We've found that using raid0 with the 'original' layout and discard enabled with different disk sizes (such that at least two zones are created) can result in data corruption. This is due to the fact that the discard handling in 'raid0_handle_discard()' assumes the 'alternate' layout. We've seen this corruption using ext4 but other filesystems are likely susceptible as well.
More specifically, while multiple zones are necessary to create the corruption, the corruption may not occur with multiple zones if they layout in such a way the layout matches what the 'alternate' layout would have produced. Thus, not all raid0 devices with the 'original' layout, different size disks and discard enabled will encounter this corruption.
The 3.14 kernel inadvertently changed the raid0 disk layout for different size disks. Thus, running a pre-3.14 kernel and post-3.14 kernel on the same raid0 array could corrupt data. This lead to the creation of the 'original' layout (to match the pre-3.14 layout) and the 'alternate' layout (to match the post 3.14 layout) in the 5.4 kernel time frame and an option to tell the kernel which layout to use (since it couldn't be autodetected). However, when the 'original' layout was added back to 5.4 discard support for the 'original' layout was not added leading this issue.
I've been able to reliably reproduce the corruption with the following test case:
1. create raid0 array with different size disks using original layout 2. mkfs 3. mount -o discard 4. create lots of files 5. remove 1/2 the files 6. fstrim -a (or just the mount point for the raid0 array) 7. umount 8. fsck -fn /dev/md0 (spews all sorts of corruptions)
Let's fix this by adding proper discard support to the 'original' layout. The fix 'maps' the 'original' layout disks to the order in which they are read/written such that we can compare the disks in the same way that the current 'alternate' layout does. A 'disk_shift' field is added to 'struct strip_zone'. This could be computed on the fly in raid0_handle_discard() but by adding this field, we save some computation in the discard path.
Note we could also potentially fix this by re-ordering the disks in the zones that follow the first one, and then always read/writing them using the 'alternate' layout. However, that is seen as a more substantial change, and we are attempting the least invasive fix at this time to remedy the corruption.
I've verified the change using the reproducer mentioned above. Typically, the corruption is seen after less than 3 iterations, while the patch has run 500+ iterations.
Cc: NeilBrown neilb@suse.de Cc: Song Liu song@kernel.org Fixes: c84a1372df92 ("md/raid0: avoid RAID0 data corruption due to layout confusion.") Cc: stable@vger.kernel.org Signed-off-by: Jason Baron jbaron@akamai.com Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/20230623180523.1901230-1-jbaron@akamai.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/raid0.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++------- drivers/md/raid0.h | 1 2 files changed, 55 insertions(+), 8 deletions(-)
--- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -270,6 +270,18 @@ static int create_strip_zones(struct mdd goto abort; }
+ if (conf->layout == RAID0_ORIG_LAYOUT) { + for (i = 1; i < conf->nr_strip_zones; i++) { + sector_t first_sector = conf->strip_zone[i-1].zone_end; + + sector_div(first_sector, mddev->chunk_sectors); + zone = conf->strip_zone + i; + /* disk_shift is first disk index used in the zone */ + zone->disk_shift = sector_div(first_sector, + zone->nb_dev); + } + } + pr_debug("md/raid0:%s: done.\n", mdname(mddev)); *private_conf = conf;
@@ -431,6 +443,20 @@ exit_acct_set: return ret; }
+/* + * Convert disk_index to the disk order in which it is read/written. + * For example, if we have 4 disks, they are numbered 0,1,2,3. If we + * write the disks starting at disk 3, then the read/write order would + * be disk 3, then 0, then 1, and then disk 2 and we want map_disk_shift() + * to map the disks as follows 0,1,2,3 => 1,2,3,0. So disk 0 would map + * to 1, 1 to 2, 2 to 3, and 3 to 0. That way we can compare disks in + * that 'output' space to understand the read/write disk ordering. + */ +static int map_disk_shift(int disk_index, int num_disks, int disk_shift) +{ + return ((disk_index + num_disks - disk_shift) % num_disks); +} + static void raid0_handle_discard(struct mddev *mddev, struct bio *bio) { struct r0conf *conf = mddev->private; @@ -444,7 +470,9 @@ static void raid0_handle_discard(struct sector_t end_disk_offset; unsigned int end_disk_index; unsigned int disk; + sector_t orig_start, orig_end;
+ orig_start = start; zone = find_zone(conf, &start);
if (bio_end_sector(bio) > zone->zone_end) { @@ -458,6 +486,7 @@ static void raid0_handle_discard(struct } else end = bio_end_sector(bio);
+ orig_end = end; if (zone != conf->strip_zone) end = end - zone[-1].zone_end;
@@ -469,13 +498,26 @@ static void raid0_handle_discard(struct last_stripe_index = end; sector_div(last_stripe_index, stripe_size);
- start_disk_index = (int)(start - first_stripe_index * stripe_size) / - mddev->chunk_sectors; + /* In the first zone the original and alternate layouts are the same */ + if ((conf->layout == RAID0_ORIG_LAYOUT) && (zone != conf->strip_zone)) { + sector_div(orig_start, mddev->chunk_sectors); + start_disk_index = sector_div(orig_start, zone->nb_dev); + start_disk_index = map_disk_shift(start_disk_index, + zone->nb_dev, + zone->disk_shift); + sector_div(orig_end, mddev->chunk_sectors); + end_disk_index = sector_div(orig_end, zone->nb_dev); + end_disk_index = map_disk_shift(end_disk_index, + zone->nb_dev, zone->disk_shift); + } else { + start_disk_index = (int)(start - first_stripe_index * stripe_size) / + mddev->chunk_sectors; + end_disk_index = (int)(end - last_stripe_index * stripe_size) / + mddev->chunk_sectors; + } start_disk_offset = ((int)(start - first_stripe_index * stripe_size) % mddev->chunk_sectors) + first_stripe_index * mddev->chunk_sectors; - end_disk_index = (int)(end - last_stripe_index * stripe_size) / - mddev->chunk_sectors; end_disk_offset = ((int)(end - last_stripe_index * stripe_size) % mddev->chunk_sectors) + last_stripe_index * mddev->chunk_sectors; @@ -483,18 +525,22 @@ static void raid0_handle_discard(struct for (disk = 0; disk < zone->nb_dev; disk++) { sector_t dev_start, dev_end; struct md_rdev *rdev; + int compare_disk; + + compare_disk = map_disk_shift(disk, zone->nb_dev, + zone->disk_shift);
- if (disk < start_disk_index) + if (compare_disk < start_disk_index) dev_start = (first_stripe_index + 1) * mddev->chunk_sectors; - else if (disk > start_disk_index) + else if (compare_disk > start_disk_index) dev_start = first_stripe_index * mddev->chunk_sectors; else dev_start = start_disk_offset;
- if (disk < end_disk_index) + if (compare_disk < end_disk_index) dev_end = (last_stripe_index + 1) * mddev->chunk_sectors; - else if (disk > end_disk_index) + else if (compare_disk > end_disk_index) dev_end = last_stripe_index * mddev->chunk_sectors; else dev_end = end_disk_offset; --- a/drivers/md/raid0.h +++ b/drivers/md/raid0.h @@ -6,6 +6,7 @@ struct strip_zone { sector_t zone_end; /* Start of the next zone (in sectors) */ sector_t dev_start; /* Zone offset in real dev (in sectors) */ int nb_dev; /* # of devices attached to the zone */ + int disk_shift; /* start disk for the original layout */ };
/* Linux 3.14 (20d0189b101) made an unintended change to
From: Alexander Aring aahringo@redhat.com
commit 92655fbda5c05950a411eaabc19e025e86e2a291 upstream.
The GETLK pid values have all been negated since commit 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks"). Revert this for local pids, and leave in place negative pids for remote owners.
Cc: stable@vger.kernel.org Fixes: 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/plock.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -360,7 +360,9 @@ int dlm_posix_get(dlm_lockspace_t *locks locks_init_lock(fl); fl->fl_type = (op->info.ex) ? F_WRLCK : F_RDLCK; fl->fl_flags = FL_POSIX; - fl->fl_pid = -op->info.pid; + fl->fl_pid = op->info.pid; + if (op->info.nodeid != dlm_our_nodeid()) + fl->fl_pid = -fl->fl_pid; fl->fl_start = op->info.start; fl->fl_end = op->info.end; rv = 0;
From: Alexander Aring aahringo@redhat.com
commit c847f4e203046a2c93d8a1cf0348315c0b655a60 upstream.
Immediately clean up a posix lock request if it is interrupted while waiting for a result from user space (dlm_controld.) This largely reverts the recent commit b92a4e3f86b1 ("fs: dlm: change posix lock sigint handling"). That previous commit attempted to defer lock cleanup to the point in time when a result from user space arrived. The deferred approach was not reliable because some dlm plock ops may not receive replies.
Cc: stable@vger.kernel.org Fixes: b92a4e3f86b1 ("fs: dlm: change posix lock sigint handling") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/plock.c | 25 ++++++------------------- 1 file changed, 6 insertions(+), 19 deletions(-)
--- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -30,8 +30,6 @@ struct plock_async_data { struct plock_op { struct list_head list; int done; - /* if lock op got interrupted while waiting dlm_controld reply */ - bool sigint; struct dlm_plock_info info; /* if set indicates async handling */ struct plock_async_data *data; @@ -167,12 +165,14 @@ int dlm_posix_lock(dlm_lockspace_t *lock spin_unlock(&ops_lock); goto do_lock_wait; } - - op->sigint = true; + list_del(&op->list); spin_unlock(&ops_lock); + log_debug(ls, "%s: wait interrupted %x %llx pid %d", __func__, ls->ls_global_id, (unsigned long long)number, op->info.pid); + do_unlock_close(&op->info); + dlm_release_plock_op(op); goto out; }
@@ -434,19 +434,6 @@ static ssize_t dev_write(struct file *fi if (iter->info.fsid == info.fsid && iter->info.number == info.number && iter->info.owner == info.owner) { - if (iter->sigint) { - list_del(&iter->list); - spin_unlock(&ops_lock); - - pr_debug("%s: sigint cleanup %x %llx pid %d", - __func__, iter->info.fsid, - (unsigned long long)iter->info.number, - iter->info.pid); - do_unlock_close(&iter->info); - memcpy(&iter->info, &info, sizeof(info)); - dlm_release_plock_op(iter); - return count; - } list_del_init(&iter->list); memcpy(&iter->info, &info, sizeof(info)); if (iter->data) @@ -465,8 +452,8 @@ static ssize_t dev_write(struct file *fi else wake_up(&recv_wq); } else - log_print("%s: no op %x %llx", __func__, - info.fsid, (unsigned long long)info.number); + pr_debug("%s: no op %x %llx", __func__, + info.fsid, (unsigned long long)info.number); return count; }
From: Alexander Aring aahringo@redhat.com
commit 59e45c758ca1b9893ac923dd63536da946ac333b upstream.
If a posix lock request is waiting for a result from user space (dlm_controld), do not let it be interrupted unless the process is killed. This reverts commit a6b1533e9a57 ("dlm: make posix locks interruptible"). The problem with the interruptible change is that all locks were cleared on any signal interrupt. If a signal was received that did not terminate the process, the process could continue running after all its dlm posix locks had been cleared. A future patch will add cancelation to allow proper interruption.
Cc: stable@vger.kernel.org Fixes: a6b1533e9a57 ("dlm: make posix locks interruptible") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/plock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -155,7 +155,7 @@ int dlm_posix_lock(dlm_lockspace_t *lock
send_op(op);
- rv = wait_event_interruptible(recv_wq, (op->done != 0)); + rv = wait_event_killable(recv_wq, (op->done != 0)); if (rv == -ERESTARTSYS) { spin_lock(&ops_lock); /* recheck under ops_lock if we got a done != 0,
From: Alexander Aring aahringo@redhat.com
commit 0f2b1cb89ccdbdcedf7143f4153a4da700a05f48 upstream.
While a non-waiting posix lock request (F_SETLK) is waiting for user space processing (in dlm_controld), wait for that processing to complete with an unkillable wait_event(). This makes F_SETLK behave the same way for F_RDLCK, F_WRLCK and F_UNLCK. F_SETLKW continues to use wait_event_killable().
Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/plock.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-)
--- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -155,25 +155,29 @@ int dlm_posix_lock(dlm_lockspace_t *lock
send_op(op);
- rv = wait_event_killable(recv_wq, (op->done != 0)); - if (rv == -ERESTARTSYS) { - spin_lock(&ops_lock); - /* recheck under ops_lock if we got a done != 0, - * if so this interrupt case should be ignored - */ - if (op->done != 0) { + if (op->info.wait) { + rv = wait_event_killable(recv_wq, (op->done != 0)); + if (rv == -ERESTARTSYS) { + spin_lock(&ops_lock); + /* recheck under ops_lock if we got a done != 0, + * if so this interrupt case should be ignored + */ + if (op->done != 0) { + spin_unlock(&ops_lock); + goto do_lock_wait; + } + list_del(&op->list); spin_unlock(&ops_lock); - goto do_lock_wait; - } - list_del(&op->list); - spin_unlock(&ops_lock);
- log_debug(ls, "%s: wait interrupted %x %llx pid %d", - __func__, ls->ls_global_id, - (unsigned long long)number, op->info.pid); - do_unlock_close(&op->info); - dlm_release_plock_op(op); - goto out; + log_debug(ls, "%s: wait interrupted %x %llx pid %d", + __func__, ls->ls_global_id, + (unsigned long long)number, op->info.pid); + do_unlock_close(&op->info); + dlm_release_plock_op(op); + goto out; + } + } else { + wait_event(recv_wq, (op->done != 0)); }
do_lock_wait:
From: Alexander Aring aahringo@redhat.com
commit 57e2c2f2d94cfd551af91cedfa1af6d972487197 upstream.
When a waiting plock request (F_SETLKW) is sent to userspace for processing (dlm_controld), the result is returned at a later time. That result could be incorrectly matched to a different waiting request in cases where the owner field is the same (e.g. different threads in a process.) This is fixed by comparing all the properties in the request and reply.
The results for non-waiting plock requests are now matched based on list order because the results are returned in the same order they were sent.
Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/plock.c | 58 ++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 45 insertions(+), 13 deletions(-)
--- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -395,7 +395,7 @@ static ssize_t dev_read(struct file *fil if (op->info.flags & DLM_PLOCK_FL_CLOSE) list_del(&op->list); else - list_move(&op->list, &recv_list); + list_move_tail(&op->list, &recv_list); memcpy(&info, &op->info, sizeof(info)); } spin_unlock(&ops_lock); @@ -433,20 +433,52 @@ static ssize_t dev_write(struct file *fi if (check_version(&info)) return -EINVAL;
+ /* + * The results for waiting ops (SETLKW) can be returned in any + * order, so match all fields to find the op. The results for + * non-waiting ops are returned in the order that they were sent + * to userspace, so match the result with the first non-waiting op. + */ spin_lock(&ops_lock); - list_for_each_entry(iter, &recv_list, list) { - if (iter->info.fsid == info.fsid && - iter->info.number == info.number && - iter->info.owner == info.owner) { - list_del_init(&iter->list); - memcpy(&iter->info, &info, sizeof(info)); - if (iter->data) - do_callback = 1; - else - iter->done = 1; - op = iter; - break; + if (info.wait) { + list_for_each_entry(iter, &recv_list, list) { + if (iter->info.fsid == info.fsid && + iter->info.number == info.number && + iter->info.owner == info.owner && + iter->info.pid == info.pid && + iter->info.start == info.start && + iter->info.end == info.end && + iter->info.ex == info.ex && + iter->info.wait) { + op = iter; + break; + } } + } else { + list_for_each_entry(iter, &recv_list, list) { + if (!iter->info.wait) { + op = iter; + break; + } + } + } + + if (op) { + /* Sanity check that op and info match. */ + if (info.wait) + WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK); + else + WARN_ON(op->info.fsid != info.fsid || + op->info.number != info.number || + op->info.owner != info.owner || + op->info.optype != info.optype); + + list_del_init(&op->list); + memcpy(&op->info, &info, sizeof(info)); + if (op->data) + do_callback = 1; + else + op->done = 1; } spin_unlock(&ops_lock);
From: Alexander Aring aahringo@redhat.com
commit 7a931477bff1c7548aa8492bccf600f5f29452b1 upstream.
This patch clears the DLM_IFL_CB_PENDING_BIT flag which will be set when there is callback work queued when there was no callback to dequeue. It is a buggy case and should never happen, that's why there is a WARN_ON(). However if the case happens we are prepared to somehow recover from it.
Cc: stable@vger.kernel.org Fixes: 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/ast.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c index 700ff2e0515a..ff0ef4653535 100644 --- a/fs/dlm/ast.c +++ b/fs/dlm/ast.c @@ -181,10 +181,12 @@ void dlm_callback_work(struct work_struct *work)
spin_lock(&lkb->lkb_cb_lock); rv = dlm_dequeue_lkb_callback(lkb, &cb); - spin_unlock(&lkb->lkb_cb_lock); - - if (WARN_ON_ONCE(rv == DLM_DEQUEUE_CALLBACK_EMPTY)) + if (WARN_ON_ONCE(rv == DLM_DEQUEUE_CALLBACK_EMPTY)) { + clear_bit(DLM_IFL_CB_PENDING_BIT, &lkb->lkb_iflags); + spin_unlock(&lkb->lkb_cb_lock); goto out; + } + spin_unlock(&lkb->lkb_cb_lock);
for (;;) { castfn = lkb->lkb_astfn;
From: Alexander Aring aahringo@redhat.com
commit f68bb23cad1f128198074ed7b3a4c5fb03dbd9d2 upstream.
This patch sets the process_dlm_messages_pending boolean to false when there was no message to process. It is a case which should not happen but if we are prepared to recover from this situation by setting pending boolean to false.
Cc: stable@vger.kernel.org Fixes: dbb751ffab0b ("fs: dlm: parallelize lowcomms socket handling") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dlm/lowcomms.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 3d3802c47b8b..5aad4d4842eb 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -898,6 +898,7 @@ static void process_dlm_messages(struct work_struct *work) pentry = list_first_entry_or_null(&processqueue, struct processqueue_entry, list); if (WARN_ON_ONCE(!pentry)) { + process_dlm_messages_pending = false; spin_unlock(&processqueue_lock); return; }
From: Justin Tee justin.tee@broadcom.com
commit 97f975823f8196d970bd795087b514271214677a upstream.
Smatch detected a double free path because lpfc_nlp_not_used() releases an ndlp object before reaching lpfc_nlp_put() at the end of lpfc_cmpl_els_logo_acc().
Remove the outdated lpfc_nlp_not_used() routine. In lpfc_mbx_cmpl_ns_reg_login(), replace the call with lpfc_nlp_put(). In lpfc_cmpl_els_logo_acc(), replace the call with lpfc_unreg_rpi() and keep the lpfc_nlp_put() at the end of the routine. If ndlp's rpi was registered, then lpfc_unreg_rpi()'s completion routine performs the final ndlp clean up after lpfc_nlp_put() is called from lpfc_cmpl_els_logo_acc(). Otherwise if ndlp has no rpi registered, the lpfc_nlp_put() at the end of lpfc_cmpl_els_logo_acc() is the final ndlp clean up.
Fixes: 4430f7fd09ec ("scsi: lpfc: Rework locations of ndlp reference taking") Cc: stable@vger.kernel.org # v5.11+ Reported-by: Dan Carpenter error27@gmail.com Link: https://lore.kernel.org/all/Y3OefhyyJNKH%2Fiaf@kili/ Signed-off-by: Justin Tee justin.tee@broadcom.com Link: https://lore.kernel.org/r/20230417191558.83100-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/lpfc/lpfc_crtn.h | 1 - drivers/scsi/lpfc/lpfc_els.c | 30 +++++++----------------------- drivers/scsi/lpfc/lpfc_hbadisc.c | 24 +++--------------------- 3 files changed, 10 insertions(+), 45 deletions(-)
--- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -134,7 +134,6 @@ void lpfc_check_nlp_post_devloss(struct struct lpfc_nodelist *ndlp); void lpfc_ignore_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, struct lpfc_iocbq *rspiocb); -int lpfc_nlp_not_used(struct lpfc_nodelist *ndlp); struct lpfc_nodelist *lpfc_setup_disc_node(struct lpfc_vport *, uint32_t); void lpfc_disc_list_loopmap(struct lpfc_vport *); void lpfc_disc_start(struct lpfc_vport *); --- a/drivers/scsi/lpfc/lpfc_els.c +++ b/drivers/scsi/lpfc/lpfc_els.c @@ -5205,14 +5205,9 @@ lpfc_els_free_iocb(struct lpfc_hba *phba * * This routine is the completion callback function to the Logout (LOGO) * Accept (ACC) Response ELS command. This routine is invoked to indicate - * the completion of the LOGO process. It invokes the lpfc_nlp_not_used() to - * release the ndlp if it has the last reference remaining (reference count - * is 1). If succeeded (meaning ndlp released), it sets the iocb ndlp - * field to NULL to inform the following lpfc_els_free_iocb() routine no - * ndlp reference count needs to be decremented. Otherwise, the ndlp - * reference use-count shall be decremented by the lpfc_els_free_iocb() - * routine. Finally, the lpfc_els_free_iocb() is invoked to release the - * IOCB data structure. + * the completion of the LOGO process. If the node has transitioned to NPR, + * this routine unregisters the RPI if it is still registered. The + * lpfc_els_free_iocb() is invoked to release the IOCB data structure. **/ static void lpfc_cmpl_els_logo_acc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb, @@ -5253,19 +5248,9 @@ lpfc_cmpl_els_logo_acc(struct lpfc_hba * (ndlp->nlp_last_elscmd == ELS_CMD_PLOGI)) goto out;
- /* NPort Recovery mode or node is just allocated */ - if (!lpfc_nlp_not_used(ndlp)) { - /* A LOGO is completing and the node is in NPR state. - * Just unregister the RPI because the node is still - * required. - */ + if (ndlp->nlp_flag & NLP_RPI_REGISTERED) lpfc_unreg_rpi(vport, ndlp); - } else { - /* Indicate the node has already released, should - * not reference to it from within lpfc_els_free_iocb. - */ - cmdiocb->ndlp = NULL; - } + } out: /* @@ -5285,9 +5270,8 @@ lpfc_cmpl_els_logo_acc(struct lpfc_hba * * RPI (Remote Port Index) mailbox command to the @phba. It simply releases * the associated lpfc Direct Memory Access (DMA) buffer back to the pool and * decrements the ndlp reference count held for this completion callback - * function. After that, it invokes the lpfc_nlp_not_used() to check - * whether there is only one reference left on the ndlp. If so, it will - * perform one more decrement and trigger the release of the ndlp. + * function. After that, it invokes the lpfc_drop_node to check + * whether it is appropriate to release the node. **/ void lpfc_mbx_cmpl_dflt_rpi(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb) --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -4333,13 +4333,14 @@ out:
/* If the node is not registered with the scsi or nvme * transport, remove the fabric node. The failed reg_login - * is terminal. + * is terminal and forces the removal of the last node + * reference. */ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD))) { spin_lock_irq(&ndlp->lock); ndlp->nlp_flag &= ~NLP_NPR_2B_DISC; spin_unlock_irq(&ndlp->lock); - lpfc_nlp_not_used(ndlp); + lpfc_nlp_put(ndlp); }
if (phba->fc_topology == LPFC_TOPOLOGY_LOOP) { @@ -6704,25 +6705,6 @@ lpfc_nlp_put(struct lpfc_nodelist *ndlp) return ndlp ? kref_put(&ndlp->kref, lpfc_nlp_release) : 0; }
-/* This routine free's the specified nodelist if it is not in use - * by any other discovery thread. This routine returns 1 if the - * ndlp has been freed. A return value of 0 indicates the ndlp is - * not yet been released. - */ -int -lpfc_nlp_not_used(struct lpfc_nodelist *ndlp) -{ - lpfc_debugfs_disc_trc(ndlp->vport, LPFC_DISC_TRC_NODE, - "node not used: did:x%x flg:x%x refcnt:x%x", - ndlp->nlp_DID, ndlp->nlp_flag, - kref_read(&ndlp->kref)); - - if (kref_read(&ndlp->kref) == 1) - if (lpfc_nlp_put(ndlp)) - return 1; - return 0; -} - /** * lpfc_fcf_inuse - Check if FCF can be unregistered. * @phba: Pointer to hba context object.
From: Brian Norris briannorris@chromium.org
commit 9d0e3cac3517942a6e00eeecfe583a98715edb16 upstream.
The self-refresh helper framework overloads "disable" to sometimes mean "go into self-refresh mode," and this mode activates automatically (e.g., after some period of unchanging display output). In such cases, the display pipe is still considered "on", and user-space is not aware that we went into self-refresh mode. Thus, users may expect that vblank-related features (such as DRM_IOCTL_WAIT_VBLANK) still work properly.
However, we trigger the WARN_ONCE() here if a CRTC driver tries to leave vblank enabled.
Add a different expectation: that CRTCs *should* leave vblank enabled when going into self-refresh.
This patch is preparation for another patch -- "drm/rockchip: vop: Leave vblank enabled in self-refresh" -- which resolves conflicts between the above self-refresh behavior and the API tests in IGT's kms_vblank test module.
== Some alternatives discussed: ==
It's likely that on many display controllers, vblank interrupts will turn off when the CRTC is disabled, and so in some cases, self-refresh may not support vblank. To support such cases, we might consider additions to the generic helpers such that we fire vblank events based on a timer.
However, there is currently only one driver using the common self-refresh helpers (i.e., rockchip), and at least as of commit bed030a49f3e ("drm/rockchip: Don't fully disable vop on self refresh"), the CRTC hardware is powered enough to continue to generate vblank interrupts.
So we chose the simpler option of leaving vblank interrupts enabled. We can reevaluate this decision and perhaps augment the helpers if/when we gain a second driver that has different requirements.
v3: * include discussion summary
v2: * add 'ret != 0' warning case for self-refresh * describe failing test case and relation to drm/rockchip patch better
Cc: stable@vger.kernel.org # dependency for "drm/rockchip: vop: Leave # vblank enabled in self-refresh" Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Sean Paul seanpaul@chromium.org Link: https://patchwork.freedesktop.org/patch/msgid/20230109171809.v3.1.I3904f6978... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_atomic_helper.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1209,7 +1209,16 @@ disable_outputs(struct drm_device *dev, continue;
ret = drm_crtc_vblank_get(crtc); - WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n"); + /* + * Self-refresh is not a true "disable"; ensure vblank remains + * enabled. + */ + if (new_crtc_state->self_refresh_active) + WARN_ONCE(ret != 0, + "driver disabled vblank in self-refresh\n"); + else + WARN_ONCE(ret != -EINVAL, + "driver forgot to call drm_crtc_vblank_off()\n"); if (ret == 0) drm_crtc_vblank_put(crtc); }
From: Brian Norris briannorris@chromium.org
commit 2bdba9d4a3baa758c2ca7f5b37b35c7b3391dc42 upstream.
If we disable vblank when entering self-refresh, vblank APIs (like DRM_IOCTL_WAIT_VBLANK) no longer work. But user space is not aware when we enter self-refresh, so this appears to be an API violation -- that DRM_IOCTL_WAIT_VBLANK fails with EINVAL whenever the display is idle and enters self-refresh.
The downstream driver used by many of these systems never used to disable vblank for PSR, and in fact, even upstream, we didn't do that until radically redesigning the state machine in commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR").
Thus, it seems like a reasonable API fix to simply restore that behavior, and leave vblank enabled.
Note that this appears to potentially unbalance the drm_crtc_vblank_{off,on}() calls in some cases, but: (a) drm_crtc_vblank_on() documents this as OK and (b) if I do the naive balancing, I find state machine issues such that we're not in sync properly; so it's easier to take advantage of (a).
This issue was exposed by IGT's kms_vblank tests, and reported by KernelCI. The bug has been around a while (longer than KernelCI noticed), but was only exposed once self-refresh was bugfixed more recently, and so KernelCI could properly test it. Some other notes in:
https://lore.kernel.org/dri-devel/Y6OCg9BPnJvimQLT@google.com/ Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
== Backporting notes: ==
Marking as 'Fixes' commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), but it probably depends on commit bed030a49f3e ("drm/rockchip: Don't fully disable vop on self refresh") as well.
We also need the previous patch ("drm/atomic: Allow vblank-enabled + self-refresh "disable""), of course.
v3: * no update
v2: * skip unnecessary lock/unlock
Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: stable@vger.kernel.org Reported-by: "kernelci.org bot" bot@kernelci.org Link: https://lore.kernel.org/dri-devel/Y5itf0+yNIQa6fU4@sirena.org.uk/ Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Sean Paul seanpaul@chromium.org Link: https://patchwork.freedesktop.org/patch/msgid/20230109171809.v3.2.Ic07cba4ab... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -714,13 +714,13 @@ static void vop_crtc_atomic_disable(stru if (crtc->state->self_refresh_active) rockchip_drm_set_win_enabled(crtc, false);
+ if (crtc->state->self_refresh_active) + goto out; + mutex_lock(&vop->vop_lock);
drm_crtc_vblank_off(crtc);
- if (crtc->state->self_refresh_active) - goto out; - /* * Vop standby will take effect at end of current frame, * if dsp hold valid irq happen, it means standby complete. @@ -754,9 +754,9 @@ static void vop_crtc_atomic_disable(stru vop_core_clks_disable(vop); pm_runtime_put(vop->dev);
-out: mutex_unlock(&vop->vop_lock);
+out: if (crtc->state->event && !crtc->state->active) { spin_lock_irq(&crtc->dev->event_lock); drm_crtc_send_vblank_event(crtc, crtc->state->event);
From: Wayne Lin Wayne.Lin@amd.com
commit 72f1de49ffb90b29748284f27f1d6b829ab1de95 upstream.
[Why] The sequence for collecting down_reply from source perspective should be:
Request_n->repeat (get partial reply of Request_n->clear message ready flag to ack DPRX that the message is received) till all partial replies for Request_n are received->new Request_n+1.
Now there is chance that drm_dp_mst_hpd_irq() will fire new down request in the tx queue when the down reply is incomplete. Source is restricted to generate interveleaved message transactions so we should avoid it.
Also, while assembling partial reply packets, reading out DPCD DOWN_REP Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be wrapped up as a complete operation for reading out a reply packet. Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might be risky. e.g. If the reply of the new request has overwritten the DPRX DOWN_REP Sideband MSG buffer before source writing one to clear DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply for the new request. Should handle the up request in the same way.
[How] Separete drm_dp_mst_hpd_irq() into 2 steps. After acking the MST IRQ event, driver calls drm_dp_mst_hpd_irq_send_new_request() and might trigger drm_dp_mst_kick_tx() only when there is no on going message transaction.
Changes since v1: * Reworked on review comments received -> Adjust the fix to let driver explicitly kick off new down request when mst irq event is handled and acked -> Adjust the commit message
Changes since v2: * Adjust the commit message * Adjust the naming of the divided 2 functions and add a new input parameter "ack". * Adjust code flow as per review comments.
Changes since v3: * Update the function description of drm_dp_mst_hpd_irq_handle_event
Changes since v4: * Change ack of drm_dp_mst_hpd_irq_handle_event() to be an array align the size of esi[]
Signed-off-by: Wayne Lin Wayne.Lin@amd.com Reviewed-by: Lyude Paul lyude@redhat.com Acked-by: Jani Nikula jani.nikula@intel.com Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 30 ++++++------ drivers/gpu/drm/display/drm_dp_mst_topology.c | 54 +++++++++++++++++++--- drivers/gpu/drm/i915/display/intel_dp.c | 7 +- drivers/gpu/drm/nouveau/dispnv50/disp.c | 12 +++- include/drm/display/drm_dp_mst_helper.h | 7 ++ 5 files changed, 80 insertions(+), 30 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -3263,6 +3263,7 @@ static void dm_handle_mst_sideband_msg(s
while (dret == dpcd_bytes_to_read && process_count < max_process_count) { + u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {}; u8 retry; dret = 0;
@@ -3271,28 +3272,29 @@ static void dm_handle_mst_sideband_msg(s DRM_DEBUG_DRIVER("ESI %02x %02x %02x\n", esi[0], esi[1], esi[2]); /* handle HPD short pulse irq */ if (aconnector->mst_mgr.mst_state) - drm_dp_mst_hpd_irq( - &aconnector->mst_mgr, - esi, - &new_irq_handled); + drm_dp_mst_hpd_irq_handle_event(&aconnector->mst_mgr, + esi, + ack, + &new_irq_handled);
if (new_irq_handled) { /* ACK at DPCD to notify down stream */ - const int ack_dpcd_bytes_to_write = - dpcd_bytes_to_read - 1; - for (retry = 0; retry < 3; retry++) { - u8 wret; + ssize_t wret;
- wret = drm_dp_dpcd_write( - &aconnector->dm_dp_aux.aux, - dpcd_addr + 1, - &esi[1], - ack_dpcd_bytes_to_write); - if (wret == ack_dpcd_bytes_to_write) + wret = drm_dp_dpcd_writeb(&aconnector->dm_dp_aux.aux, + dpcd_addr + 1, + ack[1]); + if (wret == 1) break; }
+ if (retry == 3) { + DRM_ERROR("Failed to ack MST event.\n"); + return; + } + + drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr); /* check if there is new irq to be handled */ dret = drm_dp_dpcd_read( &aconnector->dm_dp_aux.aux, --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c @@ -4053,17 +4053,28 @@ out: }
/** - * drm_dp_mst_hpd_irq() - MST hotplug IRQ notify + * drm_dp_mst_hpd_irq_handle_event() - MST hotplug IRQ handle MST event * @mgr: manager to notify irq for. * @esi: 4 bytes from SINK_COUNT_ESI + * @ack: 4 bytes used to ack events starting from SINK_COUNT_ESI * @handled: whether the hpd interrupt was consumed or not * - * This should be called from the driver when it detects a short IRQ, + * This should be called from the driver when it detects a HPD IRQ, * along with the value of the DEVICE_SERVICE_IRQ_VECTOR_ESI0. The - * topology manager will process the sideband messages received as a result - * of this. + * topology manager will process the sideband messages received + * as indicated in the DEVICE_SERVICE_IRQ_VECTOR_ESI0 and set the + * corresponding flags that Driver has to ack the DP receiver later. + * + * Note that driver shall also call + * drm_dp_mst_hpd_irq_send_new_request() if the 'handled' is set + * after calling this function, to try to kick off a new request in + * the queue if the previous message transaction is completed. + * + * See also: + * drm_dp_mst_hpd_irq_send_new_request() */ -int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handled) +int drm_dp_mst_hpd_irq_handle_event(struct drm_dp_mst_topology_mgr *mgr, const u8 *esi, + u8 *ack, bool *handled) { int ret = 0; int sc; @@ -4078,18 +4089,47 @@ int drm_dp_mst_hpd_irq(struct drm_dp_mst if (esi[1] & DP_DOWN_REP_MSG_RDY) { ret = drm_dp_mst_handle_down_rep(mgr); *handled = true; + ack[1] |= DP_DOWN_REP_MSG_RDY; }
if (esi[1] & DP_UP_REQ_MSG_RDY) { ret |= drm_dp_mst_handle_up_req(mgr); *handled = true; + ack[1] |= DP_UP_REQ_MSG_RDY; }
- drm_dp_mst_kick_tx(mgr); return ret; } -EXPORT_SYMBOL(drm_dp_mst_hpd_irq); +EXPORT_SYMBOL(drm_dp_mst_hpd_irq_handle_event); + +/** + * drm_dp_mst_hpd_irq_send_new_request() - MST hotplug IRQ kick off new request + * @mgr: manager to notify irq for. + * + * This should be called from the driver when mst irq event is handled + * and acked. Note that new down request should only be sent when + * previous message transaction is completed. Source is not supposed to generate + * interleaved message transactions. + */ +void drm_dp_mst_hpd_irq_send_new_request(struct drm_dp_mst_topology_mgr *mgr) +{ + struct drm_dp_sideband_msg_tx *txmsg; + bool kick = true;
+ mutex_lock(&mgr->qlock); + txmsg = list_first_entry_or_null(&mgr->tx_msg_downq, + struct drm_dp_sideband_msg_tx, next); + /* If last transaction is not completed yet*/ + if (!txmsg || + txmsg->state == DRM_DP_SIDEBAND_TX_START_SEND || + txmsg->state == DRM_DP_SIDEBAND_TX_SENT) + kick = false; + mutex_unlock(&mgr->qlock); + + if (kick) + drm_dp_mst_kick_tx(mgr); +} +EXPORT_SYMBOL(drm_dp_mst_hpd_irq_send_new_request); /** * drm_dp_mst_detect_port() - get connection status for an MST port * @connector: DRM connector for this port --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -3940,9 +3940,7 @@ intel_dp_mst_hpd_irq(struct intel_dp *in { bool handled = false;
- drm_dp_mst_hpd_irq(&intel_dp->mst_mgr, esi, &handled); - if (handled) - ack[1] |= esi[1] & (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY); + drm_dp_mst_hpd_irq_handle_event(&intel_dp->mst_mgr, esi, ack, &handled);
if (esi[1] & DP_CP_IRQ) { intel_hdcp_handle_cp_irq(intel_dp->attached_connector); @@ -4017,6 +4015,9 @@ intel_dp_check_mst_status(struct intel_d
if (!intel_dp_ack_sink_irq_esi(intel_dp, ack)) drm_dbg_kms(&i915->drm, "Failed to ack ESI\n"); + + if (ack[1] & (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY)) + drm_dp_mst_hpd_irq_send_new_request(&intel_dp->mst_mgr); }
return link_ok; --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -1359,22 +1359,26 @@ nv50_mstm_service(struct nouveau_drm *dr u8 esi[8] = {};
while (handled) { + u8 ack[8] = {}; + rc = drm_dp_dpcd_read(aux, DP_SINK_COUNT_ESI, esi, 8); if (rc != 8) { ret = false; break; }
- drm_dp_mst_hpd_irq(&mstm->mgr, esi, &handled); + drm_dp_mst_hpd_irq_handle_event(&mstm->mgr, esi, ack, &handled); if (!handled) break;
- rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1, &esi[1], - 3); - if (rc != 3) { + rc = drm_dp_dpcd_writeb(aux, DP_SINK_COUNT_ESI + 1, ack[1]); + + if (rc != 1) { ret = false; break; } + + drm_dp_mst_hpd_irq_send_new_request(&mstm->mgr); }
if (!ret) --- a/include/drm/display/drm_dp_mst_helper.h +++ b/include/drm/display/drm_dp_mst_helper.h @@ -815,8 +815,11 @@ void drm_dp_mst_topology_mgr_destroy(str bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, const u8 dpcd[DP_RECEIVER_CAP_SIZE]); int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool mst_state);
-int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handled); - +int drm_dp_mst_hpd_irq_handle_event(struct drm_dp_mst_topology_mgr *mgr, + const u8 *esi, + u8 *ack, + bool *handled); +void drm_dp_mst_hpd_irq_send_new_request(struct drm_dp_mst_topology_mgr *mgr);
int drm_dp_mst_detect_port(struct drm_connector *connector,
From: Alvin Lee Alvin.Lee2@amd.com
commit ee7be8f3de1ccc9665281fe996f9b6d45191ec1a upstream.
- Due to hardware related QoS issues, we need to limit certain SKUs with less memory channels to DPM1 and above. - At DPM0 + workload running, the urgent return latency can exceed 15us (the expected maximum is 4us) which results in underflow
Cc: stable@vger.kernel.org Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: Saaem Rizvi SyedSaaem.Rizvi@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Alvin Lee Alvin.Lee2@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c | 2 ++ drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 15 +++++++++++++++ drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h | 2 ++ 3 files changed, 19 insertions(+)
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c @@ -1888,6 +1888,8 @@ bool dcn32_validate_bandwidth(struct dc
dc->res_pool->funcs->calculate_wm_and_dlg(dc, context, pipes, pipe_cnt, vlevel);
+ dcn32_override_min_req_memclk(dc, context); + BW_VAL_TRACE_END_WATERMARKS();
goto validate_out; --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c @@ -2882,3 +2882,18 @@ void dcn32_set_clock_limits(const struct dc_assert_fp_enabled(); dcn3_2_soc.clock_limits[0].dcfclk_mhz = 1200.0; } + +void dcn32_override_min_req_memclk(struct dc *dc, struct dc_state *context) +{ + // WA: restrict FPO and SubVP to use first non-strobe mode (DCN32 BW issue) + if ((context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching || dcn32_subvp_in_use(dc, context)) && + dc->dml.soc.num_chans <= 8) { + int num_mclk_levels = dc->clk_mgr->bw_params->clk_table.num_entries_per_clk.num_memclk_levels; + + if (context->bw_ctx.dml.vba.DRAMSpeed <= dc->clk_mgr->bw_params->clk_table.entries[0].memclk_mhz * 16 && + num_mclk_levels > 1) { + context->bw_ctx.dml.vba.DRAMSpeed = dc->clk_mgr->bw_params->clk_table.entries[1].memclk_mhz * 16; + context->bw_ctx.bw.dcn.clk.dramclk_khz = context->bw_ctx.dml.vba.DRAMSpeed * 1000 / 16; + } + } +} --- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h @@ -80,6 +80,8 @@ void dcn32_assign_fpo_vactive_candidate(
bool dcn32_find_vactive_pipe(struct dc *dc, const struct dc_state *context, uint32_t vactive_margin_req);
+void dcn32_override_min_req_memclk(struct dc *dc, struct dc_state *context); + void dcn32_set_clock_limits(const struct _vcs_dpi_soc_bounding_box_st *soc_bb);
#endif
From: Alan Liu HaoPing.Liu@amd.com
commit f477c7b5ec3e4ef87606671b340abf3bdb0cccff upstream.
[Why & How] We need to store CRTC information in secure_display_ctx, so postpone the call to amdgpu_dm_crtc_secure_display_create_contexts() until we initialize all CRTCs.
Cc: stable@vger.kernel.org Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: Wayne Lin Wayne.Lin@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Alan Liu HaoPing.Liu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 +++++------ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h | 2 +- 2 files changed, 6 insertions(+), 7 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1776,12 +1776,6 @@ static int amdgpu_dm_init(struct amdgpu_
dc_init_callbacks(adev->dm.dc, &init_params); } -#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) - adev->dm.secure_display_ctxs = amdgpu_dm_crtc_secure_display_create_contexts(adev); - if (!adev->dm.secure_display_ctxs) { - DRM_ERROR("amdgpu: failed to initialize secure_display_ctxs.\n"); - } -#endif if (dc_is_dmub_outbox_supported(adev->dm.dc)) { init_completion(&adev->dm.dmub_aux_transfer_done); adev->dm.dmub_notify = kzalloc(sizeof(struct dmub_notification), GFP_KERNEL); @@ -1840,6 +1834,11 @@ static int amdgpu_dm_init(struct amdgpu_ goto error; }
+#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY) + adev->dm.secure_display_ctxs = amdgpu_dm_crtc_secure_display_create_contexts(adev); + if (!adev->dm.secure_display_ctxs) + DRM_ERROR("amdgpu: failed to initialize secure display contexts.\n"); +#endif
DRM_DEBUG_DRIVER("KMS initialized.\n");
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h @@ -100,7 +100,7 @@ struct secure_display_context *amdgpu_dm #else #define amdgpu_dm_crc_window_is_activated(x) #define amdgpu_dm_crtc_handle_crc_window_irq(x) -#define amdgpu_dm_crtc_secure_display_create_contexts() +#define amdgpu_dm_crtc_secure_display_create_contexts(x) #endif
#endif /* AMD_DAL_DEV_AMDGPU_DM_AMDGPU_DM_CRC_H_ */
From: Dmytro Laktyushkin dmytro.laktyushkin@amd.com
commit 75c2b7ed080d7421157c03064be82275364136e7 upstream.
Add missing programming and function pointers
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Acked-by: Stylon Wang stylon.wang@amd.com Signed-off-by: Dmytro Laktyushkin dmytro.laktyushkin@amd.com Reviewed-by: Charlene Liu Charlene.Liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 11 +++++++++++ drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c | 2 +- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h | 1 + 3 files changed, 13 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -1732,6 +1732,17 @@ static void dcn20_program_pipe(
if (hws->funcs.setup_vupdate_interrupt) hws->funcs.setup_vupdate_interrupt(dc, pipe_ctx); + + if (hws->funcs.calculate_dccg_k1_k2_values && dc->res_pool->dccg->funcs->set_pixel_rate_div) { + unsigned int k1_div, k2_div; + + hws->funcs.calculate_dccg_k1_k2_values(pipe_ctx, &k1_div, &k2_div); + + dc->res_pool->dccg->funcs->set_pixel_rate_div( + dc->res_pool->dccg, + pipe_ctx->stream_res.tg->inst, + k1_div, k2_div); + } }
if (pipe_ctx->update_flags.bits.odm) --- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c @@ -98,7 +98,7 @@ static void optc32_set_odm_combine(struc optc1->opp_count = opp_cnt; }
-static void optc32_set_h_timing_div_manual_mode(struct timing_generator *optc, bool manual_mode) +void optc32_set_h_timing_div_manual_mode(struct timing_generator *optc, bool manual_mode) { struct optc *optc1 = DCN10TG_FROM_TG(optc);
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h @@ -179,5 +179,6 @@ SF(OTG0_OTG_DRR_CONTROL, OTG_V_TOTAL_LAST_USED_BY_DRR, mask_sh)
void dcn32_timing_generator_init(struct optc *optc1); +void optc32_set_h_timing_div_manual_mode(struct timing_generator *optc, bool manual_mode);
#endif /* __DC_OPTC_DCN32_H__ */
From: Hersen Wu hersenxs.wu@amd.com
commit 7a0e005c7957931689a327b2a4e7333a19f13f95 upstream.
[Why] most edp support only timings from edid. applying non-edid timings, especially those timings out of edp bandwidth, may damage edp.
[How] do not add non-edid timings for edp.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Acked-by: Stylon Wang stylon.wang@amd.com Signed-off-by: Hersen Wu hersenxs.wu@amd.com Reviewed-by: Roman Li roman.li@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -7192,7 +7192,13 @@ static int amdgpu_dm_connector_get_modes drm_add_modes_noedid(connector, 1920, 1080); } else { amdgpu_dm_connector_ddc_get_modes(connector, edid); - amdgpu_dm_connector_add_common_modes(encoder, connector); + /* most eDP supports only timings from its edid, + * usually only detailed timings are available + * from eDP edid. timings which are not from edid + * may damage eDP + */ + if (connector->connector_type != DRM_MODE_CONNECTOR_eDP) + amdgpu_dm_connector_add_common_modes(encoder, connector); amdgpu_dm_connector_add_freesync_modes(connector, edid); } amdgpu_dm_fbc_init(connector);
From: Austin Zheng austin.zheng@amd.com
commit 1966bbfdfe476d271b338336254854c5edd5a907 upstream.
[Why] K1 and K2 not being setting properly when subVP is active.
[How] Have phantom pipes use the same programing as the main pipes without checking the paired stream
Cc: stable@vger.kernel.org Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: Alvin Lee alvin.lee2@amd.com Acked-by: Rodrigo Siqueira rodrigo.siqueira@amd.com Signed-off-by: Austin Zheng austin.zheng@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c @@ -1125,10 +1125,6 @@ unsigned int dcn32_calculate_dccg_k1_k2_ unsigned int odm_combine_factor = 0; bool two_pix_per_container = false;
- // For phantom pipes, use the same programming as the main pipes - if (pipe_ctx->stream->mall_stream_config.type == SUBVP_PHANTOM) { - stream = pipe_ctx->stream->mall_stream_config.paired_stream; - } two_pix_per_container = optc2_is_two_pixels_per_containter(&stream->timing); odm_combine_factor = get_odm_config(pipe_ctx, NULL);
From: Leo Chen sancchen@amd.com
commit 26518b39181876064850209ecdab48c0ee5924b1 upstream.
[Why & How] Having seamless boot on while forcing debug option ODM combine 2 to 1 will cause some corruptions because of some missing programmings.
Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Leo Chen sancchen@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1602,6 +1602,9 @@ bool dc_validate_boot_timing(const struc return false; }
+ if (dc->debug.force_odm_combine) + return false; + /* Check for enabled DIG to identify enabled display */ if (!link->link_enc->funcs->is_dig_enabled(link->link_enc)) return false;
From: Samuel Pitoiset samuel.pitoiset@gmail.com
commit ea2c3c08554601b051d91403a241266e1cf490a5 upstream.
Per VM BOs must be marked as moved or otherwise their ranges are not updated on use which might be necessary when the replace operation splits mappings.
This fixes random GPU hangs when replacing sparse mappings from the userspace, while OP_MAP/OP_UNMAP works fine because always valid BOs are correctly handled there.
Cc: stable@vger.kernel.org Signed-off-by: Samuel Pitoiset samuel.pitoiset@gmail.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1683,18 +1683,30 @@ int amdgpu_vm_bo_clear_mappings(struct a
/* Insert partial mapping before the range */ if (!list_empty(&before->list)) { + struct amdgpu_bo *bo = before->bo_va->base.bo; + amdgpu_vm_it_insert(before, &vm->va); if (before->flags & AMDGPU_PTE_PRT) amdgpu_vm_prt_get(adev); + + if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv && + !before->bo_va->base.moved) + amdgpu_vm_bo_moved(&before->bo_va->base); } else { kfree(before); }
/* Insert partial mapping after the range */ if (!list_empty(&after->list)) { + struct amdgpu_bo *bo = after->bo_va->base.bo; + amdgpu_vm_it_insert(after, &vm->va); if (after->flags & AMDGPU_PTE_PRT) amdgpu_vm_prt_get(adev); + + if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv && + !after->bo_va->base.moved) + amdgpu_vm_bo_moved(&after->bo_va->base); } else { kfree(after); }
From: Mario Limonciello mario.limonciello@amd.com
commit 072030b1783056b5de8b0fac5303a5e9dbc6cfde upstream.
A number of users have reported that there are random hangs occurring caused by PSR-SU specifically on panels that contain the parade 0803 TCON. Users have been able to work around the issue by disabling PSR entirely.
To avoid these hangs, disable PSR-SU when this TCON is found.
Cc: stable@vger.kernel.org Cc: Sean Wang sean.ns.wang@amd.com Cc: Marc Rossi Marc.Rossi@amd.com Cc: Hamza Mahfooz Hamza.Mahfooz@amd.com Suggested-by: Tsung-hua (Ryan) Lin Tsung-hua.Lin@amd.com Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2443 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/modules/power/power_helpers.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c +++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c @@ -818,6 +818,8 @@ bool is_psr_su_specific_panel(struct dc_ ((dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x08) || (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x07))) isPSRSUSupported = false; + else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x03) + isPSRSUSupported = false; else if (dpcd_caps->psr_info.force_psrsu_cap == 0x1) isPSRSUSupported = true; }
From: Sung-huai Wang danny.wang@amd.com
commit 0f48a4b83610cb0e4e0bc487800ab69f51b4aca6 upstream.
[Why & How]
We have to check if stream is properly initialized before calling find_matching_pll(), otherwise we might end up trying to deferecence a NULL pointer.
Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Sung-huai Wang danny.wang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c @@ -970,10 +970,12 @@ enum dc_status resource_map_phy_clock_re || dc_is_virtual_signal(pipe_ctx->stream->signal)) pipe_ctx->clock_source = dc->res_pool->dp_clock_source; - else - pipe_ctx->clock_source = find_matching_pll( - &context->res_ctx, dc->res_pool, - stream); + else { + if (stream && stream->link && stream->link->link_enc) + pipe_ctx->clock_source = find_matching_pll( + &context->res_ctx, dc->res_pool, + stream); + }
if (pipe_ctx->clock_source == NULL) return DC_NO_CLOCK_SOURCE_RESOURCE;
From: Ilya Bakoulin ilya.bakoulin@amd.com
commit ed83fe2abcace898fdec5c2ba0455703178ac9a3 upstream.
[Why] We don't check 128b132b-specific bits in LANE_ALIGN_STATUS_UPDATED DPCD registers when parsing link loss status, which can cause us to miss a link loss notification from some sinks.
[How] Add a 128b132b-specific status bit check.
Cc: stable@vger.kernel.org # 6.3+ Reviewed-by: Wenjing Liu wenjing.liu@amd.com Acked-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Ilya Bakoulin ilya.bakoulin@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../display/dc/link/protocols/link_dp_irq_handler.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c index ba95facc4ee8..b1b11eb0f9bb 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c @@ -82,8 +82,15 @@ bool dp_parse_link_loss_status( }
/* Check interlane align.*/ - if (sink_status_changed || - !hpd_irq_dpcd_data->bytes.lane_status_updated.bits.INTERLANE_ALIGN_DONE) { + if (link_dp_get_encoding_format(&link->cur_link_settings) == DP_128b_132b_ENCODING && + (!hpd_irq_dpcd_data->bytes.lane_status_updated.bits.EQ_INTERLANE_ALIGN_DONE_128b_132b || + !hpd_irq_dpcd_data->bytes.lane_status_updated.bits.CDS_INTERLANE_ALIGN_DONE_128b_132b)) { + sink_status_changed = true; + } else if (!hpd_irq_dpcd_data->bytes.lane_status_updated.bits.INTERLANE_ALIGN_DONE) { + sink_status_changed = true; + } + + if (sink_status_changed) {
DC_LOG_HW_HPD_IRQ("%s: Link Status changed.\n", __func__);
From: Mario Limonciello mario.limonciello@amd.com
commit 274d205cb59f43815542e04b42a9e6d0b9b95eff upstream.
The `DMUB_FW_VERSION` macro has a mistake in that the revision field is off by one byte. The last byte is typically used for other purposes and not a revision.
Cc: stable@vger.kernel.org Cc: Sean Wang sean.ns.wang@amd.com Cc: Marc Rossi Marc.Rossi@amd.com Cc: Hamza Mahfooz Hamza.Mahfooz@amd.com Cc: Tsung-hua (Ryan) Lin Tsung-hua.Lin@amd.com Reviewed-by: Leo Li sunpeng.li@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h @@ -490,7 +490,7 @@ struct dmub_notification { * of a firmware to know if feature or functionality is supported or present. */ #define DMUB_FW_VERSION(major, minor, revision) \ - ((((major) & 0xFF) << 24) | (((minor) & 0xFF) << 16) | ((revision) & 0xFFFF)) + ((((major) & 0xFF) << 24) | (((minor) & 0xFF) << 16) | (((revision) & 0xFF) << 8))
/** * dmub_srv_create() - creates the DMUB service.
From: Aurabindo Pillai aurabindo.pillai@amd.com
commit 613a7956deb3b1ffa2810c6d4c90ee9c3d743dbb upstream.
Disable FAMS on a Samsung Odyssey G9 monitor. Experiments show that this monitor does not work well under some use cases, and is likely implementation specific bug on the monitor's firmware.
Cc: stable@vger.kernel.org Reviewed-by: Rodrigo Siqueira rodrigo.siqueira@amd.com Signed-off-by: Aurabindo Pillai aurabindo.pillai@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 26 ++++++++++++++ 1 file changed, 26 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -47,6 +47,30 @@ /* MST Dock */ static const uint8_t SYNAPTICS_DEVICE_ID[] = "SYNA";
+static u32 edid_extract_panel_id(struct edid *edid) +{ + return (u32)edid->mfg_id[0] << 24 | + (u32)edid->mfg_id[1] << 16 | + (u32)EDID_PRODUCT_ID(edid); +} + +static void apply_edid_quirks(struct edid *edid, struct dc_edid_caps *edid_caps) +{ + uint32_t panel_id = edid_extract_panel_id(edid); + + switch (panel_id) { + /* Workaround for some monitors which does not work well with FAMS */ + case drm_edid_encode_panel_id('S', 'A', 'M', 0x0E5E): + case drm_edid_encode_panel_id('S', 'A', 'M', 0x7053): + case drm_edid_encode_panel_id('S', 'A', 'M', 0x71AC): + DRM_DEBUG_DRIVER("Disabling FAMS on monitor with panel id %X\n", panel_id); + edid_caps->panel_patch.disable_fams = true; + break; + default: + return; + } +} + /* dm_helpers_parse_edid_caps * * Parse edid caps @@ -118,6 +142,8 @@ enum dc_edid_status dm_helpers_parse_edi else edid_caps->speaker_flags = DEFAULT_SPEAKER_LOCATION;
+ apply_edid_quirks(edid_buf, edid_caps); + kfree(sads); kfree(sadb);
From: gaba gaba@amd.com
commit 8a774fe912ff09e39c2d3a3589c729330113f388 upstream.
In restore process worker, pinned BO cause update PTE fail, then the function re-schedule the restore_work. This will generate dead loop.
Signed-off-by: gaba gaba@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -2792,6 +2792,9 @@ int amdgpu_amdkfd_gpuvm_restore_process_ if (!attachment->is_mapped) continue;
+ if (attachment->bo_va->base.bo->tbo.pin_count) + continue; + kfd_mem_dmaunmap_attachment(mem, attachment); ret = update_gpuvm_pte(mem, attachment, &sync_obj); if (ret) {
From: Yang Wang kevinyang.wang@amd.com
commit d934e537c14bfe1227ced6341472571f354383e8 upstream.
the smu driver_table is used for all types of smu tables data transcation (e.g: PPtable, Metrics, i2c, Ecc..).
it is necessary to hold this lock to avoiding data tampering during the i2c read operation.
Signed-off-by: Yang Wang kevinyang.wang@amd.com Reviewed-by: Lijo Lazar lijo.lazar@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 2 +- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c @@ -2113,7 +2113,6 @@ static int arcturus_i2c_xfer(struct i2c_ } mutex_lock(&adev->pm.mutex); r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true); - mutex_unlock(&adev->pm.mutex); if (r) goto fail;
@@ -2130,6 +2129,7 @@ static int arcturus_i2c_xfer(struct i2c_ } r = num_msgs; fail: + mutex_unlock(&adev->pm.mutex); kfree(req); return r; } --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -3021,7 +3021,6 @@ static int navi10_i2c_xfer(struct i2c_ad } mutex_lock(&adev->pm.mutex); r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true); - mutex_unlock(&adev->pm.mutex); if (r) goto fail;
@@ -3038,6 +3037,7 @@ static int navi10_i2c_xfer(struct i2c_ad } r = num_msgs; fail: + mutex_unlock(&adev->pm.mutex); kfree(req); return r; } --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -3842,7 +3842,6 @@ static int sienna_cichlid_i2c_xfer(struc } mutex_lock(&adev->pm.mutex); r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true); - mutex_unlock(&adev->pm.mutex); if (r) goto fail;
@@ -3859,6 +3858,7 @@ static int sienna_cichlid_i2c_xfer(struc } r = num_msgs; fail: + mutex_unlock(&adev->pm.mutex); kfree(req); return r; } --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c @@ -1525,7 +1525,6 @@ static int aldebaran_i2c_xfer(struct i2c } mutex_lock(&adev->pm.mutex); r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true); - mutex_unlock(&adev->pm.mutex); if (r) goto fail;
@@ -1542,6 +1541,7 @@ static int aldebaran_i2c_xfer(struct i2c } r = num_msgs; fail: + mutex_unlock(&adev->pm.mutex); kfree(req); return r; } --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -1838,7 +1838,6 @@ static int smu_v13_0_0_i2c_xfer(struct i } mutex_lock(&adev->pm.mutex); r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true); - mutex_unlock(&adev->pm.mutex); if (r) goto fail;
@@ -1855,6 +1854,7 @@ static int smu_v13_0_0_i2c_xfer(struct i } r = num_msgs; fail: + mutex_unlock(&adev->pm.mutex); kfree(req); return r; } --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c @@ -1639,7 +1639,6 @@ static int smu_v13_0_6_i2c_xfer(struct i } mutex_lock(&adev->pm.mutex); r = smu_v13_0_6_request_i2c_xfer(smu, req); - mutex_unlock(&adev->pm.mutex); if (r) goto fail;
@@ -1656,6 +1655,7 @@ static int smu_v13_0_6_i2c_xfer(struct i } r = num_msgs; fail: + mutex_unlock(&adev->pm.mutex); kfree(req); return r; }
From: Thomas Hellström thomas.hellstrom@linux.intel.com
commit e8188c461ee015ba0b9ab2fc82dbd5ebca5a5532 upstream.
On eviction errors other than -EMULTIHOP we were leaking a resource. Fix.
v2: - Avoid yet another goto (Andi Shyti)
Fixes: 403797925768 ("drm/ttm: Fix multihop assert on eviction.") Cc: Andrey Grodzovsky andrey.grodzovsky@amd.com Cc: Christian König christian.koenig@amd.com Cc: Christian Koenig christian.koenig@amd.com Cc: Huang Rui ray.huang@amd.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com #v1 Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20230626091450.14757-4-thomas.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -458,18 +458,18 @@ static int ttm_bo_evict(struct ttm_buffe goto out; }
-bounce: - ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop); - if (ret == -EMULTIHOP) { + do { + ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop); + if (ret != -EMULTIHOP) + break; + ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop); - if (ret) { - if (ret != -ERESTARTSYS && ret != -EINTR) - pr_err("Buffer eviction failed\n"); - ttm_resource_free(bo, &evict_mem); - goto out; - } - /* try and move to final place now. */ - goto bounce; + } while (!ret); + + if (ret) { + ttm_resource_free(bo, &evict_mem); + if (ret != -ERESTARTSYS && ret != -EINTR) + pr_err("Buffer eviction failed\n"); } out: return ret;
From: Thomas Hellström thomas.hellstrom@linux.intel.com
commit a590f03d8de7c4cb7ce4916dc7f2fd10711faabe upstream.
If moving the bo to system for swapout failed, we were leaking a resource. Fix.
Fixes: bfa3357ef9ab ("drm/ttm: allocate resource object instead of embedding it v2") Cc: Christian König christian.koenig@amd.com Cc: "Christian König" ckoenig.leichtzumerken@gmail.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.14+ Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20230626091450.14757-5-thomas.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/ttm/ttm_bo.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1167,6 +1167,7 @@ int ttm_bo_swapout(struct ttm_buffer_obj ret = ttm_bo_handle_move_mem(bo, evict_mem, true, &ctx, &hop); if (unlikely(ret != 0)) { WARN(ret == -EMULTIHOP, "Unexpected multihop in swaput - likely driver bug.\n"); + ttm_resource_free(bo, &evict_mem); goto out; } }
From: Christian König christian.koenig@amd.com
commit a2848d08742c8e8494675892c02c0d22acbe3cf8 upstream.
There is a small window where we have already incremented the pin count but not yet moved the bo from the lru to the pinned list.
Signed-off-by: Christian König christian.koenig@amd.com Reported-by: Pelloux-Prayer, Pierre-Eric Pierre-eric.Pelloux-prayer@amd.com Tested-by: Pelloux-Prayer, Pierre-Eric Pierre-eric.Pelloux-prayer@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230707120826.3701-1-christia... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -517,6 +517,12 @@ static bool ttm_bo_evict_swapout_allowab { bool ret = false;
+ if (bo->pin_count) { + *locked = false; + *busy = false; + return false; + } + if (bo->base.resv == ctx->resv) { dma_resv_assert_held(bo->base.resv); if (ctx->allow_res_evict)
From: Dan Carpenter dan.carpenter@linaro.org
commit 27a826837ec9a3e94cc44bd9328b8289b0fcecd7 upstream.
The atmel_complete_tx_dma() function disables IRQs at the start of the function by calling spin_lock_irqsave(&port->lock, flags); There is no need to disable them a second time using the spin_lock_irq() function and, in fact, doing so is a bug because it will enable IRQs prematurely when we call spin_unlock_irq().
Just use spin_lock/unlock() instead without disabling or enabling IRQs.
Fixes: 08f738be88bb ("serial: at91: add tx dma support") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Jiri Slaby jirislaby@kernel.org Acked-by: Richard Genoud richard.genoud@gmail.com Link: https://lore.kernel.org/r/cb7c39a9-c004-4673-92e1-be4e34b85368@moroto.mounta... Cc: stable stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/atmel_serial.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/tty/serial/atmel_serial.c +++ b/drivers/tty/serial/atmel_serial.c @@ -868,11 +868,11 @@ static void atmel_complete_tx_dma(void * dmaengine_terminate_all(chan); uart_xmit_advance(port, atmel_port->tx_len);
- spin_lock_irq(&atmel_port->lock_tx); + spin_lock(&atmel_port->lock_tx); async_tx_ack(atmel_port->desc_tx); atmel_port->cookie_tx = -EINVAL; atmel_port->desc_tx = NULL; - spin_unlock_irq(&atmel_port->lock_tx); + spin_unlock(&atmel_port->lock_tx);
if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS) uart_write_wakeup(port);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit a9c09546e903f1068acfa38e1ee18bded7114b37 upstream.
If clk_get_rate() fails, the clk that has just been allocated needs to be freed.
Cc: stable@vger.kernel.org # v3.3+ Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Andi Shyti andi.shyti@kernel.org Fixes: 5f5a7a5578c5 ("serial: samsung: switch to clkdev based clock lookup") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Jiri Slaby jirislaby@kernel.org Message-ID: e4baf6039368f52e5a5453982ddcb9a330fc689e.1686412569.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/samsung_tty.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/tty/serial/samsung_tty.c +++ b/drivers/tty/serial/samsung_tty.c @@ -1459,8 +1459,12 @@ static unsigned int s3c24xx_serial_getcl continue;
rate = clk_get_rate(clk); - if (!rate) + if (!rate) { + dev_err(ourport->port.dev, + "Failed to get clock rate for %s.\n", clkname); + clk_put(clk); continue; + }
if (ourport->info->has_divslot) { unsigned long div = rate / req_baud;
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit 832e231cff476102e8204a9e7bddfe5c6154a375 upstream.
When the best clk is searched, we iterate over all possible clk.
If we find a better match, the previous one, if any, needs to be freed. If a better match has already been found, we still need to free the new one, otherwise it leaks.
Cc: stable@vger.kernel.org # v3.3+ Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Andi Shyti andi.shyti@kernel.org Fixes: 5f5a7a5578c5 ("serial: samsung: switch to clkdev based clock lookup") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Jiri Slaby jirislaby@kernel.org Message-ID: cf3e0053d2fc7391b2d906a86cd01a5ef15fb9dc.1686412569.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/samsung_tty.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c index a92a23e1964e..0b37019820b4 100644 --- a/drivers/tty/serial/samsung_tty.c +++ b/drivers/tty/serial/samsung_tty.c @@ -1490,10 +1490,18 @@ static unsigned int s3c24xx_serial_getclk(struct s3c24xx_uart_port *ourport, calc_deviation = -calc_deviation;
if (calc_deviation < deviation) { + /* + * If we find a better clk, release the previous one, if + * any. + */ + if (!IS_ERR(*best_clk)) + clk_put(*best_clk); *best_clk = clk; best_quot = quot; *clk_num = cnt; deviation = calc_deviation; + } else { + clk_put(clk); } }
From: Martin Fuzzey martin.fuzzey@flowbird.group
commit 639949a7031e04c59ec91614eceb9543e9120f43 upstream.
Since commit 79d0224f6bf2 ("tty: serial: imx: Handle RS485 DE signal active high") RS485 reception no longer works after a transmission.
The following scenario shows the problem: 1) Open a port in RS485 mode 2) Receive data from remote (OK) 3) Transmit data to remote (OK) 4) Receive data from remote (Nothing received)
In RS485 mode, imx_uart_start_tx() calls imx_uart_stop_rx() and, when the transmission is complete, imx_uart_stop_tx() calls imx_uart_start_rx().
Since the above commit imx_uart_stop_rx() now sets the loopback bit but imx_uart_start_rx() does not clear it causing the hardware to remain in loopback mode and not receive external data.
Fix this by moving the existing loopback disable code to a helper function and calling it from imx_uart_start_rx() too.
Fixes: 79d0224f6bf2 ("tty: serial: imx: Handle RS485 DE signal active high") Cc: stable@vger.kernel.org Signed-off-by: Martin Fuzzey martin.fuzzey@flowbird.group Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20230616104838.2729694-1-martin.fuzzey@flowbird.gr... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/imx.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
--- a/drivers/tty/serial/imx.c +++ b/drivers/tty/serial/imx.c @@ -369,6 +369,16 @@ static void imx_uart_soft_reset(struct i sport->idle_counter = 0; }
+static void imx_uart_disable_loopback_rs485(struct imx_port *sport) +{ + unsigned int uts; + + /* See SER_RS485_ENABLED/UTS_LOOP comment in imx_uart_probe() */ + uts = imx_uart_readl(sport, imx_uart_uts_reg(sport)); + uts &= ~UTS_LOOP; + imx_uart_writel(sport, uts, imx_uart_uts_reg(sport)); +} + /* called with port.lock taken and irqs off */ static void imx_uart_start_rx(struct uart_port *port) { @@ -390,6 +400,7 @@ static void imx_uart_start_rx(struct uar /* Write UCR2 first as it includes RXEN */ imx_uart_writel(sport, ucr2, UCR2); imx_uart_writel(sport, ucr1, UCR1); + imx_uart_disable_loopback_rs485(sport); }
/* called with port.lock taken and irqs off */ @@ -1422,7 +1433,7 @@ static int imx_uart_startup(struct uart_ int retval; unsigned long flags; int dma_is_inited = 0; - u32 ucr1, ucr2, ucr3, ucr4, uts; + u32 ucr1, ucr2, ucr3, ucr4;
retval = clk_prepare_enable(sport->clk_per); if (retval) @@ -1521,10 +1532,7 @@ static int imx_uart_startup(struct uart_ imx_uart_writel(sport, ucr2, UCR2); }
- /* See SER_RS485_ENABLED/UTS_LOOP comment in imx_uart_probe() */ - uts = imx_uart_readl(sport, imx_uart_uts_reg(sport)); - uts &= ~UTS_LOOP; - imx_uart_writel(sport, uts, imx_uart_uts_reg(sport)); + imx_uart_disable_loopback_rs485(sport);
spin_unlock_irqrestore(&sport->port.lock, flags);
From: Hui Li caelli@tencent.com
commit 4903fde8047a28299d1fc79c1a0dcc255e928f12 upstream.
It is possible to hang pty devices in this case, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com Message-ID: 1680749090-14106-1-git-send-email-caelli@tencent.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_tty.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
--- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -203,8 +203,8 @@ static void n_tty_kick_worker(struct tty struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1697,7 +1697,7 @@ n_tty_receive_buf_common(struct tty_stru if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1728,6 +1728,17 @@ n_tty_receive_buf_common(struct tty_stru } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set. + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2281,8 +2292,14 @@ more_to_be_read: if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
commit 1995f15590ca222f91193ed11461862b450abfd6 upstream.
svc_create_memory_pool() is only called from stratix10_svc_drv_probe(). Most of resources in the probe are managed, but not this memremap() call.
There is also no memunmap() call in the file.
So switch to devm_memremap() to avoid a resource leak.
Cc: stable@vger.kernel.org Fixes: 7ca5ce896524 ("firmware: add Intel Stratix10 service layer driver") Link: https://lore.kernel.org/all/783e9dfbba34e28505c9efa8bba41f97fd0fa1dc.1686109... Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Dinh Nguyen dinguyen@kernel.org Message-ID: 20230613211521.16366-1-dinguyen@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/stratix10-svc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c index 80f4e2d14e04..2d674126160f 100644 --- a/drivers/firmware/stratix10-svc.c +++ b/drivers/firmware/stratix10-svc.c @@ -755,7 +755,7 @@ svc_create_memory_pool(struct platform_device *pdev, end = rounddown(sh_memory->addr + sh_memory->size, PAGE_SIZE); paddr = begin; size = end - begin; - va = memremap(paddr, size, MEMREMAP_WC); + va = devm_memremap(dev, paddr, size, MEMREMAP_WC); if (!va) { dev_err(dev, "fail to remap shared memory\n"); return ERR_PTR(-EINVAL);
From: Ilya Dryomov idryomov@gmail.com
commit a282a2f10539dce2aa619e71e1817570d557fc97 upstream.
ceph_frame_desc::fd_lens is an int array. decode_preamble() thus effectively casts u32 -> int but the checks for segment lengths are written as if on unsigned values. While reading in HELLO or one of the AUTH frames (before authentication is completed), arithmetic in head_onwire_len() can get duped by negative ctrl_len and produce head_len which is less than CEPH_PREAMBLE_LEN but still positive. This would lead to a buffer overrun in prepare_read_control() as the preamble gets copied to the newly allocated buffer of size head_len.
Cc: stable@vger.kernel.org Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Reported-by: Thelford Williams thelford@google.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Xiubo Li xiubli@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ceph/messenger_v2.c | 41 ++++++++++++++++++++++++++--------------- 1 file changed, 26 insertions(+), 15 deletions(-)
--- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -391,6 +391,8 @@ static int head_onwire_len(int ctrl_len, int head_len; int rem_len;
+ BUG_ON(ctrl_len < 0 || ctrl_len > CEPH_MSG_MAX_CONTROL_LEN); + if (secure) { head_len = CEPH_PREAMBLE_SECURE_LEN; if (ctrl_len > CEPH_PREAMBLE_INLINE_LEN) { @@ -409,6 +411,10 @@ static int head_onwire_len(int ctrl_len, static int __tail_onwire_len(int front_len, int middle_len, int data_len, bool secure) { + BUG_ON(front_len < 0 || front_len > CEPH_MSG_MAX_FRONT_LEN || + middle_len < 0 || middle_len > CEPH_MSG_MAX_MIDDLE_LEN || + data_len < 0 || data_len > CEPH_MSG_MAX_DATA_LEN); + if (!front_len && !middle_len && !data_len) return 0;
@@ -521,29 +527,34 @@ static int decode_preamble(void *p, stru desc->fd_aligns[i] = ceph_decode_16(&p); }
- /* - * This would fire for FRAME_TAG_WAIT (it has one empty - * segment), but we should never get it as client. - */ - if (!desc->fd_lens[desc->fd_seg_cnt - 1]) { - pr_err("last segment empty\n"); + if (desc->fd_lens[0] < 0 || + desc->fd_lens[0] > CEPH_MSG_MAX_CONTROL_LEN) { + pr_err("bad control segment length %d\n", desc->fd_lens[0]); return -EINVAL; } - - if (desc->fd_lens[0] > CEPH_MSG_MAX_CONTROL_LEN) { - pr_err("control segment too big %d\n", desc->fd_lens[0]); + if (desc->fd_lens[1] < 0 || + desc->fd_lens[1] > CEPH_MSG_MAX_FRONT_LEN) { + pr_err("bad front segment length %d\n", desc->fd_lens[1]); return -EINVAL; } - if (desc->fd_lens[1] > CEPH_MSG_MAX_FRONT_LEN) { - pr_err("front segment too big %d\n", desc->fd_lens[1]); + if (desc->fd_lens[2] < 0 || + desc->fd_lens[2] > CEPH_MSG_MAX_MIDDLE_LEN) { + pr_err("bad middle segment length %d\n", desc->fd_lens[2]); return -EINVAL; } - if (desc->fd_lens[2] > CEPH_MSG_MAX_MIDDLE_LEN) { - pr_err("middle segment too big %d\n", desc->fd_lens[2]); + if (desc->fd_lens[3] < 0 || + desc->fd_lens[3] > CEPH_MSG_MAX_DATA_LEN) { + pr_err("bad data segment length %d\n", desc->fd_lens[3]); return -EINVAL; } - if (desc->fd_lens[3] > CEPH_MSG_MAX_DATA_LEN) { - pr_err("data segment too big %d\n", desc->fd_lens[3]); + + /* + * This would fire for FRAME_TAG_WAIT (it has one empty + * segment), but we should never get it as client. + */ + if (!desc->fd_lens[desc->fd_seg_cnt - 1]) { + pr_err("last segment empty, segment count %d\n", + desc->fd_seg_cnt); return -EINVAL; }
From: Xiubo Li xiubli@redhat.com
commit 23ee27dce30e7d3091d6c3143b79f48dab6f9a3e upstream.
We need to save the 'f_ra.ra_pages' to expand the readahead window later.
Cc: stable@vger.kernel.org Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead") Link: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@mail.scut.edu... Link: https://www.spinics.net/lists/ceph-users/msg76183.html Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-and-tested-by: Hu Weiwen sehuww@mail.scut.edu.cn Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/addr.c | 45 ++++++++++++++++++++++++++++++++++----------- fs/ceph/super.h | 13 +++++++++++++ 2 files changed, 47 insertions(+), 11 deletions(-)
--- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -362,18 +362,28 @@ static int ceph_init_request(struct netf { struct inode *inode = rreq->inode; int got = 0, want = CEPH_CAP_FILE_CACHE; + struct ceph_netfs_request_data *priv; int ret = 0;
if (rreq->origin != NETFS_READAHEAD) return 0;
+ priv = kzalloc(sizeof(*priv), GFP_NOFS); + if (!priv) + return -ENOMEM; + if (file) { struct ceph_rw_context *rw_ctx; struct ceph_file_info *fi = file->private_data;
+ priv->file_ra_pages = file->f_ra.ra_pages; + priv->file_ra_disabled = file->f_mode & FMODE_RANDOM; + rw_ctx = ceph_find_rw_context(fi); - if (rw_ctx) + if (rw_ctx) { + rreq->netfs_priv = priv; return 0; + } }
/* @@ -383,27 +393,40 @@ static int ceph_init_request(struct netf ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, true, &got); if (ret < 0) { dout("start_read %p, error getting cap\n", inode); - return ret; + goto out; }
if (!(got & want)) { dout("start_read %p, no cache cap\n", inode); - return -EACCES; + ret = -EACCES; + goto out; + } + if (ret == 0) { + ret = -EACCES; + goto out; } - if (ret == 0) - return -EACCES;
- rreq->netfs_priv = (void *)(uintptr_t)got; - return 0; + priv->caps = got; + rreq->netfs_priv = priv; + +out: + if (ret < 0) + kfree(priv); + + return ret; }
static void ceph_netfs_free_request(struct netfs_io_request *rreq) { - struct ceph_inode_info *ci = ceph_inode(rreq->inode); - int got = (uintptr_t)rreq->netfs_priv; + struct ceph_netfs_request_data *priv = rreq->netfs_priv; + + if (!priv) + return;
- if (got) - ceph_put_cap_refs(ci, got); + if (priv->caps) + ceph_put_cap_refs(ceph_inode(rreq->inode), priv->caps); + kfree(priv); + rreq->netfs_priv = NULL; }
const struct netfs_request_ops ceph_netfs_ops = { --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -451,6 +451,19 @@ struct ceph_inode_info { unsigned long i_work_mask; };
+struct ceph_netfs_request_data { + int caps; + + /* + * Maximum size of a file readahead request. + * The fadvise could update the bdi's default ra_pages. + */ + unsigned int file_ra_pages; + + /* Set it if fadvise disables file readahead entirely */ + bool file_ra_disabled; +}; + static inline struct ceph_inode_info * ceph_inode(const struct inode *inode) {
From: Xiubo Li xiubli@redhat.com
commit dc94bb8f271c079f69583d0f12a489aaf5202751 upstream.
Blindly expanding the readahead windows will cause unneccessary pagecache thrashing and also will introduce the network workload. We should disable expanding the windows if the readahead is disabled and also shouldn't expand the windows too much.
Expanding forward firstly instead of expanding backward for possible sequential reads.
Bound `rreq->len` to the actual file size to restore the previous page cache usage.
The posix_fadvise may change the maximum size of a file readahead.
Cc: stable@vger.kernel.org Fixes: 49870056005c ("ceph: convert ceph_readpages to ceph_readahead") Link: https://lore.kernel.org/ceph-devel/20230504082510.247-1-sehuww@mail.scut.edu... Link: https://www.spinics.net/lists/ceph-users/msg76183.html Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-and-tested-by: Hu Weiwen sehuww@mail.scut.edu.cn Reviewed-by: Milind Changire mchangir@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/addr.c | 40 +++++++++++++++++++++++++++++++++------- 1 file changed, 33 insertions(+), 7 deletions(-)
--- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -187,16 +187,42 @@ static void ceph_netfs_expand_readahead( struct inode *inode = rreq->inode; struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_file_layout *lo = &ci->i_layout; + unsigned long max_pages = inode->i_sb->s_bdi->ra_pages; + loff_t end = rreq->start + rreq->len, new_end; + struct ceph_netfs_request_data *priv = rreq->netfs_priv; + unsigned long max_len; u32 blockoff; - u64 blockno;
- /* Expand the start downward */ - blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff); - rreq->start = blockno * lo->stripe_unit; - rreq->len += blockoff; + if (priv) { + /* Readahead is disabled by posix_fadvise POSIX_FADV_RANDOM */ + if (priv->file_ra_disabled) + max_pages = 0; + else + max_pages = priv->file_ra_pages;
- /* Now, round up the length to the next block */ - rreq->len = roundup(rreq->len, lo->stripe_unit); + } + + /* Readahead is disabled */ + if (!max_pages) + return; + + max_len = max_pages << PAGE_SHIFT; + + /* + * Try to expand the length forward by rounding up it to the next + * block, but do not exceed the file size, unless the original + * request already exceeds it. + */ + new_end = min(round_up(end, lo->stripe_unit), rreq->i_size); + if (new_end > end && new_end <= rreq->start + max_len) + rreq->len = new_end - rreq->start; + + /* Try to expand the start downward */ + div_u64_rem(rreq->start, lo->stripe_unit, &blockoff); + if (rreq->len + blockoff <= max_len) { + rreq->start -= blockoff; + rreq->len += blockoff; + } }
static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq)
From: Xiubo Li xiubli@redhat.com
commit 257e6172ab36ebbe295a6c9ee9a9dd0fe54c1dc2 upstream.
If a client sends out a cap update dropping caps with the prior 'seq' just before an incoming cap revoke request, then the client may drop the revoke because it believes it's already released the requested capabilities.
This causes the MDS to wait indefinitely for the client to respond to the revoke. It's therefore always a good idea to ack the cap revoke request with the bumped up 'seq'.
Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/61782 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Milind Changire mchangir@redhat.com Reviewed-by: Patrick Donnelly pdonnell@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/caps.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3560,6 +3560,15 @@ static void handle_cap_grant(struct inod } BUG_ON(cap->issued & ~cap->implemented);
+ /* don't let check_caps skip sending a response to MDS for revoke msgs */ + if (le32_to_cpu(grant->op) == CEPH_CAP_OP_REVOKE) { + cap->mds_wanted = 0; + if (cap == ci->i_auth_cap) + check_caps = 1; /* check auth cap only */ + else + check_caps = 2; /* check all caps */ + } + if (extra_info->inline_version > 0 && extra_info->inline_version >= ci->i_inline_version) { ci->i_inline_version = extra_info->inline_version;
From: Yinjun Zhang yinjun.zhang@corigine.com
commit cc7eab25b1cf3f9594fe61142d3523ce4d14a788 upstream.
When moving devices from one namespace to another, mc addresses are cleaned in software while not removed from application firmware. Thus the mc addresses are remained and will cause resource leak.
Now use `__dev_mc_unsync` to clean mc addresses when closing port.
Fixes: e20aa071cd95 ("nfp: fix schedule in atomic context when sync mc address") Cc: stable@vger.kernel.org Signed-off-by: Yinjun Zhang yinjun.zhang@corigine.com Acked-by: Simon Horman simon.horman@corigine.com Signed-off-by: Louis Peens louis.peens@corigine.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Message-ID: 20230705052818.7122-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c index 49f2f081ebb5..6b1fb5708434 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -53,6 +53,8 @@ #include "crypto/crypto.h" #include "crypto/fw.h"
+static int nfp_net_mc_unsync(struct net_device *netdev, const unsigned char *addr); + /** * nfp_net_get_fw_version() - Read and parse the FW version * @fw_ver: Output fw_version structure to read to @@ -1084,6 +1086,9 @@ static int nfp_net_netdev_close(struct net_device *netdev)
/* Step 2: Tell NFP */ + if (nn->cap_w1 & NFP_NET_CFG_CTRL_MCAST_FILTER) + __dev_mc_unsync(netdev, nfp_net_mc_unsync); + nfp_net_clear_config_and_disable(nn); nfp_port_configure(netdev, false);
From: Oliver Upton oliver.upton@linux.dev
commit 6df696cd9bc1ceed0e92e36908f88bbd16d18255 upstream.
AmpereOne has an erratum in its implementation of FEAT_HAFDBS that required disabling the feature on the design. This was done by reporting the feature as not implemented in the ID register, although the corresponding control bits were not actually RES0. This does not align well with the requirements of the architecture, which mandates these bits be RES0 if HAFDBS isn't implemented.
The kernel's use of stage-1 is unaffected, as the HA and HD bits are only set if HAFDBS is detected in the ID register. KVM, on the other hand, relies on the RES0 behavior at stage-2 to use the same value for VTCR_EL2 on any cpu in the system. Mitigate the non-RES0 behavior by leaving VTCR_EL2.HA clear on affected systems.
Cc: stable@vger.kernel.org Cc: D Scott Phillips scott@os.amperecomputing.com Cc: Darren Hart darren@os.amperecomputing.com Acked-by: D Scott Phillips scott@os.amperecomputing.com Acked-by: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/r/20230609220104.1836988-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/arm64/silicon-errata.rst | 3 +++ arch/arm64/Kconfig | 19 +++++++++++++++++++ arch/arm64/kernel/cpu_errata.c | 7 +++++++ arch/arm64/kvm/hyp/pgtable.c | 14 +++++++++++--- arch/arm64/tools/cpucaps | 1 + 5 files changed, 41 insertions(+), 3 deletions(-)
--- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -52,6 +52,9 @@ stable kernels. | Allwinner | A64/R18 | UNKNOWN1 | SUN50I_ERRATUM_UNKNOWN1 | +----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ +| Ampere | AmpereOne | AC03_CPU_38 | AMPERE_ERRATUM_AC03_CPU_38 | ++----------------+-----------------+-----------------+-----------------------------+ ++----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A510 | #2457168 | ARM64_ERRATUM_2457168 | +----------------+-----------------+-----------------+-----------------------------+ | ARM | Cortex-A510 | #2064142 | ARM64_ERRATUM_2064142 | --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -407,6 +407,25 @@ menu "Kernel Features"
menu "ARM errata workarounds via the alternatives framework"
+config AMPERE_ERRATUM_AC03_CPU_38 + bool "AmpereOne: AC03_CPU_38: Certain bits in the Virtualization Translation Control Register and Translation Control Registers do not follow RES0 semantics" + default y + help + This option adds an alternative code sequence to work around Ampere + erratum AC03_CPU_38 on AmpereOne. + + The affected design reports FEAT_HAFDBS as not implemented in + ID_AA64MMFR1_EL1.HAFDBS, but (V)TCR_ELx.{HA,HD} are not RES0 + as required by the architecture. The unadvertised HAFDBS + implementation suffers from an additional erratum where hardware + A/D updates can occur after a PTE has been marked invalid. + + The workaround forces KVM to explicitly set VTCR_EL2.HA to 0, + which avoids enabling unadvertised hardware Access Flag management + at stage-2. + + If unsure, say Y. + config ARM64_WORKAROUND_CLEAN_CACHE bool
--- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -730,6 +730,13 @@ const struct arm64_cpu_capabilities arm6 .cpu_enable = cpu_clear_bf16_from_user_emulation, }, #endif +#ifdef CONFIG_AMPERE_ERRATUM_AC03_CPU_38 + { + .desc = "AmpereOne erratum AC03_CPU_38", + .capability = ARM64_WORKAROUND_AMPERE_AC03_CPU_38, + ERRATA_MIDR_ALL_VERSIONS(MIDR_AMPERE1), + }, +#endif { } }; --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -623,10 +623,18 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u #ifdef CONFIG_ARM64_HW_AFDBM /* * Enable the Hardware Access Flag management, unconditionally - * on all CPUs. The features is RES0 on CPUs without the support - * and must be ignored by the CPUs. + * on all CPUs. In systems that have asymmetric support for the feature + * this allows KVM to leverage hardware support on the subset of cores + * that implement the feature. + * + * The architecture requires VTCR_EL2.HA to be RES0 (thus ignored by + * hardware) on implementations that do not advertise support for the + * feature. As such, setting HA unconditionally is safe, unless you + * happen to be running on a design that has unadvertised support for + * HAFDBS. Here be dragons. */ - vtcr |= VTCR_EL2_HA; + if (!cpus_have_final_cap(ARM64_WORKAROUND_AMPERE_AC03_CPU_38)) + vtcr |= VTCR_EL2_HA; #endif /* CONFIG_ARM64_HW_AFDBM */
/* Set the vmid bits */ --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -77,6 +77,7 @@ WORKAROUND_2077057 WORKAROUND_2457168 WORKAROUND_2645198 WORKAROUND_2658417 +WORKAROUND_AMPERE_AC03_CPU_38 WORKAROUND_TRBE_OVERWRITE_FILL_MODE WORKAROUND_TSB_FLUSH_FAILURE WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
From: Weitao Wang WeitaoWang-oc@zhaoxin.com
commit f927728186f0de1167262d6a632f9f7e96433d1a upstream.
On ZHAOXIN ZX-100 project, xHCI can't work normally after resume from system Sx state. To fix this issue, when resume from system Sx state, reinitialize xHCI instead of restore. So, Add XHCI_RESET_ON_RESUME quirk for ZX-100 to fix issue of resuming from system Sx state.
Cc: stable@vger.kernel.org Signed-off-by: Weitao Wang WeitaoWang-oc@zhaoxin.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Message-ID: 20230602144009.1225632-9-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -528,6 +528,11 @@ static void xhci_pci_quirks(struct devic pdev->device == PCI_DEVICE_ID_AMD_PROMONTORYA_4)) xhci->quirks |= XHCI_NO_SOFT_RETRY;
+ if (pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) { + if (pdev->device == 0x9202) + xhci->quirks |= XHCI_RESET_ON_RESUME; + } + /* xHC spec requires PCI devices to support D3hot and D3cold */ if (xhci->hci_version >= 0x120) xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
From: Weitao Wang WeitaoWang-oc@zhaoxin.com
commit 2a865a652299f5666f3b785cbe758c5f57453036 upstream.
On some ZHAOXIN hosts, xHCI will prefetch TRB for performance improvement. However this TRB prefetch mechanism may cross page boundary, which may access memory not allocated by xHCI driver. In order to fix this issue, two pages was allocated for a segment and only the first page will be used. And add a quirk XHCI_ZHAOXIN_TRB_FETCH for this issue.
Cc: stable@vger.kernel.org Signed-off-by: Weitao Wang WeitaoWang-oc@zhaoxin.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Message-ID: 20230602144009.1225632-10-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-mem.c | 8 ++++++-- drivers/usb/host/xhci-pci.c | 7 ++++++- drivers/usb/host/xhci.h | 1 + 3 files changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2352,8 +2352,12 @@ int xhci_mem_init(struct xhci_hcd *xhci, * and our use of dma addresses in the trb_address_map radix tree needs * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need. */ - xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, - TRB_SEGMENT_SIZE, TRB_SEGMENT_SIZE, xhci->page_size); + if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH) + xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, + TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2); + else + xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, + TRB_SEGMENT_SIZE, TRB_SEGMENT_SIZE, xhci->page_size);
/* See Table 46 and Note on Figure 55 */ xhci->device_pool = dma_pool_create("xHCI input/output contexts", dev, --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -529,8 +529,13 @@ static void xhci_pci_quirks(struct devic xhci->quirks |= XHCI_NO_SOFT_RETRY;
if (pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) { - if (pdev->device == 0x9202) + if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; + xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + } + + if (pdev->device == 0x9203) + xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; }
/* xHC spec requires PCI devices to support D3hot and D3cold */ --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1905,6 +1905,7 @@ struct xhci_hcd { #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42) #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) +#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45)
unsigned int num_active_eps; unsigned int limit_active_eps;
From: Weitao Wang WeitaoWang-oc@zhaoxin.com
commit d9b0328d0b8b8298dfdc97cd8e0e2371d4bcc97b upstream.
Some ZHAOXIN xHCI controllers follow usb3.1 spec, but only support gen1 speed 5Gbps. While in Linux kernel, if xHCI suspport usb3.1, root hub speed will show on 10Gbps. To fix this issue of ZHAOXIN xHCI platforms, read usb speed ID supported by xHCI to determine root hub speed. And add a quirk XHCI_ZHAOXIN_HOST for this issue.
[fix warning about uninitialized symbol -Mathias]
Suggested-by: Mathias Nyman mathias.nyman@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Weitao Wang WeitaoWang-oc@zhaoxin.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Message-ID: 20230602144009.1225632-11-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-mem.c | 31 ++++++++++++++++++++++++------- drivers/usb/host/xhci-pci.c | 2 ++ drivers/usb/host/xhci.h | 1 + 3 files changed, 27 insertions(+), 7 deletions(-)
--- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1968,7 +1968,7 @@ static void xhci_add_in_port(struct xhci { u32 temp, port_offset, port_count; int i; - u8 major_revision, minor_revision; + u8 major_revision, minor_revision, tmp_minor_revision; struct xhci_hub *rhub; struct device *dev = xhci_to_hcd(xhci)->self.sysdev; struct xhci_port_cap *port_cap; @@ -1988,6 +1988,15 @@ static void xhci_add_in_port(struct xhci */ if (minor_revision > 0x00 && minor_revision < 0x10) minor_revision <<= 4; + /* + * Some zhaoxin's xHCI controller that follow usb3.1 spec + * but only support Gen1. + */ + if (xhci->quirks & XHCI_ZHAOXIN_HOST) { + tmp_minor_revision = minor_revision; + minor_revision = 0; + } + } else if (major_revision <= 0x02) { rhub = &xhci->usb2_rhub; } else { @@ -1996,10 +2005,6 @@ static void xhci_add_in_port(struct xhci /* Ignoring port protocol we can't understand. FIXME */ return; } - rhub->maj_rev = XHCI_EXT_PORT_MAJOR(temp); - - if (rhub->min_rev < minor_revision) - rhub->min_rev = minor_revision;
/* Port offset and count in the third dword, see section 7.2 */ temp = readl(addr + 2); @@ -2017,8 +2022,6 @@ static void xhci_add_in_port(struct xhci if (xhci->num_port_caps > max_caps) return;
- port_cap->maj_rev = major_revision; - port_cap->min_rev = minor_revision; port_cap->psi_count = XHCI_EXT_PORT_PSIC(temp);
if (port_cap->psi_count) { @@ -2039,6 +2042,11 @@ static void xhci_add_in_port(struct xhci XHCI_EXT_PORT_PSIV(port_cap->psi[i - 1]))) port_cap->psi_uid_count++;
+ if (xhci->quirks & XHCI_ZHAOXIN_HOST && + major_revision == 0x03 && + XHCI_EXT_PORT_PSIV(port_cap->psi[i]) >= 5) + minor_revision = tmp_minor_revision; + xhci_dbg(xhci, "PSIV:%d PSIE:%d PLT:%d PFD:%d LP:%d PSIM:%d\n", XHCI_EXT_PORT_PSIV(port_cap->psi[i]), XHCI_EXT_PORT_PSIE(port_cap->psi[i]), @@ -2048,6 +2056,15 @@ static void xhci_add_in_port(struct xhci XHCI_EXT_PORT_PSIM(port_cap->psi[i])); } } + + rhub->maj_rev = major_revision; + + if (rhub->min_rev < minor_revision) + rhub->min_rev = minor_revision; + + port_cap->maj_rev = major_revision; + port_cap->min_rev = minor_revision; + /* cache usb2 port capabilities */ if (major_revision < 0x03 && xhci->num_ext_caps < max_caps) xhci->ext_caps[xhci->num_ext_caps++] = temp; --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -529,6 +529,8 @@ static void xhci_pci_quirks(struct devic xhci->quirks |= XHCI_NO_SOFT_RETRY;
if (pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) { + xhci->quirks |= XHCI_ZHAOXIN_HOST; + if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1906,6 +1906,7 @@ struct xhci_hcd { #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) #define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45) +#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
unsigned int num_active_eps; unsigned int limit_active_eps;
From: George Stark gnstark@sberdevices.ru
commit c57fa0037024c92c2ca34243e79e857da5d2c0a9 upstream.
According to the datasheets of supported meson SoCs length of ADC_CLK_DIV field is 6-bit. Although all supported SoCs have the register with that field documented later SoCs use external clock rather than ADC internal clock so this patch affects only meson8 family (S8* SoCs).
Fixes: 3adbf3427330 ("iio: adc: add a driver for the SAR ADC found in Amlogic Meson SoCs") Signed-off-by: George Stark GNStark@sberdevices.ru Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Link: https://lore.kernel.org/r/20230606165357.42417-1-gnstark@sberdevices.ru Cc: stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/meson_saradc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/meson_saradc.c +++ b/drivers/iio/adc/meson_saradc.c @@ -72,7 +72,7 @@ #define MESON_SAR_ADC_REG3_PANEL_DETECT_COUNT_MASK GENMASK(20, 18) #define MESON_SAR_ADC_REG3_PANEL_DETECT_FILTER_TB_MASK GENMASK(17, 16) #define MESON_SAR_ADC_REG3_ADC_CLK_DIV_SHIFT 10 - #define MESON_SAR_ADC_REG3_ADC_CLK_DIV_WIDTH 5 + #define MESON_SAR_ADC_REG3_ADC_CLK_DIV_WIDTH 6 #define MESON_SAR_ADC_REG3_BLOCK_DLY_SEL_MASK GENMASK(9, 8) #define MESON_SAR_ADC_REG3_BLOCK_DLY_MASK GENMASK(7, 0)
From: Stephan Gerhold stephan.gerhold@kernkonzept.com
commit b2a2ab039bd58f51355e33d7d3fc64605d7f870d upstream.
When dev_pm_opp_of_find_icc_paths() in _allocate_opp_table() returns -EPROBE_DEFER, the opp_table is freed again, to wait until all the interconnect paths are available.
However, if the OPP table is using required-opps then it may already have been added to the global lazy_opp_tables list. The error path does not remove the opp_table from the list again.
This can cause crashes later when the provider of the required-opps is added, since we will iterate over OPP tables that have already been freed. E.g.:
Unable to handle kernel NULL pointer dereference when read CPU: 0 PID: 7 Comm: kworker/0:0 Not tainted 6.4.0-rc3 PC is at _of_add_opp_table_v2 (include/linux/of.h:949 drivers/opp/of.c:98 drivers/opp/of.c:344 drivers/opp/of.c:404 drivers/opp/of.c:1032) -> lazy_link_required_opp_table()
Fix this by calling _of_clear_opp_table() to remove the opp_table from the list and clear other allocated resources. While at it, also add the missing mutex_destroy() calls in the error path.
Cc: stable@vger.kernel.org Suggested-by: Viresh Kumar viresh.kumar@linaro.org Fixes: 7eba0c7641b0 ("opp: Allow lazy-linking of required-opps") Signed-off-by: Stephan Gerhold stephan.gerhold@kernkonzept.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/opp/core.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1358,7 +1358,10 @@ static struct opp_table *_allocate_opp_t return opp_table;
remove_opp_dev: + _of_clear_opp_table(opp_table); _remove_opp_dev(opp_dev, opp_table); + mutex_destroy(&opp_table->genpd_virt_dev_lock); + mutex_destroy(&opp_table->lock); err: kfree(opp_table); return ERR_PTR(ret);
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 490937d479abe5f6584e69b96df066bc87be92e9 upstream.
The 'qcom_swrm_ctrl->pconfig' has size of QCOM_SDW_MAX_PORTS (14), however we index it starting from 1, not 0, to match real port numbers. This can lead to writing port config past 'pconfig' bounds and overwriting next member of 'qcom_swrm_ctrl' struct. Reported also by smatch:
drivers/soundwire/qcom.c:1269 qcom_swrm_get_port_config() error: buffer overflow 'ctrl->pconfig' 14 <= 14
Fixes: 9916c02ccd74 ("soundwire: qcom: cleanup internal port config indexing") Cc: stable@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Reported-by: Dan Carpenter error27@gmail.com Link: https://lore.kernel.org/r/202305201301.sCJ8UDKV-lkp@intel.com/ Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230601102525.609627-1-krzysztof.kozlowski@linaro... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soundwire/qcom.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -171,7 +171,8 @@ struct qcom_swrm_ctrl { u32 intr_mask; u8 rcmd_id; u8 wcmd_id; - struct qcom_swrm_port_config pconfig[QCOM_SDW_MAX_PORTS]; + /* Port numbers are 1 - 14 */ + struct qcom_swrm_port_config pconfig[QCOM_SDW_MAX_PORTS + 1]; struct sdw_stream_runtime *sruntime[SWRM_MAX_DAIS]; enum sdw_slave_status status[SDW_MAX_DEVICES + 1]; int (*reg_read)(struct qcom_swrm_ctrl *ctrl, int reg, u32 *val);
From: Sakari Ailus sakari.ailus@linux.intel.com
commit 950e9a295b984b011bcbfb90af167e4e20a077f3 upstream.
The value of the V4L2_SUBDEV_ROUTE_FL_ACTIVE is 1, not 0. Use hexadecimal numbers as is done elsewhere in the documentation.
Cc: stable@vger.kernel.org # for >= v6.3 Fixes: ea73eda50813 ("media: Documentation: Add GS_ROUTING documentation") Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Jacopo Mondi jacopo.mondi@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- .../userspace-api/media/v4l/vidioc-subdev-g-routing.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst index 68ca343c3b44..2d6e3bbdd040 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst @@ -122,7 +122,7 @@ for all the route entries and call ``VIDIOC_SUBDEV_G_ROUTING`` again. :widths: 3 1 4
* - V4L2_SUBDEV_ROUTE_FL_ACTIVE - - 0 + - 0x0001 - The route is enabled. Set by applications.
Return Value
From: Jiaqing Zhao jiaqing.zhao@linux.intel.com
commit a82d62f708545d22859584e0e0620da8e3759bbc upstream.
This reverts commit eb26dfe8aa7eeb5a5aa0b7574550125f8aa4c3b3.
Commit eb26dfe8aa7e ("8250: add support for ASIX devices with a FIFO bug") merged on Jul 13, 2012 adds a quirk for PCI_VENDOR_ID_ASIX (0x9710). But that ID is the same as PCI_VENDOR_ID_NETMOS defined in 1f8b061050c7 ("[PATCH] Netmos parallel/serial/combo support") merged on Mar 28, 2005. In pci_serial_quirks array, the NetMos entry always takes precedence over the ASIX entry even since it was initially merged, code in that commit is always unreachable.
In my tests, adding the FIFO workaround to pci_netmos_init() makes no difference, and the vendor driver also does not have such workaround. Given that the code was never used for over a decade, it's safe to revert it.
Also, the real PCI_VENDOR_ID_ASIX should be 0x125b, which is used on their newer AX99100 PCIe serial controllers released on 2016. The FIFO workaround should not be intended for these newer controllers, and it was never implemented in vendor driver.
Fixes: eb26dfe8aa7e ("8250: add support for ASIX devices with a FIFO bug") Cc: stable stable@kernel.org Signed-off-by: Jiaqing Zhao jiaqing.zhao@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230619155743.827859-1-jiaqing.zhao@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250.h | 1 - drivers/tty/serial/8250/8250_pci.c | 19 ------------------- drivers/tty/serial/8250/8250_port.c | 11 +++-------- include/linux/serial_8250.h | 1 - 4 files changed, 3 insertions(+), 29 deletions(-)
--- a/drivers/tty/serial/8250/8250.h +++ b/drivers/tty/serial/8250/8250.h @@ -91,7 +91,6 @@ struct serial8250_config { #define UART_BUG_TXEN BIT(1) /* UART has buggy TX IIR status */ #define UART_BUG_NOMSR BIT(2) /* UART has buggy MSR status bits (Au1x00) */ #define UART_BUG_THRE BIT(3) /* UART has buggy THRE reassertion */ -#define UART_BUG_PARITY BIT(4) /* UART mishandles parity if FIFO enabled */ #define UART_BUG_TXRACE BIT(5) /* UART Tx fails to set remote DR */
--- a/drivers/tty/serial/8250/8250_pci.c +++ b/drivers/tty/serial/8250/8250_pci.c @@ -1232,14 +1232,6 @@ static int pci_oxsemi_tornado_setup(stru return pci_default_setup(priv, board, up, idx); }
-static int pci_asix_setup(struct serial_private *priv, - const struct pciserial_board *board, - struct uart_8250_port *port, int idx) -{ - port->bugs |= UART_BUG_PARITY; - return pci_default_setup(priv, board, port, idx); -} - #define QPCR_TEST_FOR1 0x3F #define QPCR_TEST_GET1 0x00 #define QPCR_TEST_FOR2 0x40 @@ -1955,7 +1947,6 @@ pci_moxa_setup(struct serial_private *pr #define PCI_DEVICE_ID_WCH_CH355_4S 0x7173 #define PCI_VENDOR_ID_AGESTAR 0x5372 #define PCI_DEVICE_ID_AGESTAR_9375 0x6872 -#define PCI_VENDOR_ID_ASIX 0x9710 #define PCI_DEVICE_ID_BROADCOM_TRUMANAGE 0x160a #define PCI_DEVICE_ID_AMCC_ADDIDATA_APCI7800 0x818e
@@ -2601,16 +2592,6 @@ static struct pci_serial_quirk pci_seria .setup = pci_wch_ch38x_setup, }, /* - * ASIX devices with FIFO bug - */ - { - .vendor = PCI_VENDOR_ID_ASIX, - .device = PCI_ANY_ID, - .subvendor = PCI_ANY_ID, - .subdevice = PCI_ANY_ID, - .setup = pci_asix_setup, - }, - /* * Broadcom TruManage (NetXtreme) */ { --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c @@ -2636,11 +2636,8 @@ static unsigned char serial8250_compute_
if (c_cflag & CSTOPB) cval |= UART_LCR_STOP; - if (c_cflag & PARENB) { + if (c_cflag & PARENB) cval |= UART_LCR_PARITY; - if (up->bugs & UART_BUG_PARITY) - up->fifo_bug = true; - } if (!(c_cflag & PARODD)) cval |= UART_LCR_EPAR; if (c_cflag & CMSPAR) @@ -2801,8 +2798,7 @@ serial8250_do_set_termios(struct uart_po up->lcr = cval; /* Save computed LCR */
if (up->capabilities & UART_CAP_FIFO && port->fifosize > 1) { - /* NOTE: If fifo_bug is not set, a user can set RX_trigger. */ - if ((baud < 2400 && !up->dma) || up->fifo_bug) { + if (baud < 2400 && !up->dma) { up->fcr &= ~UART_FCR_TRIGGER_MASK; up->fcr |= UART_FCR_TRIGGER_1; } @@ -3138,8 +3134,7 @@ static int do_set_rxtrig(struct tty_port struct uart_8250_port *up = up_to_u8250p(uport); int rxtrig;
- if (!(up->capabilities & UART_CAP_FIFO) || uport->fifosize <= 1 || - up->fifo_bug) + if (!(up->capabilities & UART_CAP_FIFO) || uport->fifosize <= 1) return -EINVAL;
rxtrig = bytes_to_fcr_rxtrig(up, bytes); --- a/include/linux/serial_8250.h +++ b/include/linux/serial_8250.h @@ -98,7 +98,6 @@ struct uart_8250_port { struct list_head list; /* ports on this IRQ */ u32 capabilities; /* port capabilities */ unsigned short bugs; /* port bugs */ - bool fifo_bug; /* min RX trigger if enabled */ unsigned int tx_loadsz; /* transmit fifo load size */ unsigned char acr; unsigned char fcr;
From: Jonas Gorski jonas.gorski@gmail.com
commit 6722e46513e0af8e2fff4698f7cb78bc50a9f13f upstream.
The IXP4XX_EXP_T1_MASK was shifted one bit to the right, overlapping IXP4XX_EXP_T2_MASK and leaving bit 29 unused. The offset being wrong is also confirmed at least by the datasheet of IXP45X/46X [1].
Fix this by aligning it to IXP4XX_EXP_T1_SHIFT.
[1] https://www.intel.com/content/dam/www/public/us/en/documents/manuals/ixp45x-...
Cc: stable@vger.kernel.org Fixes: 1c953bda90ca ("bus: ixp4xx: Add a driver for IXP4xx expansion bus") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Link: https://lore.kernel.org/r/20230624112958.27727-1-jonas.gorski@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20230624122139.3229642-1-linus.walleij@linaro.org Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bus/intel-ixp4xx-eb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/bus/intel-ixp4xx-eb.c +++ b/drivers/bus/intel-ixp4xx-eb.c @@ -33,7 +33,7 @@ #define IXP4XX_EXP_TIMING_STRIDE 0x04 #define IXP4XX_EXP_CS_EN BIT(31) #define IXP456_EXP_PAR_EN BIT(30) /* Only on IXP45x and IXP46x */ -#define IXP4XX_EXP_T1_MASK GENMASK(28, 27) +#define IXP4XX_EXP_T1_MASK GENMASK(29, 28) #define IXP4XX_EXP_T1_SHIFT 28 #define IXP4XX_EXP_T2_MASK GENMASK(27, 26) #define IXP4XX_EXP_T2_SHIFT 26
From: Heiko Carstens hca@linux.ibm.com
commit 938f0c35d7d93a822ab9c9728e3205e8e57409d0 upstream.
Nathan Chancellor reported a kernel build error on Fedora 39:
$ clang --version | head -1 clang version 16.0.5 (Fedora 16.0.5-1.fc39)
$ s390x-linux-gnu-ld --version | head -1 GNU ld version 2.40-1.fc39
$ make -skj"$(nproc)" ARCH=s390 CC=clang CROSS_COMPILE=s390x-linux-gnu- olddefconfig all s390x-linux-gnu-ld: arch/s390/boot/startup.o(.text+0x5b4): misaligned symbol `_decompressor_end' (0x35b0f) for relocation R_390_PC32DBL make[3]: *** [.../arch/s390/boot/Makefile:78: arch/s390/boot/vmlinux] Error 1
It turned out that the problem with misaligned symbols on s390 was fixed with commit 80ddf5ce1c92 ("s390: always build relocatable kernel") for the kernel image, but did not take into account that the decompressor uses its own set of CFLAGS, which come without -fPIE.
Add the -fPIE flag also to the decompresser CFLAGS to fix this.
Reported-by: Nathan Chancellor nathan@kernel.org Tested-by: Nathan Chancellor nathan@kernel.org Reported-by: CKI cki-project@redhat.com Suggested-by: Ulrich Weigand Ulrich.Weigand@de.ibm.com Link: https://github.com/ClangBuiltLinux/linux/issues/1747 Link: https://lore.kernel.org/32935.123062114500601371@us-mta-9.us.mimecast.lan/ Link: https://lore.kernel.org/r/20230622125508.1068457-1-hca@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -27,6 +27,7 @@ KBUILD_CFLAGS_DECOMPRESSOR += -fno-delet KBUILD_CFLAGS_DECOMPRESSOR += -fno-asynchronous-unwind-tables KBUILD_CFLAGS_DECOMPRESSOR += -ffreestanding KBUILD_CFLAGS_DECOMPRESSOR += -fno-stack-protector +KBUILD_CFLAGS_DECOMPRESSOR += -fPIE KBUILD_CFLAGS_DECOMPRESSOR += $(call cc-disable-warning, address-of-packed-member) KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO),-g) KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO_DWARF4), $(call cc-option, -gdwarf-4,))
From: Matthias Kaehlcke mka@chromium.org
commit 47f04616f2c9b2f4f0c9127e30ca515a078db591 upstream.
Add a NULL check for the 'bdev' parameter of dm_verity_loadpin_is_bdev_trusted(). The function is called by loadpin_check(), which passes the block device that corresponds to the super block of the file system from which a file is being loaded. Generally a super_block structure has an associated block device, however that is not always the case (e.g. tmpfs).
Cc: stable@vger.kernel.org # v6.0+ Fixes: b6c1c5745ccc ("dm: Add verity helpers for LoadPin") Signed-off-by: Matthias Kaehlcke mka@chromium.org Link: https://lore.kernel.org/r/20230627202800.1.Id63f7f59536d20f1ab83e1abdc1fda14... Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-verity-loadpin.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/md/dm-verity-loadpin.c +++ b/drivers/md/dm-verity-loadpin.c @@ -58,6 +58,9 @@ bool dm_verity_loadpin_is_bdev_trusted(s int srcu_idx; bool trusted = false;
+ if (bdev == NULL) + return false; + if (list_empty(&dm_verity_loadpin_trusted_root_digests)) return false;
From: Mohamed Khalfella mkhalfella@purestorage.com
commit 6018b585e8c6fa7d85d4b38d9ce49a5b67be7078 upstream.
Hist triggers can have referenced variables without having direct variables fields. This can be the case if referenced variables are added for trigger actions. In this case the newly added references will not have field variables. Not taking such referenced variables into consideration can result in a bug where it would be possible to remove hist trigger with variables being refenced. This will result in a bug that is easily reproducable like so
$ cd /sys/kernel/tracing $ echo 'synthetic_sys_enter char[] comm; long id' >> synthetic_events $ echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger $ echo 'hist:keys=common_pid.execname,id.syscall:onmatch(raw_syscalls.sys_enter).synthetic_sys_enter($comm, id)' >> events/raw_syscalls/sys_enter/trigger $ echo '!hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
[ 100.263533] ================================================================== [ 100.264634] BUG: KASAN: slab-use-after-free in resolve_var_refs+0xc7/0x180 [ 100.265520] Read of size 8 at addr ffff88810375d0f0 by task bash/439 [ 100.266320] [ 100.266533] CPU: 2 PID: 439 Comm: bash Not tainted 6.5.0-rc1 #4 [ 100.267277] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014 [ 100.268561] Call Trace: [ 100.268902] <TASK> [ 100.269189] dump_stack_lvl+0x4c/0x70 [ 100.269680] print_report+0xc5/0x600 [ 100.270165] ? resolve_var_refs+0xc7/0x180 [ 100.270697] ? kasan_complete_mode_report_info+0x80/0x1f0 [ 100.271389] ? resolve_var_refs+0xc7/0x180 [ 100.271913] kasan_report+0xbd/0x100 [ 100.272380] ? resolve_var_refs+0xc7/0x180 [ 100.272920] __asan_load8+0x71/0xa0 [ 100.273377] resolve_var_refs+0xc7/0x180 [ 100.273888] event_hist_trigger+0x749/0x860 [ 100.274505] ? kasan_save_stack+0x2a/0x50 [ 100.275024] ? kasan_set_track+0x29/0x40 [ 100.275536] ? __pfx_event_hist_trigger+0x10/0x10 [ 100.276138] ? ksys_write+0xd1/0x170 [ 100.276607] ? do_syscall_64+0x3c/0x90 [ 100.277099] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 100.277771] ? destroy_hist_data+0x446/0x470 [ 100.278324] ? event_hist_trigger_parse+0xa6c/0x3860 [ 100.278962] ? __pfx_event_hist_trigger_parse+0x10/0x10 [ 100.279627] ? __kasan_check_write+0x18/0x20 [ 100.280177] ? mutex_unlock+0x85/0xd0 [ 100.280660] ? __pfx_mutex_unlock+0x10/0x10 [ 100.281200] ? kfree+0x7b/0x120 [ 100.281619] ? ____kasan_slab_free+0x15d/0x1d0 [ 100.282197] ? event_trigger_write+0xac/0x100 [ 100.282764] ? __kasan_slab_free+0x16/0x20 [ 100.283293] ? __kmem_cache_free+0x153/0x2f0 [ 100.283844] ? sched_mm_cid_remote_clear+0xb1/0x250 [ 100.284550] ? __pfx_sched_mm_cid_remote_clear+0x10/0x10 [ 100.285221] ? event_trigger_write+0xbc/0x100 [ 100.285781] ? __kasan_check_read+0x15/0x20 [ 100.286321] ? __bitmap_weight+0x66/0xa0 [ 100.286833] ? _find_next_bit+0x46/0xe0 [ 100.287334] ? task_mm_cid_work+0x37f/0x450 [ 100.287872] event_triggers_call+0x84/0x150 [ 100.288408] trace_event_buffer_commit+0x339/0x430 [ 100.289073] ? ring_buffer_event_data+0x3f/0x60 [ 100.292189] trace_event_raw_event_sys_enter+0x8b/0xe0 [ 100.295434] syscall_trace_enter.constprop.0+0x18f/0x1b0 [ 100.298653] syscall_enter_from_user_mode+0x32/0x40 [ 100.301808] do_syscall_64+0x1a/0x90 [ 100.304748] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 100.307775] RIP: 0033:0x7f686c75c1cb [ 100.310617] Code: 73 01 c3 48 8b 0d 65 3c 10 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 21 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 3c 10 00 f7 d8 64 89 01 48 [ 100.317847] RSP: 002b:00007ffc60137a38 EFLAGS: 00000246 ORIG_RAX: 0000000000000021 [ 100.321200] RAX: ffffffffffffffda RBX: 000055f566469ea0 RCX: 00007f686c75c1cb [ 100.324631] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 000000000000000a [ 100.328104] RBP: 00007ffc60137ac0 R08: 00007f686c818460 R09: 000000000000000a [ 100.331509] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 [ 100.334992] R13: 0000000000000007 R14: 000000000000000a R15: 0000000000000007 [ 100.338381] </TASK>
We hit the bug because when second hist trigger has was created has_hist_vars() returned false because hist trigger did not have variables. As a result of that save_hist_vars() was not called to add the trigger to trace_array->hist_vars. Later on when we attempted to remove the first histogram find_any_var_ref() failed to detect it is being used because it did not find the second trigger in hist_vars list.
With this change we wait until trigger actions are created so we can take into consideration if hist trigger has variable references. Also, now we check the return value of save_hist_vars() and fail trigger creation if save_hist_vars() fails.
Link: https://lore.kernel.org/linux-trace-kernel/20230712223021.636335-1-mkhalfell...
Cc: stable@vger.kernel.org Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Signed-off-by: Mohamed Khalfella mkhalfella@purestorage.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_hist.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -6663,13 +6663,15 @@ static int event_hist_trigger_parse(stru if (get_named_trigger_data(trigger_data)) goto enable;
- if (has_hist_vars(hist_data)) - save_hist_vars(hist_data); - ret = create_actions(hist_data); if (ret) goto out_unreg;
+ if (has_hist_vars(hist_data) || hist_data->n_var_refs) { + if (save_hist_vars(hist_data)) + goto out_unreg; + } + ret = tracing_map_init(hist_data->map); if (ret) goto out_unreg;
From: Zheng Yejian zhengyejian1@huawei.com
commit d5a821896360cc8b93a15bd888fabc858c038dc0 upstream.
kmemleak reports: unreferenced object 0xffff88814d14e200 (size 256): comm "cat", pid 336, jiffies 4294871818 (age 779.490s) hex dump (first 32 bytes): 04 00 01 03 00 00 00 00 08 00 00 00 00 00 00 00 ................ 0c d8 c8 9b ff ff ff ff 04 5a ca 9b ff ff ff ff .........Z...... backtrace: [<ffffffff9bdff18f>] __kmalloc+0x4f/0x140 [<ffffffff9bc9238b>] trace_find_next_entry+0xbb/0x1d0 [<ffffffff9bc9caef>] trace_print_lat_context+0xaf/0x4e0 [<ffffffff9bc94490>] print_trace_line+0x3e0/0x950 [<ffffffff9bc95499>] tracing_read_pipe+0x2d9/0x5a0 [<ffffffff9bf03a43>] vfs_read+0x143/0x520 [<ffffffff9bf04c2d>] ksys_read+0xbd/0x160 [<ffffffff9d0f0edf>] do_syscall_64+0x3f/0x90 [<ffffffff9d2000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
when reading file 'trace_pipe', 'iter->temp' is allocated or relocated in trace_find_next_entry() but not freed before 'trace_pipe' is closed.
To fix it, free 'iter->temp' in tracing_release_pipe().
Link: https://lore.kernel.org/linux-trace-kernel/20230713141435.1133021-1-zhengyej...
Cc: stable@vger.kernel.org Fixes: ff895103a84ab ("tracing: Save off entry when peeking at next entry") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6753,6 +6753,7 @@ static int tracing_release_pipe(struct i
free_cpumask_var(iter->started); kfree(iter->fmt); + kfree(iter->temp); mutex_destroy(&iter->mutex); kfree(iter);
From: Christoph Hellwig hch@lst.de
commit ac522fc6c3165fd0daa2f8da7e07d5f800586daa upstream.
While duplicate IDs are still very harmful, including the potential to easily see changing devices in /dev/disk/by-id, it turn out they are extremely common for cheap end user NVMe devices.
Relax our check for them for so that it doesn't reject the probe on single-ported PCIe devices, but prints a big warning instead. In doubt we'd still like to see quirk entries to disable the potential for changing supposed stable device identifier links, but this will at least allow users how have two (or more) of these devices to use them without having to manually add a new PCI ID entry with the quirk through sysfs or by patching the kernel.
Fixes: 2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique") Cc: stable@vger.kernel.org # 6.0+ Co-developed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/core.c | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-)
--- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4226,10 +4226,40 @@ static int nvme_init_ns_head(struct nvme
ret = nvme_global_check_duplicate_ids(ctrl->subsys, &info->ids); if (ret) { - dev_err(ctrl->device, - "globally duplicate IDs for nsid %d\n", info->nsid); + /* + * We've found two different namespaces on two different + * subsystems that report the same ID. This is pretty nasty + * for anything that actually requires unique device + * identification. In the kernel we need this for multipathing, + * and in user space the /dev/disk/by-id/ links rely on it. + * + * If the device also claims to be multi-path capable back off + * here now and refuse the probe the second device as this is a + * recipe for data corruption. If not this is probably a + * cheap consumer device if on the PCIe bus, so let the user + * proceed and use the shiny toy, but warn that with changing + * probing order (which due to our async probing could just be + * device taking longer to startup) the other device could show + * up at any time. + */ nvme_print_device_info(ctrl); - return ret; + if ((ns->ctrl->ops->flags & NVME_F_FABRICS) || /* !PCIe */ + ((ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) && + info->is_shared)) { + dev_err(ctrl->device, + "ignoring nsid %d because of duplicate IDs\n", + info->nsid); + return ret; + } + + dev_err(ctrl->device, + "clearing duplicate IDs for nsid %d\n", info->nsid); + dev_err(ctrl->device, + "use of /dev/disk/by-id/ may cause data corruption\n"); + memset(&info->ids.nguid, 0, sizeof(info->ids.nguid)); + memset(&info->ids.uuid, 0, sizeof(info->ids.uuid)); + memset(&info->ids.eui64, 0, sizeof(info->ids.eui64)); + ctrl->quirks |= NVME_QUIRK_BOGUS_NID; }
mutex_lock(&ctrl->subsys->lock);
From: Florent Revest revest@chromium.org
commit 8564c315876ab86fcaf8e7f558d6a84cb2ce5590 upstream.
The ftrace-direct-too sample traces the handle_mm_fault function whose signature changed since the introduction of the sample. Since: commit bce617edecad ("mm: do page fault accounting in handle_mm_fault") handle_mm_fault now has 4 arguments. Therefore, the sample trampoline should save 4 argument registers.
s390 saves all argument registers already so it does not need a change but x86_64 needs an extra push and pop.
This also evolves the signature of the tracing function to make it mirror the signature of the traced function.
Link: https://lkml.kernel.org/r/20230427140700.625241-2-revest@chromium.org
Cc: stable@vger.kernel.org Fixes: bce617edecad ("mm: do page fault accounting in handle_mm_fault") Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Reviewed-by: Mark Rutland mark.rutland@arm.com Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Florent Revest revest@chromium.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- samples/ftrace/ftrace-direct-too.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/samples/ftrace/ftrace-direct-too.c +++ b/samples/ftrace/ftrace-direct-too.c @@ -5,14 +5,14 @@ #include <linux/ftrace.h> #include <asm/asm-offsets.h>
-extern void my_direct_func(struct vm_area_struct *vma, - unsigned long address, unsigned int flags); +extern void my_direct_func(struct vm_area_struct *vma, unsigned long address, + unsigned int flags, struct pt_regs *regs);
-void my_direct_func(struct vm_area_struct *vma, - unsigned long address, unsigned int flags) +void my_direct_func(struct vm_area_struct *vma, unsigned long address, + unsigned int flags, struct pt_regs *regs) { - trace_printk("handle mm fault vma=%p address=%lx flags=%x\n", - vma, address, flags); + trace_printk("handle mm fault vma=%p address=%lx flags=%x regs=%p\n", + vma, address, flags, regs); }
extern void my_tramp(void *); @@ -34,7 +34,9 @@ asm ( " pushq %rdi\n" " pushq %rsi\n" " pushq %rdx\n" +" pushq %rcx\n" " call my_direct_func\n" +" popq %rcx\n" " popq %rdx\n" " popq %rsi\n" " popq %rdi\n"
From: Eric Lin eric.lin@sifive.com
commit 66843b14fb71825fdd73ab12f6594f2243b402be upstream.
Since commit 096b52fd2bb4 ("perf: RISC-V: throttle perf events") the perf_sample_event_took() function was added to report time spent in overflow interrupts. If the interrupt takes too long, the perf framework will lower the sysctl_perf_event_sample_rate and max_samples_per_tick. When hwc->interrupts is larger than max_samples_per_tick, the hwc->interrupts will be set to MAX_INTERRUPTS, and events will be throttled within the __perf_event_account_interrupt() function.
However, the RISC-V PMU driver doesn't call riscv_pmu_stop() to update the PERF_HES_STOPPED flag after perf_event_overflow() in pmu_sbi_ovf_handler() function to avoid throttling. When the perf framework unthrottled the event in the timer interrupt handler, it triggers riscv_pmu_start() function and causes a WARN_ON_ONCE() warning, as shown below:
------------[ cut here ]------------ WARNING: CPU: 0 PID: 240 at drivers/perf/riscv_pmu.c:184 riscv_pmu_start+0x7c/0x8e Modules linked in: CPU: 0 PID: 240 Comm: ls Not tainted 6.4-rc4-g19d0788e9ef2 #1 Hardware name: SiFive (DT) epc : riscv_pmu_start+0x7c/0x8e ra : riscv_pmu_start+0x28/0x8e epc : ffffffff80aef864 ra : ffffffff80aef810 sp : ffff8f80004db6f0 gp : ffffffff81c83750 tp : ffffaf80069f9bc0 t0 : ffff8f80004db6c0 t1 : 0000000000000000 t2 : 000000000000001f s0 : ffff8f80004db720 s1 : ffffaf8008ca1068 a0 : 0000ffffffffffff a1 : 0000000000000000 a2 : 0000000000000001 a3 : 0000000000000870 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000840 a7 : 0000000000000030 s2 : 0000000000000000 s3 : ffffaf8005165800 s4 : ffffaf800424da00 s5 : ffffffffffffffff s6 : ffffffff81cc7590 s7 : 0000000000000000 s8 : 0000000000000006 s9 : 0000000000000001 s10: ffffaf807efbc340 s11: ffffaf807efbbf00 t3 : ffffaf8006a16028 t4 : 00000000dbfbb796 t5 : 0000000700000000 t6 : ffffaf8005269870 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff80aef864>] riscv_pmu_start+0x7c/0x8e [<ffffffff80185b56>] perf_adjust_freq_unthr_context+0x15e/0x174 [<ffffffff80188642>] perf_event_task_tick+0x88/0x9c [<ffffffff800626a8>] scheduler_tick+0xfe/0x27c [<ffffffff800b5640>] update_process_times+0x9a/0xba [<ffffffff800c5bd4>] tick_sched_handle+0x32/0x66 [<ffffffff800c5e0c>] tick_sched_timer+0x64/0xb0 [<ffffffff800b5e50>] __hrtimer_run_queues+0x156/0x2f4 [<ffffffff800b6bdc>] hrtimer_interrupt+0xe2/0x1fe [<ffffffff80acc9e8>] riscv_timer_interrupt+0x38/0x42 [<ffffffff80090a16>] handle_percpu_devid_irq+0x90/0x1d2 [<ffffffff8008a9f4>] generic_handle_domain_irq+0x28/0x36
After referring other PMU drivers like Arm, Loongarch, Csky, and Mips, they don't call *_pmu_stop() to update with PERF_HES_STOPPED flag after perf_event_overflow() function nor do they add PERF_HES_STOPPED flag checking in *_pmu_start() which don't cause this warning.
Thus, it's recommended to remove this unnecessary check in riscv_pmu_start() function to prevent this warning.
Signed-off-by: Eric Lin eric.lin@sifive.com Link: https://lore.kernel.org/r/20230710154328.19574-1-eric.lin@sifive.com Fixes: 096b52fd2bb4 ("perf: RISC-V: throttle perf events") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/perf/riscv_pmu.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/perf/riscv_pmu.c +++ b/drivers/perf/riscv_pmu.c @@ -181,9 +181,6 @@ void riscv_pmu_start(struct perf_event * uint64_t max_period = riscv_pmu_ctr_get_width_mask(event); u64 init_val;
- if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED))) - return; - if (flags & PERF_EF_RELOAD) WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE));
From: Isaac J. Manjarres isaacmanjarres@google.com
commit 963b54df82b6d6206d7def273390bf3f7af558e1 upstream.
When allocating the 2D array for handling IRQ type registers in regmap_add_irq_chip_fwnode(), the intent is to allocate a matrix with num_config_bases rows and num_config_regs columns.
This is currently handled by allocating a buffer to hold a pointer for each row (i.e. num_config_bases). After that, the logic attempts to allocate the memory required to hold the register configuration for each row. However, instead of doing this allocation for each row (i.e. num_config_bases allocations), the logic erroneously does this allocation num_config_regs number of times.
This scenario can lead to out-of-bounds accesses when num_config_regs is greater than num_config_bases. Fix this by updating the terminating condition of the loop that allocates the memory for holding the register configuration to allocate memory only for each row in the matrix.
Amit Pundir reported a crash that was occurring on his db845c device due to memory corruption (see "Closes" tag for Amit's report). The KASAN report below helped narrow it down to this issue:
[ 14.033877][ T1] ================================================================== [ 14.042507][ T1] BUG: KASAN: invalid-access in regmap_add_irq_chip_fwnode+0x594/0x1364 [ 14.050796][ T1] Write of size 8 at addr 06ffff8081021850 by task init/1
[ 14.242004][ T1] The buggy address belongs to the object at ffffff8081021850 [ 14.242004][ T1] which belongs to the cache kmalloc-8 of size 8 [ 14.255669][ T1] The buggy address is located 0 bytes inside of [ 14.255669][ T1] 8-byte region [ffffff8081021850, ffffff8081021858)
Fixes: faa87ce9196d ("regmap-irq: Introduce config registers for irq types") Reported-by: Amit Pundir amit.pundir@linaro.org Closes: https://lore.kernel.org/all/CAMi1Hd04mu6JojT3y6wyN2YeVkPR5R3qnkKJ8iR8if_YByC... Tested-by: John Stultz jstultz@google.com Tested-by: Amit Pundir amit.pundir@linaro.org # tested on Dragonboard 845c Cc: stable@vger.kernel.org # v6.0+ Cc: Aidan MacDonald aidanmacdonald.0x0@gmail.com Cc: Saravana Kannan saravanak@google.com Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: "Isaac J. Manjarres" isaacmanjarres@google.com Link: https://lore.kernel.org/r/20230711193059.2480971-1-isaacmanjarres@google.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/regmap/regmap-irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/base/regmap/regmap-irq.c +++ b/drivers/base/regmap/regmap-irq.c @@ -852,7 +852,7 @@ int regmap_add_irq_chip_fwnode(struct fw if (!d->config_buf) goto err_alloc;
- for (i = 0; i < chip->num_config_regs; i++) { + for (i = 0; i < chip->num_config_bases; i++) { d->config_buf[i] = kcalloc(chip->num_config_regs, sizeof(**d->config_buf), GFP_KERNEL);
From: Krister Johansen kjlx@templeofstupid.com
commit 1e9cb763e9bacf0c932aa948f50dcfca6f519a26 upstream.
The ENA adapters on our instances occasionally reset. Once recently logged a UBSAN failure to console in the process:
UBSAN: shift-out-of-bounds in build/linux/drivers/net/ethernet/amazon/ena/ena_com.c:540:13 shift exponent 32 is too large for 32-bit type 'unsigned int' CPU: 28 PID: 70012 Comm: kworker/u72:2 Kdump: loaded not tainted 5.15.117 Hardware name: Amazon EC2 c5d.9xlarge/, BIOS 1.0 10/16/2017 Workqueue: ena ena_fw_reset_device [ena] Call Trace: <TASK> dump_stack_lvl+0x4a/0x63 dump_stack+0x10/0x16 ubsan_epilogue+0x9/0x36 __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e ? __const_udelay+0x43/0x50 ena_delay_exponential_backoff_us.cold+0x16/0x1e [ena] wait_for_reset_state+0x54/0xa0 [ena] ena_com_dev_reset+0xc8/0x110 [ena] ena_down+0x3fe/0x480 [ena] ena_destroy_device+0xeb/0xf0 [ena] ena_fw_reset_device+0x30/0x50 [ena] process_one_work+0x22b/0x3d0 worker_thread+0x4d/0x3f0 ? process_one_work+0x3d0/0x3d0 kthread+0x12a/0x150 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x22/0x30 </TASK>
Apparently, the reset delays are getting so large they can trigger a UBSAN panic.
Looking at the code, the current timeout is capped at 5000us. Using a base value of 100us, the current code will overflow after (1<<29). Even at values before 32, this function wraps around, perhaps unintentionally.
Cap the value of the exponent used for this backoff at (1<<16) which is larger than currently necessary, but large enough to support bigger values in the future.
Cc: stable@vger.kernel.org Fixes: 4bb7f4cf60e3 ("net: ena: reduce driver load time") Signed-off-by: Krister Johansen kjlx@templeofstupid.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Reviewed-by: Shay Agroskin shayagr@amazon.com Link: https://lore.kernel.org/r/20230711013621.GE1926@templeofstupid.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/amazon/ena/ena_com.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/ethernet/amazon/ena/ena_com.c +++ b/drivers/net/ethernet/amazon/ena/ena_com.c @@ -35,6 +35,8 @@
#define ENA_REGS_ADMIN_INTR_MASK 1
+#define ENA_MAX_BACKOFF_DELAY_EXP 16U + #define ENA_MIN_ADMIN_POLL_US 100
#define ENA_MAX_ADMIN_POLL_US 5000 @@ -536,6 +538,7 @@ static int ena_com_comp_status_to_errno(
static void ena_delay_exponential_backoff_us(u32 exp, u32 delay_us) { + exp = min_t(u32, exp, ENA_MAX_BACKOFF_DELAY_EXP); delay_us = max_t(u32, ENA_MIN_ADMIN_POLL_US, delay_us); delay_us = min_t(u32, delay_us * (1U << exp), ENA_MAX_ADMIN_POLL_US); usleep_range(delay_us, 2 * delay_us);
From: Zheng Yejian zhengyejian1@huawei.com
commit 7e42907f3a7b4ce3a2d1757f6d78336984daf8f5 upstream.
Soft lockup occurs when reading file 'trace_pipe':
watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488] [...] RIP: 0010:ring_buffer_empty_cpu+0xed/0x170 RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218 RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901 R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000 [...] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __find_next_entry+0x1a8/0x4b0 ? peek_next_entry+0x250/0x250 ? down_write+0xa5/0x120 ? down_write_killable+0x130/0x130 trace_find_next_entry_inc+0x3b/0x1d0 tracing_read_pipe+0x423/0xae0 ? tracing_splice_read_pipe+0xcb0/0xcb0 vfs_read+0x16b/0x490 ksys_read+0x105/0x210 ? __ia32_sys_pwrite64+0x200/0x200 ? switch_fpu_return+0x108/0x220 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6
Through the vmcore, I found it's because in tracing_read_pipe(), ring_buffer_empty_cpu() found some buffer is not empty but then it cannot read anything due to "rb_num_of_entries() == 0" always true, Then it infinitely loop the procedure due to user buffer not been filled, see following code path:
tracing_read_pipe() { ... ... waitagain: tracing_wait_pipe() // 1. find non-empty buffer here trace_find_next_entry_inc() // 2. loop here try to find an entry __find_next_entry() ring_buffer_empty_cpu(); // 3. find non-empty buffer peek_next_entry() // 4. but peek always return NULL ring_buffer_peek() rb_buffer_peek() rb_get_reader_page() // 5. because rb_num_of_entries() == 0 always true here // then return NULL // 6. user buffer not been filled so goto 'waitgain' // and eventually leads to an deadloop in kernel!!! }
By some analyzing, I found that when resetting ringbuffer, the 'entries' of its pages are not all cleared (see rb_reset_cpu()). Then when reducing the ringbuffer, and if some reduced pages exist dirty 'entries' data, they will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which cause wrong 'overrun' count and eventually cause the deadloop issue.
To fix it, we need to clear every pages in rb_reset_cpu().
Link: https://lore.kernel.org/linux-trace-kernel/20230708225144.3785600-1-zhengyej...
Cc: stable@vger.kernel.org Fixes: a5fb833172eca ("ring-buffer: Fix uninitialized read_stamp") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ring_buffer.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-)
--- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -5242,28 +5242,34 @@ unsigned long ring_buffer_size(struct tr } EXPORT_SYMBOL_GPL(ring_buffer_size);
+static void rb_clear_buffer_page(struct buffer_page *page) +{ + local_set(&page->write, 0); + local_set(&page->entries, 0); + rb_init_page(page->page); + page->read = 0; +} + static void rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) { + struct buffer_page *page; + rb_head_page_deactivate(cpu_buffer);
cpu_buffer->head_page = list_entry(cpu_buffer->pages, struct buffer_page, list); - local_set(&cpu_buffer->head_page->write, 0); - local_set(&cpu_buffer->head_page->entries, 0); - local_set(&cpu_buffer->head_page->page->commit, 0); - - cpu_buffer->head_page->read = 0; + rb_clear_buffer_page(cpu_buffer->head_page); + list_for_each_entry(page, cpu_buffer->pages, list) { + rb_clear_buffer_page(page); + }
cpu_buffer->tail_page = cpu_buffer->head_page; cpu_buffer->commit_page = cpu_buffer->head_page;
INIT_LIST_HEAD(&cpu_buffer->reader_page->list); INIT_LIST_HEAD(&cpu_buffer->new_pages); - local_set(&cpu_buffer->reader_page->write, 0); - local_set(&cpu_buffer->reader_page->entries, 0); - local_set(&cpu_buffer->reader_page->page->commit, 0); - cpu_buffer->reader_page->read = 0; + rb_clear_buffer_page(cpu_buffer->reader_page);
local_set(&cpu_buffer->entries_bytes, 0); local_set(&cpu_buffer->overrun, 0);
From: Zheng Yejian zhengyejian1@huawei.com
commit 26efd79c4624294e553aeaa3439c646729bad084 upstream.
As comments in ftrace_process_locs(), there may be NULL pointers in mcount_loc section:
Some architecture linkers will pad between the different mcount_loc sections of different object files to satisfy alignments. Skip any NULL pointers.
After commit 20e5227e9f55 ("ftrace: allow NULL pointers in mcount_loc"), NULL pointers will be accounted when allocating ftrace pages but skipped before adding into ftrace pages, this may result in some pages not being used. Then after commit 706c81f87f84 ("ftrace: Remove extra helper functions"), warning may occur at: WARN_ON(pg->next);
To fix it, only warn for case that no pointers skipped but pages not used up, then free those unused pages after releasing ftrace_lock.
Link: https://lore.kernel.org/linux-trace-kernel/20230712060452.3175675-1-zhengyej...
Cc: stable@vger.kernel.org Fixes: 706c81f87f84 ("ftrace: Remove extra helper functions") Suggested-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ftrace.c | 45 +++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-)
--- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -3305,6 +3305,22 @@ static int ftrace_allocate_records(struc return cnt; }
+static void ftrace_free_pages(struct ftrace_page *pages) +{ + struct ftrace_page *pg = pages; + + while (pg) { + if (pg->records) { + free_pages((unsigned long)pg->records, pg->order); + ftrace_number_of_pages -= 1 << pg->order; + } + pages = pg->next; + kfree(pg); + pg = pages; + ftrace_number_of_groups--; + } +} + static struct ftrace_page * ftrace_allocate_pages(unsigned long num_to_init) { @@ -3343,17 +3359,7 @@ ftrace_allocate_pages(unsigned long num_ return start_pg;
free_pages: - pg = start_pg; - while (pg) { - if (pg->records) { - free_pages((unsigned long)pg->records, pg->order); - ftrace_number_of_pages -= 1 << pg->order; - } - start_pg = pg->next; - kfree(pg); - pg = start_pg; - ftrace_number_of_groups--; - } + ftrace_free_pages(start_pg); pr_info("ftrace: FAILED to allocate memory for functions\n"); return NULL; } @@ -6434,9 +6440,11 @@ static int ftrace_process_locs(struct mo unsigned long *start, unsigned long *end) { + struct ftrace_page *pg_unuse = NULL; struct ftrace_page *start_pg; struct ftrace_page *pg; struct dyn_ftrace *rec; + unsigned long skipped = 0; unsigned long count; unsigned long *p; unsigned long addr; @@ -6499,8 +6507,10 @@ static int ftrace_process_locs(struct mo * object files to satisfy alignments. * Skip any NULL pointers. */ - if (!addr) + if (!addr) { + skipped++; continue; + }
end_offset = (pg->index+1) * sizeof(pg->records[0]); if (end_offset > PAGE_SIZE << pg->order) { @@ -6514,8 +6524,10 @@ static int ftrace_process_locs(struct mo rec->ip = addr; }
- /* We should have used all pages */ - WARN_ON(pg->next); + if (pg->next) { + pg_unuse = pg->next; + pg->next = NULL; + }
/* Assign the last page to ftrace_pages */ ftrace_pages = pg; @@ -6537,6 +6549,11 @@ static int ftrace_process_locs(struct mo out: mutex_unlock(&ftrace_lock);
+ /* We should have used all pages unless we skipped some */ + if (pg_unuse) { + WARN_ON(!skipped); + ftrace_free_pages(pg_unuse); + } return ret; }
From: Evan Quan evan.quan@amd.com
commit dcb489bae65d92cfd26da22c7a0d6665b06ecc63 upstream.
So that SMU13.0.0 and SMU13.0.7 do not need to have one copy each.
Signed-off-by: Evan Quan evan.quan@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 4 ++ drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 31 +++++++++++++++++ drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 33 ------------------- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c | 33 ------------------- 4 files changed, 37 insertions(+), 64 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h @@ -303,5 +303,9 @@ int smu_v13_0_get_pptable_from_firmware( uint32_t *size, uint32_t pptable_id);
+int smu_v13_0_update_pcie_parameters(struct smu_context *smu, + uint32_t pcie_gen_cap, + uint32_t pcie_width_cap); + #endif #endif --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -2453,3 +2453,34 @@ int smu_v13_0_mode1_reset(struct smu_con
return ret; } + +int smu_v13_0_update_pcie_parameters(struct smu_context *smu, + uint32_t pcie_gen_cap, + uint32_t pcie_width_cap) +{ + struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; + struct smu_13_0_pcie_table *pcie_table = + &dpm_context->dpm_tables.pcie_table; + uint32_t smu_pcie_arg; + int ret, i; + + for (i = 0; i < pcie_table->num_of_link_levels; i++) { + if (pcie_table->pcie_gen[i] > pcie_gen_cap) + pcie_table->pcie_gen[i] = pcie_gen_cap; + if (pcie_table->pcie_lane[i] > pcie_width_cap) + pcie_table->pcie_lane[i] = pcie_width_cap; + + smu_pcie_arg = i << 16; + smu_pcie_arg |= pcie_table->pcie_gen[i] << 8; + smu_pcie_arg |= pcie_table->pcie_lane[i]; + + ret = smu_cmn_send_smc_msg_with_param(smu, + SMU_MSG_OverridePcieParameters, + smu_pcie_arg, + NULL); + if (ret) + return ret; + } + + return 0; +} --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c @@ -1235,37 +1235,6 @@ static int smu_v13_0_0_force_clk_levels( return ret; }
-static int smu_v13_0_0_update_pcie_parameters(struct smu_context *smu, - uint32_t pcie_gen_cap, - uint32_t pcie_width_cap) -{ - struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; - struct smu_13_0_pcie_table *pcie_table = - &dpm_context->dpm_tables.pcie_table; - uint32_t smu_pcie_arg; - int ret, i; - - for (i = 0; i < pcie_table->num_of_link_levels; i++) { - if (pcie_table->pcie_gen[i] > pcie_gen_cap) - pcie_table->pcie_gen[i] = pcie_gen_cap; - if (pcie_table->pcie_lane[i] > pcie_width_cap) - pcie_table->pcie_lane[i] = pcie_width_cap; - - smu_pcie_arg = i << 16; - smu_pcie_arg |= pcie_table->pcie_gen[i] << 8; - smu_pcie_arg |= pcie_table->pcie_lane[i]; - - ret = smu_cmn_send_smc_msg_with_param(smu, - SMU_MSG_OverridePcieParameters, - smu_pcie_arg, - NULL); - if (ret) - return ret; - } - - return 0; -} - static const struct smu_temperature_range smu13_thermal_policy[] = { {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000}, { 120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000}, @@ -2172,7 +2141,7 @@ static const struct pptable_funcs smu_v1 .feature_is_enabled = smu_cmn_feature_is_enabled, .print_clk_levels = smu_v13_0_0_print_clk_levels, .force_clk_levels = smu_v13_0_0_force_clk_levels, - .update_pcie_parameters = smu_v13_0_0_update_pcie_parameters, + .update_pcie_parameters = smu_v13_0_update_pcie_parameters, .get_thermal_temperature_range = smu_v13_0_0_get_thermal_temperature_range, .register_irq_handler = smu_v13_0_register_irq_handler, .enable_thermal_alert = smu_v13_0_enable_thermal_alert, --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c @@ -1225,37 +1225,6 @@ static int smu_v13_0_7_force_clk_levels( return ret; }
-static int smu_v13_0_7_update_pcie_parameters(struct smu_context *smu, - uint32_t pcie_gen_cap, - uint32_t pcie_width_cap) -{ - struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; - struct smu_13_0_pcie_table *pcie_table = - &dpm_context->dpm_tables.pcie_table; - uint32_t smu_pcie_arg; - int ret, i; - - for (i = 0; i < pcie_table->num_of_link_levels; i++) { - if (pcie_table->pcie_gen[i] > pcie_gen_cap) - pcie_table->pcie_gen[i] = pcie_gen_cap; - if (pcie_table->pcie_lane[i] > pcie_width_cap) - pcie_table->pcie_lane[i] = pcie_width_cap; - - smu_pcie_arg = i << 16; - smu_pcie_arg |= pcie_table->pcie_gen[i] << 8; - smu_pcie_arg |= pcie_table->pcie_lane[i]; - - ret = smu_cmn_send_smc_msg_with_param(smu, - SMU_MSG_OverridePcieParameters, - smu_pcie_arg, - NULL); - if (ret) - return ret; - } - - return 0; -} - static const struct smu_temperature_range smu13_thermal_policy[] = { {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000}, @@ -1752,7 +1721,7 @@ static const struct pptable_funcs smu_v1 .feature_is_enabled = smu_cmn_feature_is_enabled, .print_clk_levels = smu_v13_0_7_print_clk_levels, .force_clk_levels = smu_v13_0_7_force_clk_levels, - .update_pcie_parameters = smu_v13_0_7_update_pcie_parameters, + .update_pcie_parameters = smu_v13_0_update_pcie_parameters, .get_thermal_temperature_range = smu_v13_0_7_get_thermal_temperature_range, .register_irq_handler = smu_v13_0_register_irq_handler, .enable_thermal_alert = smu_v13_0_enable_thermal_alert,
From: Mario Limonciello mario.limonciello@amd.com
commit 31c7a3b378a136adc63296a2ff17645896fcf303 upstream.
Intel platforms such as Sapphire Rapids and Raptor Lake don't support dynamic pcie lane or speed switching.
This limitation seems to carry over from one generation to another. To be safer, disable dynamic pcie lane width and speed switching when running on an Intel platform.
Link: https://edc.intel.com/content/www/us/en/design/products/platforms/details/ra... Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2663 Co-developed-by: Evan Quan evan.quan@amd.com Signed-off-by: Evan Quan evan.quan@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 42 +++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -2454,6 +2454,25 @@ int smu_v13_0_mode1_reset(struct smu_con return ret; }
+/* + * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic + * speed switching. Until we have confirmation from Intel that a specific host + * supports it, it's safer that we keep it disabled for all. + * + * https://edc.intel.com/content/www/us/en/design/products/platforms/details/ra... + * https://gitlab.freedesktop.org/drm/amd/-/issues/2663 + */ +static bool smu_v13_0_is_pcie_dynamic_switching_supported(void) +{ +#if IS_ENABLED(CONFIG_X86) + struct cpuinfo_x86 *c = &cpu_data(0); + + if (c->x86_vendor == X86_VENDOR_INTEL) + return false; +#endif + return true; +} + int smu_v13_0_update_pcie_parameters(struct smu_context *smu, uint32_t pcie_gen_cap, uint32_t pcie_width_cap) @@ -2461,15 +2480,32 @@ int smu_v13_0_update_pcie_parameters(str struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; struct smu_13_0_pcie_table *pcie_table = &dpm_context->dpm_tables.pcie_table; + int num_of_levels = pcie_table->num_of_link_levels; uint32_t smu_pcie_arg; int ret, i;
- for (i = 0; i < pcie_table->num_of_link_levels; i++) { - if (pcie_table->pcie_gen[i] > pcie_gen_cap) + if (!smu_v13_0_is_pcie_dynamic_switching_supported()) { + if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap) + pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1]; + + if (pcie_table->pcie_lane[num_of_levels - 1] < pcie_width_cap) + pcie_width_cap = pcie_table->pcie_lane[num_of_levels - 1]; + + /* Force all levels to use the same settings */ + for (i = 0; i < num_of_levels; i++) { pcie_table->pcie_gen[i] = pcie_gen_cap; - if (pcie_table->pcie_lane[i] > pcie_width_cap) pcie_table->pcie_lane[i] = pcie_width_cap; + } + } else { + for (i = 0; i < num_of_levels; i++) { + if (pcie_table->pcie_gen[i] > pcie_gen_cap) + pcie_table->pcie_gen[i] = pcie_gen_cap; + if (pcie_table->pcie_lane[i] > pcie_width_cap) + pcie_table->pcie_lane[i] = pcie_width_cap; + } + }
+ for (i = 0; i < num_of_levels; i++) { smu_pcie_arg = i << 16; smu_pcie_arg |= pcie_table->pcie_gen[i] << 8; smu_pcie_arg |= pcie_table->pcie_lane[i];
From: Bharath SM bharathsm@microsoft.com
commit df9d70c18616760c6504b97fec66b6379c172dbb upstream.
If defer close timeout value is set to 0, then there is no need to include files in the deferred close list and utilize the delayed worker for closing. Instead, we can close them immediately.
Signed-off-by: Bharath SM bharathsm@microsoft.com Reviewed-by: Shyam Prasad N sprasad@microsoft.com Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/smb/client/file.c +++ b/fs/smb/client/file.c @@ -1080,8 +1080,8 @@ int cifs_close(struct inode *inode, stru cfile = file->private_data; file->private_data = NULL; dclose = kmalloc(sizeof(struct cifs_deferred_close), GFP_KERNEL); - if ((cinode->oplock == CIFS_CACHE_RHW_FLG) && - cinode->lease_granted && + if ((cifs_sb->ctx->closetimeo && cinode->oplock == CIFS_CACHE_RHW_FLG) + && cinode->lease_granted && !test_bit(CIFS_INO_CLOSE_ON_LOCK, &cinode->flags) && dclose) { if (test_and_clear_bit(CIFS_INO_MODIFIED_ATTR, &cinode->flags)) {
From: Max Filippov jcmvbkbc@gmail.com
commit bc8d5916541fa19ca5bc598eb51a5f78eb891a36 upstream.
split_if_spec expects a NULL-pointer as an end marker for the argument list, but tuntap_probe never supplied that terminating NULL. As a result incorrectly formatted interface specification string may cause a crash because of the random memory access. Fix that by adding NULL terminator to the split_if_spec argument list.
Cc: stable@vger.kernel.org Fixes: 7282bee78798 ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 8") Signed-off-by: Max Filippov jcmvbkbc@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/xtensa/platforms/iss/network.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/xtensa/platforms/iss/network.c +++ b/arch/xtensa/platforms/iss/network.c @@ -237,7 +237,7 @@ static int tuntap_probe(struct iss_net_p
init += sizeof(TRANSPORT_TUNTAP_NAME) - 1; if (*init == ',') { - rem = split_if_spec(init + 1, &mac_str, &dev_name); + rem = split_if_spec(init + 1, &mac_str, &dev_name, NULL); if (rem != NULL) { pr_err("%s: extra garbage on specification : '%s'\n", dev->name, rem);
From: Namhyung Kim namhyung@kernel.org
commit 27c68c216ee1f1b086e789a64486e6511e380b8a upstream.
On SPR, the load latency event needs an auxiliary event in the same group to work properly. There's a check in intel_pmu_hw_config() for this to iterate sibling events and find a mem-loads-aux event.
The for_each_sibling_event() has a lockdep assert to make sure if it disabled hardirq or hold leader->ctx->mutex. This works well if the given event has a separate leader event since perf_try_init_event() grabs the leader->ctx->mutex to protect the sibling list. But it can cause a problem when the event itself is a leader since the event is not initialized yet and there's no ctx for the event.
Actually I got a lockdep warning when I run the below command on SPR, but I guess it could be a NULL pointer dereference.
$ perf record -d -e cpu/mem-loads/uP true
The code path to the warning is:
sys_perf_event_open() perf_event_alloc() perf_init_event() perf_try_init_event() x86_pmu_event_init() hsw_hw_config() intel_pmu_hw_config() for_each_sibling_event() lockdep_assert_event_ctx()
We don't need for_each_sibling_event() when it's a standalone event. Let's return the error code directly.
Fixes: f3c0eba28704 ("perf: Add a few assertions") Reported-by: Greg Thelen gthelen@google.com Signed-off-by: Namhyung Kim namhyung@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20230704181516.3293665-1-namhyung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/events/intel/core.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3993,6 +3993,13 @@ static int intel_pmu_hw_config(struct pe struct perf_event *leader = event->group_leader; struct perf_event *sibling = NULL;
+ /* + * When this memload event is also the first event (no group + * exists yet), then there is no aux event before it. + */ + if (leader == event) + return -ENODATA; + if (!is_mem_loads_aux_event(leader)) { for_each_sibling_event(sibling, leader) { if (is_mem_loads_aux_event(sibling))
From: Chungkai Yang Chung-kai.Yang@mediatek.com
commit 3a8395b565b5b4f019b3dc182be4c4541eb35ac8 upstream.
Commit 8d36694245f2 ("PM: QoS: Add check to make sure CPU freq is non-negative") makes sure CPU freq is non-negative to avoid negative value converting to unsigned data type. However, when the value is PM_QOS_DEFAULT_VALUE, pm_qos_update_target specifically uses c->default_value which is set to FREQ_QOS_MIN/MAX_DEFAULT_VALUE when cpufreq_policy_alloc is executed, for this case handling.
Adding check for PM_QOS_DEFAULT_VALUE to let default setting work will fix this problem.
Fixes: 8d36694245f2 ("PM: QoS: Add check to make sure CPU freq is non-negative") Link: https://lore.kernel.org/lkml/20230626035144.19717-1-Chung-kai.Yang@mediatek.... Link: https://lore.kernel.org/lkml/20230627071727.16646-1-Chung-kai.Yang@mediatek.... Link: https://lore.kernel.org/lkml/CAJZ5v0gxNOWhC58PHeUhW_tgf6d1fGJVZ1x91zkDdht11y... Signed-off-by: Chungkai Yang Chung-kai.Yang@mediatek.com Cc: 6.0+ stable@vger.kernel.org # 6.0+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/power/qos.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -426,6 +426,11 @@ late_initcall(cpu_latency_qos_init);
/* Definitions related to the frequency QoS below. */
+static inline bool freq_qos_value_invalid(s32 value) +{ + return value < 0 && value != PM_QOS_DEFAULT_VALUE; +} + /** * freq_constraints_init - Initialize frequency QoS constraints. * @qos: Frequency QoS constraints to initialize. @@ -531,7 +536,7 @@ int freq_qos_add_request(struct freq_con { int ret;
- if (IS_ERR_OR_NULL(qos) || !req || value < 0) + if (IS_ERR_OR_NULL(qos) || !req || freq_qos_value_invalid(value)) return -EINVAL;
if (WARN(freq_qos_request_active(req), @@ -563,7 +568,7 @@ EXPORT_SYMBOL_GPL(freq_qos_add_request); */ int freq_qos_update_request(struct freq_qos_request *req, s32 new_value) { - if (!req || new_value < 0) + if (!req || freq_qos_value_invalid(new_value)) return -EINVAL;
if (WARN(!freq_qos_request_active(req),
From: Heiner Kallweit hkallweit1@gmail.com
commit 6b9352f3f8a1a35faf0efc1ad1807ee303467796 upstream.
I don't see a reason why we should treat the case lo < hi differently and return 0 as period and duty_cycle. The current logic was added with c375bcbaabdb ("pwm: meson: Read the full hardware state in meson_pwm_get_state()"), Martin as original author doesn't remember why it was implemented this way back then. So let's handle it as normal use case and also remove the optimization for lo == 0. I think the improved readability is worth it.
Fixes: c375bcbaabdb ("pwm: meson: Read the full hardware state in meson_pwm_get_state()") Reviewed-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Dmitry Rokosov ddrokosov@sberdevices.ru Acked-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-meson.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-)
--- a/drivers/pwm/pwm-meson.c +++ b/drivers/pwm/pwm-meson.c @@ -351,18 +351,8 @@ static int meson_pwm_get_state(struct pw channel->lo = FIELD_GET(PWM_LOW_MASK, value); channel->hi = FIELD_GET(PWM_HIGH_MASK, value);
- if (channel->lo == 0) { - state->period = meson_pwm_cnt_to_ns(chip, pwm, channel->hi); - state->duty_cycle = state->period; - } else if (channel->lo >= channel->hi) { - state->period = meson_pwm_cnt_to_ns(chip, pwm, - channel->lo + channel->hi); - state->duty_cycle = meson_pwm_cnt_to_ns(chip, pwm, - channel->hi); - } else { - state->period = 0; - state->duty_cycle = 0; - } + state->period = meson_pwm_cnt_to_ns(chip, pwm, channel->lo + channel->hi); + state->duty_cycle = meson_pwm_cnt_to_ns(chip, pwm, channel->hi);
state->polarity = PWM_POLARITY_NORMAL;
From: Heiner Kallweit hkallweit1@gmail.com
commit 87a2cbf02d7701255f9fcca7e5bd864a7bb397cf upstream.
state->period/duty are of type u64, and if their value is greater than UINT_MAX, then the cast to uint will cause problems. Fix this by changing the type of the respective local variables to u64.
Fixes: b79c3670e120 ("pwm: meson: Don't duplicate the polarity internally") Cc: stable@vger.kernel.org Suggested-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-meson.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
--- a/drivers/pwm/pwm-meson.c +++ b/drivers/pwm/pwm-meson.c @@ -156,8 +156,9 @@ static int meson_pwm_calc(struct meson_p const struct pwm_state *state) { struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm]; - unsigned int duty, period, pre_div, cnt, duty_cnt; + unsigned int pre_div, cnt, duty_cnt; unsigned long fin_freq; + u64 duty, period;
duty = state->duty_cycle; period = state->period; @@ -179,19 +180,19 @@ static int meson_pwm_calc(struct meson_p
dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
- pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL); + pre_div = div64_u64(fin_freq * period, NSEC_PER_SEC * 0xffffLL); if (pre_div > MISC_CLK_DIV_MASK) { dev_err(meson->chip.dev, "unable to get period pre_div\n"); return -EINVAL; }
- cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1)); + cnt = div64_u64(fin_freq * period, NSEC_PER_SEC * (pre_div + 1)); if (cnt > 0xffff) { dev_err(meson->chip.dev, "unable to get period cnt\n"); return -EINVAL; }
- dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period, + dev_dbg(meson->chip.dev, "period=%llu pre_div=%u cnt=%u\n", period, pre_div, cnt);
if (duty == period) { @@ -204,14 +205,13 @@ static int meson_pwm_calc(struct meson_p channel->lo = cnt; } else { /* Then check is we can have the duty with the same pre_div */ - duty_cnt = div64_u64(fin_freq * (u64)duty, - NSEC_PER_SEC * (pre_div + 1)); + duty_cnt = div64_u64(fin_freq * duty, NSEC_PER_SEC * (pre_div + 1)); if (duty_cnt > 0xffff) { dev_err(meson->chip.dev, "unable to get duty cycle\n"); return -EINVAL; }
- dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n", + dev_dbg(meson->chip.dev, "duty=%llu pre_div=%u duty_cnt=%u\n", duty, pre_div, duty_cnt);
channel->pre_div = pre_div;
From: Karol Wachowski karol.wachowski@linux.intel.com
commit 020b527b556a35cf636015c1c3cbdfe7c7acd5f0 upstream.
Incorrect REGB_WR32() macro was used to access VPUIP register. Use correct REGV_WR32().
Fixes: 35b137630f08 ("accel/ivpu: Introduce a new DRM driver for Intel VPU") Cc: stable@vger.kernel.org # 6.3.x Signed-off-by: Karol Wachowski karol.wachowski@linux.intel.com Reviewed-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Signed-off-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230703080725.2065635-1-stani... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/ivpu/ivpu_hw_mtl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/accel/ivpu/ivpu_hw_mtl.c b/drivers/accel/ivpu/ivpu_hw_mtl.c index fef35422c6f0..3485be27138a 100644 --- a/drivers/accel/ivpu/ivpu_hw_mtl.c +++ b/drivers/accel/ivpu/ivpu_hw_mtl.c @@ -885,7 +885,7 @@ static void ivpu_hw_mtl_irq_disable(struct ivpu_device *vdev) REGB_WR32(MTL_BUTTRESS_GLOBAL_INT_MASK, 0x1); REGB_WR32(MTL_BUTTRESS_LOCAL_INT_MASK, BUTTRESS_IRQ_DISABLE_MASK); REGV_WR64(MTL_VPU_HOST_SS_ICB_ENABLE_0, 0x0ull); - REGB_WR32(MTL_VPU_HOST_SS_FW_SOC_IRQ_EN, 0x0); + REGV_WR32(MTL_VPU_HOST_SS_FW_SOC_IRQ_EN, 0x0); }
static void ivpu_hw_mtl_irq_wdt_nce_handler(struct ivpu_device *vdev)
From: Karol Wachowski karol.wachowski@linux.intel.com
commit 7f34e01f77f811ecb2ef83e60301b38cf89af466 upstream.
MTL C0 stepping fixed issue related to butrress interrupt status clearing, to clear an interrupt status it is required to write 1 to specific status bit field. This allows to execute read, modify and write routine.
Writing 0 will not clear the interrupt and will cause interrupt storm.
Fixes: 35b137630f08 ("accel/ivpu: Introduce a new DRM driver for Intel VPU") Cc: stable@vger.kernel.org # 6.3.x Signed-off-by: Karol Wachowski karol.wachowski@linux.intel.com Reviewed-by: Jacek Lawrynowicz jacek.lawrynowicz@linux.intel.com Signed-off-by: Stanislaw Gruszka stanislaw.gruszka@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230703080725.2065635-2-stani... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/accel/ivpu/ivpu_drv.h | 1 + drivers/accel/ivpu/ivpu_hw_mtl.c | 18 ++++++++++++------ 2 files changed, 13 insertions(+), 6 deletions(-)
--- a/drivers/accel/ivpu/ivpu_drv.h +++ b/drivers/accel/ivpu/ivpu_drv.h @@ -75,6 +75,7 @@ struct ivpu_wa_table { bool punit_disabled; bool clear_runtime_mem; bool d3hot_after_power_off; + bool interrupt_clear_with_0; };
struct ivpu_hw_info; --- a/drivers/accel/ivpu/ivpu_hw_mtl.c +++ b/drivers/accel/ivpu/ivpu_hw_mtl.c @@ -101,6 +101,9 @@ static void ivpu_hw_wa_init(struct ivpu_ vdev->wa.punit_disabled = ivpu_is_fpga(vdev); vdev->wa.clear_runtime_mem = false; vdev->wa.d3hot_after_power_off = true; + + if (ivpu_device_id(vdev) == PCI_DEVICE_ID_MTL && ivpu_revision(vdev) < 4) + vdev->wa.interrupt_clear_with_0 = true; }
static void ivpu_hw_timeouts_init(struct ivpu_device *vdev) @@ -973,12 +976,15 @@ static u32 ivpu_hw_mtl_irqb_handler(stru schedule_recovery = true; }
- /* - * Clear local interrupt status by writing 0 to all bits. - * This must be done after interrupts are cleared at the source. - * Writing 1 triggers an interrupt, so we can't perform read update write. - */ - REGB_WR32(MTL_BUTTRESS_INTERRUPT_STAT, 0x0); + /* This must be done after interrupts are cleared at the source. */ + if (IVPU_WA(interrupt_clear_with_0)) + /* + * Writing 1 triggers an interrupt, so we can't perform read update write. + * Clear local interrupt status by writing 0 to all bits. + */ + REGB_WR32(MTL_BUTTRESS_INTERRUPT_STAT, 0x0); + else + REGB_WR32(MTL_BUTTRESS_INTERRUPT_STAT, status);
/* Re-enable global interrupt */ REGB_WR32(MTL_BUTTRESS_GLOBAL_INT_MASK, 0x0);
From: Jiri Olsa jolsa@kernel.org
commit 5f81018753dfd4989e33ece1f0cb6b8aae498b82 upstream.
While running bpf selftests it's possible to get following fault:
general protection fault, probably for non-canonical address \ 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI ... Call Trace: <TASK> fprobe_handler+0xc1/0x270 ? __pfx_bpf_testmod_init+0x10/0x10 ? __pfx_bpf_testmod_init+0x10/0x10 ? bpf_fentry_test1+0x5/0x10 ? bpf_fentry_test1+0x5/0x10 ? bpf_testmod_init+0x22/0x80 ? do_one_initcall+0x63/0x2e0 ? rcu_is_watching+0xd/0x40 ? kmalloc_trace+0xaf/0xc0 ? do_init_module+0x60/0x250 ? __do_sys_finit_module+0xac/0x120 ? do_syscall_64+0x37/0x90 ? entry_SYSCALL_64_after_hwframe+0x72/0xdc </TASK>
In unregister_fprobe function we can't release fp->rethook while it's possible there are some of its users still running on another cpu.
Moving rethook_free call after fp->ops is unregistered with unregister_ftrace_function call.
Link: https://lore.kernel.org/all/20230615115236.3476617-1-jolsa@kernel.org/
Fixes: 5b0ab78998e3 ("fprobe: Add exit_handler support") Cc: stable@vger.kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Jiri Olsa jolsa@kernel.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/fprobe.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
--- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -366,19 +366,13 @@ int unregister_fprobe(struct fprobe *fp) fp->ops.saved_func != fprobe_kprobe_handler)) return -EINVAL;
- /* - * rethook_free() starts disabling the rethook, but the rethook handlers - * may be running on other processors at this point. To make sure that all - * current running handlers are finished, call unregister_ftrace_function() - * after this. - */ - if (fp->rethook) - rethook_free(fp->rethook); - ret = unregister_ftrace_function(&fp->ops); if (ret < 0) return ret;
+ if (fp->rethook) + rethook_free(fp->rethook); + ftrace_free_filter(&fp->ops);
return ret;
From: Masami Hiramatsu (Google) mhiramat@kernel.org
commit 195b9cb5b288fec1c871ef89f78cc9a7461aad3a upstream.
Ensure running fprobe_exit_handler() has finished before calling rethook_free() in the unregister_fprobe() so that caller can free the fprobe right after unregister_fprobe().
unregister_fprobe() ensured that all running fprobe_entry/exit_handler() have finished by calling unregister_ftrace_function() which synchronizes RCU. But commit 5f81018753df ("fprobe: Release rethook after the ftrace_ops is unregistered") changed to call rethook_free() after unregister_ftrace_function(). So call rethook_stop() to make rethook disabled before unregister_ftrace_function() and ensure it again.
Here is the possible code flow that can call the exit handler after unregister_fprobe().
------ CPU1 CPU2 call unregister_fprobe(fp) ... __fprobe_handler() rethook_hook() on probed function unregister_ftrace_function() return from probed function rethook hooks find rh->handler == fprobe_exit_handler call fprobe_exit_handler() rethook_free(): set rh->handler = NULL; return from unreigster_fprobe; call fp->exit_handler() <- (*) ------
(*) At this point, the exit handler is called after returning from unregister_fprobe().
This fixes it as following; ------ CPU1 CPU2 call unregister_fprobe() ... rethook_stop(): set rh->handler = NULL; __fprobe_handler() rethook_hook() on probed function unregister_ftrace_function() return from probed function rethook hooks find rh->handler == NULL return from rethook rethook_free() return from unreigster_fprobe; ------
Link: https://lore.kernel.org/all/168873859949.156157.13039240432299335849.stgit@d...
Fixes: 5f81018753df ("fprobe: Release rethook after the ftrace_ops is unregistered") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/rethook.h | 1 + kernel/trace/fprobe.c | 3 +++ kernel/trace/rethook.c | 13 +++++++++++++ 3 files changed, 17 insertions(+)
--- a/include/linux/rethook.h +++ b/include/linux/rethook.h @@ -59,6 +59,7 @@ struct rethook_node { };
struct rethook *rethook_alloc(void *data, rethook_handler_t handler); +void rethook_stop(struct rethook *rh); void rethook_free(struct rethook *rh); void rethook_add_node(struct rethook *rh, struct rethook_node *node); struct rethook_node *rethook_try_get(struct rethook *rh); --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -366,6 +366,9 @@ int unregister_fprobe(struct fprobe *fp) fp->ops.saved_func != fprobe_kprobe_handler)) return -EINVAL;
+ if (fp->rethook) + rethook_stop(fp->rethook); + ret = unregister_ftrace_function(&fp->ops); if (ret < 0) return ret; --- a/kernel/trace/rethook.c +++ b/kernel/trace/rethook.c @@ -54,6 +54,19 @@ static void rethook_free_rcu(struct rcu_ }
/** + * rethook_stop() - Stop using a rethook. + * @rh: the struct rethook to stop. + * + * Stop using a rethook to prepare for freeing it. If you want to wait for + * all running rethook handler before calling rethook_free(), you need to + * call this first and wait RCU, and call rethook_free(). + */ +void rethook_stop(struct rethook *rh) +{ + WRITE_ONCE(rh->handler, NULL); +} + +/** * rethook_free() - Free struct rethook. * @rh: the struct rethook to be freed. *
From: Mateusz Stachyra m.stachyra@samsung.com
commit 02b0095e2fbbc060560c1065f86a211d91e27b26 upstream.
Fix an issue in function 'tracing_err_log_open'. The function doesn't call 'seq_open' if the file is opened only with write permissions, which results in 'file->private_data' being left as null. If we then use 'lseek' on that opened file, 'seq_lseek' dereferences 'file->private_data' in 'mutex_lock(&m->lock)', resulting in a kernel panic. Writing to this node requires root privileges, therefore this bug has very little security impact.
Tracefs node: /sys/kernel/tracing/error_log
Example Kernel panic:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038 Call trace: mutex_lock+0x30/0x110 seq_lseek+0x34/0xb8 __arm64_sys_lseek+0x6c/0xb8 invoke_syscall+0x58/0x13c el0_svc_common+0xc4/0x10c do_el0_svc+0x24/0x98 el0_svc+0x24/0x88 el0t_64_sync_handler+0x84/0xe4 el0t_64_sync+0x1b4/0x1b8 Code: d503201f aa0803e0 aa1f03e1 aa0103e9 (c8e97d02) ---[ end trace 561d1b49c12cf8a5 ]--- Kernel panic - not syncing: Oops: Fatal exception
Link: https://lore.kernel.org/linux-trace-kernel/20230703155237eucms1p4dfb6a19caa1... Link: https://lore.kernel.org/linux-trace-kernel/20230704102706eucms1p30d7ecdcc287...
Cc: stable@vger.kernel.org Fixes: 8a062902be725 ("tracing: Add tracing error log") Signed-off-by: Mateusz Stachyra m.stachyra@samsung.com Suggested-by: Steven Rostedt rostedt@goodmis.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -8136,7 +8136,7 @@ static const struct file_operations trac .open = tracing_err_log_open, .write = tracing_err_log_write, .read = seq_read, - .llseek = seq_lseek, + .llseek = tracing_lseek, .release = tracing_err_log_release, };
From: Paolo Abeni pabeni@redhat.com
commit 0226436acf2495cde4b93e7400e5a87305c26054 upstream.
Since the blamed commit, closing the first subflow resets the first subflow socket state to SS_UNCONNECTED.
The current mptcp listen implementation relies only on such state to prevent touching not-fully-disconnected sockets.
Incoming mptcp fastclose (or paired endpoint removal) unconditionally closes the first subflow.
All the above allows an incoming fastclose followed by a listen() call to successfully race with a blocking recvmsg(), potentially causing the latter to hit a divide by zero bug in cleanup_rbuf/__tcp_select_window().
Address the issue explicitly checking the msk socket state in mptcp_listen(). An alternative solution would be moving the first subflow socket state update into mptcp_disconnect(), but in the long term the first subflow socket should be removed: better avoid relaying on it for internal consistency check.
Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/414 Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3697,6 +3697,11 @@ static int mptcp_listen(struct socket *s pr_debug("msk=%p", msk);
lock_sock(sk); + + err = -EINVAL; + if (sock->state != SS_UNCONNECTED || sock->type != SOCK_STREAM) + goto unlock; + ssock = __mptcp_nmpc_socket(msk); if (IS_ERR(ssock)) { err = PTR_ERR(ssock);
From: Paolo Abeni pabeni@redhat.com
commit 3fffa15bfef48b0ad6424779c03e68ae8ace5acb upstream.
While tacking care of the mptcp-level listener I unintentionally moved the subflow level unhash after the subflow listener backlog cleanup.
That could cause some nasty race and makes the code harder to read.
Address the issue restoring the proper order of operations.
Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2908,10 +2908,10 @@ static void mptcp_check_listen_stop(stru return;
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); + tcp_set_state(ssk, TCP_CLOSE); mptcp_subflow_queue_clean(sk, ssk); inet_csk_listen_stop(ssk); mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED); - tcp_set_state(ssk, TCP_CLOSE); release_sock(ssk); }
From: Matthieu Baerts matthieu.baerts@tessares.net
commit a5a5990c099dd354e05e89ee77cd2dbf6655d4a1 upstream.
IPTables commands using 'iptables-nft' fail on old kernels, at least on v5.15 because it doesn't see the default IPTables chains:
$ iptables -L iptables/1.8.2 Failed to initialize nft: Protocol not supported
As a first step before switching to NFTables, we can use iptables-legacy if available.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: dc65fe82fb07 ("selftests: mptcp: add packet mark test case") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh @@ -12,6 +12,8 @@ ksft_skip=4 timeout_poll=30 timeout_test=$((timeout_poll * 2 + 1)) mptcp_connect="" +iptables="iptables" +ip6tables="ip6tables"
sec=$(date +%s) rndh=$(printf %x $sec)-$(mktemp -u XXXXXX) @@ -25,7 +27,7 @@ add_mark_rules() local m=$2
local t - for t in iptables ip6tables; do + for t in ${iptables} ${ip6tables}; do # just to debug: check we have multiple subflows connection requests ip netns exec $ns $t -A OUTPUT -p tcp --syn -m mark --mark $m -j ACCEPT
@@ -95,14 +97,14 @@ if [ $? -ne 0 ];then exit $ksft_skip fi
-iptables -V > /dev/null 2>&1 -if [ $? -ne 0 ];then +# Use the legacy version if available to support old kernel versions +if iptables-legacy -V &> /dev/null; then + iptables="iptables-legacy" + ip6tables="ip6tables-legacy" +elif ! iptables -V &> /dev/null; then echo "SKIP: Could not run all tests without iptables tool" exit $ksft_skip -fi - -ip6tables -V > /dev/null 2>&1 -if [ $? -ne 0 ];then +elif ! ip6tables -V &> /dev/null; then echo "SKIP: Could not run all tests without ip6tables tool" exit $ksft_skip fi @@ -112,10 +114,10 @@ check_mark() local ns=$1 local af=$2
- local tables=iptables + local tables=${iptables}
if [ $af -eq 6 ];then - tables=ip6tables + tables=${ip6tables} fi
local counters values
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 221e4550454a822f9a11834e30694c7d1d65747c upstream.
In case of "external" errors when preparing the environment for the TProxy tests, the subtests were marked as skipped.
This is fine but it means these errors are ignored. On MPTCP Public CI, we do want to catch such issues and mark the selftest as failed if there are such issues. We can then use mptcp_lib_fail_if_expected_feature() helper that has been recently added to fail if needed.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5fb62e9cd3ad ("selftests: mptcp: add tproxy test case") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_connect.sh | 3 +++ 1 file changed, 3 insertions(+)
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh @@ -718,6 +718,7 @@ table inet mangle { EOF if [ $? -ne 0 ]; then echo "SKIP: $msg, could not load nft ruleset" + mptcp_lib_fail_if_expected_feature "nft rules" return fi
@@ -733,6 +734,7 @@ EOF if [ $? -ne 0 ]; then ip netns exec "$listener_ns" nft flush ruleset echo "SKIP: $msg, ip $r6flag rule failed" + mptcp_lib_fail_if_expected_feature "ip rule" return fi
@@ -741,6 +743,7 @@ EOF ip netns exec "$listener_ns" nft flush ruleset ip -net "$listener_ns" $r6flag rule del fwmark 1 lookup 100 echo "SKIP: $msg, ip route add local $local_addr failed" + mptcp_lib_fail_if_expected_feature "ip route" return fi
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 9ac4c28eb70cd5ea5472a5e1c495dcdd597d4597 upstream.
When an error was detected when checking the marks, a message was correctly printed mentioning the error but followed by another one saying everything was OK and the selftest was not marked as failed as expected.
Now the 'ret' variable is directly set to 1 in order to make sure the exit is done with an error, similar to what is done in other functions. While at it, the error is correctly propagated to the caller.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: dc65fe82fb07 ("selftests: mptcp: add packet mark test case") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh @@ -128,6 +128,7 @@ check_mark() for v in $values; do if [ $v -ne 0 ]; then echo "FAIL: got $tables $values in ns $ns , not 0 - not all expected packets marked" 1>&2 + ret=1 return 1 fi done @@ -227,11 +228,11 @@ do_transfer() fi
if [ $local_addr = "::" ];then - check_mark $listener_ns 6 - check_mark $connector_ns 6 + check_mark $listener_ns 6 || retc=1 + check_mark $connector_ns 6 || retc=1 else - check_mark $listener_ns 4 - check_mark $connector_ns 4 + check_mark $listener_ns 4 || retc=1 + check_mark $connector_ns 4 || retc=1 fi
check_transfer $cin $sout "file received by server"
From: Matthieu Baerts matthieu.baerts@tessares.net
commit d8566d0e03922217f70d9be2d401fcb860986374 upstream.
"server4_port" variable is not set but "app4_port" is the server port in v4 and the correct variable name to use.
The port is optional so there was no visible impact.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ca188a25d43f ("selftests: mptcp: userspace PM support for MP_PRIO signals") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/userspace_pm.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/net/mptcp/userspace_pm.sh +++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh @@ -848,7 +848,7 @@ test_prio() local count
# Send MP_PRIO signal from client to server machine - ip netns exec "$ns2" ./pm_nl_ctl set 10.0.1.2 port "$client4_port" flags backup token "$client4_token" rip 10.0.1.1 rport "$server4_port" + ip netns exec "$ns2" ./pm_nl_ctl set 10.0.1.2 port "$client4_port" flags backup token "$client4_token" rip 10.0.1.1 rport "$app4_port" sleep 0.5
# Check TX
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 966c6c3adfb1257ea8a839cdfad2b74092cc5532 upstream.
A message was mentioning an issue with the "remove" tests but the selftest was not marked as failed.
Directly exit with an error like it is done everywhere else in this selftest.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/userspace_pm.sh | 2 ++ 1 file changed, 2 insertions(+)
--- a/tools/testing/selftests/net/mptcp/userspace_pm.sh +++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh @@ -423,6 +423,7 @@ test_remove() stdbuf -o0 -e0 printf "[OK]\n" else stdbuf -o0 -e0 printf "[FAIL]\n" + exit 1 fi
# RM_ADDR using an invalid addr id should result in no action @@ -437,6 +438,7 @@ test_remove() stdbuf -o0 -e0 printf "[OK]\n" else stdbuf -o0 -e0 printf "[FAIL]\n" + exit 1 fi
# RM_ADDR from the client to server machine
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 6c8880fcaa5c45355179b759c1d11737775e31fc upstream.
MPTCP selftests are using TCP SYN Cookies for quite a while now, since v5.9.
Some CIs don't have this config option enabled and this is causing issues in the tests:
# ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 167ms) sysctl: cannot stat /proc/sys/net/ipv4/tcp_syncookies: No such file or directory # [ OK ]./mptcp_connect.sh: line 554: [: -eq: unary operator expected
There is no impact in the results but the test is not doing what it is supposed to do.
Fixes: fed61c4b584c ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/config | 1 + 1 file changed, 1 insertion(+)
--- a/tools/testing/selftests/net/mptcp/config +++ b/tools/testing/selftests/net/mptcp/config @@ -6,6 +6,7 @@ CONFIG_INET_DIAG=m CONFIG_INET_MPTCP_DIAG=m CONFIG_VETH=y CONFIG_NET_SCH_NETEM=m +CONFIG_SYN_COOKIES=y CONFIG_NETFILTER=y CONFIG_NETFILTER_ADVANCED=y CONFIG_NETFILTER_NETLINK=m
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 61d9658050260dbcbf9055479b7ac5bbbe1e8831 upstream.
When using pm_nl_ctl to validate userspace path-manager's behaviours, it was failing on 32-bit architectures ~half of the time.
pm_nl_ctl was not reporting any error but the command was not doing what it was expected to do. As a result, the expected linked event was not triggered after and the test failed.
This is due to the fact the token given in argument to the application was parsed as an integer with atoi(): in a 32-bit arch, if the number was bigger than INT_MAX, 2147483647 was used instead.
This can simply be fixed by using strtoul() instead of atoi().
The errors have been seen "by chance" when manually looking at the results from LKFT.
Fixes: 9a0b36509df0 ("selftests: mptcp: support MPTCP_PM_CMD_ANNOUNCE") Cc: stable@vger.kernel.org Fixes: ecd2a77d672f ("selftests: mptcp: support MPTCP_PM_CMD_REMOVE") Fixes: cf8d0a6dfd64 ("selftests: mptcp: support MPTCP_PM_CMD_SUBFLOW_CREATE") Fixes: 57cc361b8d38 ("selftests: mptcp: support MPTCP_PM_CMD_SUBFLOW_DESTROY") Fixes: ca188a25d43f ("selftests: mptcp: userspace PM support for MP_PRIO signals") Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/pm_nl_ctl.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c +++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c @@ -425,7 +425,7 @@ int dsf(int fd, int pm_family, int argc, }
/* token */ - token = atoi(params[4]); + token = strtoul(params[4], NULL, 10); rta = (void *)(data + off); rta->rta_type = MPTCP_PM_ATTR_TOKEN; rta->rta_len = RTA_LENGTH(4); @@ -551,7 +551,7 @@ int csf(int fd, int pm_family, int argc, }
/* token */ - token = atoi(params[4]); + token = strtoul(params[4], NULL, 10); rta = (void *)(data + off); rta->rta_type = MPTCP_PM_ATTR_TOKEN; rta->rta_len = RTA_LENGTH(4); @@ -598,7 +598,7 @@ int remove_addr(int fd, int pm_family, i if (++arg >= argc) error(1, 0, " missing token value");
- token = atoi(argv[arg]); + token = strtoul(argv[arg], NULL, 10); rta = (void *)(data + off); rta->rta_type = MPTCP_PM_ATTR_TOKEN; rta->rta_len = RTA_LENGTH(4); @@ -710,7 +710,7 @@ int announce_addr(int fd, int pm_family, if (++arg >= argc) error(1, 0, " missing token value");
- token = atoi(argv[arg]); + token = strtoul(argv[arg], NULL, 10); } else error(1, 0, "unknown keyword %s", argv[arg]); } @@ -1347,7 +1347,7 @@ int set_flags(int fd, int pm_family, int error(1, 0, " missing token value");
/* token */ - token = atoi(argv[arg]); + token = strtoul(argv[arg], NULL, 10); } else if (!strcmp(argv[arg], "flags")) { char *tok, *str;
From: Gustavo A. R. Silva gustavoars@kernel.org
commit f1f047bd7ce0d73788e04ac02268060a565f7ecb upstream.
pSMB->hdr.Protocol is an array of size 4 bytes, hence when the compiler analyzes this line of code
parm_data = ((char *) &pSMB->hdr.Protocol) + offset;
it legitimately complains about the fact that offset points outside the bounds of the array. Notice that the compiler gives priority to the object as an array, rather than merely the address of one more byte in a structure to wich offset should be added (which seems to be the actual intention of the original implementation).
Fix this by explicitly instructing the compiler to treat the code as a sequence of bytes in struct smb_com_transaction2_spi_req, and not as an array accessed through pointer notation.
Notice that ((char *)pSMB) + sizeof(pSMB->hdr.smb_buf_length) points to the same address as ((char *) &pSMB->hdr.Protocol), therefore this results in no differences in binary output.
Fixes the following -Wstringop-overflow warnings when built s390 architecture with defconfig (GCC 13): CC [M] fs/smb/client/cifssmb.o In function 'cifs_init_ace', inlined from 'posix_acl_to_cifs' at fs/smb/client/cifssmb.c:3046:3, inlined from 'cifs_do_set_acl' at fs/smb/client/cifssmb.c:3191:15: fs/smb/client/cifssmb.c:2987:31: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 2987 | cifs_ace->cifs_e_perm = local_ace->e_perm; | ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ In file included from fs/smb/client/cifssmb.c:27: fs/smb/client/cifspdu.h: In function 'cifs_do_set_acl': fs/smb/client/cifspdu.h:384:14: note: at offset [7, 11] into destination object 'Protocol' of size 4 384 | __u8 Protocol[4]; | ^~~~~~~~ In function 'cifs_init_ace', inlined from 'posix_acl_to_cifs' at fs/smb/client/cifssmb.c:3046:3, inlined from 'cifs_do_set_acl' at fs/smb/client/cifssmb.c:3191:15: fs/smb/client/cifssmb.c:2988:30: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 2988 | cifs_ace->cifs_e_tag = local_ace->e_tag; | ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ fs/smb/client/cifspdu.h: In function 'cifs_do_set_acl': fs/smb/client/cifspdu.h:384:14: note: at offset [6, 10] into destination object 'Protocol' of size 4 384 | __u8 Protocol[4]; | ^~~~~~~~
This helps with the ongoing efforts to globally enable -Wstringop-overflow.
Link: https://github.com/KSPP/linux/issues/310 Fixes: dc1af4c4b472 ("cifs: implement set acl method") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/cifssmb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c index 19f7385abeec..9dee267f1893 100644 --- a/fs/smb/client/cifssmb.c +++ b/fs/smb/client/cifssmb.c @@ -3184,7 +3184,7 @@ setAclRetry: param_offset = offsetof(struct smb_com_transaction2_spi_req, InformationLevel) - 4; offset = param_offset + params; - parm_data = ((char *) &pSMB->hdr.Protocol) + offset; + parm_data = ((char *)pSMB) + sizeof(pSMB->hdr.smb_buf_length) + offset; pSMB->ParameterOffset = cpu_to_le16(param_offset);
/* convert to on the wire format for POSIX ACL */
From: Masami Hiramatsu (Google) mhiramat@kernel.org
commit 66bcf65d6cf0ca6540e2341e88ee7ef02dbdda08 upstream.
If an array is specified with the ustring or symstr, the length of the strings are accumlated on both of 'ret' and 'total', which means the length is double counted. Just set the length to the 'ret' value for avoiding double counting.
Link: https://lore.kernel.org/all/168908492917.123124.15076463491122036025.stgit@d...
Reported-by: Dan Carpenter dan.carpenter@linaro.org Closes: https://lore.kernel.org/all/8819b154-2ba1-43c3-98a2-cbde20892023@moroto.moun... Fixes: 88903c464321 ("tracing/probe: Add ustring type for user-space string") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_probe_tmpl.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -156,11 +156,11 @@ stage3: code++; goto array; case FETCH_OP_ST_USTRING: - ret += fetch_store_strlen_user(val + code->offset); + ret = fetch_store_strlen_user(val + code->offset); code++; goto array; case FETCH_OP_ST_SYMSTR: - ret += fetch_store_symstrlen(val + code->offset); + ret = fetch_store_symstrlen(val + code->offset); code++; goto array; default:
From: Masami Hiramatsu (Google) mhiramat@kernel.org
commit b41326b5e0f82e93592c4366359917b5d67b529f upstream.
Fix not to count the error code (which is minus value) to the total used length of array, because it can mess up the return code of process_fetch_insn_bottom(). Also clear the 'ret' value because it will be used for calculating next data_loc entry.
Link: https://lore.kernel.org/all/168908493827.123124.2175257289106364229.stgit@de...
Reported-by: Dan Carpenter dan.carpenter@linaro.org Closes: https://lore.kernel.org/all/8819b154-2ba1-43c3-98a2-cbde20892023@moroto.moun... Fixes: 9b960a38835f ("tracing: probeevent: Unify fetch_insn processing common part") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_probe_tmpl.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -204,6 +204,8 @@ stage3: array: /* the last stage: Loop on array */ if (code->op == FETCH_OP_LP_ARRAY) { + if (ret < 0) + ret = 0; total += ret; if (++i < code->param) { code = s3;
From: Masami Hiramatsu (Google) mhiramat@kernel.org
commit e38e2c6a9efc435f9de344b7c91f7697e01b47d5 upstream.
Fix to update dynamic data counter ('dyndata') and max length ('maxlen') only if the fetcharg uses the dynamic data. Also get out arg->dynamic from unlikely(). This makes dynamic data address wrong if process_fetch_insn() returns error on !arg->dynamic case.
Link: https://lore.kernel.org/all/168908494781.123124.8160245359962103684.stgit@de...
Suggested-by: Steven Rostedt rostedt@goodmis.org Link: https://lore.kernel.org/all/20230710233400.5aaf024e@gandalf.local.home/ Fixes: 9178412ddf5a ("tracing: probeevent: Return consumed bytes of dynamic area") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_probe_tmpl.h | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -267,11 +267,13 @@ store_trace_args(void *data, struct trac if (unlikely(arg->dynamic)) *dl = make_data_loc(maxlen, dyndata - base); ret = process_fetch_insn(arg->code, rec, dl, base); - if (unlikely(ret < 0 && arg->dynamic)) { - *dl = make_data_loc(0, dyndata - base); - } else { - dyndata += ret; - maxlen -= ret; + if (arg->dynamic) { + if (unlikely(ret < 0)) { + *dl = make_data_loc(0, dyndata - base); + } else { + dyndata += ret; + maxlen -= ret; + } } } }
From: Masami Hiramatsu (Google) mhiramat@kernel.org
commit 4ed8f337dee32df71435689c19d22e4ee846e15a upstream.
This reverts commit 2e9906f84fc7c99388bb7123ade167250d50f1c0.
It was turned out that commit 2e9906f84fc7 ("tracing: Add "(fault)" name injection to kernel probes") did not work correctly and probe events still show just '(fault)' (instead of '"(fault)"'). Also, current '(fault)' is more explicit that it faulted.
This also moves FAULT_STRING macro to trace.h so that synthetic event can keep using it, and uses it in trace_probe.c too.
Link: https://lore.kernel.org/all/168908495772.123124.1250788051922100079.stgit@de... Link: https://lore.kernel.org/all/20230706230642.3793a593@rorschach.local.home/
Cc: stable@vger.kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Tom Zanussi zanussi@kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.h | 2 ++ kernel/trace/trace_probe.c | 2 +- kernel/trace/trace_probe_kernel.h | 31 ++++++------------------------- 3 files changed, 9 insertions(+), 26 deletions(-)
--- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -113,6 +113,8 @@ enum trace_type { #define MEM_FAIL(condition, fmt, ...) \ DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
+#define FAULT_STRING "(fault)" + #define HIST_STACKTRACE_DEPTH 16 #define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long)) #define HIST_STACKTRACE_SKIP 5 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -65,7 +65,7 @@ int PRINT_TYPE_FUNC_NAME(string)(struct int len = *(u32 *)data >> 16;
if (!len) - trace_seq_puts(s, "(fault)"); + trace_seq_puts(s, FAULT_STRING); else trace_seq_printf(s, ""%s"", (const char *)get_loc_data(data, ent)); --- a/kernel/trace/trace_probe_kernel.h +++ b/kernel/trace/trace_probe_kernel.h @@ -2,8 +2,6 @@ #ifndef __TRACE_PROBE_KERNEL_H_ #define __TRACE_PROBE_KERNEL_H_
-#define FAULT_STRING "(fault)" - /* * This depends on trace_probe.h, but can not include it due to * the way trace_probe_tmpl.h is used by trace_kprobe.c and trace_eprobe.c. @@ -15,16 +13,8 @@ static nokprobe_inline int fetch_store_strlen_user(unsigned long addr) { const void __user *uaddr = (__force const void __user *)addr; - int ret;
- ret = strnlen_user_nofault(uaddr, MAX_STRING_SIZE); - /* - * strnlen_user_nofault returns zero on fault, insert the - * FAULT_STRING when that occurs. - */ - if (ret <= 0) - return strlen(FAULT_STRING) + 1; - return ret; + return strnlen_user_nofault(uaddr, MAX_STRING_SIZE); }
/* Return the length of string -- including null terminal byte */ @@ -44,18 +34,7 @@ fetch_store_strlen(unsigned long addr) len++; } while (c && ret == 0 && len < MAX_STRING_SIZE);
- /* For faults, return enough to hold the FAULT_STRING */ - return (ret < 0) ? strlen(FAULT_STRING) + 1 : len; -} - -static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base, int len) -{ - if (ret >= 0) { - *(u32 *)dest = make_data_loc(ret, __dest - base); - } else { - strscpy(__dest, FAULT_STRING, len); - ret = strlen(__dest) + 1; - } + return (ret < 0) ? ret : len; }
/* @@ -76,7 +55,8 @@ fetch_store_string_user(unsigned long ad __dest = get_loc_data(dest, base);
ret = strncpy_from_user_nofault(__dest, uaddr, maxlen); - set_data_loc(ret, dest, __dest, base, maxlen); + if (ret >= 0) + *(u32 *)dest = make_data_loc(ret, __dest - base);
return ret; } @@ -107,7 +87,8 @@ fetch_store_string(unsigned long addr, v * probing. */ ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen); - set_data_loc(ret, dest, __dest, base, maxlen); + if (ret >= 0) + *(u32 *)dest = make_data_loc(ret, __dest - base);
return ret; }
From: Masami Hiramatsu (Google) mhiramat@kernel.org
commit 797311bce5c2ac90b8d65e357603cfd410d36ebb upstream.
Fix to record 0-length data to data_loc in fetch_store_string*() if it fails to get the string data. Currently those expect that the data_loc is updated by store_trace_args() if it returns the error code. However, that does not work correctly if the argument is an array of strings. In that case, store_trace_args() only clears the first entry of the array (which may have no error) and leaves other entries. So it should be cleared by fetch_store_string*() itself. Also, 'dyndata' and 'maxlen' in store_trace_args() should be updated only if it is used (ret > 0 and argument is a dynamic data.)
Link: https://lore.kernel.org/all/168908496683.123124.4761206188794205601.stgit@de...
Fixes: 40b53b771806 ("tracing: probeevent: Add array type support") Cc: stable@vger.kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_probe_kernel.h | 13 +++++++++---- kernel/trace/trace_probe_tmpl.h | 10 +++------- kernel/trace/trace_uprobe.c | 3 ++- 3 files changed, 14 insertions(+), 12 deletions(-)
--- a/kernel/trace/trace_probe_kernel.h +++ b/kernel/trace/trace_probe_kernel.h @@ -37,6 +37,13 @@ fetch_store_strlen(unsigned long addr) return (ret < 0) ? ret : len; }
+static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base) +{ + if (ret < 0) + ret = 0; + *(u32 *)dest = make_data_loc(ret, __dest - base); +} + /* * Fetch a null-terminated string from user. Caller MUST set *(u32 *)buf * with max length and relative data location. @@ -55,8 +62,7 @@ fetch_store_string_user(unsigned long ad __dest = get_loc_data(dest, base);
ret = strncpy_from_user_nofault(__dest, uaddr, maxlen); - if (ret >= 0) - *(u32 *)dest = make_data_loc(ret, __dest - base); + set_data_loc(ret, dest, __dest, base);
return ret; } @@ -87,8 +93,7 @@ fetch_store_string(unsigned long addr, v * probing. */ ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen); - if (ret >= 0) - *(u32 *)dest = make_data_loc(ret, __dest - base); + set_data_loc(ret, dest, __dest, base);
return ret; } --- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -267,13 +267,9 @@ store_trace_args(void *data, struct trac if (unlikely(arg->dynamic)) *dl = make_data_loc(maxlen, dyndata - base); ret = process_fetch_insn(arg->code, rec, dl, base); - if (arg->dynamic) { - if (unlikely(ret < 0)) { - *dl = make_data_loc(0, dyndata - base); - } else { - dyndata += ret; - maxlen -= ret; - } + if (arg->dynamic && likely(ret > 0)) { + dyndata += ret; + maxlen -= ret; } } } --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -170,7 +170,8 @@ fetch_store_string(unsigned long addr, v */ ret++; *(u32 *)dest = make_data_loc(ret, (void *)dst - base); - } + } else + *(u32 *)dest = make_data_loc(0, (void *)dst - base);
return ret; }
From: Beau Belgrave beaub@linux.microsoft.com
commit d0a3022f30629a208e5944022caeca3568add9e7 upstream.
When users register an event the name of the event and it's argument are checked to ensure they match if the event already exists. Normally all arguments are in the form of "type name", except for when the type starts with "struct ". In those cases, the size of the struct is passed in addition to the name, IE: "struct my_struct a 20" for an argument that is of type "struct my_struct" with a field name of "a" and has the size of 20 bytes.
The current code does not honor the above case properly when comparing a match. This causes the event register to fail even when the same string was used for events that contain a struct argument within them. The example above "struct my_struct a 20" generates a match string of "struct my_struct a" omitting the size field.
Add the struct size of the existing field when generating a comparison string for a struct field to ensure proper match checking.
Link: https://lkml.kernel.org/r/20230629235049.581-2-beaub@linux.microsoft.com
Cc: stable@vger.kernel.org Fixes: e6f89a149872 ("tracing/user_events: Ensure user provided strings are safely formatted") Signed-off-by: Beau Belgrave beaub@linux.microsoft.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_user.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/kernel/trace/trace_events_user.c +++ b/kernel/trace/trace_events_user.c @@ -1317,6 +1317,9 @@ static int user_field_set_string(struct pos += snprintf(buf + pos, LEN_OR_ZERO, " "); pos += snprintf(buf + pos, LEN_OR_ZERO, "%s", field->name);
+ if (str_has_prefix(field->type, "struct ")) + pos += snprintf(buf + pos, LEN_OR_ZERO, " %d", field->size); + if (colon) pos += snprintf(buf + pos, LEN_OR_ZERO, ";");
From: Quinn Tran qutran@marvell.com
commit 6a87679626b51b53fbb6be417ad8eb083030b617 upstream.
Task management command failed with status 2Ch which is a result of too many task management commands sent to the same target. Hence limit task management commands to 8 per target.
Reported-by: kernel test robot lkp@intel.com Link: https://lore.kernel.org/oe-kbuild-all/202304271952.NKNmoFzv-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230428075339.32551-4-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_def.h | 3 + drivers/scsi/qla2xxx/qla_init.c | 63 ++++++++++++++++++++++++++++++++++++---- 2 files changed, 61 insertions(+), 5 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -2542,6 +2542,7 @@ enum rscn_addr_format { typedef struct fc_port { struct list_head list; struct scsi_qla_host *vha; + struct list_head tmf_pending;
unsigned int conf_compl_supported:1; unsigned int deleted:2; @@ -2562,6 +2563,8 @@ typedef struct fc_port { unsigned int do_prli_nvme:1;
uint8_t nvme_flag; + uint8_t active_tmf; +#define MAX_ACTIVE_TMF 8
uint8_t node_name[WWN_SIZE]; uint8_t port_name[WWN_SIZE]; --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -2149,6 +2149,54 @@ done: return rval; }
+static void qla_put_tmf(fc_port_t *fcport) +{ + struct scsi_qla_host *vha = fcport->vha; + struct qla_hw_data *ha = vha->hw; + unsigned long flags; + + spin_lock_irqsave(&ha->tgt.sess_lock, flags); + fcport->active_tmf--; + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); +} + +static +int qla_get_tmf(fc_port_t *fcport) +{ + struct scsi_qla_host *vha = fcport->vha; + struct qla_hw_data *ha = vha->hw; + unsigned long flags; + int rc = 0; + LIST_HEAD(tmf_elem); + + spin_lock_irqsave(&ha->tgt.sess_lock, flags); + list_add_tail(&tmf_elem, &fcport->tmf_pending); + + while (fcport->active_tmf >= MAX_ACTIVE_TMF) { + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); + + msleep(1); + + spin_lock_irqsave(&ha->tgt.sess_lock, flags); + if (fcport->deleted) { + rc = EIO; + break; + } + if (fcport->active_tmf < MAX_ACTIVE_TMF && + list_is_first(&tmf_elem, &fcport->tmf_pending)) + break; + } + + list_del(&tmf_elem); + + if (!rc) + fcport->active_tmf++; + + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); + + return rc; +} + int qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, uint32_t tag) @@ -2156,18 +2204,19 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, struct scsi_qla_host *vha = fcport->vha; struct qla_qpair *qpair; struct tmf_arg a; - struct completion comp; int i, rval;
- init_completion(&comp); a.vha = fcport->vha; a.fcport = fcport; a.lun = lun; - - if (flags & (TCF_LUN_RESET|TCF_ABORT_TASK_SET|TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) + if (flags & (TCF_LUN_RESET|TCF_ABORT_TASK_SET|TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) { a.modifier = MK_SYNC_ID_LUN; - else + + if (qla_get_tmf(fcport)) + return QLA_FUNCTION_FAILED; + } else { a.modifier = MK_SYNC_ID; + }
if (vha->hw->mqenable) { for (i = 0; i < vha->hw->num_qpairs; i++) { @@ -2186,6 +2235,9 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, a.flags = flags; rval = __qla2x00_async_tm_cmd(&a);
+ if (a.modifier == MK_SYNC_ID_LUN) + qla_put_tmf(fcport); + return rval; }
@@ -5400,6 +5452,7 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vh INIT_WORK(&fcport->reg_work, qla_register_fcport_fn); INIT_LIST_HEAD(&fcport->gnl_entry); INIT_LIST_HEAD(&fcport->list); + INIT_LIST_HEAD(&fcport->tmf_pending);
INIT_LIST_HEAD(&fcport->sess_cmd_list); spin_lock_init(&fcport->sess_cmd_lock);
From: Quinn Tran qutran@marvell.com
commit 9ae615c5bfd37bd091772969b1153de5335ea986 upstream.
Task management command hangs where a side band chip reset failed to nudge the TMF from it's current send path.
Add additional error check to block TMF from entering during chip reset and along the TMF path to cause it to bail out, skip over abort of marker.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230428075339.32551-5-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_def.h | 4 ++ drivers/scsi/qla2xxx/qla_init.c | 60 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 61 insertions(+), 3 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -5516,4 +5516,8 @@ struct ql_vnd_tgt_stats_resp { _fp->disc_state, _fp->scan_state, _fp->loop_id, _fp->deleted, \ _fp->flags
+#define TMF_NOT_READY(_fcport) \ + (!_fcport || IS_SESSION_DELETED(_fcport) || atomic_read(&_fcport->state) != FCS_ONLINE || \ + !_fcport->vha->hw->flags.fw_started) + #endif --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -1996,6 +1996,11 @@ qla2x00_tmf_iocb_timeout(void *data) int rc, h; unsigned long flags;
+ if (sp->type == SRB_MARKER) { + complete(&tmf->u.tmf.comp); + return; + } + rc = qla24xx_async_abort_cmd(sp, false); if (rc) { spin_lock_irqsave(sp->qpair->qp_lock_ptr, flags); @@ -2023,6 +2028,7 @@ static void qla_marker_sp_done(srb_t *sp sp->handle, sp->fcport->d_id.b24, sp->u.iocb_cmd.u.tmf.flags, sp->u.iocb_cmd.u.tmf.lun, sp->qpair->id);
+ sp->u.iocb_cmd.u.tmf.data = res; complete(&tmf->u.tmf.comp); }
@@ -2039,6 +2045,11 @@ static void qla_marker_sp_done(srb_t *sp } while (cnt); \ }
+/** + * qla26xx_marker: send marker IOCB and wait for the completion of it. + * @arg: pointer to argument list. + * It is assume caller will provide an fcport pointer and modifier + */ static int qla26xx_marker(struct tmf_arg *arg) { @@ -2048,6 +2059,14 @@ qla26xx_marker(struct tmf_arg *arg) int rval = QLA_FUNCTION_FAILED; fc_port_t *fcport = arg->fcport;
+ if (TMF_NOT_READY(arg->fcport)) { + ql_dbg(ql_dbg_taskm, vha, 0x8039, + "FC port not ready for marker loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d.\n", + fcport->loop_id, fcport->d_id.b24, + arg->modifier, arg->lun, arg->qpair->id); + return QLA_SUSPENDED; + } + /* ref: INIT */ sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL); if (!sp) @@ -2074,11 +2093,19 @@ qla26xx_marker(struct tmf_arg *arg)
if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x8031, - "Marker IOCB failed (%x).\n", rval); + "Marker IOCB send failure (%x).\n", rval); goto done_free_sp; }
wait_for_completion(&tm_iocb->u.tmf.comp); + rval = tm_iocb->u.tmf.data; + + if (rval != QLA_SUCCESS) { + ql_log(ql_log_warn, vha, 0x8019, + "Marker failed hdl=%x loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d rval %d.\n", + sp->handle, fcport->loop_id, fcport->d_id.b24, + arg->modifier, arg->lun, sp->qpair->id, rval); + }
done_free_sp: /* ref: INIT */ @@ -2091,6 +2118,8 @@ static void qla2x00_tmf_sp_done(srb_t *s { struct srb_iocb *tmf = &sp->u.iocb_cmd;
+ if (res) + tmf->u.tmf.data = res; complete(&tmf->u.tmf.comp); }
@@ -2104,6 +2133,14 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
fc_port_t *fcport = arg->fcport;
+ if (TMF_NOT_READY(arg->fcport)) { + ql_dbg(ql_dbg_taskm, vha, 0x8032, + "FC port not ready for TM command loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d.\n", + fcport->loop_id, fcport->d_id.b24, + arg->modifier, arg->lun, arg->qpair->id); + return QLA_SUSPENDED; + } + /* ref: INIT */ sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL); if (!sp) @@ -2178,7 +2215,9 @@ int qla_get_tmf(fc_port_t *fcport) msleep(1);
spin_lock_irqsave(&ha->tgt.sess_lock, flags); - if (fcport->deleted) { + if (TMF_NOT_READY(fcport)) { + ql_log(ql_log_warn, vha, 0x802c, + "Unable to acquire TM resource due to disruption.\n"); rc = EIO; break; } @@ -2204,7 +2243,10 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, struct scsi_qla_host *vha = fcport->vha; struct qla_qpair *qpair; struct tmf_arg a; - int i, rval; + int i, rval = QLA_SUCCESS; + + if (TMF_NOT_READY(fcport)) + return QLA_SUSPENDED;
a.vha = fcport->vha; a.fcport = fcport; @@ -2223,6 +2265,14 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, qpair = vha->hw->queue_pair_map[i]; if (!qpair) continue; + + if (TMF_NOT_READY(fcport)) { + ql_log(ql_log_warn, vha, 0x8026, + "Unable to send TM due to disruption.\n"); + rval = QLA_SUSPENDED; + break; + } + a.qpair = qpair; a.flags = flags|TCF_NOTMCMD_TO_TARGET; rval = __qla2x00_async_tm_cmd(&a); @@ -2231,10 +2281,14 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, } }
+ if (rval) + goto bailout; + a.qpair = vha->hw->base_qpair; a.flags = flags; rval = __qla2x00_async_tm_cmd(&a);
+bailout: if (a.modifier == MK_SYNC_ID_LUN) qla_put_tmf(fcport);
From: Quinn Tran qutran@marvell.com
commit fc0cba0c7be8261a1625098bd1d695077ec621c9 upstream.
System crash due to use after free. Current code allows terminate_rport_io to exit before making sure all IOs has returned. For FCP-2 device, IO's can hang on in HW because driver has not tear down the session in FW at first sign of cable pull. When dev_loss_tmo timer pops, terminate_rport_io is called and upper layer is about to free various resources. Terminate_rport_io trigger qla to do the final cleanup, but the cleanup might not be fast enough where it leave qla still holding on to the same resource.
Wait for IO's to return to upper layer before resources are freed.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230428075339.32551-7-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_attr.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -2750,6 +2750,7 @@ static void qla2x00_terminate_rport_io(struct fc_rport *rport) { fc_port_t *fcport = *(fc_port_t **)rport->dd_data; + scsi_qla_host_t *vha;
if (!fcport) return; @@ -2759,9 +2760,12 @@ qla2x00_terminate_rport_io(struct fc_rpo
if (test_bit(ABORT_ISP_ACTIVE, &fcport->vha->dpc_flags)) return; + vha = fcport->vha;
if (unlikely(pci_channel_offline(fcport->vha->hw->pdev))) { qla2x00_abort_all_cmds(fcport->vha, DID_NO_CONNECT << 16); + qla2x00_eh_wait_for_pending_commands(fcport->vha, fcport->d_id.b24, + 0, WAIT_TARGET); return; } /* @@ -2786,6 +2790,15 @@ qla2x00_terminate_rport_io(struct fc_rpo qla2x00_port_logout(fcport->vha, fcport); } } + + /* check for any straggling io left behind */ + if (qla2x00_eh_wait_for_pending_commands(fcport->vha, fcport->d_id.b24, 0, WAIT_TARGET)) { + ql_log(ql_log_warn, vha, 0x300b, + "IO not return. Resetting. \n"); + set_bit(ISP_ABORT_NEEDED, &vha->dpc_flags); + qla2xxx_wake_dpc(vha); + qla2x00_wait_for_chip_reset(vha); + } }
static int
From: Quinn Tran qutran@marvell.com
commit b843adde8d490934d042fbe9e3e46697cb3a64d2 upstream.
System crash, where driver is accessing scsi layer's memory (scsi_cmnd->device->host) to search for a well known internal pointer (vha). The scsi_cmnd was released back to upper layer which could be freed, but the driver is still accessing it.
7 [ffffa8e8d2c3f8d0] page_fault at ffffffff86c010fe [exception RIP: __qla2x00_eh_wait_for_pending_commands+240] RIP: ffffffffc0642350 RSP: ffffa8e8d2c3f988 RFLAGS: 00010286 RAX: 0000000000000165 RBX: 0000000000000002 RCX: 00000000000036d8 RDX: 0000000000000000 RSI: ffff9c5c56535188 RDI: 0000000000000286 RBP: ffff9c5bf7aa4a58 R8: ffff9c589aecdb70 R9: 00000000000003d1 R10: 0000000000000001 R11: 0000000000380000 R12: ffff9c5c5392bc78 R13: ffff9c57044ff5c0 R14: ffff9c56b5a3aa00 R15: 00000000000006db ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 8 [ffffa8e8d2c3f9c8] qla2x00_eh_wait_for_pending_commands at ffffffffc0646dd5 [qla2xxx] 9 [ffffa8e8d2c3fa00] __qla2x00_async_tm_cmd at ffffffffc0658094 [qla2xxx]
Remove access of freed memory. Currently the driver was checking to see if scsi_done was called by seeing if the sp->type has changed. Instead, check to see if the command has left the oustanding_cmds[] array as sign of scsi_done was called.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230428075339.32551-6-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_isr.c | 38 +++++++++-- drivers/scsi/qla2xxx/qla_os.c | 130 ++++++++++++++++++++--------------------- 2 files changed, 95 insertions(+), 73 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -1862,9 +1862,9 @@ qla2x00_process_completed_request(struct } }
-srb_t * -qla2x00_get_sp_from_handle(scsi_qla_host_t *vha, const char *func, - struct req_que *req, void *iocb) +static srb_t * +qla_get_sp_from_handle(scsi_qla_host_t *vha, const char *func, + struct req_que *req, void *iocb, u16 *ret_index) { struct qla_hw_data *ha = vha->hw; sts_entry_t *pkt = iocb; @@ -1899,12 +1899,25 @@ qla2x00_get_sp_from_handle(scsi_qla_host return NULL; }
- req->outstanding_cmds[index] = NULL; - + *ret_index = index; qla_put_fw_resources(sp->qpair, &sp->iores); return sp; }
+srb_t * +qla2x00_get_sp_from_handle(scsi_qla_host_t *vha, const char *func, + struct req_que *req, void *iocb) +{ + uint16_t index; + srb_t *sp; + + sp = qla_get_sp_from_handle(vha, func, req, iocb, &index); + if (sp) + req->outstanding_cmds[index] = NULL; + + return sp; +} + static void qla2x00_mbx_iocb_entry(scsi_qla_host_t *vha, struct req_que *req, struct mbx_entry *mbx) @@ -3237,13 +3250,13 @@ qla2x00_status_entry(scsi_qla_host_t *vh return; }
- req->outstanding_cmds[handle] = NULL; cp = GET_CMD_SP(sp); if (cp == NULL) { ql_dbg(ql_dbg_io, vha, 0x3018, "Command already returned (0x%x/%p).\n", sts->handle, sp);
+ req->outstanding_cmds[handle] = NULL; return; }
@@ -3514,6 +3527,9 @@ out:
if (rsp->status_srb == NULL) sp->done(sp, res); + + /* for io's, clearing of outstanding_cmds[handle] means scsi_done was called */ + req->outstanding_cmds[handle] = NULL; }
/** @@ -3590,6 +3606,7 @@ qla2x00_error_entry(scsi_qla_host_t *vha uint16_t que = MSW(pkt->handle); struct req_que *req = NULL; int res = DID_ERROR << 16; + u16 index;
ql_dbg(ql_dbg_async, vha, 0x502a, "iocb type %xh with error status %xh, handle %xh, rspq id %d\n", @@ -3608,7 +3625,6 @@ qla2x00_error_entry(scsi_qla_host_t *vha
switch (pkt->entry_type) { case NOTIFY_ACK_TYPE: - case STATUS_TYPE: case STATUS_CONT_TYPE: case LOGINOUT_PORT_IOCB_TYPE: case CT_IOCB_TYPE: @@ -3628,6 +3644,14 @@ qla2x00_error_entry(scsi_qla_host_t *vha case CTIO_TYPE7: case CTIO_CRC2: return 1; + case STATUS_TYPE: + sp = qla_get_sp_from_handle(vha, func, req, pkt, &index); + if (sp) { + sp->done(sp, res); + req->outstanding_cmds[index] = NULL; + return 0; + } + break; } fatal: ql_log(ql_log_warn, vha, 0x5030, --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1079,43 +1079,6 @@ qc24_fail_command: }
/* - * qla2x00_eh_wait_on_command - * Waits for the command to be returned by the Firmware for some - * max time. - * - * Input: - * cmd = Scsi Command to wait on. - * - * Return: - * Completed in time : QLA_SUCCESS - * Did not complete in time : QLA_FUNCTION_FAILED - */ -static int -qla2x00_eh_wait_on_command(struct scsi_cmnd *cmd) -{ -#define ABORT_POLLING_PERIOD 1000 -#define ABORT_WAIT_ITER ((2 * 1000) / (ABORT_POLLING_PERIOD)) - unsigned long wait_iter = ABORT_WAIT_ITER; - scsi_qla_host_t *vha = shost_priv(cmd->device->host); - struct qla_hw_data *ha = vha->hw; - srb_t *sp = scsi_cmd_priv(cmd); - int ret = QLA_SUCCESS; - - if (unlikely(pci_channel_offline(ha->pdev)) || ha->flags.eeh_busy) { - ql_dbg(ql_dbg_taskm, vha, 0x8005, - "Return:eh_wait.\n"); - return ret; - } - - while (sp->type && wait_iter--) - msleep(ABORT_POLLING_PERIOD); - if (sp->type) - ret = QLA_FUNCTION_FAILED; - - return ret; -} - -/* * qla2x00_wait_for_hba_online * Wait till the HBA is online after going through * <= MAX_RETRIES_OF_ISP_ABORT or @@ -1365,6 +1328,9 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd) return ret; }
+#define ABORT_POLLING_PERIOD 1000 +#define ABORT_WAIT_ITER ((2 * 1000) / (ABORT_POLLING_PERIOD)) + /* * Returns: QLA_SUCCESS or QLA_FUNCTION_FAILED. */ @@ -1378,41 +1344,73 @@ __qla2x00_eh_wait_for_pending_commands(s struct req_que *req = qpair->req; srb_t *sp; struct scsi_cmnd *cmd; + unsigned long wait_iter = ABORT_WAIT_ITER; + bool found; + struct qla_hw_data *ha = vha->hw;
status = QLA_SUCCESS;
- spin_lock_irqsave(qpair->qp_lock_ptr, flags); - for (cnt = 1; status == QLA_SUCCESS && - cnt < req->num_outstanding_cmds; cnt++) { - sp = req->outstanding_cmds[cnt]; - if (!sp) - continue; - if (sp->type != SRB_SCSI_CMD) - continue; - if (vha->vp_idx != sp->vha->vp_idx) - continue; - match = 0; - cmd = GET_CMD_SP(sp); - switch (type) { - case WAIT_HOST: - match = 1; - break; - case WAIT_TARGET: - match = cmd->device->id == t; - break; - case WAIT_LUN: - match = (cmd->device->id == t && - cmd->device->lun == l); - break; - } - if (!match) - continue; + while (wait_iter--) { + found = false;
- spin_unlock_irqrestore(qpair->qp_lock_ptr, flags); - status = qla2x00_eh_wait_on_command(cmd); spin_lock_irqsave(qpair->qp_lock_ptr, flags); + for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { + sp = req->outstanding_cmds[cnt]; + if (!sp) + continue; + if (sp->type != SRB_SCSI_CMD) + continue; + if (vha->vp_idx != sp->vha->vp_idx) + continue; + match = 0; + cmd = GET_CMD_SP(sp); + switch (type) { + case WAIT_HOST: + match = 1; + break; + case WAIT_TARGET: + if (sp->fcport) + match = sp->fcport->d_id.b24 == t; + else + match = 0; + break; + case WAIT_LUN: + if (sp->fcport) + match = (sp->fcport->d_id.b24 == t && + cmd->device->lun == l); + else + match = 0; + break; + } + if (!match) + continue; + + spin_unlock_irqrestore(qpair->qp_lock_ptr, flags); + + if (unlikely(pci_channel_offline(ha->pdev)) || + ha->flags.eeh_busy) { + ql_dbg(ql_dbg_taskm, vha, 0x8005, + "Return:eh_wait.\n"); + return status; + } + + /* + * SRB_SCSI_CMD is still in the outstanding_cmds array. + * it means scsi_done has not called. Wait for it to + * clear from outstanding_cmds. + */ + msleep(ABORT_POLLING_PERIOD); + spin_lock_irqsave(qpair->qp_lock_ptr, flags); + found = true; + } + spin_unlock_irqrestore(qpair->qp_lock_ptr, flags); + + if (!found) + break; } - spin_unlock_irqrestore(qpair->qp_lock_ptr, flags); + + if (!wait_iter && found) + status = QLA_FUNCTION_FAILED;
return status; }
From: Nilesh Javali njavali@marvell.com
commit d721b591b95cf3f290f8a7cbe90aa2ee0368388d upstream.
Klocwork reports array 'vha->host_str' of size 16 may use index value(s) 16..19. Use snprintf() instead of sprintf().
Cc: stable@vger.kernel.org Co-developed-by: Bikash Hazarika bhazarika@marvell.com Signed-off-by: Bikash Hazarika bhazarika@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-2-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_os.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -5088,7 +5088,8 @@ struct scsi_qla_host *qla2x00_create_hos } INIT_DELAYED_WORK(&vha->scan.scan_work, qla_scan_work_fn);
- sprintf(vha->host_str, "%s_%lu", QLA2XXX_DRIVER_NAME, vha->host_no); + snprintf(vha->host_str, sizeof(vha->host_str), "%s_%lu", + QLA2XXX_DRIVER_NAME, vha->host_no); ql_dbg(ql_dbg_init, vha, 0x0041, "Allocated the host=%p hw=%p vha=%p dev_name=%s", vha->host, vha->hw, vha,
From: Nilesh Javali njavali@marvell.com
commit 6b504d06976fe4a61cc05dedc68b84fadb397f77 upstream.
Klocwork reported warning of NULL pointer may be dereferenced. The routine exits when sa_ctl is NULL and fcport is allocated after the exit call thus causing NULL fcport pointer to dereference at the time of exit.
To avoid fcport pointer dereference, exit the routine when sa_ctl is NULL.
Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-4-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_edif.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_edif.c +++ b/drivers/scsi/qla2xxx/qla_edif.c @@ -2361,8 +2361,8 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_h if (!sa_ctl) { ql_dbg(ql_dbg_edif, vha, 0x70e6, "sa_ctl allocation failed\n"); - rval = -ENOMEM; - goto done; + rval = -ENOMEM; + return rval; }
fcport = sa_ctl->fcport;
From: Quinn Tran qutran@marvell.com
commit b68710a8094fdffe8dd4f7a82c82649f479bb453 upstream.
Klocwork warning: Buffer Overflow - Array Index Out of Bounds
Driver uses fc_els_flogi to calculate size of buffer. The actual buffer is nested inside of fc_els_flogi which is smaller.
Replace structure name to allow proper size calculation.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-6-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -5549,7 +5549,7 @@ static void qla_get_login_template(scsi_ __be32 *q;
memset(ha->init_cb, 0, ha->init_cb_size); - sz = min_t(int, sizeof(struct fc_els_flogi), ha->init_cb_size); + sz = min_t(int, sizeof(struct fc_els_csp), ha->init_cb_size); rval = qla24xx_get_port_login_templ(vha, ha->init_cb_dma, ha->init_cb, sz); if (rval != QLA_SUCCESS) {
From: Bikash Hazarika bhazarika@marvell.com
commit 464ea494a40c6e3e0e8f91dd325408aaf21515ba upstream.
Klocwork tool reported 'cur_dsd' may be dereferenced. Add fix to validate pointer before dereferencing the pointer.
Cc: stable@vger.kernel.org Signed-off-by: Bikash Hazarika bhazarika@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-3-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_iocb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/scsi/qla2xxx/qla_iocb.c +++ b/drivers/scsi/qla2xxx/qla_iocb.c @@ -607,7 +607,8 @@ qla24xx_build_scsi_type_6_iocbs(srb_t *s put_unaligned_le32(COMMAND_TYPE_6, &cmd_pkt->entry_type);
/* No data transfer */ - if (!scsi_bufflen(cmd) || cmd->sc_data_direction == DMA_NONE) { + if (!scsi_bufflen(cmd) || cmd->sc_data_direction == DMA_NONE || + tot_dsds == 0) { cmd_pkt->byte_count = cpu_to_le32(0); return 0; }
From: Nilesh Javali njavali@marvell.com
commit af73f23a27206ffb3c477cac75b5fcf03410556e upstream.
Klocwork reported warning of rport maybe NULL and will be dereferenced. rport returned by call to fc_bsg_to_rport() could be NULL and dereferenced.
Check valid rport returned by fc_bsg_to_rport().
Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-5-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_bsg.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/scsi/qla2xxx/qla_bsg.c +++ b/drivers/scsi/qla2xxx/qla_bsg.c @@ -283,6 +283,10 @@ qla2x00_process_els(struct bsg_job *bsg_
if (bsg_request->msgcode == FC_BSG_RPT_ELS) { rport = fc_bsg_to_rport(bsg_job); + if (!rport) { + rval = -ENOMEM; + goto done; + } fcport = *(fc_port_t **) rport->dd_data; host = rport_to_shost(rport); vha = shost_priv(host);
From: Bikash Hazarika bhazarika@marvell.com
commit b1b9d3825df4c757d653d0b1df66f084835db9c3 upstream.
Klocwork reported array 'port_dstate_str' of size 10 may use index value(s) 10..15.
Add a fix to correct the index of array.
Cc: stable@vger.kernel.org Signed-off-by: Bikash Hazarika bhazarika@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-8-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_inline.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/scsi/qla2xxx/qla_inline.h +++ b/drivers/scsi/qla2xxx/qla_inline.h @@ -109,11 +109,13 @@ qla2x00_set_fcport_disc_state(fc_port_t { int old_val; uint8_t shiftbits, mask; + uint8_t port_dstate_str_sz;
/* This will have to change when the max no. of states > 16 */ shiftbits = 4; mask = (1 << shiftbits) - 1;
+ port_dstate_str_sz = sizeof(port_dstate_str) / sizeof(char *); fcport->disc_state = state; while (1) { old_val = atomic_read(&fcport->shadow_disc_state); @@ -121,7 +123,8 @@ qla2x00_set_fcport_disc_state(fc_port_t old_val, (old_val << shiftbits) | state)) { ql_dbg(ql_dbg_disc, fcport->vha, 0x2134, "FCPort %8phC disc_state transition: %s to %s - portid=%06x.\n", - fcport->port_name, port_dstate_str[old_val & mask], + fcport->port_name, (old_val & mask) < port_dstate_str_sz ? + port_dstate_str[old_val & mask] : "Unknown", port_dstate_str[state], fcport->d_id.b24); return; }
From: Shreyas Deodhar sdeodhar@marvell.com
commit 00eca15319d9ce8c31cdf22f32a3467775423df4 upstream.
Klocwork tool reported pointer 'rport' returned from call to function fc_bsg_to_rport() may be NULL and will be dereferenced.
Add a fix to validate rport before dereferencing.
Cc: stable@vger.kernel.org Signed-off-by: Shreyas Deodhar sdeodhar@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230607113843.37185-7-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_bsg.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/scsi/qla2xxx/qla_bsg.c +++ b/drivers/scsi/qla2xxx/qla_bsg.c @@ -2996,6 +2996,8 @@ qla24xx_bsg_request(struct bsg_job *bsg_
if (bsg_request->msgcode == FC_BSG_RPT_ELS) { rport = fc_bsg_to_rport(bsg_job); + if (!rport) + return ret; host = rport_to_shost(rport); vha = shost_priv(host); } else {
From: Manish Rangankar mrangankar@marvell.com
commit 20fce500b232b970e40312a9c97e7f3b6d7a709c upstream.
System crash when qla2x00_start_sp(sp) returns error code EGAIN and wake_up gets called for uninitialized wait queue sp->nvme_ls_waitq.
qla2xxx [0000:37:00.1]-2121:5: Returning existing qpair of ffff8ae2c0513400 for idx=0 qla2xxx [0000:37:00.1]-700e:5: qla2x00_start_sp failed = 11 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021 Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc] RIP: 0010:__wake_up_common+0x4c/0x190 RSP: 0018:ffff95f3e0cb7cd0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff8b08d3b26328 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8b08d3b26320 RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffffffffffe8 R10: 0000000000000000 R11: ffff95f3e0cb7a60 R12: ffff95f3e0cb7d20 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8b2fdf6c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000002f1e410002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: __wake_up_common_lock+0x7c/0xc0 qla_nvme_ls_req+0x355/0x4c0 [qla2xxx] ? __nvme_fc_send_ls_req+0x260/0x380 [nvme_fc] ? nvme_fc_send_ls_req.constprop.42+0x1a/0x45 [nvme_fc] ? nvme_fc_connect_ctrl_work.cold.63+0x1e3/0xa7d [nvme_fc]
Remove unused nvme_ls_waitq wait queue. nvme_ls_waitq logic was removed previously in the commits tagged Fixed: below.
Fixes: 219d27d7147e ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands") Fixes: 5621b0dd7453 ("scsi: qla2xxx: Simpify unregistration of FC-NVMe local/remote ports") Cc: stable@vger.kernel.org Signed-off-by: Manish Rangankar mrangankar@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230615074633.12721-1-njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_def.h | 1 - drivers/scsi/qla2xxx/qla_nvme.c | 3 --- 2 files changed, 4 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -703,7 +703,6 @@ typedef struct srb { struct iocb_resource iores; struct kref cmd_kref; /* need to migrate ref_count over to this */ void *priv; - wait_queue_head_t nvme_ls_waitq; struct fc_port *fcport; struct scsi_qla_host *vha; unsigned int start_timer:1; --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -360,7 +360,6 @@ static int qla_nvme_ls_req(struct nvme_f if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x700e, "qla2x00_start_sp failed = %d\n", rval); - wake_up(&sp->nvme_ls_waitq); sp->priv = NULL; priv->sp = NULL; qla2x00_rel_sp(sp); @@ -652,7 +651,6 @@ static int qla_nvme_post_cmd(struct nvme if (!sp) return -EBUSY;
- init_waitqueue_head(&sp->nvme_ls_waitq); kref_init(&sp->cmd_kref); spin_lock_init(&priv->cmd_lock); sp->priv = priv; @@ -671,7 +669,6 @@ static int qla_nvme_post_cmd(struct nvme if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x212d, "qla2x00_start_nvme_mq failed = %d\n", rval); - wake_up(&sp->nvme_ls_waitq); sp->priv = NULL; priv->sp = NULL; qla2xxx_rel_qpair_sp(sp->qpair, sp);
From: Dan Carpenter dan.carpenter@linaro.org
commit 339020091e246e708c1381acf74c5f8e3fe4d2b5 upstream.
This loop will exit successfully when "found" is false or in the failure case it times out with "wait_iter" set to -1. The test for timeouts is impossible as is.
Fixes: b843adde8d49 ("scsi: qla2xxx: Fix mem access after free") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/cea5a62f-b873-4347-8f8e-c67527ced8d2@kili.mountain Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_os.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1409,7 +1409,7 @@ __qla2x00_eh_wait_for_pending_commands(s break; }
- if (!wait_iter && found) + if (wait_iter == -1) status = QLA_FUNCTION_FAILED;
return status;
From: Dan Carpenter dan.carpenter@linaro.org
commit cad7526f33ce1e7d387d1d0568a089e41deec5c2 upstream.
This error path needs call mutex_unlock(&ocelot->tas_lock) before returning.
Fixes: 2d800bc500fb ("net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/ocelot/felix_vsc9959.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -1449,7 +1449,8 @@ static int vsc9959_qos_port_tas_set(stru mutex_unlock(&ocelot->tas_lock); return 0; } else if (taprio->cmd != TAPRIO_CMD_REPLACE) { - return -EOPNOTSUPP; + ret = -EOPNOTSUPP; + goto err_unlock; }
ret = ocelot_port_mqprio(ocelot, port, &taprio->mqprio);
From: Thomas Bogendoerfer tsbogend@alpha.franken.de
commit 3a6dbb691782e88e07e5c70b327495dbd58a2e7f upstream.
Commit e4de20576986 ("MIPS: KVM: Fix NULL pointer dereference") missed converting one place accessing cop0 registers, which results in a build error, if KVM_MIPS_DEBUG_COP0_COUNTERS is enabled.
Fixes: e4de20576986 ("MIPS: KVM: Fix NULL pointer dereference") Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/kvm/stats.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/mips/kvm/stats.c +++ b/arch/mips/kvm/stats.c @@ -54,9 +54,9 @@ void kvm_mips_dump_stats(struct kvm_vcpu kvm_info("\nKVM VCPU[%d] COP0 Access Profile:\n", vcpu->vcpu_id); for (i = 0; i < N_MIPS_COPROC_REGS; i++) { for (j = 0; j < N_MIPS_COPROC_SEL; j++) { - if (vcpu->arch.cop0->stat[i][j]) + if (vcpu->arch.cop0.stat[i][j]) kvm_info("%s[%d]: %lu\n", kvm_cop0_str[i], j, - vcpu->arch.cop0->stat[i][j]); + vcpu->arch.cop0.stat[i][j]); } } #endif
From: Mario Limonciello mario.limonciello@amd.com
commit 1e66a17ce546eabad753178bbd4175cb52bafca8 upstream.
This reverts commit 072030b1783056b5de8b0fac5303a5e9dbc6cfde. This is no longer necessary when using newer DMUB F/W.
Cc: stable@vger.kernel.org Cc: Sean Wang sean.ns.wang@amd.com Cc: Marc Rossi Marc.Rossi@amd.com Cc: Hamza Mahfooz Hamza.Mahfooz@amd.com Cc: Tsung-hua (Ryan) Lin Tsung-hua.Lin@amd.com Reviewed-by: Leo Li sunpeng.li@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/modules/power/power_helpers.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c +++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c @@ -818,8 +818,6 @@ bool is_psr_su_specific_panel(struct dc_ ((dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x08) || (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x07))) isPSRSUSupported = false; - else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x03) - isPSRSUSupported = false; else if (dpcd_caps->psr_info.force_psrsu_cap == 0x1) isPSRSUSupported = true; }
Hi,
On Fri, 21 Jul 2023 18:01:49 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
I confirmed that this rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below.
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr
Thanks, SJ
---
ok 1 selftests: damon: debugfs_attrs.sh ok 2 selftests: damon: debugfs_schemes.sh ok 3 selftests: damon: debugfs_target_ids.sh ok 4 selftests: damon: debugfs_empty_targets.sh ok 5 selftests: damon: debugfs_huge_count_read_write.sh ok 6 selftests: damon: debugfs_duplicate_context_creation.sh ok 7 selftests: damon: debugfs_rm_non_contexts.sh ok 8 selftests: damon: sysfs.sh ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 10 selftests: damon: reclaim.sh ok 11 selftests: damon: lru_sort.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_m68k.sh ok 12 selftests: damon-tests: build_arm64.sh ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh
PASS
[...]
Hi Greg
On Sat, Jul 22, 2023 at 1:09 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
6.4.5-rc1 tested.
Build successfully completed. Boot successfully completed. No dmesg regressions. Video output normal. Sound output normal.
Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64), arch linux)
Thanks
Tested-by: Takeshi Ogasawara takeshi.ogasawara@futuring-girl.com
On Fri, Jul 21, 2023 at 06:01:49PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc1 against the Fedora build system (aarch64, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
On 7/21/23 9:01 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Fri, Jul 21, 2023 at 06:01:49PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully compiled and installed bindeb-pkgs on my computer (Acer Aspire E15, Intel Core i3 Haswell). No noticeable regressions.
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On Fri, 21 Jul 2023 at 21:39, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
NOTE: The following kernel warning was noticed while booting qemu-arm64 with these configs enabled on stable rc 6.4.5-rc1.
CONFIG_ARM64_64K_PAGES=y CONFIG_KFENCE=y
This crash is not easily reproducible.
boot logs: -------- [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510] [ 0.000000] Linux version 6.4.5-rc1 (tuxmake@tuxmake) (aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT @1689957802 [ 0.000000] random: crng init done [ 0.000000] Machine model: linux,dummy-virt ... <6>[ 0.006821] kfence: initialized - using 33554432 bytes for 255 objects at 0x(____ptrval____)-0x(____ptrval____) ... <4>[ 7.726994] ------------[ cut here ]------------ <4>[ 7.727704] WARNING: CPU: 1 PID: 1 at mm/kfence/core.c:1097 __kfence_free+0x84/0xc8 <4>[ 7.730078] Modules linked in: ip_tables x_tables <4>[ 7.732637] CPU: 1 PID: 1 Comm: systemd Not tainted 6.4.5-rc1 #1 <4>[ 7.733334] Hardware name: linux,dummy-virt (DT) <4>[ 7.734765] pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) <4>[ 7.735323] pc : __kfence_free+0x84/0xc8 <4>[ 7.736036] lr : __slab_free+0x490/0x508 <4>[ 7.736374] sp : ffff8000080afb40 <4>[ 7.736657] x29: ffff8000080afb40 x28: ffffffc0003fa100 x27: 0000000000000000 <4>[ 7.738294] x26: 0000000000000000 x25: 0000000000000001 x24: 0000000000000000 <4>[ 7.739138] x23: ffffcd8ea7099000 x22: ffffcd8ea45cac38 x21: ffff0000fe840000 <4>[ 7.739961] x20: ffffcd8ea45cac38 x19: ffff0000c0012300 x18: 0000000000000000 <4>[ 7.740778] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 <4>[ 7.741636] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 <4>[ 7.742474] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffcd8ea45601e0 <4>[ 7.743407] x8 : ffff8000080afc20 x7 : 0000000000000000 x6 : 0000000000009901 <4>[ 7.744268] x5 : ffffcd8ea45cac38 x4 : ffffcd8ea7099000 x3 : ffffcd8ea76e7aa0 <4>[ 7.745093] x2 : ffff0000c162e000 x1 : ffffcd8ea76fa5b0 x0 : ffff0000fe840000 <4>[ 7.746478] Call trace: <4>[ 7.746776] __kfence_free+0x84/0xc8 <4>[ 7.747134] __slab_free+0x490/0x508 <4>[ 7.748063] __kmem_cache_free+0x2b4/0x2d0 <4>[ 7.748377] kfree+0x78/0x140 <4>[ 7.748638] single_release+0x40/0x60 <4>[ 7.750664] __fput+0x78/0x260 <4>[ 7.751065] ____fput+0x18/0x30 <4>[ 7.752086] task_work_run+0x80/0xe0 <4>[ 7.753122] do_notify_resume+0x200/0x1398 <4>[ 7.754292] el0_svc+0xec/0x100 <4>[ 7.754573] el0t_64_sync_handler+0xf4/0x120 <4>[ 7.755559] el0t_64_sync+0x190/0x198 <4>[ 7.756643] ---[ end trace 0000000000000000 ]---
Links - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4.4-... - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4.4-... - https://storage.tuxsuite.com/public/linaro/lkft/builds/2StEPFnEfoD076PRu8fIx...
## Build * kernel: 6.4.5-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.4.y * git commit: 4f44255da83d4e0d6c39114e6d90f43705c9159d * git describe: v6.4.4-293-g4f44255da83d * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4.4-...
## Test Regressions (compared to v6.4.4-150-g698271d38e0b)
## Metric Regressions (compared to v6.4.4-150-g698271d38e0b)
## Test Fixes (compared to v6.4.4-150-g698271d38e0b)
## Metric Fixes (compared to v6.4.4-150-g698271d38e0b)
## Test result summary total: 165616, pass: 143054, fail: 2212, skip: 20190, xfail: 160
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 141 total, 141 passed, 0 failed * arm64: 50 total, 50 passed, 0 failed * i386: 37 total, 37 passed, 0 failed * mips: 26 total, 26 passed, 0 failed * parisc: 3 total, 3 passed, 0 failed * powerpc: 34 total, 34 passed, 0 failed * riscv: 22 total, 22 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 12 total, 12 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 42 total, 42 passed, 0 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesytems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-vm * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * rcutorture * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
On 7/21/2023 9:01 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On Fri, Jul 21, 2023 at 06:01:49PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
Build results: total: 157 pass: 157 fail: 0 Qemu test results: total: 523 pass: 523 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On Fri, 21 Jul 2023 18:01:49 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 23 Jul 2023 16:04:29 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.5-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.4: 11 builds: 11 pass, 0 fail 28 boots: 28 pass, 0 fail 130 tests: 130 pass, 0 fail
Linux version: 6.4.5-rc1-g698271d38e0b Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Fri, Jul 21, 2023 at 06:01:49PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.5 release. There are 292 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
linux-stable-mirror@lists.linaro.org