This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.19.13-rc1
Levi Yun ppbuk5246@gmail.com damon/sysfs: fix possible memleak on damon_sysfs_add_target
Nadav Amit namit@vmware.com x86/alternative: Fix race in try_get_desc()
Borislav Petkov bp@suse.de x86/cacheinfo: Add a cpu_llc_shared_mask() UP variant
Jim Mattson jmattson@google.com KVM: x86: Hide IA32_PLATFORM_DCA_CAP[31:0] from the guest
Arnaldo Carvalho de Melo acme@redhat.com perf tests record: Fail the test if the 'errs' counter is not zero
Zhengjun Xing zhengjun.xing@linux.intel.com perf test: Fix test case 87 ("perf record tests") for hybrid systems
Daniel Golle daniel@makrotopia.org net: ethernet: mtk_eth_soc: fix mask of RX_DMA_GET_SPORT{,_V2}
Vladimir Oltean vladimir.oltean@nxp.com net: mscc: ocelot: fix tagged VLAN refusal while under a VLAN-unaware bridge
Peng Fan peng.fan@nxp.com clk: imx93: drop of_match_ptr
Florian Fainelli f.fainelli@gmail.com clk: iproc: Do not rely on node name for correct PLL setup
Ashutosh Dixit ashutosh.dixit@intel.com drm/i915/gt: Perf_limit_reasons are only available for Gen11+
Han Xu han.xu@nxp.com clk: imx: imx6sx: remove the SET_RATE_PARENT flag for QSPI clocks
Al Viro viro@zeniv.linux.org.uk don't use __kernel_write() on kmap_local_page()
Eli Cohen elic@nvidia.com vdpa/mlx5: Fix MQ to support non power of two num queues
Suwan Kim suwan.kim027@gmail.com virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq()
Angus Chen angus.chen@jaguarmicro.com vdpa/ifcvf: fix the calculation of queuepair
Maciej Fijalkowski maciej.fijalkowski@intel.com ice: xsk: drop power of 2 ring size restriction for AF_XDP
Maciej Fijalkowski maciej.fijalkowski@intel.com ice: xsk: change batched Tx descriptor cleaning
Wang Yufen wangyufen@huawei.com selftests: Fix the if conditions of in test_extra_filter()
Lukas Wunner lukas@wunner.de net: phy: Don't WARN for PHY_UP state in mdio_bus_phy_resume()
Junxiao Chang junxiao.chang@intel.com net: stmmac: power up/down serdes in stmmac_open/release
Paweł Lenkow pawel.lenkow@camlingroup.com wifi: mac80211: fix memory corruption in minstrel_ht_update_rates()
Hans de Goede hdegoede@redhat.com wifi: mac80211: fix regression with non-QoS drivers
Tamizh Chelvam Raja quic_tamizhr@quicinc.com wifi: cfg80211: fix MCS divisor value
Michael Kelley mikelley@microsoft.com nvme: Fix IOC_PR_CLEAR and IOC_PR_RELEASE ioctls for nvme devices
Peng Wu wupeng58@huawei.com net/mlxbf_gige: Fix an IS_ERR() vs NULL bug in mlxbf_gige_mdio_probe
Rafael Mendonca rafaelmendsr@gmail.com cxgb4: fix missing unlock on ETHOFLD desc collect fail path
Hangyu Hua hbh25y@gmail.com net: sched: act_ct: fix possible refcount leak in tcf_ct_init()
Peilin Ye peilin.ye@bytedance.com usbnet: Fix memory leak in usbnet_disconnect()
Zhengjun Xing zhengjun.xing@linux.intel.com perf parse-events: Remove "not supported" hybrid cache events
Zhengjun Xing zhengjun.xing@linux.intel.com perf print-events: Fix "perf list" can not display the PMU prefix for some hybrid cache events
Ian Rogers irogers@google.com perf parse-events: Break out tracepoint and printing
Pali Rohár pali@kernel.org gpio: mvebu: Fix check for pwm support on non-A8K platforms
Yang Yingliang yangyingliang@huawei.com Input: melfas_mip4 - fix return value check in mip4_probe()
Brian Norris briannorris@chromium.org Revert "drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time"
Radhey Shyam Pandey radhey.shyam.pandey@amd.com net: macb: Fix ZynqMP SGMII non-wakeup source resume failure
Francesco Dolcini francesco.dolcini@toradex.com drm/bridge: lt8912b: fix corrupted image output
Philippe Schenker philippe.schenker@toradex.com drm/bridge: lt8912b: set hdmi or dvi mode
Philippe Schenker philippe.schenker@toradex.com drm/bridge: lt8912b: add vsync hsync
Martin Povišer povik+lin@cutebit.org ASoC: tas2770: Reinit regcache on reset
Johan Hovold johan+linaro@kernel.org arm64: dts: qcom: sm8350: fix UFS PHY serdes size
Conor Dooley conor.dooley@microchip.com clk: microchip: mpfs: make the rtc's ahb clock critical
Conor Dooley conor.dooley@microchip.com clk: microchip: mpfs: fix clk_cfg array bounds violation
Shengjiu Wang shengjiu.wang@nxp.com ASoC: imx-card: Fix refcount issue with of_node_put
Samuel Holland samuel@sholland.org soc: sunxi: sram: Fix debugfs info for A64 SRAM C
Samuel Holland samuel@sholland.org soc: sunxi: sram: Fix probe function ordering issues
Samuel Holland samuel@sholland.org soc: sunxi: sram: Prevent the driver from being unbound
Samuel Holland samuel@sholland.org soc: sunxi: sram: Actually claim SRAM regions
Romain Naour romain.naour@skf.com ARM: dts: am5748: keep usb4_tm disabled
Richard Zhu hongxing.zhu@nxp.com reset: imx7: Fix the iMX8MP PCIe PHY PERST support
YuTong Chang mtwget@gmail.com ARM: dts: am33xx: Fix MMCHS0 dma properties
Hans Verkuil hverkuil-cisco@xs4all.nl media: v4l2-compat-ioctl32.c: zero buffer passed to v4l2_compat_get_array_args()
Nícolas F. R. A. Prado nfraprado@collabora.com media: mediatek: vcodec: Drop platform_get_resource(IORESOURCE_IRQ)
Nicolas Dufresne nicolas.dufresne@collabora.com media: rkvdec: Disable H.264 error detection
Hangyu Hua hbh25y@gmail.com media: dvb_vb2: fix possible out of bound access
Shuai Xue xueshuai@linux.alibaba.com mm,hwpoison: check mm when killing accessing process
Doug Berger opendmb@gmail.com mm/hugetlb: correct demote page offset logic
Sergei Antonov saproj@gmail.com mm: bring back update_mmu_cache() to finish_fault()
Minchan Kim minchan@kernel.org mm: fix madivse_pageout mishandling on non-LRU page
Alistair Popple apopple@nvidia.com mm/migrate_device.c: copy pte dirty bit to page
Alistair Popple apopple@nvidia.com mm/migrate_device.c: add missing flush_cache_page()
Alistair Popple apopple@nvidia.com mm/migrate_device.c: flush TLB while holding PTL
Binyi Han dantengknight@gmail.com mm: fix dereferencing possible ERR_PTR
Zi Yan ziy@nvidia.com mm/page_isolation: fix isolate_single_pageblock() isolation behavior
Maurizio Lombardi mlombard@redhat.com mm: prevent page_frag_alloc() from corrupting the memory
Mel Gorman mgorman@techsingularity.net mm/page_alloc: fix race condition between build_all_zonelists and page allocation
Yang Shi shy828301@gmail.com mm: gup: fix the fast GUP race against THP collapse
Wenchao Chen wenchao.chen@unisoc.com mmc: hsq: Fix data stomping during mmc recovery
Sergei Antonov saproj@gmail.com mmc: moxart: fix 4-bit bus width and remove 8-bit bus width
Menglong Dong imagedong@tencent.com mptcp: fix unreleased socket in accept queue
Menglong Dong imagedong@tencent.com mptcp: factor out __mptcp_close() without socket lock
Florian Westphal fw@strlen.de mm: fix BUG splat with kvmalloc + GFP_ATOMIC
Niklas Cassel niklas.cassel@wdc.com libata: add ATA_HORKAGE_NOLPM for Pioneer BDR-207M and BDR-205
Maxime Coquelin maxime.coquelin@redhat.com vduse: prevent uninitialized memory accesses
Bokun Zhang Bokun.Zhang@amd.com drm/amdgpu: Add amdgpu suspend-resume code path under SRIOV
Chris Wilson chris@chris-wilson.co.uk drm/i915/gt: Restrict forced preemption to the active context
Yang Shi shy828301@gmail.com powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush
Ulf Hansson ulf.hansson@linaro.org Revert "firmware: arm_scmi: Add clock management to the SCMI power domain"
Alexander Couzens lynxis@fe80.eu net: mt7531: only do PLL once after the reset
Greg Kroah-Hartman gregkh@linuxfoundation.org mm/damon/dbgfs: fix memory leak when using debugfs_lookup()
Kees Cook keescook@chromium.org x86/uaccess: avoid check_object_size() in copy_from_user_nmi()
ChenXiaoSong chenxiaosong2@huawei.com ntfs: fix BUG_ON in ntfs_lookup_inode_by_name()
Linus Walleij linus.walleij@linaro.org ARM: dts: integrator: Tag PCI host with device_type
Christoph Hellwig hch@lst.de frontswap: don't call ->init if no ops are registered
Jarkko Sakkinen jarkko@kernel.org x86/sgx: Do not fail on incomplete sanitization on premature stop of ksgxd
Alexander Wetzel alexander@wetzel-home.de wifi: mac80211: ensure vif queues are operational after start
Aidan MacDonald aidanmacdonald.0x0@gmail.com clk: ingenic-tcu: Properly enable registers before accessing timers
Marc Kleine-Budde mkl@pengutronix.de can: c_can: don't cache TX messages for C_CAN cores
Sebastian Krzyszkowiak sebastian.krzyszkowiak@puri.sm Input: snvs_pwrkey - fix SNVS_HPVIDR1 register address
Frank Wunderlich frank-w@public-files.de net: usb: qmi_wwan: Add new usb-id for Dell branded EM7455
Mario Limonciello mario.limonciello@amd.com thunderbolt: Explicitly reset plug events delay back to USB4 spec value
Heikki Krogerus heikki.krogerus@linux.intel.com usb: typec: ucsi: Remove incorrect warning
Hongling Zeng zenghongling@kylinos.cn uas: ignore UAS for Thinkplus chips
Hongling Zeng zenghongling@kylinos.cn usb-storage: Add Hiksemi USB3-FW to IGNORE_UAS
Hongling Zeng zenghongling@kylinos.cn uas: add no-uas quirk for Hiksemi usb_disk
William Breathitt Gray william.gray@linaro.org counter: 104-quad-8: Fix skipped IRQ lines during events configuration
William Breathitt Gray william.gray@linaro.org counter: 104-quad-8: Implement and utilize register structures
William Breathitt Gray william.gray@linaro.org counter: 104-quad-8: Utilize iomap interface
Adrian Hunter adrian.hunter@intel.com perf record: Fix cpu mask bit setting for mixed mmaps
Athira Rajeev atrajeev@linux.vnet.ibm.com tools/perf: Fix out of bound access to cpu mask array
Heiko Stuebner heiko@sntech.de riscv: make t-head erratas depend on MMU
-------------
Diffstat:
Makefile | 4 +- arch/arm/boot/dts/am33xx-l4.dtsi | 3 +- arch/arm/boot/dts/am5748.dtsi | 4 + arch/arm/boot/dts/integratorap.dts | 1 + arch/arm64/boot/dts/qcom/sm8350.dtsi | 2 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 9 - arch/riscv/Kconfig.erratas | 2 +- arch/x86/include/asm/smp.h | 25 +- arch/x86/kernel/alternative.c | 45 +- arch/x86/kernel/cpu/sgx/main.c | 15 +- arch/x86/kvm/cpuid.c | 2 - arch/x86/lib/usercopy.c | 2 +- drivers/ata/libata-core.c | 4 + drivers/block/virtio_blk.c | 11 +- drivers/clk/bcm/clk-iproc-pll.c | 12 +- drivers/clk/imx/clk-imx6sx.c | 4 +- drivers/clk/imx/clk-imx93.c | 2 +- drivers/clk/ingenic/tcu.c | 15 +- drivers/clk/microchip/clk-mpfs.c | 11 +- drivers/counter/104-quad-8.c | 209 +++--- drivers/firmware/arm_scmi/scmi_pm_domain.c | 26 - drivers/gpio/gpio-mvebu.c | 15 +- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 4 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 +- drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 13 - drivers/gpu/drm/bridge/lontium-lt8912b.c | 13 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 + .../gpu/drm/i915/gt/intel_execlists_submission.c | 21 +- drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c | 15 +- drivers/input/keyboard/snvs_pwrkey.c | 2 +- drivers/input/touchscreen/melfas_mip4.c | 2 +- drivers/media/dvb-core/dvb_vb2.c | 11 + .../platform/mediatek/vcodec/mtk_vcodec_enc_drv.c | 9 +- drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 2 + drivers/mmc/host/mmc_hsq.c | 2 +- drivers/mmc/host/moxart-mmc.c | 17 +- drivers/net/can/c_can/c_can.h | 17 +- drivers/net/can/c_can/c_can_main.c | 11 +- drivers/net/dsa/mt7530.c | 15 +- drivers/net/ethernet/cadence/macb_main.c | 4 + drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c | 28 +- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 163 ++--- drivers/net/ethernet/intel/ice/ice_xsk.h | 7 +- drivers/net/ethernet/mediatek/mtk_eth_soc.h | 4 +- .../ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c | 4 +- drivers/net/ethernet/mscc/ocelot.c | 7 + drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 23 +- drivers/net/phy/phy_device.c | 10 +- drivers/net/usb/qmi_wwan.c | 1 + drivers/net/usb/usbnet.c | 7 +- drivers/nvme/host/core.c | 6 +- drivers/reset/reset-imx7.c | 1 + drivers/soc/sunxi/sunxi_sram.c | 23 +- drivers/staging/media/rkvdec/rkvdec-h264.c | 4 +- drivers/thunderbolt/switch.c | 1 + drivers/usb/storage/unusual_uas.h | 21 + drivers/usb/typec/ucsi/ucsi.c | 2 - drivers/vdpa/ifcvf/ifcvf_base.c | 4 +- drivers/vdpa/mlx5/net/mlx5_vnet.c | 17 +- drivers/vdpa/vdpa_user/vduse_dev.c | 9 +- fs/coredump.c | 38 +- fs/internal.h | 3 + fs/ntfs/super.c | 3 +- fs/read_write.c | 22 +- mm/damon/dbgfs.c | 19 +- mm/damon/sysfs.c | 2 +- mm/frontswap.c | 3 + mm/gup.c | 34 +- mm/hugetlb.c | 14 +- mm/khugepaged.c | 10 +- mm/madvise.c | 7 +- mm/memory-failure.c | 3 + mm/memory.c | 14 +- mm/migrate_device.c | 16 +- mm/page_alloc.c | 65 +- mm/page_isolation.c | 25 +- mm/secretmem.c | 2 +- mm/util.c | 4 + net/mac80211/rc80211_minstrel_ht.c | 6 +- net/mac80211/tx.c | 4 + net/mac80211/util.c | 4 +- net/mptcp/protocol.c | 16 +- net/mptcp/protocol.h | 2 + net/mptcp/subflow.c | 33 +- net/sched/act_ct.c | 5 +- net/wireless/util.c | 4 +- sound/soc/codecs/tas2770.c | 3 + sound/soc/fsl/imx-card.c | 4 + tools/perf/builtin-list.c | 2 +- tools/perf/builtin-lock.c | 1 + tools/perf/builtin-record.c | 28 +- tools/perf/builtin-timechart.c | 1 + tools/perf/builtin-trace.c | 1 + tools/perf/tests/perf-record.c | 2 +- tools/perf/tests/shell/record.sh | 2 +- tools/perf/util/Build | 2 + tools/perf/util/parse-events-hybrid.c | 21 +- tools/perf/util/parse-events.c | 734 ++------------------- tools/perf/util/parse-events.h | 32 +- tools/perf/util/print-events.c | 533 +++++++++++++++ tools/perf/util/print-events.h | 22 + tools/perf/util/trace-event-info.c | 96 +++ tools/perf/util/tracepoint.c | 63 ++ tools/perf/util/tracepoint.h | 25 + tools/testing/selftests/net/reuseport_bpf.c | 2 +- 106 files changed, 1628 insertions(+), 1271 deletions(-)
From: Heiko Stuebner heiko@sntech.de
[ Upstream commit 2a2018c3ac84c2dc7cfbad117ce9339ea0914622 ]
Both basic extensions of SVPBMT and ZICBOM depend on CONFIG_MMU. Make the T-Head errata implementations of the similar functionality also depend on it to prevent build errors.
Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head") Fixes: d20ec7529236 ("riscv: implement cache-management errata for T-Head SoCs") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Heiko Stuebner heiko@sntech.de Reviewed-by: Guo Ren guoren@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220907154932.2858518-1-heiko@sntech.de Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/Kconfig.erratas | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas index 457ac72c9b36..e59a770b4432 100644 --- a/arch/riscv/Kconfig.erratas +++ b/arch/riscv/Kconfig.erratas @@ -46,7 +46,7 @@ config ERRATA_THEAD
config ERRATA_THEAD_PBMT bool "Apply T-Head memory type errata" - depends on ERRATA_THEAD && 64BIT + depends on ERRATA_THEAD && 64BIT && MMU select RISCV_ALTERNATIVE_EARLY default y help
From: Athira Rajeev atrajeev@linux.vnet.ibm.com
[ Upstream commit cbd7bfc7fd99acdde58ec2b0bce990158fba1654 ]
The cpu mask init code in "record__mmap_cpu_mask_init" function access "bits" array part of "struct mmap_cpu_mask". The size of this array is the value from cpu__max_cpu().cpu. This array is used to contain the cpumask value for each cpu. While setting bit for each cpu, it calls "set_bit" function which access index in "bits" array.
If we provide a command line option to -C which is greater than the number of CPU's present in the system, the set_bit could access an array member which is out-of the array size. This is because currently, there is no boundary check for the CPU. This will result in seg fault:
<<>> ./perf record -C 12341234 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Segmentation fault (core dumped) <<>>
Debugging with gdb, points to function flow as below:
<<>> set_bit record__mmap_cpu_mask_init record__init_thread_default_masks record__init_thread_masks cmd_record <<>>
Fix this by adding boundary check for the array.
After the patch:
<<>> ./perf record -C 12341234 ls Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS Failed to initialize parallel data streaming masks <<>>
With this fix, if -C is given a non-exsiting CPU, perf record will fail with:
<<>> ./perf record -C 50 ls Failed to initialize parallel data streaming masks <<>>
Reported-by: Nageswara R Sastry rnsastry@linux.ibm.com Signed-off-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Tested-by: Nageswara R Sastry rnsastry@linux.ibm.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kajol Jain kjain@linux.ibm.com Cc: Madhavan Srinivasan maddy@linux.vnet.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220905141929.7171-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: ca76d7d2812b ("perf record: Fix cpu mask bit setting for mixed mmaps") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-record.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 68c878b4e5e4..708880a1c83c 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -3335,16 +3335,22 @@ static struct option __record_options[] = {
struct option *record_options = __record_options;
-static void record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cpu_map *cpus) +static int record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cpu_map *cpus) { struct perf_cpu cpu; int idx;
if (cpu_map__is_dummy(cpus)) - return; + return 0;
- perf_cpu_map__for_each_cpu(cpu, idx, cpus) + perf_cpu_map__for_each_cpu(cpu, idx, cpus) { + /* Return ENODEV is input cpu is greater than max cpu */ + if ((unsigned long)cpu.cpu > mask->nbits) + return -ENODEV; set_bit(cpu.cpu, mask->bits); + } + + return 0; }
static int record__mmap_cpu_mask_init_spec(struct mmap_cpu_mask *mask, const char *mask_spec) @@ -3356,7 +3362,9 @@ static int record__mmap_cpu_mask_init_spec(struct mmap_cpu_mask *mask, const cha return -ENOMEM;
bitmap_zero(mask->bits, mask->nbits); - record__mmap_cpu_mask_init(mask, cpus); + if (record__mmap_cpu_mask_init(mask, cpus)) + return -ENODEV; + perf_cpu_map__put(cpus);
return 0; @@ -3438,7 +3446,12 @@ static int record__init_thread_masks_spec(struct record *rec, struct perf_cpu_ma pr_err("Failed to allocate CPUs mask\n"); return ret; } - record__mmap_cpu_mask_init(&cpus_mask, cpus); + + ret = record__mmap_cpu_mask_init(&cpus_mask, cpus); + if (ret) { + pr_err("Failed to init cpu mask\n"); + goto out_free_cpu_mask; + }
ret = record__thread_mask_alloc(&full_mask, cpu__max_cpu().cpu); if (ret) { @@ -3679,7 +3692,8 @@ static int record__init_thread_default_masks(struct record *rec, struct perf_cpu if (ret) return ret;
- record__mmap_cpu_mask_init(&rec->thread_masks->maps, cpus); + if (record__mmap_cpu_mask_init(&rec->thread_masks->maps, cpus)) + return -ENODEV;
rec->nr_threads = 1;
From: Adrian Hunter adrian.hunter@intel.com
[ Upstream commit ca76d7d2812b46124291f99c9b50aaf63a936f23 ]
With mixed per-thread and (system-wide) per-cpu maps, the "any cpu" value -1 must be skipped when setting CPU mask bits.
Prior to commit cbd7bfc7fd99acdd ("tools/perf: Fix out of bound access to cpu mask array") the invalid setting went unnoticed, but since then it causes perf record to fail with an error.
Example:
Before:
$ perf record -e intel_pt// --per-thread uname Failed to initialize parallel data streaming masks
After:
$ perf record -e intel_pt// --per-thread uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.068 MB perf.data ]
Fixes: ae4f8ae16a078964 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps") Signed-off-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Athira Rajeev atrajeev@linux.vnet.ibm.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220915122612.81738-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-record.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 708880a1c83c..7fbc85c1da81 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -3344,6 +3344,8 @@ static int record__mmap_cpu_mask_init(struct mmap_cpu_mask *mask, struct perf_cp return 0;
perf_cpu_map__for_each_cpu(cpu, idx, cpus) { + if (cpu.cpu == -1) + continue; /* Return ENODEV is input cpu is greater than max cpu */ if ((unsigned long)cpu.cpu > mask->nbits) return -ENODEV;
From: William Breathitt Gray william.gray@linaro.org
[ Upstream commit b6e9cded90d46b1066a1ca260d8b5ecf67787aba ]
This driver doesn't need to access I/O ports directly via inb()/outb() and friends. This patch abstracts such access by calling ioport_map() to enable the use of more typical ioread8()/iowrite8() I/O memory accessor calls.
Link: https://lore.kernel.org/r/861c003318dce3d2bef4061711643bb04f5ec14f.165220192... Cc: Syed Nayyar Waris syednwaris@gmail.com Suggested-by: David Laight David.Laight@ACULAB.COM Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: William Breathitt Gray william.gray@linaro.org Link: https://lore.kernel.org/r/e971b897cacfac4cb2eca478f5533d2875f5cadd.165781347... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 2bc54aaa65d2 ("counter: 104-quad-8: Fix skipped IRQ lines during events configuration") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/counter/104-quad-8.c | 169 ++++++++++++++++++----------------- 1 file changed, 89 insertions(+), 80 deletions(-)
diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index a17e51d65aca..43dde9abfdcf 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -63,7 +63,7 @@ struct quad8 { unsigned int synchronous_mode[QUAD8_NUM_COUNTERS]; unsigned int index_polarity[QUAD8_NUM_COUNTERS]; unsigned int cable_fault_enable; - unsigned int base; + void __iomem *base; };
#define QUAD8_REG_INTERRUPT_STATUS 0x10 @@ -118,8 +118,8 @@ static int quad8_signal_read(struct counter_device *counter, if (signal->id < 16) return -EINVAL;
- state = inb(priv->base + QUAD8_REG_INDEX_INPUT_LEVELS) - & BIT(signal->id - 16); + state = ioread8(priv->base + QUAD8_REG_INDEX_INPUT_LEVELS) & + BIT(signal->id - 16);
*level = (state) ? COUNTER_SIGNAL_LEVEL_HIGH : COUNTER_SIGNAL_LEVEL_LOW;
@@ -130,14 +130,14 @@ static int quad8_count_read(struct counter_device *counter, struct counter_count *count, u64 *val) { struct quad8 *const priv = counter_priv(counter); - const int base_offset = priv->base + 2 * count->id; + void __iomem *const base_offset = priv->base + 2 * count->id; unsigned int flags; unsigned int borrow; unsigned int carry; unsigned long irqflags; int i;
- flags = inb(base_offset + 1); + flags = ioread8(base_offset + 1); borrow = flags & QUAD8_FLAG_BT; carry = !!(flags & QUAD8_FLAG_CT);
@@ -147,11 +147,11 @@ static int quad8_count_read(struct counter_device *counter, spin_lock_irqsave(&priv->lock, irqflags);
/* Reset Byte Pointer; transfer Counter to Output Latch */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_CNTR_OUT, - base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_CNTR_OUT, + base_offset + 1);
for (i = 0; i < 3; i++) - *val |= (unsigned long)inb(base_offset) << (8 * i); + *val |= (unsigned long)ioread8(base_offset) << (8 * i);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -162,7 +162,7 @@ static int quad8_count_write(struct counter_device *counter, struct counter_count *count, u64 val) { struct quad8 *const priv = counter_priv(counter); - const int base_offset = priv->base + 2 * count->id; + void __iomem *const base_offset = priv->base + 2 * count->id; unsigned long irqflags; int i;
@@ -173,27 +173,27 @@ static int quad8_count_write(struct counter_device *counter, spin_lock_irqsave(&priv->lock, irqflags);
/* Reset Byte Pointer */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1);
/* Counter can only be set via Preset Register */ for (i = 0; i < 3; i++) - outb(val >> (8 * i), base_offset); + iowrite8(val >> (8 * i), base_offset);
/* Transfer Preset Register to Counter */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_PRESET_CNTR, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_PRESET_CNTR, base_offset + 1);
/* Reset Byte Pointer */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1);
/* Set Preset Register back to original value */ val = priv->preset[count->id]; for (i = 0; i < 3; i++) - outb(val >> (8 * i), base_offset); + iowrite8(val >> (8 * i), base_offset);
/* Reset Borrow, Carry, Compare, and Sign flags */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); /* Reset Error flag */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -246,7 +246,7 @@ static int quad8_function_write(struct counter_device *counter, unsigned int *const quadrature_mode = priv->quadrature_mode + id; unsigned int *const scale = priv->quadrature_scale + id; unsigned int *const synchronous_mode = priv->synchronous_mode + id; - const int base_offset = priv->base + 2 * id + 1; + void __iomem *const base_offset = priv->base + 2 * id + 1; unsigned long irqflags; unsigned int mode_cfg; unsigned int idr_cfg; @@ -266,7 +266,7 @@ static int quad8_function_write(struct counter_device *counter, if (*synchronous_mode) { *synchronous_mode = 0; /* Disable synchronous function mode */ - outb(QUAD8_CTR_IDR | idr_cfg, base_offset); + iowrite8(QUAD8_CTR_IDR | idr_cfg, base_offset); } } else { *quadrature_mode = 1; @@ -292,7 +292,7 @@ static int quad8_function_write(struct counter_device *counter, }
/* Load mode configuration to Counter Mode Register */ - outb(QUAD8_CTR_CMR | mode_cfg, base_offset); + iowrite8(QUAD8_CTR_CMR | mode_cfg, base_offset);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -305,10 +305,10 @@ static int quad8_direction_read(struct counter_device *counter, { const struct quad8 *const priv = counter_priv(counter); unsigned int ud_flag; - const unsigned int flag_addr = priv->base + 2 * count->id + 1; + void __iomem *const flag_addr = priv->base + 2 * count->id + 1;
/* U/D flag: nonzero = up, zero = down */ - ud_flag = inb(flag_addr) & QUAD8_FLAG_UD; + ud_flag = ioread8(flag_addr) & QUAD8_FLAG_UD;
*direction = (ud_flag) ? COUNTER_COUNT_DIRECTION_FORWARD : COUNTER_COUNT_DIRECTION_BACKWARD; @@ -402,7 +402,7 @@ static int quad8_events_configure(struct counter_device *counter) struct counter_event_node *event_node; unsigned int next_irq_trigger; unsigned long ior_cfg; - unsigned long base_offset; + void __iomem *base_offset;
spin_lock_irqsave(&priv->lock, irqflags);
@@ -438,13 +438,13 @@ static int quad8_events_configure(struct counter_device *counter) priv->preset_enable[event_node->channel] << 1 | priv->irq_trigger[event_node->channel] << 3; base_offset = priv->base + 2 * event_node->channel + 1; - outb(QUAD8_CTR_IOR | ior_cfg, base_offset); + iowrite8(QUAD8_CTR_IOR | ior_cfg, base_offset);
/* Enable IRQ line */ irq_enabled |= BIT(event_node->channel); }
- outb(irq_enabled, priv->base + QUAD8_REG_INDEX_INTERRUPT); + iowrite8(irq_enabled, priv->base + QUAD8_REG_INDEX_INTERRUPT);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -508,7 +508,7 @@ static int quad8_index_polarity_set(struct counter_device *counter, { struct quad8 *const priv = counter_priv(counter); const size_t channel_id = signal->id - 16; - const int base_offset = priv->base + 2 * channel_id + 1; + void __iomem *const base_offset = priv->base + 2 * channel_id + 1; unsigned long irqflags; unsigned int idr_cfg = index_polarity << 1;
@@ -519,7 +519,7 @@ static int quad8_index_polarity_set(struct counter_device *counter, priv->index_polarity[channel_id] = index_polarity;
/* Load Index Control configuration to Index Control Register */ - outb(QUAD8_CTR_IDR | idr_cfg, base_offset); + iowrite8(QUAD8_CTR_IDR | idr_cfg, base_offset);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -549,7 +549,7 @@ static int quad8_synchronous_mode_set(struct counter_device *counter, { struct quad8 *const priv = counter_priv(counter); const size_t channel_id = signal->id - 16; - const int base_offset = priv->base + 2 * channel_id + 1; + void __iomem *const base_offset = priv->base + 2 * channel_id + 1; unsigned long irqflags; unsigned int idr_cfg = synchronous_mode;
@@ -566,7 +566,7 @@ static int quad8_synchronous_mode_set(struct counter_device *counter, priv->synchronous_mode[channel_id] = synchronous_mode;
/* Load Index Control configuration to Index Control Register */ - outb(QUAD8_CTR_IDR | idr_cfg, base_offset); + iowrite8(QUAD8_CTR_IDR | idr_cfg, base_offset);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -614,7 +614,7 @@ static int quad8_count_mode_write(struct counter_device *counter, struct quad8 *const priv = counter_priv(counter); unsigned int count_mode; unsigned int mode_cfg; - const int base_offset = priv->base + 2 * count->id + 1; + void __iomem *const base_offset = priv->base + 2 * count->id + 1; unsigned long irqflags;
/* Map Generic Counter count mode to 104-QUAD-8 count mode */ @@ -648,7 +648,7 @@ static int quad8_count_mode_write(struct counter_device *counter, mode_cfg |= (priv->quadrature_scale[count->id] + 1) << 3;
/* Load mode configuration to Counter Mode Register */ - outb(QUAD8_CTR_CMR | mode_cfg, base_offset); + iowrite8(QUAD8_CTR_CMR | mode_cfg, base_offset);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -669,7 +669,7 @@ static int quad8_count_enable_write(struct counter_device *counter, struct counter_count *count, u8 enable) { struct quad8 *const priv = counter_priv(counter); - const int base_offset = priv->base + 2 * count->id; + void __iomem *const base_offset = priv->base + 2 * count->id; unsigned long irqflags; unsigned int ior_cfg;
@@ -681,7 +681,7 @@ static int quad8_count_enable_write(struct counter_device *counter, priv->irq_trigger[count->id] << 3;
/* Load I/O control configuration */ - outb(QUAD8_CTR_IOR | ior_cfg, base_offset + 1); + iowrite8(QUAD8_CTR_IOR | ior_cfg, base_offset + 1);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -697,9 +697,9 @@ static int quad8_error_noise_get(struct counter_device *counter, struct counter_count *count, u32 *noise_error) { const struct quad8 *const priv = counter_priv(counter); - const int base_offset = priv->base + 2 * count->id + 1; + void __iomem *const base_offset = priv->base + 2 * count->id + 1;
- *noise_error = !!(inb(base_offset) & QUAD8_FLAG_E); + *noise_error = !!(ioread8(base_offset) & QUAD8_FLAG_E);
return 0; } @@ -717,17 +717,17 @@ static int quad8_count_preset_read(struct counter_device *counter, static void quad8_preset_register_set(struct quad8 *const priv, const int id, const unsigned int preset) { - const unsigned int base_offset = priv->base + 2 * id; + void __iomem *const base_offset = priv->base + 2 * id; int i;
priv->preset[id] = preset;
/* Reset Byte Pointer */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1);
/* Set Preset Register */ for (i = 0; i < 3; i++) - outb(preset >> (8 * i), base_offset); + iowrite8(preset >> (8 * i), base_offset); }
static int quad8_count_preset_write(struct counter_device *counter, @@ -816,7 +816,7 @@ static int quad8_count_preset_enable_write(struct counter_device *counter, u8 preset_enable) { struct quad8 *const priv = counter_priv(counter); - const int base_offset = priv->base + 2 * count->id + 1; + void __iomem *const base_offset = priv->base + 2 * count->id + 1; unsigned long irqflags; unsigned int ior_cfg;
@@ -831,7 +831,7 @@ static int quad8_count_preset_enable_write(struct counter_device *counter, priv->irq_trigger[count->id] << 3;
/* Load I/O control configuration to Input / Output Control Register */ - outb(QUAD8_CTR_IOR | ior_cfg, base_offset); + iowrite8(QUAD8_CTR_IOR | ior_cfg, base_offset);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -858,7 +858,7 @@ static int quad8_signal_cable_fault_read(struct counter_device *counter, }
/* Logic 0 = cable fault */ - status = inb(priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS); + status = ioread8(priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -899,7 +899,8 @@ static int quad8_signal_cable_fault_enable_write(struct counter_device *counter, /* Enable is active low in Differential Encoder Cable Status register */ cable_fault_enable = ~priv->cable_fault_enable;
- outb(cable_fault_enable, priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS); + iowrite8(cable_fault_enable, + priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -923,7 +924,7 @@ static int quad8_signal_fck_prescaler_write(struct counter_device *counter, { struct quad8 *const priv = counter_priv(counter); const size_t channel_id = signal->id / 2; - const int base_offset = priv->base + 2 * channel_id; + void __iomem *const base_offset = priv->base + 2 * channel_id; unsigned long irqflags;
spin_lock_irqsave(&priv->lock, irqflags); @@ -931,12 +932,12 @@ static int quad8_signal_fck_prescaler_write(struct counter_device *counter, priv->fck_prescaler[channel_id] = prescaler;
/* Reset Byte Pointer */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1);
/* Set filter clock factor */ - outb(prescaler, base_offset); - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_PRESET_PSC, - base_offset + 1); + iowrite8(prescaler, base_offset); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_PRESET_PSC, + base_offset + 1);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -1084,12 +1085,12 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) { struct counter_device *counter = private; struct quad8 *const priv = counter_priv(counter); - const unsigned long base = priv->base; + void __iomem *const base = priv->base; unsigned long irq_status; unsigned long channel; u8 event;
- irq_status = inb(base + QUAD8_REG_INTERRUPT_STATUS); + irq_status = ioread8(base + QUAD8_REG_INTERRUPT_STATUS); if (!irq_status) return IRQ_NONE;
@@ -1118,17 +1119,43 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) }
/* Clear pending interrupts on device */ - outb(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, base + QUAD8_REG_CHAN_OP); + iowrite8(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, base + QUAD8_REG_CHAN_OP);
return IRQ_HANDLED; }
+static void quad8_init_counter(void __iomem *const base_offset) +{ + unsigned long i; + + /* Reset Byte Pointer */ + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + /* Reset filter clock factor */ + iowrite8(0, base_offset); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_PRESET_PSC, + base_offset + 1); + /* Reset Byte Pointer */ + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + /* Reset Preset Register */ + for (i = 0; i < 3; i++) + iowrite8(0x00, base_offset); + /* Reset Borrow, Carry, Compare, and Sign flags */ + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); + /* Reset Error flag */ + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1); + /* Binary encoding; Normal count; non-quadrature mode */ + iowrite8(QUAD8_CTR_CMR, base_offset + 1); + /* Disable A and B inputs; preset on index; FLG1 as Carry */ + iowrite8(QUAD8_CTR_IOR, base_offset + 1); + /* Disable index function; negative index polarity */ + iowrite8(QUAD8_CTR_IDR, base_offset + 1); +} + static int quad8_probe(struct device *dev, unsigned int id) { struct counter_device *counter; struct quad8 *priv; - int i, j; - unsigned int base_offset; + unsigned long i; int err;
if (!devm_request_region(dev, base[id], QUAD8_EXTENT, dev_name(dev))) { @@ -1142,6 +1169,10 @@ static int quad8_probe(struct device *dev, unsigned int id) return -ENOMEM; priv = counter_priv(counter);
+ priv->base = devm_ioport_map(dev, base[id], QUAD8_EXTENT); + if (!priv->base) + return -ENOMEM; + /* Initialize Counter device and driver data */ counter->name = dev_name(dev); counter->parent = dev; @@ -1150,43 +1181,21 @@ static int quad8_probe(struct device *dev, unsigned int id) counter->num_counts = ARRAY_SIZE(quad8_counts); counter->signals = quad8_signals; counter->num_signals = ARRAY_SIZE(quad8_signals); - priv->base = base[id];
spin_lock_init(&priv->lock);
/* Reset Index/Interrupt Register */ - outb(0x00, base[id] + QUAD8_REG_INDEX_INTERRUPT); + iowrite8(0x00, priv->base + QUAD8_REG_INDEX_INTERRUPT); /* Reset all counters and disable interrupt function */ - outb(QUAD8_CHAN_OP_RESET_COUNTERS, base[id] + QUAD8_REG_CHAN_OP); + iowrite8(QUAD8_CHAN_OP_RESET_COUNTERS, priv->base + QUAD8_REG_CHAN_OP); /* Set initial configuration for all counters */ - for (i = 0; i < QUAD8_NUM_COUNTERS; i++) { - base_offset = base[id] + 2 * i; - /* Reset Byte Pointer */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); - /* Reset filter clock factor */ - outb(0, base_offset); - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_PRESET_PSC, - base_offset + 1); - /* Reset Byte Pointer */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); - /* Reset Preset Register */ - for (j = 0; j < 3; j++) - outb(0x00, base_offset); - /* Reset Borrow, Carry, Compare, and Sign flags */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); - /* Reset Error flag */ - outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1); - /* Binary encoding; Normal count; non-quadrature mode */ - outb(QUAD8_CTR_CMR, base_offset + 1); - /* Disable A and B inputs; preset on index; FLG1 as Carry */ - outb(QUAD8_CTR_IOR, base_offset + 1); - /* Disable index function; negative index polarity */ - outb(QUAD8_CTR_IDR, base_offset + 1); - } + for (i = 0; i < QUAD8_NUM_COUNTERS; i++) + quad8_init_counter(priv->base + 2 * i); /* Disable Differential Encoder Cable Status for all channels */ - outb(0xFF, base[id] + QUAD8_DIFF_ENCODER_CABLE_STATUS); + iowrite8(0xFF, priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS); /* Enable all counters and enable interrupt function */ - outb(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, base[id] + QUAD8_REG_CHAN_OP); + iowrite8(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, + priv->base + QUAD8_REG_CHAN_OP);
err = devm_request_irq(&counter->dev, irq[id], quad8_irq_handler, IRQF_SHARED, counter->name, counter);
From: William Breathitt Gray william.gray@linaro.org
[ Upstream commit daae1ee572d1f99daddef26afe6c6fc7aeea741d ]
Reduce magic numbers and improve code readability by implementing and utilizing named register data structures.
Link: https://lore.kernel.org/r/20220707171709.36010-1-william.gray@linaro.org/ Cc: Syed Nayyar Waris syednwaris@gmail.com Tested-by: Fred Eckert Frede@cmslaser.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: William Breathitt Gray william.gray@linaro.org Link: https://lore.kernel.org/r/285fdc7c03892251f50bdbf2c28c19998243a6a3.165781347... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 2bc54aaa65d2 ("counter: 104-quad-8: Fix skipped IRQ lines during events configuration") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/counter/104-quad-8.c | 166 ++++++++++++++++++++--------------- 1 file changed, 93 insertions(+), 73 deletions(-)
diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index 43dde9abfdcf..62c2b7ac4339 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -33,6 +33,36 @@ MODULE_PARM_DESC(irq, "ACCES 104-QUAD-8 interrupt line numbers");
#define QUAD8_NUM_COUNTERS 8
+/** + * struct channel_reg - channel register structure + * @data: Count data + * @control: Channel flags and control + */ +struct channel_reg { + u8 data; + u8 control; +}; + +/** + * struct quad8_reg - device register structure + * @channel: quadrature counter data and control + * @interrupt_status: channel interrupt status + * @channel_oper: enable/reset counters and interrupt functions + * @index_interrupt: enable channel interrupts + * @reserved: reserved for Factory Use + * @index_input_levels: index signal logical input level + * @cable_status: differential encoder cable status + */ +struct quad8_reg { + struct channel_reg channel[QUAD8_NUM_COUNTERS]; + u8 interrupt_status; + u8 channel_oper; + u8 index_interrupt; + u8 reserved[3]; + u8 index_input_levels; + u8 cable_status; +}; + /** * struct quad8 - device private data structure * @lock: lock to prevent clobbering device states during R/W ops @@ -48,7 +78,7 @@ MODULE_PARM_DESC(irq, "ACCES 104-QUAD-8 interrupt line numbers"); * @synchronous_mode: array of index function synchronous mode configurations * @index_polarity: array of index function polarity configurations * @cable_fault_enable: differential encoder cable status enable configurations - * @base: base port address of the device + * @reg: I/O address offset for the device registers */ struct quad8 { spinlock_t lock; @@ -63,14 +93,9 @@ struct quad8 { unsigned int synchronous_mode[QUAD8_NUM_COUNTERS]; unsigned int index_polarity[QUAD8_NUM_COUNTERS]; unsigned int cable_fault_enable; - void __iomem *base; + struct quad8_reg __iomem *reg; };
-#define QUAD8_REG_INTERRUPT_STATUS 0x10 -#define QUAD8_REG_CHAN_OP 0x11 -#define QUAD8_REG_INDEX_INTERRUPT 0x12 -#define QUAD8_REG_INDEX_INPUT_LEVELS 0x16 -#define QUAD8_DIFF_ENCODER_CABLE_STATUS 0x17 /* Borrow Toggle flip-flop */ #define QUAD8_FLAG_BT BIT(0) /* Carry Toggle flip-flop */ @@ -118,8 +143,7 @@ static int quad8_signal_read(struct counter_device *counter, if (signal->id < 16) return -EINVAL;
- state = ioread8(priv->base + QUAD8_REG_INDEX_INPUT_LEVELS) & - BIT(signal->id - 16); + state = ioread8(&priv->reg->index_input_levels) & BIT(signal->id - 16);
*level = (state) ? COUNTER_SIGNAL_LEVEL_HIGH : COUNTER_SIGNAL_LEVEL_LOW;
@@ -130,14 +154,14 @@ static int quad8_count_read(struct counter_device *counter, struct counter_count *count, u64 *val) { struct quad8 *const priv = counter_priv(counter); - void __iomem *const base_offset = priv->base + 2 * count->id; + struct channel_reg __iomem *const chan = priv->reg->channel + count->id; unsigned int flags; unsigned int borrow; unsigned int carry; unsigned long irqflags; int i;
- flags = ioread8(base_offset + 1); + flags = ioread8(&chan->control); borrow = flags & QUAD8_FLAG_BT; carry = !!(flags & QUAD8_FLAG_CT);
@@ -148,10 +172,10 @@ static int quad8_count_read(struct counter_device *counter,
/* Reset Byte Pointer; transfer Counter to Output Latch */ iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_CNTR_OUT, - base_offset + 1); + &chan->control);
for (i = 0; i < 3; i++) - *val |= (unsigned long)ioread8(base_offset) << (8 * i); + *val |= (unsigned long)ioread8(&chan->data) << (8 * i);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -162,7 +186,7 @@ static int quad8_count_write(struct counter_device *counter, struct counter_count *count, u64 val) { struct quad8 *const priv = counter_priv(counter); - void __iomem *const base_offset = priv->base + 2 * count->id; + struct channel_reg __iomem *const chan = priv->reg->channel + count->id; unsigned long irqflags; int i;
@@ -173,27 +197,27 @@ static int quad8_count_write(struct counter_device *counter, spin_lock_irqsave(&priv->lock, irqflags);
/* Reset Byte Pointer */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, &chan->control);
/* Counter can only be set via Preset Register */ for (i = 0; i < 3; i++) - iowrite8(val >> (8 * i), base_offset); + iowrite8(val >> (8 * i), &chan->data);
/* Transfer Preset Register to Counter */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_PRESET_CNTR, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_PRESET_CNTR, &chan->control);
/* Reset Byte Pointer */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, &chan->control);
/* Set Preset Register back to original value */ val = priv->preset[count->id]; for (i = 0; i < 3; i++) - iowrite8(val >> (8 * i), base_offset); + iowrite8(val >> (8 * i), &chan->data);
/* Reset Borrow, Carry, Compare, and Sign flags */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, &chan->control); /* Reset Error flag */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, &chan->control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -246,7 +270,7 @@ static int quad8_function_write(struct counter_device *counter, unsigned int *const quadrature_mode = priv->quadrature_mode + id; unsigned int *const scale = priv->quadrature_scale + id; unsigned int *const synchronous_mode = priv->synchronous_mode + id; - void __iomem *const base_offset = priv->base + 2 * id + 1; + u8 __iomem *const control = &priv->reg->channel[id].control; unsigned long irqflags; unsigned int mode_cfg; unsigned int idr_cfg; @@ -266,7 +290,7 @@ static int quad8_function_write(struct counter_device *counter, if (*synchronous_mode) { *synchronous_mode = 0; /* Disable synchronous function mode */ - iowrite8(QUAD8_CTR_IDR | idr_cfg, base_offset); + iowrite8(QUAD8_CTR_IDR | idr_cfg, control); } } else { *quadrature_mode = 1; @@ -292,7 +316,7 @@ static int quad8_function_write(struct counter_device *counter, }
/* Load mode configuration to Counter Mode Register */ - iowrite8(QUAD8_CTR_CMR | mode_cfg, base_offset); + iowrite8(QUAD8_CTR_CMR | mode_cfg, control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -305,7 +329,7 @@ static int quad8_direction_read(struct counter_device *counter, { const struct quad8 *const priv = counter_priv(counter); unsigned int ud_flag; - void __iomem *const flag_addr = priv->base + 2 * count->id + 1; + u8 __iomem *const flag_addr = &priv->reg->channel[count->id].control;
/* U/D flag: nonzero = up, zero = down */ ud_flag = ioread8(flag_addr) & QUAD8_FLAG_UD; @@ -402,7 +426,6 @@ static int quad8_events_configure(struct counter_device *counter) struct counter_event_node *event_node; unsigned int next_irq_trigger; unsigned long ior_cfg; - void __iomem *base_offset;
spin_lock_irqsave(&priv->lock, irqflags);
@@ -437,14 +460,14 @@ static int quad8_events_configure(struct counter_device *counter) ior_cfg = priv->ab_enable[event_node->channel] | priv->preset_enable[event_node->channel] << 1 | priv->irq_trigger[event_node->channel] << 3; - base_offset = priv->base + 2 * event_node->channel + 1; - iowrite8(QUAD8_CTR_IOR | ior_cfg, base_offset); + iowrite8(QUAD8_CTR_IOR | ior_cfg, + &priv->reg->channel[event_node->channel].control);
/* Enable IRQ line */ irq_enabled |= BIT(event_node->channel); }
- iowrite8(irq_enabled, priv->base + QUAD8_REG_INDEX_INTERRUPT); + iowrite8(irq_enabled, &priv->reg->index_interrupt);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -508,7 +531,7 @@ static int quad8_index_polarity_set(struct counter_device *counter, { struct quad8 *const priv = counter_priv(counter); const size_t channel_id = signal->id - 16; - void __iomem *const base_offset = priv->base + 2 * channel_id + 1; + u8 __iomem *const control = &priv->reg->channel[channel_id].control; unsigned long irqflags; unsigned int idr_cfg = index_polarity << 1;
@@ -519,7 +542,7 @@ static int quad8_index_polarity_set(struct counter_device *counter, priv->index_polarity[channel_id] = index_polarity;
/* Load Index Control configuration to Index Control Register */ - iowrite8(QUAD8_CTR_IDR | idr_cfg, base_offset); + iowrite8(QUAD8_CTR_IDR | idr_cfg, control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -549,7 +572,7 @@ static int quad8_synchronous_mode_set(struct counter_device *counter, { struct quad8 *const priv = counter_priv(counter); const size_t channel_id = signal->id - 16; - void __iomem *const base_offset = priv->base + 2 * channel_id + 1; + u8 __iomem *const control = &priv->reg->channel[channel_id].control; unsigned long irqflags; unsigned int idr_cfg = synchronous_mode;
@@ -566,7 +589,7 @@ static int quad8_synchronous_mode_set(struct counter_device *counter, priv->synchronous_mode[channel_id] = synchronous_mode;
/* Load Index Control configuration to Index Control Register */ - iowrite8(QUAD8_CTR_IDR | idr_cfg, base_offset); + iowrite8(QUAD8_CTR_IDR | idr_cfg, control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -614,7 +637,7 @@ static int quad8_count_mode_write(struct counter_device *counter, struct quad8 *const priv = counter_priv(counter); unsigned int count_mode; unsigned int mode_cfg; - void __iomem *const base_offset = priv->base + 2 * count->id + 1; + u8 __iomem *const control = &priv->reg->channel[count->id].control; unsigned long irqflags;
/* Map Generic Counter count mode to 104-QUAD-8 count mode */ @@ -648,7 +671,7 @@ static int quad8_count_mode_write(struct counter_device *counter, mode_cfg |= (priv->quadrature_scale[count->id] + 1) << 3;
/* Load mode configuration to Counter Mode Register */ - iowrite8(QUAD8_CTR_CMR | mode_cfg, base_offset); + iowrite8(QUAD8_CTR_CMR | mode_cfg, control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -669,7 +692,7 @@ static int quad8_count_enable_write(struct counter_device *counter, struct counter_count *count, u8 enable) { struct quad8 *const priv = counter_priv(counter); - void __iomem *const base_offset = priv->base + 2 * count->id; + u8 __iomem *const control = &priv->reg->channel[count->id].control; unsigned long irqflags; unsigned int ior_cfg;
@@ -681,7 +704,7 @@ static int quad8_count_enable_write(struct counter_device *counter, priv->irq_trigger[count->id] << 3;
/* Load I/O control configuration */ - iowrite8(QUAD8_CTR_IOR | ior_cfg, base_offset + 1); + iowrite8(QUAD8_CTR_IOR | ior_cfg, control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -697,9 +720,9 @@ static int quad8_error_noise_get(struct counter_device *counter, struct counter_count *count, u32 *noise_error) { const struct quad8 *const priv = counter_priv(counter); - void __iomem *const base_offset = priv->base + 2 * count->id + 1; + u8 __iomem *const flag_addr = &priv->reg->channel[count->id].control;
- *noise_error = !!(ioread8(base_offset) & QUAD8_FLAG_E); + *noise_error = !!(ioread8(flag_addr) & QUAD8_FLAG_E);
return 0; } @@ -717,17 +740,17 @@ static int quad8_count_preset_read(struct counter_device *counter, static void quad8_preset_register_set(struct quad8 *const priv, const int id, const unsigned int preset) { - void __iomem *const base_offset = priv->base + 2 * id; + struct channel_reg __iomem *const chan = priv->reg->channel + id; int i;
priv->preset[id] = preset;
/* Reset Byte Pointer */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, &chan->control);
/* Set Preset Register */ for (i = 0; i < 3; i++) - iowrite8(preset >> (8 * i), base_offset); + iowrite8(preset >> (8 * i), &chan->data); }
static int quad8_count_preset_write(struct counter_device *counter, @@ -816,7 +839,7 @@ static int quad8_count_preset_enable_write(struct counter_device *counter, u8 preset_enable) { struct quad8 *const priv = counter_priv(counter); - void __iomem *const base_offset = priv->base + 2 * count->id + 1; + u8 __iomem *const control = &priv->reg->channel[count->id].control; unsigned long irqflags; unsigned int ior_cfg;
@@ -831,7 +854,7 @@ static int quad8_count_preset_enable_write(struct counter_device *counter, priv->irq_trigger[count->id] << 3;
/* Load I/O control configuration to Input / Output Control Register */ - iowrite8(QUAD8_CTR_IOR | ior_cfg, base_offset); + iowrite8(QUAD8_CTR_IOR | ior_cfg, control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -858,7 +881,7 @@ static int quad8_signal_cable_fault_read(struct counter_device *counter, }
/* Logic 0 = cable fault */ - status = ioread8(priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS); + status = ioread8(&priv->reg->cable_status);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -899,8 +922,7 @@ static int quad8_signal_cable_fault_enable_write(struct counter_device *counter, /* Enable is active low in Differential Encoder Cable Status register */ cable_fault_enable = ~priv->cable_fault_enable;
- iowrite8(cable_fault_enable, - priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS); + iowrite8(cable_fault_enable, &priv->reg->cable_status);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -924,7 +946,7 @@ static int quad8_signal_fck_prescaler_write(struct counter_device *counter, { struct quad8 *const priv = counter_priv(counter); const size_t channel_id = signal->id / 2; - void __iomem *const base_offset = priv->base + 2 * channel_id; + struct channel_reg __iomem *const chan = priv->reg->channel + channel_id; unsigned long irqflags;
spin_lock_irqsave(&priv->lock, irqflags); @@ -932,12 +954,12 @@ static int quad8_signal_fck_prescaler_write(struct counter_device *counter, priv->fck_prescaler[channel_id] = prescaler;
/* Reset Byte Pointer */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, &chan->control);
/* Set filter clock factor */ - iowrite8(prescaler, base_offset); + iowrite8(prescaler, &chan->data); iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_PRESET_PSC, - base_offset + 1); + &chan->control);
spin_unlock_irqrestore(&priv->lock, irqflags);
@@ -1085,12 +1107,11 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) { struct counter_device *counter = private; struct quad8 *const priv = counter_priv(counter); - void __iomem *const base = priv->base; unsigned long irq_status; unsigned long channel; u8 event;
- irq_status = ioread8(base + QUAD8_REG_INTERRUPT_STATUS); + irq_status = ioread8(&priv->reg->interrupt_status); if (!irq_status) return IRQ_NONE;
@@ -1119,36 +1140,36 @@ static irqreturn_t quad8_irq_handler(int irq, void *private) }
/* Clear pending interrupts on device */ - iowrite8(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, base + QUAD8_REG_CHAN_OP); + iowrite8(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, &priv->reg->channel_oper);
return IRQ_HANDLED; }
-static void quad8_init_counter(void __iomem *const base_offset) +static void quad8_init_counter(struct channel_reg __iomem *const chan) { unsigned long i;
/* Reset Byte Pointer */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, &chan->control); /* Reset filter clock factor */ - iowrite8(0, base_offset); + iowrite8(0, &chan->data); iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_PRESET_PSC, - base_offset + 1); + &chan->control); /* Reset Byte Pointer */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, &chan->control); /* Reset Preset Register */ for (i = 0; i < 3; i++) - iowrite8(0x00, base_offset); + iowrite8(0x00, &chan->data); /* Reset Borrow, Carry, Compare, and Sign flags */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, &chan->control); /* Reset Error flag */ - iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1); + iowrite8(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, &chan->control); /* Binary encoding; Normal count; non-quadrature mode */ - iowrite8(QUAD8_CTR_CMR, base_offset + 1); + iowrite8(QUAD8_CTR_CMR, &chan->control); /* Disable A and B inputs; preset on index; FLG1 as Carry */ - iowrite8(QUAD8_CTR_IOR, base_offset + 1); + iowrite8(QUAD8_CTR_IOR, &chan->control); /* Disable index function; negative index polarity */ - iowrite8(QUAD8_CTR_IDR, base_offset + 1); + iowrite8(QUAD8_CTR_IDR, &chan->control); }
static int quad8_probe(struct device *dev, unsigned int id) @@ -1169,8 +1190,8 @@ static int quad8_probe(struct device *dev, unsigned int id) return -ENOMEM; priv = counter_priv(counter);
- priv->base = devm_ioport_map(dev, base[id], QUAD8_EXTENT); - if (!priv->base) + priv->reg = devm_ioport_map(dev, base[id], QUAD8_EXTENT); + if (!priv->reg) return -ENOMEM;
/* Initialize Counter device and driver data */ @@ -1185,17 +1206,16 @@ static int quad8_probe(struct device *dev, unsigned int id) spin_lock_init(&priv->lock);
/* Reset Index/Interrupt Register */ - iowrite8(0x00, priv->base + QUAD8_REG_INDEX_INTERRUPT); + iowrite8(0x00, &priv->reg->index_interrupt); /* Reset all counters and disable interrupt function */ - iowrite8(QUAD8_CHAN_OP_RESET_COUNTERS, priv->base + QUAD8_REG_CHAN_OP); + iowrite8(QUAD8_CHAN_OP_RESET_COUNTERS, &priv->reg->channel_oper); /* Set initial configuration for all counters */ for (i = 0; i < QUAD8_NUM_COUNTERS; i++) - quad8_init_counter(priv->base + 2 * i); + quad8_init_counter(priv->reg->channel + i); /* Disable Differential Encoder Cable Status for all channels */ - iowrite8(0xFF, priv->base + QUAD8_DIFF_ENCODER_CABLE_STATUS); + iowrite8(0xFF, &priv->reg->cable_status); /* Enable all counters and enable interrupt function */ - iowrite8(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, - priv->base + QUAD8_REG_CHAN_OP); + iowrite8(QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC, &priv->reg->channel_oper);
err = devm_request_irq(&counter->dev, irq[id], quad8_irq_handler, IRQF_SHARED, counter->name, counter);
From: William Breathitt Gray william.gray@linaro.org
[ Upstream commit 2bc54aaa65d2126ae629919175708a28ce7ef06e ]
IRQ trigger configuration is skipped if it has already been set before; however, the IRQ line still needs to be OR'd to irq_enabled because irq_enabled is reset for every events_configure call. This patch moves the irq_enabled OR operation update to before the irq_trigger check so that IRQ line enablement is not skipped.
Fixes: c95cc0d95702 ("counter: 104-quad-8: Fix persistent enabled events bug") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/20220815122301.2750-1-william.gray@linaro.org/ Signed-off-by: William Breathitt Gray william.gray@linaro.org Link: https://lore.kernel.org/r/179eed11eaf225dbd908993b510df0c8f67b1230.166384477... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/counter/104-quad-8.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index 62c2b7ac4339..4407203e0c9b 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -449,6 +449,9 @@ static int quad8_events_configure(struct counter_device *counter) return -EINVAL; }
+ /* Enable IRQ line */ + irq_enabled |= BIT(event_node->channel); + /* Skip configuration if it is the same as previously set */ if (priv->irq_trigger[event_node->channel] == next_irq_trigger) continue; @@ -462,9 +465,6 @@ static int quad8_events_configure(struct counter_device *counter) priv->irq_trigger[event_node->channel] << 3; iowrite8(QUAD8_CTR_IOR | ior_cfg, &priv->reg->channel[event_node->channel].control); - - /* Enable IRQ line */ - irq_enabled |= BIT(event_node->channel); }
iowrite8(irq_enabled, &priv->reg->index_interrupt);
From: Hongling Zeng zenghongling@kylinos.cn
commit a625a4b8806cc1e928b7dd2cca1fee709c9de56e upstream.
The UAS mode of Hiksemi is reported to fail to work on several platforms with the following error message, then after re-connecting the device will be offlined and not working at all.
[ 592.518442][ 2] sd 8:0:0:0: [sda] tag#17 uas_eh_abort_handler 0 uas-tag 18 inflight: CMD [ 592.527575][ 2] sd 8:0:0:0: [sda] tag#17 CDB: Write(10) 2a 00 03 6f 88 00 00 04 00 00 [ 592.536330][ 2] sd 8:0:0:0: [sda] tag#0 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD [ 592.545266][ 2] sd 8:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 07 44 1a 88 00 00 08 00
These disks have a broken uas implementation, the tag field of the status iu-s is not set properly,so we need to fall-back to usb-storage.
Acked-by: Alan Stern stern@rowland.harvard.edu Cc: stable stable@kernel.org Signed-off-by: Hongling Zeng zenghongling@kylinos.cn Link: https://lore.kernel.org/r/1663901173-21020-1-git-send-email-zenghongling@kyl... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -52,6 +52,13 @@ UNUSUAL_DEV(0x059f, 0x1061, 0x0000, 0x99 USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_REPORT_OPCODES | US_FL_NO_SAME),
+/* Reported-by: Hongling Zeng zenghongling@kylinos.cn */ +UNUSUAL_DEV(0x090c, 0x2000, 0x0000, 0x9999, + "Hiksemi", + "External HDD", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* * Apricorn USB3 dongle sometimes returns "USBSUSBSUSBS" in response to SCSI * commands in UAS mode. Observed with the 1.28 firmware; are there others?
From: Hongling Zeng zenghongling@kylinos.cn
commit e00b488e813f0f1ad9f778e771b7cd2fe2877023 upstream.
The UAS mode of Hiksemi USB_HDD is reported to fail to work on several platforms with the following error message, then after re-connecting the device will be offlined and not working at all.
[ 592.518442][ 2] sd 8:0:0:0: [sda] tag#17 uas_eh_abort_handler 0 uas-tag 18 inflight: CMD [ 592.527575][ 2] sd 8:0:0:0: [sda] tag#17 CDB: Write(10) 2a 00 03 6f 88 00 00 04 00 00 [ 592.536330][ 2] sd 8:0:0:0: [sda] tag#0 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD [ 592.545266][ 2] sd 8:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 07 44 1a 88 00 00 08 00
These disks have a broken uas implementation, the tag field of the status iu-s is not set properly,so we need to fall-back to usb-storage.
Acked-by: Alan Stern stern@rowland.harvard.edu Cc: stable stable@kernel.org Signed-off-by: Hongling Zeng zenghongling@kylinos.cn Link: https://lore.kernel.org/r/1663901185-21067-1-git-send-email-zenghongling@kyl... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -83,6 +83,13 @@ UNUSUAL_DEV(0x0bc2, 0x331a, 0x0000, 0x99 USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_REPORT_LUNS),
+/* Reported-by: Hongling Zeng zenghongling@kylinos.cn */ +UNUSUAL_DEV(0x0bda, 0x9210, 0x0000, 0x9999, + "Hiksemi", + "External HDD", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* Reported-by: Benjamin Tissoires benjamin.tissoires@redhat.com */ UNUSUAL_DEV(0x13fd, 0x3940, 0x0000, 0x9999, "Initio Corporation",
From: Hongling Zeng zenghongling@kylinos.cn
commit 0fb9703a3eade0bb84c635705d9c795345e55053 upstream.
The UAS mode of Thinkplus(0x17ef, 0x3899) is reported to influence performance and trigger kernel panic on several platforms with the following error message:
[ 39.702439] xhci_hcd 0000:0c:00.3: ERROR Transfer event for disabled endpoint or incorrect stream ring [ 39.702442] xhci_hcd 0000:0c:00.3: @000000026c61f810 00000000 00000000 1b000000 05038000
[ 720.545894][13] Workqueue: usb_hub_wq hub_event [ 720.550971][13] ffff88026c143c38 0000000000016300 ffff8802755bb900 ffff880 26cb80000 [ 720.559673][13] ffff88026c144000 ffff88026ca88100 0000000000000000 ffff880 26cb80000 [ 720.568374][13] ffff88026cb80000 ffff88026c143c50 ffffffff8186ae25 ffff880 26ca880f8 [ 720.577076][13] Call Trace: [ 720.580201][13] [<ffffffff8186ae25>] schedule+0x35/0x80 [ 720.586137][13] [<ffffffff8186b0ce>] schedule_preempt_disabled+0xe/0x10 [ 720.593623][13] [<ffffffff8186cb94>] __mutex_lock_slowpath+0x164/0x1e0 [ 720.601012][13] [<ffffffff8186cc3f>] mutex_lock+0x2f/0x40 [ 720.607141][13] [<ffffffff8162b8e9>] usb_disconnect+0x59/0x290
Falling back to USB mass storage can solve this problem, so ignore UAS function of this chip.
Acked-by: Alan Stern stern@rowland.harvard.edu Cc: stable stable@kernel.org Signed-off-by: Hongling Zeng zenghongling@kylinos.cn Link: https://lore.kernel.org/r/1663902249837086.19.seg@mailgw Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -132,6 +132,13 @@ UNUSUAL_DEV(0x154b, 0xf00d, 0x0000, 0x99 USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_ATA_1X),
+/* Reported-by: Hongling Zeng zenghongling@kylinos.cn */ +UNUSUAL_DEV(0x17ef, 0x3899, 0x0000, 0x9999, + "Thinkplus", + "External HDD", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* Reported-by: Hans de Goede hdegoede@redhat.com */ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999, "VIA",
From: Heikki Krogerus heikki.krogerus@linux.intel.com
commit 415ba26cb73f7d22a892043301b91b57ae54db02 upstream.
Sink only devices do not have any source capabilities, so the driver should not warn about that. Also DRP (Dual Role Power) capable devices, such as USB Type-C docking stations, do not return any source capabilities unless they are plugged to a power supply themselves.
Fixes: 1f4642b72be7 ("usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4") Reported-by: Paul Menzel pmenzel@molgen.mpg.de Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20220922145924.80667-1-heikki.krogerus@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/ucsi.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -588,8 +588,6 @@ static int ucsi_get_pdos(struct ucsi_con num_pdos * sizeof(u32)); if (ret < 0 && ret != -ETIMEDOUT) dev_err(ucsi->dev, "UCSI_GET_PDOS failed (%d)\n", ret); - if (ret == 0 && offset == 0) - dev_warn(ucsi->dev, "UCSI_GET_PDOS returned 0 bytes\n");
return ret; }
From: Mario Limonciello mario.limonciello@amd.com
commit 31f87f705b3c1635345d8e8a493697099b43e508 upstream.
If any software has interacted with the USB4 registers before the Linux USB4 CM runs, it may have modified the plug events delay. It has been observed that if this value too large, it's possible that hotplugged devices will negotiate a fallback mode instead in Linux.
To prevent this, explicitly align the plug events delay with the USB4 spec value of 10ms.
Cc: stable@vger.kernel.org Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thunderbolt/switch.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/thunderbolt/switch.c +++ b/drivers/thunderbolt/switch.c @@ -2413,6 +2413,7 @@ int tb_switch_configure(struct tb_switch * additional capabilities. */ sw->config.cmuv = USB4_VERSION_1_0; + sw->config.plug_events_delay = 0xa;
/* Enumerate the switch */ ret = tb_sw_write(sw, (u32 *)&sw->config + 1, TB_CFG_SWITCH,
From: Frank Wunderlich frank-w@public-files.de
commit 797666cd5af041ffb66642fff62f7389f08566a2 upstream.
Add support for Dell 5811e (EM7455) with USB-id 0x413c:0x81c2.
Signed-off-by: Frank Wunderlich frank-w@public-files.de Cc: stable@vger.kernel.org Acked-by: Bjørn Mork bjorn@mork.no Link: https://lore.kernel.org/r/20220926150740.6684-3-linux@fw-web.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1399,6 +1399,7 @@ static const struct usb_device_id produc {QMI_FIXED_INTF(0x413c, 0x81b3, 8)}, /* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */ {QMI_FIXED_INTF(0x413c, 0x81b6, 8)}, /* Dell Wireless 5811e */ {QMI_FIXED_INTF(0x413c, 0x81b6, 10)}, /* Dell Wireless 5811e */ + {QMI_FIXED_INTF(0x413c, 0x81c2, 8)}, /* Dell Wireless 5811e */ {QMI_FIXED_INTF(0x413c, 0x81cc, 8)}, /* Dell Wireless 5816e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */ {QMI_FIXED_INTF(0x413c, 0x81d7, 1)}, /* Dell Wireless 5821e preproduction config */
From: Sebastian Krzyszkowiak sebastian.krzyszkowiak@puri.sm
commit e62563db857f81d75c5726a35bc0180bed6d1540 upstream.
Both i.MX6 and i.MX8 reference manuals list 0xBF8 as SNVS_HPVIDR1 (chapters 57.9 and 6.4.5 respectively).
Without this, trying to read the revision number results in 0 on all revisions, causing the i.MX6 quirk to apply on all platforms, which in turn causes the driver to synthesise power button release events instead of passing the real one as they happen even on platforms like i.MX8 where that's not wanted.
Fixes: 1a26c920717a ("Input: snvs_pwrkey - send key events for i.MX6 S, DL and Q") Tested-by: Martin Kepplinger martin.kepplinger@puri.sm Signed-off-by: Sebastian Krzyszkowiak sebastian.krzyszkowiak@puri.sm Reviewed-by: Mattijs Korpershoek mkorpershoek@baylibre.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/4599101.ElGaqSPkdT@pliszka Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/keyboard/snvs_pwrkey.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/input/keyboard/snvs_pwrkey.c +++ b/drivers/input/keyboard/snvs_pwrkey.c @@ -20,7 +20,7 @@ #include <linux/mfd/syscon.h> #include <linux/regmap.h>
-#define SNVS_HPVIDR1_REG 0xF8 +#define SNVS_HPVIDR1_REG 0xBF8 #define SNVS_LPSR_REG 0x4C /* LP Status Register */ #define SNVS_LPCR_REG 0x38 /* LP Control Register */ #define SNVS_HPSR_REG 0x14
From: Marc Kleine-Budde mkl@pengutronix.de
commit 81d192c2ce74157e717e1fc4b68791f82f7499d4 upstream.
As Jacob noticed, the optimization introduced in 387da6bc7a82 ("can: c_can: cache frames to operate as a true FIFO") doesn't properly work on C_CAN, but on D_CAN IP cores. The exact reasons are still unknown.
For now disable caching if CAN frames in the TX path for C_CAN cores.
Fixes: 387da6bc7a82 ("can: c_can: cache frames to operate as a true FIFO") Link: https://lore.kernel.org/all/20220928083354.1062321-1-mkl@pengutronix.de Link: https://lore.kernel.org/all/15a8084b-9617-2da1-6704-d7e39d60643b@gmail.com Reported-by: Jacob Kroon jacob.kroon@gmail.com Tested-by: Jacob Kroon jacob.kroon@gmail.com Cc: stable@vger.kernel.org # v5.15 Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/c_can/c_can.h | 17 +++++++++++++++-- drivers/net/can/c_can/c_can_main.c | 11 +++++------ 2 files changed, 20 insertions(+), 8 deletions(-)
--- a/drivers/net/can/c_can/c_can.h +++ b/drivers/net/can/c_can/c_can.h @@ -235,9 +235,22 @@ static inline u8 c_can_get_tx_tail(const return ring->tail & (ring->obj_num - 1); }
-static inline u8 c_can_get_tx_free(const struct c_can_tx_ring *ring) +static inline u8 c_can_get_tx_free(const struct c_can_priv *priv, + const struct c_can_tx_ring *ring) { - return ring->obj_num - (ring->head - ring->tail); + u8 head = c_can_get_tx_head(ring); + u8 tail = c_can_get_tx_tail(ring); + + if (priv->type == BOSCH_D_CAN) + return ring->obj_num - (ring->head - ring->tail); + + /* This is not a FIFO. C/D_CAN sends out the buffers + * prioritized. The lowest buffer number wins. + */ + if (head < tail) + return 0; + + return ring->obj_num - head; }
#endif /* C_CAN_H */ --- a/drivers/net/can/c_can/c_can_main.c +++ b/drivers/net/can/c_can/c_can_main.c @@ -429,7 +429,7 @@ static void c_can_setup_receive_object(s static bool c_can_tx_busy(const struct c_can_priv *priv, const struct c_can_tx_ring *tx_ring) { - if (c_can_get_tx_free(tx_ring) > 0) + if (c_can_get_tx_free(priv, tx_ring) > 0) return false;
netif_stop_queue(priv->dev); @@ -437,7 +437,7 @@ static bool c_can_tx_busy(const struct c /* Memory barrier before checking tx_free (head and tail) */ smp_mb();
- if (c_can_get_tx_free(tx_ring) == 0) { + if (c_can_get_tx_free(priv, tx_ring) == 0) { netdev_dbg(priv->dev, "Stopping tx-queue (tx_head=0x%08x, tx_tail=0x%08x, len=%d).\n", tx_ring->head, tx_ring->tail, @@ -465,7 +465,7 @@ static netdev_tx_t c_can_start_xmit(stru
idx = c_can_get_tx_head(tx_ring); tx_ring->head++; - if (c_can_get_tx_free(tx_ring) == 0) + if (c_can_get_tx_free(priv, tx_ring) == 0) netif_stop_queue(dev);
if (idx < c_can_get_tx_tail(tx_ring)) @@ -748,7 +748,7 @@ static void c_can_do_tx(struct net_devic return;
tx_ring->tail += pkts; - if (c_can_get_tx_free(tx_ring)) { + if (c_can_get_tx_free(priv, tx_ring)) { /* Make sure that anybody stopping the queue after * this sees the new tx_ring->tail. */ @@ -760,8 +760,7 @@ static void c_can_do_tx(struct net_devic stats->tx_packets += pkts;
tail = c_can_get_tx_tail(tx_ring); - - if (tail == 0) { + if (priv->type == BOSCH_D_CAN && tail == 0) { u8 head = c_can_get_tx_head(tx_ring);
/* Start transmission for all cached messages */
From: Aidan MacDonald aidanmacdonald.0x0@gmail.com
commit 6726d552a6912e88cf63fe2bda87b2efa0efc7d0 upstream.
Access to registers is guarded by ingenic_tcu_{enable,disable}_regs() so the stop bit can be cleared before accessing a timer channel, but those functions did not clear the stop bit on SoCs with a global TCU clock gate.
Testing on the X1000 has revealed that the stop bits must be cleared _and_ the global TCU clock must be ungated to access timer registers. This appears to be the norm on Ingenic SoCs, and is specified in the documentation for the X1000 and numerous JZ47xx SoCs.
If the stop bit isn't cleared, register writes don't take effect and the system can be left in a broken state, eg. the watchdog timer may not run.
The bug probably went unnoticed because stop bits are zeroed when the SoC is reset, and the kernel does not set them unless a timer gets disabled at runtime. However, it is possible that a bootloader or a previous kernel (if using kexec) leaves the stop bits set and we should not rely on them being cleared.
Fixing this is easy: have ingenic_tcu_{enable,disable}_regs() always clear the stop bit, regardless of the presence of a global TCU gate.
Reviewed-by: Paul Cercueil paul@crapouillou.net Tested-by: Paul Cercueil paul@crapouillou.net Fixes: 4f89e4b8f121 ("clk: ingenic: Add driver for the TCU clocks") Cc: stable@vger.kernel.org Signed-off-by: Aidan MacDonald aidanmacdonald.0x0@gmail.com Link: https://lore.kernel.org/r/20220617122254.738900-1-aidanmacdonald.0x0@gmail.c... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/ingenic/tcu.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)
--- a/drivers/clk/ingenic/tcu.c +++ b/drivers/clk/ingenic/tcu.c @@ -101,15 +101,11 @@ static bool ingenic_tcu_enable_regs(stru bool enabled = false;
/* - * If the SoC has no global TCU clock, we must ungate the channel's - * clock to be able to access its registers. - * If we have a TCU clock, it will be enabled automatically as it has - * been attached to the regmap. + * According to the programming manual, a timer channel's registers can + * only be accessed when the channel's stop bit is clear. */ - if (!tcu->clk) { - enabled = !!ingenic_tcu_is_enabled(hw); - regmap_write(tcu->map, TCU_REG_TSCR, BIT(info->gate_bit)); - } + enabled = !!ingenic_tcu_is_enabled(hw); + regmap_write(tcu->map, TCU_REG_TSCR, BIT(info->gate_bit));
return enabled; } @@ -120,8 +116,7 @@ static void ingenic_tcu_disable_regs(str const struct ingenic_tcu_clk_info *info = tcu_clk->info; struct ingenic_tcu *tcu = tcu_clk->tcu;
- if (!tcu->clk) - regmap_write(tcu->map, TCU_REG_TSSR, BIT(info->gate_bit)); + regmap_write(tcu->map, TCU_REG_TSSR, BIT(info->gate_bit)); }
static u8 ingenic_tcu_get_parent(struct clk_hw *hw)
From: Alexander Wetzel alexander@wetzel-home.de
commit 527008e5e87600a389feb8a57042c928ecca195d upstream.
Make sure local->queue_stop_reasons and vif.txqs_stopped stay in sync.
When a new vif is created the queues may end up in an inconsistent state and be inoperable: Communication not using iTXQ will work, allowing to e.g. complete the association. But the 4-way handshake will time out. The sta will not send out any skbs queued in iTXQs.
All normal attempts to start the queues will fail when reaching this state. local->queue_stop_reasons will have marked all queues as operational but vif.txqs_stopped will still be set, creating an inconsistent internal state.
In reality this seems to be race between the mac80211 function ieee80211_do_open() setting SDATA_STATE_RUNNING and the wake_txqs_tasklet: Depending on the driver and the timing the queues may end up to be operational or not.
Cc: stable@vger.kernel.org Fixes: f856373e2f31 ("wifi: mac80211: do not wake queues on a vif that is being stopped") Signed-off-by: Alexander Wetzel alexander@wetzel-home.de Acked-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20220915130946.302803-1-alexander@wetzel-home.de Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mac80211/util.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/util.c b/net/mac80211/util.c index 53826c663723..efcefb2dd882 100644 --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -301,14 +301,14 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac) local_bh_disable(); spin_lock(&fq->lock);
+ sdata->vif.txqs_stopped[ac] = false; + if (!test_bit(SDATA_STATE_RUNNING, &sdata->state)) goto out;
if (sdata->vif.type == NL80211_IFTYPE_AP) ps = &sdata->bss->ps;
- sdata->vif.txqs_stopped[ac] = false; - list_for_each_entry_rcu(sta, &local->sta_list, list) { if (sdata != sta->sdata) continue;
From: Jarkko Sakkinen jarkko@kernel.org
commit 133e049a3f8c91b175029fb6a59b6039d5e79cba upstream.
Unsanitized pages trigger WARN_ON() unconditionally, which can panic the whole computer, if /proc/sys/kernel/panic_on_warn is set.
In sgx_init(), if misc_register() fails or misc_register() succeeds but neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be prematurely stopped. This may leave unsanitized pages, which will result a false warning.
Refine __sgx_sanitize_pages() to return:
1. Zero when the sanitization process is complete or ksgxd has been requested to stop. 2. The number of unsanitized pages otherwise.
Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list") Reported-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org/... Link: https://lkml.kernel.org/r/20220906000221.34286-2-jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/sgx/main.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -49,9 +49,13 @@ static LIST_HEAD(sgx_dirty_page_list); * Reset post-kexec EPC pages to the uninitialized state. The pages are removed * from the input list, and made available for the page allocator. SECS pages * prepending their children in the input list are left intact. + * + * Return 0 when sanitization was successful or kthread was stopped, and the + * number of unsanitized pages otherwise. */ -static void __sgx_sanitize_pages(struct list_head *dirty_page_list) +static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list) { + unsigned long left_dirty = 0; struct sgx_epc_page *page; LIST_HEAD(dirty); int ret; @@ -59,7 +63,7 @@ static void __sgx_sanitize_pages(struct /* dirty_page_list is thread-local, no need for a lock: */ while (!list_empty(dirty_page_list)) { if (kthread_should_stop()) - return; + return 0;
page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);
@@ -92,12 +96,14 @@ static void __sgx_sanitize_pages(struct } else { /* The page is not yet clean - move to the dirty list. */ list_move_tail(&page->list, &dirty); + left_dirty++; }
cond_resched(); }
list_splice(&dirty, dirty_page_list); + return left_dirty; }
static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page) @@ -440,10 +446,7 @@ static int ksgxd(void *p) * required for SECS pages, whose child pages blocked EREMOVE. */ __sgx_sanitize_pages(&sgx_dirty_page_list); - __sgx_sanitize_pages(&sgx_dirty_page_list); - - /* sanity check: */ - WARN_ON(!list_empty(&sgx_dirty_page_list)); + WARN_ON(__sgx_sanitize_pages(&sgx_dirty_page_list));
while (!kthread_should_stop()) { if (try_to_freeze())
From: Christoph Hellwig hch@lst.de
commit 37dcc673d065d9823576cd9f2484a72531e1cba6 upstream.
If no frontswap module (i.e. zswap) was registered, frontswap_ops will be NULL. In such situation, swapon crashes with the following stack trace:
Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000020a4fab000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] SMP Modules linked in: zram fsl_dpaa2_eth pcs_lynx phylink ahci_qoriq crct10dif_ce ghash_ce sbsa_gwdt fsl_mc_dpio nvme lm90 nvme_core at803x xhci_plat_hcd rtc_fsl_ftm_alarm xgmac_mdio ahci_platform i2c_imx ip6_tables ip_tables fuse Unloaded tainted modules: cppc_cpufreq():1 CPU: 10 PID: 761 Comm: swapon Not tainted 6.0.0-rc2-00454-g22100432cf14 #1 Hardware name: SolidRun Ltd. SolidRun CEX7 Platform, BIOS EDK II Jun 21 2022 pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : frontswap_init+0x38/0x60 lr : __do_sys_swapon+0x8a8/0x9f4 sp : ffff80000969bcf0 x29: ffff80000969bcf0 x28: ffff37bee0d8fc00 x27: ffff80000a7f5000 x26: fffffcdefb971e80 x25: ffffaba797453b90 x24: 0000000000000064 x23: ffff37c1f209d1a8 x22: ffff37bee880e000 x21: ffffaba797748560 x20: ffff37bee0d8fce4 x19: ffffaba797748488 x18: 0000000000000014 x17: 0000000030ec029a x16: ffffaba795a479b0 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000001 x11: ffff37c63c0aba18 x10: 0000000000000000 x9 : ffffaba7956b8c88 x8 : ffff80000969bcd0 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffaba79730f000 x2 : ffff37bee0d8fc00 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: frontswap_init+0x38/0x60 __do_sys_swapon+0x8a8/0x9f4 __arm64_sys_swapon+0x28/0x3c invoke_syscall+0x78/0x100 el0_svc_common.constprop.0+0xd4/0xf4 do_el0_svc+0x38/0x4c el0_svc+0x34/0x10c el0t_64_sync_handler+0x11c/0x150 el0t_64_sync+0x190/0x194 Code: d000e283 910003fd f9006c41 f946d461 (f9400021) ---[ end trace 0000000000000000 ]---
Link: https://lkml.kernel.org/r/20220909130829.3262926-1-hch@lst.de Fixes: 1da0d94a3ec8 ("frontswap: remove support for multiple ops") Reported-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Liu Shixin liushixin2@huawei.com Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/frontswap.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/frontswap.c +++ b/mm/frontswap.c @@ -125,6 +125,9 @@ void frontswap_init(unsigned type, unsig * p->frontswap set to something valid to work properly. */ frontswap_map_set(sis, map); + + if (!frontswap_enabled()) + return; frontswap_ops->init(type); }
From: Linus Walleij linus.walleij@linaro.org
commit 4952aa696a9f221c5e34e5961e02fca41ef67ad6 upstream.
The DT parser is dependent on the PCI device being tagged as device_type = "pci" in order to parse memory ranges properly. Fix this up.
Signed-off-by: Linus Walleij linus.walleij@linaro.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220919092608.813511-1-linus.walleij@linaro.org' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/integratorap.dts | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm/boot/dts/integratorap.dts +++ b/arch/arm/boot/dts/integratorap.dts @@ -160,6 +160,7 @@
pci: pciv3@62000000 { compatible = "arm,integrator-ap-pci", "v3,v360epc-pci"; + device_type = "pci"; #interrupt-cells = <1>; #size-cells = <2>; #address-cells = <3>;
From: ChenXiaoSong chenxiaosong2@huawei.com
commit 1b513f613731e2afc05550e8070d79fac80c661e upstream.
Syzkaller reported BUG_ON as follows:
------------[ cut here ]------------ kernel BUG at fs/ntfs/dir.c:86! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 3 PID: 758 Comm: a.out Not tainted 5.19.0-next-20220808 #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:ntfs_lookup_inode_by_name+0xd11/0x2d10 Code: ff e9 b9 01 00 00 e8 1e fe d6 fe 48 8b 7d 98 49 8d 5d 07 e8 91 85 29 ff 48 c7 45 98 00 00 00 00 e9 5a fb ff ff e8 ff fd d6 fe <0f> 0b e8 f8 fd d6 fe 0f 0b e8 f1 fd d6 fe 48 8b b5 50 ff ff ff 4c RSP: 0018:ffff888079607978 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000008000 RCX: 0000000000000000 RDX: ffff88807cf10000 RSI: ffffffff82a4a081 RDI: 0000000000000003 RBP: ffff888079607a70 R08: 0000000000000001 R09: ffff88807a6d01d7 R10: ffffed100f4da03a R11: 0000000000000000 R12: ffff88800f0fb110 R13: ffff88800f0ee000 R14: ffff88800f0fb000 R15: 0000000000000001 FS: 00007f33b63c7540(0000) GS:ffff888108580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f33b635c090 CR3: 000000000f39e005 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> load_system_files+0x1f7f/0x3620 ntfs_fill_super+0xa01/0x1be0 mount_bdev+0x36a/0x440 ntfs_mount+0x3a/0x50 legacy_get_tree+0xfb/0x210 vfs_get_tree+0x8f/0x2f0 do_new_mount+0x30a/0x760 path_mount+0x4de/0x1880 __x64_sys_mount+0x2b3/0x340 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f33b62ff9ea Code: 48 8b 0d a9 f4 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 76 f4 0b 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd0c471aa8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f33b62ff9ea RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffd0c471be0 RBP: 00007ffd0c471c60 R08: 00007ffd0c471ae0 R09: 00007ffd0c471c24 R10: 0000000000000000 R11: 0000000000000202 R12: 000055bac5afc160 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]---
Fix this by adding sanity check on extended system files' directory inode to ensure that it is directory, just like ntfs_extend_init() when mounting ntfs3.
Link: https://lkml.kernel.org/r/20220809064730.2316892-1-chenxiaosong2@huawei.com Signed-off-by: ChenXiaoSong chenxiaosong2@huawei.com Cc: Anton Altaparmakov anton@tuxera.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ntfs/super.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/ntfs/super.c +++ b/fs/ntfs/super.c @@ -2092,7 +2092,8 @@ get_ctx_vol_failed: // TODO: Initialize security. /* Get the extended system files' directory inode. */ vol->extend_ino = ntfs_iget(sb, FILE_Extend); - if (IS_ERR(vol->extend_ino) || is_bad_inode(vol->extend_ino)) { + if (IS_ERR(vol->extend_ino) || is_bad_inode(vol->extend_ino) || + !S_ISDIR(vol->extend_ino->i_mode)) { if (!IS_ERR(vol->extend_ino)) iput(vol->extend_ino); ntfs_error(sb, "Failed to load $Extend.");
From: Kees Cook keescook@chromium.org
commit 59298997df89e19aad426d4ae0a7e5037074da5a upstream.
The check_object_size() helper under CONFIG_HARDENED_USERCOPY is designed to skip any checks where the length is known at compile time as a reasonable heuristic to avoid "likely known-good" cases. However, it can only do this when the copy_*_user() helpers are, themselves, inline too.
Using find_vmap_area() requires taking a spinlock. The check_object_size() helper can call find_vmap_area() when the destination is in vmap memory. If show_regs() is called in interrupt context, it will attempt a call to copy_from_user_nmi(), which may call check_object_size() and then find_vmap_area(). If something in normal context happens to be in the middle of calling find_vmap_area() (with the spinlock held), the interrupt handler will hang forever.
The copy_from_user_nmi() call is actually being called with a fixed-size length, so check_object_size() should never have been called in the first place. Given the narrow constraints, just replace the __copy_from_user_inatomic() call with an open-coded version that calls only into the sanitizers and not check_object_size(), followed by a call to raw_copy_from_user().
[akpm@linux-foundation.org: no instrument_copy_from_user() in my tree...] Link: https://lkml.kernel.org/r/20220919201648.2250764-1-keescook@chromium.org Link: https://lore.kernel.org/all/CAOUHufaPshtKrTWOz7T7QFYUNVGFm0JBjvM700Nhf9qEL9b... Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns") Signed-off-by: Kees Cook keescook@chromium.org Reported-by: Yu Zhao yuzhao@google.com Reported-by: Florian Lehner dev@der-flo.net Suggested-by: Andrew Morton akpm@linux-foundation.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Florian Lehner dev@der-flo.net Cc: Matthew Wilcox willy@infradead.org Cc: Josh Poimboeuf jpoimboe@kernel.org Cc: Dave Hansen dave.hansen@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/lib/usercopy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/lib/usercopy.c b/arch/x86/lib/usercopy.c index ad0139d25401..f1bb18617156 100644 --- a/arch/x86/lib/usercopy.c +++ b/arch/x86/lib/usercopy.c @@ -44,7 +44,7 @@ copy_from_user_nmi(void *to, const void __user *from, unsigned long n) * called from other contexts. */ pagefault_disable(); - ret = __copy_from_user_inatomic(to, from, n); + ret = raw_copy_from_user(to, from, n); pagefault_enable();
return ret;
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 1552fd3ef7dbe07208b8ae84a0a6566adf7dfc9d upstream.
When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. Fix this up by properly calling dput().
Link: https://lkml.kernel.org/r/20220902191149.112434-1-sj@kernel.org Fixes: 75c1c2b53c78b ("mm/damon/dbgfs: support multiple contexts") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/dbgfs.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/mm/damon/dbgfs.c +++ b/mm/damon/dbgfs.c @@ -853,6 +853,7 @@ static int dbgfs_rm_context(char *name) struct dentry *root, *dir, **new_dirs; struct damon_ctx **new_ctxs; int i, j; + int ret = 0;
if (damon_nr_running_ctxs()) return -EBUSY; @@ -867,14 +868,16 @@ static int dbgfs_rm_context(char *name)
new_dirs = kmalloc_array(dbgfs_nr_ctxs - 1, sizeof(*dbgfs_dirs), GFP_KERNEL); - if (!new_dirs) - return -ENOMEM; + if (!new_dirs) { + ret = -ENOMEM; + goto out_dput; + }
new_ctxs = kmalloc_array(dbgfs_nr_ctxs - 1, sizeof(*dbgfs_ctxs), GFP_KERNEL); if (!new_ctxs) { - kfree(new_dirs); - return -ENOMEM; + ret = -ENOMEM; + goto out_new_dirs; }
for (i = 0, j = 0; i < dbgfs_nr_ctxs; i++) { @@ -894,7 +897,13 @@ static int dbgfs_rm_context(char *name) dbgfs_ctxs = new_ctxs; dbgfs_nr_ctxs--;
- return 0; + goto out_dput; + +out_new_dirs: + kfree(new_dirs); +out_dput: + dput(dir); + return ret; }
static ssize_t dbgfs_rm_context_write(struct file *file,
From: Alexander Couzens lynxis@fe80.eu
commit 42bc4fafe359ed6b73602b7a2dba0dd99588f8ce upstream.
Move the PLL init of the switch out of the pad configuration of the port 6 (usally cpu port).
Fix a unidirectional 100 mbit limitation on 1 gbit or 2.5 gbit links for outbound traffic on port 5 or port 6.
Fixes: c288575f7810 ("net: dsa: mt7530: Add the support of MT7531 switch") Cc: stable@vger.kernel.org Signed-off-by: Alexander Couzens lynxis@fe80.eu Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/mt7530.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -506,14 +506,19 @@ static bool mt7531_dual_sgmii_supported( static int mt7531_pad_setup(struct dsa_switch *ds, phy_interface_t interface) { - struct mt7530_priv *priv = ds->priv; + return 0; +} + +static void +mt7531_pll_setup(struct mt7530_priv *priv) +{ u32 top_sig; u32 hwstrap; u32 xtal; u32 val;
if (mt7531_dual_sgmii_supported(priv)) - return 0; + return;
val = mt7530_read(priv, MT7531_CREV); top_sig = mt7530_read(priv, MT7531_TOP_SIG_SR); @@ -592,8 +597,6 @@ mt7531_pad_setup(struct dsa_switch *ds, val |= EN_COREPLL; mt7530_write(priv, MT7531_PLLGP_EN, val); usleep_range(25, 35); - - return 0; }
static void @@ -2310,6 +2313,8 @@ mt7531_setup(struct dsa_switch *ds) SYS_CTRL_PHY_RST | SYS_CTRL_SW_RST | SYS_CTRL_REG_RST);
+ mt7531_pll_setup(priv); + if (mt7531_dual_sgmii_supported(priv)) { priv->p5_intf_sel = P5_INTF_SEL_GMAC5_SGMII;
@@ -2863,8 +2868,6 @@ mt7531_cpu_port_config(struct dsa_switch case 6: interface = PHY_INTERFACE_MODE_2500BASEX;
- mt7531_pad_setup(ds, interface); - priv->p6_interface = interface; break; default:
From: Ulf Hansson ulf.hansson@linaro.org
commit 3c6656337852e9f1a4079d172f3fddfbf00868f9 upstream.
This reverts commit a3b884cef873 ("firmware: arm_scmi: Add clock management to the SCMI power domain").
Using the GENPD_FLAG_PM_CLK tells genpd to gate/ungate the consumer device's clock(s) during runtime suspend/resume through the PM clock API. More precisely, in genpd_runtime_resume() the clock(s) for the consumer device would become ungated prior to the driver-level ->runtime_resume() callbacks gets invoked.
This behaviour isn't a good fit for all platforms/drivers. For example, a driver may need to make some preparations of its device in its ->runtime_resume() callback, like calling clk_set_rate() before the clock(s) should be ungated. In these cases, it's easier to let the clock(s) to be managed solely by the driver, rather than at the PM domain level.
For these reasons, let's drop the use GENPD_FLAG_PM_CLK for the SCMI PM domain, as to enable it to be more easily adopted across ARM platforms.
Fixes: a3b884cef873 ("firmware: arm_scmi: Add clock management to the SCMI power domain") Cc: Nicolas Pitre npitre@baylibre.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Tested-by: Peng Fan peng.fan@nxp.com Acked-by: Sudeep Holla sudeep.holla@arm.com Link: https://lore.kernel.org/r/20220919122033.86126-1-ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/arm_scmi/scmi_pm_domain.c | 26 -------------------------- 1 file changed, 26 deletions(-)
--- a/drivers/firmware/arm_scmi/scmi_pm_domain.c +++ b/drivers/firmware/arm_scmi/scmi_pm_domain.c @@ -8,7 +8,6 @@ #include <linux/err.h> #include <linux/io.h> #include <linux/module.h> -#include <linux/pm_clock.h> #include <linux/pm_domain.h> #include <linux/scmi_protocol.h>
@@ -53,27 +52,6 @@ static int scmi_pd_power_off(struct gene return scmi_pd_power(domain, false); }
-static int scmi_pd_attach_dev(struct generic_pm_domain *pd, struct device *dev) -{ - int ret; - - ret = pm_clk_create(dev); - if (ret) - return ret; - - ret = of_pm_clk_add_clks(dev); - if (ret >= 0) - return 0; - - pm_clk_destroy(dev); - return ret; -} - -static void scmi_pd_detach_dev(struct generic_pm_domain *pd, struct device *dev) -{ - pm_clk_destroy(dev); -} - static int scmi_pm_domain_probe(struct scmi_device *sdev) { int num_domains, i; @@ -124,10 +102,6 @@ static int scmi_pm_domain_probe(struct s scmi_pd->genpd.name = scmi_pd->name; scmi_pd->genpd.power_off = scmi_pd_power_off; scmi_pd->genpd.power_on = scmi_pd_power_on; - scmi_pd->genpd.attach_dev = scmi_pd_attach_dev; - scmi_pd->genpd.detach_dev = scmi_pd_detach_dev; - scmi_pd->genpd.flags = GENPD_FLAG_PM_CLK | - GENPD_FLAG_ACTIVE_WAKEUP;
pm_genpd_init(&scmi_pd->genpd, NULL, state == SCMI_POWER_STATE_GENERIC_OFF);
From: Yang Shi shy828301@gmail.com
commit bedf03416913d88c796288f9dca109a53608c745 upstream.
The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will move to use RCU instead of disabling local interrupts in fast-GUP. Using an IPI is the old-styled way of serializing against fast-GUP although it still works as expected now.
And fast-GUP now fixed the potential race with THP collapse by checking whether PMD is changed or not. So IPI broadcast in radix pmd collapse flush is not necessary anymore. But it is still needed for hash TLB.
Link: https://lkml.kernel.org/r/20220907180144.555485-2-shy828301@gmail.com Suggested-by: Aneesh Kumar K.V aneesh.kumar@linux.ibm.com Signed-off-by: Yang Shi shy828301@gmail.com Acked-by: David Hildenbrand david@redhat.com Acked-by: Peter Xu peterx@redhat.com Cc: Christophe Leroy christophe.leroy@csgroup.eu Cc: Hugh Dickins hughd@google.com Cc: Jason Gunthorpe jgg@nvidia.com Cc: John Hubbard jhubbard@nvidia.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Nicholas Piggin npiggin@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/mm/book3s64/radix_pgtable.c | 9 --------- 1 file changed, 9 deletions(-)
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -937,15 +937,6 @@ pmd_t radix__pmdp_collapse_flush(struct pmd = *pmdp; pmd_clear(pmdp);
- /* - * pmdp collapse_flush need to ensure that there are no parallel gup - * walk after this call. This is needed so that we can have stable - * page ref count when collapsing a page. We don't allow a collapse page - * if we have gup taken on the page. We can ensure that by sending IPI - * because gup walk happens with IRQ disabled. - */ - serialize_against_pte_lookup(vma->vm_mm); - radix__flush_tlb_collapsed_pmd(vma->vm_mm, address);
return pmd;
From: Chris Wilson chris@chris-wilson.co.uk
commit 6ef7d362123ecb5bf6d163bb9c7fd6ba2d8c968c upstream.
When we submit a new pair of contexts to ELSP for execution, we start a timer by which point we expect the HW to have switched execution to the pending contexts. If the promotion to the new pair of contexts has not occurred, we declare the executing context to have hung and force the preemption to take place by resetting the engine and resubmitting the new contexts.
This can lead to an unfair situation where almost all of the preemption timeout is consumed by the first context which just switches into the second context immediately prior to the timer firing and triggering the preemption reset (assuming that the timer interrupts before we process the CS events for the context switch). The second context hasn't yet had a chance to yield to the incoming ELSP (and send the ACk for the promotion) and so ends up being blamed for the reset.
If we see that a context switch has occurred since setting the preemption timeout, but have not yet received the ACK for the ELSP promotion, rearm the preemption timer and check again. This is especially significant if the first context was not schedulable and so we used the shortest timer possible, greatly increasing the chance of accidentally blaming the second innocent context.
Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out") Reported-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Andi Shyti andi.shyti@linux.intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Tested-by: Andrzej Hajda andrzej.hajda@intel.com Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrz... (cherry picked from commit 107ba1a2c705f4358f2602ec2f2fd821bb651f42) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 +++++++++++++ drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 21 ++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -156,6 +156,21 @@ struct intel_engine_execlists { struct timer_list preempt;
/** + * @preempt_target: active request at the time of the preemption request + * + * We force a preemption to occur if the pending contexts have not + * been promoted to active upon receipt of the CS ack event within + * the timeout. This timeout maybe chosen based on the target, + * using a very short timeout if the context is no longer schedulable. + * That short timeout may not be applicable to other contexts, so + * if a context switch should happen within before the preemption + * timeout, we may shoot early at an innocent context. To prevent this, + * we record which context was active at the time of the preemption + * request and only reset that context upon the timeout. + */ + const struct i915_request *preempt_target; + + /** * @ccid: identifier for contexts submitted to this engine */ u32 ccid; --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -1241,6 +1241,9 @@ static unsigned long active_preempt_time if (!rq) return 0;
+ /* Only allow ourselves to force reset the currently active context */ + engine->execlists.preempt_target = rq; + /* Force a fast reset for terminated contexts (ignoring sysfs!) */ if (unlikely(intel_context_is_banned(rq->context) || bad_request(rq))) return 1; @@ -2427,8 +2430,24 @@ static void execlists_submission_tasklet GEM_BUG_ON(inactive - post > ARRAY_SIZE(post));
if (unlikely(preempt_timeout(engine))) { + const struct i915_request *rq = *engine->execlists.active; + + /* + * If after the preempt-timeout expired, we are still on the + * same active request/context as before we initiated the + * preemption, reset the engine. + * + * However, if we have processed a CS event to switch contexts, + * but not yet processed the CS event for the pending + * preemption, reset the timer allowing the new context to + * gracefully exit. + */ cancel_timer(&engine->execlists.preempt); - engine->execlists.error_interrupt |= ERROR_PREEMPT; + if (rq == engine->execlists.preempt_target) + engine->execlists.error_interrupt |= ERROR_PREEMPT; + else + set_timer_ms(&engine->execlists.preempt, + active_preempt_timeout(engine, rq)); }
if (unlikely(READ_ONCE(engine->execlists.error_interrupt))) {
From: Bokun Zhang Bokun.Zhang@amd.com
commit 3b7329cf5a767c1be38352d43066012e220ad43c upstream.
- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor during the suspend time. Furthermore, we cannot request a mode 1 reset under SRIOV as VF. Therefore, we will skip it as it is called in suspend_noirq() function.
- In the resume code path, we need to send REQ_GPU_INIT to the hypervisor and also resume PSP IP block under SRIOV.
Signed-off-by: Bokun Zhang Bokun.Zhang@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 ++++++++++++++++++++++++++- 2 files changed, 30 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -1056,6 +1056,10 @@ bool amdgpu_acpi_should_gpu_reset(struct { if (adev->flags & AMD_IS_APU) return false; + + if (amdgpu_sriov_vf(adev)) + return false; + return pm_suspend_target_state != PM_SUSPEND_TO_IDLE; }
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3178,7 +3178,8 @@ static int amdgpu_device_ip_resume_phase continue; if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_COMMON || adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC || - adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH) { + adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_IH || + (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_PSP && amdgpu_sriov_vf(adev))) {
r = adev->ip_blocks[i].version->funcs->resume(adev); if (r) { @@ -4124,12 +4125,20 @@ static void amdgpu_device_evict_resource int amdgpu_device_suspend(struct drm_device *dev, bool fbcon) { struct amdgpu_device *adev = drm_to_adev(dev); + int r = 0;
if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0;
adev->in_suspend = true;
+ if (amdgpu_sriov_vf(adev)) { + amdgpu_virt_fini_data_exchange(adev); + r = amdgpu_virt_request_full_gpu(adev, false); + if (r) + return r; + } + if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3)) DRM_WARN("smart shift update failed\n");
@@ -4153,6 +4162,9 @@ int amdgpu_device_suspend(struct drm_dev
amdgpu_device_ip_suspend_phase2(adev);
+ if (amdgpu_sriov_vf(adev)) + amdgpu_virt_release_full_gpu(adev, false); + return 0; }
@@ -4171,6 +4183,12 @@ int amdgpu_device_resume(struct drm_devi struct amdgpu_device *adev = drm_to_adev(dev); int r = 0;
+ if (amdgpu_sriov_vf(adev)) { + r = amdgpu_virt_request_full_gpu(adev, true); + if (r) + return r; + } + if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0;
@@ -4185,6 +4203,13 @@ int amdgpu_device_resume(struct drm_devi }
r = amdgpu_device_ip_resume(adev); + + /* no matter what r is, always need to properly release full GPU */ + if (amdgpu_sriov_vf(adev)) { + amdgpu_virt_init_data_exchange(adev); + amdgpu_virt_release_full_gpu(adev, true); + } + if (r) { dev_err(adev->dev, "amdgpu_device_ip_resume failed (%d).\n", r); return r;
From: Maxime Coquelin maxime.coquelin@redhat.com
commit 46f8a29272e51b6df7393d58fc5cb8967397ef2b upstream.
If the VDUSE application provides a smaller config space than the driver expects, the driver may use uninitialized memory from the stack.
This patch prevents it by initializing the buffer passed by the driver to store the config value.
This fix addresses CVE-2022-2308.
Cc: stable@vger.kernel.org # v5.15+ Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reviewed-by: Xie Yongji xieyongji@bytedance.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Maxime Coquelin maxime.coquelin@redhat.com Message-Id: 20220831154923.97809-1-maxime.coquelin@redhat.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/vdpa/vdpa_user/vduse_dev.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -662,10 +662,15 @@ static void vduse_vdpa_get_config(struct { struct vduse_dev *dev = vdpa_to_vduse(vdpa);
- if (offset > dev->config_size || - len > dev->config_size - offset) + /* Initialize the buffer in case of partial copy. */ + memset(buf, 0, len); + + if (offset > dev->config_size) return;
+ if (len > dev->config_size - offset) + len = dev->config_size - offset; + memcpy(buf, dev->config + offset, len); }
From: Niklas Cassel niklas.cassel@wdc.com
commit ea08aec7e77bfd6599489ec430f9f859ab84575a upstream.
Commit 1527f69204fe ("ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile") added an explicit entry for AMD Green Sardine AHCI controller using the board_ahci_mobile configuration (this configuration has later been renamed to board_ahci_low_power).
The board_ahci_low_power configuration enables support for low power modes.
This explicit entry takes precedence over the generic AHCI controller entry, which does not enable support for low power modes.
Therefore, when commit 1527f69204fe ("ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile") was backported to stable kernels, it make some Pioneer optical drives, which was working perfectly fine before the commit was backported, stop working.
The real problem is that the Pioneer optical drives do not handle low power modes correctly. If these optical drives would have been tested on another AHCI controller using the board_ahci_low_power configuration, this issue would have been detected earlier.
Unfortunately, the board_ahci_low_power configuration is only used in less than 15% of the total AHCI controller entries, so many devices have never been tested with an AHCI controller with low power modes.
Fixes: 1527f69204fe ("ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile") Cc: stable@vger.kernel.org Reported-by: Jaap Berkhout j.j.berkhout@staalenberk.nl Signed-off-by: Niklas Cassel niklas.cassel@wdc.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ata/libata-core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -3988,6 +3988,10 @@ static const struct ata_blacklist_entry { "PIONEER DVD-RW DVR-212D", NULL, ATA_HORKAGE_NOSETXFER }, { "PIONEER DVD-RW DVR-216D", NULL, ATA_HORKAGE_NOSETXFER },
+ /* These specific Pioneer models have LPM issues */ + { "PIONEER BD-RW BDR-207M", NULL, ATA_HORKAGE_NOLPM }, + { "PIONEER BD-RW BDR-205", NULL, ATA_HORKAGE_NOLPM }, + /* Crucial BX100 SSD 500GB has broken LPM support */ { "CT500BX100SSD1", NULL, ATA_HORKAGE_NOLPM },
From: Florian Westphal fw@strlen.de
commit 30c19366636f72515679aa10dad61a4d988d4c9a upstream.
Martin Zaharinov reports BUG with 5.19.10 kernel: kernel BUG at mm/vmalloc.c:2437! invalid opcode: 0000 [#1] SMP CPU: 28 PID: 0 Comm: swapper/28 Tainted: G W O 5.19.9 #1 [..] RIP: 0010:__get_vm_area_node+0x120/0x130 __vmalloc_node_range+0x96/0x1e0 kvmalloc_node+0x92/0xb0 bucket_table_alloc.isra.0+0x47/0x140 rhashtable_try_insert+0x3a4/0x440 rhashtable_insert_slow+0x1b/0x30 [..]
bucket_table_alloc uses kvzalloc(GPF_ATOMIC). If kmalloc fails, this now falls through to vmalloc and hits code paths that assume GFP_KERNEL.
Link: https://lkml.kernel.org/r/20220926151650.15293-1-fw@strlen.de Fixes: a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc") Signed-off-by: Florian Westphal fw@strlen.de Suggested-by: Michal Hocko mhocko@suse.com Link: https://lore.kernel.org/linux-mm/Yy3MS2uhSgjF47dy@pc636/T/#t Acked-by: Michal Hocko mhocko@suse.com Reported-by: Martin Zaharinov micron10@gmail.com Cc: Uladzislau Rezki (Sony) urezki@gmail.com Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/util.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/mm/util.c +++ b/mm/util.c @@ -619,6 +619,10 @@ void *kvmalloc_node(size_t size, gfp_t f if (ret || size <= PAGE_SIZE) return ret;
+ /* non-sleeping allocations are not supported by vmalloc */ + if (!gfpflags_allow_blocking(flags)) + return NULL; + /* Don't even allow crazy sizes */ if (unlikely(size > INT_MAX)) { WARN_ON_ONCE(!(flags & __GFP_NOWARN));
From: Menglong Dong imagedong@tencent.com
commit 26d3e21ce1aab6cb19069c510fac8e7474445b18 upstream.
Factor out __mptcp_close() from mptcp_close(). The caller of __mptcp_close() should hold the socket lock, and cancel mptcp work when __mptcp_close() returns true.
This function will be used in the next commit.
Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests") Fixes: 6aeed9045071 ("mptcp: fix race on unaccepted mptcp sockets") Cc: stable@vger.kernel.org Reviewed-by: Jiang Biao benbjiang@tencent.com Reviewed-by: Mengen Sun mengensun@tencent.com Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Menglong Dong imagedong@tencent.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 14 ++++++++++++-- net/mptcp/protocol.h | 1 + 2 files changed, 13 insertions(+), 2 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2832,13 +2832,12 @@ static void __mptcp_destroy_sock(struct sock_put(sk); }
-static void mptcp_close(struct sock *sk, long timeout) +bool __mptcp_close(struct sock *sk, long timeout) { struct mptcp_subflow_context *subflow; struct mptcp_sock *msk = mptcp_sk(sk); bool do_cancel_work = false;
- lock_sock(sk); sk->sk_shutdown = SHUTDOWN_MASK;
if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) { @@ -2880,6 +2879,17 @@ cleanup: } else { mptcp_reset_timeout(msk, 0); } + + return do_cancel_work; +} + +static void mptcp_close(struct sock *sk, long timeout) +{ + bool do_cancel_work; + + lock_sock(sk); + + do_cancel_work = __mptcp_close(sk, timeout); release_sock(sk); if (do_cancel_work) mptcp_cancel_work(sk); --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -613,6 +613,7 @@ void mptcp_subflow_reset(struct sock *ss void mptcp_subflow_queue_clean(struct sock *ssk); void mptcp_sock_graft(struct sock *sk, struct socket *parent); struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); +bool __mptcp_close(struct sock *sk, long timeout);
bool mptcp_addresses_equal(const struct mptcp_addr_info *a, const struct mptcp_addr_info *b, bool use_port);
From: Menglong Dong imagedong@tencent.com
commit 30e51b923e436b631e8d5b77fa5e318c6b066dc7 upstream.
The mptcp socket and its subflow sockets in accept queue can't be released after the process exit.
While the release of a mptcp socket in listening state, the corresponding tcp socket will be released too. Meanwhile, the tcp socket in the unaccept queue will be released too. However, only init subflow is in the unaccept queue, and the joined subflow is not in the unaccept queue, which makes the joined subflow won't be released, and therefore the corresponding unaccepted mptcp socket will not be released to.
This can be reproduced easily with following steps:
1. create 2 namespace and veth: $ ip netns add mptcp-client $ ip netns add mptcp-server $ sysctl -w net.ipv4.conf.all.rp_filter=0 $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1 $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1 $ ip link add red-client netns mptcp-client type veth peer red-server \ netns mptcp-server $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client $ ip -n mptcp-server link set red-server up $ ip -n mptcp-client link set red-client up
2. configure the endpoint and limit for client and server: $ ip -n mptcp-server mptcp endpoint flush $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2 $ ip -n mptcp-client mptcp endpoint flush $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2 $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \ 1 subflow
3. listen and accept on a port, such as 9999. The nc command we used here is modified, which makes it use mptcp protocol by default. $ ip netns exec mptcp-server nc -l -k -p 9999
4. open another *two* terminal and use each of them to connect to the server with the following command: $ ip netns exec mptcp-client nc 10.0.0.1 9999 Input something after connect to trigger the connection of the second subflow. So that there are two established mptcp connections, with the second one still unaccepted.
5. exit all the nc command, and check the tcp socket in server namespace. And you will find that there is one tcp socket in CLOSE_WAIT state and can't release forever.
Fix this by closing all of the unaccepted mptcp socket in mptcp_subflow_queue_clean() with __mptcp_close().
Now, we can ensure that all unaccepted mptcp sockets will be cleaned by __mptcp_close() before they are released, so mptcp_sock_destruct(), which is used to clean the unaccepted mptcp socket, is not needed anymore.
The selftests for mptcp is ran for this commit, and no new failures.
Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests") Fixes: 6aeed9045071 ("mptcp: fix race on unaccepted mptcp sockets") Cc: stable@vger.kernel.org Reviewed-by: Jiang Biao benbjiang@tencent.com Reviewed-by: Mengen Sun mengensun@tencent.com Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Menglong Dong imagedong@tencent.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 2 +- net/mptcp/protocol.h | 1 + net/mptcp/subflow.c | 33 +++++++-------------------------- 3 files changed, 9 insertions(+), 27 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2692,7 +2692,7 @@ static void __mptcp_clear_xmit(struct so dfrag_clear(sk, dfrag); }
-static void mptcp_cancel_work(struct sock *sk) +void mptcp_cancel_work(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk);
--- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -614,6 +614,7 @@ void mptcp_subflow_queue_clean(struct so void mptcp_sock_graft(struct sock *sk, struct socket *parent); struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); bool __mptcp_close(struct sock *sk, long timeout); +void mptcp_cancel_work(struct sock *sk);
bool mptcp_addresses_equal(const struct mptcp_addr_info *a, const struct mptcp_addr_info *b, bool use_port); --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -602,30 +602,6 @@ static bool subflow_hmac_valid(const str return !crypto_memneq(hmac, mp_opt->hmac, MPTCPOPT_HMAC_LEN); }
-static void mptcp_sock_destruct(struct sock *sk) -{ - /* if new mptcp socket isn't accepted, it is free'd - * from the tcp listener sockets request queue, linked - * from req->sk. The tcp socket is released. - * This calls the ULP release function which will - * also remove the mptcp socket, via - * sock_put(ctx->conn). - * - * Problem is that the mptcp socket will be in - * ESTABLISHED state and will not have the SOCK_DEAD flag. - * Both result in warnings from inet_sock_destruct. - */ - if ((1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { - sk->sk_state = TCP_CLOSE; - WARN_ON_ONCE(sk->sk_socket); - sock_orphan(sk); - } - - /* We don't need to clear msk->subflow, as it's still NULL at this point */ - mptcp_destroy_common(mptcp_sk(sk), 0); - inet_sock_destruct(sk); -} - static void mptcp_force_close(struct sock *sk) { /* the msk is not yet exposed to user-space */ @@ -768,7 +744,6 @@ create_child: /* new mpc subflow takes ownership of the newly * created mptcp socket */ - new_msk->sk_destruct = mptcp_sock_destruct; mptcp_sk(new_msk)->setsockopt_seq = ctx->setsockopt_seq; mptcp_pm_new_connection(mptcp_sk(new_msk), child, 1); mptcp_token_accept(subflow_req, mptcp_sk(new_msk)); @@ -1763,13 +1738,19 @@ void mptcp_subflow_queue_clean(struct so
for (msk = head; msk; msk = next) { struct sock *sk = (struct sock *)msk; - bool slow; + bool slow, do_cancel_work;
+ sock_hold(sk); slow = lock_sock_fast_nested(sk); next = msk->dl_next; msk->first = NULL; msk->dl_next = NULL; + + do_cancel_work = __mptcp_close(sk, 0); unlock_sock_fast(sk, slow); + if (do_cancel_work) + mptcp_cancel_work(sk); + sock_put(sk); }
/* we are still under the listener msk socket lock */
From: Sergei Antonov saproj@gmail.com
commit 35ca91d1338ae158f6dcc0de5d1e86197924ffda upstream.
According to the datasheet [1] at page 377, 4-bit bus width is turned on by bit 2 of the Bus Width Register. Thus the current bitmask is wrong: define BUS_WIDTH_4 BIT(1)
BIT(1) does not work but BIT(2) works. This has been verified on real MOXA hardware with FTSDC010 controller revision 1_6_0.
The corrected value of BUS_WIDTH_4 mask collides with: define BUS_WIDTH_8 BIT(2). Additionally, 8-bit bus width mode isn't supported according to the datasheet, so let's remove the corresponding code.
[1] https://bitbucket.org/Kasreyn/mkrom-uc7112lx/src/master/documents/FIC8120_DS...
Fixes: 1b66e94e6b99 ("mmc: moxart: Add MOXA ART SD/MMC driver") Signed-off-by: Sergei Antonov saproj@gmail.com Cc: Jonas Jensen jonas.jensen@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220907205753.1577434-1-saproj@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/moxart-mmc.c | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-)
--- a/drivers/mmc/host/moxart-mmc.c +++ b/drivers/mmc/host/moxart-mmc.c @@ -111,8 +111,8 @@ #define CLK_DIV_MASK 0x7f
/* REG_BUS_WIDTH */ -#define BUS_WIDTH_8 BIT(2) -#define BUS_WIDTH_4 BIT(1) +#define BUS_WIDTH_4_SUPPORT BIT(3) +#define BUS_WIDTH_4 BIT(2) #define BUS_WIDTH_1 BIT(0)
#define MMC_VDD_360 23 @@ -524,9 +524,6 @@ static void moxart_set_ios(struct mmc_ho case MMC_BUS_WIDTH_4: writel(BUS_WIDTH_4, host->base + REG_BUS_WIDTH); break; - case MMC_BUS_WIDTH_8: - writel(BUS_WIDTH_8, host->base + REG_BUS_WIDTH); - break; default: writel(BUS_WIDTH_1, host->base + REG_BUS_WIDTH); break; @@ -651,16 +648,8 @@ static int moxart_probe(struct platform_ dmaengine_slave_config(host->dma_chan_rx, &cfg); }
- switch ((readl(host->base + REG_BUS_WIDTH) >> 3) & 3) { - case 1: + if (readl(host->base + REG_BUS_WIDTH) & BUS_WIDTH_4_SUPPORT) mmc->caps |= MMC_CAP_4_BIT_DATA; - break; - case 2: - mmc->caps |= MMC_CAP_4_BIT_DATA | MMC_CAP_8_BIT_DATA; - break; - default: - break; - }
writel(0, host->base + REG_INTERRUPT_MASK);
From: Wenchao Chen wenchao.chen@unisoc.com
commit e7afa79a3b35a27a046a2139f8b20bd6b98155c2 upstream.
The block device uses multiple queues to access emmc. There will be up to 3 requests in the hsq of the host. The current code will check whether there is a request doing recovery before entering the queue, but it will not check whether there is a request when the lock is issued. The request is in recovery mode. If there is a request in recovery, then a read and write request is initiated at this time, and the conflict between the request and the recovery request will cause the data to be trampled.
Signed-off-by: Wenchao Chen wenchao.chen@unisoc.com Fixes: 511ce378e16f ("mmc: Add MMC host software queue support") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220916090506.10662-1-wenchao.chen666@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mmc_hsq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/mmc_hsq.c +++ b/drivers/mmc/host/mmc_hsq.c @@ -34,7 +34,7 @@ static void mmc_hsq_pump_requests(struct spin_lock_irqsave(&hsq->lock, flags);
/* Make sure we are not already running a request now */ - if (hsq->mrq) { + if (hsq->mrq || hsq->recovery_halt) { spin_unlock_irqrestore(&hsq->lock, flags); return; }
From: Yang Shi shy828301@gmail.com
commit 70cbc3cc78a997d8247b50389d37c4e1736019da upstream.
Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer sufficient to handle concurrent GUP-fast in all cases, it only handles traditional IPI-based GUP-fast correctly. On architectures that send an IPI broadcast on TLB flush, it works as expected. But on the architectures that do not use IPI to broadcast TLB flush, it may have the below race:
CPU A CPU B THP collapse fast GUP gup_pmd_range() <-- see valid pmd gup_pte_range() <-- work on pte pmdp_collapse_flush() <-- clear pmd and flush __collapse_huge_page_isolate() check page pinned <-- before GUP bump refcount pin the page check PTE <-- no change __collapse_huge_page_copy() copy data to huge page ptep_clear() install huge pmd for the huge page return the stale page discard the stale page
The race can be fixed by checking whether PMD is changed or not after taking the page pin in fast GUP, just like what it does for PTE. If the PMD is changed it means there may be parallel THP collapse, so GUP should back off.
Also update the stale comment about serializing against fast GUP in khugepaged.
Link: https://lkml.kernel.org/r/20220907180144.555485-1-shy828301@gmail.com Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()") Acked-by: David Hildenbrand david@redhat.com Acked-by: Peter Xu peterx@redhat.com Signed-off-by: Yang Shi shy828301@gmail.com Reviewed-by: John Hubbard jhubbard@nvidia.com Cc: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com Cc: Hugh Dickins hughd@google.com Cc: Jason Gunthorpe jgg@nvidia.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Nicholas Piggin npiggin@gmail.com Cc: Christophe Leroy christophe.leroy@csgroup.eu Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/gup.c | 34 ++++++++++++++++++++++++++++------ mm/khugepaged.c | 10 ++++++---- 2 files changed, 34 insertions(+), 10 deletions(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -2278,8 +2278,28 @@ static void __maybe_unused undo_dev_page }
#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL -static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +/* + * Fast-gup relies on pte change detection to avoid concurrent pgtable + * operations. + * + * To pin the page, fast-gup needs to do below in order: + * (1) pin the page (by prefetching pte), then (2) check pte not changed. + * + * For the rest of pgtable operations where pgtable updates can be racy + * with fast-gup, we need to do (1) clear pte, then (2) check whether page + * is pinned. + * + * Above will work for all pte-level operations, including THP split. + * + * For THP collapse, it's a bit more complicated because fast-gup may be + * walking a pgtable page that is being freed (pte is still valid but pmd + * can be cleared already). To avoid race in such condition, we need to + * also check pmd here to make sure pmd doesn't change (corresponds to + * pmdp_collapse_flush() in the THP collapse code path). + */ +static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, unsigned int flags, + struct page **pages, int *nr) { struct dev_pagemap *pgmap = NULL; int nr_start = *nr, ret = 0; @@ -2325,7 +2345,8 @@ static int gup_pte_range(pmd_t pmd, unsi goto pte_unmap; }
- if (unlikely(pte_val(pte) != pte_val(*ptep))) { + if (unlikely(pmd_val(pmd) != pmd_val(*pmdp)) || + unlikely(pte_val(pte) != pte_val(*ptep))) { gup_put_folio(folio, 1, flags); goto pte_unmap; } @@ -2372,8 +2393,9 @@ pte_unmap: * get_user_pages_fast_only implementation that can pin pages. Thus it's still * useful to have gup_huge_pmd even if we can't operate on ptes. */ -static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, - unsigned int flags, struct page **pages, int *nr) +static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, unsigned int flags, + struct page **pages, int *nr) { return 0; } @@ -2697,7 +2719,7 @@ static int gup_pmd_range(pud_t *pudp, pu if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr, PMD_SHIFT, next, flags, pages, nr)) return 0; - } else if (!gup_pte_range(pmd, addr, next, flags, pages, nr)) + } else if (!gup_pte_range(pmd, pmdp, addr, next, flags, pages, nr)) return 0; } while (pmdp++, addr = next, addr != end);
--- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1121,10 +1121,12 @@ static void collapse_huge_page(struct mm
pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */ /* - * After this gup_fast can't run anymore. This also removes - * any huge TLB entry from the CPU so we won't allow - * huge and small TLB entries for the same virtual address - * to avoid the risk of CPU bugs in that area. + * This removes any huge TLB entry from the CPU so we won't allow + * huge and small TLB entries for the same virtual address to + * avoid the risk of CPU bugs in that area. + * + * Parallel fast GUP is fine since fast GUP will back off when + * it detects PMD is changed. */ _pmd = pmdp_collapse_flush(vma, address, pmd); spin_unlock(pmd_ptl);
From: Mel Gorman mgorman@techsingularity.net
commit 3d36424b3b5850bd92f3e89b953a430d7cfc88ef upstream.
Patrick Daly reported the following problem;
NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - before offline operation [0] - ZONE_MOVABLE [1] - ZONE_NORMAL [2] - NULL
For a GFP_KERNEL allocation, alloc_pages_slowpath() will save the offset of ZONE_NORMAL in ac->preferred_zoneref. If a concurrent memory_offline operation removes the last page from ZONE_MOVABLE, build_all_zonelists() & build_zonerefs_node() will update node_zonelists as shown below. Only populated zones are added.
NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - after offline operation [0] - ZONE_NORMAL [1] - NULL [2] - NULL
The race is simple -- page allocation could be in progress when a memory hot-remove operation triggers a zonelist rebuild that removes zones. The allocation request will still have a valid ac->preferred_zoneref that is now pointing to NULL and triggers an OOM kill.
This problem probably always existed but may be slightly easier to trigger due to 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator") which distinguishes between zones that are completely unpopulated versus zones that have valid pages not managed by the buddy allocator (e.g. reserved, memblock, ballooning etc). Memory hotplug had multiple stages with timing considerations around managed/present page updates, the zonelist rebuild and the zone span updates. As David Hildenbrand puts it
memory offlining adjusts managed+present pages of the zone essentially in one go. If after the adjustments, the zone is no longer populated (present==0), we rebuild the zone lists.
Once that's done, we try shrinking the zone (start+spanned pages) -- which results in zone_start_pfn == 0 if there are no more pages. That happens *after* rebuilding the zonelists via remove_pfn_range_from_zone().
The only requirement to fix the race is that a page allocation request identifies when a zonelist rebuild has happened since the allocation request started and no page has yet been allocated. Use a seqlock_t to track zonelist updates with a lockless read-side of the zonelist and protecting the rebuild and update of the counter with a spinlock.
[akpm@linux-foundation.org: make zonelist_update_seq static] Link: https://lkml.kernel.org/r/20220824110900.vh674ltxmzb3proq@techsingularity.ne... Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator") Signed-off-by: Mel Gorman mgorman@techsingularity.net Reported-by: Patrick Daly quic_pdaly@quicinc.com Acked-by: Michal Hocko mhocko@suse.com Reviewed-by: David Hildenbrand david@redhat.com Cc: stable@vger.kernel.org [4.9+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 53 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 10 deletions(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4623,6 +4623,30 @@ void fs_reclaim_release(gfp_t gfp_mask) EXPORT_SYMBOL_GPL(fs_reclaim_release); #endif
+/* + * Zonelists may change due to hotplug during allocation. Detect when zonelists + * have been rebuilt so allocation retries. Reader side does not lock and + * retries the allocation if zonelist changes. Writer side is protected by the + * embedded spin_lock. + */ +static DEFINE_SEQLOCK(zonelist_update_seq); + +static unsigned int zonelist_iter_begin(void) +{ + if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) + return read_seqbegin(&zonelist_update_seq); + + return 0; +} + +static unsigned int check_retry_zonelist(unsigned int seq) +{ + if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) + return read_seqretry(&zonelist_update_seq, seq); + + return seq; +} + /* Perform direct synchronous page reclaim */ static unsigned long __perform_reclaim(gfp_t gfp_mask, unsigned int order, @@ -4916,6 +4940,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u int compaction_retries; int no_progress_loops; unsigned int cpuset_mems_cookie; + unsigned int zonelist_iter_cookie; int reserve_flags;
/* @@ -4926,11 +4951,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u (__GFP_ATOMIC|__GFP_DIRECT_RECLAIM))) gfp_mask &= ~__GFP_ATOMIC;
-retry_cpuset: +restart: compaction_retries = 0; no_progress_loops = 0; compact_priority = DEF_COMPACT_PRIORITY; cpuset_mems_cookie = read_mems_allowed_begin(); + zonelist_iter_cookie = zonelist_iter_begin();
/* * The fast path uses conservative alloc_flags to succeed only until @@ -5102,9 +5128,13 @@ retry: goto retry;
- /* Deal with possible cpuset update races before we start OOM killing */ - if (check_retry_cpuset(cpuset_mems_cookie, ac)) - goto retry_cpuset; + /* + * Deal with possible cpuset update races or zonelist updates to avoid + * a unnecessary OOM kill. + */ + if (check_retry_cpuset(cpuset_mems_cookie, ac) || + check_retry_zonelist(zonelist_iter_cookie)) + goto restart;
/* Reclaim has failed us, start killing things */ page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress); @@ -5124,9 +5154,13 @@ retry: }
nopage: - /* Deal with possible cpuset update races before we fail */ - if (check_retry_cpuset(cpuset_mems_cookie, ac)) - goto retry_cpuset; + /* + * Deal with possible cpuset update races or zonelist updates to avoid + * a unnecessary OOM kill. + */ + if (check_retry_cpuset(cpuset_mems_cookie, ac) || + check_retry_zonelist(zonelist_iter_cookie)) + goto restart;
/* * Make sure that __GFP_NOFAIL request doesn't leak out and make sure @@ -6421,9 +6455,8 @@ static void __build_all_zonelists(void * int nid; int __maybe_unused cpu; pg_data_t *self = data; - static DEFINE_SPINLOCK(lock);
- spin_lock(&lock); + write_seqlock(&zonelist_update_seq);
#ifdef CONFIG_NUMA memset(node_load, 0, sizeof(node_load)); @@ -6460,7 +6493,7 @@ static void __build_all_zonelists(void * #endif }
- spin_unlock(&lock); + write_sequnlock(&zonelist_update_seq); }
static noinline void __init
From: Maurizio Lombardi mlombard@redhat.com
commit dac22531bbd4af2426c4e29e05594415ccfa365d upstream.
A number of drivers call page_frag_alloc() with a fragment's size > PAGE_SIZE.
In low memory conditions, __page_frag_cache_refill() may fail the order 3 cache allocation and fall back to order 0; In this case, the cache will be smaller than the fragment, causing memory corruptions.
Prevent this from happening by checking if the newly allocated cache is large enough for the fragment; if not, the allocation will fail and page_frag_alloc() will return NULL.
Link: https://lkml.kernel.org/r/20220715125013.247085-1-mlombard@redhat.com Fixes: b63ae8ca096d ("mm/net: Rename and move page fragment handling from net/ to mm/") Signed-off-by: Maurizio Lombardi mlombard@redhat.com Reviewed-by: Alexander Duyck alexanderduyck@fb.com Cc: Chen Lin chen45464546@163.com Cc: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5651,6 +5651,18 @@ refill: /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; offset = size - fragsz; + if (unlikely(offset < 0)) { + /* + * The caller is trying to allocate a fragment + * with fragsz > PAGE_SIZE but the cache isn't big + * enough to satisfy the request, this may + * happen in low memory conditions. + * We don't release the cache page because + * it could make memory pressure worse + * so we simply return NULL here. + */ + return NULL; + } }
nc->pagecnt_bias--;
From: Zi Yan ziy@nvidia.com
commit 80e2b584f3abfc31c3fe5573007f0d1d10810fde upstream.
set_migratetype_isolate() does not allow isolating MIGRATE_CMA pageblocks unless it is used for CMA allocation. isolate_single_pageblock() did not have the same behavior when it is used together with set_migratetype_isolate() in start_isolate_page_range(). This allows alloc_contig_range() with migratetype other than MIGRATE_CMA, like MIGRATE_MOVABLE (used by alloc_contig_pages()), to isolate first and last pageblock but fail the rest. The failure leads to changing migratetype of the first and last pageblock to MIGRATE_MOVABLE from MIGRATE_CMA, corrupting the CMA region. This can happen during gigantic page allocations.
Like Doug said here: https://lore.kernel.org/linux-mm/a3363a52-883b-dcd1-b77f-f2bb378d6f2d@gmail...., for gigantic page allocations, the user would notice no difference, since the allocation on CMA region will fail as well as it did before. But it might hurt the performance of device drivers that use CMA, since CMA region size decreases.
Fix it by passing migratetype into isolate_single_pageblock(), so that set_migratetype_isolate() used by isolate_single_pageblock() will prevent the isolation happening.
Link: https://lkml.kernel.org/r/20220914023913.1855924-1-zi.yan@sent.com Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: Zi Yan ziy@nvidia.com Reported-by: Doug Berger opendmb@gmail.com Cc: David Hildenbrand david@redhat.com Cc: Doug Berger opendmb@gmail.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_isolation.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)
--- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -288,6 +288,7 @@ __first_valid_page(unsigned long pfn, un * @isolate_before: isolate the pageblock before the boundary_pfn * @skip_isolation: the flag to skip the pageblock isolation in second * isolate_single_pageblock() + * @migratetype: migrate type to set in error recovery. * * Free and in-use pages can be as big as MAX_ORDER-1 and contain more than one * pageblock. When not all pageblocks within a page are isolated at the same @@ -302,9 +303,9 @@ __first_valid_page(unsigned long pfn, un * the in-use page then splitting the free page. */ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, - gfp_t gfp_flags, bool isolate_before, bool skip_isolation) + gfp_t gfp_flags, bool isolate_before, bool skip_isolation, + int migratetype) { - unsigned char saved_mt; unsigned long start_pfn; unsigned long isolate_pageblock; unsigned long pfn; @@ -328,13 +329,13 @@ static int isolate_single_pageblock(unsi start_pfn = max(ALIGN_DOWN(isolate_pageblock, MAX_ORDER_NR_PAGES), zone->zone_start_pfn);
- saved_mt = get_pageblock_migratetype(pfn_to_page(isolate_pageblock)); + if (skip_isolation) { + int mt = get_pageblock_migratetype(pfn_to_page(isolate_pageblock));
- if (skip_isolation) - VM_BUG_ON(!is_migrate_isolate(saved_mt)); - else { - ret = set_migratetype_isolate(pfn_to_page(isolate_pageblock), saved_mt, flags, - isolate_pageblock, isolate_pageblock + pageblock_nr_pages); + VM_BUG_ON(!is_migrate_isolate(mt)); + } else { + ret = set_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype, + flags, isolate_pageblock, isolate_pageblock + pageblock_nr_pages);
if (ret) return ret; @@ -475,7 +476,7 @@ static int isolate_single_pageblock(unsi failed: /* restore the original migratetype */ if (!skip_isolation) - unset_migratetype_isolate(pfn_to_page(isolate_pageblock), saved_mt); + unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype); return -EBUSY; }
@@ -537,7 +538,8 @@ int start_isolate_page_range(unsigned lo bool skip_isolation = false;
/* isolate [isolate_start, isolate_start + pageblock_nr_pages) pageblock */ - ret = isolate_single_pageblock(isolate_start, flags, gfp_flags, false, skip_isolation); + ret = isolate_single_pageblock(isolate_start, flags, gfp_flags, false, + skip_isolation, migratetype); if (ret) return ret;
@@ -545,7 +547,8 @@ int start_isolate_page_range(unsigned lo skip_isolation = true;
/* isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock */ - ret = isolate_single_pageblock(isolate_end, flags, gfp_flags, true, skip_isolation); + ret = isolate_single_pageblock(isolate_end, flags, gfp_flags, true, + skip_isolation, migratetype); if (ret) { unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype); return ret;
From: Binyi Han dantengknight@gmail.com
commit 4eb5bbde3ccb710d3b85bfb13466612e56393369 upstream.
Smatch checker complains that 'secretmem_mnt' dereferencing possible ERR_PTR(). Let the function return if 'secretmem_mnt' is ERR_PTR, to avoid deferencing it.
Link: https://lkml.kernel.org/r/20220904074647.GA64291@cloud-MacBookPro Fixes: 1507f51255c9f ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: Binyi Han dantengknight@gmail.com Reviewed-by: Andrew Morton akpm@linux-foudation.org Cc: Mike Rapoport rppt@kernel.org Cc: Ammar Faizi ammarfaizi2@gnuweeb.org Cc: Hagen Paul Pfeifer hagen@jauu.net Cc: James Bottomley James.Bottomley@HansenPartnership.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/secretmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -283,7 +283,7 @@ static int secretmem_init(void)
secretmem_mnt = kern_mount(&secretmem_fs); if (IS_ERR(secretmem_mnt)) - ret = PTR_ERR(secretmem_mnt); + return PTR_ERR(secretmem_mnt);
/* prevent secretmem mappings from ever getting PROT_EXEC */ secretmem_mnt->mnt_flags |= MNT_NOEXEC;
From: Alistair Popple apopple@nvidia.com
commit 60bae73708963de4a17231077285bd9ff2f41c44 upstream.
When clearing a PTE the TLB should be flushed whilst still holding the PTL to avoid a potential race with madvise/munmap/etc. For example consider the following sequence:
CPU0 CPU1 ---- ----
migrate_vma_collect_pmd() pte_unmap_unlock() madvise(MADV_DONTNEED) -> zap_pte_range() pte_offset_map_lock() [ PTE not present, TLB not flushed ] pte_unmap_unlock() [ page is still accessible via stale TLB ] flush_tlb_range()
In this case the page may still be accessed via the stale TLB entry after madvise returns. Fix this by flushing the TLB while holding the PTL.
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Link: https://lkml.kernel.org/r/9f801e9d8d830408f2ca27821f606e09aa856899.166207852... Signed-off-by: Alistair Popple apopple@nvidia.com Reported-by: Nadav Amit nadav.amit@gmail.com Reviewed-by: "Huang, Ying" ying.huang@intel.com Acked-by: David Hildenbrand david@redhat.com Acked-by: Peter Xu peterx@redhat.com Cc: Alex Sierra alex.sierra@amd.com Cc: Ben Skeggs bskeggs@redhat.com Cc: Felix Kuehling Felix.Kuehling@amd.com Cc: huang ying huang.ying.caritas@gmail.com Cc: Jason Gunthorpe jgg@nvidia.com Cc: John Hubbard jhubbard@nvidia.com Cc: Karol Herbst kherbst@redhat.com Cc: Logan Gunthorpe logang@deltatee.com Cc: Lyude Paul lyude@redhat.com Cc: Matthew Wilcox willy@infradead.org Cc: Paul Mackerras paulus@ozlabs.org Cc: Ralph Campbell rcampbell@nvidia.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/migrate_device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -248,13 +248,14 @@ next: migrate->dst[migrate->npages] = 0; migrate->src[migrate->npages++] = mpfn; } - arch_leave_lazy_mmu_mode(); - pte_unmap_unlock(ptep - 1, ptl);
/* Only flush the TLB if we actually modified any entries */ if (unmapped) flush_tlb_range(walk->vma, start, end);
+ arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(ptep - 1, ptl); + return 0; }
From: Alistair Popple apopple@nvidia.com
commit a3589e1d5fe39c3d9fdd291b111524b93d08bc32 upstream.
Currently we only call flush_cache_page() for the anon_exclusive case, however in both cases we clear the pte so should flush the cache.
Link: https://lkml.kernel.org/r/5676f30436ab71d1a587ac73f835ed8bd2113ff5.166207852... Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Signed-off-by: Alistair Popple apopple@nvidia.com Reviewed-by: David Hildenbrand david@redhat.com Acked-by: Peter Xu peterx@redhat.com Cc: Alex Sierra alex.sierra@amd.com Cc: Ben Skeggs bskeggs@redhat.com Cc: Felix Kuehling Felix.Kuehling@amd.com Cc: huang ying huang.ying.caritas@gmail.com Cc: "Huang, Ying" ying.huang@intel.com Cc: Jason Gunthorpe jgg@nvidia.com Cc: John Hubbard jhubbard@nvidia.com Cc: Karol Herbst kherbst@redhat.com Cc: Logan Gunthorpe logang@deltatee.com Cc: Lyude Paul lyude@redhat.com Cc: Matthew Wilcox willy@infradead.org Cc: Nadav Amit nadav.amit@gmail.com Cc: Paul Mackerras paulus@ozlabs.org Cc: Ralph Campbell rcampbell@nvidia.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/migrate_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -187,9 +187,9 @@ again: bool anon_exclusive; pte_t swp_pte;
+ flush_cache_page(vma, addr, pte_pfn(*ptep)); anon_exclusive = PageAnon(page) && PageAnonExclusive(page); if (anon_exclusive) { - flush_cache_page(vma, addr, pte_pfn(*ptep)); ptep_clear_flush(vma, addr, ptep);
if (page_try_share_anon_rmap(page)) {
From: Alistair Popple apopple@nvidia.com
commit fd35ca3d12cc9922d7d9a35f934e72132dbc4853 upstream.
migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that installs migration entries directly if it can lock the migrating page. When removing a dirty pte the dirty bit is supposed to be carried over to the underlying page to prevent it being lost.
Currently migrate_vma_*() can only be used for private anonymous mappings. That means loss of the dirty bit usually doesn't result in data loss because these pages are typically not file-backed. However pages may be backed by swap storage which can result in data loss if an attempt is made to migrate a dirty page that doesn't yet have the PageDirty flag set.
In this case migration will fail due to unexpected references but the dirty pte bit will be lost. If the page is subsequently reclaimed data won't be written back to swap storage as it is considered uptodate, resulting in data loss if the page is subsequently accessed.
Prevent this by copying the dirty bit to the page when removing the pte to match what try_to_migrate_one() does.
Link: https://lkml.kernel.org/r/dd48e4882ce859c295c1a77612f66d198b0403f9.166207852... Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Signed-off-by: Alistair Popple apopple@nvidia.com Acked-by: Peter Xu peterx@redhat.com Reviewed-by: "Huang, Ying" ying.huang@intel.com Reported-by: "Huang, Ying" ying.huang@intel.com Acked-by: David Hildenbrand david@redhat.com Cc: Alex Sierra alex.sierra@amd.com Cc: Ben Skeggs bskeggs@redhat.com Cc: Felix Kuehling Felix.Kuehling@amd.com Cc: huang ying huang.ying.caritas@gmail.com Cc: Jason Gunthorpe jgg@nvidia.com Cc: John Hubbard jhubbard@nvidia.com Cc: Karol Herbst kherbst@redhat.com Cc: Logan Gunthorpe logang@deltatee.com Cc: Lyude Paul lyude@redhat.com Cc: Matthew Wilcox willy@infradead.org Cc: Nadav Amit nadav.amit@gmail.com Cc: Paul Mackerras paulus@ozlabs.org Cc: Ralph Campbell rcampbell@nvidia.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/migrate_device.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -7,6 +7,7 @@ #include <linux/export.h> #include <linux/memremap.h> #include <linux/migrate.h> +#include <linux/mm.h> #include <linux/mm_inline.h> #include <linux/mmu_notifier.h> #include <linux/oom.h> @@ -190,7 +191,7 @@ again: flush_cache_page(vma, addr, pte_pfn(*ptep)); anon_exclusive = PageAnon(page) && PageAnonExclusive(page); if (anon_exclusive) { - ptep_clear_flush(vma, addr, ptep); + pte = ptep_clear_flush(vma, addr, ptep);
if (page_try_share_anon_rmap(page)) { set_pte_at(mm, addr, ptep, pte); @@ -200,11 +201,15 @@ again: goto next; } } else { - ptep_get_and_clear(mm, addr, ptep); + pte = ptep_get_and_clear(mm, addr, ptep); }
migrate->cpages++;
+ /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pte)) + folio_mark_dirty(page_folio(page)); + /* Setup special migration page table entry */ if (mpfn & MIGRATE_PFN_WRITE) entry = make_writable_migration_entry(
From: Minchan Kim minchan@kernel.org
commit 58d426a7ba92870d489686dfdb9d06b66815a2ab upstream.
MADV_PAGEOUT tries to isolate non-LRU pages and gets a warning from isolate_lru_page below.
Fix it by checking PageLRU in advance.
------------[ cut here ]------------ trying to isolate tail page WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140 Modules linked in: CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:isolate_lru_page+0x130/0x140
Link: https://lore.kernel.org/linux-mm/485f8c33.2471b.182d5726afb.Coremail.hantian... Link: https://lkml.kernel.org/r/20220908151204.762596-1-minchan@kernel.org Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT") Signed-off-by: Minchan Kim minchan@kernel.org Reported-by: 韩天ç` hantianshuo@iie.ac.cn Suggested-by: Yang Shi shy828301@gmail.com Acked-by: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/madvise.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/madvise.c +++ b/mm/madvise.c @@ -451,8 +451,11 @@ regular_page: continue; }
- /* Do not interfere with other mappings of this page */ - if (page_mapcount(page) != 1) + /* + * Do not interfere with other mappings of this page and + * non-LRU page. + */ + if (!PageLRU(page) || page_mapcount(page) != 1) continue;
VM_BUG_ON_PAGE(PageTransCompound(page), page);
From: Sergei Antonov saproj@gmail.com
commit 70427f6e9ecfc8c5f977b21dd9f846b3bda02500 upstream.
Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug.
int main() { unsigned i; char paragon[SIZE]; void* ptr;
memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE); }
In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16.
Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it.
Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit.
On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug.
Link: https://lkml.kernel.org/r/20220908204809.2012451-1-saproj@gmail.com Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov saproj@gmail.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/mm/memory.c +++ b/mm/memory.c @@ -4378,14 +4378,20 @@ vm_fault_t finish_fault(struct vm_fault
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); - ret = 0; + /* Re-check under ptl */ - if (likely(!vmf_pte_changed(vmf))) + if (likely(!vmf_pte_changed(vmf))) { do_set_pte(vmf, page, vmf->address); - else + + /* no need to invalidate: a not-present page won't be cached */ + update_mmu_cache(vma, vmf->address, vmf->pte); + + ret = 0; + } else { + update_mmu_tlb(vma, vmf->address, vmf->pte); ret = VM_FAULT_NOPAGE; + }
- update_mmu_tlb(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; }
From: Doug Berger opendmb@gmail.com
commit 317314527d173e1f139ceaf8cb87cb1746abf240 upstream.
With gigantic pages it may not be true that struct page structures are contiguous across the entire gigantic page. The nth_page macro is used here in place of direct pointer arithmetic to correct for this.
Mike said:
: This error could cause addressing exceptions. However, this is only : possible in configurations where CONFIG_SPARSEMEM && : !CONFIG_SPARSEMEM_VMEMMAP. Such a configuration option is rare and : unknown to be the default anywhere.
Link: https://lkml.kernel.org/r/20220914190917.3517663-1-opendmb@gmail.com Fixes: 8531fc6f52f5 ("hugetlb: add hugetlb demote page support") Signed-off-by: Doug Berger opendmb@gmail.com Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: Oscar Salvador osalvador@suse.de Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Cc: Muchun Song songmuchun@bytedance.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3418,6 +3418,7 @@ static int demote_free_huge_page(struct { int i, nid = page_to_nid(page); struct hstate *target_hstate; + struct page *subpage; int rc = 0;
target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order); @@ -3451,15 +3452,16 @@ static int demote_free_huge_page(struct mutex_lock(&target_hstate->resize_lock); for (i = 0; i < pages_per_huge_page(h); i += pages_per_huge_page(target_hstate)) { + subpage = nth_page(page, i); if (hstate_is_gigantic(target_hstate)) - prep_compound_gigantic_page_for_demote(page + i, + prep_compound_gigantic_page_for_demote(subpage, target_hstate->order); else - prep_compound_page(page + i, target_hstate->order); - set_page_private(page + i, 0); - set_page_refcounted(page + i); - prep_new_huge_page(target_hstate, page + i, nid); - put_page(page + i); + prep_compound_page(subpage, target_hstate->order); + set_page_private(subpage, 0); + set_page_refcounted(subpage); + prep_new_huge_page(target_hstate, subpage, nid); + put_page(subpage); } mutex_unlock(&target_hstate->resize_lock);
From: Shuai Xue xueshuai@linux.alibaba.com
commit 77677cdbc2aa4b5d5d839562793d3d126201d18d upstream.
The GHES code calls memory_failure_queue() from IRQ context to queue work into workqueue and schedule it on the current CPU. Then the work is processed in memory_failure_work_func() by kworker and calls memory_failure().
When a page is already poisoned, commit a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") make memory_failure() call kill_accessing_process() that:
- holds mmap locking of current->mm - does pagetable walk to find the error virtual address - and sends SIGBUS to the current process with error info.
However, the mm of kworker is not valid, resulting in a null-pointer dereference. So check mm when killing the accessing process.
[akpm@linux-foundation.org: remove unrelated whitespace alteration] Link: https://lkml.kernel.org/r/20220914064935.7851-1-xueshuai@linux.alibaba.com Fixes: a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Miaohe Lin linmiaohe@huawei.com Acked-by: Naoya Horiguchi naoya.horiguchi@nec.com Cc: Huang Ying ying.huang@intel.com Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Bixuan Cui cuibixuan@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -697,6 +697,9 @@ static int kill_accessing_process(struct }; priv.tk.tsk = p;
+ if (!p->mm) + return -EFAULT; + mmap_read_lock(p->mm); ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops, (void *)&priv);
From: Hangyu Hua hbh25y@gmail.com
commit 37238699073e7e93f05517e529661151173cd458 upstream.
vb2_core_qbuf and vb2_core_querybuf don't check the range of b->index controlled by the user.
Fix this by adding range checking code before using them.
Fixes: 57868acc369a ("media: videobuf2: Add new uAPI for DVB streaming I/O") Signed-off-by: Hangyu Hua hbh25y@gmail.com Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/dvb-core/dvb_vb2.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/media/dvb-core/dvb_vb2.c +++ b/drivers/media/dvb-core/dvb_vb2.c @@ -354,6 +354,12 @@ int dvb_vb2_reqbufs(struct dvb_vb2_ctx *
int dvb_vb2_querybuf(struct dvb_vb2_ctx *ctx, struct dmx_buffer *b) { + struct vb2_queue *q = &ctx->vb_q; + + if (b->index >= q->num_buffers) { + dprintk(1, "[%s] buffer index out of range\n", ctx->name); + return -EINVAL; + } vb2_core_querybuf(&ctx->vb_q, b->index, b); dprintk(3, "[%s] index=%d\n", ctx->name, b->index); return 0; @@ -378,8 +384,13 @@ int dvb_vb2_expbuf(struct dvb_vb2_ctx *c
int dvb_vb2_qbuf(struct dvb_vb2_ctx *ctx, struct dmx_buffer *b) { + struct vb2_queue *q = &ctx->vb_q; int ret;
+ if (b->index >= q->num_buffers) { + dprintk(1, "[%s] buffer index out of range\n", ctx->name); + return -EINVAL; + } ret = vb2_core_qbuf(&ctx->vb_q, b->index, b, NULL); if (ret) { dprintk(1, "[%s] index=%d errno=%d\n", ctx->name,
From: Nicolas Dufresne nicolas.dufresne@collabora.com
commit 3a99c4474112f49a5459933d8758614002ca0ddc upstream.
Quite often, the HW get stuck in error condition if a stream error was detected. As documented, the HW should stop immediately and self reset. There is likely a problem or a miss-understanding of the self reset mechanism, as unless we make a long pause, the next command will then report an error even if there is no error in it.
Disabling error detection fixes the issue, and let the decoder continue after an error. This patch is safe for backport into older kernels.
Fixes: cd33c830448b ("media: rkvdec: Add the rkvdec driver") Signed-off-by: Nicolas Dufresne nicolas.dufresne@collabora.com Reviewed-by: Brian Norris briannorris@chromium.org Tested-by: Brian Norris briannorris@chromium.org Reviewed-by: Ezequiel Garcia ezequiel@vanguardiasur.com.ar Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/media/rkvdec/rkvdec-h264.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/staging/media/rkvdec/rkvdec-h264.c +++ b/drivers/staging/media/rkvdec/rkvdec-h264.c @@ -1175,8 +1175,8 @@ static int rkvdec_h264_run(struct rkvdec
schedule_delayed_work(&rkvdec->watchdog_work, msecs_to_jiffies(2000));
- writel(0xffffffff, rkvdec->regs + RKVDEC_REG_STRMD_ERR_EN); - writel(0xffffffff, rkvdec->regs + RKVDEC_REG_H264_ERR_E); + writel(0, rkvdec->regs + RKVDEC_REG_STRMD_ERR_EN); + writel(0, rkvdec->regs + RKVDEC_REG_H264_ERR_E); writel(1, rkvdec->regs + RKVDEC_REG_PREF_LUMA_CACHE_COMMAND); writel(1, rkvdec->regs + RKVDEC_REG_PREF_CHR_CACHE_COMMAND);
From: Nícolas F. R. A. Prado nfraprado@collabora.com
commit a2d2e593d39bc2f29a1cd5e3779af457fd26490c upstream.
Commit a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") removed support for calling platform_get_resource(..., IORESOURCE_IRQ, ...) on DT-based drivers, but the probe() function of mtk-vcodec's encoder was still making use of it. This caused the encoder driver to fail probe.
Since the platform_get_resource() call was only being used to check for the presence of the interrupt (its returned resource wasn't even used) and platform_get_irq() was already being used to get the IRQ, simply drop the use of platform_get_resource(IORESOURCE_IRQ) and handle the failure of platform_get_irq(), to get the driver probing again.
[hverkuil: drop unused struct resource *res]
Fixes: a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") Signed-off-by: Nícolas F. R. A. Prado nfraprado@collabora.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc_drv.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc_drv.c +++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc_drv.c @@ -228,7 +228,6 @@ static int mtk_vcodec_probe(struct platf { struct mtk_vcodec_dev *dev; struct video_device *vfd_enc; - struct resource *res; phandle rproc_phandle; enum mtk_vcodec_fw_type fw_type; int ret; @@ -272,14 +271,12 @@ static int mtk_vcodec_probe(struct platf goto err_res; }
- res = platform_get_resource(pdev, IORESOURCE_IRQ, 0); - if (res == NULL) { - dev_err(&pdev->dev, "failed to get irq resource"); - ret = -ENOENT; + dev->enc_irq = platform_get_irq(pdev, 0); + if (dev->enc_irq < 0) { + ret = dev->enc_irq; goto err_res; }
- dev->enc_irq = platform_get_irq(pdev, 0); irq_set_status_flags(dev->enc_irq, IRQ_NOAUTOEN); ret = devm_request_irq(&pdev->dev, dev->enc_irq, mtk_vcodec_enc_irq_handler,
From: Hans Verkuil hverkuil-cisco@xs4all.nl
commit 4e768c8e34e639cff66a0f175bc4aebf472e4305 upstream.
The v4l2_compat_get_array_args() function can leave uninitialized memory in the buffer it is passed. So zero it before copying array elements from userspace into the buffer.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Reported-by: syzbot+ff18193ff05f3f87f226@syzkaller.appspotmail.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/media/v4l2-core/v4l2-compat-ioctl32.c +++ b/drivers/media/v4l2-core/v4l2-compat-ioctl32.c @@ -1040,6 +1040,8 @@ int v4l2_compat_get_array_args(struct fi { int err = 0;
+ memset(mbuf, 0, array_size); + switch (cmd) { case VIDIOC_G_FMT32: case VIDIOC_S_FMT32:
From: YuTong Chang mtwget@gmail.com
[ Upstream commit 2eb502f496f7764027b7958d4e74356fed918059 ]
According to technical manual(table 11-24), the DMA of MMCHS0 should be direct mapped.
Fixes: b5e509066074 ("ARM: DTS: am33xx: Use the new DT bindings for the eDMA3") Signed-off-by: YuTong Chang mtwget@gmail.com Message-Id: 20220620124146.5330-1-mtwget@gmail.com Acked-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am33xx-l4.dtsi | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/arm/boot/dts/am33xx-l4.dtsi b/arch/arm/boot/dts/am33xx-l4.dtsi index 7da42a5b959c..7e50fe633d8a 100644 --- a/arch/arm/boot/dts/am33xx-l4.dtsi +++ b/arch/arm/boot/dts/am33xx-l4.dtsi @@ -1502,8 +1502,7 @@ mmc1: mmc@0 { compatible = "ti,am335-sdhci"; ti,needs-special-reset; - dmas = <&edma_xbar 24 0 0 - &edma_xbar 25 0 0>; + dmas = <&edma 24 0>, <&edma 25 0>; dma-names = "tx", "rx"; interrupts = <64>; reg = <0x0 0x1000>;
From: Richard Zhu hongxing.zhu@nxp.com
[ Upstream commit 051d9eb403887bb11852b7a4f744728a6a4b1b58 ]
On i.MX7/iMX8MM/iMX8MQ, the initialized default value of PERST bit(BIT3) of SRC_PCIEPHY_RCR is 1b'1. But i.MX8MP has one inversed default value 1b'0 of PERST bit.
And the PERST bit should be kept 1b'1 after power and clocks are stable. So fix the i.MX8MP PCIe PHY PERST support here.
Fixes: e08672c03981 ("reset: imx7: Add support for i.MX8MP SoC") Signed-off-by: Richard Zhu hongxing.zhu@nxp.com Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Tested-by: Marek Vasut marex@denx.de Tested-by: Richard Leitner richard.leitner@skidata.com Tested-by: Alexander Stein alexander.stein@ew.tq-group.com Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Link: https://lore.kernel.org/r/1661845564-11373-5-git-send-email-hongxing.zhu@nxp... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/reset/reset-imx7.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/reset/reset-imx7.c b/drivers/reset/reset-imx7.c index 185a333df66c..d2408725eb2c 100644 --- a/drivers/reset/reset-imx7.c +++ b/drivers/reset/reset-imx7.c @@ -329,6 +329,7 @@ static int imx8mp_reset_set(struct reset_controller_dev *rcdev, break;
case IMX8MP_RESET_PCIE_CTRL_APPS_EN: + case IMX8MP_RESET_PCIEPHY_PERST: value = assert ? 0 : bit; break; }
From: Romain Naour romain.naour@skf.com
[ Upstream commit 6a6d9ecff14a2a46c1deeffa3eb3825349639bdd ]
Commit bcbb63b80284 ("ARM: dts: dra7: Separate AM57 dtsi files") disabled usb4_tm for am5748 devices since USB4 IP is not present in this SoC.
The commit log explained the difference between AM5 and DRA7 families:
AM5 and DRA7 SoC families have different set of modules in them so the SoC sepecific dtsi files need to be separated.
e.g. Some of the major differences between AM576 and DRA76
DRA76x AM576x
USB3 x USB4 x ATL x VCP x MLB x ISS x PRU-ICSS1 x PRU-ICSS2 x
Then commit 176f26bcd41a ("ARM: dts: Add support for dra762 abz package") removed usb4_tm part from am5748.dtsi and introcuded new ti-sysc errors in dmesg:
ti-sysc 48940000.target-module: clock get error for fck: -2 ti-sysc: probe of 48940000.target-module failed with error -2
Fixes: 176f26bcd41a ("ARM: dts: Add support for dra762 abz package")
Signed-off-by: Romain Naour romain.naour@skf.com Signed-off-by: Romain Naour romain.naour@smile.fr Message-Id: 20220823072742.351368-1-romain.naour@smile.fr Reviewed-by: Roger Quadros rogerq@kernel.org Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/am5748.dtsi | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/arm/boot/dts/am5748.dtsi b/arch/arm/boot/dts/am5748.dtsi index c260aa1a85bd..a1f029e9d1f3 100644 --- a/arch/arm/boot/dts/am5748.dtsi +++ b/arch/arm/boot/dts/am5748.dtsi @@ -25,6 +25,10 @@ status = "disabled"; };
+&usb4_tm { + status = "disabled"; +}; + &atl_tm { status = "disabled"; };
From: Samuel Holland samuel@sholland.org
[ Upstream commit fd362baad2e659ef0fb5652f023a606b248f1781 ]
sunxi_sram_claim() checks the sram_desc->claimed flag before updating the register, with the intent that only one device can claim a region. However, this was ineffective because the flag was never set.
Fixes: 4af34b572a85 ("drivers: soc: sunxi: Introduce SoC driver to map SRAMs") Reviewed-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Samuel Holland samuel@sholland.org Reviewed-by: Heiko Stuebner heiko@sntech.de Tested-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Jernej Skrabec jernej.skrabec@gmail.com Link: https://lore.kernel.org/r/20220815041248.53268-4-samuel@sholland.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/sunxi/sunxi_sram.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/soc/sunxi/sunxi_sram.c b/drivers/soc/sunxi/sunxi_sram.c index a8f3876963a0..f3d3f9259df9 100644 --- a/drivers/soc/sunxi/sunxi_sram.c +++ b/drivers/soc/sunxi/sunxi_sram.c @@ -254,6 +254,7 @@ int sunxi_sram_claim(struct device *dev) writel(val | ((device << sram_data->offset) & mask), base + sram_data->reg);
+ sram_desc->claimed = true; spin_unlock(&sram_lock);
return 0;
From: Samuel Holland samuel@sholland.org
[ Upstream commit 90e10a1fcd9b24b4ba8c0d35136127473dcd829e ]
This driver exports a regmap tied to the platform device (as opposed to a syscon, which exports a regmap tied to the OF node). Because of this, the driver can never be unbound, as that would destroy the regmap. Use builtin_platform_driver_probe() to enforce this limitation.
Fixes: 5828729bebbb ("soc: sunxi: export a regmap for EMAC clock reg on A64") Reviewed-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Samuel Holland samuel@sholland.org Reviewed-by: Heiko Stuebner heiko@sntech.de Tested-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Jernej Skrabec jernej.skrabec@gmail.com Link: https://lore.kernel.org/r/20220815041248.53268-5-samuel@sholland.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/sunxi/sunxi_sram.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/soc/sunxi/sunxi_sram.c b/drivers/soc/sunxi/sunxi_sram.c index f3d3f9259df9..a858a37fcdd4 100644 --- a/drivers/soc/sunxi/sunxi_sram.c +++ b/drivers/soc/sunxi/sunxi_sram.c @@ -330,7 +330,7 @@ static struct regmap_config sunxi_sram_emac_clock_regmap = { .writeable_reg = sunxi_sram_regmap_accessible_reg, };
-static int sunxi_sram_probe(struct platform_device *pdev) +static int __init sunxi_sram_probe(struct platform_device *pdev) { struct dentry *d; struct regmap *emac_clock; @@ -410,9 +410,8 @@ static struct platform_driver sunxi_sram_driver = { .name = "sunxi-sram", .of_match_table = sunxi_sram_dt_match, }, - .probe = sunxi_sram_probe, }; -module_platform_driver(sunxi_sram_driver); +builtin_platform_driver_probe(sunxi_sram_driver, sunxi_sram_probe);
MODULE_AUTHOR("Maxime Ripard maxime.ripard@free-electrons.com"); MODULE_DESCRIPTION("Allwinner sunXi SRAM Controller Driver");
From: Samuel Holland samuel@sholland.org
[ Upstream commit 49fad91a7b8941979c3e9a35f9894ac45bc5d3d6 ]
Errors from debugfs are intended to be non-fatal, and should not prevent the driver from probing.
Since debugfs file creation is treated as infallible, move it below the parts of the probe function that can fail. This prevents an error elsewhere in the probe function from causing the file to leak. Do the same for the call to of_platform_populate().
Finally, checkpatch suggests an octal literal for the file permissions.
Fixes: 4af34b572a85 ("drivers: soc: sunxi: Introduce SoC driver to map SRAMs") Fixes: 5828729bebbb ("soc: sunxi: export a regmap for EMAC clock reg on A64") Reviewed-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Samuel Holland samuel@sholland.org Tested-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Jernej Skrabec jernej.skrabec@gmail.com Link: https://lore.kernel.org/r/20220815041248.53268-6-samuel@sholland.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/sunxi/sunxi_sram.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/soc/sunxi/sunxi_sram.c b/drivers/soc/sunxi/sunxi_sram.c index a858a37fcdd4..52d07bed7664 100644 --- a/drivers/soc/sunxi/sunxi_sram.c +++ b/drivers/soc/sunxi/sunxi_sram.c @@ -332,9 +332,9 @@ static struct regmap_config sunxi_sram_emac_clock_regmap = {
static int __init sunxi_sram_probe(struct platform_device *pdev) { - struct dentry *d; struct regmap *emac_clock; const struct sunxi_sramc_variant *variant; + struct device *dev = &pdev->dev;
sram_dev = &pdev->dev;
@@ -346,13 +346,6 @@ static int __init sunxi_sram_probe(struct platform_device *pdev) if (IS_ERR(base)) return PTR_ERR(base);
- of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev); - - d = debugfs_create_file("sram", S_IRUGO, NULL, NULL, - &sunxi_sram_fops); - if (!d) - return -ENOMEM; - if (variant->num_emac_clocks > 0) { emac_clock = devm_regmap_init_mmio(&pdev->dev, base, &sunxi_sram_emac_clock_regmap); @@ -361,6 +354,10 @@ static int __init sunxi_sram_probe(struct platform_device *pdev) return PTR_ERR(emac_clock); }
+ of_platform_populate(dev->of_node, NULL, NULL, dev); + + debugfs_create_file("sram", 0444, NULL, NULL, &sunxi_sram_fops); + return 0; }
From: Samuel Holland samuel@sholland.org
[ Upstream commit e3c95edb1bd8b9c2cb0caa6ae382fc8080f6a0ed ]
The labels were backward with respect to the register values. The SRAM is mapped to the CPU when the register value is 1.
Fixes: 5e4fb6429761 ("drivers: soc: sunxi: add support for A64 and its SRAM C") Acked-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Samuel Holland samuel@sholland.org Signed-off-by: Jernej Skrabec jernej.skrabec@gmail.com Link: https://lore.kernel.org/r/20220815041248.53268-7-samuel@sholland.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/sunxi/sunxi_sram.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/soc/sunxi/sunxi_sram.c b/drivers/soc/sunxi/sunxi_sram.c index 52d07bed7664..09754cd1d57d 100644 --- a/drivers/soc/sunxi/sunxi_sram.c +++ b/drivers/soc/sunxi/sunxi_sram.c @@ -78,8 +78,8 @@ static struct sunxi_sram_desc sun4i_a10_sram_d = {
static struct sunxi_sram_desc sun50i_a64_sram_c = { .data = SUNXI_SRAM_DATA("C", 0x4, 24, 1, - SUNXI_SRAM_MAP(0, 1, "cpu"), - SUNXI_SRAM_MAP(1, 0, "de2")), + SUNXI_SRAM_MAP(1, 0, "cpu"), + SUNXI_SRAM_MAP(0, 1, "de2")), };
static const struct of_device_id sunxi_sram_dt_ids[] = {
From: Shengjiu Wang shengjiu.wang@nxp.com
[ Upstream commit d56ba9a04d7548d4149c46ec86a0e3cc41a70f4a ]
imx_card_parse_of will search all the node with loop, if there is defer probe happen in the middle of loop, the previous released codec node will be released twice, then cause refcount issue.
Here assign NULL to pointer of released nodes to fix the issue.
Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver") Signed-off-by: Shengjiu Wang shengjiu.wang@nxp.com Link: https://lore.kernel.org/r/1663059601-29259-1-git-send-email-shengjiu.wang@nx... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/imx-card.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/sound/soc/fsl/imx-card.c b/sound/soc/fsl/imx-card.c index 4a8609b0d700..5153af3281d2 100644 --- a/sound/soc/fsl/imx-card.c +++ b/sound/soc/fsl/imx-card.c @@ -698,6 +698,10 @@ static int imx_card_parse_of(struct imx_card_data *data) of_node_put(cpu); of_node_put(codec); of_node_put(platform); + + cpu = NULL; + codec = NULL; + platform = NULL; }
return 0;
From: Conor Dooley conor.dooley@microchip.com
[ Upstream commit 5da39ac5d648cdbfdfa8bea0e0cde279ded5c7c2 ]
There is an array bounds violation present during clock registration, triggered by current code by only specific toolchains. This seems to fail gracefully in v6.0-rc1, using a toolchain build from the riscv- gnu-toolchain repo and with clang-15, and life carries on. While converting the driver to use standard clock structs/ops, kernel panics were seen during boot when built with clang-15:
[ 0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1 [ 0.591520] Oops [#1] [ 0.594045] Modules linked in: [ 0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1 [ 0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) [ 0.613012] epc : __clk_register+0x4a6/0x85c [ 0.617759] ra : __clk_register+0x49e/0x85c [ 0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720 [ 0.630466] gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800 [ 0.638443] t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0 [ 0.646420] s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000 [ 0.654396] a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.662373] a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006 [ 0.670350] s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028 [ 0.678327] s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100 [ 0.686304] s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008 [ 0.694281] s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007 [ 0.702258] t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0 [ 0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d [ 0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa [ 0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244
In v6.0-rc1 and later, this issue is visible without the follow on patches doing the conversion using toolchains provided by our Yocto meta layer too.
It fails on "clk_periph_timer" - which uses a different parent, that it tries to find using the macro: #define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw)
If parent is RTCREF, so the macro becomes: &mpfs_cfg_clks[33].cfg.hw which is well beyond the end of the array. Amazingly, builds with GCC 11.1 see no problem here, booting correctly and hooking the parent up etc. Builds with clang-15 do not, with the above panic.
Change the macro to use specific offsets depending on the parent rather than the dt-binding's clock IDs.
Fixes: 1c6a7ea32b8c ("clk: microchip: mpfs: add RTCREF clock control") CC: Nathan Chancellor nathan@kernel.org Signed-off-by: Conor Dooley conor.dooley@microchip.com Reviewed-by: Claudiu Beznea claudiu.beznea@microchip.com Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Link: https://lore.kernel.org/r/20220909123123.2699583-2-conor.dooley@microchip.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/microchip/clk-mpfs.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/microchip/clk-mpfs.c b/drivers/clk/microchip/clk-mpfs.c index 070c3b896559..f0f9c9a1cc48 100644 --- a/drivers/clk/microchip/clk-mpfs.c +++ b/drivers/clk/microchip/clk-mpfs.c @@ -239,6 +239,11 @@ static const struct clk_ops mpfs_clk_cfg_ops = { .hw.init = CLK_HW_INIT(_name, _parent, &mpfs_clk_cfg_ops, 0), \ }
+#define CLK_CPU_OFFSET 0u +#define CLK_AXI_OFFSET 1u +#define CLK_AHB_OFFSET 2u +#define CLK_RTCREF_OFFSET 3u + static struct mpfs_cfg_hw_clock mpfs_cfg_clks[] = { CLK_CFG(CLK_CPU, "clk_cpu", "clk_msspll", 0, 2, mpfs_div_cpu_axi_table, 0, REG_CLOCK_CONFIG_CR), @@ -362,7 +367,7 @@ static const struct clk_ops mpfs_periph_clk_ops = { _flags), \ }
-#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].hw) +#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT##_OFFSET].hw)
/* * Critical clocks:
From: Conor Dooley conor.dooley@microchip.com
[ Upstream commit 05d27090b6dc88bce71a608d1271536e582b73d1 ]
The onboard RTC's AHB bus clock must be kept running as the RTC will stop & lose track of time if the AHB interface clock is disabled.
Fixes: 635e5e73370e ("clk: microchip: Add driver for Microchip PolarFire SoC") Signed-off-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: Claudiu Beznea claudiu.beznea@microchip.com Link: https://lore.kernel.org/r/20220909123123.2699583-3-conor.dooley@microchip.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/microchip/clk-mpfs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/microchip/clk-mpfs.c b/drivers/clk/microchip/clk-mpfs.c index f0f9c9a1cc48..b6b89413e090 100644 --- a/drivers/clk/microchip/clk-mpfs.c +++ b/drivers/clk/microchip/clk-mpfs.c @@ -375,6 +375,8 @@ static const struct clk_ops mpfs_periph_clk_ops = { * trap handler * - CLK_MMUART0: reserved by the hss * - CLK_DDRC: provides clock to the ddr subsystem + * - CLK_RTC: the onboard RTC's AHB bus clock must be kept running as the rtc will stop + * if the AHB interface clock is disabled * - CLK_FICx: these provide the processor side clocks to the "FIC" (Fabric InterConnect) * clock domain crossers which provide the interface to the FPGA fabric. Disabling them * causes the FPGA fabric to go into reset. @@ -399,7 +401,7 @@ static struct mpfs_periph_hw_clock mpfs_periph_clks[] = { CLK_PERIPH(CLK_CAN0, "clk_periph_can0", PARENT_CLK(AHB), 14, 0), CLK_PERIPH(CLK_CAN1, "clk_periph_can1", PARENT_CLK(AHB), 15, 0), CLK_PERIPH(CLK_USB, "clk_periph_usb", PARENT_CLK(AHB), 16, 0), - CLK_PERIPH(CLK_RTC, "clk_periph_rtc", PARENT_CLK(AHB), 18, 0), + CLK_PERIPH(CLK_RTC, "clk_periph_rtc", PARENT_CLK(AHB), 18, CLK_IS_CRITICAL), CLK_PERIPH(CLK_QSPI, "clk_periph_qspi", PARENT_CLK(AHB), 19, 0), CLK_PERIPH(CLK_GPIO0, "clk_periph_gpio0", PARENT_CLK(AHB), 20, 0), CLK_PERIPH(CLK_GPIO1, "clk_periph_gpio1", PARENT_CLK(AHB), 21, 0),
From: Johan Hovold johan+linaro@kernel.org
[ Upstream commit 40e9541959100e017533e18e44d07eed44f91dc5 ]
The size of the UFS PHY serdes register region is 0x1c4 and the corresponding 'reg' property should specifically not include the adjacent regions that are defined in the child node (e.g. tx and rx).
Fixes: 59c7cf814783 ("arm64: dts: qcom: sm8350: Add UFS nodes") Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20220916093603.24263-1-johan+linaro@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sm8350.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi b/arch/arm64/boot/dts/qcom/sm8350.dtsi index 3293f76478df..0e5a4fbb5eb1 100644 --- a/arch/arm64/boot/dts/qcom/sm8350.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi @@ -2128,7 +2128,7 @@
ufs_mem_phy: phy@1d87000 { compatible = "qcom,sm8350-qmp-ufs-phy"; - reg = <0 0x01d87000 0 0xe10>; + reg = <0 0x01d87000 0 0x1c4>; #address-cells = <2>; #size-cells = <2>; ranges;
From: Martin Povišer povik+lin@cutebit.org
[ Upstream commit 0a0342ede303fc420f3a388e1ae82da3ae8ff6bd ]
On probe of the ASoC component, the device is reset but the regcache is retained. This means the regcache gets out of sync if the codec is rebound to a sound card for a second time. Fix it by reinitializing the regcache to defaults after the device is reset.
Fixes: b0bcbe615756 ("ASoC: tas2770: Fix calling reset in probe") Signed-off-by: Martin Povišer povik+lin@cutebit.org Link: https://lore.kernel.org/r/20220919173453.84292-1-povik+lin@cutebit.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/tas2770.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/codecs/tas2770.c b/sound/soc/codecs/tas2770.c index 9ea2aca65e89..e02ad765351b 100644 --- a/sound/soc/codecs/tas2770.c +++ b/sound/soc/codecs/tas2770.c @@ -495,6 +495,8 @@ static struct snd_soc_dai_driver tas2770_dai_driver[] = { }, };
+static const struct regmap_config tas2770_i2c_regmap; + static int tas2770_codec_probe(struct snd_soc_component *component) { struct tas2770_priv *tas2770 = @@ -508,6 +510,7 @@ static int tas2770_codec_probe(struct snd_soc_component *component) }
tas2770_reset(tas2770); + regmap_reinit_cache(tas2770->regmap, &tas2770_i2c_regmap);
return 0; }
From: Philippe Schenker philippe.schenker@toradex.com
[ Upstream commit da73a94fa282f78d485bd0aab36c8ac15b6f792c ]
Currently the bridge driver does not take care whether or not the display needs positive/negative vertical/horizontal syncs. Pass these two flags to the bridge from the EDID that was read out from the display.
Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge") Signed-off-by: Philippe Schenker philippe.schenker@toradex.com Acked-by: Adrien Grassein adrien.grassein@gmail.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20220922124306.34729-2-dev@psc... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/lontium-lt8912b.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c index c642d1e02b2f..e011a2763621 100644 --- a/drivers/gpu/drm/bridge/lontium-lt8912b.c +++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c @@ -266,7 +266,7 @@ static int lt8912_video_setup(struct lt8912 *lt) u32 hactive, h_total, hpw, hfp, hbp; u32 vactive, v_total, vpw, vfp, vbp; u8 settle = 0x08; - int ret; + int ret, hsync_activehigh, vsync_activehigh;
if (!lt) return -EINVAL; @@ -276,12 +276,14 @@ static int lt8912_video_setup(struct lt8912 *lt) hpw = lt->mode.hsync_len; hbp = lt->mode.hback_porch; h_total = hactive + hfp + hpw + hbp; + hsync_activehigh = lt->mode.flags & DISPLAY_FLAGS_HSYNC_HIGH;
vactive = lt->mode.vactive; vfp = lt->mode.vfront_porch; vpw = lt->mode.vsync_len; vbp = lt->mode.vback_porch; v_total = vactive + vfp + vpw + vbp; + vsync_activehigh = lt->mode.flags & DISPLAY_FLAGS_VSYNC_HIGH;
if (vactive <= 600) settle = 0x04; @@ -315,6 +317,11 @@ static int lt8912_video_setup(struct lt8912 *lt) ret |= regmap_write(lt->regmap[I2C_CEC_DSI], 0x3e, hfp & 0xff); ret |= regmap_write(lt->regmap[I2C_CEC_DSI], 0x3f, hfp >> 8);
+ ret |= regmap_update_bits(lt->regmap[I2C_MAIN], 0xab, BIT(0), + vsync_activehigh ? BIT(0) : 0); + ret |= regmap_update_bits(lt->regmap[I2C_MAIN], 0xab, BIT(1), + hsync_activehigh ? BIT(1) : 0); + return ret; }
From: Philippe Schenker philippe.schenker@toradex.com
[ Upstream commit 6dd1de12e1243f2013e4fabf31e99e63b1a860d0 ]
The Lontium LT8912 does have a setting for DVI or HDMI. This patch reads from EDID what the display needs and sets it accordingly.
Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge") Signed-off-by: Philippe Schenker philippe.schenker@toradex.com Acked-by: Adrien Grassein adrien.grassein@gmail.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20220922124306.34729-3-dev@psc... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/lontium-lt8912b.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c index e011a2763621..bab3772c8407 100644 --- a/drivers/gpu/drm/bridge/lontium-lt8912b.c +++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c @@ -321,6 +321,8 @@ static int lt8912_video_setup(struct lt8912 *lt) vsync_activehigh ? BIT(0) : 0); ret |= regmap_update_bits(lt->regmap[I2C_MAIN], 0xab, BIT(1), hsync_activehigh ? BIT(1) : 0); + ret |= regmap_update_bits(lt->regmap[I2C_MAIN], 0xb2, BIT(0), + lt->connector.display_info.is_hdmi ? BIT(0) : 0);
return ret; }
From: Francesco Dolcini francesco.dolcini@toradex.com
[ Upstream commit 051ad2788d35ca07aec8402542e5d38429f2426a ]
Correct I2C address for the register list in lt8912_write_lvds_config(), these registers are on the first I2C address (0x48), the current function is just writing garbage to the wrong registers and this creates multiple issues (artifacts and output completely corrupted) on some HDMI displays.
Correct I2C address comes from Lontium documentation and it is the one used on other out-of-tree LT8912B drivers [1].
[1] https://github.com/boundarydevices/linux/blob/boundary-imx_5.10.x_2.0.0/driv...
Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge") Signed-off-by: Francesco Dolcini francesco.dolcini@toradex.com Signed-off-by: Philippe Schenker philippe.schenker@toradex.com Acked-by: Adrien Grassein adrien.grassein@gmail.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20220922124306.34729-4-dev@psc... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/lontium-lt8912b.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/bridge/lontium-lt8912b.c b/drivers/gpu/drm/bridge/lontium-lt8912b.c index bab3772c8407..167cd7d85dbb 100644 --- a/drivers/gpu/drm/bridge/lontium-lt8912b.c +++ b/drivers/gpu/drm/bridge/lontium-lt8912b.c @@ -186,7 +186,7 @@ static int lt8912_write_lvds_config(struct lt8912 *lt) {0x03, 0xff}, };
- return regmap_multi_reg_write(lt->regmap[I2C_CEC_DSI], seq, ARRAY_SIZE(seq)); + return regmap_multi_reg_write(lt->regmap[I2C_MAIN], seq, ARRAY_SIZE(seq)); };
static inline struct lt8912 *bridge_to_lt8912(struct drm_bridge *b)
From: Radhey Shyam Pandey radhey.shyam.pandey@amd.com
[ Upstream commit f22bd29ba19a43e758b192429613e04aa7abb70d ]
When GEM is in SGMII mode and disabled as a wakeup source, the power management controller can power down the entire full power domain(FPD) if none of the FPD devices are in use.
Incase of FPD off, there are below ethernet link up issues on non-wakeup suspend/resume. To fix it add phy_exit() in suspend and phy_init() in the resume path which reinitializes PS GTR SGMII lanes.
$ echo +20 > /sys/class/rtc/rtc0/wakealarm $ echo mem > /sys/power/state
After resume:
$ ifconfig eth0 up xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout phy phy-fd400000.phy.0: phy poweron failed --> -110 xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout SIOCSIFFLAGS: Connection timed out phy phy-fd400000.phy.0: phy poweron failed --> -110
Fixes: 8b73fa3ae02b ("net: macb: Added ZynqMP-specific initialization") Signed-off-by: Radhey Shyam Pandey radhey.shyam.pandey@amd.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb_main.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index d89098f4ede8..e9aa41949a4b 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -5092,6 +5092,7 @@ static int __maybe_unused macb_suspend(struct device *dev) if (!(bp->wol & MACB_WOL_ENABLED)) { rtnl_lock(); phylink_stop(bp->phylink); + phy_exit(bp->sgmii_phy); rtnl_unlock(); spin_lock_irqsave(&bp->lock, flags); macb_reset_hw(bp); @@ -5181,6 +5182,9 @@ static int __maybe_unused macb_resume(struct device *dev) macb_set_rx_mode(netdev); macb_restore_features(bp); rtnl_lock(); + if (!device_may_wakeup(&bp->dev->dev)) + phy_init(bp->sgmii_phy); + phylink_start(bp->phylink); rtnl_unlock();
From: Brian Norris briannorris@chromium.org
[ Upstream commit cc62d98bd56d45de4531844ca23913a15136c05b ]
This reverts commit 211f276ed3d96e964d2d1106a198c7f4a4b3f4c0.
For quite some time, core DRM helpers already ensure that any relevant connectors/CRTCs/etc. are disabled, as well as their associated components (e.g., bridges) when suspending the system. Thus, analogix_dp_bridge_{enable,disable}() already get called, which in turn call drm_panel_{prepare,unprepare}(). This makes these drm_panel_*() calls redundant.
Besides redundancy, there are a few problems with this handling:
(1) drm_panel_{prepare,unprepare}() are *not* reference-counted APIs and are not in general designed to be handled by multiple callers -- although some panel drivers have a coarse 'prepared' flag that mitigates some damage, at least. So at a minimum this is redundant and confusing, but in some cases, this could be actively harmful.
(2) The error-handling is a bit non-standard. We ignored errors in suspend(), but handled errors in resume(). And recently, people noticed that the clk handling is unbalanced in error paths, and getting *that* right is not actually trivial, given the current way errors are mostly ignored.
(3) In the particular way analogix_dp_{suspend,resume}() get used (e.g., in rockchip_dp_*(), as a late/early callback), we don't necessarily have a proper PM relationship between the DP/bridge device and the panel device. So while the DP bridge gets resumed, the panel's parent device (e.g., platform_device) may still be suspended, and so any prepare() calls may fail.
So remove the superfluous, possibly-harmful suspend()/resume() handling of panel state.
Fixes: 211f276ed3d9 ("drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time") Link: https://lore.kernel.org/all/Yv2CPBD3Picg%2FgVe@google.com/ Signed-off-by: Brian Norris briannorris@chromium.org Reviewed-by: Douglas Anderson dianders@chromium.org Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://patchwork.freedesktop.org/patch/msgid/20220822180729.1.I8ac5abe3a4c1... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 13 ------------- 1 file changed, 13 deletions(-)
diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index 01c8b80e34ec..41431b9d55bd 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1863,12 +1863,6 @@ EXPORT_SYMBOL_GPL(analogix_dp_remove); int analogix_dp_suspend(struct analogix_dp_device *dp) { clk_disable_unprepare(dp->clock); - - if (dp->plat_data->panel) { - if (drm_panel_unprepare(dp->plat_data->panel)) - DRM_ERROR("failed to turnoff the panel\n"); - } - return 0; } EXPORT_SYMBOL_GPL(analogix_dp_suspend); @@ -1883,13 +1877,6 @@ int analogix_dp_resume(struct analogix_dp_device *dp) return ret; }
- if (dp->plat_data->panel) { - if (drm_panel_prepare(dp->plat_data->panel)) { - DRM_ERROR("failed to setup the panel\n"); - return -EBUSY; - } - } - return 0; } EXPORT_SYMBOL_GPL(analogix_dp_resume);
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit a54dc27bd25f20ee3ea2009584b3166d25178243 ]
devm_gpiod_get_optional() may return ERR_PTR(-EPROBE_DEFER), add a minus sign to fix it.
Fixes: 6ccb1d8f78bd ("Input: add MELFAS MIP4 Touchscreen driver") Signed-off-by: Yang Yingliang yangyingliang@huawei.com Link: https://lore.kernel.org/r/20220924030715.1653538-1-yangyingliang@huawei.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/touchscreen/melfas_mip4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/input/touchscreen/melfas_mip4.c b/drivers/input/touchscreen/melfas_mip4.c index 2745bf1aee38..83f4be05e27b 100644 --- a/drivers/input/touchscreen/melfas_mip4.c +++ b/drivers/input/touchscreen/melfas_mip4.c @@ -1453,7 +1453,7 @@ static int mip4_probe(struct i2c_client *client, const struct i2c_device_id *id) "ce", GPIOD_OUT_LOW); if (IS_ERR(ts->gpio_ce)) { error = PTR_ERR(ts->gpio_ce); - if (error != EPROBE_DEFER) + if (error != -EPROBE_DEFER) dev_err(&client->dev, "Failed to get gpio: %d\n", error); return error;
From: Pali Rohár pali@kernel.org
[ Upstream commit 4335417da2b8d6d9b2d4411b5f9e248e5bb2d380 ]
pwm support incompatible with Armada 80x0/70x0 API is not only in Armada 370, but also in Armada XP, 38x and 39x. So basically every non-A8K platform. Fix check for pwm support appropriately.
Fixes: 85b7d8abfec7 ("gpio: mvebu: add pwm support for Armada 8K/7K") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-mvebu.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c index 2db19cd640a4..de1e7a1a76f2 100644 --- a/drivers/gpio/gpio-mvebu.c +++ b/drivers/gpio/gpio-mvebu.c @@ -793,8 +793,12 @@ static int mvebu_pwm_probe(struct platform_device *pdev, u32 offset; u32 set;
- if (of_device_is_compatible(mvchip->chip.of_node, - "marvell,armada-370-gpio")) { + if (mvchip->soc_variant == MVEBU_GPIO_SOC_VARIANT_A8K) { + int ret = of_property_read_u32(dev->of_node, + "marvell,pwm-offset", &offset); + if (ret < 0) + return 0; + } else { /* * There are only two sets of PWM configuration registers for * all the GPIO lines on those SoCs which this driver reserves @@ -804,13 +808,6 @@ static int mvebu_pwm_probe(struct platform_device *pdev, if (!platform_get_resource_byname(pdev, IORESOURCE_MEM, "pwm")) return 0; offset = 0; - } else if (mvchip->soc_variant == MVEBU_GPIO_SOC_VARIANT_A8K) { - int ret = of_property_read_u32(dev->of_node, - "marvell,pwm-offset", &offset); - if (ret < 0) - return 0; - } else { - return 0; }
if (IS_ERR(mvchip->clk))
From: Ian Rogers irogers@google.com
[ Upstream commit 9b7c7728f4e4ba8dd75269fb111fa187faa018c6 ]
Move print_*_events functions out of parse-events.c into a new print-events.c. Move tracepoint code into tracepoint.c or trace-event-info.c (sole user). This reduces the dependencies of parse-events.c and makes it more amenable to being a library in the future.
Remove some unnecessary definitions from parse-events.h. Fix a checkpatch.pl warning on using unsigned rather than unsigned int. Fix some line length warnings too.
Signed-off-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com [ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ] Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: 71c86cda750b ("perf parse-events: Remove "not supported" hybrid cache events") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-list.c | 2 +- tools/perf/builtin-lock.c | 1 + tools/perf/builtin-timechart.c | 1 + tools/perf/builtin-trace.c | 1 + tools/perf/util/Build | 2 + tools/perf/util/parse-events.c | 713 +---------------------------- tools/perf/util/parse-events.h | 31 -- tools/perf/util/print-events.c | 572 +++++++++++++++++++++++ tools/perf/util/print-events.h | 22 + tools/perf/util/trace-event-info.c | 96 ++++ tools/perf/util/tracepoint.c | 63 +++ tools/perf/util/tracepoint.h | 25 + 12 files changed, 791 insertions(+), 738 deletions(-) create mode 100644 tools/perf/util/print-events.c create mode 100644 tools/perf/util/print-events.h create mode 100644 tools/perf/util/tracepoint.c create mode 100644 tools/perf/util/tracepoint.h
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c index 468958154ed9..744dd3520584 100644 --- a/tools/perf/builtin-list.c +++ b/tools/perf/builtin-list.c @@ -10,7 +10,7 @@ */ #include "builtin.h"
-#include "util/parse-events.h" +#include "util/print-events.h" #include "util/pmu.h" #include "util/pmu-hybrid.h" #include "util/debug.h" diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c index 23a33ac15e68..dcc079a80585 100644 --- a/tools/perf/builtin-lock.c +++ b/tools/perf/builtin-lock.c @@ -13,6 +13,7 @@ #include <subcmd/pager.h> #include <subcmd/parse-options.h> #include "util/trace-event.h" +#include "util/tracepoint.h"
#include "util/debug.h" #include "util/session.h" diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c index afce731cec16..e2e9ad929baf 100644 --- a/tools/perf/builtin-timechart.c +++ b/tools/perf/builtin-timechart.c @@ -36,6 +36,7 @@ #include "util/data.h" #include "util/debug.h" #include "util/string2.h" +#include "util/tracepoint.h" #include <linux/err.h>
#ifdef LACKS_OPEN_MEMSTREAM_PROTOTYPE diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index f075cf37a65e..1e1f10a1971d 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -53,6 +53,7 @@ #include "trace-event.h" #include "util/parse-events.h" #include "util/bpf-loader.h" +#include "util/tracepoint.h" #include "callchain.h" #include "print_binary.h" #include "string2.h" diff --git a/tools/perf/util/Build b/tools/perf/util/Build index a51267d88ca9..038e4cf8f488 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -26,6 +26,8 @@ perf-y += mmap.o perf-y += memswap.o perf-y += parse-events.o perf-y += parse-events-hybrid.o +perf-y += print-events.o +perf-y += tracepoint.o perf-y += perf_regs.o perf-y += path.o perf-y += print_binary.o diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 700c95eafd62..3acf7452572c 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -5,18 +5,12 @@ #include <dirent.h> #include <errno.h> #include <sys/ioctl.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <fcntl.h> #include <sys/param.h> #include "term.h" -#include "build-id.h" #include "evlist.h" #include "evsel.h" -#include <subcmd/pager.h> #include <subcmd/parse-options.h> #include "parse-events.h" -#include <subcmd/exec-cmd.h> #include "string2.h" #include "strlist.h" #include "bpf-loader.h" @@ -27,20 +21,22 @@ #define YY_EXTRA_TYPE void* #include "parse-events-flex.h" #include "pmu.h" -#include "thread_map.h" -#include "probe-file.h" #include "asm/bug.h" #include "util/parse-branch-options.h" -#include "metricgroup.h" #include "util/evsel_config.h" #include "util/event.h" -#include "util/pfm.h" +#include "perf.h" #include "util/parse-events-hybrid.h" #include "util/pmu-hybrid.h" -#include "perf.h" +#include "tracepoint.h"
#define MAX_NAME_LEN 100
+struct perf_pmu_event_symbol { + char *symbol; + enum perf_pmu_event_symbol_type type; +}; + #ifdef PARSER_DEBUG extern int parse_events_debug; #endif @@ -154,21 +150,6 @@ struct event_symbol event_symbols_sw[PERF_COUNT_SW_MAX] = { }, };
-struct event_symbol event_symbols_tool[PERF_TOOL_MAX] = { - [PERF_TOOL_DURATION_TIME] = { - .symbol = "duration_time", - .alias = "", - }, - [PERF_TOOL_USER_TIME] = { - .symbol = "user_time", - .alias = "", - }, - [PERF_TOOL_SYSTEM_TIME] = { - .symbol = "system_time", - .alias = "", - }, -}; - #define __PERF_EVENT_FIELD(config, name) \ ((config & PERF_EVENT_##name##_MASK) >> PERF_EVENT_##name##_SHIFT)
@@ -177,121 +158,6 @@ struct event_symbol event_symbols_tool[PERF_TOOL_MAX] = { #define PERF_EVENT_TYPE(config) __PERF_EVENT_FIELD(config, TYPE) #define PERF_EVENT_ID(config) __PERF_EVENT_FIELD(config, EVENT)
-#define for_each_subsystem(sys_dir, sys_dirent) \ - while ((sys_dirent = readdir(sys_dir)) != NULL) \ - if (sys_dirent->d_type == DT_DIR && \ - (strcmp(sys_dirent->d_name, ".")) && \ - (strcmp(sys_dirent->d_name, ".."))) - -static int tp_event_has_id(const char *dir_path, struct dirent *evt_dir) -{ - char evt_path[MAXPATHLEN]; - int fd; - - snprintf(evt_path, MAXPATHLEN, "%s/%s/id", dir_path, evt_dir->d_name); - fd = open(evt_path, O_RDONLY); - if (fd < 0) - return -EINVAL; - close(fd); - - return 0; -} - -#define for_each_event(dir_path, evt_dir, evt_dirent) \ - while ((evt_dirent = readdir(evt_dir)) != NULL) \ - if (evt_dirent->d_type == DT_DIR && \ - (strcmp(evt_dirent->d_name, ".")) && \ - (strcmp(evt_dirent->d_name, "..")) && \ - (!tp_event_has_id(dir_path, evt_dirent))) - -#define MAX_EVENT_LENGTH 512 - -struct tracepoint_path *tracepoint_id_to_path(u64 config) -{ - struct tracepoint_path *path = NULL; - DIR *sys_dir, *evt_dir; - struct dirent *sys_dirent, *evt_dirent; - char id_buf[24]; - int fd; - u64 id; - char evt_path[MAXPATHLEN]; - char *dir_path; - - sys_dir = tracing_events__opendir(); - if (!sys_dir) - return NULL; - - for_each_subsystem(sys_dir, sys_dirent) { - dir_path = get_events_file(sys_dirent->d_name); - if (!dir_path) - continue; - evt_dir = opendir(dir_path); - if (!evt_dir) - goto next; - - for_each_event(dir_path, evt_dir, evt_dirent) { - - scnprintf(evt_path, MAXPATHLEN, "%s/%s/id", dir_path, - evt_dirent->d_name); - fd = open(evt_path, O_RDONLY); - if (fd < 0) - continue; - if (read(fd, id_buf, sizeof(id_buf)) < 0) { - close(fd); - continue; - } - close(fd); - id = atoll(id_buf); - if (id == config) { - put_events_file(dir_path); - closedir(evt_dir); - closedir(sys_dir); - path = zalloc(sizeof(*path)); - if (!path) - return NULL; - if (asprintf(&path->system, "%.*s", MAX_EVENT_LENGTH, sys_dirent->d_name) < 0) { - free(path); - return NULL; - } - if (asprintf(&path->name, "%.*s", MAX_EVENT_LENGTH, evt_dirent->d_name) < 0) { - zfree(&path->system); - free(path); - return NULL; - } - return path; - } - } - closedir(evt_dir); -next: - put_events_file(dir_path); - } - - closedir(sys_dir); - return NULL; -} - -struct tracepoint_path *tracepoint_name_to_path(const char *name) -{ - struct tracepoint_path *path = zalloc(sizeof(*path)); - char *str = strchr(name, ':'); - - if (path == NULL || str == NULL) { - free(path); - return NULL; - } - - path->system = strndup(name, str - name); - path->name = strdup(str+1); - - if (path->system == NULL || path->name == NULL) { - zfree(&path->system); - zfree(&path->name); - zfree(&path); - } - - return path; -} - const char *event_type(int type) { switch (type) { @@ -2674,571 +2540,6 @@ int exclude_perf(const struct option *opt, NULL); }
-static const char * const event_type_descriptors[] = { - "Hardware event", - "Software event", - "Tracepoint event", - "Hardware cache event", - "Raw hardware event descriptor", - "Hardware breakpoint", -}; - -static int cmp_string(const void *a, const void *b) -{ - const char * const *as = a; - const char * const *bs = b; - - return strcmp(*as, *bs); -} - -/* - * Print the events from <debugfs_mount_point>/tracing/events - */ - -void print_tracepoint_events(const char *subsys_glob, const char *event_glob, - bool name_only) -{ - DIR *sys_dir, *evt_dir; - struct dirent *sys_dirent, *evt_dirent; - char evt_path[MAXPATHLEN]; - char *dir_path; - char **evt_list = NULL; - unsigned int evt_i = 0, evt_num = 0; - bool evt_num_known = false; - -restart: - sys_dir = tracing_events__opendir(); - if (!sys_dir) - return; - - if (evt_num_known) { - evt_list = zalloc(sizeof(char *) * evt_num); - if (!evt_list) - goto out_close_sys_dir; - } - - for_each_subsystem(sys_dir, sys_dirent) { - if (subsys_glob != NULL && - !strglobmatch(sys_dirent->d_name, subsys_glob)) - continue; - - dir_path = get_events_file(sys_dirent->d_name); - if (!dir_path) - continue; - evt_dir = opendir(dir_path); - if (!evt_dir) - goto next; - - for_each_event(dir_path, evt_dir, evt_dirent) { - if (event_glob != NULL && - !strglobmatch(evt_dirent->d_name, event_glob)) - continue; - - if (!evt_num_known) { - evt_num++; - continue; - } - - snprintf(evt_path, MAXPATHLEN, "%s:%s", - sys_dirent->d_name, evt_dirent->d_name); - - evt_list[evt_i] = strdup(evt_path); - if (evt_list[evt_i] == NULL) { - put_events_file(dir_path); - goto out_close_evt_dir; - } - evt_i++; - } - closedir(evt_dir); -next: - put_events_file(dir_path); - } - closedir(sys_dir); - - if (!evt_num_known) { - evt_num_known = true; - goto restart; - } - qsort(evt_list, evt_num, sizeof(char *), cmp_string); - evt_i = 0; - while (evt_i < evt_num) { - if (name_only) { - printf("%s ", evt_list[evt_i++]); - continue; - } - printf(" %-50s [%s]\n", evt_list[evt_i++], - event_type_descriptors[PERF_TYPE_TRACEPOINT]); - } - if (evt_num && pager_in_use()) - printf("\n"); - -out_free: - evt_num = evt_i; - for (evt_i = 0; evt_i < evt_num; evt_i++) - zfree(&evt_list[evt_i]); - zfree(&evt_list); - return; - -out_close_evt_dir: - closedir(evt_dir); -out_close_sys_dir: - closedir(sys_dir); - - printf("FATAL: not enough memory to print %s\n", - event_type_descriptors[PERF_TYPE_TRACEPOINT]); - if (evt_list) - goto out_free; -} - -/* - * Check whether event is in <debugfs_mount_point>/tracing/events - */ - -int is_valid_tracepoint(const char *event_string) -{ - DIR *sys_dir, *evt_dir; - struct dirent *sys_dirent, *evt_dirent; - char evt_path[MAXPATHLEN]; - char *dir_path; - - sys_dir = tracing_events__opendir(); - if (!sys_dir) - return 0; - - for_each_subsystem(sys_dir, sys_dirent) { - dir_path = get_events_file(sys_dirent->d_name); - if (!dir_path) - continue; - evt_dir = opendir(dir_path); - if (!evt_dir) - goto next; - - for_each_event(dir_path, evt_dir, evt_dirent) { - snprintf(evt_path, MAXPATHLEN, "%s:%s", - sys_dirent->d_name, evt_dirent->d_name); - if (!strcmp(evt_path, event_string)) { - closedir(evt_dir); - closedir(sys_dir); - return 1; - } - } - closedir(evt_dir); -next: - put_events_file(dir_path); - } - closedir(sys_dir); - return 0; -} - -static bool is_event_supported(u8 type, u64 config) -{ - bool ret = true; - int open_return; - struct evsel *evsel; - struct perf_event_attr attr = { - .type = type, - .config = config, - .disabled = 1, - }; - struct perf_thread_map *tmap = thread_map__new_by_tid(0); - - if (tmap == NULL) - return false; - - evsel = evsel__new(&attr); - if (evsel) { - open_return = evsel__open(evsel, NULL, tmap); - ret = open_return >= 0; - - if (open_return == -EACCES) { - /* - * This happens if the paranoid value - * /proc/sys/kernel/perf_event_paranoid is set to 2 - * Re-run with exclude_kernel set; we don't do that - * by default as some ARM machines do not support it. - * - */ - evsel->core.attr.exclude_kernel = 1; - ret = evsel__open(evsel, NULL, tmap) >= 0; - } - evsel__delete(evsel); - } - - perf_thread_map__put(tmap); - return ret; -} - -void print_sdt_events(const char *subsys_glob, const char *event_glob, - bool name_only) -{ - struct probe_cache *pcache; - struct probe_cache_entry *ent; - struct strlist *bidlist, *sdtlist; - struct strlist_config cfg = {.dont_dupstr = true}; - struct str_node *nd, *nd2; - char *buf, *path, *ptr = NULL; - bool show_detail = false; - int ret; - - sdtlist = strlist__new(NULL, &cfg); - if (!sdtlist) { - pr_debug("Failed to allocate new strlist for SDT\n"); - return; - } - bidlist = build_id_cache__list_all(true); - if (!bidlist) { - pr_debug("Failed to get buildids: %d\n", errno); - return; - } - strlist__for_each_entry(nd, bidlist) { - pcache = probe_cache__new(nd->s, NULL); - if (!pcache) - continue; - list_for_each_entry(ent, &pcache->entries, node) { - if (!ent->sdt) - continue; - if (subsys_glob && - !strglobmatch(ent->pev.group, subsys_glob)) - continue; - if (event_glob && - !strglobmatch(ent->pev.event, event_glob)) - continue; - ret = asprintf(&buf, "%s:%s@%s", ent->pev.group, - ent->pev.event, nd->s); - if (ret > 0) - strlist__add(sdtlist, buf); - } - probe_cache__delete(pcache); - } - strlist__delete(bidlist); - - strlist__for_each_entry(nd, sdtlist) { - buf = strchr(nd->s, '@'); - if (buf) - *(buf++) = '\0'; - if (name_only) { - printf("%s ", nd->s); - continue; - } - nd2 = strlist__next(nd); - if (nd2) { - ptr = strchr(nd2->s, '@'); - if (ptr) - *ptr = '\0'; - if (strcmp(nd->s, nd2->s) == 0) - show_detail = true; - } - if (show_detail) { - path = build_id_cache__origname(buf); - ret = asprintf(&buf, "%s@%s(%.12s)", nd->s, path, buf); - if (ret > 0) { - printf(" %-50s [%s]\n", buf, "SDT event"); - free(buf); - } - free(path); - } else - printf(" %-50s [%s]\n", nd->s, "SDT event"); - if (nd2) { - if (strcmp(nd->s, nd2->s) != 0) - show_detail = false; - if (ptr) - *ptr = '@'; - } - } - strlist__delete(sdtlist); -} - -int print_hwcache_events(const char *event_glob, bool name_only) -{ - unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus = 0; - char name[64], new_name[128]; - char **evt_list = NULL, **evt_pmus = NULL; - bool evt_num_known = false; - struct perf_pmu *pmu = NULL; - - if (perf_pmu__has_hybrid()) { - npmus = perf_pmu__hybrid_pmu_num(); - evt_pmus = zalloc(sizeof(char *) * npmus); - if (!evt_pmus) - goto out_enomem; - } - -restart: - if (evt_num_known) { - evt_list = zalloc(sizeof(char *) * evt_num); - if (!evt_list) - goto out_enomem; - } - - for (type = 0; type < PERF_COUNT_HW_CACHE_MAX; type++) { - for (op = 0; op < PERF_COUNT_HW_CACHE_OP_MAX; op++) { - /* skip invalid cache type */ - if (!evsel__is_cache_op_valid(type, op)) - continue; - - for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) { - unsigned int hybrid_supported = 0, j; - bool supported; - - __evsel__hw_cache_type_op_res_name(type, op, i, name, sizeof(name)); - if (event_glob != NULL && !strglobmatch(name, event_glob)) - continue; - - if (!perf_pmu__has_hybrid()) { - if (!is_event_supported(PERF_TYPE_HW_CACHE, - type | (op << 8) | (i << 16))) { - continue; - } - } else { - perf_pmu__for_each_hybrid_pmu(pmu) { - if (!evt_num_known) { - evt_num++; - continue; - } - - supported = is_event_supported( - PERF_TYPE_HW_CACHE, - type | (op << 8) | (i << 16) | - ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT)); - if (supported) { - snprintf(new_name, sizeof(new_name), "%s/%s/", - pmu->name, name); - evt_pmus[hybrid_supported] = strdup(new_name); - hybrid_supported++; - } - } - - if (hybrid_supported == 0) - continue; - } - - if (!evt_num_known) { - evt_num++; - continue; - } - - if ((hybrid_supported == 0) || - (hybrid_supported == npmus)) { - evt_list[evt_i] = strdup(name); - if (npmus > 0) { - for (j = 0; j < npmus; j++) - zfree(&evt_pmus[j]); - } - } else { - for (j = 0; j < hybrid_supported; j++) { - evt_list[evt_i++] = evt_pmus[j]; - evt_pmus[j] = NULL; - } - continue; - } - - if (evt_list[evt_i] == NULL) - goto out_enomem; - evt_i++; - } - } - } - - if (!evt_num_known) { - evt_num_known = true; - goto restart; - } - - for (evt_i = 0; evt_i < evt_num; evt_i++) { - if (!evt_list[evt_i]) - break; - } - - evt_num = evt_i; - qsort(evt_list, evt_num, sizeof(char *), cmp_string); - evt_i = 0; - while (evt_i < evt_num) { - if (name_only) { - printf("%s ", evt_list[evt_i++]); - continue; - } - printf(" %-50s [%s]\n", evt_list[evt_i++], - event_type_descriptors[PERF_TYPE_HW_CACHE]); - } - if (evt_num && pager_in_use()) - printf("\n"); - -out_free: - evt_num = evt_i; - for (evt_i = 0; evt_i < evt_num; evt_i++) - zfree(&evt_list[evt_i]); - zfree(&evt_list); - - for (evt_i = 0; evt_i < npmus; evt_i++) - zfree(&evt_pmus[evt_i]); - zfree(&evt_pmus); - return evt_num; - -out_enomem: - printf("FATAL: not enough memory to print %s\n", event_type_descriptors[PERF_TYPE_HW_CACHE]); - if (evt_list) - goto out_free; - return evt_num; -} - -static void print_tool_event(const struct event_symbol *syms, const char *event_glob, - bool name_only) -{ - if (syms->symbol == NULL) - return; - - if (event_glob && !(strglobmatch(syms->symbol, event_glob) || - (syms->alias && strglobmatch(syms->alias, event_glob)))) - return; - - if (name_only) - printf("%s ", syms->symbol); - else { - char name[MAX_NAME_LEN]; - if (syms->alias && strlen(syms->alias)) - snprintf(name, MAX_NAME_LEN, "%s OR %s", syms->symbol, syms->alias); - else - strlcpy(name, syms->symbol, MAX_NAME_LEN); - printf(" %-50s [%s]\n", name, "Tool event"); - } -} - -void print_tool_events(const char *event_glob, bool name_only) -{ - // Start at 1 because the first enum entry symbols no tool event - for (int i = 1; i < PERF_TOOL_MAX; ++i) { - print_tool_event(event_symbols_tool + i, event_glob, name_only); - } - if (pager_in_use()) - printf("\n"); -} - -void print_symbol_events(const char *event_glob, unsigned type, - struct event_symbol *syms, unsigned max, - bool name_only) -{ - unsigned int i, evt_i = 0, evt_num = 0; - char name[MAX_NAME_LEN]; - char **evt_list = NULL; - bool evt_num_known = false; - -restart: - if (evt_num_known) { - evt_list = zalloc(sizeof(char *) * evt_num); - if (!evt_list) - goto out_enomem; - syms -= max; - } - - for (i = 0; i < max; i++, syms++) { - /* - * New attr.config still not supported here, the latest - * example was PERF_COUNT_SW_CGROUP_SWITCHES - */ - if (syms->symbol == NULL) - continue; - - if (event_glob != NULL && !(strglobmatch(syms->symbol, event_glob) || - (syms->alias && strglobmatch(syms->alias, event_glob)))) - continue; - - if (!is_event_supported(type, i)) - continue; - - if (!evt_num_known) { - evt_num++; - continue; - } - - if (!name_only && strlen(syms->alias)) - snprintf(name, MAX_NAME_LEN, "%s OR %s", syms->symbol, syms->alias); - else - strlcpy(name, syms->symbol, MAX_NAME_LEN); - - evt_list[evt_i] = strdup(name); - if (evt_list[evt_i] == NULL) - goto out_enomem; - evt_i++; - } - - if (!evt_num_known) { - evt_num_known = true; - goto restart; - } - qsort(evt_list, evt_num, sizeof(char *), cmp_string); - evt_i = 0; - while (evt_i < evt_num) { - if (name_only) { - printf("%s ", evt_list[evt_i++]); - continue; - } - printf(" %-50s [%s]\n", evt_list[evt_i++], event_type_descriptors[type]); - } - if (evt_num && pager_in_use()) - printf("\n"); - -out_free: - evt_num = evt_i; - for (evt_i = 0; evt_i < evt_num; evt_i++) - zfree(&evt_list[evt_i]); - zfree(&evt_list); - return; - -out_enomem: - printf("FATAL: not enough memory to print %s\n", event_type_descriptors[type]); - if (evt_list) - goto out_free; -} - -/* - * Print the help text for the event symbols: - */ -void print_events(const char *event_glob, bool name_only, bool quiet_flag, - bool long_desc, bool details_flag, bool deprecated, - const char *pmu_name) -{ - print_symbol_events(event_glob, PERF_TYPE_HARDWARE, - event_symbols_hw, PERF_COUNT_HW_MAX, name_only); - - print_symbol_events(event_glob, PERF_TYPE_SOFTWARE, - event_symbols_sw, PERF_COUNT_SW_MAX, name_only); - print_tool_events(event_glob, name_only); - - print_hwcache_events(event_glob, name_only); - - print_pmu_events(event_glob, name_only, quiet_flag, long_desc, - details_flag, deprecated, pmu_name); - - if (event_glob != NULL) - return; - - if (!name_only) { - printf(" %-50s [%s]\n", - "rNNN", - event_type_descriptors[PERF_TYPE_RAW]); - printf(" %-50s [%s]\n", - "cpu/t1=v1[,t2=v2,t3 ...]/modifier", - event_type_descriptors[PERF_TYPE_RAW]); - if (pager_in_use()) - printf(" (see 'man perf-list' on how to encode it)\n\n"); - - printf(" %-50s [%s]\n", - "mem:<addr>[/len][:access]", - event_type_descriptors[PERF_TYPE_BREAKPOINT]); - if (pager_in_use()) - printf("\n"); - } - - print_tracepoint_events(NULL, NULL, name_only); - - print_sdt_events(NULL, NULL, name_only); - - metricgroup__print(true, true, NULL, name_only, details_flag, - pmu_name); - - print_libpfm_events(name_only, long_desc); -} - int parse_events__is_hardcoded_term(struct parse_events_term *term) { return term->type_term != PARSE_EVENTS__TERM_TYPE_USER; diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index a38b8b160e80..ba9fa3ddaf6e 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -11,7 +11,6 @@ #include <linux/perf_event.h> #include <string.h>
-struct list_head; struct evsel; struct evlist; struct parse_events_error; @@ -19,14 +18,6 @@ struct parse_events_error; struct option; struct perf_pmu;
-struct tracepoint_path { - char *system; - char *name; - struct tracepoint_path *next; -}; - -struct tracepoint_path *tracepoint_id_to_path(u64 config); -struct tracepoint_path *tracepoint_name_to_path(const char *name); bool have_tracepoints(struct list_head *evlist);
const char *event_type(int type); @@ -46,8 +37,6 @@ int parse_events_terms(struct list_head *terms, const char *str); int parse_filter(const struct option *opt, const char *str, int unset); int exclude_perf(const struct option *opt, const char *arg, int unset);
-#define EVENTS_HELP_MAX (128*1024) - enum perf_pmu_event_symbol_type { PMU_EVENT_SYMBOL_ERR, /* not a PMU EVENT */ PMU_EVENT_SYMBOL, /* normal style PMU event */ @@ -56,11 +45,6 @@ enum perf_pmu_event_symbol_type { PMU_EVENT_SYMBOL_SUFFIX2, /* suffix of pre-suf2 style event */ };
-struct perf_pmu_event_symbol { - char *symbol; - enum perf_pmu_event_symbol_type type; -}; - enum { PARSE_EVENTS__TERM_TYPE_NUM, PARSE_EVENTS__TERM_TYPE_STR, @@ -219,28 +203,13 @@ void parse_events_update_lists(struct list_head *list_event, void parse_events_evlist_error(struct parse_events_state *parse_state, int idx, const char *str);
-void print_events(const char *event_glob, bool name_only, bool quiet, - bool long_desc, bool details_flag, bool deprecated, - const char *pmu_name); - struct event_symbol { const char *symbol; const char *alias; }; extern struct event_symbol event_symbols_hw[]; extern struct event_symbol event_symbols_sw[]; -void print_symbol_events(const char *event_glob, unsigned type, - struct event_symbol *syms, unsigned max, - bool name_only); -void print_tool_events(const char *event_glob, bool name_only); -void print_tracepoint_events(const char *subsys_glob, const char *event_glob, - bool name_only); -int print_hwcache_events(const char *event_glob, bool name_only); -void print_sdt_events(const char *subsys_glob, const char *event_glob, - bool name_only); -int is_valid_tracepoint(const char *event_string);
-int valid_event_mount(const char *eventfs); char *parse_events_formats_error_string(char *additional_terms);
void parse_events_error__init(struct parse_events_error *err); diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c new file mode 100644 index 000000000000..ba1ab5134685 --- /dev/null +++ b/tools/perf/util/print-events.c @@ -0,0 +1,572 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <dirent.h> +#include <errno.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/param.h> + +#include <api/fs/tracing_path.h> +#include <linux/stddef.h> +#include <linux/perf_event.h> +#include <linux/zalloc.h> +#include <subcmd/pager.h> + +#include "build-id.h" +#include "debug.h" +#include "evsel.h" +#include "metricgroup.h" +#include "parse-events.h" +#include "pmu.h" +#include "print-events.h" +#include "probe-file.h" +#include "string2.h" +#include "strlist.h" +#include "thread_map.h" +#include "tracepoint.h" +#include "pfm.h" +#include "pmu-hybrid.h" + +#define MAX_NAME_LEN 100 + +static const char * const event_type_descriptors[] = { + "Hardware event", + "Software event", + "Tracepoint event", + "Hardware cache event", + "Raw hardware event descriptor", + "Hardware breakpoint", +}; + +static const struct event_symbol event_symbols_tool[PERF_TOOL_MAX] = { + [PERF_TOOL_DURATION_TIME] = { + .symbol = "duration_time", + .alias = "", + }, + [PERF_TOOL_USER_TIME] = { + .symbol = "user_time", + .alias = "", + }, + [PERF_TOOL_SYSTEM_TIME] = { + .symbol = "system_time", + .alias = "", + }, +}; + +static int cmp_string(const void *a, const void *b) +{ + const char * const *as = a; + const char * const *bs = b; + + return strcmp(*as, *bs); +} + +/* + * Print the events from <debugfs_mount_point>/tracing/events + */ +void print_tracepoint_events(const char *subsys_glob, + const char *event_glob, bool name_only) +{ + DIR *sys_dir, *evt_dir; + struct dirent *sys_dirent, *evt_dirent; + char evt_path[MAXPATHLEN]; + char *dir_path; + char **evt_list = NULL; + unsigned int evt_i = 0, evt_num = 0; + bool evt_num_known = false; + +restart: + sys_dir = tracing_events__opendir(); + if (!sys_dir) + return; + + if (evt_num_known) { + evt_list = zalloc(sizeof(char *) * evt_num); + if (!evt_list) + goto out_close_sys_dir; + } + + for_each_subsystem(sys_dir, sys_dirent) { + if (subsys_glob != NULL && + !strglobmatch(sys_dirent->d_name, subsys_glob)) + continue; + + dir_path = get_events_file(sys_dirent->d_name); + if (!dir_path) + continue; + evt_dir = opendir(dir_path); + if (!evt_dir) + goto next; + + for_each_event(dir_path, evt_dir, evt_dirent) { + if (event_glob != NULL && + !strglobmatch(evt_dirent->d_name, event_glob)) + continue; + + if (!evt_num_known) { + evt_num++; + continue; + } + + snprintf(evt_path, MAXPATHLEN, "%s:%s", + sys_dirent->d_name, evt_dirent->d_name); + + evt_list[evt_i] = strdup(evt_path); + if (evt_list[evt_i] == NULL) { + put_events_file(dir_path); + goto out_close_evt_dir; + } + evt_i++; + } + closedir(evt_dir); +next: + put_events_file(dir_path); + } + closedir(sys_dir); + + if (!evt_num_known) { + evt_num_known = true; + goto restart; + } + qsort(evt_list, evt_num, sizeof(char *), cmp_string); + evt_i = 0; + while (evt_i < evt_num) { + if (name_only) { + printf("%s ", evt_list[evt_i++]); + continue; + } + printf(" %-50s [%s]\n", evt_list[evt_i++], + event_type_descriptors[PERF_TYPE_TRACEPOINT]); + } + if (evt_num && pager_in_use()) + printf("\n"); + +out_free: + evt_num = evt_i; + for (evt_i = 0; evt_i < evt_num; evt_i++) + zfree(&evt_list[evt_i]); + zfree(&evt_list); + return; + +out_close_evt_dir: + closedir(evt_dir); +out_close_sys_dir: + closedir(sys_dir); + + printf("FATAL: not enough memory to print %s\n", + event_type_descriptors[PERF_TYPE_TRACEPOINT]); + if (evt_list) + goto out_free; +} + +void print_sdt_events(const char *subsys_glob, const char *event_glob, + bool name_only) +{ + struct probe_cache *pcache; + struct probe_cache_entry *ent; + struct strlist *bidlist, *sdtlist; + struct strlist_config cfg = {.dont_dupstr = true}; + struct str_node *nd, *nd2; + char *buf, *path, *ptr = NULL; + bool show_detail = false; + int ret; + + sdtlist = strlist__new(NULL, &cfg); + if (!sdtlist) { + pr_debug("Failed to allocate new strlist for SDT\n"); + return; + } + bidlist = build_id_cache__list_all(true); + if (!bidlist) { + pr_debug("Failed to get buildids: %d\n", errno); + return; + } + strlist__for_each_entry(nd, bidlist) { + pcache = probe_cache__new(nd->s, NULL); + if (!pcache) + continue; + list_for_each_entry(ent, &pcache->entries, node) { + if (!ent->sdt) + continue; + if (subsys_glob && + !strglobmatch(ent->pev.group, subsys_glob)) + continue; + if (event_glob && + !strglobmatch(ent->pev.event, event_glob)) + continue; + ret = asprintf(&buf, "%s:%s@%s", ent->pev.group, + ent->pev.event, nd->s); + if (ret > 0) + strlist__add(sdtlist, buf); + } + probe_cache__delete(pcache); + } + strlist__delete(bidlist); + + strlist__for_each_entry(nd, sdtlist) { + buf = strchr(nd->s, '@'); + if (buf) + *(buf++) = '\0'; + if (name_only) { + printf("%s ", nd->s); + continue; + } + nd2 = strlist__next(nd); + if (nd2) { + ptr = strchr(nd2->s, '@'); + if (ptr) + *ptr = '\0'; + if (strcmp(nd->s, nd2->s) == 0) + show_detail = true; + } + if (show_detail) { + path = build_id_cache__origname(buf); + ret = asprintf(&buf, "%s@%s(%.12s)", nd->s, path, buf); + if (ret > 0) { + printf(" %-50s [%s]\n", buf, "SDT event"); + free(buf); + } + free(path); + } else + printf(" %-50s [%s]\n", nd->s, "SDT event"); + if (nd2) { + if (strcmp(nd->s, nd2->s) != 0) + show_detail = false; + if (ptr) + *ptr = '@'; + } + } + strlist__delete(sdtlist); +} + +static bool is_event_supported(u8 type, unsigned int config) +{ + bool ret = true; + int open_return; + struct evsel *evsel; + struct perf_event_attr attr = { + .type = type, + .config = config, + .disabled = 1, + }; + struct perf_thread_map *tmap = thread_map__new_by_tid(0); + + if (tmap == NULL) + return false; + + evsel = evsel__new(&attr); + if (evsel) { + open_return = evsel__open(evsel, NULL, tmap); + ret = open_return >= 0; + + if (open_return == -EACCES) { + /* + * This happens if the paranoid value + * /proc/sys/kernel/perf_event_paranoid is set to 2 + * Re-run with exclude_kernel set; we don't do that + * by default as some ARM machines do not support it. + * + */ + evsel->core.attr.exclude_kernel = 1; + ret = evsel__open(evsel, NULL, tmap) >= 0; + } + evsel__delete(evsel); + } + + perf_thread_map__put(tmap); + return ret; +} + +int print_hwcache_events(const char *event_glob, bool name_only) +{ + unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus = 0; + char name[64], new_name[128]; + char **evt_list = NULL, **evt_pmus = NULL; + bool evt_num_known = false; + struct perf_pmu *pmu = NULL; + + if (perf_pmu__has_hybrid()) { + npmus = perf_pmu__hybrid_pmu_num(); + evt_pmus = zalloc(sizeof(char *) * npmus); + if (!evt_pmus) + goto out_enomem; + } + +restart: + if (evt_num_known) { + evt_list = zalloc(sizeof(char *) * evt_num); + if (!evt_list) + goto out_enomem; + } + + for (type = 0; type < PERF_COUNT_HW_CACHE_MAX; type++) { + for (op = 0; op < PERF_COUNT_HW_CACHE_OP_MAX; op++) { + /* skip invalid cache type */ + if (!evsel__is_cache_op_valid(type, op)) + continue; + + for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) { + unsigned int hybrid_supported = 0, j; + bool supported; + + __evsel__hw_cache_type_op_res_name(type, op, i, name, sizeof(name)); + if (event_glob != NULL && !strglobmatch(name, event_glob)) + continue; + + if (!perf_pmu__has_hybrid()) { + if (!is_event_supported(PERF_TYPE_HW_CACHE, + type | (op << 8) | (i << 16))) { + continue; + } + } else { + perf_pmu__for_each_hybrid_pmu(pmu) { + if (!evt_num_known) { + evt_num++; + continue; + } + + supported = is_event_supported( + PERF_TYPE_HW_CACHE, + type | (op << 8) | (i << 16) | + ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT)); + if (supported) { + snprintf(new_name, sizeof(new_name), + "%s/%s/", pmu->name, name); + evt_pmus[hybrid_supported] = + strdup(new_name); + hybrid_supported++; + } + } + + if (hybrid_supported == 0) + continue; + } + + if (!evt_num_known) { + evt_num++; + continue; + } + + if ((hybrid_supported == 0) || + (hybrid_supported == npmus)) { + evt_list[evt_i] = strdup(name); + if (npmus > 0) { + for (j = 0; j < npmus; j++) + zfree(&evt_pmus[j]); + } + } else { + for (j = 0; j < hybrid_supported; j++) { + evt_list[evt_i++] = evt_pmus[j]; + evt_pmus[j] = NULL; + } + continue; + } + + if (evt_list[evt_i] == NULL) + goto out_enomem; + evt_i++; + } + } + } + + if (!evt_num_known) { + evt_num_known = true; + goto restart; + } + + for (evt_i = 0; evt_i < evt_num; evt_i++) { + if (!evt_list[evt_i]) + break; + } + + evt_num = evt_i; + qsort(evt_list, evt_num, sizeof(char *), cmp_string); + evt_i = 0; + while (evt_i < evt_num) { + if (name_only) { + printf("%s ", evt_list[evt_i++]); + continue; + } + printf(" %-50s [%s]\n", evt_list[evt_i++], + event_type_descriptors[PERF_TYPE_HW_CACHE]); + } + if (evt_num && pager_in_use()) + printf("\n"); + +out_free: + evt_num = evt_i; + for (evt_i = 0; evt_i < evt_num; evt_i++) + zfree(&evt_list[evt_i]); + zfree(&evt_list); + + for (evt_i = 0; evt_i < npmus; evt_i++) + zfree(&evt_pmus[evt_i]); + zfree(&evt_pmus); + return evt_num; + +out_enomem: + printf("FATAL: not enough memory to print %s\n", + event_type_descriptors[PERF_TYPE_HW_CACHE]); + if (evt_list) + goto out_free; + return evt_num; +} + +static void print_tool_event(const struct event_symbol *syms, const char *event_glob, + bool name_only) +{ + if (syms->symbol == NULL) + return; + + if (event_glob && !(strglobmatch(syms->symbol, event_glob) || + (syms->alias && strglobmatch(syms->alias, event_glob)))) + return; + + if (name_only) + printf("%s ", syms->symbol); + else { + char name[MAX_NAME_LEN]; + + if (syms->alias && strlen(syms->alias)) + snprintf(name, MAX_NAME_LEN, "%s OR %s", syms->symbol, syms->alias); + else + strlcpy(name, syms->symbol, MAX_NAME_LEN); + printf(" %-50s [%s]\n", name, "Tool event"); + } +} + +void print_tool_events(const char *event_glob, bool name_only) +{ + // Start at 1 because the first enum entry means no tool event. + for (int i = 1; i < PERF_TOOL_MAX; ++i) + print_tool_event(event_symbols_tool + i, event_glob, name_only); + + if (pager_in_use()) + printf("\n"); +} + +void print_symbol_events(const char *event_glob, unsigned int type, + struct event_symbol *syms, unsigned int max, + bool name_only) +{ + unsigned int i, evt_i = 0, evt_num = 0; + char name[MAX_NAME_LEN]; + char **evt_list = NULL; + bool evt_num_known = false; + +restart: + if (evt_num_known) { + evt_list = zalloc(sizeof(char *) * evt_num); + if (!evt_list) + goto out_enomem; + syms -= max; + } + + for (i = 0; i < max; i++, syms++) { + /* + * New attr.config still not supported here, the latest + * example was PERF_COUNT_SW_CGROUP_SWITCHES + */ + if (syms->symbol == NULL) + continue; + + if (event_glob != NULL && !(strglobmatch(syms->symbol, event_glob) || + (syms->alias && strglobmatch(syms->alias, event_glob)))) + continue; + + if (!is_event_supported(type, i)) + continue; + + if (!evt_num_known) { + evt_num++; + continue; + } + + if (!name_only && strlen(syms->alias)) + snprintf(name, MAX_NAME_LEN, "%s OR %s", syms->symbol, syms->alias); + else + strlcpy(name, syms->symbol, MAX_NAME_LEN); + + evt_list[evt_i] = strdup(name); + if (evt_list[evt_i] == NULL) + goto out_enomem; + evt_i++; + } + + if (!evt_num_known) { + evt_num_known = true; + goto restart; + } + qsort(evt_list, evt_num, sizeof(char *), cmp_string); + evt_i = 0; + while (evt_i < evt_num) { + if (name_only) { + printf("%s ", evt_list[evt_i++]); + continue; + } + printf(" %-50s [%s]\n", evt_list[evt_i++], event_type_descriptors[type]); + } + if (evt_num && pager_in_use()) + printf("\n"); + +out_free: + evt_num = evt_i; + for (evt_i = 0; evt_i < evt_num; evt_i++) + zfree(&evt_list[evt_i]); + zfree(&evt_list); + return; + +out_enomem: + printf("FATAL: not enough memory to print %s\n", event_type_descriptors[type]); + if (evt_list) + goto out_free; +} + +/* + * Print the help text for the event symbols: + */ +void print_events(const char *event_glob, bool name_only, bool quiet_flag, + bool long_desc, bool details_flag, bool deprecated, + const char *pmu_name) +{ + print_symbol_events(event_glob, PERF_TYPE_HARDWARE, + event_symbols_hw, PERF_COUNT_HW_MAX, name_only); + + print_symbol_events(event_glob, PERF_TYPE_SOFTWARE, + event_symbols_sw, PERF_COUNT_SW_MAX, name_only); + print_tool_events(event_glob, name_only); + + print_hwcache_events(event_glob, name_only); + + print_pmu_events(event_glob, name_only, quiet_flag, long_desc, + details_flag, deprecated, pmu_name); + + if (event_glob != NULL) + return; + + if (!name_only) { + printf(" %-50s [%s]\n", + "rNNN", + event_type_descriptors[PERF_TYPE_RAW]); + printf(" %-50s [%s]\n", + "cpu/t1=v1[,t2=v2,t3 ...]/modifier", + event_type_descriptors[PERF_TYPE_RAW]); + if (pager_in_use()) + printf(" (see 'man perf-list' on how to encode it)\n\n"); + + printf(" %-50s [%s]\n", + "mem:<addr>[/len][:access]", + event_type_descriptors[PERF_TYPE_BREAKPOINT]); + if (pager_in_use()) + printf("\n"); + } + + print_tracepoint_events(NULL, NULL, name_only); + + print_sdt_events(NULL, NULL, name_only); + + metricgroup__print(true, true, NULL, name_only, details_flag, + pmu_name); + + print_libpfm_events(name_only, long_desc); +} diff --git a/tools/perf/util/print-events.h b/tools/perf/util/print-events.h new file mode 100644 index 000000000000..1da9910d83a6 --- /dev/null +++ b/tools/perf/util/print-events.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PERF_PRINT_EVENTS_H +#define __PERF_PRINT_EVENTS_H + +#include <stdbool.h> + +struct event_symbol; + +void print_events(const char *event_glob, bool name_only, bool quiet_flag, + bool long_desc, bool details_flag, bool deprecated, + const char *pmu_name); +int print_hwcache_events(const char *event_glob, bool name_only); +void print_sdt_events(const char *subsys_glob, const char *event_glob, + bool name_only); +void print_symbol_events(const char *event_glob, unsigned int type, + struct event_symbol *syms, unsigned int max, + bool name_only); +void print_tool_events(const char *event_glob, bool name_only); +void print_tracepoint_events(const char *subsys_glob, const char *event_glob, + bool name_only); + +#endif /* __PERF_PRINT_EVENTS_H */ diff --git a/tools/perf/util/trace-event-info.c b/tools/perf/util/trace-event-info.c index a65f65d0857e..892c323b4ac9 100644 --- a/tools/perf/util/trace-event-info.c +++ b/tools/perf/util/trace-event-info.c @@ -19,16 +19,24 @@ #include <linux/kernel.h> #include <linux/zalloc.h> #include <internal/lib.h> // page_size +#include <sys/param.h>
#include "trace-event.h" +#include "tracepoint.h" #include <api/fs/tracing_path.h> #include "evsel.h" #include "debug.h"
#define VERSION "0.6" +#define MAX_EVENT_LENGTH 512
static int output_fd;
+struct tracepoint_path { + char *system; + char *name; + struct tracepoint_path *next; +};
int bigendian(void) { @@ -400,6 +408,94 @@ put_tracepoints_path(struct tracepoint_path *tps) } }
+static struct tracepoint_path *tracepoint_id_to_path(u64 config) +{ + struct tracepoint_path *path = NULL; + DIR *sys_dir, *evt_dir; + struct dirent *sys_dirent, *evt_dirent; + char id_buf[24]; + int fd; + u64 id; + char evt_path[MAXPATHLEN]; + char *dir_path; + + sys_dir = tracing_events__opendir(); + if (!sys_dir) + return NULL; + + for_each_subsystem(sys_dir, sys_dirent) { + dir_path = get_events_file(sys_dirent->d_name); + if (!dir_path) + continue; + evt_dir = opendir(dir_path); + if (!evt_dir) + goto next; + + for_each_event(dir_path, evt_dir, evt_dirent) { + + scnprintf(evt_path, MAXPATHLEN, "%s/%s/id", dir_path, + evt_dirent->d_name); + fd = open(evt_path, O_RDONLY); + if (fd < 0) + continue; + if (read(fd, id_buf, sizeof(id_buf)) < 0) { + close(fd); + continue; + } + close(fd); + id = atoll(id_buf); + if (id == config) { + put_events_file(dir_path); + closedir(evt_dir); + closedir(sys_dir); + path = zalloc(sizeof(*path)); + if (!path) + return NULL; + if (asprintf(&path->system, "%.*s", + MAX_EVENT_LENGTH, sys_dirent->d_name) < 0) { + free(path); + return NULL; + } + if (asprintf(&path->name, "%.*s", + MAX_EVENT_LENGTH, evt_dirent->d_name) < 0) { + zfree(&path->system); + free(path); + return NULL; + } + return path; + } + } + closedir(evt_dir); +next: + put_events_file(dir_path); + } + + closedir(sys_dir); + return NULL; +} + +static struct tracepoint_path *tracepoint_name_to_path(const char *name) +{ + struct tracepoint_path *path = zalloc(sizeof(*path)); + char *str = strchr(name, ':'); + + if (path == NULL || str == NULL) { + free(path); + return NULL; + } + + path->system = strndup(name, str - name); + path->name = strdup(str+1); + + if (path->system == NULL || path->name == NULL) { + zfree(&path->system); + zfree(&path->name); + zfree(&path); + } + + return path; +} + static struct tracepoint_path * get_tracepoints_path(struct list_head *pattrs) { diff --git a/tools/perf/util/tracepoint.c b/tools/perf/util/tracepoint.c new file mode 100644 index 000000000000..89ef56c43311 --- /dev/null +++ b/tools/perf/util/tracepoint.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "tracepoint.h" + +#include <errno.h> +#include <fcntl.h> +#include <stdio.h> +#include <sys/param.h> +#include <unistd.h> + +#include <api/fs/tracing_path.h> + +int tp_event_has_id(const char *dir_path, struct dirent *evt_dir) +{ + char evt_path[MAXPATHLEN]; + int fd; + + snprintf(evt_path, MAXPATHLEN, "%s/%s/id", dir_path, evt_dir->d_name); + fd = open(evt_path, O_RDONLY); + if (fd < 0) + return -EINVAL; + close(fd); + + return 0; +} + +/* + * Check whether event is in <debugfs_mount_point>/tracing/events + */ +int is_valid_tracepoint(const char *event_string) +{ + DIR *sys_dir, *evt_dir; + struct dirent *sys_dirent, *evt_dirent; + char evt_path[MAXPATHLEN]; + char *dir_path; + + sys_dir = tracing_events__opendir(); + if (!sys_dir) + return 0; + + for_each_subsystem(sys_dir, sys_dirent) { + dir_path = get_events_file(sys_dirent->d_name); + if (!dir_path) + continue; + evt_dir = opendir(dir_path); + if (!evt_dir) + goto next; + + for_each_event(dir_path, evt_dir, evt_dirent) { + snprintf(evt_path, MAXPATHLEN, "%s:%s", + sys_dirent->d_name, evt_dirent->d_name); + if (!strcmp(evt_path, event_string)) { + closedir(evt_dir); + closedir(sys_dir); + return 1; + } + } + closedir(evt_dir); +next: + put_events_file(dir_path); + } + closedir(sys_dir); + return 0; +} diff --git a/tools/perf/util/tracepoint.h b/tools/perf/util/tracepoint.h new file mode 100644 index 000000000000..c4a110fe87d7 --- /dev/null +++ b/tools/perf/util/tracepoint.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PERF_TRACEPOINT_H +#define __PERF_TRACEPOINT_H + +#include <dirent.h> +#include <string.h> + +int tp_event_has_id(const char *dir_path, struct dirent *evt_dir); + +#define for_each_event(dir_path, evt_dir, evt_dirent) \ + while ((evt_dirent = readdir(evt_dir)) != NULL) \ + if (evt_dirent->d_type == DT_DIR && \ + (strcmp(evt_dirent->d_name, ".")) && \ + (strcmp(evt_dirent->d_name, "..")) && \ + (!tp_event_has_id(dir_path, evt_dirent))) + +#define for_each_subsystem(sys_dir, sys_dirent) \ + while ((sys_dirent = readdir(sys_dir)) != NULL) \ + if (sys_dirent->d_type == DT_DIR && \ + (strcmp(sys_dirent->d_name, ".")) && \ + (strcmp(sys_dirent->d_name, ".."))) + +int is_valid_tracepoint(const char *event_string); + +#endif /* __PERF_TRACEPOINT_H */
From: Zhengjun Xing zhengjun.xing@linux.intel.com
[ Upstream commit e28c07871c3f2107e316c2590d4703496bd114f4 ]
Some hybrid hardware cache events are only available on one CPU PMU. For example, 'L1-dcache-load-misses' is only available on cpu_core.
We have supported in the perf list clearly reporting this info, the function works fine before but recently the argument "config" in API is_event_supported() is changed from "u64" to "unsigned int" which caused a regression, the "perf list" then can not display the PMU prefix for some hybrid cache events.
For the hybrid systems, the PMU type ID is stored at config[63:32], define config to "unsigned int" will miss the PMU type ID information, then the regression happened, the config should be defined as "u64".
Before: # ./perf list |grep "Hardware cache event" L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] node-load-misses [Hardware cache event] node-loads [Hardware cache event]
After: # ./perf list |grep "Hardware cache event" L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] cpu_atom/L1-icache-loads/ [Hardware cache event] cpu_core/L1-dcache-load-misses/ [Hardware cache event] cpu_core/node-load-misses/ [Hardware cache event] cpu_core/node-loads/ [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event]
Fixes: 9b7c7728f4e4ba8d ("perf parse-events: Break out tracepoint and printing") Reported-by: Yi Ammy ammy.yi@intel.com Reviewed-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Xing Zhengjun zhengjun.xing@linux.intel.com Acked-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20220923030013.3726410-1-zhengjun.xing@linux.intel... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: 71c86cda750b ("perf parse-events: Remove "not supported" hybrid cache events") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/print-events.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c index ba1ab5134685..04050d4f6db8 100644 --- a/tools/perf/util/print-events.c +++ b/tools/perf/util/print-events.c @@ -239,7 +239,7 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob, strlist__delete(sdtlist); }
-static bool is_event_supported(u8 type, unsigned int config) +static bool is_event_supported(u8 type, u64 config) { bool ret = true; int open_return;
From: Zhengjun Xing zhengjun.xing@linux.intel.com
[ Upstream commit 71c86cda750b001100e0d6dc04a88449b7381a59 ]
By default, we create two hybrid cache events, one is for cpu_core, and another is for cpu_atom. But Some hybrid hardware cache events are only available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only available on cpu_core, while the 'L1-icache-loads' is only available on cpu_atom. We need to remove "not supported" hybrid cache events. By extending is_event_supported() to global API and using it to check if the hybrid cache events are supported before being created, we can remove the "not supported" hybrid cache events.
Before:
# ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1
Performance counter stats for 'system wide':
52,570 cpu_core/L1-dcache-load-misses/ <not supported> cpu_atom/L1-dcache-load-misses/ <not supported> cpu_core/L1-icache-loads/ 1,471,817 cpu_atom/L1-icache-loads/
1.004915229 seconds time elapsed
After:
# ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1
Performance counter stats for 'system wide':
54,510 cpu_core/L1-dcache-load-misses/ 1,441,286 cpu_atom/L1-icache-loads/
1.005114281 seconds time elapsed
Fixes: 30def61f64bac5f5 ("perf parse-events: Create two hybrid cache events") Reported-by: Yi Ammy ammy.yi@intel.com Reviewed-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Xing Zhengjun zhengjun.xing@linux.intel.com Acked-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Jin Yao yao.jin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20220923030013.3726410-2-zhengjun.xing@linux.intel... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/parse-events-hybrid.c | 21 ++++++++++++--- tools/perf/util/parse-events.c | 39 +++++++++++++++++++++++++++ tools/perf/util/parse-events.h | 1 + tools/perf/util/print-events.c | 39 --------------------------- 4 files changed, 57 insertions(+), 43 deletions(-)
diff --git a/tools/perf/util/parse-events-hybrid.c b/tools/perf/util/parse-events-hybrid.c index 284f8eabd3b9..7c9f9150bad5 100644 --- a/tools/perf/util/parse-events-hybrid.c +++ b/tools/perf/util/parse-events-hybrid.c @@ -33,7 +33,8 @@ static void config_hybrid_attr(struct perf_event_attr *attr, * If the PMU type ID is 0, the PERF_TYPE_RAW will be applied. */ attr->type = type; - attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT); + attr->config = (attr->config & PERF_HW_EVENT_MASK) | + ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT); }
static int create_event_hybrid(__u32 config_type, int *idx, @@ -48,13 +49,25 @@ static int create_event_hybrid(__u32 config_type, int *idx, __u64 config = attr->config;
config_hybrid_attr(attr, config_type, pmu->type); + + /* + * Some hybrid hardware cache events are only available on one CPU + * PMU. For example, the 'L1-dcache-load-misses' is only available + * on cpu_core, while the 'L1-icache-loads' is only available on + * cpu_atom. We need to remove "not supported" hybrid cache events. + */ + if (attr->type == PERF_TYPE_HW_CACHE + && !is_event_supported(attr->type, attr->config)) + return 0; + evsel = parse_events__add_event_hybrid(list, idx, attr, name, metric_id, pmu, config_terms); - if (evsel) + if (evsel) { evsel->pmu_name = strdup(pmu->name); - else + if (!evsel->pmu_name) + return -ENOMEM; + } else return -ENOMEM; - attr->type = type; attr->config = config; return 0; diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 3acf7452572c..b51c646c212e 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -29,6 +29,7 @@ #include "util/parse-events-hybrid.h" #include "util/pmu-hybrid.h" #include "tracepoint.h" +#include "thread_map.h"
#define MAX_NAME_LEN 100
@@ -158,6 +159,44 @@ struct event_symbol event_symbols_sw[PERF_COUNT_SW_MAX] = { #define PERF_EVENT_TYPE(config) __PERF_EVENT_FIELD(config, TYPE) #define PERF_EVENT_ID(config) __PERF_EVENT_FIELD(config, EVENT)
+bool is_event_supported(u8 type, u64 config) +{ + bool ret = true; + int open_return; + struct evsel *evsel; + struct perf_event_attr attr = { + .type = type, + .config = config, + .disabled = 1, + }; + struct perf_thread_map *tmap = thread_map__new_by_tid(0); + + if (tmap == NULL) + return false; + + evsel = evsel__new(&attr); + if (evsel) { + open_return = evsel__open(evsel, NULL, tmap); + ret = open_return >= 0; + + if (open_return == -EACCES) { + /* + * This happens if the paranoid value + * /proc/sys/kernel/perf_event_paranoid is set to 2 + * Re-run with exclude_kernel set; we don't do that + * by default as some ARM machines do not support it. + * + */ + evsel->core.attr.exclude_kernel = 1; + ret = evsel__open(evsel, NULL, tmap) >= 0; + } + evsel__delete(evsel); + } + + perf_thread_map__put(tmap); + return ret; +} + const char *event_type(int type) { switch (type) { diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h index ba9fa3ddaf6e..fd97bb74559e 100644 --- a/tools/perf/util/parse-events.h +++ b/tools/perf/util/parse-events.h @@ -19,6 +19,7 @@ struct option; struct perf_pmu;
bool have_tracepoints(struct list_head *evlist); +bool is_event_supported(u8 type, u64 config);
const char *event_type(int type);
diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c index 04050d4f6db8..c4d5d87fae2f 100644 --- a/tools/perf/util/print-events.c +++ b/tools/perf/util/print-events.c @@ -22,7 +22,6 @@ #include "probe-file.h" #include "string2.h" #include "strlist.h" -#include "thread_map.h" #include "tracepoint.h" #include "pfm.h" #include "pmu-hybrid.h" @@ -239,44 +238,6 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob, strlist__delete(sdtlist); }
-static bool is_event_supported(u8 type, u64 config) -{ - bool ret = true; - int open_return; - struct evsel *evsel; - struct perf_event_attr attr = { - .type = type, - .config = config, - .disabled = 1, - }; - struct perf_thread_map *tmap = thread_map__new_by_tid(0); - - if (tmap == NULL) - return false; - - evsel = evsel__new(&attr); - if (evsel) { - open_return = evsel__open(evsel, NULL, tmap); - ret = open_return >= 0; - - if (open_return == -EACCES) { - /* - * This happens if the paranoid value - * /proc/sys/kernel/perf_event_paranoid is set to 2 - * Re-run with exclude_kernel set; we don't do that - * by default as some ARM machines do not support it. - * - */ - evsel->core.attr.exclude_kernel = 1; - ret = evsel__open(evsel, NULL, tmap) >= 0; - } - evsel__delete(evsel); - } - - perf_thread_map__put(tmap); - return ret; -} - int print_hwcache_events(const char *event_glob, bool name_only) { unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus = 0;
From: Peilin Ye peilin.ye@bytedance.com
[ Upstream commit a43206156263fbaf1f2b7f96257441f331e91bb7 ]
Currently usbnet_disconnect() unanchors and frees all deferred URBs using usb_scuttle_anchored_urbs(), which does not free urb->context, causing a memory leak as reported by syzbot.
Use a usb_get_from_anchor() while loop instead, similar to what we did in commit 19cfe912c37b ("Bluetooth: btusb: Fix memory leak in play_deferred"). Also free urb->sg.
Reported-and-tested-by: syzbot+dcd3e13cf4472f2e0ba1@syzkaller.appspotmail.com Fixes: 69ee472f2706 ("usbnet & cdc-ether: Autosuspend for online devices") Fixes: 638c5115a794 ("USBNET: support DMA SG") Signed-off-by: Peilin Ye peilin.ye@bytedance.com Link: https://lore.kernel.org/r/20220923042551.2745-1-yepeilin.cs@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/usbnet.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index 0ed09bb91c44..bccf63aac6cd 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -1601,6 +1601,7 @@ void usbnet_disconnect (struct usb_interface *intf) struct usbnet *dev; struct usb_device *xdev; struct net_device *net; + struct urb *urb;
dev = usb_get_intfdata(intf); usb_set_intfdata(intf, NULL); @@ -1617,7 +1618,11 @@ void usbnet_disconnect (struct usb_interface *intf) net = dev->net; unregister_netdev (net);
- usb_scuttle_anchored_urbs(&dev->deferred); + while ((urb = usb_get_from_anchor(&dev->deferred))) { + dev_kfree_skb(urb->context); + kfree(urb->sg); + usb_free_urb(urb); + }
if (dev->driver_info->unbind) dev->driver_info->unbind(dev, intf);
From: Hangyu Hua hbh25y@gmail.com
[ Upstream commit 6e23ec0ba92d426c77a73a9ccab16346e5e0ef49 ]
nf_ct_put need to be called to put the refcount got by tcf_ct_fill_params to avoid possible refcount leak when tcf_ct_flow_table_get fails.
Fixes: c34b961a2492 ("net/sched: act_ct: Create nf flow table per zone") Signed-off-by: Hangyu Hua hbh25y@gmail.com Link: https://lore.kernel.org/r/20220923020046.8021-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/act_ct.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index e013253b10d1..4d44a1bf4a04 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -1393,7 +1393,7 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
err = tcf_ct_flow_table_get(params); if (err) - goto cleanup; + goto cleanup_params;
spin_lock_bh(&c->tcf_lock); goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); @@ -1408,6 +1408,9 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
return res;
+cleanup_params: + if (params->tmpl) + nf_ct_put(params->tmpl); cleanup: if (goto_ch) tcf_chain_put_by_act(goto_ch);
From: Rafael Mendonca rafaelmendsr@gmail.com
[ Upstream commit c635ebe8d911a93bd849a9419b01a58783de76f1 ]
The label passed to the QDESC_GET for the ETHOFLD TXQ, RXQ, and FLQ, is the 'out' one, which skips the 'out_unlock' label, and thus doesn't unlock the 'uld_mutex' before returning. Additionally, since commit 5148e5950c67 ("cxgb4: add EOTID tracking and software context dump"), the access to these ETHOFLD hardware queues should be protected by the 'mqprio_mutex' instead.
Fixes: 2d0cb84dd973 ("cxgb4: add ETHOFLD hardware queue support") Fixes: 5148e5950c67 ("cxgb4: add EOTID tracking and software context dump") Signed-off-by: Rafael Mendonca rafaelmendsr@gmail.com Reviewed-by: Rahul Lakkireddy rahul.lakkireddy@chelsio.com Link: https://lore.kernel.org/r/20220922175109.764898-1-rafaelmendsr@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/chelsio/cxgb4/cudbg_lib.c | 28 +++++++++++++------ 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c index a7f291c89702..557c591a6ce3 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c @@ -14,6 +14,7 @@ #include "cudbg_entity.h" #include "cudbg_lib.h" #include "cudbg_zlib.h" +#include "cxgb4_tc_mqprio.h"
static const u32 t6_tp_pio_array[][IREG_NUM_ELEM] = { {0x7e40, 0x7e44, 0x020, 28}, /* t6_tp_pio_regs_20_to_3b */ @@ -3458,7 +3459,7 @@ int cudbg_collect_qdesc(struct cudbg_init *pdbg_init, for (i = 0; i < utxq->ntxq; i++) QDESC_GET_TXQ(&utxq->uldtxq[i].q, cudbg_uld_txq_to_qtype(j), - out_unlock); + out_unlock_uld); } }
@@ -3475,7 +3476,7 @@ int cudbg_collect_qdesc(struct cudbg_init *pdbg_init, for (i = 0; i < urxq->nrxq; i++) QDESC_GET_RXQ(&urxq->uldrxq[i].rspq, cudbg_uld_rxq_to_qtype(j), - out_unlock); + out_unlock_uld); }
/* ULD FLQ */ @@ -3487,7 +3488,7 @@ int cudbg_collect_qdesc(struct cudbg_init *pdbg_init, for (i = 0; i < urxq->nrxq; i++) QDESC_GET_FLQ(&urxq->uldrxq[i].fl, cudbg_uld_flq_to_qtype(j), - out_unlock); + out_unlock_uld); }
/* ULD CIQ */ @@ -3500,29 +3501,34 @@ int cudbg_collect_qdesc(struct cudbg_init *pdbg_init, for (i = 0; i < urxq->nciq; i++) QDESC_GET_RXQ(&urxq->uldrxq[base + i].rspq, cudbg_uld_ciq_to_qtype(j), - out_unlock); + out_unlock_uld); } } + mutex_unlock(&uld_mutex); + + if (!padap->tc_mqprio) + goto out;
+ mutex_lock(&padap->tc_mqprio->mqprio_mutex); /* ETHOFLD TXQ */ if (s->eohw_txq) for (i = 0; i < s->eoqsets; i++) QDESC_GET_TXQ(&s->eohw_txq[i].q, - CUDBG_QTYPE_ETHOFLD_TXQ, out); + CUDBG_QTYPE_ETHOFLD_TXQ, out_unlock_mqprio);
/* ETHOFLD RXQ and FLQ */ if (s->eohw_rxq) { for (i = 0; i < s->eoqsets; i++) QDESC_GET_RXQ(&s->eohw_rxq[i].rspq, - CUDBG_QTYPE_ETHOFLD_RXQ, out); + CUDBG_QTYPE_ETHOFLD_RXQ, out_unlock_mqprio);
for (i = 0; i < s->eoqsets; i++) QDESC_GET_FLQ(&s->eohw_rxq[i].fl, - CUDBG_QTYPE_ETHOFLD_FLQ, out); + CUDBG_QTYPE_ETHOFLD_FLQ, out_unlock_mqprio); }
-out_unlock: - mutex_unlock(&uld_mutex); +out_unlock_mqprio: + mutex_unlock(&padap->tc_mqprio->mqprio_mutex);
out: qdesc_info->qdesc_entry_size = sizeof(*qdesc_entry); @@ -3559,6 +3565,10 @@ int cudbg_collect_qdesc(struct cudbg_init *pdbg_init, #undef QDESC_GET
return rc; + +out_unlock_uld: + mutex_unlock(&uld_mutex); + goto out; }
int cudbg_collect_flash(struct cudbg_init *pdbg_init,
From: Peng Wu wupeng58@huawei.com
[ Upstream commit 4774db8dfc6a2e6649920ebb2fc8e2f062c2080d ]
The devm_ioremap() function returns NULL on error, it doesn't return error pointers.
Fixes: 3a1a274e933f ("mlxbf_gige: compute MDIO period based on i1clk") Signed-off-by: Peng Wu wupeng58@huawei.com Link: https://lore.kernel.org/r/20220923023640.116057-1-wupeng58@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c index 4aeb927c3715..aa780b1614a3 100644 --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c @@ -246,8 +246,8 @@ int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige *priv) }
priv->clk_io = devm_ioremap(dev, res->start, resource_size(res)); - if (IS_ERR(priv->clk_io)) - return PTR_ERR(priv->clk_io); + if (!priv->clk_io) + return -ENOMEM;
mlxbf_gige_mdio_cfg(priv);
From: Michael Kelley mikelley@microsoft.com
[ Upstream commit c292a337d0e45a292c301e3cd51c35aa0ae91e95 ]
The IOC_PR_CLEAR and IOC_PR_RELEASE ioctls are non-functional on NVMe devices because the nvme_pr_clear() and nvme_pr_release() functions set the IEKEY field incorrectly. The IEKEY field should be set only when the key is zero (i.e, not specified). The current code does it backwards.
Furthermore, the NVMe spec describes the persistent reservation "clear" function as an option on the reservation release command. The current implementation of nvme_pr_clear() erroneously uses the reservation register command.
Fix these errors. Note that NVMe version 1.3 and later specify that setting the IEKEY field will return an error of Invalid Field in Command. The fix will set IEKEY when the key is zero, which is appropriate as these ioctls consider a zero key to be "unspecified", and the intention of the spec change is to require a valid key.
Tested on a version 1.4 PCI NVMe device in an Azure VM.
Fixes: 1673f1f08c88 ("nvme: move block_device_operations and ns/ctrl freeing to common code") Fixes: 1d277a637a71 ("NVMe: Add persistent reservation ops") Signed-off-by: Michael Kelley mikelley@microsoft.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 6d76fc608b74..326ad33537ed 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2069,14 +2069,14 @@ static int nvme_pr_preempt(struct block_device *bdev, u64 old, u64 new,
static int nvme_pr_clear(struct block_device *bdev, u64 key) { - u32 cdw10 = 1 | (key ? 1 << 3 : 0); + u32 cdw10 = 1 | (key ? 0 : 1 << 3);
- return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_register); + return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release); }
static int nvme_pr_release(struct block_device *bdev, u64 key, enum pr_type type) { - u32 cdw10 = nvme_pr_type(type) << 8 | (key ? 1 << 3 : 0); + u32 cdw10 = nvme_pr_type(type) << 8 | (key ? 0 : 1 << 3);
return nvme_pr_command(bdev, cdw10, key, 0, nvme_cmd_resv_release); }
From: Tamizh Chelvam Raja quic_tamizhr@quicinc.com
[ Upstream commit 64e966d1e84b29c9fa916cfeaabbf4013703942e ]
The Bitrate for HE/EHT MCS6 is calculated wrongly due to the incorrect MCS divisor value for mcs6. Fix it with the proper value.
previous mcs_divisor value = (11769/6144) = 1.915527
fixed mcs_divisor value = (11377/6144) = 1.851725
Fixes: 9c97c88d2f4b ("cfg80211: Add support to calculate and report 4096-QAM HE rates") Signed-off-by: Tamizh Chelvam Raja quic_tamizhr@quicinc.com Link: https://lore.kernel.org/r/20220908181034.9936-1-quic_tamizhr@quicinc.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/util.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/wireless/util.c b/net/wireless/util.c index b7257862e0fe..28b7f120501a 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -1361,7 +1361,7 @@ static u32 cfg80211_calculate_bitrate_he(struct rate_info *rate) 25599, /* 4.166666... */ 17067, /* 2.777777... */ 12801, /* 2.083333... */ - 11769, /* 1.851851... */ + 11377, /* 1.851725... */ 10239, /* 1.666666... */ 8532, /* 1.388888... */ 7680, /* 1.250000... */ @@ -1444,7 +1444,7 @@ static u32 cfg80211_calculate_bitrate_eht(struct rate_info *rate) 25599, /* 4.166666... */ 17067, /* 2.777777... */ 12801, /* 2.083333... */ - 11769, /* 1.851851... */ + 11377, /* 1.851725... */ 10239, /* 1.666666... */ 8532, /* 1.388888... */ 7680, /* 1.250000... */
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit d873697ef2b7e1b6fdd8e9d449d9354bd5d29a4a ]
Commit 10cb8e617560 ("mac80211: enable QoS support for nl80211 ctrl port") changed ieee80211_tx_control_port() to aways call __ieee80211_select_queue() without checking local->hw.queues.
__ieee80211_select_queue() returns a queue-id between 0 and 3, which means that now ieee80211_tx_control_port() may end up setting the queue-mapping for a skb to a value higher then local->hw.queues if local->hw.queues is less then 4.
Specifically this is a problem for ralink rt2500-pci cards where local->hw.queues is 2. There this causes rt2x00queue_get_tx_queue() to return NULL and the following error to be logged: "ieee80211 phy0: rt2x00mac_tx: Error - Attempt to send packet over invalid queue 2", after which association with the AP fails.
Other callers of __ieee80211_select_queue() skip calling it when local->hw.queues < IEEE80211_NUM_ACS, add the same check to ieee80211_tx_control_port(). This fixes ralink rt2500-pci and similar cards when less then 4 tx-queues no longer working.
Fixes: 10cb8e617560 ("mac80211: enable QoS support for nl80211 ctrl port") Cc: Markus Theil markus.theil@tu-ilmenau.de Suggested-by: Stanislaw Gruszka stf_xl@wp.pl Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20220918192052.443529-1-hdegoede@redhat.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/tx.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 3cd24d8170d3..f6f09a3506aa 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -5761,6 +5761,9 @@ int ieee80211_tx_control_port(struct wiphy *wiphy, struct net_device *dev, skb_reset_network_header(skb); skb_reset_mac_header(skb);
+ if (local->hw.queues < IEEE80211_NUM_ACS) + goto start_xmit; + /* update QoS header to prioritize control port frames if possible, * priorization also happens for control port frames send over * AF_PACKET @@ -5776,6 +5779,7 @@ int ieee80211_tx_control_port(struct wiphy *wiphy, struct net_device *dev,
rcu_read_unlock();
+start_xmit: /* mutex lock is only needed for incrementing the cookie counter */ mutex_lock(&local->mtx);
From: Paweł Lenkow pawel.lenkow@camlingroup.com
[ Upstream commit be92292b90bfdc31f332c962882b6d3ea0285fdf ]
During our testing of WFM200 module over SDIO on i.MX6Q-based platform, we discovered a memory corruption on the system, tracing back to the wfx driver. Using kfence, it was possible to trace it back to the root cause, which is hw->max_rates set to 8 in wfx_init_common, while the maximum defined by IEEE80211_TX_TABLE_SIZE is 4.
This causes array out-of-bounds writes during updates of the rate table, as seen below:
BUG: KFENCE: memory corruption in kfree_rcu_work+0x320/0x36c
Corrupted memory at 0xe0a4ffe0 [ 0x03 0x03 0x03 0x03 0x01 0x00 0x00 0x02 0x02 0x02 0x09 0x00 0x21 0xbb 0xbb 0xbb ] (in kfence-#81): kfree_rcu_work+0x320/0x36c process_one_work+0x3ec/0x920 worker_thread+0x60/0x7a4 kthread+0x174/0x1b4 ret_from_fork+0x14/0x2c 0x0
kfence-#81: 0xe0a4ffc0-0xe0a4ffdf, size=32, cache=kmalloc-64
allocated by task 297 on cpu 0 at 631.039555s: minstrel_ht_update_rates+0x38/0x2b0 [mac80211] rate_control_tx_status+0xb4/0x148 [mac80211] ieee80211_tx_status_ext+0x364/0x1030 [mac80211] ieee80211_tx_status+0xe0/0x118 [mac80211] ieee80211_tasklet_handler+0xb0/0xe0 [mac80211] tasklet_action_common.constprop.0+0x11c/0x148 __do_softirq+0x1a4/0x61c irq_exit+0xcc/0x104 call_with_stack+0x18/0x20 __irq_svc+0x80/0xb0 wq_worker_sleeping+0x10/0x100 wq_worker_sleeping+0x10/0x100 schedule+0x50/0xe0 schedule_timeout+0x2e0/0x474 wait_for_completion+0xdc/0x1ec mmc_wait_for_req_done+0xc4/0xf8 mmc_io_rw_extended+0x3b4/0x4ec sdio_io_rw_ext_helper+0x290/0x384 sdio_memcpy_toio+0x30/0x38 wfx_sdio_copy_to_io+0x88/0x108 [wfx] wfx_data_write+0x88/0x1f0 [wfx] bh_work+0x1c8/0xcc0 [wfx] process_one_work+0x3ec/0x920 worker_thread+0x60/0x7a4 kthread+0x174/0x1b4 ret_from_fork+0x14/0x2c 0x0
After discussion on the wireless mailing list it was clarified that the issue has been introduced by: commit ee0e16ab756a ("mac80211: minstrel_ht: fill all requested rates") and fix shall be in minstrel_ht_update_rates in rc80211_minstrel_ht.c.
Fixes: ee0e16ab756a ("mac80211: minstrel_ht: fill all requested rates") Link: https://lore.kernel.org/all/12e5adcd-8aed-f0f7-70cc-4fb7b656b829@camlingroup... Link: https://lore.kernel.org/linux-wireless/20220915131445.30600-1-lech.perczak@c... Cc: Jérôme Pouiller jerome.pouiller@silabs.com Cc: Johannes Berg johannes@sipsolutions.net Cc: Peter Seiderer ps.report@gmx.net Cc: Kalle Valo kvalo@kernel.org Cc: Krzysztof Drobiński krzysztof.drobinski@camlingroup.com, Signed-off-by: Paweł Lenkow pawel.lenkow@camlingroup.com Signed-off-by: Lech Perczak lech.perczak@camlingroup.com Reviewed-by: Peter Seiderer ps.report@gmx.net Reviewed-by: Jérôme Pouiller jerome.pouiller@silabs.com Acked-by: Felix Fietkau nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/rc80211_minstrel_ht.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c index 5f27e6746762..788a82f9c74d 100644 --- a/net/mac80211/rc80211_minstrel_ht.c +++ b/net/mac80211/rc80211_minstrel_ht.c @@ -10,6 +10,7 @@ #include <linux/random.h> #include <linux/moduleparam.h> #include <linux/ieee80211.h> +#include <linux/minmax.h> #include <net/mac80211.h> #include "rate.h" #include "sta_info.h" @@ -1550,6 +1551,7 @@ minstrel_ht_update_rates(struct minstrel_priv *mp, struct minstrel_ht_sta *mi) { struct ieee80211_sta_rates *rates; int i = 0; + int max_rates = min_t(int, mp->hw->max_rates, IEEE80211_TX_RATE_TABLE_SIZE);
rates = kzalloc(sizeof(*rates), GFP_ATOMIC); if (!rates) @@ -1559,10 +1561,10 @@ minstrel_ht_update_rates(struct minstrel_priv *mp, struct minstrel_ht_sta *mi) minstrel_ht_set_rate(mp, mi, rates, i++, mi->max_tp_rate[0]);
/* Fill up remaining, keep one entry for max_probe_rate */ - for (; i < (mp->hw->max_rates - 1); i++) + for (; i < (max_rates - 1); i++) minstrel_ht_set_rate(mp, mi, rates, i, mi->max_tp_rate[i]);
- if (i < mp->hw->max_rates) + if (i < max_rates) minstrel_ht_set_rate(mp, mi, rates, i++, mi->max_prob_rate);
if (i < IEEE80211_TX_RATE_TABLE_SIZE)
From: Junxiao Chang junxiao.chang@intel.com
[ Upstream commit 49725ffc15fc4e9fae68c55b691fd25168cbe5c1 ]
This commit fixes DMA engine reset timeout issue in suspend/resume with ADLink I-Pi SMARC Plus board which dmesg shows: ... [ 54.678271] PM: suspend exit [ 54.754066] intel-eth-pci 0000:00:1d.2 enp0s29f2: PHY [stmmac-3:01] driver [Maxlinear Ethernet GPY215B] (irq=POLL) [ 54.755808] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-0 ... [ 54.780482] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-7 [ 55.784098] intel-eth-pci 0000:00:1d.2: Failed to reset the dma [ 55.784111] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_hw_setup: DMA engine initialization failed [ 55.784115] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_open: Hw setup failed ...
The issue is related with serdes which impacts clock. There is serdes in ADLink I-Pi SMARC board ethernet controller. Please refer to commit b9663b7ca6ff78 ("net: stmmac: Enable SERDES power up/down sequence") for detial. When issue is reproduced, DMA engine clock is not ready because serdes is not powered up.
To reproduce DMA engine reset timeout issue with hardware which has serdes in GBE controller, install Ubuntu. In Ubuntu GUI, click "Power Off/Log Out" -> "Suspend" menu, it disables network interface, then goes to sleep mode. When it wakes up, it enables network interface again. Stmmac driver is called in this way:
1. stmmac_release: Stop network interface. In this function, it disables DMA engine and network interface; 2. stmmac_suspend: It is called in kernel suspend flow. But because network interface has been disabled(netif_running(ndev) is false), it does nothing and returns directly; 3. System goes into S3 or S0ix state. Some time later, system is waken up by keyboard or mouse; 4. stmmac_resume: It does nothing because network interface has been disabled; 5. stmmac_open: It is called to enable network interace again. DMA engine is initialized in this API, but serdes is not power on so there will be DMA engine reset timeout issue.
Similarly, serdes powerdown should be added in stmmac_release. Network interface might be disabled by cmd "ifconfig eth0 down", DMA engine, phy and mac have been disabled in ndo_stop callback, serdes should be powered down as well. It doesn't make sense that serdes is on while other components have been turned off.
If ethernet interface is in enabled state(netif_running(ndev) is true) before suspend/resume, the issue couldn't be reproduced because serdes could be powered up in stmmac_resume.
Because serdes_powerup is added in stmmac_open, it doesn't need to be called in probe function.
Fixes: b9663b7ca6ff78 ("net: stmmac: Enable SERDES power up/down sequence") Signed-off-by: Junxiao Chang junxiao.chang@intel.com Reviewed-by: Voon Weifeng weifeng.voon@intel.com Tested-by: Jimmy JS Chen jimmyjs.chen@adlinktech.com Tested-by: Looi, Hong Aun hong.aun.looi@intel.com Link: https://lore.kernel.org/r/20220923050448.1220250-1-junxiao.chang@intel.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 23 +++++++++++-------- 1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 78f11dabca05..8d9272f01e31 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -3704,6 +3704,15 @@ static int stmmac_open(struct net_device *dev) goto init_error; }
+ if (priv->plat->serdes_powerup) { + ret = priv->plat->serdes_powerup(dev, priv->plat->bsp_priv); + if (ret < 0) { + netdev_err(priv->dev, "%s: Serdes powerup failed\n", + __func__); + goto init_error; + } + } + ret = stmmac_hw_setup(dev, true); if (ret < 0) { netdev_err(priv->dev, "%s: Hw setup failed\n", __func__); @@ -3793,6 +3802,10 @@ static int stmmac_release(struct net_device *dev) /* Disable the MAC Rx/Tx */ stmmac_mac_set(priv, priv->ioaddr, false);
+ /* Powerdown Serdes if there is */ + if (priv->plat->serdes_powerdown) + priv->plat->serdes_powerdown(dev, priv->plat->bsp_priv); + netif_carrier_off(dev);
stmmac_release_ptp(priv); @@ -7158,14 +7171,6 @@ int stmmac_dvr_probe(struct device *device, goto error_netdev_register; }
- if (priv->plat->serdes_powerup) { - ret = priv->plat->serdes_powerup(ndev, - priv->plat->bsp_priv); - - if (ret < 0) - goto error_serdes_powerup; - } - #ifdef CONFIG_DEBUG_FS stmmac_init_fs(ndev); #endif @@ -7180,8 +7185,6 @@ int stmmac_dvr_probe(struct device *device,
return ret;
-error_serdes_powerup: - unregister_netdev(ndev); error_netdev_register: phylink_destroy(priv->phylink); error_xpcs_setup:
From: Lukas Wunner lukas@wunner.de
[ Upstream commit ea64cdfad124922c931633e39287c5a31a9b14a1 ]
Commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") introduced a WARN() on resume from system sleep if a PHY is not in PHY_HALTED state.
Commit 6dbe852c379f ("net: phy: Don't WARN for PHY_READY state in mdio_bus_phy_resume()") added an exemption for PHY_READY state from the WARN().
It turns out PHY_UP state needs to be exempted as well because the following may happen on suspend:
mdio_bus_phy_suspend() phy_stop_machine() phydev->state = PHY_UP # if (phydev->state >= PHY_UP)
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") Reported-by: Marek Szyprowski m.szyprowski@samsung.com Tested-by: Marek Szyprowski m.szyprowski@samsung.com Link: https://lore.kernel.org/netdev/2b1a1588-505e-dff3-301d-bfc1fb14d685@samsung.... Signed-off-by: Lukas Wunner lukas@wunner.de Acked-by: Florian Fainelli f.fainelli@gmail.com Cc: Xiaolei Wang xiaolei.wang@windriver.com Link: https://lore.kernel.org/r/8128fdb51eeebc9efbf3776a4097363a1317aaf1.166390557... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/phy_device.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index f90a21781d8d..adc9d97cbb88 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -316,11 +316,13 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
phydev->suspended_by_mdio_bus = 0;
- /* If we manged to get here with the PHY state machine in a state neither - * PHY_HALTED nor PHY_READY this is an indication that something went wrong - * and we should most likely be using MAC managed PM and we are not. + /* If we managed to get here with the PHY state machine in a state + * neither PHY_HALTED, PHY_READY nor PHY_UP, this is an indication + * that something went wrong and we should most likely be using + * MAC managed PM, but we are not. */ - WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY); + WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY && + phydev->state != PHY_UP);
ret = phy_init_hw(phydev); if (ret < 0)
From: Wang Yufen wangyufen@huawei.com
[ Upstream commit bc7a319844891746135dc1f34ab9df78d636a3ac ]
The socket 2 bind the addr in use, bind should fail with EADDRINUSE. So if bind success or errno != EADDRINUSE, testcase should be failed.
Fixes: 3ca8e4029969 ("soreuseport: BPF selection functional test") Signed-off-by: Wang Yufen wangyufen@huawei.com Link: https://lore.kernel.org/r/1663916557-10730-1-git-send-email-wangyufen@huawei... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/reuseport_bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/reuseport_bpf.c b/tools/testing/selftests/net/reuseport_bpf.c index 072d709c96b4..65aea27d761c 100644 --- a/tools/testing/selftests/net/reuseport_bpf.c +++ b/tools/testing/selftests/net/reuseport_bpf.c @@ -328,7 +328,7 @@ static void test_extra_filter(const struct test_params p) if (bind(fd1, addr, sockaddr_size())) error(1, errno, "failed to bind recv socket 1");
- if (!bind(fd2, addr, sockaddr_size()) && errno != EADDRINUSE) + if (!bind(fd2, addr, sockaddr_size()) || errno != EADDRINUSE) error(1, errno, "bind socket 2 should fail with EADDRINUSE");
free(addr);
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
[ Upstream commit 29322791bc8b4f42fc65734840826e3ddc30921e ]
AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy" way - descriptors are not cleaned immediately after send. We rather hold on with cleaning until we see that free space in ring drops below particular threshold. This was supposed to reduce the amount of unnecessary work related to cleaning and instead of keeping the ring empty, ring was rather saturated.
In AF_XDP realm cleaning Tx descriptors implies producing them to CQ. This is a way of letting know user space that particular descriptor has been sent, as John points out in [0].
We tried to implement serial descriptor cleaning which would be used in conjunction with batched cleaning but it made code base more convoluted and probably harder to maintain in future. Therefore we step away from batched cleaning in a current form in favor of an approach where we set RS bit on every last descriptor from a batch and clean always at the beginning of ice_xmit_zc().
This means that we give up a bit of Tx performance, but this doesn't hurt l2fwd scenario which is way more meaningful than txonly as this can be treaten as AF_XDP based packet generator. l2fwd is not hurt due to the fact that Tx side is much faster than Rx and Rx is the one that has to catch Tx up.
FWIW Tx descriptors are still produced in a batched way.
[0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/
Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Tested-by: George Kuruvinakunnel george.kuruvinakunnel@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 143 +++++++++------------- drivers/net/ethernet/intel/ice/ice_xsk.h | 7 +- 3 files changed, 64 insertions(+), 88 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 97453d1dfafe..dd2285d4bef4 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -1467,7 +1467,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget) bool wd;
if (tx_ring->xsk_pool) - wd = ice_xmit_zc(tx_ring, ICE_DESC_UNUSED(tx_ring), budget); + wd = ice_xmit_zc(tx_ring); else if (ice_ring_is_xdp(tx_ring)) wd = true; else diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 03ce85f6e6df..8833b66b4e54 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -788,69 +788,57 @@ ice_clean_xdp_tx_buf(struct ice_tx_ring *xdp_ring, struct ice_tx_buf *tx_buf) }
/** - * ice_clean_xdp_irq_zc - Reclaim resources after transmit completes on XDP ring - * @xdp_ring: XDP ring to clean - * @napi_budget: amount of descriptors that NAPI allows us to clean - * - * Returns count of cleaned descriptors + * ice_clean_xdp_irq_zc - produce AF_XDP descriptors to CQ + * @xdp_ring: XDP Tx ring */ -static u16 ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring, int napi_budget) +static void ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring) { - u16 tx_thresh = ICE_RING_QUARTER(xdp_ring); - int budget = napi_budget / tx_thresh; - u16 next_dd = xdp_ring->next_dd; - u16 ntc, cleared_dds = 0; - - do { - struct ice_tx_desc *next_dd_desc; - u16 desc_cnt = xdp_ring->count; - struct ice_tx_buf *tx_buf; - u32 xsk_frames; - u16 i; - - next_dd_desc = ICE_TX_DESC(xdp_ring, next_dd); - if (!(next_dd_desc->cmd_type_offset_bsz & - cpu_to_le64(ICE_TX_DESC_DTYPE_DESC_DONE))) - break; + u16 ntc = xdp_ring->next_to_clean; + struct ice_tx_desc *tx_desc; + u16 cnt = xdp_ring->count; + struct ice_tx_buf *tx_buf; + u16 xsk_frames = 0; + u16 last_rs; + int i;
- cleared_dds++; - xsk_frames = 0; - if (likely(!xdp_ring->xdp_tx_active)) { - xsk_frames = tx_thresh; - goto skip; - } + last_rs = xdp_ring->next_to_use ? xdp_ring->next_to_use - 1 : cnt - 1; + tx_desc = ICE_TX_DESC(xdp_ring, last_rs); + if ((tx_desc->cmd_type_offset_bsz & + cpu_to_le64(ICE_TX_DESC_DTYPE_DESC_DONE))) { + if (last_rs >= ntc) + xsk_frames = last_rs - ntc + 1; + else + xsk_frames = last_rs + cnt - ntc + 1; + }
- ntc = xdp_ring->next_to_clean; + if (!xsk_frames) + return;
- for (i = 0; i < tx_thresh; i++) { - tx_buf = &xdp_ring->tx_buf[ntc]; + if (likely(!xdp_ring->xdp_tx_active)) + goto skip;
- if (tx_buf->raw_buf) { - ice_clean_xdp_tx_buf(xdp_ring, tx_buf); - tx_buf->raw_buf = NULL; - } else { - xsk_frames++; - } + ntc = xdp_ring->next_to_clean; + for (i = 0; i < xsk_frames; i++) { + tx_buf = &xdp_ring->tx_buf[ntc];
- ntc++; - if (ntc >= xdp_ring->count) - ntc = 0; + if (tx_buf->raw_buf) { + ice_clean_xdp_tx_buf(xdp_ring, tx_buf); + tx_buf->raw_buf = NULL; + } else { + xsk_frames++; } + + ntc++; + if (ntc >= xdp_ring->count) + ntc = 0; + } skip: - xdp_ring->next_to_clean += tx_thresh; - if (xdp_ring->next_to_clean >= desc_cnt) - xdp_ring->next_to_clean -= desc_cnt; - if (xsk_frames) - xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames); - next_dd_desc->cmd_type_offset_bsz = 0; - next_dd = next_dd + tx_thresh; - if (next_dd >= desc_cnt) - next_dd = tx_thresh - 1; - } while (--budget); - - xdp_ring->next_dd = next_dd; - - return cleared_dds * tx_thresh; + tx_desc->cmd_type_offset_bsz = 0; + xdp_ring->next_to_clean += xsk_frames; + if (xdp_ring->next_to_clean >= cnt) + xdp_ring->next_to_clean -= cnt; + if (xsk_frames) + xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames); }
/** @@ -885,7 +873,6 @@ static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc, static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs, unsigned int *total_bytes) { - u16 tx_thresh = ICE_RING_QUARTER(xdp_ring); u16 ntu = xdp_ring->next_to_use; struct ice_tx_desc *tx_desc; u32 i; @@ -905,13 +892,6 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de }
xdp_ring->next_to_use = ntu; - - if (xdp_ring->next_to_use > xdp_ring->next_rs) { - tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_rs); - tx_desc->cmd_type_offset_bsz |= - cpu_to_le64(ICE_TX_DESC_CMD_RS << ICE_TXD_QW1_CMD_S); - xdp_ring->next_rs += tx_thresh; - } }
/** @@ -924,7 +904,6 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs, u32 nb_pkts, unsigned int *total_bytes) { - u16 tx_thresh = ICE_RING_QUARTER(xdp_ring); u32 batched, leftover, i;
batched = ALIGN_DOWN(nb_pkts, PKTS_PER_BATCH); @@ -933,54 +912,54 @@ static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *d ice_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes); for (; i < batched + leftover; i++) ice_xmit_pkt(xdp_ring, &descs[i], total_bytes); +}
- if (xdp_ring->next_to_use > xdp_ring->next_rs) { - struct ice_tx_desc *tx_desc; +/** + * ice_set_rs_bit - set RS bit on last produced descriptor (one behind current NTU) + * @xdp_ring: XDP ring to produce the HW Tx descriptors on + */ +static void ice_set_rs_bit(struct ice_tx_ring *xdp_ring) +{ + u16 ntu = xdp_ring->next_to_use ? xdp_ring->next_to_use - 1 : xdp_ring->count - 1; + struct ice_tx_desc *tx_desc;
- tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_rs); - tx_desc->cmd_type_offset_bsz |= - cpu_to_le64(ICE_TX_DESC_CMD_RS << ICE_TXD_QW1_CMD_S); - xdp_ring->next_rs += tx_thresh; - } + tx_desc = ICE_TX_DESC(xdp_ring, ntu); + tx_desc->cmd_type_offset_bsz |= + cpu_to_le64(ICE_TX_DESC_CMD_RS << ICE_TXD_QW1_CMD_S); }
/** * ice_xmit_zc - take entries from XSK Tx ring and place them onto HW Tx ring * @xdp_ring: XDP ring to produce the HW Tx descriptors on - * @budget: number of free descriptors on HW Tx ring that can be used - * @napi_budget: amount of descriptors that NAPI allows us to clean * * Returns true if there is no more work that needs to be done, false otherwise */ -bool ice_xmit_zc(struct ice_tx_ring *xdp_ring, u32 budget, int napi_budget) +bool ice_xmit_zc(struct ice_tx_ring *xdp_ring) { struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs; - u16 tx_thresh = ICE_RING_QUARTER(xdp_ring); u32 nb_pkts, nb_processed = 0; unsigned int total_bytes = 0; + int budget; + + ice_clean_xdp_irq_zc(xdp_ring);
- if (budget < tx_thresh) - budget += ice_clean_xdp_irq_zc(xdp_ring, napi_budget); + budget = ICE_DESC_UNUSED(xdp_ring); + budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget); if (!nb_pkts) return true;
if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) { - struct ice_tx_desc *tx_desc; - nb_processed = xdp_ring->count - xdp_ring->next_to_use; ice_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes); - tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_rs); - tx_desc->cmd_type_offset_bsz |= - cpu_to_le64(ICE_TX_DESC_CMD_RS << ICE_TXD_QW1_CMD_S); - xdp_ring->next_rs = tx_thresh - 1; xdp_ring->next_to_use = 0; }
ice_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed, &total_bytes);
+ ice_set_rs_bit(xdp_ring); ice_xdp_ring_update_tail(xdp_ring); ice_update_tx_ring_stats(xdp_ring, nb_pkts, total_bytes);
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h index 4edbe81eb646..6fa181f080ef 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.h +++ b/drivers/net/ethernet/intel/ice/ice_xsk.h @@ -26,13 +26,10 @@ bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count); bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi); void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring); void ice_xsk_clean_xdp_ring(struct ice_tx_ring *xdp_ring); -bool ice_xmit_zc(struct ice_tx_ring *xdp_ring, u32 budget, int napi_budget); +bool ice_xmit_zc(struct ice_tx_ring *xdp_ring); int ice_realloc_zc_buf(struct ice_vsi *vsi, bool zc); #else -static inline bool -ice_xmit_zc(struct ice_tx_ring __always_unused *xdp_ring, - u32 __always_unused budget, - int __always_unused napi_budget) +static inline bool ice_xmit_zc(struct ice_tx_ring __always_unused *xdp_ring) { return false; }
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
[ Upstream commit b3056ae2b57858b02b376b3fed6077040caf14b4 ]
We had multiple customers in the past months that reported commit 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2") makes them unable to use ring size of 8160 in conjunction with AF_XDP. Remove this restriction.
Fixes: 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2") CC: Alasdair McWilliam alasdair.mcwilliam@outlook.com Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Tested-by: George Kuruvinakunnel george.kuruvinakunnel@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_xsk.c | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 8833b66b4e54..056c904b83cc 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -392,13 +392,6 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid) goto failure; }
- if (!is_power_of_2(vsi->rx_rings[qid]->count) || - !is_power_of_2(vsi->tx_rings[qid]->count)) { - netdev_err(vsi->netdev, "Please align ring sizes to power of 2\n"); - pool_failure = -EINVAL; - goto failure; - } - if_running = netif_running(vsi->netdev) && ice_is_xdp_ena_vsi(vsi);
if (if_running) { @@ -534,11 +527,10 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count) bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count) { u16 rx_thresh = ICE_RING_QUARTER(rx_ring); - u16 batched, leftover, i, tail_bumps; + u16 leftover, i, tail_bumps;
- batched = ALIGN_DOWN(count, rx_thresh); - tail_bumps = batched / rx_thresh; - leftover = count & (rx_thresh - 1); + tail_bumps = count / rx_thresh; + leftover = count - (tail_bumps * rx_thresh);
for (i = 0; i < tail_bumps; i++) if (!__ice_alloc_rx_bufs_zc(rx_ring, rx_thresh)) @@ -1037,14 +1029,16 @@ bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi) */ void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring) { - u16 count_mask = rx_ring->count - 1; u16 ntc = rx_ring->next_to_clean; u16 ntu = rx_ring->next_to_use;
- for ( ; ntc != ntu; ntc = (ntc + 1) & count_mask) { + while (ntc != ntu) { struct xdp_buff *xdp = *ice_xdp_buf(rx_ring, ntc);
xsk_buff_free(xdp); + ntc++; + if (ntc >= rx_ring->count) + ntc = 0; } }
From: Angus Chen angus.chen@jaguarmicro.com
[ Upstream commit db5db1a00d0816207be3a0166fcb4f523eaf3b52 ]
The q_pair_id to address a queue pair in the lm bar should be calculated by queue_id / 2 rather than queue_id / nr_vring.
Fixes: 2ddae773c93b ("vDPA/ifcvf: detect and use the onboard number of queues directly") Signed-off-by: Angus Chen angus.chen@jaguarmicro.com Reviewed-by: Jason Wang jasowang@redhat.com Reviewed-by: Michael S. Tsirkin mst@redhat.com Acked-by: Zhu Lingshan lingshan.zhu@intel.com Message-Id: 20220923091013.191-1-angus.chen@jaguarmicro.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/ifcvf/ifcvf_base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/vdpa/ifcvf/ifcvf_base.c b/drivers/vdpa/ifcvf/ifcvf_base.c index 48c4dadb0c7c..a4c1b985f79a 100644 --- a/drivers/vdpa/ifcvf/ifcvf_base.c +++ b/drivers/vdpa/ifcvf/ifcvf_base.c @@ -315,7 +315,7 @@ u16 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid) u32 q_pair_id;
ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg; - q_pair_id = qid / hw->nr_vring; + q_pair_id = qid / 2; avail_idx_addr = &ifcvf_lm->vring_lm_cfg[q_pair_id].idx_addr[qid % 2]; last_avail_idx = vp_ioread16(avail_idx_addr);
@@ -329,7 +329,7 @@ int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u16 num) u32 q_pair_id;
ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg; - q_pair_id = qid / hw->nr_vring; + q_pair_id = qid / 2; avail_idx_addr = &ifcvf_lm->vring_lm_cfg[q_pair_id].idx_addr[qid % 2]; hw->vring[qid].last_avail_idx = num; vp_iowrite16(num, avail_idx_addr);
From: Suwan Kim suwan.kim027@gmail.com
[ Upstream commit 37fafe6b61e4f15d977982635bb785f4e605f7cd ]
If a request fails at virtio_queue_rqs(), it is inserted to requeue_list and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs() despite the failure.
[ 1.890468] ------------[ cut here ]------------ [ 1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143 blk_mq_start_request+0x8a/0xe0 [ 1.891045] Modules linked in: [ 1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44 [ 1.891504] Hardware name: ChromiumOS crosvm, BIOS 0 [ 1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0 [ 1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53 20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b 41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b 01 0f [ 1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202 [ 1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000 [ 1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000 [ 1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100 [ 1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000 [ 1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00 [ 1.894060] FS: 00007ffa143a4640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 1.894412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0 [ 1.894953] Call Trace: [ 1.895139] <TASK> [ 1.895303] virtblk_prep_rq+0x1e5/0x280 [ 1.895509] virtio_queue_rq+0x5c/0x310 [ 1.895710] ? virtqueue_add_sgs+0x95/0xb0 [ 1.895905] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 1.896133] ? virtio_queue_rqs+0x340/0x390 [ 1.896453] ? sbitmap_get+0xfa/0x220 [ 1.896678] __blk_mq_issue_directly+0x41/0x180 [ 1.896906] blk_mq_plug_issue_direct+0xd8/0x2c0 [ 1.897115] blk_mq_flush_plug_list+0x115/0x180 [ 1.897342] blk_add_rq_to_plug+0x51/0x130 [ 1.897543] blk_mq_submit_bio+0x3a1/0x570 [ 1.897750] submit_bio_noacct_nocheck+0x418/0x520 [ 1.897985] ? submit_bio_noacct+0x1e/0x260 [ 1.897989] ext4_bio_write_page+0x222/0x420 [ 1.898000] mpage_process_page_bufs+0x178/0x1c0 [ 1.899451] mpage_prepare_extent_to_map+0x2d2/0x440 [ 1.899603] ext4_writepages+0x495/0x1020 [ 1.899733] do_writepages+0xcb/0x220 [ 1.899871] ? __seccomp_filter+0x171/0x7e0 [ 1.900006] file_write_and_wait_range+0xcd/0xf0 [ 1.900167] ext4_sync_file+0x72/0x320 [ 1.900308] __x64_sys_fsync+0x66/0xa0 [ 1.900449] do_syscall_64+0x31/0x50 [ 1.900595] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 1.900747] RIP: 0033:0x7ffa16ec96ea [ 1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b 44 24 [ 1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea [ 1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016 [ 1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640 [ 1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0 [ 1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000 [ 1.902763] </TASK> [ 1.902877] ---[ end trace 0000000000000000 ]---
To avoid calling blk_mq_start_request() twice, This patch moves the execution of blk_mq_start_request() to the end of virtblk_prep_rq(). And instead of requeuing failed request to plug list in the error path of virtblk_add_req_batch(), it uses blk_mq_requeue_request() to change failed request state to MQ_RQ_IDLE. Then virtblk can safely handle the request on the next trial.
Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()") Reported-by: Alexandre Courbot acourbot@chromium.org Tested-by: Alexandre Courbot acourbot@chromium.org Signed-off-by: Suwan Kim suwan.kim027@gmail.com Message-Id: 20220830150153.12627-1-suwan.kim027@gmail.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Stefan Hajnoczi stefanha@redhat.com Reviewed-by: Pankaj Raghav p.raghav@samsung.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/virtio_blk.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 59d6d5faf739..dcd639e58ff0 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -322,14 +322,14 @@ static blk_status_t virtblk_prep_rq(struct blk_mq_hw_ctx *hctx, if (unlikely(status)) return status;
- blk_mq_start_request(req); - vbr->sg_table.nents = virtblk_map_data(hctx, req, vbr); if (unlikely(vbr->sg_table.nents < 0)) { virtblk_cleanup_cmd(req); return BLK_STS_RESOURCE; }
+ blk_mq_start_request(req); + return BLK_STS_OK; }
@@ -391,8 +391,7 @@ static bool virtblk_prep_rq_batch(struct request *req) }
static bool virtblk_add_req_batch(struct virtio_blk_vq *vq, - struct request **rqlist, - struct request **requeue_list) + struct request **rqlist) { unsigned long flags; int err; @@ -408,7 +407,7 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq, if (err) { virtblk_unmap_data(req, vbr); virtblk_cleanup_cmd(req); - rq_list_add(requeue_list, req); + blk_mq_requeue_request(req, true); } }
@@ -436,7 +435,7 @@ static void virtio_queue_rqs(struct request **rqlist)
if (!next || req->mq_hctx != next->mq_hctx) { req->rq_next = NULL; - kick = virtblk_add_req_batch(vq, rqlist, &requeue_list); + kick = virtblk_add_req_batch(vq, rqlist); if (kick) virtqueue_notify(vq->vq);
From: Eli Cohen elic@nvidia.com
[ Upstream commit a43ae8057cc154fd26a3a23c0e8643bef104d995 ]
RQT objects require that a power of two value be configured for both rqt_max_size and rqt_actual size.
For create_rqt, make sure to round up to the power of two the value of given by the user who created the vdpa device and given by ndev->rqt_size. The actual size is also rounded up to the power of two using the current number of VQs given by ndev->cur_num_vqs.
Same goes with modify_rqt where we need to make sure act size is power of two based on the new number of QPs.
Without this patch, attempt to create a device with non power of two QPs would result in error from firmware.
Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Signed-off-by: Eli Cohen elic@nvidia.com Message-Id: 20220912125019.833708-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index e85c1d71f4ed..f527cbeb1169 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1297,6 +1297,8 @@ static void teardown_vq(struct mlx5_vdpa_net *ndev, struct mlx5_vdpa_virtqueue *
static int create_rqt(struct mlx5_vdpa_net *ndev) { + int rqt_table_size = roundup_pow_of_two(ndev->rqt_size); + int act_sz = roundup_pow_of_two(ndev->cur_num_vqs / 2); __be32 *list; void *rqtc; int inlen; @@ -1304,7 +1306,7 @@ static int create_rqt(struct mlx5_vdpa_net *ndev) int i, j; int err;
- inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + ndev->rqt_size * MLX5_ST_SZ_BYTES(rq_num); + inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + rqt_table_size * MLX5_ST_SZ_BYTES(rq_num); in = kzalloc(inlen, GFP_KERNEL); if (!in) return -ENOMEM; @@ -1313,12 +1315,12 @@ static int create_rqt(struct mlx5_vdpa_net *ndev) rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context);
MLX5_SET(rqtc, rqtc, list_q_type, MLX5_RQTC_LIST_Q_TYPE_VIRTIO_NET_Q); - MLX5_SET(rqtc, rqtc, rqt_max_size, ndev->rqt_size); + MLX5_SET(rqtc, rqtc, rqt_max_size, rqt_table_size); list = MLX5_ADDR_OF(rqtc, rqtc, rq_num[0]); - for (i = 0, j = 0; i < ndev->rqt_size; i++, j += 2) + for (i = 0, j = 0; i < act_sz; i++, j += 2) list[i] = cpu_to_be32(ndev->vqs[j % ndev->cur_num_vqs].virtq_id);
- MLX5_SET(rqtc, rqtc, rqt_actual_size, ndev->rqt_size); + MLX5_SET(rqtc, rqtc, rqt_actual_size, act_sz); err = mlx5_vdpa_create_rqt(&ndev->mvdev, in, inlen, &ndev->res.rqtn); kfree(in); if (err) @@ -1331,6 +1333,7 @@ static int create_rqt(struct mlx5_vdpa_net *ndev)
static int modify_rqt(struct mlx5_vdpa_net *ndev, int num) { + int act_sz = roundup_pow_of_two(num / 2); __be32 *list; void *rqtc; int inlen; @@ -1338,7 +1341,7 @@ static int modify_rqt(struct mlx5_vdpa_net *ndev, int num) int i, j; int err;
- inlen = MLX5_ST_SZ_BYTES(modify_rqt_in) + ndev->rqt_size * MLX5_ST_SZ_BYTES(rq_num); + inlen = MLX5_ST_SZ_BYTES(modify_rqt_in) + act_sz * MLX5_ST_SZ_BYTES(rq_num); in = kzalloc(inlen, GFP_KERNEL); if (!in) return -ENOMEM; @@ -1349,10 +1352,10 @@ static int modify_rqt(struct mlx5_vdpa_net *ndev, int num) MLX5_SET(rqtc, rqtc, list_q_type, MLX5_RQTC_LIST_Q_TYPE_VIRTIO_NET_Q);
list = MLX5_ADDR_OF(rqtc, rqtc, rq_num[0]); - for (i = 0, j = 0; i < ndev->rqt_size; i++, j += 2) + for (i = 0, j = 0; i < act_sz; i++, j = j + 2) list[i] = cpu_to_be32(ndev->vqs[j % num].virtq_id);
- MLX5_SET(rqtc, rqtc, rqt_actual_size, ndev->rqt_size); + MLX5_SET(rqtc, rqtc, rqt_actual_size, act_sz); err = mlx5_vdpa_modify_rqt(&ndev->mvdev, in, inlen, ndev->res.rqtn); kfree(in); if (err)
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 06bbaa6dc53cb72040db952053432541acb9adc7 ]
passing kmap_local_page() result to __kernel_write() is unsafe - random ->write_iter() might (and 9p one does) get unhappy when passed ITER_KVEC with pointer that came from kmap_local_page().
Fix by providing a variant of __kernel_write() that takes an iov_iter from caller (__kernel_write() becomes a trivial wrapper) and adding dump_emit_page() that parallels dump_emit(), except that instead of __kernel_write() it uses __kernel_write_iter() with ITER_BVEC source.
Fixes: 3159ed57792b "fs/coredump: use kmap_local_page()" Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/coredump.c | 38 +++++++++++++++++++++++++++++++++----- fs/internal.h | 3 +++ fs/read_write.c | 22 ++++++++++++++-------- 3 files changed, 50 insertions(+), 13 deletions(-)
diff --git a/fs/coredump.c b/fs/coredump.c index ebc43f960b64..f1355e52614a 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -832,6 +832,38 @@ static int __dump_skip(struct coredump_params *cprm, size_t nr) } }
+static int dump_emit_page(struct coredump_params *cprm, struct page *page) +{ + struct bio_vec bvec = { + .bv_page = page, + .bv_offset = 0, + .bv_len = PAGE_SIZE, + }; + struct iov_iter iter; + struct file *file = cprm->file; + loff_t pos = file->f_pos; + ssize_t n; + + if (cprm->to_skip) { + if (!__dump_skip(cprm, cprm->to_skip)) + return 0; + cprm->to_skip = 0; + } + if (cprm->written + PAGE_SIZE > cprm->limit) + return 0; + if (dump_interrupted()) + return 0; + iov_iter_bvec(&iter, WRITE, &bvec, 1, PAGE_SIZE); + n = __kernel_write_iter(cprm->file, &iter, &pos); + if (n != PAGE_SIZE) + return 0; + file->f_pos = pos; + cprm->written += PAGE_SIZE; + cprm->pos += PAGE_SIZE; + + return 1; +} + int dump_emit(struct coredump_params *cprm, const void *addr, int nr) { if (cprm->to_skip) { @@ -863,7 +895,6 @@ int dump_user_range(struct coredump_params *cprm, unsigned long start,
for (addr = start; addr < start + len; addr += PAGE_SIZE) { struct page *page; - int stop;
/* * To avoid having to allocate page tables for virtual address @@ -874,10 +905,7 @@ int dump_user_range(struct coredump_params *cprm, unsigned long start, */ page = get_dump_page(addr); if (page) { - void *kaddr = kmap_local_page(page); - - stop = !dump_emit(cprm, kaddr, PAGE_SIZE); - kunmap_local(kaddr); + int stop = !dump_emit_page(cprm, page); put_page(page); if (stop) return 0; diff --git a/fs/internal.h b/fs/internal.h index 87e96b9024ce..3e206d3e317c 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -16,6 +16,7 @@ struct shrink_control; struct fs_context; struct user_namespace; struct pipe_inode_info; +struct iov_iter;
/* * block/bdev.c @@ -221,3 +222,5 @@ ssize_t do_getxattr(struct user_namespace *mnt_userns, int setxattr_copy(const char __user *name, struct xattr_ctx *ctx); int do_setxattr(struct user_namespace *mnt_userns, struct dentry *dentry, struct xattr_ctx *ctx); + +ssize_t __kernel_write_iter(struct file *file, struct iov_iter *from, loff_t *pos); diff --git a/fs/read_write.c b/fs/read_write.c index 397da0236607..a0a3d35e2c0f 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -509,14 +509,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t }
/* caller is responsible for file_start_write/file_end_write */ -ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) +ssize_t __kernel_write_iter(struct file *file, struct iov_iter *from, loff_t *pos) { - struct kvec iov = { - .iov_base = (void *)buf, - .iov_len = min_t(size_t, count, MAX_RW_COUNT), - }; struct kiocb kiocb; - struct iov_iter iter; ssize_t ret;
if (WARN_ON_ONCE(!(file->f_mode & FMODE_WRITE))) @@ -532,8 +527,7 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
init_sync_kiocb(&kiocb, file); kiocb.ki_pos = pos ? *pos : 0; - iov_iter_kvec(&iter, WRITE, &iov, 1, iov.iov_len); - ret = file->f_op->write_iter(&kiocb, &iter); + ret = file->f_op->write_iter(&kiocb, from); if (ret > 0) { if (pos) *pos = kiocb.ki_pos; @@ -543,6 +537,18 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t inc_syscw(current); return ret; } + +/* caller is responsible for file_start_write/file_end_write */ +ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) +{ + struct kvec iov = { + .iov_base = (void *)buf, + .iov_len = min_t(size_t, count, MAX_RW_COUNT), + }; + struct iov_iter iter; + iov_iter_kvec(&iter, WRITE, &iov, 1, iov.iov_len); + return __kernel_write_iter(file, &iter, pos); +} /* * This "EXPORT_SYMBOL_GPL()" is more of a "EXPORT_SYMBOL_DONTUSE()", * but autofs is one of the few internal kernel users that actually
Hi Greg,
On Mon, Oct 3, 2022 at 9:28 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 06bbaa6dc53cb72040db952053432541acb9adc7 ]
passing kmap_local_page() result to __kernel_write() is unsafe - random ->write_iter() might (and 9p one does) get unhappy when passed ITER_KVEC with pointer that came from kmap_local_page().
Fix by providing a variant of __kernel_write() that takes an iov_iter from caller (__kernel_write() becomes a trivial wrapper) and adding dump_emit_page() that parallels dump_emit(), except that instead of __kernel_write() it uses __kernel_write_iter() with ITER_BVEC source.
Fixes: 3159ed57792b "fs/coredump: use kmap_local_page()" Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org
This will need a follow-up patch, which I have just posted[1], to not break the build if CONFIG_ELF_CORE is not set.
[1] https://lore.kernel.org/20221003090657.2053236-1-geert@linux-m68k.org
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Mon, Oct 03, 2022 at 11:09:12AM +0200, Geert Uytterhoeven wrote:
Hi Greg,
On Mon, Oct 3, 2022 at 9:28 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 06bbaa6dc53cb72040db952053432541acb9adc7 ]
passing kmap_local_page() result to __kernel_write() is unsafe - random ->write_iter() might (and 9p one does) get unhappy when passed ITER_KVEC with pointer that came from kmap_local_page().
Fix by providing a variant of __kernel_write() that takes an iov_iter from caller (__kernel_write() becomes a trivial wrapper) and adding dump_emit_page() that parallels dump_emit(), except that instead of __kernel_write() it uses __kernel_write_iter() with ITER_BVEC source.
Fixes: 3159ed57792b "fs/coredump: use kmap_local_page()" Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org
This will need a follow-up patch, which I have just posted[1], to not break the build if CONFIG_ELF_CORE is not set.
[1] https://lore.kernel.org/20221003090657.2053236-1-geert@linux-m68k.org
Thanks, now dropped from 5.19 and 5.15 queues. When this gets merged, can you ping stable@kernel.org to add them both back?
thanks,
greg k-h
From: Han Xu han.xu@nxp.com
[ Upstream commit b1ff1bfe81e763420afd5f3f25f0b3cbfd97055c ]
There is no dedicate parent clock for QSPI so SET_RATE_PARENT flag should not be used. For instance, the default parent clock for QSPI is pll2_bus, which is also the parent clock for quite a few modules, such as MMDC, once GPMI NAND set clock rate for EDO5 mode can cause system hang due to pll2_bus rate changed.
Fixes: f1541e15e38e ("clk: imx6sx: Switch to clk_hw based API") Signed-off-by: Han Xu han.xu@nxp.com Link: https://lore.kernel.org/r/20220915150959.3646702-1-han.xu@nxp.com Tested-by: Fabio Estevam festevam@denx.de Reviewed-by: Abel Vesa abel.vesa@linaro.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/imx/clk-imx6sx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/imx/clk-imx6sx.c b/drivers/clk/imx/clk-imx6sx.c index fc1bd23d4583..598f3cf4eba4 100644 --- a/drivers/clk/imx/clk-imx6sx.c +++ b/drivers/clk/imx/clk-imx6sx.c @@ -280,13 +280,13 @@ static void __init imx6sx_clocks_init(struct device_node *ccm_node) hws[IMX6SX_CLK_SSI3_SEL] = imx_clk_hw_mux("ssi3_sel", base + 0x1c, 14, 2, ssi_sels, ARRAY_SIZE(ssi_sels)); hws[IMX6SX_CLK_SSI2_SEL] = imx_clk_hw_mux("ssi2_sel", base + 0x1c, 12, 2, ssi_sels, ARRAY_SIZE(ssi_sels)); hws[IMX6SX_CLK_SSI1_SEL] = imx_clk_hw_mux("ssi1_sel", base + 0x1c, 10, 2, ssi_sels, ARRAY_SIZE(ssi_sels)); - hws[IMX6SX_CLK_QSPI1_SEL] = imx_clk_hw_mux_flags("qspi1_sel", base + 0x1c, 7, 3, qspi1_sels, ARRAY_SIZE(qspi1_sels), CLK_SET_RATE_PARENT); + hws[IMX6SX_CLK_QSPI1_SEL] = imx_clk_hw_mux("qspi1_sel", base + 0x1c, 7, 3, qspi1_sels, ARRAY_SIZE(qspi1_sels)); hws[IMX6SX_CLK_PERCLK_SEL] = imx_clk_hw_mux("perclk_sel", base + 0x1c, 6, 1, perclk_sels, ARRAY_SIZE(perclk_sels)); hws[IMX6SX_CLK_VID_SEL] = imx_clk_hw_mux("vid_sel", base + 0x20, 21, 3, vid_sels, ARRAY_SIZE(vid_sels)); hws[IMX6SX_CLK_ESAI_SEL] = imx_clk_hw_mux("esai_sel", base + 0x20, 19, 2, audio_sels, ARRAY_SIZE(audio_sels)); hws[IMX6SX_CLK_CAN_SEL] = imx_clk_hw_mux("can_sel", base + 0x20, 8, 2, can_sels, ARRAY_SIZE(can_sels)); hws[IMX6SX_CLK_UART_SEL] = imx_clk_hw_mux("uart_sel", base + 0x24, 6, 1, uart_sels, ARRAY_SIZE(uart_sels)); - hws[IMX6SX_CLK_QSPI2_SEL] = imx_clk_hw_mux_flags("qspi2_sel", base + 0x2c, 15, 3, qspi2_sels, ARRAY_SIZE(qspi2_sels), CLK_SET_RATE_PARENT); + hws[IMX6SX_CLK_QSPI2_SEL] = imx_clk_hw_mux("qspi2_sel", base + 0x2c, 15, 3, qspi2_sels, ARRAY_SIZE(qspi2_sels)); hws[IMX6SX_CLK_SPDIF_SEL] = imx_clk_hw_mux("spdif_sel", base + 0x30, 20, 2, audio_sels, ARRAY_SIZE(audio_sels)); hws[IMX6SX_CLK_AUDIO_SEL] = imx_clk_hw_mux("audio_sel", base + 0x30, 7, 2, audio_sels, ARRAY_SIZE(audio_sels)); hws[IMX6SX_CLK_ENET_PRE_SEL] = imx_clk_hw_mux("enet_pre_sel", base + 0x34, 15, 3, enet_pre_sels, ARRAY_SIZE(enet_pre_sels));
From: Ashutosh Dixit ashutosh.dixit@intel.com
[ Upstream commit 7738be973fc4e2ba22154fafd3a5d7b9666f9abf ]
Register GT0_PERF_LIMIT_REASONS (0x1381a8) is available only for Gen11+. Therefore ensure perf_limit_reasons sysfs files are created only for Gen11+. Otherwise on Gen < 5 accessing these files results in the following oops:
<1> [88.829420] BUG: unable to handle page fault for address: ffffc90000bb81a8 <1> [88.829438] #PF: supervisor read access in kernel mode <1> [88.829447] #PF: error_code(0x0000) - not-present page
This patch is a backport of the drm-tip commit 0d2d201095e9 ("drm/i915: Perf_limit_reasons are only available for Gen11+") to drm-intel-fixes. The backport is not identical to the original, it only includes the sysfs portions of if. The debugfs portion is not available in drm-intel-fixes so has not been backported.
Bspec: 20008 Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6863 Fixes: fa68bff7cf27 ("drm/i915/gt: Add sysfs throttle frequency interfaces") Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220919162401.2077713-1-ashut... (backported from commit 0d2d201095e9f141d6a9fb44320afce761f8b5c2) Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c index f76b6cf8040e..b8cb58e2819a 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c @@ -544,8 +544,7 @@ static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_ratl, RATL_MASK); static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_vr_thermalert, VR_THERMALERT_MASK); static INTEL_GT_RPS_BOOL_ATTR_RO(throttle_reason_vr_tdc, VR_TDC_MASK);
-static const struct attribute *freq_attrs[] = { - &dev_attr_punit_req_freq_mhz.attr, +static const struct attribute *throttle_reason_attrs[] = { &attr_throttle_reason_status.attr, &attr_throttle_reason_pl1.attr, &attr_throttle_reason_pl2.attr, @@ -594,9 +593,17 @@ void intel_gt_sysfs_pm_init(struct intel_gt *gt, struct kobject *kobj) if (!is_object_gt(kobj)) return;
- ret = sysfs_create_files(kobj, freq_attrs); + ret = sysfs_create_file(kobj, &dev_attr_punit_req_freq_mhz.attr); if (ret) drm_warn(>->i915->drm, - "failed to create gt%u throttle sysfs files (%pe)", + "failed to create gt%u punit_req_freq_mhz sysfs (%pe)", gt->info.id, ERR_PTR(ret)); + + if (GRAPHICS_VER(gt->i915) >= 11) { + ret = sysfs_create_files(kobj, throttle_reason_attrs); + if (ret) + drm_warn(>->i915->drm, + "failed to create gt%u throttle sysfs files (%pe)", + gt->info.id, ERR_PTR(ret)); + } }
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 1b24a132eba7a1c19475ba2510ec1c00af3ff914 ]
After commit 31fd9b79dc58 ("ARM: dts: BCM5301X: update CRU block description") a warning from clk-iproc-pll.c was generated due to a duplicate PLL name as well as the console stopped working. Upon closer inspection it became clear that iproc_pll_clk_setup() used the Device Tree node unit name as an unique identifier as well as a parent name to parent all clocks under the PLL.
BCM5301X was the first platform on which that got noticed because of the DT node unit name renaming but the same assumptions hold true for any user of the iproc_pll_clk_setup() function.
The first 'clock-output-names' property is always guaranteed to be unique as well as providing the actual desired PLL clock name, so we utilize that to register the PLL and as a parent name of all children clock.
Fixes: 5fe225c105fd ("clk: iproc: add initial common clock support") Signed-off-by: Florian Fainelli f.fainelli@gmail.com Acked-by: Rafał Miłecki rafal@milecki.pl Link: https://lore.kernel.org/r/20220905161504.1526-1-f.fainelli@gmail.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/bcm/clk-iproc-pll.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/bcm/clk-iproc-pll.c b/drivers/clk/bcm/clk-iproc-pll.c index 33da30f99c79..d39c44b61c52 100644 --- a/drivers/clk/bcm/clk-iproc-pll.c +++ b/drivers/clk/bcm/clk-iproc-pll.c @@ -736,6 +736,7 @@ void iproc_pll_clk_setup(struct device_node *node, const char *parent_name; struct iproc_clk *iclk_array; struct clk_hw_onecell_data *clk_data; + const char *clk_name;
if (WARN_ON(!pll_ctrl) || WARN_ON(!clk_ctrl)) return; @@ -783,7 +784,12 @@ void iproc_pll_clk_setup(struct device_node *node, iclk = &iclk_array[0]; iclk->pll = pll;
- init.name = node->name; + ret = of_property_read_string_index(node, "clock-output-names", + 0, &clk_name); + if (WARN_ON(ret)) + goto err_pll_register; + + init.name = clk_name; init.ops = &iproc_pll_ops; init.flags = 0; parent_name = of_clk_get_parent_name(node, 0); @@ -803,13 +809,11 @@ void iproc_pll_clk_setup(struct device_node *node, goto err_pll_register;
clk_data->hws[0] = &iclk->hw; + parent_name = clk_name;
/* now initialize and register all leaf clocks */ for (i = 1; i < num_clks; i++) { - const char *clk_name; - memset(&init, 0, sizeof(init)); - parent_name = node->name;
ret = of_property_read_string_index(node, "clock-output-names", i, &clk_name);
From: Peng Fan peng.fan@nxp.com
[ Upstream commit daaa2fbe678efdaced53d1c635f4d326751addf8 ]
There is build warning when CONFIG_OF is not selected.
drivers/clk/imx/clk-imx93.c:324:34: warning: 'imx93_clk_of_match' defined but not used [-Wunused-const-variable=]
324 | static const struct of_device_id imx93_clk_of_match[] = { | ^~~~~~~~~~~~~~~~~~
The driver only support DT table, no sense to use of_match_ptr.
Fixes: 24defbe194b6 ("clk: imx: add i.MX93 clk") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Peng Fan peng.fan@nxp.com Link: https://lore.kernel.org/r/20220830033137.4149542-3-peng.fan@oss.nxp.com Reviewed-by: Abel Vesa abel.vesa@linaro.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/imx/clk-imx93.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/imx/clk-imx93.c b/drivers/clk/imx/clk-imx93.c index f5c9fa40491c..dcc41d178238 100644 --- a/drivers/clk/imx/clk-imx93.c +++ b/drivers/clk/imx/clk-imx93.c @@ -332,7 +332,7 @@ static struct platform_driver imx93_clk_driver = { .driver = { .name = "imx93-ccm", .suppress_bind_attrs = true, - .of_match_table = of_match_ptr(imx93_clk_of_match), + .of_match_table = imx93_clk_of_match, }, }; module_platform_driver(imx93_clk_driver);
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 276d37eb449133bc22872b8f0a6f878e120deeff ]
Currently the following set of commands fails:
$ ip link add br0 type bridge # vlan_filtering 0 $ ip link set swp0 master br0 $ bridge vlan port vlan-id swp0 1 PVID Egress Untagged $ bridge vlan add dev swp0 vid 10 Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.
Dumping ocelot->vlans, one can see that the 2 egress-untagged VLANs on swp0 are vid 1 (the bridge PVID) and vid 4094, a PVID used privately by the driver for VLAN-unaware bridging. So this is why bridge vid 10 is refused, despite 'bridge vlan' showing a single egress untagged VLAN.
As mentioned in the comment added, having this private VLAN does not impose restrictions to the hardware configuration, yet it is a bookkeeping problem.
There are 2 possible solutions.
One is to make the functions that operate on VLAN-unaware pvids: - ocelot_add_vlan_unaware_pvid() - ocelot_del_vlan_unaware_pvid() - ocelot_port_setup_dsa_8021q_cpu() - ocelot_port_teardown_dsa_8021q_cpu() call something different than ocelot_vlan_member_(add|del)(), the latter being the real problem, because it allocates a struct ocelot_bridge_vlan *vlan which it adds to ocelot->vlans. We don't really *need* the private VLANs in ocelot->vlans, it's just that we have the extra convenience of having the vlan->portmask cached in software (whereas without these structures, we'd have to create a raw ocelot_vlant_rmw_mask() procedure which reads back the current port mask from hardware).
The other solution is to filter out the private VLANs from ocelot_port_num_untagged_vlans(), since they aren't what callers care about. We only need to do this to the mentioned function and not to ocelot_port_num_tagged_vlans(), because private VLANs are never egress-tagged.
Nothing else seems to be broken in either solution, but the first one requires more rework which will conflict with the net-next change 36a0bf443585 ("net: mscc: ocelot: set up tag_8021q CPU ports independent of user port affinity"), and I'd like to avoid that. So go with the other one.
Fixes: 54c319846086 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://lore.kernel.org/r/20220927122042.1100231-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 68991b021c56..c250ad6dc956 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -290,6 +290,13 @@ static int ocelot_port_num_untagged_vlans(struct ocelot *ocelot, int port) if (!(vlan->portmask & BIT(port))) continue;
+ /* Ignore the VLAN added by ocelot_add_vlan_unaware_pvid(), + * because this is never active in hardware at the same time as + * the bridge VLANs, which only matter in VLAN-aware mode. + */ + if (vlan->vid >= OCELOT_RSV_VLAN_RANGE_START) + continue; + if (vlan->untagged & BIT(port)) num_untagged++; }
From: Daniel Golle daniel@makrotopia.org
[ Upstream commit c9da02bfb1112461e048d3b736afb1873f6f4ccf ]
The bitmasks applied in RX_DMA_GET_SPORT and RX_DMA_GET_SPORT_V2 macros were swapped. Fix that.
Reported-by: Chen Minqiang ptpt52@gmail.com Fixes: 160d3a9b192985 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support") Acked-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Daniel Golle daniel@makrotopia.org Link: https://lore.kernel.org/r/YzMW+mg9UsaCdKRQ@makrotopia.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h index 98d6a6d047e3..c1fe1a2cb746 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h @@ -312,8 +312,8 @@ #define MTK_RXD5_PPE_CPU_REASON GENMASK(22, 18) #define MTK_RXD5_SRC_PORT GENMASK(29, 26)
-#define RX_DMA_GET_SPORT(x) (((x) >> 19) & 0xf) -#define RX_DMA_GET_SPORT_V2(x) (((x) >> 26) & 0x7) +#define RX_DMA_GET_SPORT(x) (((x) >> 19) & 0x7) +#define RX_DMA_GET_SPORT_V2(x) (((x) >> 26) & 0xf)
/* PDMA V2 descriptor rxd3 */ #define RX_DMA_VTAG_V2 BIT(0)
From: Zhengjun Xing zhengjun.xing@linux.intel.com
[ Upstream commit 457c8b60267054869513ab1fb5513abb0a566dd0 ]
The test case 87 ("perf record tests") failed on hybrid systems,the event "cpu/br_inst_retired.near_call/p" is only for non-hybrid system. Correct the test event to support both non-hybrid and hybrid systems.
Before:
# ./perf test 87 87: perf record tests : FAILED!
After:
# ./perf test 87 87: perf record tests : Ok
Fixes: 24f378e66021f559 ("perf test: Add basic perf record tests") Reviewed-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Xing Zhengjun zhengjun.xing@linux.intel.com Acked-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20220927051513.3768717-1-zhengjun.xing@linux.intel... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/record.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/record.sh b/tools/perf/tests/shell/record.sh index 00c7285ce1ac..301f95427159 100755 --- a/tools/perf/tests/shell/record.sh +++ b/tools/perf/tests/shell/record.sh @@ -61,7 +61,7 @@ test_register_capture() { echo "Register capture test [Skipped missing registers]" return fi - if ! perf record -o - --intr-regs=di,r8,dx,cx -e cpu/br_inst_retired.near_call/p \ + if ! perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p \ -c 1000 --per-thread true 2> /dev/null \ | perf script -F ip,sym,iregs -i - 2> /dev/null \ | egrep -q "DI:"
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit 25c5e67cdf744cbb93fd06647611d3036218debe ]
We were just checking for the 'err' variable, when we should really see if there was some of the many checked errors that don't stop the test right away.
Detected with clang 15.0.0:
44 75.23 fedora:37 : FAIL clang version 15.0.0 (Fedora 15.0.0-2.fc37)
tests/perf-record.c:68:16: error: variable 'errs' set but not used [-Werror,-Wunused-but-set-variable] int err = -1, errs = 0, i, wakeups = 0; ^ 1 error generated.
The patch introducing this 'perf test' entry had that check:
+ return (err < 0 || errs > 0) ? -1 : 0;
But at some point we lost that:
- return (err < 0 || errs > 0) ? -1 : 0; + if (err == -EACCES) + return TEST_SKIP; + if (err < 0) + return TEST_FAIL; + return TEST_OK
Put it back.
Fixes: 2cf88f4614c996e5 ("perf test: Use skip in PERF_RECORD_*") Acked-by: Ian Rogers irogers@google.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Link: https://lore.kernel.org/lkml/YzR0n5QhsH9VyYB0@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/perf-record.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c index 6a001fcfed68..4952abe716f3 100644 --- a/tools/perf/tests/perf-record.c +++ b/tools/perf/tests/perf-record.c @@ -332,7 +332,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest out: if (err == -EACCES) return TEST_SKIP; - if (err < 0) + if (err < 0 || errs != 0) return TEST_FAIL; return TEST_OK; }
From: Jim Mattson jmattson@google.com
[ Upstream commit aae2e72229cdb21f90df2dbe4244c977e5d3265b ]
The only thing reported by CPUID.9 is the value of IA32_PLATFORM_DCA_CAP[31:0] in EAX. This MSR doesn't even exist in the guest, since CPUID.1:ECX.DCA[bit 18] is clear in the guest.
Clear CPUID.9 in KVM_GET_SUPPORTED_CPUID.
Fixes: 24c82e576b78 ("KVM: Sanitize cpuid") Signed-off-by: Jim Mattson jmattson@google.com Message-Id: 20220922231854.249383-1-jmattson@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/cpuid.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 3ab498165639..cb14441cee37 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -870,8 +870,6 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) entry->edx = 0; } break; - case 9: - break; case 0xa: { /* Architectural Performance Monitoring */ struct x86_pmu_capability cap; union cpuid10_eax eax;
From: Borislav Petkov bp@suse.de
commit df5b035b5683d6a25f077af889fb88e09827f8bc upstream.
On a CONFIG_SMP=n kernel, the LLC shared mask is 0, which prevents __cache_amd_cpumap_setup() from doing the L3 masks setup, and more specifically from setting up the shared_cpu_map and shared_cpu_list files in sysfs, leading to lscpu from util-linux getting confused and segfaulting.
Add a cpu_llc_shared_mask() UP variant which returns a mask with a single bit set, i.e., for CPU0.
Fixes: 2b83809a5e6d ("x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask") Reported-by: Saurabh Sengar ssengar@linux.microsoft.com Signed-off-by: Borislav Petkov bp@suse.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1660148115-302-1-git-send-email-ssengar@linux.micr... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/smp.h | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-)
--- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -21,16 +21,6 @@ DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_llc DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_l2c_id); DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
-static inline struct cpumask *cpu_llc_shared_mask(int cpu) -{ - return per_cpu(cpu_llc_shared_map, cpu); -} - -static inline struct cpumask *cpu_l2c_shared_mask(int cpu) -{ - return per_cpu(cpu_l2c_shared_map, cpu); -} - DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid); DECLARE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid); DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_bios_cpu_apicid); @@ -172,6 +162,16 @@ extern int safe_smp_processor_id(void); # define safe_smp_processor_id() smp_processor_id() #endif
+static inline struct cpumask *cpu_llc_shared_mask(int cpu) +{ + return per_cpu(cpu_llc_shared_map, cpu); +} + +static inline struct cpumask *cpu_l2c_shared_mask(int cpu) +{ + return per_cpu(cpu_l2c_shared_map, cpu); +} + #else /* !CONFIG_SMP */ #define wbinvd_on_cpu(cpu) wbinvd() static inline int wbinvd_on_all_cpus(void) @@ -179,6 +179,11 @@ static inline int wbinvd_on_all_cpus(voi wbinvd(); return 0; } + +static inline struct cpumask *cpu_llc_shared_mask(int cpu) +{ + return (struct cpumask *)cpumask_of(0); +} #endif /* CONFIG_SMP */
extern unsigned disabled_cpus;
From: Nadav Amit namit@vmware.com
commit efd608fa7403ba106412b437f873929e2c862e28 upstream.
I encountered some occasional crashes of poke_int3_handler() when kprobes are set, while accessing desc->vec.
The text poke mechanism claims to have an RCU-like behavior, but it does not appear that there is any quiescent state to ensure that nobody holds reference to desc. As a result, the following race appears to be possible, which can lead to memory corruption.
CPU0 CPU1 ---- ---- text_poke_bp_batch() -> smp_store_release(&bp_desc, &desc)
[ notice that desc is on the stack ]
poke_int3_handler()
[ int3 might be kprobe's so sync events are do not help ]
-> try_get_desc(descp=&bp_desc) desc = __READ_ONCE(bp_desc)
if (!desc) [false, success] WRITE_ONCE(bp_desc, NULL); atomic_dec_and_test(&desc.refs)
[ success, desc space on the stack is being reused and might have non-zero value. ] arch_atomic_inc_not_zero(&desc->refs)
[ might succeed since desc points to stack memory that was freed and might be reused. ]
Fix this issue with small backportable patch. Instead of trying to make RCU-like behavior for bp_desc, just eliminate the unnecessary level of indirection of bp_desc, and hold the whole descriptor as a global. Anyhow, there is only a single descriptor at any given moment.
Fixes: 1f676247f36a4 ("x86/alternatives: Implement a better poke_int3_handler() completion scheme") Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@kernel.org Link: https://lkml.kernel.org/r/20220920224743.3089-1-namit@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/alternative.c | 45 +++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 22 deletions(-)
--- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1319,22 +1319,23 @@ struct bp_patching_desc { atomic_t refs; };
-static struct bp_patching_desc *bp_desc; +static struct bp_patching_desc bp_desc;
static __always_inline -struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp) +struct bp_patching_desc *try_get_desc(void) { - /* rcu_dereference */ - struct bp_patching_desc *desc = __READ_ONCE(*descp); + struct bp_patching_desc *desc = &bp_desc;
- if (!desc || !arch_atomic_inc_not_zero(&desc->refs)) + if (!arch_atomic_inc_not_zero(&desc->refs)) return NULL;
return desc; }
-static __always_inline void put_desc(struct bp_patching_desc *desc) +static __always_inline void put_desc(void) { + struct bp_patching_desc *desc = &bp_desc; + smp_mb__before_atomic(); arch_atomic_dec(&desc->refs); } @@ -1367,15 +1368,15 @@ noinstr int poke_int3_handler(struct pt_
/* * Having observed our INT3 instruction, we now must observe - * bp_desc: + * bp_desc with non-zero refcount: * - * bp_desc = desc INT3 + * bp_desc.refs = 1 INT3 * WMB RMB - * write INT3 if (desc) + * write INT3 if (bp_desc.refs != 0) */ smp_rmb();
- desc = try_get_desc(&bp_desc); + desc = try_get_desc(); if (!desc) return 0;
@@ -1429,7 +1430,7 @@ noinstr int poke_int3_handler(struct pt_ ret = 1;
out_put: - put_desc(desc); + put_desc(); return ret; }
@@ -1460,18 +1461,20 @@ static int tp_vec_nr; */ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries) { - struct bp_patching_desc desc = { - .vec = tp, - .nr_entries = nr_entries, - .refs = ATOMIC_INIT(1), - }; unsigned char int3 = INT3_INSN_OPCODE; unsigned int i; int do_sync;
lockdep_assert_held(&text_mutex);
- smp_store_release(&bp_desc, &desc); /* rcu_assign_pointer */ + bp_desc.vec = tp; + bp_desc.nr_entries = nr_entries; + + /* + * Corresponds to the implicit memory barrier in try_get_desc() to + * ensure reading a non-zero refcount provides up to date bp_desc data. + */ + atomic_set_release(&bp_desc.refs, 1);
/* * Corresponding read barrier in int3 notifier for making sure the @@ -1559,12 +1562,10 @@ static void text_poke_bp_batch(struct te text_poke_sync();
/* - * Remove and synchronize_rcu(), except we have a very primitive - * refcount based completion. + * Remove and wait for refs to be zero. */ - WRITE_ONCE(bp_desc, NULL); /* RCU_INIT_POINTER */ - if (!atomic_dec_and_test(&desc.refs)) - atomic_cond_read_acquire(&desc.refs, !VAL); + if (!atomic_dec_and_test(&bp_desc.refs)) + atomic_cond_read_acquire(&bp_desc.refs, !VAL); }
static void text_poke_loc_init(struct text_poke_loc *tp, void *addr,
From: Levi Yun ppbuk5246@gmail.com
commit 1c8e2349f2d033f634d046063b704b2ca6c46972 upstream.
When damon_sysfs_add_target couldn't find proper task, New allocated damon_target structure isn't registered yet, So, it's impossible to free new allocated one by damon_sysfs_destroy_targets.
By calling damon_add_target as soon as allocating new target, Fix this possible memory leak.
Link: https://lkml.kernel.org/r/20220926160611.48536-1-sj@kernel.org Fixes: a61ea561c871 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring") Signed-off-by: Levi Yun ppbuk5246@gmail.com Signed-off-by: SeongJae Park sj@kernel.org Reviewed-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org [5.17.x] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -2181,13 +2181,13 @@ static int damon_sysfs_add_target(struct
if (!t) return -ENOMEM; + damon_add_target(ctx, t); if (ctx->ops.id == DAMON_OPS_VADDR || ctx->ops.id == DAMON_OPS_FVADDR) { t->pid = find_get_pid(sys_target->pid); if (!t->pid) goto destroy_targets_out; } - damon_add_target(ctx, t); err = damon_sysfs_set_regions(t, sys_target->regions); if (err) goto destroy_targets_out;
On Mon, Oct 03, 2022 at 09:09:56AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
Build results: total: 150 pass: 150 fail: 0 Qemu test results: total: 490 pass: 490 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On 10/3/22 00:09, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On Mon, Oct 03, 2022 at 09:09:56AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc1 against the Fedora build system (aarch64, armv7, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
On Oct 3, 2022, at 3:09 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
5.19.13-rc1 compiled and booted with no errors or regressions on my x86_64 test system.
Tested-by: Slade Watkins srw@sladewatkins.net
-srw
On 10/3/22 01:09, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Mon, Oct 3, 2022 at 8:56 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Hi Greg,
Compiled and booted on my test system Lenovo P50s: Intel Core i7 No emergency and critical messages in the dmesg
./perf bench sched all # Running sched/messaging benchmark... # 20 sender and receiver processes per group # 10 groups == 400 processes run
Total time: 0.740 [sec]
# Running sched/pipe benchmark... # Executed 1000000 pipe operations between two processes
Total time: 9.647 [sec]
9.647452 usecs/op 103654 ops/sec
Tested-by: Zan Aziz zanaziz313@gmail.com
Thanks -Zan
On 10/3/22 12:09 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, 3 Oct 2022 at 12:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro's test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
NOTE: 1) Build warning 2) Boot warning on qemu-arm64 with KASAN and Kunit test Suspecting one of the recently commits causing this warning and need to bisect to confirm the commit id. mm/slab_common: fix possible double free of kmem_cache [ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
1) Following build warning found on few arm configs which do not set Kconfig # CONFIG_ELF_CORE is not set
- powerpc: tqm8xx_defconfig - arm: keystone_defconfig and omap1_defconfig - mips: ar7_defconfig fs/coredump.c:835:12: warning: 'dump_emit_page' defined but not used [-Wunused-function] 835 | static int dump_emit_page(struct coredump_params *cprm, struct page *page) | ^~~~~~~~~~~~~~
2) Following kernel boot warning noticed on qemu-arm64 with KASAN and KUNIT enabled [1]
[ 177.651182] ------------[ cut here ]------------ [ 177.652217] kmem_cache_destroy test: Slab cache still has objects when called from test_exit+0x28/0x40 [ 177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520 kmem_cache_destroy+0x1e8/0x20c [ 177.666237] Modules linked in: [ 177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 5.19.13-rc1 #1 [ 177.668666] Hardware name: linux,dummy-virt (DT) [ 177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 177.671120] pc : kmem_cache_destroy+0x1e8/0x20c [ 177.672217] lr : kmem_cache_destroy+0x1e8/0x20c [ 177.673302] sp : ffff8000080876f0 [ 177.674013] x29: ffff8000080876f0 x28: ffffb5ed1da56f38 x27: ffffb5ed1a87b480 [ 177.676478] x26: ffff800008087aa0 x25: ffff800008087ac8 x24: ffff00000c73b480 [ 177.678215] x23: 000000004c800000 x22: ffffb5ed1eca3000 x21: ffffb5ed1da381f0 [ 177.679873] x20: fdecb5ed18ea3a78 x19: ffff00000759be00 x18: 00000000ffffffff [ 177.681540] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 177.683139] x14: 0000000000000000 x13: 206d6f7266206465 x12: ffff700001010e63 [ 177.684776] x11: 1ffff00001010e62 x10: ffff700001010e62 x9 : ffffb5ed18b89514 [ 177.686554] x8 : ffff800008087317 x7 : 0000000000000001 x6 : 0000000000000001 [ 177.688238] x5 : ffffb5ed1d893000 x4 : dfff800000000000 x3 : ffffb5ed18b89520 [ 177.689912] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007150000 [ 177.691598] Call trace: [ 177.692165] kmem_cache_destroy+0x1e8/0x20c [ 177.693196] test_exit+0x28/0x40 [ 177.694158] kunit_catch_run_case+0x5c/0x120 [ 177.695177] kunit_try_catch_run+0x144/0x26c [ 177.696211] kunit_run_case_catch_errors+0x158/0x1e0 [ 177.697353] kunit_run_tests+0x374/0x750 [ 177.698333] __kunit_test_suites_init+0x74/0xa0 [ 177.699386] kunit_run_all_tests+0x160/0x380 [ 177.700428] kernel_init_freeable+0x32c/0x388 [ 177.701497] kernel_init+0x2c/0x150 [ 177.702347] ret_from_fork+0x10/0x20 [ 177.703308] ---[ end trace 0000000000000000 ]---
[1] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2FcCyacq1Su...
--- mm/slab_common: fix possible double free of kmem_cache [ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
When doing slub_debug test, kfence's 'test_memcache_typesafe_by_rcu' kunit test case cause a use-after-free error:
BUG: KASAN: use-after-free in kobject_del+0x14/0x30 Read of size 8 at addr ffff888007679090 by task kunit_try_catch/261
CPU: 1 PID: 261 Comm: kunit_try_catch Tainted: G B N 6.0.0-rc5-next-20220916 #17 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x48 print_address_description.constprop.0+0x87/0x2a5 print_report+0x103/0x1ed kasan_report+0xb7/0x140 kobject_del+0x14/0x30 kmem_cache_destroy+0x130/0x170 test_exit+0x1a/0x30 kunit_try_run_case+0xad/0xc0 kunit_generic_run_threadfn_adapter+0x26/0x50 kthread+0x17b/0x1b0 </TASK>
The cause is inside kmem_cache_destroy():
kmem_cache_destroy acquire lock/mutex shutdown_cache schedule_work(kmem_cache_release) (if RCU flag set) release lock/mutex kmem_cache_release (if RCU flag not set)
In some certain timing, the scheduled work could be run before the next RCU flag checking, which can then get a wrong value and lead to double kmem_cache_release().
Fix it by caching the RCU flag inside protected area, just like 'refcnt'
Fixes: 0495e337b703 ("mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock") Signed-off-by: Feng Tang feng.tang@intel.com Reviewed-by: Hyeonggon Yoo 42.hyeyoo@gmail.com Reviewed-by: Waiman Long longman@redhat.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org
## Build * kernel: 5.19.13-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.19.y * git commit: 0d49bf6408c47f815c7e056a006617d5431b1bed * git describe: v5.19.12-102-g0d49bf6408c4 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.19.y/build/v5.19....
## No Test Regressions (compared to v5.19.12)
## No Metric Regressions (compared to v5.19.12)
## No Test Fixes (compared to v5.19.12)
## No Metric Fixes (compared to v5.19.12)
## Test result summary total: 119604, pass: 105318, fail: 1117, skip: 12815, xfail: 354
## Build Summary * arc: 10 total, 10 passed, 0 failed * arm: 339 total, 336 passed, 3 failed * arm64: 73 total, 70 passed, 3 failed * i386: 62 total, 55 passed, 7 failed * mips: 62 total, 59 passed, 3 failed * parisc: 14 total, 14 passed, 0 failed * powerpc: 75 total, 66 passed, 9 failed * riscv: 32 total, 27 passed, 5 failed * s390: 26 total, 24 passed, 2 failed * sh: 26 total, 24 passed, 2 failed * sparc: 14 total, 14 passed, 0 failed * x86_64: 66 total, 63 passed, 3 failed
## Test suites summary * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-arm64/arm64.btitest.bti_c_func * kselftest-arm64/arm64.btitest.bti_j_func * kselftest-arm64/arm64.btitest.bti_jc_func * kselftest-arm64/arm64.btitest.bti_none_func * kselftest-arm64/arm64.btitest.nohint_func * kselftest-arm64/arm64.btitest.paciasp_func * kselftest-arm64/arm64.nobtitest.bti_c_func * kselftest-arm64/arm64.nobtitest.bti_j_func * kselftest-arm64/arm64.nobtitest.bti_jc_func * kselftest-arm64/arm64.nobtitest.bti_none_func * kselftest-arm64/arm64.nobtitest.nohint_func * kselftest-arm64/arm64.nobtitest.paciasp_func * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-open-posix-tests * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * packetdrill * perf * perf/Zstd-perf.data-compression * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
On Tue, Oct 04, 2022 at 12:18:05PM +0530, Naresh Kamboju wrote:
On Mon, 3 Oct 2022 at 12:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro's test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
NOTE:
- Build warning
- Boot warning on qemu-arm64 with KASAN and Kunit test Suspecting one of the recently commits causing this warning and need to bisect to confirm the commit id. mm/slab_common: fix possible double free of kmem_cache
[ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
Hi Naresh Kamboju,
Thanks for the report!
Could you try reverting the commit and re-test it to confirm?
Also could you provide the kernel dmesg of the failure and the kernel config of the test?
I locally pulled the linux-stable source and used QEMU to test it with kasan/kfence enabled, but could not reproduce it (I only have x86 HW at hand).
- Following kernel boot warning noticed on qemu-arm64 with KASAN and
KUNIT enabled [1]
[ 177.651182] ------------[ cut here ]------------ [ 177.652217] kmem_cache_destroy test: Slab cache still has
objects when called from test_exit+0x28/0x40 [ 177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520 kmem_cache_destroy+0x1e8/0x20c [ 177.666237] Modules linked in: [ 177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 5.19.13-rc1 #1 [ 177.668666] Hardware name: linux,dummy-virt (DT) [ 177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 177.671120] pc : kmem_cache_destroy+0x1e8/0x20c [ 177.672217] lr : kmem_cache_destroy+0x1e8/0x20c [ 177.691598] Call trace: [ 177.692165] kmem_cache_destroy+0x1e8/0x20c [ 177.693196] test_exit+0x28/0x40 [ 177.694158] kunit_catch_run_case+0x5c/0x120 [ 177.695177] kunit_try_catch_run+0x144/0x26c [ 177.696211] kunit_run_case_catch_errors+0x158/0x1e0 [ 177.697353] kunit_run_tests+0x374/0x750 [ 177.698333] __kunit_test_suites_init+0x74/0xa0 [ 177.699386] kunit_run_all_tests+0x160/0x380 [ 177.700428] kernel_init_freeable+0x32c/0x388 [ 177.701497] kernel_init+0x2c/0x150 [ 177.702347] ret_from_fork+0x10/0x20 [ 177.703308] ---[ end trace 0000000000000000 ]---
[1] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2FcCyacq1Su...
I also tried the reproduce cmmand from the above link:
tuxrun --runtime podman --device qemu-arm64 --kernel https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/Image.gz --modules https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/modules.tar.xz --rootfs https://storage.lkft.org/rootfs/oe-kirkstone/20220824-114729/juno/lkft-tux-i... --parameters SKIPFILE=skipfile-lkft.yaml --image docker.io/lavasoftware/lava-dispatcher:2022.06 --tests kunit --timeouts boot=30
Which also didn't reproduce it, but had some RCU stall problems (could also be related to the x86 HWs)
[ 321.006279] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 321.007281] ffff0000074c2300: 00 07 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 321.009283] rcu: 0-...0: (1 GPs behind) idle=40f/1/0x4000000000000000 softirq=436/437 fqs=5
[ 321.024995] rcu: rcu_preempt kthread timer wakeup didn't happen for 4464 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 [ 321.026343] rcu: Possible timer handling issue on cpu=1 timer-softirq=1426 [ 321.027340] rcu: rcu_preempt kthread starved for 4465 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 [ 321.028517] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 321.029488] rcu: RCU grace-period kthread stack dump: [ 321.030251] task:rcu_preempt state:I stack: 0 pid: 16 ppid: 2 flags:0x00000008 [ 321.031434] Call trace: [ 321.031878] __switch_to+0x140/0x1e0 [ 321.032565] __schedule+0x4f4/0xc74 [ 321.033228] schedule+0x88/0x13c [ 321.033915] schedule_timeout+0x104/0x2b0 [ 321.034646] rcu_gp_fqs_loop+0x1a0/0x784 [ 321.035119] rcu_gp_kthread+0x278/0x3a0 [ 321.035608] kthread+0x160/0x170 [ 339.882465] ret_from_fork+0x10/0x20 [ 339.883898] rcu: Stack dump where RCU GP kthread last ran:
The full .xz log is attched.
Thanks, Feng
On Wed, 5 Oct 2022 at 15:09, Feng Tang feng.tang@intel.com wrote:
On Tue, Oct 04, 2022 at 12:18:05PM +0530, Naresh Kamboju wrote:
On Mon, 3 Oct 2022 at 12:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro's test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
NOTE:
- Build warning
- Boot warning on qemu-arm64 with KASAN and Kunit test Suspecting one of the recently commits causing this warning and need to bisect to confirm the commit id. mm/slab_common: fix possible double free of kmem_cache
[ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
Hi Naresh Kamboju,
Thanks for the report!
Could you try reverting the commit and re-test it to confirm?
Anders re-run the tests multiple times with and without the patch reverted and was not successful in reproducing the reported problem. Which confirms that, it is not easy to reproduce.
Also could you provide the kernel dmesg of the failure and the kernel config of the test?
dmesg log attached to this email.
Here is the build link, https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/
I locally pulled the linux-stable source and used QEMU to test it with kasan/kfence enabled, but could not reproduce it (I only have x86 HW at hand).
- Following kernel boot warning noticed on qemu-arm64 with KASAN and
KUNIT enabled [1]
[ 177.651182] ------------[ cut here ]------------ [ 177.652217] kmem_cache_destroy test: Slab cache still has
objects when called from test_exit+0x28/0x40 [ 177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520 kmem_cache_destroy+0x1e8/0x20c [ 177.666237] Modules linked in: [ 177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 5.19.13-rc1 #1 [ 177.668666] Hardware name: linux,dummy-virt (DT) [ 177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 177.671120] pc : kmem_cache_destroy+0x1e8/0x20c [ 177.672217] lr : kmem_cache_destroy+0x1e8/0x20c [ 177.691598] Call trace: [ 177.692165] kmem_cache_destroy+0x1e8/0x20c [ 177.693196] test_exit+0x28/0x40 [ 177.694158] kunit_catch_run_case+0x5c/0x120 [ 177.695177] kunit_try_catch_run+0x144/0x26c [ 177.696211] kunit_run_case_catch_errors+0x158/0x1e0 [ 177.697353] kunit_run_tests+0x374/0x750 [ 177.698333] __kunit_test_suites_init+0x74/0xa0 [ 177.699386] kunit_run_all_tests+0x160/0x380 [ 177.700428] kernel_init_freeable+0x32c/0x388 [ 177.701497] kernel_init+0x2c/0x150 [ 177.702347] ret_from_fork+0x10/0x20 [ 177.703308] ---[ end trace 0000000000000000 ]---
[1] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2FcCyacq1Su...
I also tried the reproduce cmmand from the above link:
tuxrun --runtime podman --device qemu-arm64 --kernel https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/Image.gz --modules https://builds.tuxbuild.com/2FcCwzbNgR7TlQXzJ0nu32y1CpB/modules.tar.xz --rootfs https://storage.lkft.org/rootfs/oe-kirkstone/20220824-114729/juno/lkft-tux-i... --parameters SKIPFILE=skipfile-lkft.yaml --image docker.io/lavasoftware/lava-dispatcher:2022.06 --tests kunit --timeouts boot=30
Which also didn't reproduce it, but had some RCU stall problems (could also be related to the x86 HWs)
[ 321.006279] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 321.007281] ffff0000074c2300: 00 07 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 321.009283] rcu: 0-...0: (1 GPs behind) idle=40f/1/0x4000000000000000 softirq=436/437 fqs=5
[ 321.024995] rcu: rcu_preempt kthread timer wakeup didn't happen for 4464 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 [ 321.026343] rcu: Possible timer handling issue on cpu=1 timer-softirq=1426 [ 321.027340] rcu: rcu_preempt kthread starved for 4465 jiffies! g-207 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 [ 321.028517] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 321.029488] rcu: RCU grace-period kthread stack dump: [ 321.030251] task:rcu_preempt state:I stack: 0 pid: 16 ppid: 2 flags:0x00000008 [ 321.031434] Call trace: [ 321.031878] __switch_to+0x140/0x1e0 [ 321.032565] __schedule+0x4f4/0xc74 [ 321.033228] schedule+0x88/0x13c [ 321.033915] schedule_timeout+0x104/0x2b0 [ 321.034646] rcu_gp_fqs_loop+0x1a0/0x784 [ 321.035119] rcu_gp_kthread+0x278/0x3a0 [ 321.035608] kthread+0x160/0x170 [ 339.882465] ret_from_fork+0x10/0x20 [ 339.883898] rcu: Stack dump where RCU GP kthread last ran:
The full .xz log is attched.
Thanks for looking into this.
Thanks, Feng
- Naresh
On Tue, Oct 04, 2022 at 12:18:05PM +0530, Naresh Kamboju wrote:
On Mon, 3 Oct 2022 at 12:43, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.19.13-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.19.y and the diffstat can be found below.
thanks,
greg k-h
[...]
- Boot warning on qemu-arm64 with KASAN and Kunit test Suspecting one of the recently commits causing this warning and need to bisect to confirm the commit id. mm/slab_common: fix possible double free of kmem_cache
[ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
[...]
- Following kernel boot warning noticed on qemu-arm64 with KASAN and
KUNIT enabled [1]
[ 177.651182] ------------[ cut here ]------------ [ 177.652217] kmem_cache_destroy test: Slab cache still has
objects when called from test_exit+0x28/0x40 [ 177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520 kmem_cache_destroy+0x1e8/0x20c [ 177.666237] Modules linked in: [ 177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 5.19.13-rc1 #1 [ 177.668666] Hardware name: linux,dummy-virt (DT) [ 177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 177.671120] pc : kmem_cache_destroy+0x1e8/0x20c [ 177.672217] lr : kmem_cache_destroy+0x1e8/0x20c [ 177.673302] sp : ffff8000080876f0 [ 177.674013] x29: ffff8000080876f0 x28: ffffb5ed1da56f38 x27: ffffb5ed1a87b480 [ 177.676478] x26: ffff800008087aa0 x25: ffff800008087ac8 x24: ffff00000c73b480 [ 177.678215] x23: 000000004c800000 x22: ffffb5ed1eca3000 x21: ffffb5ed1da381f0 [ 177.679873] x20: fdecb5ed18ea3a78 x19: ffff00000759be00 x18: 00000000ffffffff [ 177.681540] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 177.683139] x14: 0000000000000000 x13: 206d6f7266206465 x12: ffff700001010e63 [ 177.684776] x11: 1ffff00001010e62 x10: ffff700001010e62 x9 : ffffb5ed18b89514 [ 177.686554] x8 : ffff800008087317 x7 : 0000000000000001 x6 : 0000000000000001 [ 177.688238] x5 : ffffb5ed1d893000 x4 : dfff800000000000 x3 : ffffb5ed18b89520 [ 177.689912] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007150000 [ 177.691598] Call trace: [ 177.692165] kmem_cache_destroy+0x1e8/0x20c [ 177.693196] test_exit+0x28/0x40 [ 177.694158] kunit_catch_run_case+0x5c/0x120 [ 177.695177] kunit_try_catch_run+0x144/0x26c [ 177.696211] kunit_run_case_catch_errors+0x158/0x1e0 [ 177.697353] kunit_run_tests+0x374/0x750 [ 177.698333] __kunit_test_suites_init+0x74/0xa0 [ 177.699386] kunit_run_all_tests+0x160/0x380 [ 177.700428] kernel_init_freeable+0x32c/0x388 [ 177.701497] kernel_init+0x2c/0x150 [ 177.702347] ret_from_fork+0x10/0x20 [ 177.703308] ---[ end trace 0000000000000000 ]---
[1] https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2FcCyacq1Su...
[+Cc Marco Elver]
I can't reproduce it with the image and still not sure what caused this,
but the dmesg output [3] raises some questions: 1) What made kfence_test fail, and 2) can a failure from KFENCE test cause this SLUB warning?
2022-10-03T07:48:54.922482 <6>[ 146.564765] ok 3 - test_out_of_bounds_write 2022-10-03T07:48:54.922578 <6>[ 146.577134] # test_out_of_bounds_write-memcache: setup_test_cache: size=32, ctor=0x0 2022-10-03T07:48:54.922666 <6>[ 146.592675] # test_out_of_bounds_write-memcache: test_alloc: size=32, gfp=cc0, policy=left, cache=1 2022-10-03T07:48:54.922756 <3>[ 156.602992] # test_out_of_bounds_write-memcache: ASSERTION FAILED at mm/kfence/kfence_test.c:312 2022-10-03T07:48:54.922844 <3>[ 156.602992] Expected false to be true, but is false 2022-10-03T07:48:54.922934 <3>[ 156.602992] 2022-10-03T07:48:54.923018 <3>[ 156.602992] failed to allocate from KFENCE 2022-10-03T07:48:54.925842 <6>[ 156.864670] not ok 4 - test_out_of_bounds_write-memcache 2022-10-03T07:48:54.926038 <6>[ 156.883110] # test_use_after_free_read: test_alloc: size=32, gfp=cc0, policy=any, cache=0 2022-10-03T07:48:54.926178 <3>[ 156.920306] ==================================================================
[...]
2022-10-03T07:50:11.011619 <6>[ 163.904684] # test_free_bulk-memcache: setup_test_cache: size=223, ctor=0x0 2022-10-03T07:50:11.011811 <6>[ 163.927257] # test_free_bulk-memcache: test_alloc: size=223, gfp=cc0, policy=right, cache=1 2022-10-03T07:50:11.012007 <6>[ 163.992279] # test_free_bulk-memcache: test_alloc: size=223, gfp=cc0, policy=none, cache=1 2022-10-03T07:50:11.012200 <6>[ 164.007799] # test_free_bulk-memcache: test_alloc: size=223, gfp=cc0, policy=left, cache=1 2022-10-03T07:50:11.012392 <3>[ 176.777879] # test_free_bulk-memcache: ASSERTION FAILED at mm/kfence/kfence_test.c:312 2022-10-03T07:50:11.012592 <3>[ 176.777879] Expected false to be true, but is false 2022-10-03T07:50:21.406181 <3>[ 176.777879] 2022-10-03T07:50:21.406483 <3>[ 176.777879] failed to allocate from KFENCE 2022-10-03T07:50:21.406616 <3>[ 177.604811] ============================================================================= 2022-10-03T07:50:21.406728 <3>[ 177.608387] BUG test (Tainted: G B ): Objects remaining in test on __kmem_cache_shutdown() 2022-10-03T07:50:21.406827 <3>[ 177.609927] ----------------------------------------------------------------------------- 2022-10-03T07:50:21.406918 <3>[ 177.609927] 2022-10-03T07:50:21.407005 <3>[ 177.611424] Slab 0x000000009535baed objects=14 used=1 fp=0x00000000e8649a76 flags=0x3fffc0000000200(slab|node=0|zone=0|lastcpupid=0xffff) 2022-10-03T07:50:21.407112 <4>[ 177.613882] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 5.19.13-rc1 #1 2022-10-03T07:50:21.407219 <4>[ 177.615231] Hardware name: linux,dummy-virt (DT) 2022-10-03T07:50:21.407310 <4>[ 177.616197] Call trace: 2022-10-03T07:50:21.407400 <4>[ 177.616788] dump_backtrace+0xb8/0x130 2022-10-03T07:50:21.407490 <4>[ 177.617792] show_stack+0x20/0x60 2022-10-03T07:50:21.407581 <4>[ 177.618630] dump_stack_lvl+0x8c/0xb8 2022-10-03T07:50:21.407671 <4>[ 177.619548] dump_stack+0x1c/0x38 2022-10-03T07:50:21.407761 <4>[ 177.620396] slab_err+0xa0/0xf0 2022-10-03T07:50:21.407851 <4>[ 177.621180] __kmem_cache_shutdown+0x140/0x3c0 2022-10-03T07:50:21.407935 <4>[ 177.622230] kmem_cache_destroy+0x9c/0x20c 2022-10-03T07:50:21.408017 <4>[ 177.623242] test_exit+0x28/0x40 2022-10-03T07:50:21.408100 <4>[ 177.624172] kunit_catch_run_case+0x5c/0x120 2022-10-03T07:50:21.408183 <4>[ 177.625189] kunit_try_catch_run+0x144/0x26c 2022-10-03T07:50:21.408269 <4>[ 177.626251] kunit_run_case_catch_errors+0x158/0x1e0 2022-10-03T07:50:21.408355 <4>[ 177.627359] kunit_run_tests+0x374/0x750 2022-10-03T07:50:21.408439 <4>[ 177.628316] __kunit_test_suites_init+0x74/0xa0 2022-10-03T07:50:21.408523 <4>[ 177.629376] kunit_run_all_tests+0x160/0x380 2022-10-03T07:50:21.408606 <4>[ 177.630440] kernel_init_freeable+0x32c/0x388 2022-10-03T07:50:21.408687 <4>[ 177.631517] kernel_init+0x2c/0x150 2022-10-03T07:50:21.408770 <4>[ 177.632351] ret_from_fork+0x10/0x20 2022-10-03T07:50:21.408856 <4>[ 177.633506] Disabling lock debugging due to kernel taint 2022-10-03T07:50:21.408942 <3>[ 177.634724] Object 0x00000000a1747116 @offset=2880 2022-10-03T07:50:21.409029 <4>[ 177.651182] ------------[ cut here ]------------ 2022-10-03T07:50:21.409116 <4>[ 177.652217] kmem_cache_destroy test: Slab cache still has objects when called from test_exit+0x28/0x40 2022-10-03T07:50:21.409205 <4>[ 177.654849] WARNING: CPU: 0 PID: 1 at mm/slab_common.c:520 kmem_cache_destroy+0x1e8/0x20c 2022-10-03T07:50:21.409297 <4>[ 177.666237] Modules linked in: 2022-10-03T07:50:32.517549 <4>[ 177.667325] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B 5.19.13-rc1 #1 2022-10-03T07:50:32.518598 <4>[ 177.668666] Hardware name: linux,dummy-virt (DT) 2022-10-03T07:50:32.519060 <4>[ 177.669783] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) 2022-10-03T07:50:32.519440 <4>[ 177.671120] pc : kmem_cache_destroy+0x1e8/0x20c 2022-10-03T07:50:32.519798 <4>[ 177.672217] lr : kmem_cache_destroy+0x1e8/0x20c 2022-10-03T07:50:32.520150 <4>[ 177.673302] sp : ffff8000080876f0 2022-10-03T07:50:32.520502 <4>[ 177.674013] x29: ffff8000080876f0 x28: ffffb5ed1da56f38 x27: ffffb5ed1a87b480 2022-10-03T07:50:32.520852 <4>[ 177.676478] x26: ffff800008087aa0 x25: ffff800008087ac8 x24: ffff00000c73b480 2022-10-03T07:50:32.521203 <4>[ 177.678215] x23: 000000004c800000 x22: ffffb5ed1eca3000 x21: ffffb5ed1da381f0 2022-10-03T07:50:32.521565 <4>[ 177.679873] x20: fdecb5ed18ea3a78 x19: ffff00000759be00 x18: 00000000ffffffff 2022-10-03T07:50:32.521903 <4>[ 177.681540] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 2022-10-03T07:50:32.522248 <4>[ 177.683139] x14: 0000000000000000 x13: 206d6f7266206465 x12: ffff700001010e63 2022-10-03T07:50:32.522624 <4>[ 177.684776] x11: 1ffff00001010e62 x10: ffff700001010e62 x9 : ffffb5ed18b89514 2022-10-03T07:50:32.522978 <4>[ 177.686554] x8 : ffff800008087317 x7 : 0000000000000001 x6 : 0000000000000001 2022-10-03T07:50:32.523346 <4>[ 177.688238] x5 : ffffb5ed1d893000 x4 : dfff800000000000 x3 : ffffb5ed18b89520 2022-10-03T07:50:32.523706 <4>[ 177.689912] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007150000 2022-10-03T07:50:32.524060 <4>[ 177.691598] Call trace: 2022-10-03T07:50:32.524419 <4>[ 177.692165] kmem_cache_destroy+0x1e8/0x20c 2022-10-03T07:50:32.524781 <4>[ 177.693196] test_exit+0x28/0x40 2022-10-03T07:50:32.525138 <4>[ 177.694158] kunit_catch_run_case+0x5c/0x120 2022-10-03T07:50:32.525491 <4>[ 177.695177] kunit_try_catch_run+0x144/0x26c 2022-10-03T07:50:32.525842 <4>[ 177.696211] kunit_run_case_catch_errors+0x158/0x1e0 2022-10-03T07:50:32.526203 <4>[ 177.697353] kunit_run_tests+0x374/0x750 2022-10-03T07:50:32.526583 <4>[ 177.698333] __kunit_test_suites_init+0x74/0xa0 2022-10-03T07:50:32.526944 <4>[ 177.699386] kunit_run_all_tests+0x160/0x380 2022-10-03T07:50:32.527319 <4>[ 177.700428] kernel_init_freeable+0x32c/0x388 2022-10-03T07:50:32.527677 <4>[ 177.701497] kernel_init+0x2c/0x150 2022-10-03T07:50:32.528045 <4>[ 177.702347] ret_from_fork+0x10/0x20 2022-10-03T07:50:32.528415 <4>[ 177.703308] ---[ end trace 0000000000000000 ]--- 2022-10-03T07:50:32.528777 <6>[ 180.045230] not ok 14 - test_free_bulk-memcache
[3] https://tuxapi-prod-storage-public-linaro.s3.amazonaws.com/lkft/tests/2FcCya...
mm/slab_common: fix possible double free of kmem_cache [ Upstream commit d71608a877362becdc94191f190902fac1e64d35 ]
When doing slub_debug test, kfence's 'test_memcache_typesafe_by_rcu' kunit test case cause a use-after-free error:
BUG: KASAN: use-after-free in kobject_del+0x14/0x30 Read of size 8 at addr ffff888007679090 by task kunit_try_catch/261
CPU: 1 PID: 261 Comm: kunit_try_catch Tainted: G B N 6.0.0-rc5-next-20220916 #17 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace:
<TASK> dump_stack_lvl+0x34/0x48 print_address_description.constprop.0+0x87/0x2a5 print_report+0x103/0x1ed kasan_report+0xb7/0x140 kobject_del+0x14/0x30 kmem_cache_destroy+0x130/0x170 test_exit+0x1a/0x30 kunit_try_run_case+0xad/0xc0 kunit_generic_run_threadfn_adapter+0x26/0x50 kthread+0x17b/0x1b0 </TASK>
The cause is inside kmem_cache_destroy():
kmem_cache_destroy acquire lock/mutex shutdown_cache schedule_work(kmem_cache_release) (if RCU flag set) release lock/mutex kmem_cache_release (if RCU flag not set)
In some certain timing, the scheduled work could be run before the next RCU flag checking, which can then get a wrong value and lead to double kmem_cache_release().
Fix it by caching the RCU flag inside protected area, just like 'refcnt'
Fixes: 0495e337b703 ("mm/slab_common: Deleting kobject in kmem_cache_destroy() without holding slab_mutex/cpu_hotplug_lock") Signed-off-by: Feng Tang feng.tang@intel.com Reviewed-by: Hyeonggon Yoo 42.hyeyoo@gmail.com Reviewed-by: Waiman Long longman@redhat.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org
## Build
- kernel: 5.19.13-rc1
- git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
- git branch: linux-5.19.y
- git commit: 0d49bf6408c47f815c7e056a006617d5431b1bed
- git describe: v5.19.12-102-g0d49bf6408c4
- test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.19.y/build/v5.19....
[...]
-- Linaro LKFT https://lkft.linaro.org
On Mon, Oct 03, 2022 at 09:09:56AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.1.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
Hi Greg,
On Mon, Oct 03, 2022 at 09:09:56AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.19.13 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Oct 2022 07:07:06 +0000. Anything received after that time might be too late.
Build test (gcc version 12.2.1 20220925): mips: 59 configs -> no failure arm: 99 configs -> no failure arm64: 3 configs -> no failure x86_64: 4 configs -> no failure alpha allmodconfig -> no failure csky allmodconfig -> no failure powerpc allmodconfig -> no failure riscv allmodconfig -> no failure s390 allmodconfig -> no failure xtensa allmodconfig -> no failure
Boot test: x86_64: Booted on my test laptop. No regression. x86_64: Booted on qemu. No regression. [1] arm64: Booted on rpi4b (4GB model). No regression. [2]
[1]. https://openqa.qa.codethink.co.uk/tests/1947 [2]. https://openqa.qa.codethink.co.uk/tests/1950
Tested-by: Sudip Mukherjee sudip.mukherjee@codethink.co.uk
-- Regards Sudip
Hey Greg,
Ran tests and boot tested on my system, no regressions found
Tested-by: Fenil Jain fkjainco@gmail.com
linux-stable-mirror@lists.linaro.org