This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.124-rc1
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: join: only check for ip6tables if needed
Thomas Petazzoni thomas.petazzoni@bootlin.com ASoC: cs42l51: fix driver to properly autoload with automatic module loading
Jens Axboe axboe@kernel.dk io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: sockopt: use 'iptables-legacy' if available
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: intel_pstate: Drop ACPI _PSS states table patching
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: processor: perflib: Avoid updating frequency QoS unnecessarily
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: processor: perflib: Use the "no limit" frequency QoS
Steven Rostedt (Google) rostedt@goodmis.org tracing: Fix trace_event_raw_event_synth() if else statement
Ilya Dryomov idryomov@gmail.com rbd: retrieve and check lock owner twice before blocklisting
Ilya Dryomov idryomov@gmail.com rbd: harden get_lock_owner_info() a bit
Ilya Dryomov idryomov@gmail.com rbd: make get_lock_owner_info() return a single locker or NULL
Joe Thornber ejt@redhat.com dm cache policy smq: ensure IO doesn't prevent cleaner policy progress
Xiubo Li xiubli@redhat.com ceph: never send metrics if disable_send_metrics is set
Mark Brown broonie@kernel.org ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register
Stefan Haberland sth@linux.ibm.com s390/dasd: fix hanging device after quiesce/resume
Jason Wang jasowang@redhat.com virtio-net: fix race between set queues and probe
Sean Christopherson seanjc@google.com KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
Peter Zijlstra peterz@infradead.org locking/rtmutex: Fix task->pi_waiters integrity
Marc Zyngier maz@kernel.org irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
Jonas Gorski jonas.gorski@gmail.com irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
Alexander Steffen Alexander.Steffen@infineon.com tpm_tis: Explicitly check for error code
Trond Myklebust trond.myklebust@hammerspace.com nfsd: Remove incorrect check in nfsd4_validate_stateid
Christian Brauner brauner@kernel.org file: always lock position for FMODE_ATOMIC_POS
Filipe Manana fdmanana@suse.com btrfs: check for commit error at btrfs_attach_transaction_barrier()
Filipe Manana fdmanana@suse.com btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
Gilles Buloz Gilles.Buloz@kontron.com hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
Baskaran Kannan Baski.Kannan@amd.com hwmon: (k10temp) Enable AMD3255 Proc to show negative temperature
Luka Guzenko l.guzenko@web.de ALSA: hda/relatek: Enable Mute LED on HP 250 G8
Oliver Neukum oneukum@suse.com Revert "xhci: add quirk for host controllers that don't update endpoint DCS"
Chaoyuan Peng hedonistsmith@gmail.com tty: n_gsm: fix UAF in gsm_cleanup_mux
Zhang Shurong zhang_shurong@foxmail.com staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
Larry Finger Larry.Finger@lwfinger.net staging: r8712: Fix memory leak in _r8712_init_xmit_priv()
Greg Kroah-Hartman gregkh@linuxfoundation.org Documentation: security-bugs.rst: clarify CVE handling
Greg Kroah-Hartman gregkh@linuxfoundation.org Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
Dan Carpenter dan.carpenter@linaro.org Revert "usb: xhci: tegra: Fix error check"
Ricardo Ribalda ribalda@chromium.org usb: xhci-mtk: set the dma max_seg_size
Frank Li Frank.Li@nxp.com usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
Łukasz Bartosik lb@semihalf.com USB: quirks: add quirk for Focusrite Scarlett
Guiting Shen aarongt.shen@gmail.com usb: ohci-at91: Fix the unhandle interrupt when resume
Jisheng Zhang jszhang@kernel.org usb: dwc3: don't reset device side if dwc3 was configured as host-only
Gratian Crisan gratian.crisan@ni.com usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
Jakub Vanek linuxtardis@gmail.com Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED
Johan Hovold johan@kernel.org USB: serial: simple: sort driver entries
Oliver Neukum oneukum@suse.com USB: serial: simple: add Kaufmann RKS+CAN VCP
Mohsen Tahmasebi moh53n@moh53n.ir USB: serial: option: add Quectel EC200A module support
Jerry Meng jerry-meng@foxmail.com USB: serial: option: support Quectel EM060K_128
Samuel Holland samuel.holland@sifive.com serial: sifive: Fix sifive_serial_console_setup() section
Ruihong Luo colorsu1922@gmail.com serial: 8250_dw: Preserve original value of DLF register
Johan Hovold johan+linaro@kernel.org serial: qcom-geni: drop bogus runtime pm state update
Sean Christopherson seanjc@google.com KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
Sean Christopherson seanjc@google.com KVM: Grab a reference to KVM for VM and vCPU stats file descriptors
Zqiang qiang.zhang1211@gmail.com USB: gadget: Fix the memory leak in raw_gadget driver
Frank Li Frank.Li@nxp.com usb: gadget: call usb_gadget_check_config() to verify UDC capability
Dan Carpenter dan.carpenter@linaro.org Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
Zheng Yejian zhengyejian1@huawei.com tracing: Fix warning in trace_buffered_event_disable()
Zheng Yejian zhengyejian1@huawei.com ring-buffer: Fix wrong stat of cpu_buffer->read
Arnd Bergmann arnd@arndb.de ata: pata_ns87415: mark ns87560_tf_read static
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Report correct WC error
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix an error handling mistake in psp_sw_init()
Yu Kuai yukuai3@huawei.com dm raid: protect md_stop() with 'reconfig_mutex'
Yu Kuai yukuai3@huawei.com dm raid: clean up four equivalent goto tags in raid_ctr()
Yu Kuai yukuai3@huawei.com dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
Bart Van Assche bvanassche@acm.org block: Fix a source code comment in include/uapi/linux/blkzoned.h
Matus Gajdos matuszpd@gmail.com ASoC: fsl_spdif: Silence output on stop
Gaosheng Cui cuigaosheng1@huawei.com drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Prevent handling any completions after qp destroy
Thomas Bogendoerfer tbogendoerfer@suse.de RDMA/mthca: Fix crash when polling CQ for shared QPs
Shiraz Saleem shiraz.saleem@intel.com RDMA/irdma: Fix data race on CQP request done
Shiraz Saleem shiraz.saleem@intel.com RDMA/irdma: Fix data race on CQP completion stats
Shiraz Saleem shiraz.saleem@intel.com RDMA/irdma: Add missing read barriers
Rob Clark robdclark@chromium.org drm/msm/adreno: Fix snapshot BINDLESS_DATA size
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/dpu: drop enum dpu_core_perf_data_bus_id
Dan Carpenter dan.carpenter@linaro.org RDMA/mlx4: Make check for invalid flags stricter
Fedor Pchelkin pchelkin@ispras.ru tipc: stop tipc crypto on failure in tipc_node_create
Yuanjun Gong ruc_gongyuanjun@163.com tipc: check return value of pskb_trim()
Yuanjun Gong ruc_gongyuanjun@163.com benet: fix return value check in be_lancer_xmit_workarounds()
Lin Ma linma@zju.edu.cn net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
Vladimir Oltean vladimir.oltean@nxp.com net/sched: mqprio: add extack to mqprio_parse_nlattr()
Vladimir Oltean vladimir.oltean@nxp.com net/sched: mqprio: refactor nlattr parsing to a separate function
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
Florian Westphal fw@strlen.de netfilter: nft_set_rbtree: fix overlap expiration walk
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Fix Kernel Panic during ndo_tx_timeout callback
Maxim Mikityanskiy maxtram95@gmail.com platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100
Vincent Whitchurch vincent.whitchurch@axis.com net: stmmac: Apply redundant write work around on 4.xx too
Hangbin Liu liuhangbin@gmail.com team: reset team's flags when down link is P2P device
Hangbin Liu liuhangbin@gmail.com bonding: reset bond's flags when down link is P2P device
Jedrzej Jagielski jedrzej.jagielski@intel.com ice: Fix memory management in ice_ethtool_fdir.c
Stewart Smith trawets@amazon.com tcp: Reduce chance of collisions in inet6_hashfn().
Maciej Żenczykowski maze@google.com ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
Yuanjun Gong ruc_gongyuanjun@163.com ethernet: atheros: fix return value check in atl1e_tso_csum()
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
Jiri Benc jbenc@redhat.com vxlan: calculate correct header length for GPE
Roopa Prabhu roopa@nvidia.com vxlan: move to its own directory
Jijie Shao shaojijie@huawei.com net: hns3: fix wrong bw weight of disabled tc issue
Jijie Shao shaojijie@huawei.com net: hns3: fix wrong tc bandwidth weight data issue
Jiawen Wu jiawenwu@trustnetic.com net: phy: marvell10g: fix 88x3310 power up
Jacob Keller jacob.e.keller@intel.com iavf: check for removal state before IAVF_FLAG_PF_COMMS_FAILED
Jacob Keller jacob.e.keller@intel.com iavf: fix potential deadlock on allocation failure
Wang Ming machel@vivo.com i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir()
Sakari Ailus sakari.ailus@linux.intel.com media: staging: atomisp: select V4L2_FWNODE
Srinivas Kandagatla srinivas.kandagatla@linaro.org soundwire: qcom: update status correctly with mask
Adrien Thierry athierry@redhat.com phy: qcom-snps-femto-v2: properly enable ref clock
Adrien Thierry athierry@redhat.com phy: qcom-snps-femto-v2: keep cfg_ahb_clk enabled during runtime suspend
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org phy: qcom-snps: correct struct qcom_snps_hsphy kerneldoc
Yuan Can yuancan@huawei.com phy: qcom-snps: Use dev_err_probe() to simplify code
Zhang Yi yi.zhang@huawei.com jbd2: fix a race when checking checkpoint buffer busy
Zhang Yi yi.zhang@huawei.com jbd2: remove journal_clean_one_cp_list()
Zhang Yi yi.zhang@huawei.com jbd2: remove t_checkpoint_io_list
Guchun Chen guchun.chen@amd.com drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
Flora Cui flora.cui@amd.com drm/amdgpu: fix vkms crtc settings
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix hang in task management
Arun Easi aeasi@marvell.com scsi: qla2xxx: Add debug prints in the device remove path
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix task management cmd fail due to unavailable resource
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix task management cmd failure
Quinn Tran qutran@marvell.com scsi: qla2xxx: Multi-que support for TMF
Gaosheng Cui cuigaosheng1@huawei.com scsi: qla2xxx: Remove unused declarations for qla2xxx
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails
Masami Hiramatsu (Google) mhiramat@kernel.org Revert "tracing: Add "(fault)" name injection to kernel probes"
Steven Rostedt (Google) rostedt@goodmis.org tracing: Allow synthetic events to pass around stacktraces
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Fix to avoid double count of the string length on the array
Masami Hiramatsu (Google) mhiramat@kernel.org tracing/probes: Add symstr type for dynamic events
Heiner Kallweit hkallweit1@gmail.com pwm: meson: fix handling of period/duty if greater than UINT_MAX
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: meson: Simplify duplicated per-channel tracking
Bharath SM bharathsm@microsoft.com cifs: if deferred close is disabled then close files immediately
Namjae Jeon linkinjeon@kernel.org ksmbd: remove internal.h include
Paulo Alcantara pc@cjr.nz cifs: use fs_context for automounts
Steve French stfrench@microsoft.com cifs: missing directory in MAINTAINERS file
Christian König christian.koenig@amd.com drm/ttm: never consider pinned BOs for eviction&swap
Hui Li caelli@tencent.com tty: fix hang on tty device with no_room set
Ilpo Järvinen ilpo.jarvinen@linux.intel.com n_tty: Rename tail to old_tail in n_tty_read()
Thomas Hellström thomas.hellstrom@linux.intel.com drm/ttm: Don't leak a resource on eviction error
Thomas Hellström thomas.hellstrom@linux.intel.com drm/ttm: Don't print error message if eviction was interrupted
Alexander Aring aahringo@redhat.com fs: dlm: interrupt posix locks only when process is killed
Alexander Aring aahringo@redhat.com dlm: rearrange async condition return
Alexander Aring aahringo@redhat.com dlm: cleanup plock_op vs plock_xop
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Don't advertise MSI-X in PCIe capabilities
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Fix window mapping and address translation for endpoint
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Remove writes to unused registers
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI/ASPM: Avoid link retraining race
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI/ASPM: Factor out pcie_wait_for_retrain()
Bjorn Helgaas bhelgaas@google.com PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link()
Christophe JAILLET christophe.jaillet@wanadoo.fr i2c: nomadik: Remove a useless call in the remove function
Andi Shyti andi.shyti@kernel.org i2c: nomadik: Use devm_clk_get_enabled()
Andi Shyti andi.shyti@kernel.org i2c: nomadik: Remove unnecessary goto label
Markus Elfring elfring@users.sourceforge.net i2c: Improve size determinations
Markus Elfring elfring@users.sourceforge.net i2c: Delete error messages for failed memory allocations
Filipe Manana fdmanana@suse.com btrfs: fix race between quota disable and relocation
Bartosz Golaszewski bartosz.golaszewski@linaro.org gpio: mvebu: fix irq domain leak
Uwe Kleine-König u.kleine-koenig@pengutronix.de gpio: mvebu: Make use of devm_pwmchip_add
Hans de Goede hdegoede@redhat.com gpio: tps68470: Make tps68470_gpio_output() always set the initial value
Ondrej Mosnacek omosnace@redhat.com io_uring: don't audit the capability check in io_uring_create()
Claudio Imbrenda imbrenda@linux.ibm.com KVM: s390: pv: fix index value of replaced ASCE
Zhihao Cheng chengzhihao1@huawei.com jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
-------------
Diffstat:
Documentation/admin-guide/security-bugs.rst | 37 ++- Documentation/trace/kprobetrace.rst | 8 +- MAINTAINERS | 1 + Makefile | 4 +- arch/s390/mm/gmap.c | 1 + arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 3 +- arch/x86/kvm/svm/svm.c | 6 + arch/x86/kvm/vmx/vmx.c | 41 +++- arch/x86/kvm/x86.c | 34 ++- drivers/acpi/processor_perflib.c | 42 +++- drivers/ata/pata_ns87415.c | 2 +- drivers/block/rbd.c | 124 +++++++--- drivers/char/tpm/tpm_tis_core.c | 10 +- drivers/cpufreq/intel_pstate.c | 14 -- drivers/gpio/gpio-mvebu.c | 26 ++- drivers/gpio/gpio-tps68470.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 45 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h | 5 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 13 -- drivers/gpu/drm/ttm/ttm_bo.c | 25 +- drivers/hwmon/k10temp.c | 17 +- drivers/hwmon/nct7802.c | 2 +- drivers/i2c/busses/i2c-ibm_iic.c | 4 +- drivers/i2c/busses/i2c-nomadik.c | 42 +--- drivers/i2c/busses/i2c-sh7760.c | 3 +- drivers/i2c/busses/i2c-tiny-usb.c | 4 +- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 + drivers/infiniband/hw/bnxt_re/qplib_fp.c | 18 ++ drivers/infiniband/hw/bnxt_re/qplib_fp.h | 1 + drivers/infiniband/hw/irdma/ctrl.c | 31 ++- drivers/infiniband/hw/irdma/defs.h | 46 ++-- drivers/infiniband/hw/irdma/hw.c | 3 +- drivers/infiniband/hw/irdma/main.h | 2 +- drivers/infiniband/hw/irdma/puda.c | 6 + drivers/infiniband/hw/irdma/type.h | 2 + drivers/infiniband/hw/irdma/uk.c | 3 + drivers/infiniband/hw/irdma/utils.c | 8 +- drivers/infiniband/hw/mlx4/qp.c | 18 +- drivers/infiniband/hw/mthca/mthca_qp.c | 2 +- drivers/irqchip/irq-bcm6345-l1.c | 14 +- drivers/irqchip/irq-gic-v3-its.c | 75 +++--- drivers/md/dm-cache-policy-smq.c | 28 ++- drivers/md/dm-raid.c | 20 +- drivers/md/md.c | 2 + drivers/net/Makefile | 2 +- drivers/net/bonding/bond_main.c | 5 + drivers/net/can/usb/gs_usb.c | 2 + drivers/net/ethernet/atheros/atl1e/atl1e_main.c | 7 +- drivers/net/ethernet/emulex/benet/be_main.c | 3 +- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 17 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 3 +- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 3 +- drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 2 +- drivers/net/ethernet/intel/iavf/iavf_main.c | 11 +- drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c | 26 ++- drivers/net/ethernet/intel/igc/igc_main.c | 40 +++- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 +- drivers/net/phy/marvell10g.c | 7 + drivers/net/team/team.c | 9 + drivers/net/virtio_net.c | 4 +- drivers/net/vxlan/Makefile | 7 + drivers/net/{vxlan.c => vxlan/vxlan_core.c} | 23 +- drivers/pci/controller/pcie-rockchip-ep.c | 156 ++++++------- drivers/pci/controller/pcie-rockchip.h | 40 ++-- drivers/pci/pcie/aspm.c | 55 +++-- drivers/phy/hisilicon/phy-hisi-inno-usb2.c | 2 +- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 86 ++++--- drivers/platform/x86/msi-laptop.c | 8 +- drivers/pwm/pwm-meson.c | 25 +- drivers/s390/block/dasd_ioctl.c | 1 + drivers/scsi/qla2xxx/qla_attr.c | 3 + drivers/scsi/qla2xxx/qla_def.h | 27 +++ drivers/scsi/qla2xxx/qla_gbl.h | 14 +- drivers/scsi/qla2xxx/qla_init.c | 256 +++++++++++++++++++-- drivers/scsi/qla2xxx/qla_iocb.c | 33 ++- drivers/scsi/qla2xxx/qla_isr.c | 26 ++- drivers/scsi/qla2xxx/qla_target.h | 2 - drivers/soundwire/qcom.c | 2 +- drivers/staging/ks7010/ks_wlan_net.c | 6 +- drivers/staging/media/atomisp/Kconfig | 1 + drivers/staging/rtl8712/rtl871x_xmit.c | 43 +++- drivers/staging/rtl8712/xmit_linux.c | 6 + drivers/tty/n_gsm.c | 4 +- drivers/tty/n_tty.c | 29 ++- drivers/tty/serial/8250/8250_dwlib.c | 6 +- drivers/tty/serial/qcom_geni_serial.c | 7 - drivers/tty/serial/sifive.c | 2 +- drivers/usb/cdns3/cdns3-gadget.c | 4 +- drivers/usb/core/quirks.c | 4 + drivers/usb/dwc3/core.c | 20 +- drivers/usb/dwc3/core.h | 3 - drivers/usb/dwc3/dwc3-pci.c | 6 +- drivers/usb/gadget/composite.c | 4 + drivers/usb/gadget/legacy/raw_gadget.c | 12 +- drivers/usb/gadget/udc/tegra-xudc.c | 8 +- drivers/usb/host/ohci-at91.c | 8 +- drivers/usb/host/xhci-mtk.c | 1 + drivers/usb/host/xhci-pci.c | 4 +- drivers/usb/host/xhci-ring.c | 25 +- drivers/usb/host/xhci-tegra.c | 8 +- drivers/usb/serial/option.c | 6 + drivers/usb/serial/usb-serial-simple.c | 73 +++--- fs/btrfs/qgroup.c | 18 +- fs/btrfs/transaction.c | 10 +- fs/ceph/metric.c | 2 +- fs/cifs/cifs_dfs_ref.c | 100 ++++---- fs/cifs/file.c | 4 +- fs/dlm/plock.c | 100 ++++---- fs/file.c | 6 +- fs/internal.h | 2 - fs/jbd2/checkpoint.c | 197 ++++++---------- fs/jbd2/commit.c | 3 +- fs/jbd2/transaction.c | 17 +- fs/ksmbd/vfs.c | 2 - fs/nfsd/nfs4state.c | 2 - include/linux/jbd2.h | 7 +- include/linux/namei.h | 2 + include/net/ipv6.h | 8 +- include/net/vxlan.h | 13 +- include/trace/events/jbd2.h | 12 +- include/uapi/linux/blkzoned.h | 10 +- io_uring/io_uring.c | 10 +- kernel/locking/rtmutex.c | 172 +++++++++----- kernel/locking/rtmutex_api.c | 2 +- kernel/locking/rtmutex_common.h | 47 ++-- kernel/locking/ww_mutex.h | 12 +- kernel/trace/ring_buffer.c | 22 +- kernel/trace/trace.c | 2 +- kernel/trace/trace.h | 6 + kernel/trace/trace_events.c | 14 +- kernel/trace/trace_events_hist.c | 7 +- kernel/trace/trace_events_synth.c | 80 ++++++- kernel/trace/trace_probe.c | 46 ++-- kernel/trace/trace_probe.h | 16 +- kernel/trace/trace_probe_kernel.h | 30 +-- kernel/trace/trace_probe_tmpl.h | 59 ++++- kernel/trace/trace_synth.h | 1 + kernel/trace/trace_uprobe.c | 3 +- net/ceph/messenger.c | 1 + net/ipv6/addrconf.c | 14 +- net/netfilter/nf_tables_api.c | 5 +- net/netfilter/nft_immediate.c | 27 ++- net/netfilter/nft_set_rbtree.c | 20 +- net/sched/sch_mqprio.c | 144 ++++++++---- net/tipc/crypto.c | 3 +- net/tipc/node.c | 2 +- sound/pci/hda/patch_realtek.c | 1 + sound/soc/codecs/cs42l51-i2c.c | 6 + sound/soc/codecs/cs42l51.c | 7 - sound/soc/codecs/cs42l51.h | 1 - sound/soc/codecs/wm8904.c | 3 + sound/soc/fsl/fsl_spdif.c | 2 + tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +- tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 20 +- virt/kvm/kvm_main.c | 24 ++ 160 files changed, 2097 insertions(+), 1256 deletions(-)
From: Zhihao Cheng chengzhihao1@huawei.com
[ Upstream commit e34c8dd238d0c9368b746480f313055f5bab5040 ]
Following process,
jbd2_journal_commit_transaction // there are several dirty buffer heads in transaction->t_checkpoint_list P1 wb_workfn jbd2_log_do_checkpoint if (buffer_locked(bh)) // false __block_write_full_page trylock_buffer(bh) test_clear_buffer_dirty(bh) if (!buffer_dirty(bh)) __jbd2_journal_remove_checkpoint(jh) if (buffer_write_io_error(bh)) // false >> bh IO error occurs << jbd2_cleanup_journal_tail __jbd2_update_log_tail jbd2_write_superblock // The bh won't be replayed in next mount. , which could corrupt the ext4 image, fetch a reproducer in [Link].
Since writeback process clears buffer dirty after locking buffer head, we can fix it by try locking buffer and check dirtiness while buffer is locked, the buffer head can be removed if it is neither dirty nor locked.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Fixes: 470decc613ab ("[PATCH] jbd2: initial copy of files from jbd") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-5-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index fe8bb031b7d7f..d2aba55833f92 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -221,20 +221,6 @@ int jbd2_log_do_checkpoint(journal_t *journal) jh = transaction->t_checkpoint_list; bh = jh2bh(jh);
- /* - * The buffer may be writing back, or flushing out in the - * last couple of cycles, or re-adding into a new transaction, - * need to check it again until it's unlocked. - */ - if (buffer_locked(bh)) { - get_bh(bh); - spin_unlock(&journal->j_list_lock); - wait_on_buffer(bh); - /* the journal_head may have gone by now */ - BUFFER_TRACE(bh, "brelse"); - __brelse(bh); - goto retry; - } if (jh->b_transaction != NULL) { transaction_t *t = jh->b_transaction; tid_t tid = t->t_tid; @@ -269,7 +255,22 @@ int jbd2_log_do_checkpoint(journal_t *journal) spin_lock(&journal->j_list_lock); goto restart; } - if (!buffer_dirty(bh)) { + if (!trylock_buffer(bh)) { + /* + * The buffer is locked, it may be writing back, or + * flushing out in the last couple of cycles, or + * re-adding into a new transaction, need to check + * it again until it's unlocked. + */ + get_bh(bh); + spin_unlock(&journal->j_list_lock); + wait_on_buffer(bh); + /* the journal_head may have gone by now */ + BUFFER_TRACE(bh, "brelse"); + __brelse(bh); + goto retry; + } else if (!buffer_dirty(bh)) { + unlock_buffer(bh); BUFFER_TRACE(bh, "remove from checkpoint"); /* * If the transaction was released or the checkpoint @@ -279,6 +280,7 @@ int jbd2_log_do_checkpoint(journal_t *journal) !transaction->t_checkpoint_list) goto out; } else { + unlock_buffer(bh); /* * We are about to write the buffer, it could be * raced by some other transaction shrink or buffer
From: Claudio Imbrenda imbrenda@linux.ibm.com
[ Upstream commit c2fceb59bbda16468bda82b002383bff59de89ab ]
The index field of the struct page corresponding to a guest ASCE should be 0. When replacing the ASCE in s390_replace_asce(), the index of the new ASCE should also be set to 0.
Having the wrong index might lead to the wrong addresses being passed around when notifying pte invalidations, and eventually to validity intercepts (VM crash) if the prefix gets unmapped and the notifier gets called with the wrong address.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Fixes: faa2f72cb356 ("KVM: s390: pv: leak the topmost page table when destroy fails") Reviewed-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Claudio Imbrenda imbrenda@linux.ibm.com Message-ID: 20230705111937.33472-3-imbrenda@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/mm/gmap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index ff40bf92db43a..a2c872de29a66 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2791,6 +2791,7 @@ int s390_replace_asce(struct gmap *gmap) page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER); if (!page) return -ENOMEM; + page->index = 0; table = page_to_virt(page); memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
From: Ondrej Mosnacek omosnace@redhat.com
[ Upstream commit 6adc2272aaaf84f34b652cf77f770c6fcc4b8336 ]
The check being unconditional may lead to unwanted denials reported by LSMs when a process has the capability granted by DAC, but denied by an LSM. In the case of SELinux such denials are a problem, since they can't be effectively filtered out via the policy and when not silenced, they produce noise that may hide a true problem or an attack.
Since not having the capability merely means that the created io_uring context will be accounted against the current user's RLIMIT_MEMLOCK limit, we can disable auditing of denials for this check by using ns_capable_noaudit() instead of capable().
Fixes: 2b188cc1bb85 ("Add io_uring IO interface") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2193317 Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Reviewed-by: Jeff Moyer jmoyer@redhat.com Link: https://lore.kernel.org/r/20230718115607.65652-1-omosnace@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index d7f87157be9aa..f65c6811ede81 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -10602,7 +10602,7 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p, if (!ctx) return -ENOMEM; ctx->compat = in_compat_syscall(); - if (!capable(CAP_IPC_LOCK)) + if (!ns_capable_noaudit(&init_user_ns, CAP_IPC_LOCK)) ctx->user = get_uid(current_user());
/*
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 5a7adc6c1069ce31ef4f606ae9c05592c80a6ab5 ]
Make tps68470_gpio_output() call tps68470_gpio_set() for output-only pins too, so that the initial value passed to gpiod_direction_output() is honored for these pins too.
Fixes: 275b13a65547 ("gpio: Add support for TPS68470 GPIOs") Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Daniel Scally dan.scally@ideasonboard.com Tested-by: Daniel Scally dan.scally@ideasonboard.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tps68470.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpio/gpio-tps68470.c b/drivers/gpio/gpio-tps68470.c index 423b7bc30ae88..03a523a6d6fa4 100644 --- a/drivers/gpio/gpio-tps68470.c +++ b/drivers/gpio/gpio-tps68470.c @@ -91,13 +91,13 @@ static int tps68470_gpio_output(struct gpio_chip *gc, unsigned int offset, struct tps68470_gpio_data *tps68470_gpio = gpiochip_get_data(gc); struct regmap *regmap = tps68470_gpio->tps68470_regmap;
+ /* Set the initial value */ + tps68470_gpio_set(gc, offset, value); + /* rest are always outputs */ if (offset >= TPS68470_N_REGULAR_GPIO) return 0;
- /* Set the initial value */ - tps68470_gpio_set(gc, offset, value); - return regmap_update_bits(regmap, TPS68470_GPIO_CTL_REG_A(offset), TPS68470_GPIO_MODE_MASK, TPS68470_GPIO_MODE_OUT_CMOS);
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 1945063eb59e64d2919cb14d54d081476d9e53bb ]
This allows to get rid of a call to pwmchip_remove() in the error path. There is no .remove function for this driver, so this change fixes a resource leak when a gpio-mvebu device is unbound.
Fixes: 757642f9a584 ("gpio: mvebu: Add limited PWM support") Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Andy Shevchenko andy@kernel.org Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-mvebu.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c index a245bfd5a6173..4d64b3358c50c 100644 --- a/drivers/gpio/gpio-mvebu.c +++ b/drivers/gpio/gpio-mvebu.c @@ -874,7 +874,7 @@ static int mvebu_pwm_probe(struct platform_device *pdev,
spin_lock_init(&mvpwm->lock);
- return pwmchip_add(&mvpwm->chip); + return devm_pwmchip_add(dev, &mvpwm->chip); }
#ifdef CONFIG_DEBUG_FS @@ -1244,8 +1244,7 @@ static int mvebu_gpio_probe(struct platform_device *pdev) if (!mvchip->domain) { dev_err(&pdev->dev, "couldn't allocate irq domain %s (DT).\n", mvchip->chip.label); - err = -ENODEV; - goto err_pwm; + return -ENODEV; }
err = irq_alloc_domain_generic_chips( @@ -1297,9 +1296,6 @@ static int mvebu_gpio_probe(struct platform_device *pdev)
err_domain: irq_domain_remove(mvchip->domain); -err_pwm: - pwmchip_remove(&mvchip->mvpwm->chip); - return err; }
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
[ Upstream commit 644ee70267a934be27370f9aa618b29af7290544 ]
Uwe Kleine-König pointed out we still have one resource leak in the mvebu driver triggered on driver detach. Let's address it with a custom devm action.
Fixes: 812d47889a8e ("gpio/mvebu: Use irq_domain_add_linear") Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-mvebu.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c index 4d64b3358c50c..b965513f44fea 100644 --- a/drivers/gpio/gpio-mvebu.c +++ b/drivers/gpio/gpio-mvebu.c @@ -1112,6 +1112,13 @@ static int mvebu_gpio_probe_syscon(struct platform_device *pdev, return 0; }
+static void mvebu_gpio_remove_irq_domain(void *data) +{ + struct irq_domain *domain = data; + + irq_domain_remove(domain); +} + static int mvebu_gpio_probe(struct platform_device *pdev) { struct mvebu_gpio_chip *mvchip; @@ -1247,13 +1254,18 @@ static int mvebu_gpio_probe(struct platform_device *pdev) return -ENODEV; }
+ err = devm_add_action_or_reset(&pdev->dev, mvebu_gpio_remove_irq_domain, + mvchip->domain); + if (err) + return err; + err = irq_alloc_domain_generic_chips( mvchip->domain, ngpios, 2, np->name, handle_level_irq, IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_LEVEL, 0, 0); if (err) { dev_err(&pdev->dev, "couldn't allocate irq chips %s (DT).\n", mvchip->chip.label); - goto err_domain; + return err; }
/* @@ -1293,10 +1305,6 @@ static int mvebu_gpio_probe(struct platform_device *pdev) }
return 0; - -err_domain: - irq_domain_remove(mvchip->domain); - return err; }
static struct platform_driver mvebu_gpio_driver = {
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 8a4a0b2a3eaf75ca8854f856ef29690c12b2f531 ]
If we disable quotas while we have a relocation of a metadata block group that has extents belonging to the quota root, we can cause the relocation to fail with -ENOENT. This is because relocation builds backref nodes for extents of the quota root and later needs to walk the backrefs and access the quota root - however if in between a task disables quotas, it results in deleting the quota root from the root tree (with btrfs_del_root(), called from btrfs_quota_disable().
This can be sporadically triggered by test case btrfs/255 from fstests:
$ ./check btrfs/255 FSTYP -- btrfs PLATFORM -- Linux/x86_64 debian0 6.4.0-rc6-btrfs-next-134+ #1 SMP PREEMPT_DYNAMIC Thu Jun 15 11:59:28 WEST 2023 MKFS_OPTIONS -- /dev/sdc MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1
btrfs/255 6s ... _check_dmesg: something found in dmesg (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.dmesg) - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad) # --- tests/btrfs/255.out 2023-03-02 21:47:53.876609426 +0000 # +++ /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad 2023-06-16 10:20:39.267563212 +0100 # @@ -1,2 +1,4 @@ # QA output created by 255 # +ERROR: error during balancing '/home/fdmanana/btrfs-tests/scratch_1': No such file or directory # +There may be more info in syslog - try dmesg | tail # Silence is golden # ... (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/255.out /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad' to see the entire diff) Ran: btrfs/255 Failures: btrfs/255 Failed 1 of 1 tests
To fix this make the quota disable operation take the cleaner mutex, as relocation of a block group also takes this mutex. This is also what we do when deleting a subvolume/snapshot, we take the cleaner mutex in the cleaner kthread (at cleaner_kthread()) and then we call btrfs_del_root() at btrfs_drop_snapshot() while under the protection of the cleaner mutex.
Fixes: bed92eae26cc ("Btrfs: qgroup implementation and prototypes") CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/qgroup.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index d408d1dfde7c8..d46a070275ff5 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1201,12 +1201,23 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info) int ret = 0;
/* - * We need to have subvol_sem write locked, to prevent races between - * concurrent tasks trying to disable quotas, because we will unlock - * and relock qgroup_ioctl_lock across BTRFS_FS_QUOTA_ENABLED changes. + * We need to have subvol_sem write locked to prevent races with + * snapshot creation. */ lockdep_assert_held_write(&fs_info->subvol_sem);
+ /* + * Lock the cleaner mutex to prevent races with concurrent relocation, + * because relocation may be building backrefs for blocks of the quota + * root while we are deleting the root. This is like dropping fs roots + * of deleted snapshots/subvolumes, we need the same protection. + * + * This also prevents races between concurrent tasks trying to disable + * quotas, because we will unlock and relock qgroup_ioctl_lock across + * BTRFS_FS_QUOTA_ENABLED changes. + */ + mutex_lock(&fs_info->cleaner_mutex); + mutex_lock(&fs_info->qgroup_ioctl_lock); if (!fs_info->quota_root) goto out; @@ -1287,6 +1298,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info) btrfs_end_transaction(trans); else if (trans) ret = btrfs_end_transaction(trans); + mutex_unlock(&fs_info->cleaner_mutex);
return ret; }
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit 6b3b21a8542fd2fb6ffc61bc13b9419f0c58ebad ]
These issues were detected by using the Coccinelle software.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-ibm_iic.c | 4 +--- drivers/i2c/busses/i2c-nomadik.c | 1 - drivers/i2c/busses/i2c-sh7760.c | 1 - drivers/i2c/busses/i2c-tiny-usb.c | 4 +--- 4 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/i2c/busses/i2c-ibm_iic.c b/drivers/i2c/busses/i2c-ibm_iic.c index 9f71daf6db64b..c073f5b8833a2 100644 --- a/drivers/i2c/busses/i2c-ibm_iic.c +++ b/drivers/i2c/busses/i2c-ibm_iic.c @@ -694,10 +694,8 @@ static int iic_probe(struct platform_device *ofdev) int ret;
dev = kzalloc(sizeof(*dev), GFP_KERNEL); - if (!dev) { - dev_err(&ofdev->dev, "failed to allocate device data\n"); + if (!dev) return -ENOMEM; - }
platform_set_drvdata(ofdev, dev);
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index a2d12a5b1c34c..05eaae5aeb180 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -972,7 +972,6 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
dev = devm_kzalloc(&adev->dev, sizeof(struct nmk_i2c_dev), GFP_KERNEL); if (!dev) { - dev_err(&adev->dev, "cannot allocate memory\n"); ret = -ENOMEM; goto err_no_mem; } diff --git a/drivers/i2c/busses/i2c-sh7760.c b/drivers/i2c/busses/i2c-sh7760.c index 319d1fa617c88..a0ccc5d009874 100644 --- a/drivers/i2c/busses/i2c-sh7760.c +++ b/drivers/i2c/busses/i2c-sh7760.c @@ -445,7 +445,6 @@ static int sh7760_i2c_probe(struct platform_device *pdev)
id = kzalloc(sizeof(struct cami2c), GFP_KERNEL); if (!id) { - dev_err(&pdev->dev, "no mem for private data\n"); ret = -ENOMEM; goto out0; } diff --git a/drivers/i2c/busses/i2c-tiny-usb.c b/drivers/i2c/busses/i2c-tiny-usb.c index 7279ca0eaa2d0..d1fa9ff5aeab4 100644 --- a/drivers/i2c/busses/i2c-tiny-usb.c +++ b/drivers/i2c/busses/i2c-tiny-usb.c @@ -226,10 +226,8 @@ static int i2c_tiny_usb_probe(struct usb_interface *interface,
/* allocate memory for our device state and initialize it */ dev = kzalloc(sizeof(*dev), GFP_KERNEL); - if (dev == NULL) { - dev_err(&interface->dev, "Out of memory\n"); + if (!dev) goto error; - }
dev->usb_dev = usb_get_dev(interface_to_usbdev(interface)); dev->interface = interface;
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit 06e989578232da33a7fe96b04191b862af8b2cec ]
Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 2 +- drivers/i2c/busses/i2c-sh7760.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 05eaae5aeb180..5004b9dd98563 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -970,7 +970,7 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id) struct i2c_vendor_data *vendor = id->data; u32 max_fifo_threshold = (vendor->fifodepth / 2) - 1;
- dev = devm_kzalloc(&adev->dev, sizeof(struct nmk_i2c_dev), GFP_KERNEL); + dev = devm_kzalloc(&adev->dev, sizeof(*dev), GFP_KERNEL); if (!dev) { ret = -ENOMEM; goto err_no_mem; diff --git a/drivers/i2c/busses/i2c-sh7760.c b/drivers/i2c/busses/i2c-sh7760.c index a0ccc5d009874..051b904cb35f6 100644 --- a/drivers/i2c/busses/i2c-sh7760.c +++ b/drivers/i2c/busses/i2c-sh7760.c @@ -443,7 +443,7 @@ static int sh7760_i2c_probe(struct platform_device *pdev) goto out0; }
- id = kzalloc(sizeof(struct cami2c), GFP_KERNEL); + id = kzalloc(sizeof(*id), GFP_KERNEL); if (!id) { ret = -ENOMEM; goto out0;
From: Andi Shyti andi.shyti@kernel.org
[ Upstream commit 1c5d33fff0d375e4ab7c4261dc62a286babbb4c6 ]
The err_no_mem goto label doesn't do anything. Remove it.
Signed-off-by: Andi Shyti andi.shyti@kernel.org Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 5004b9dd98563..8b9577318388e 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -971,10 +971,9 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id) u32 max_fifo_threshold = (vendor->fifodepth / 2) - 1;
dev = devm_kzalloc(&adev->dev, sizeof(*dev), GFP_KERNEL); - if (!dev) { - ret = -ENOMEM; - goto err_no_mem; - } + if (!dev) + return -ENOMEM; + dev->vendor = vendor; dev->adev = adev; nmk_i2c_of_probe(np, dev); @@ -995,30 +994,27 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
dev->virtbase = devm_ioremap(&adev->dev, adev->res.start, resource_size(&adev->res)); - if (!dev->virtbase) { - ret = -ENOMEM; - goto err_no_mem; - } + if (!dev->virtbase) + return -ENOMEM;
dev->irq = adev->irq[0]; ret = devm_request_irq(&adev->dev, dev->irq, i2c_irq_handler, 0, DRIVER_NAME, dev); if (ret) { dev_err(&adev->dev, "cannot claim the irq %d\n", dev->irq); - goto err_no_mem; + return ret; }
dev->clk = devm_clk_get(&adev->dev, NULL); if (IS_ERR(dev->clk)) { dev_err(&adev->dev, "could not get i2c clock\n"); - ret = PTR_ERR(dev->clk); - goto err_no_mem; + return PTR_ERR(dev->clk); }
ret = clk_prepare_enable(dev->clk); if (ret) { dev_err(&adev->dev, "can't prepare_enable clock\n"); - goto err_no_mem; + return ret; }
init_hw(dev); @@ -1049,7 +1045,6 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
err_no_adap: clk_disable_unprepare(dev->clk); - err_no_mem:
return ret; }
From: Andi Shyti andi.shyti@kernel.org
[ Upstream commit 9c7174db4cdd111e10d19eed5c36fd978a14c8a2 ]
Replace the pair of functions, devm_clk_get() and clk_prepare_enable(), with a single function devm_clk_get_enabled().
Signed-off-by: Andi Shyti andi.shyti@kernel.org Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 8b9577318388e..2141ba05dfece 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -1005,18 +1005,12 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id) return ret; }
- dev->clk = devm_clk_get(&adev->dev, NULL); + dev->clk = devm_clk_get_enabled(&adev->dev, NULL); if (IS_ERR(dev->clk)) { - dev_err(&adev->dev, "could not get i2c clock\n"); + dev_err(&adev->dev, "could enable i2c clock\n"); return PTR_ERR(dev->clk); }
- ret = clk_prepare_enable(dev->clk); - if (ret) { - dev_err(&adev->dev, "can't prepare_enable clock\n"); - return ret; - } - init_hw(dev);
adap = &dev->adap; @@ -1037,16 +1031,11 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
ret = i2c_add_adapter(adap); if (ret) - goto err_no_adap; + return ret;
pm_runtime_put(&adev->dev);
return 0; - - err_no_adap: - clk_disable_unprepare(dev->clk); - - return ret; }
static void nmk_i2c_remove(struct amba_device *adev) @@ -1060,7 +1049,6 @@ static void nmk_i2c_remove(struct amba_device *adev) clear_all_interrupts(dev); /* disable the controller */ i2c_clr_bit(dev->virtbase + I2C_CR, I2C_CR_PE); - clk_disable_unprepare(dev->clk); release_mem_region(res->start, resource_size(res)); }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 05f933d5f7318b03ff2028c1704dc867ac16f2c7 ]
Since commit 235602146ec9 ("i2c-nomadik: turn the platform driver to an amba driver"), there is no more request_mem_region() call in this driver.
So remove the release_mem_region() call from the remove function which is likely a left over.
Fixes: 235602146ec9 ("i2c-nomadik: turn the platform driver to an amba driver") Cc: stable@vger.kernel.org # v3.6+ Acked-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 2141ba05dfece..9c5d66bd6dc1c 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -1040,7 +1040,6 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
static void nmk_i2c_remove(struct amba_device *adev) { - struct resource *res = &adev->res; struct nmk_i2c_dev *dev = amba_get_drvdata(adev);
i2c_del_adapter(&dev->adap); @@ -1049,7 +1048,6 @@ static void nmk_i2c_remove(struct amba_device *adev) clear_all_interrupts(dev); /* disable the controller */ i2c_clr_bit(dev->virtbase + I2C_CR, I2C_CR_PE); - release_mem_region(res->start, resource_size(res)); }
static struct i2c_vendor_data vendor_stn8815 = {
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit f5297a01ee805d7fa569d288ed65fc0f9ac9b03d ]
"pcie_retrain_link" is not a question with a true/false answer, so "bool" isn't quite the right return type. Return 0 for success or -ETIMEDOUT if the retrain failed. No functional change intended.
[bhelgaas: based on Ilpo's patch below] Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.c... Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: e7e39756363a ("PCI/ASPM: Avoid link retraining race") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index c58294f53fcd1..e7f742ff31c3c 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -192,7 +192,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) link->clkpm_disable = blacklist ? 1 : 0; }
-static bool pcie_retrain_link(struct pcie_link_state *link) +static int pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; unsigned long end_jiffies; @@ -219,7 +219,9 @@ static bool pcie_retrain_link(struct pcie_link_state *link) break; msleep(1); } while (time_before(jiffies, end_jiffies)); - return !(reg16 & PCI_EXP_LNKSTA_LT); + if (reg16 & PCI_EXP_LNKSTA_LT) + return -ETIMEDOUT; + return 0; }
/* @@ -288,15 +290,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link) reg16 &= ~PCI_EXP_LNKCTL_CCC; pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
- if (pcie_retrain_link(link)) - return; + if (pcie_retrain_link(link)) {
- /* Training failed. Restore common clock configurations */ - pci_err(parent, "ASPM: Could not configure common clock\n"); - list_for_each_entry(child, &linkbus->devices, bus_list) - pcie_capability_write_word(child, PCI_EXP_LNKCTL, + /* Training failed. Restore common clock configurations */ + pci_err(parent, "ASPM: Could not configure common clock\n"); + list_for_each_entry(child, &linkbus->devices, bus_list) + pcie_capability_write_word(child, PCI_EXP_LNKCTL, child_reg[PCI_FUNC(child->devfn)]); - pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg); + pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg); + } }
/* Convert L0s latency encoding to ns */
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 9c7f136433d26592cb4d9cd00b4e15c33d9797c6 ]
Factor pcie_wait_for_retrain() out from pcie_retrain_link(). No functional change intended.
[bhelgaas: split out from https: //lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: e7e39756363a ("PCI/ASPM: Avoid link retraining race") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index e7f742ff31c3c..5e09cbb563366 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -192,10 +192,26 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) link->clkpm_disable = blacklist ? 1 : 0; }
+static int pcie_wait_for_retrain(struct pci_dev *pdev) +{ + unsigned long end_jiffies; + u16 reg16; + + /* Wait for Link Training to be cleared by hardware */ + end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT; + do { + pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, ®16); + if (!(reg16 & PCI_EXP_LNKSTA_LT)) + return 0; + msleep(1); + } while (time_before(jiffies, end_jiffies)); + + return -ETIMEDOUT; +} + static int pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; - unsigned long end_jiffies; u16 reg16;
pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); @@ -211,17 +227,7 @@ static int pcie_retrain_link(struct pcie_link_state *link) pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); }
- /* Wait for link training end. Break out after waiting for timeout */ - end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT; - do { - pcie_capability_read_word(parent, PCI_EXP_LNKSTA, ®16); - if (!(reg16 & PCI_EXP_LNKSTA_LT)) - break; - msleep(1); - } while (time_before(jiffies, end_jiffies)); - if (reg16 & PCI_EXP_LNKSTA_LT) - return -ETIMEDOUT; - return 0; + return pcie_wait_for_retrain(parent); }
/*
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit e7e39756363ad5bd83ddeae1063193d0f13870fd ]
PCIe r6.0.1, sec 7.5.3.7, recommends setting the link control parameters, then waiting for the Link Training bit to be clear before setting the Retrain Link bit.
This avoids a race where the LTSSM may not use the updated parameters if it is already in the midst of link training because of other normal link activity.
Wait for the Link Training bit to be clear before toggling the Retrain Link bit to ensure that the LTSSM uses the updated link control parameters.
[bhelgaas: commit log, return 0 (success)/-ETIMEDOUT instead of bool for both pcie_wait_for_retrain() and the existing pcie_retrain_link()] Suggested-by: Lukas Wunner lukas@wunner.de Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.c... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Lukas Wunner lukas@wunner.de Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 5e09cbb563366..3078de668f911 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -212,8 +212,19 @@ static int pcie_wait_for_retrain(struct pci_dev *pdev) static int pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; + int rc; u16 reg16;
+ /* + * Ensure the updated LNKCTL parameters are used during link + * training by checking that there is no ongoing link training to + * avoid LTSSM race as recommended in Implementation Note at the + * end of PCIe r6.0.1 sec 7.5.3.7. + */ + rc = pcie_wait_for_retrain(parent); + if (rc) + return rc; + pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); reg16 |= PCI_EXP_LNKCTL_RL; pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
From: Rick Wertenbroek rick.wertenbroek@gmail.com
[ Upstream commit 92a9c57c325dd51682d428ba960d961fec3c8a08 ]
Remove write accesses to registers that are marked "unused" (and therefore read-only) in the technical reference manual (TRM) (see RK3399 TRM 17.6.8.1)
Link: https://lore.kernel.org/r/20230418074700.1083505-2-rick.wertenbroek@gmail.co... Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Stable-dep-of: dc73ed0f1b8b ("PCI: rockchip: Fix window mapping and address translation for endpoint") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-ep.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 827d91e73efab..9e17f3dba743a 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -61,10 +61,6 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip, ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region)); rockchip_pcie_write(rockchip, 0, ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region)); - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(region)); - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(region)); }
static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn, @@ -114,12 +110,6 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn, PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); addr1 = upper_32_bits(cpu_addr); } - - /* CPU bus address region */ - rockchip_pcie_write(rockchip, addr0, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r)); - rockchip_pcie_write(rockchip, addr1, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r)); }
static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
From: Rick Wertenbroek rick.wertenbroek@gmail.com
[ Upstream commit dc73ed0f1b8bddd7f2bf70d123e68ffc99ad71ce ]
The RK3399 PCI endpoint core has 33 windows for PCIe space, now in the driver up to 32 fixed size (1M) windows are used and pages are allocated and mapped accordingly. The driver first used a single window and allocated space inside which caused translation issues (between CPU space and PCI space) because a window can only have a single translation at a given time, which if multiple pages are allocated inside will cause conflicts. Now each window is a single region of 1M which will always guarantee that the translation is not in conflict.
Set the translation register addresses for physical function. As documented in the technical reference manual (TRM) section 17.5.5 "PCIe Address Translation" and section 17.6.8 "Address Translation Registers Description"
Link: https://lore.kernel.org/r/20230418074700.1083505-9-rick.wertenbroek@gmail.co... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-ep.c | 128 ++++++++++------------ drivers/pci/controller/pcie-rockchip.h | 35 +++--- 2 files changed, 75 insertions(+), 88 deletions(-)
diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 9e17f3dba743a..3d6f828d29fc2 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -64,52 +64,29 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip, }
static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn, - u32 r, u32 type, u64 cpu_addr, - u64 pci_addr, size_t size) + u32 r, u64 cpu_addr, u64 pci_addr, + size_t size) { - u64 sz = 1ULL << fls64(size - 1); - int num_pass_bits = ilog2(sz); - u32 addr0, addr1, desc0, desc1; - bool is_nor_msg = (type == AXI_WRAPPER_NOR_MSG); + int num_pass_bits = fls64(size - 1); + u32 addr0, addr1, desc0;
- /* The minimal region size is 1MB */ if (num_pass_bits < 8) num_pass_bits = 8;
- cpu_addr -= rockchip->mem_res->start; - addr0 = ((is_nor_msg ? 0x10 : (num_pass_bits - 1)) & - PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) | - (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); - addr1 = upper_32_bits(is_nor_msg ? cpu_addr : pci_addr); - desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type; - desc1 = 0; - - if (is_nor_msg) { - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r)); - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r)); - rockchip_pcie_write(rockchip, desc0, - ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r)); - rockchip_pcie_write(rockchip, desc1, - ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r)); - } else { - /* PCI bus address region */ - rockchip_pcie_write(rockchip, addr0, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r)); - rockchip_pcie_write(rockchip, addr1, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r)); - rockchip_pcie_write(rockchip, desc0, - ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r)); - rockchip_pcie_write(rockchip, desc1, - ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r)); - - addr0 = - ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) | - (lower_32_bits(cpu_addr) & - PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); - addr1 = upper_32_bits(cpu_addr); - } + addr0 = ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) | + (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); + addr1 = upper_32_bits(pci_addr); + desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | AXI_WRAPPER_MEM_WRITE; + + /* PCI bus address region */ + rockchip_pcie_write(rockchip, addr0, + ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r)); + rockchip_pcie_write(rockchip, addr1, + ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r)); + rockchip_pcie_write(rockchip, desc0, + ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r)); + rockchip_pcie_write(rockchip, 0, + ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r)); }
static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn, @@ -248,26 +225,20 @@ static void rockchip_pcie_ep_clear_bar(struct pci_epc *epc, u8 fn, u8 vfn, ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar)); }
+static inline u32 rockchip_ob_region(phys_addr_t addr) +{ + return (addr >> ilog2(SZ_1M)) & 0x1f; +} + static int rockchip_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn, phys_addr_t addr, u64 pci_addr, size_t size) { struct rockchip_pcie_ep *ep = epc_get_drvdata(epc); struct rockchip_pcie *pcie = &ep->rockchip; - u32 r; + u32 r = rockchip_ob_region(addr);
- r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG); - /* - * Region 0 is reserved for configuration space and shouldn't - * be used elsewhere per TRM, so leave it out. - */ - if (r >= ep->max_regions - 1) { - dev_err(&epc->dev, "no free outbound region\n"); - return -EINVAL; - } - - rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, AXI_WRAPPER_MEM_WRITE, addr, - pci_addr, size); + rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, addr, pci_addr, size);
set_bit(r, &ep->ob_region_map); ep->ob_addr[r] = addr; @@ -282,15 +253,11 @@ static void rockchip_pcie_ep_unmap_addr(struct pci_epc *epc, u8 fn, u8 vfn, struct rockchip_pcie *rockchip = &ep->rockchip; u32 r;
- for (r = 0; r < ep->max_regions - 1; r++) + for (r = 0; r < ep->max_regions; r++) if (ep->ob_addr[r] == addr) break;
- /* - * Region 0 is reserved for configuration space and shouldn't - * be used elsewhere per TRM, so leave it out. - */ - if (r == ep->max_regions - 1) + if (r == ep->max_regions) return;
rockchip_pcie_clear_ep_ob_atu(rockchip, r); @@ -387,7 +354,8 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn, struct rockchip_pcie *rockchip = &ep->rockchip; u32 flags, mme, data, data_mask; u8 msi_count; - u64 pci_addr, pci_addr_mask = 0xff; + u64 pci_addr; + u32 r;
/* Check MSI enable bit */ flags = rockchip_pcie_read(&ep->rockchip, @@ -421,21 +389,20 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn, ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + ROCKCHIP_PCIE_EP_MSI_CTRL_REG + PCI_MSI_ADDRESS_LO); - pci_addr &= GENMASK_ULL(63, 2);
/* Set the outbound region if needed. */ - if (unlikely(ep->irq_pci_addr != (pci_addr & ~pci_addr_mask) || + if (unlikely(ep->irq_pci_addr != (pci_addr & PCIE_ADDR_MASK) || ep->irq_pci_fn != fn)) { - rockchip_pcie_prog_ep_ob_atu(rockchip, fn, ep->max_regions - 1, - AXI_WRAPPER_MEM_WRITE, + r = rockchip_ob_region(ep->irq_phys_addr); + rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r, ep->irq_phys_addr, - pci_addr & ~pci_addr_mask, - pci_addr_mask + 1); - ep->irq_pci_addr = (pci_addr & ~pci_addr_mask); + pci_addr & PCIE_ADDR_MASK, + ~PCIE_ADDR_MASK + 1); + ep->irq_pci_addr = (pci_addr & PCIE_ADDR_MASK); ep->irq_pci_fn = fn; }
- writew(data, ep->irq_cpu_addr + (pci_addr & pci_addr_mask)); + writew(data, ep->irq_cpu_addr + (pci_addr & ~PCIE_ADDR_MASK)); return 0; }
@@ -517,6 +484,8 @@ static int rockchip_pcie_parse_ep_dt(struct rockchip_pcie *rockchip, if (err < 0 || ep->max_regions > MAX_REGION_LIMIT) ep->max_regions = MAX_REGION_LIMIT;
+ ep->ob_region_map = 0; + err = of_property_read_u8(dev->of_node, "max-functions", &ep->epc->max_functions); if (err < 0) @@ -537,7 +506,8 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev) struct rockchip_pcie *rockchip; struct pci_epc *epc; size_t max_regions; - int err; + struct pci_epc_mem_window *windows = NULL; + int err, i;
ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL); if (!ep) @@ -584,15 +554,27 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev) /* Only enable function 0 by default */ rockchip_pcie_write(rockchip, BIT(0), PCIE_CORE_PHY_FUNC_CFG);
- err = pci_epc_mem_init(epc, rockchip->mem_res->start, - resource_size(rockchip->mem_res), PAGE_SIZE); + windows = devm_kcalloc(dev, ep->max_regions, + sizeof(struct pci_epc_mem_window), GFP_KERNEL); + if (!windows) { + err = -ENOMEM; + goto err_uninit_port; + } + for (i = 0; i < ep->max_regions; i++) { + windows[i].phys_base = rockchip->mem_res->start + (SZ_1M * i); + windows[i].size = SZ_1M; + windows[i].page_size = SZ_1M; + } + err = pci_epc_multi_mem_init(epc, windows, ep->max_regions); + devm_kfree(dev, windows); + if (err < 0) { dev_err(dev, "failed to initialize the memory space\n"); goto err_uninit_port; }
ep->irq_cpu_addr = pci_epc_mem_alloc_addr(epc, &ep->irq_phys_addr, - SZ_128K); + SZ_1M); if (!ep->irq_cpu_addr) { dev_err(dev, "failed to reserve memory space for MSI\n"); err = -ENOMEM; diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h index cbd2fd25ba761..498a40251d0be 100644 --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -139,6 +139,7 @@
#define PCIE_RC_RP_ATS_BASE 0x400000 #define PCIE_RC_CONFIG_NORMAL_BASE 0x800000 +#define PCIE_EP_PF_CONFIG_REGS_BASE 0x800000 #define PCIE_RC_CONFIG_BASE 0xa00000 #define PCIE_EP_CONFIG_BASE 0xa00000 #define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00) @@ -158,10 +159,11 @@ #define PCIE_RC_CONFIG_THP_CAP (PCIE_RC_CONFIG_BASE + 0x274) #define PCIE_RC_CONFIG_THP_CAP_NEXT_MASK GENMASK(31, 20)
+#define PCIE_ADDR_MASK 0xffffff00 #define PCIE_CORE_AXI_CONF_BASE 0xc00000 #define PCIE_CORE_OB_REGION_ADDR0 (PCIE_CORE_AXI_CONF_BASE + 0x0) #define PCIE_CORE_OB_REGION_ADDR0_NUM_BITS 0x3f -#define PCIE_CORE_OB_REGION_ADDR0_LO_ADDR 0xffffff00 +#define PCIE_CORE_OB_REGION_ADDR0_LO_ADDR PCIE_ADDR_MASK #define PCIE_CORE_OB_REGION_ADDR1 (PCIE_CORE_AXI_CONF_BASE + 0x4) #define PCIE_CORE_OB_REGION_DESC0 (PCIE_CORE_AXI_CONF_BASE + 0x8) #define PCIE_CORE_OB_REGION_DESC1 (PCIE_CORE_AXI_CONF_BASE + 0xc) @@ -169,7 +171,7 @@ #define PCIE_CORE_AXI_INBOUND_BASE 0xc00800 #define PCIE_RP_IB_ADDR0 (PCIE_CORE_AXI_INBOUND_BASE + 0x0) #define PCIE_CORE_IB_REGION_ADDR0_NUM_BITS 0x3f -#define PCIE_CORE_IB_REGION_ADDR0_LO_ADDR 0xffffff00 +#define PCIE_CORE_IB_REGION_ADDR0_LO_ADDR PCIE_ADDR_MASK #define PCIE_RP_IB_ADDR1 (PCIE_CORE_AXI_INBOUND_BASE + 0x4)
/* Size of one AXI Region (not Region 0) */ @@ -234,13 +236,15 @@ #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16) #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24) #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1 -#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12)) +#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3 +#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \ + (PCIE_EP_PF_CONFIG_REGS_BASE + (((fn) << 12) & GENMASK(19, 12))) +#define ROCKCHIP_PCIE_EP_VIRT_FUNC_BASE(fn) \ + (PCIE_EP_PF_CONFIG_REGS_BASE + 0x10000 + (((fn) << 12) & GENMASK(19, 12))) #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \ - (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008) + (PCIE_CORE_AXI_CONF_BASE + 0x0828 + (fn) * 0x0040 + (bar) * 0x0008) #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar) \ - (PCIE_RC_RP_ATS_BASE + 0x0844 + (fn) * 0x0040 + (bar) * 0x0008) -#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0000 + ((r) & 0x1f) * 0x0020) + (PCIE_CORE_AXI_CONF_BASE + 0x082c + (fn) * 0x0040 + (bar) * 0x0008) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN_MASK GENMASK(19, 12) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN(devfn) \ (((devfn) << 12) & \ @@ -248,20 +252,21 @@ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK GENMASK(27, 20) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS(bus) \ (((bus) << 20) & ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK) +#define PCIE_RC_EP_ATR_OB_REGIONS_1_32 (PCIE_CORE_AXI_CONF_BASE + 0x0020) +#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \ + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0000 + ((r) & 0x1f) * 0x0020) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0004 + ((r) & 0x1f) * 0x0020) + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0004 + ((r) & 0x1f) * 0x0020) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_HARDCODED_RID BIT(23) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK GENMASK(31, 24) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(devfn) \ (((devfn) << 24) & ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0008 + ((r) & 0x1f) * 0x0020) -#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \ - (PCIE_RC_RP_ATS_BASE + 0x000c + ((r) & 0x1f) * 0x0020) -#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0018 + ((r) & 0x1f) * 0x0020) -#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r) \ - (PCIE_RC_RP_ATS_BASE + 0x001c + ((r) & 0x1f) * 0x0020) + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0008 + ((r) & 0x1f) * 0x0020) +#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \ + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x000c + ((r) & 0x1f) * 0x0020) +#define ROCKCHIP_PCIE_AT_OB_REGION_DESC2(r) \ + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0010 + ((r) & 0x1f) * 0x0020)
#define ROCKCHIP_PCIE_CORE_EP_FUNC_BAR_CFG0(fn) \ (PCIE_CORE_CTRL_MGMT_BASE + 0x0240 + (fn) * 0x0008)
From: Rick Wertenbroek rick.wertenbroek@gmail.com
[ Upstream commit a52587e0bee14cbeeadf48a24013828cb04b8df8 ]
The RK3399 PCIe endpoint controller cannot generate MSI-X IRQs. This is documented in the RK3399 technical reference manual (TRM) section 17.5.9 "Interrupt Support".
MSI-X capability should therefore not be advertised. Remove the MSI-X capability by editing the capability linked-list. The previous entry is the MSI capability, therefore get the next entry from the MSI-X capability entry and set it as next entry for the MSI capability. This in effect removes MSI-X from the list.
Linked list before : MSI cap -> MSI-X cap -> PCIe Device cap -> ... Linked list now : MSI cap -> PCIe Device cap -> ...
Link: https://lore.kernel.org/r/20230418074700.1083505-11-rick.wertenbroek@gmail.c... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-ep.c | 24 +++++++++++++++++++++++ drivers/pci/controller/pcie-rockchip.h | 5 +++++ 2 files changed, 29 insertions(+)
diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 3d6f828d29fc2..0af0e965fb57e 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -508,6 +508,7 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev) size_t max_regions; struct pci_epc_mem_window *windows = NULL; int err, i; + u32 cfg_msi, cfg_msix_cp;
ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL); if (!ep) @@ -583,6 +584,29 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
+ /* + * MSI-X is not supported but the controller still advertises the MSI-X + * capability by default, which can lead to the Root Complex side + * allocating MSI-X vectors which cannot be used. Avoid this by skipping + * the MSI-X capability entry in the PCIe capabilities linked-list: get + * the next pointer from the MSI-X entry and set that in the MSI + * capability entry (which is the previous entry). This way the MSI-X + * entry is skipped (left out of the linked-list) and not advertised. + */ + cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE + + ROCKCHIP_PCIE_EP_MSI_CTRL_REG); + + cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK; + + cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE + + ROCKCHIP_PCIE_EP_MSIX_CAP_REG) & + ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK; + + cfg_msi |= cfg_msix_cp; + + rockchip_pcie_write(rockchip, cfg_msi, + PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG); + rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE, PCIE_CLIENT_CONFIG);
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h index 498a40251d0be..88e2bf65e433a 100644 --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -228,6 +228,8 @@ #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4 #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19) #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90 +#define ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET 8 +#define ROCKCHIP_PCIE_EP_MSI_CP1_MASK GENMASK(15, 8) #define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16 #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17 #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17) @@ -235,6 +237,9 @@ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK GENMASK(22, 20) #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16) #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24) +#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG 0xb0 +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET 8 +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK GENMASK(15, 8) #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1 #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3 #define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \
From: Alexander Aring aahringo@redhat.com
[ Upstream commit bcbb4ba6c9ba81e6975b642a2cade68044cd8a66 ]
Lately the different casting between plock_op and plock_xop and list holders which was involved showed some issues which were hard to see. This patch removes the "plock_xop" structure and introduces a "struct plock_async_data". This structure will be set in "struct plock_op" in case of asynchronous lock handling as the original "plock_xop" was made for. There is no need anymore to cast pointers around for additional fields in case of asynchronous lock handling. As disadvantage another allocation was introduces but only needed in the asynchronous case which is currently only used in combination with nfs lockd.
Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Stable-dep-of: 59e45c758ca1 ("fs: dlm: interrupt posix locks only when process is killed") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/plock.c | 77 ++++++++++++++++++++++++++++++-------------------- 1 file changed, 46 insertions(+), 31 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index edce0b25cd90e..e70e23eca03ec 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -19,20 +19,20 @@ static struct list_head recv_list; static wait_queue_head_t send_wq; static wait_queue_head_t recv_wq;
-struct plock_op { - struct list_head list; - int done; - struct dlm_plock_info info; - int (*callback)(struct file_lock *fl, int result); -}; - -struct plock_xop { - struct plock_op xop; +struct plock_async_data { void *fl; void *file; struct file_lock flc; + int (*callback)(struct file_lock *fl, int result); };
+struct plock_op { + struct list_head list; + int done; + struct dlm_plock_info info; + /* if set indicates async handling */ + struct plock_async_data *data; +};
static inline void set_version(struct dlm_plock_info *info) { @@ -58,6 +58,12 @@ static int check_version(struct dlm_plock_info *info) return 0; }
+static void dlm_release_plock_op(struct plock_op *op) +{ + kfree(op->data); + kfree(op); +} + static void send_op(struct plock_op *op) { set_version(&op->info); @@ -101,22 +107,21 @@ static void do_unlock_close(struct dlm_ls *ls, u64 number, int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, int cmd, struct file_lock *fl) { + struct plock_async_data *op_data; struct dlm_ls *ls; struct plock_op *op; - struct plock_xop *xop; int rv;
ls = dlm_find_lockspace_local(lockspace); if (!ls) return -EINVAL;
- xop = kzalloc(sizeof(*xop), GFP_NOFS); - if (!xop) { + op = kzalloc(sizeof(*op), GFP_NOFS); + if (!op) { rv = -ENOMEM; goto out; }
- op = &xop->xop; op->info.optype = DLM_PLOCK_OP_LOCK; op->info.pid = fl->fl_pid; op->info.ex = (fl->fl_type == F_WRLCK); @@ -125,22 +130,32 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op->info.number = number; op->info.start = fl->fl_start; op->info.end = fl->fl_end; + /* async handling */ if (fl->fl_lmops && fl->fl_lmops->lm_grant) { + op_data = kzalloc(sizeof(*op_data), GFP_NOFS); + if (!op_data) { + dlm_release_plock_op(op); + rv = -ENOMEM; + goto out; + } + /* fl_owner is lockd which doesn't distinguish processes on the nfs client */ op->info.owner = (__u64) fl->fl_pid; - op->callback = fl->fl_lmops->lm_grant; - locks_init_lock(&xop->flc); - locks_copy_lock(&xop->flc, fl); - xop->fl = fl; - xop->file = file; + op_data->callback = fl->fl_lmops->lm_grant; + locks_init_lock(&op_data->flc); + locks_copy_lock(&op_data->flc, fl); + op_data->fl = fl; + op_data->file = file; + + op->data = op_data; } else { op->info.owner = (__u64)(long) fl->fl_owner; }
send_op(op);
- if (!op->callback) { + if (!op->data) { rv = wait_event_interruptible(recv_wq, (op->done != 0)); if (rv == -ERESTARTSYS) { log_debug(ls, "dlm_posix_lock: wait killed %llx", @@ -148,7 +163,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, spin_lock(&ops_lock); list_del(&op->list); spin_unlock(&ops_lock); - kfree(xop); + dlm_release_plock_op(op); do_unlock_close(ls, number, file, fl); goto out; } @@ -173,7 +188,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, (unsigned long long)number); }
- kfree(xop); + dlm_release_plock_op(op); out: dlm_put_lockspace(ls); return rv; @@ -183,11 +198,11 @@ EXPORT_SYMBOL_GPL(dlm_posix_lock); /* Returns failure iff a successful lock operation should be canceled */ static int dlm_plock_callback(struct plock_op *op) { + struct plock_async_data *op_data = op->data; struct file *file; struct file_lock *fl; struct file_lock *flc; int (*notify)(struct file_lock *fl, int result) = NULL; - struct plock_xop *xop = (struct plock_xop *)op; int rv = 0;
spin_lock(&ops_lock); @@ -199,10 +214,10 @@ static int dlm_plock_callback(struct plock_op *op) spin_unlock(&ops_lock);
/* check if the following 2 are still valid or make a copy */ - file = xop->file; - flc = &xop->flc; - fl = xop->fl; - notify = op->callback; + file = op_data->file; + flc = &op_data->flc; + fl = op_data->fl; + notify = op_data->callback;
if (op->info.rv) { notify(fl, op->info.rv); @@ -233,7 +248,7 @@ static int dlm_plock_callback(struct plock_op *op) }
out: - kfree(xop); + dlm_release_plock_op(op); return rv; }
@@ -303,7 +318,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 number, struct file *file, rv = 0;
out_free: - kfree(op); + dlm_release_plock_op(op); out: dlm_put_lockspace(ls); fl->fl_flags = fl_flags; @@ -371,7 +386,7 @@ int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file, rv = 0; }
- kfree(op); + dlm_release_plock_op(op); out: dlm_put_lockspace(ls); return rv; @@ -407,7 +422,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count, (the process did not make an unlock call). */
if (op->info.flags & DLM_PLOCK_FL_CLOSE) - kfree(op); + dlm_release_plock_op(op);
if (copy_to_user(u, &info, sizeof(info))) return -EFAULT; @@ -439,7 +454,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count, op->info.owner == info.owner) { list_del_init(&op->list); memcpy(&op->info, &info, sizeof(info)); - if (op->callback) + if (op->data) do_callback = 1; else op->done = 1;
From: Alexander Aring aahringo@redhat.com
[ Upstream commit a800ba77fd285c6391a82819867ac64e9ab3af46 ]
This patch moves the return of FILE_LOCK_DEFERRED a little bit earlier than checking afterwards again if the request was an asynchronous request.
Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Stable-dep-of: 59e45c758ca1 ("fs: dlm: interrupt posix locks only when process is killed") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/plock.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index e70e23eca03ec..01fb7d8c0bca5 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -149,26 +149,25 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file, op_data->file = file;
op->data = op_data; + + send_op(op); + rv = FILE_LOCK_DEFERRED; + goto out; } else { op->info.owner = (__u64)(long) fl->fl_owner; }
send_op(op);
- if (!op->data) { - rv = wait_event_interruptible(recv_wq, (op->done != 0)); - if (rv == -ERESTARTSYS) { - log_debug(ls, "dlm_posix_lock: wait killed %llx", - (unsigned long long)number); - spin_lock(&ops_lock); - list_del(&op->list); - spin_unlock(&ops_lock); - dlm_release_plock_op(op); - do_unlock_close(ls, number, file, fl); - goto out; - } - } else { - rv = FILE_LOCK_DEFERRED; + rv = wait_event_interruptible(recv_wq, (op->done != 0)); + if (rv == -ERESTARTSYS) { + log_debug(ls, "%s: wait killed %llx", __func__, + (unsigned long long)number); + spin_lock(&ops_lock); + list_del(&op->list); + spin_unlock(&ops_lock); + dlm_release_plock_op(op); + do_unlock_close(ls, number, file, fl); goto out; }
From: Alexander Aring aahringo@redhat.com
[ Upstream commit 59e45c758ca1b9893ac923dd63536da946ac333b ]
If a posix lock request is waiting for a result from user space (dlm_controld), do not let it be interrupted unless the process is killed. This reverts commit a6b1533e9a57 ("dlm: make posix locks interruptible"). The problem with the interruptible change is that all locks were cleared on any signal interrupt. If a signal was received that did not terminate the process, the process could continue running after all its dlm posix locks had been cleared. A future patch will add cancelation to allow proper interruption.
Cc: stable@vger.kernel.org Fixes: a6b1533e9a57 ("dlm: make posix locks interruptible") Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/plock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c index 01fb7d8c0bca5..f3482e936cc25 100644 --- a/fs/dlm/plock.c +++ b/fs/dlm/plock.c @@ -159,7 +159,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
send_op(op);
- rv = wait_event_interruptible(recv_wq, (op->done != 0)); + rv = wait_event_killable(recv_wq, (op->done != 0)); if (rv == -ERESTARTSYS) { log_debug(ls, "%s: wait killed %llx", __func__, (unsigned long long)number);
From: Thomas Hellström thomas.hellstrom@linux.intel.com
[ Upstream commit 8ab3b0663e279ab550bc2c0b5d602960e8b94e02 ]
Avoid printing an error message if eviction was interrupted by, for example, the user pressing CTRL-C. That may happen if eviction is waiting for something, like for example a free batch-buffer.
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20230307144621.10748-6-thomas.... Stable-dep-of: e8188c461ee0 ("drm/ttm: Don't leak a resource on eviction error") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_bo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index d5a2b69489e7f..cf390ab636978 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -557,7 +557,8 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo, if (ret == -EMULTIHOP) { ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop); if (ret) { - pr_err("Buffer eviction failed\n"); + if (ret != -ERESTARTSYS && ret != -EINTR) + pr_err("Buffer eviction failed\n"); ttm_resource_free(bo, &evict_mem); goto out; }
From: Thomas Hellström thomas.hellstrom@linux.intel.com
[ Upstream commit e8188c461ee015ba0b9ab2fc82dbd5ebca5a5532 ]
On eviction errors other than -EMULTIHOP we were leaking a resource. Fix.
v2: - Avoid yet another goto (Andi Shyti)
Fixes: 403797925768 ("drm/ttm: Fix multihop assert on eviction.") Cc: Andrey Grodzovsky andrey.grodzovsky@amd.com Cc: Christian König christian.koenig@amd.com Cc: Christian Koenig christian.koenig@amd.com Cc: Huang Rui ray.huang@amd.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com #v1 Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20230626091450.14757-4-thomas.... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index cf390ab636978..6080f4b5c450c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -552,18 +552,18 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo, goto out; }
-bounce: - ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop); - if (ret == -EMULTIHOP) { + do { + ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop); + if (ret != -EMULTIHOP) + break; + ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop); - if (ret) { - if (ret != -ERESTARTSYS && ret != -EINTR) - pr_err("Buffer eviction failed\n"); - ttm_resource_free(bo, &evict_mem); - goto out; - } - /* try and move to final place now. */ - goto bounce; + } while (!ret); + + if (ret) { + ttm_resource_free(bo, &evict_mem); + if (ret != -ERESTARTSYS && ret != -EINTR) + pr_err("Buffer eviction failed\n"); } out: return ret;
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 947d66b68f3c4e7cf8f3f3500807b9d2a0de28ce ]
The local tail variable in n_tty_read() is used for one purpose, it keeps the old tail. Thus, rename it appropriately to improve code readability.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Jiri Slaby jirislaby@kernel.org Link: https://lore.kernel.org/r/22b37499-ff9a-7fc1-f6e0-58411328d122@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 4903fde8047a ("tty: fix hang on tty device with no_room set") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/n_tty.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index 891036bd9f897..8f57448e1ce46 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -2100,7 +2100,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, ssize_t retval = 0; long timeout; bool packet; - size_t tail; + size_t old_tail;
/* * Is this a continuation of a read started earler? @@ -2163,7 +2163,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, }
packet = tty->ctrl.packet; - tail = ldata->read_tail; + old_tail = ldata->read_tail;
add_wait_queue(&tty->read_wait, &wait); while (nr) { @@ -2252,7 +2252,7 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (tail != ldata->read_tail) + if (old_tail != ldata->read_tail) n_tty_kick_worker(tty); up_read(&tty->termios_rwsem);
From: Hui Li caelli@tencent.com
[ Upstream commit 4903fde8047a28299d1fc79c1a0dcc255e928f12 ]
It is possible to hang pty devices in this case, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com Message-ID: 1680749090-14106-1-git-send-email-caelli@tencent.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/n_tty.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index 8f57448e1ce46..6259249b11670 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -202,8 +202,8 @@ static void n_tty_kick_worker(struct tty_struct *tty) struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1661,7 +1661,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1692,6 +1692,17 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set. + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2252,8 +2263,14 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
From: Christian König christian.koenig@amd.com
[ Upstream commit a2848d08742c8e8494675892c02c0d22acbe3cf8 ]
There is a small window where we have already incremented the pin count but not yet moved the bo from the lru to the pinned list.
Signed-off-by: Christian König christian.koenig@amd.com Reported-by: Pelloux-Prayer, Pierre-Eric Pierre-eric.Pelloux-prayer@amd.com Tested-by: Pelloux-Prayer, Pierre-Eric Pierre-eric.Pelloux-prayer@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230707120826.3701-1-christia... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 6080f4b5c450c..4d0ef5ab25319 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -604,6 +604,12 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, { bool ret = false;
+ if (bo->pin_count) { + *locked = false; + *busy = false; + return false; + } + if (bo->base.resv == ctx->resv) { dma_resv_assert_held(bo->base.resv); if (ctx->allow_res_evict)
From: Steve French stfrench@microsoft.com
[ Upstream commit 5dd8ce24667a70bb9f7808f5eec0354bd37290c6 ]
The include/uapi/linux/cifs directory (not just fs/cifs and fs/smbfs_common) should be included in cifs entry in the MAINTAINERS file.
Reviewed-by: Paulo Alcantara (SUSE) pc@cjr.nz Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: df9d70c18616 ("cifs: if deferred close is disabled then close files immediately") Signed-off-by: Sasha Levin sashal@kernel.org --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS index 2bf1ad0fb2a6f..e6b53e76651be 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4666,6 +4666,7 @@ T: git git://git.samba.org/sfrench/cifs-2.6.git F: Documentation/admin-guide/cifs/ F: fs/cifs/ F: fs/smbfs_common/ +F: include/uapi/linux/cifs
COMPACTPCI HOTPLUG CORE M: Scott Murray scott@spiteful.org
From: Paulo Alcantara pc@cjr.nz
[ Upstream commit 9fd29a5bae6e8f94b410374099a6fddb253d2d5f ]
Use filesystem context support to handle dfs links.
Signed-off-by: Paulo Alcantara (SUSE) pc@cjr.nz Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: df9d70c18616 ("cifs: if deferred close is disabled then close files immediately") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifs_dfs_ref.c | 100 +++++++++++++++++------------------------ 1 file changed, 40 insertions(+), 60 deletions(-)
diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index b0864da9ef434..020e71fe1454e 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -258,61 +258,23 @@ char *cifs_compose_mount_options(const char *sb_mountdata, goto compose_mount_options_out; }
-/** - * cifs_dfs_do_mount - mounts specified path using DFS full path - * - * Always pass down @fullpath to smb3_do_mount() so we can use the root server - * to perform failover in case we failed to connect to the first target in the - * referral. - * - * @mntpt: directory entry for the path we are trying to automount - * @cifs_sb: parent/root superblock - * @fullpath: full path in UNC format - */ -static struct vfsmount *cifs_dfs_do_mount(struct dentry *mntpt, - struct cifs_sb_info *cifs_sb, - const char *fullpath) -{ - struct vfsmount *mnt; - char *mountdata; - char *devname; - - devname = kstrdup(fullpath, GFP_KERNEL); - if (!devname) - return ERR_PTR(-ENOMEM); - - convert_delimiter(devname, '/'); - - /* TODO: change to call fs_context_for_mount(), fill in context directly, call fc_mount */ - - /* See afs_mntpt_do_automount in fs/afs/mntpt.c for an example */ - - /* strip first '' from fullpath */ - mountdata = cifs_compose_mount_options(cifs_sb->ctx->mount_options, - fullpath + 1, NULL, NULL); - if (IS_ERR(mountdata)) { - kfree(devname); - return (struct vfsmount *)mountdata; - } - - mnt = vfs_submount(mntpt, &cifs_fs_type, devname, mountdata); - kfree(mountdata); - kfree(devname); - return mnt; -} - /* * Create a vfsmount that we can automount */ -static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt) +static struct vfsmount *cifs_dfs_do_automount(struct path *path) { + int rc; + struct dentry *mntpt = path->dentry; + struct fs_context *fc; struct cifs_sb_info *cifs_sb; - void *page; + void *page = NULL; + struct smb3_fs_context *ctx, *cur_ctx; + struct smb3_fs_context tmp; char *full_path; struct vfsmount *mnt;
- cifs_dbg(FYI, "in %s\n", __func__); - BUG_ON(IS_ROOT(mntpt)); + if (IS_ROOT(mntpt)) + return ERR_PTR(-ESTALE);
/* * The MSDFS spec states that paths in DFS referral requests and @@ -321,29 +283,47 @@ static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt) * gives us the latter, so we must adjust the result. */ cifs_sb = CIFS_SB(mntpt->d_sb); - if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS) { - mnt = ERR_PTR(-EREMOTE); - goto cdda_exit; - } + if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS) + return ERR_PTR(-EREMOTE); + + cur_ctx = cifs_sb->ctx; + + fc = fs_context_for_submount(path->mnt->mnt_sb->s_type, mntpt); + if (IS_ERR(fc)) + return ERR_CAST(fc); + + ctx = smb3_fc2context(fc);
page = alloc_dentry_path(); /* always use tree name prefix */ full_path = build_path_from_dentry_optional_prefix(mntpt, page, true); if (IS_ERR(full_path)) { mnt = ERR_CAST(full_path); - goto free_full_path; + goto out; }
- convert_delimiter(full_path, '\'); + convert_delimiter(full_path, '/'); cifs_dbg(FYI, "%s: full_path: %s\n", __func__, full_path);
- mnt = cifs_dfs_do_mount(mntpt, cifs_sb, full_path); - cifs_dbg(FYI, "%s: cifs_dfs_do_mount:%s , mnt:%p\n", __func__, full_path + 1, mnt); + tmp = *cur_ctx; + tmp.source = full_path; + tmp.UNC = tmp.prepath = NULL; + + rc = smb3_fs_context_dup(ctx, &tmp); + if (rc) { + mnt = ERR_PTR(rc); + goto out; + } + + rc = smb3_parse_devname(full_path, ctx); + if (!rc) + mnt = fc_mount(fc); + else + mnt = ERR_PTR(rc);
-free_full_path: +out: + put_fs_context(fc); free_dentry_path(page); -cdda_exit: - cifs_dbg(FYI, "leaving %s\n" , __func__); return mnt; }
@@ -354,9 +334,9 @@ struct vfsmount *cifs_dfs_d_automount(struct path *path) { struct vfsmount *newmnt;
- cifs_dbg(FYI, "in %s\n", __func__); + cifs_dbg(FYI, "%s: %pd\n", __func__, path->dentry);
- newmnt = cifs_dfs_do_automount(path->dentry); + newmnt = cifs_dfs_do_automount(path); if (IS_ERR(newmnt)) { cifs_dbg(FYI, "leaving %s [automount failed]\n" , __func__); return newmnt;
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit 211db0ac9e3dc6c46f2dd53395b34d76af929faf ]
Since vfs_path_lookup is exported, It should not be internal. Move vfs_path_lookup prototype in internal.h to linux/namei.h.
Suggested-by: Al Viro viro@zeniv.linux.org.uk Reviewed-by: Christian Brauner brauner@kernel.org Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Stable-dep-of: df9d70c18616 ("cifs: if deferred close is disabled then close files immediately") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/internal.h | 2 -- fs/ksmbd/vfs.c | 2 -- include/linux/namei.h | 2 ++ 3 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/internal.h b/fs/internal.h index ceb154583a3c4..1ff8cfc94467b 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -58,8 +58,6 @@ extern int finish_clean_context(struct fs_context *fc); */ extern int filename_lookup(int dfd, struct filename *name, unsigned flags, struct path *path, struct path *root); -extern int vfs_path_lookup(struct dentry *, struct vfsmount *, - const char *, unsigned int, struct path *); int do_rmdir(int dfd, struct filename *name); int do_unlinkat(int dfd, struct filename *name); int may_linkat(struct user_namespace *mnt_userns, struct path *link); diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c index 52cc6a9627ed7..f76acd83c2944 100644 --- a/fs/ksmbd/vfs.c +++ b/fs/ksmbd/vfs.c @@ -19,8 +19,6 @@ #include <linux/sched/xacct.h> #include <linux/crc32c.h>
-#include "../internal.h" /* for vfs_path_lookup */ - #include "glob.h" #include "oplock.h" #include "connection.h" diff --git a/include/linux/namei.h b/include/linux/namei.h index caeb08a98536c..40c693525f796 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -63,6 +63,8 @@ extern struct dentry *kern_path_create(int, const char *, struct path *, unsigne extern struct dentry *user_path_create(int, const char __user *, struct path *, unsigned int); extern void done_path_create(struct path *, struct dentry *); extern struct dentry *kern_path_locked(const char *, struct path *); +int vfs_path_lookup(struct dentry *, struct vfsmount *, const char *, + unsigned int, struct path *);
extern struct dentry *try_lookup_one_len(const char *, struct dentry *, int); extern struct dentry *lookup_one_len(const char *, struct dentry *, int);
From: Bharath SM bharathsm@microsoft.com
[ Upstream commit df9d70c18616760c6504b97fec66b6379c172dbb ]
If defer close timeout value is set to 0, then there is no need to include files in the deferred close list and utilize the delayed worker for closing. Instead, we can close them immediately.
Signed-off-by: Bharath SM bharathsm@microsoft.com Reviewed-by: Shyam Prasad N sprasad@microsoft.com Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 4e4f73a90574b..e65fbae9e804b 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -880,8 +880,8 @@ int cifs_close(struct inode *inode, struct file *file) cfile = file->private_data; file->private_data = NULL; dclose = kmalloc(sizeof(struct cifs_deferred_close), GFP_KERNEL); - if ((cinode->oplock == CIFS_CACHE_RHW_FLG) && - cinode->lease_granted && + if ((cifs_sb->ctx->closetimeo && cinode->oplock == CIFS_CACHE_RHW_FLG) + && cinode->lease_granted && !test_bit(CIFS_INO_CLOSE_ON_LOCK, &cinode->flags) && dclose) { if (test_and_clear_bit(CIFS_INO_MODIFIED_ATTR, &cinode->flags)) {
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 5f97f18feac9bd5a8163b108aee52d783114b36f ]
The driver tracks per-channel data via struct pwm_device::chip_data and struct meson_pwm::channels[]. The latter holds the actual data, the former is only a pointer to the latter. So simplify by using struct meson_pwm::channels[] consistently.
Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Thierry Reding thierry.reding@gmail.com Stable-dep-of: 87a2cbf02d77 ("pwm: meson: fix handling of period/duty if greater than UINT_MAX") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-meson.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c index 76f702c43cbc3..37bbad88a3ee5 100644 --- a/drivers/pwm/pwm-meson.c +++ b/drivers/pwm/pwm-meson.c @@ -147,12 +147,13 @@ static int meson_pwm_request(struct pwm_chip *chip, struct pwm_device *pwm) return err; }
- return pwm_set_chip_data(pwm, channel); + return 0; }
static void meson_pwm_free(struct pwm_chip *chip, struct pwm_device *pwm) { - struct meson_pwm_channel *channel = pwm_get_chip_data(pwm); + struct meson_pwm *meson = to_meson_pwm(chip); + struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
if (channel) clk_disable_unprepare(channel->clk); @@ -161,7 +162,7 @@ static void meson_pwm_free(struct pwm_chip *chip, struct pwm_device *pwm) static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm, const struct pwm_state *state) { - struct meson_pwm_channel *channel = pwm_get_chip_data(pwm); + struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm]; unsigned int duty, period, pre_div, cnt, duty_cnt; unsigned long fin_freq;
@@ -230,7 +231,7 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
static void meson_pwm_enable(struct meson_pwm *meson, struct pwm_device *pwm) { - struct meson_pwm_channel *channel = pwm_get_chip_data(pwm); + struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm]; struct meson_pwm_channel_data *channel_data; unsigned long flags; u32 value; @@ -273,8 +274,8 @@ static void meson_pwm_disable(struct meson_pwm *meson, struct pwm_device *pwm) static int meson_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, const struct pwm_state *state) { - struct meson_pwm_channel *channel = pwm_get_chip_data(pwm); struct meson_pwm *meson = to_meson_pwm(chip); + struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm]; int err = 0;
if (!state)
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 87a2cbf02d7701255f9fcca7e5bd864a7bb397cf ]
state->period/duty are of type u64, and if their value is greater than UINT_MAX, then the cast to uint will cause problems. Fix this by changing the type of the respective local variables to u64.
Fixes: b79c3670e120 ("pwm: meson: Don't duplicate the polarity internally") Cc: stable@vger.kernel.org Suggested-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-meson.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c index 37bbad88a3ee5..ec6a544d6f526 100644 --- a/drivers/pwm/pwm-meson.c +++ b/drivers/pwm/pwm-meson.c @@ -163,8 +163,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm, const struct pwm_state *state) { struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm]; - unsigned int duty, period, pre_div, cnt, duty_cnt; + unsigned int pre_div, cnt, duty_cnt; unsigned long fin_freq; + u64 duty, period;
duty = state->duty_cycle; period = state->period; @@ -186,19 +187,19 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
- pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL); + pre_div = div64_u64(fin_freq * period, NSEC_PER_SEC * 0xffffLL); if (pre_div > MISC_CLK_DIV_MASK) { dev_err(meson->chip.dev, "unable to get period pre_div\n"); return -EINVAL; }
- cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1)); + cnt = div64_u64(fin_freq * period, NSEC_PER_SEC * (pre_div + 1)); if (cnt > 0xffff) { dev_err(meson->chip.dev, "unable to get period cnt\n"); return -EINVAL; }
- dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period, + dev_dbg(meson->chip.dev, "period=%llu pre_div=%u cnt=%u\n", period, pre_div, cnt);
if (duty == period) { @@ -211,14 +212,13 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm, channel->lo = cnt; } else { /* Then check is we can have the duty with the same pre_div */ - duty_cnt = div64_u64(fin_freq * (u64)duty, - NSEC_PER_SEC * (pre_div + 1)); + duty_cnt = div64_u64(fin_freq * duty, NSEC_PER_SEC * (pre_div + 1)); if (duty_cnt > 0xffff) { dev_err(meson->chip.dev, "unable to get duty cycle\n"); return -EINVAL; }
- dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n", + dev_dbg(meson->chip.dev, "duty=%llu pre_div=%u duty_cnt=%u\n", duty, pre_div, duty_cnt);
channel->pre_div = pre_div;
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit b26a124cbfa80f42bfc4e63e1d5643ca98159d66 ]
Add 'symstr' type for storing the kernel symbol as a string data instead of the symbol address. This allows us to filter the events by wildcard symbol name.
e.g. # echo 'e:wqfunc workqueue.workqueue_execute_start symname=$function:symstr' >> dynamic_events # cat events/eprobes/wqfunc/format name: wqfunc ID: 2110 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1;
field:__data_loc char[] symname; offset:8; size:4; signed:1;
print fmt: " symname="%s"", __get_str(symname)
Note that there is already 'symbol' type which just change the print format (so it still stores the symbol address in the tracing ring buffer.) On the other hand, 'symstr' type stores the actual "symbol+offset/size" data as a string.
Link: https://lore.kernel.org/all/166679930847.1528100.4124308529180235965.stgit@d...
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Stable-dep-of: 66bcf65d6cf0 ("tracing/probes: Fix to avoid double count of the string length on the array") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/trace/kprobetrace.rst | 8 +++-- kernel/trace/trace.c | 2 +- kernel/trace/trace_probe.c | 44 ++++++++++++++++++--------- kernel/trace/trace_probe.h | 16 +++++++--- kernel/trace/trace_probe_tmpl.h | 47 +++++++++++++++++++++++++++-- 5 files changed, 91 insertions(+), 26 deletions(-)
diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst index b175d88f31ebb..15e4bfa2bd83c 100644 --- a/Documentation/trace/kprobetrace.rst +++ b/Documentation/trace/kprobetrace.rst @@ -58,8 +58,8 @@ Synopsis of kprobe_events NAME=FETCHARG : Set NAME as the argument name of FETCHARG. FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types - (x8/x16/x32/x64), "string", "ustring" and bitfield - are supported. + (x8/x16/x32/x64), "string", "ustring", "symbol", "symstr" + and bitfield are supported.
(*1) only for the probe on function entry (offs == 0). (*2) only for return probe. @@ -96,6 +96,10 @@ offset, and container-size (usually 32). The syntax is::
Symbol type('symbol') is an alias of u32 or u64 type (depends on BITS_PER_LONG) which shows given pointer in "symbol+offset" style. +On the other hand, symbol-string type ('symstr') converts the given address to +"symbol+offset/symbolsize" style and stores it as a null-terminated string. +With 'symstr' type, you can filter the event with wildcard pattern of the +symbols, and you don't need to solve symbol name by yourself. For $comm, the default type is "string"; any other type is invalid.
.. _user_mem_access: diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 1dda36c7e5eb5..ae7005af78c34 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5609,7 +5609,7 @@ static const char readme_msg[] = "\t +|-[u]<offset>(<fetcharg>), \imm-value, \"imm-string"\n" "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, string, symbol,\n" "\t b<bit-width>@<bit-offset>/<container-size>, ustring,\n" - "\t <type>\[<array-size>\]\n" + "\t symstr, <type>\[<array-size>\]\n" #ifdef CONFIG_HIST_TRIGGERS "\t field: <stype> <name>;\n" "\t stype: u8/u16/u32/u64, s8/s16/s32/s64, pid_t,\n" diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index cb8f9fe5669ad..90bae188d0928 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -76,9 +76,11 @@ const char PRINT_TYPE_FMT_NAME(string)[] = "\"%s\""; /* Fetch type information table */ static const struct fetch_type probe_fetch_types[] = { /* Special types */ - __ASSIGN_FETCH_TYPE("string", string, string, sizeof(u32), 1, + __ASSIGN_FETCH_TYPE("string", string, string, sizeof(u32), 1, 1, "__data_loc char[]"), - __ASSIGN_FETCH_TYPE("ustring", string, string, sizeof(u32), 1, + __ASSIGN_FETCH_TYPE("ustring", string, string, sizeof(u32), 1, 1, + "__data_loc char[]"), + __ASSIGN_FETCH_TYPE("symstr", string, string, sizeof(u32), 1, 1, "__data_loc char[]"), /* Basic types */ ASSIGN_FETCH_TYPE(u8, u8, 0), @@ -658,16 +660,26 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size,
ret = -EINVAL; /* Store operation */ - if (!strcmp(parg->type->name, "string") || - !strcmp(parg->type->name, "ustring")) { - if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF && - code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM && - code->op != FETCH_OP_DATA && code->op != FETCH_OP_TP_ARG) { - trace_probe_log_err(offset + (t ? (t - arg) : 0), - BAD_STRING); - goto fail; + if (parg->type->is_string) { + if (!strcmp(parg->type->name, "symstr")) { + if (code->op != FETCH_OP_REG && code->op != FETCH_OP_STACK && + code->op != FETCH_OP_RETVAL && code->op != FETCH_OP_ARG && + code->op != FETCH_OP_DEREF && code->op != FETCH_OP_TP_ARG) { + trace_probe_log_err(offset + (t ? (t - arg) : 0), + BAD_SYMSTRING); + goto fail; + } + } else { + if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF && + code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM && + code->op != FETCH_OP_DATA && code->op != FETCH_OP_TP_ARG) { + trace_probe_log_err(offset + (t ? (t - arg) : 0), + BAD_STRING); + goto fail; + } } - if ((code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM || + if (!strcmp(parg->type->name, "symstr") || + (code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM || code->op == FETCH_OP_DATA) || code->op == FETCH_OP_TP_ARG || parg->count) { /* @@ -675,6 +687,8 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, * must be kept, and if parg->count != 0, this is an * array of string pointers instead of string address * itself. + * For the symstr, it doesn't need to dereference, thus + * it just get the value. */ code++; if (code->op != FETCH_OP_NOP) { @@ -686,6 +700,8 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, if (!strcmp(parg->type->name, "ustring") || code->op == FETCH_OP_UDEREF) code->op = FETCH_OP_ST_USTRING; + else if (!strcmp(parg->type->name, "symstr")) + code->op = FETCH_OP_ST_SYMSTR; else code->op = FETCH_OP_ST_STRING; code->size = parg->type->size; @@ -915,8 +931,7 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len, for (i = 0; i < tp->nr_args; i++) { parg = tp->args + i; if (parg->count) { - if ((strcmp(parg->type->name, "string") == 0) || - (strcmp(parg->type->name, "ustring") == 0)) + if (parg->type->is_string) fmt = ", __get_str(%s[%d])"; else fmt = ", REC->%s[%d]"; @@ -924,8 +939,7 @@ static int __set_print_fmt(struct trace_probe *tp, char *buf, int len, pos += snprintf(buf + pos, LEN_OR_ZERO, fmt, parg->name, j); } else { - if ((strcmp(parg->type->name, "string") == 0) || - (strcmp(parg->type->name, "ustring") == 0)) + if (parg->type->is_string) fmt = ", __get_str(%s)"; else fmt = ", REC->%s"; diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index 84d495cbd876a..0f0e5005b97a0 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -99,6 +99,7 @@ enum fetch_op { FETCH_OP_ST_UMEM, /* Mem: .offset, .size */ FETCH_OP_ST_STRING, /* String: .offset, .size */ FETCH_OP_ST_USTRING, /* User String: .offset, .size */ + FETCH_OP_ST_SYMSTR, /* Kernel Symbol String: .offset, .size */ // Stage 4 (modify) op FETCH_OP_MOD_BF, /* Bitfield: .basesize, .lshift, .rshift */ // Stage 5 (loop) op @@ -134,7 +135,8 @@ struct fetch_insn { struct fetch_type { const char *name; /* Name of type */ size_t size; /* Byte size of type */ - int is_signed; /* Signed flag */ + bool is_signed; /* Signed flag */ + bool is_string; /* String flag */ print_type_func_t print; /* Print functions */ const char *fmt; /* Format string */ const char *fmttype; /* Name in format file */ @@ -178,16 +180,19 @@ DECLARE_BASIC_PRINT_TYPE_FUNC(symbol); #define _ADDR_FETCH_TYPE(t) __ADDR_FETCH_TYPE(t) #define ADDR_FETCH_TYPE _ADDR_FETCH_TYPE(BITS_PER_LONG)
-#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \ - {.name = _name, \ +#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, str, _fmttype) \ + {.name = _name, \ .size = _size, \ - .is_signed = sign, \ + .is_signed = (bool)sign, \ + .is_string = (bool)str, \ .print = PRINT_TYPE_FUNC_NAME(ptype), \ .fmt = PRINT_TYPE_FMT_NAME(ptype), \ .fmttype = _fmttype, \ } + +/* Non string types can use these macros */ #define _ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \ - __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, #_fmttype) + __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, 0, #_fmttype) #define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \ _ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, ptype)
@@ -432,6 +437,7 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call, C(ARRAY_TOO_BIG, "Array number is too big"), \ C(BAD_TYPE, "Unknown type is specified"), \ C(BAD_STRING, "String accepts only memory argument"), \ + C(BAD_SYMSTRING, "Symbol String doesn't accept data/userdata"), \ C(BAD_BITFIELD, "Invalid bitfield"), \ C(ARG_NAME_TOO_LONG, "Argument name is too long"), \ C(NO_ARG_NAME, "Argument name is not specified"), \ diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h index c293a607d5366..21799fa813ca8 100644 --- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -67,6 +67,37 @@ probe_mem_read(void *dest, void *src, size_t size); static nokprobe_inline int probe_mem_read_user(void *dest, void *src, size_t size);
+static nokprobe_inline int +fetch_store_symstrlen(unsigned long addr) +{ + char namebuf[KSYM_SYMBOL_LEN]; + int ret; + + ret = sprint_symbol(namebuf, addr); + if (ret < 0) + return 0; + + return ret + 1; +} + +/* + * Fetch a null-terminated symbol string + offset. Caller MUST set *(u32 *)buf + * with max length and relative data location. + */ +static nokprobe_inline int +fetch_store_symstring(unsigned long addr, void *dest, void *base) +{ + int maxlen = get_loc_len(*(u32 *)dest); + void *__dest; + + if (unlikely(!maxlen)) + return -ENOMEM; + + __dest = get_loc_data(dest, base); + + return sprint_symbol(__dest, addr); +} + /* From the 2nd stage, routine is same */ static nokprobe_inline int process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, @@ -99,16 +130,22 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, stage3: /* 3rd stage: store value to buffer */ if (unlikely(!dest)) { - if (code->op == FETCH_OP_ST_STRING) { + switch (code->op) { + case FETCH_OP_ST_STRING: ret = fetch_store_strlen(val + code->offset); code++; goto array; - } else if (code->op == FETCH_OP_ST_USTRING) { + case FETCH_OP_ST_USTRING: ret += fetch_store_strlen_user(val + code->offset); code++; goto array; - } else + case FETCH_OP_ST_SYMSTR: + ret += fetch_store_symstrlen(val + code->offset); + code++; + goto array; + default: return -EILSEQ; + } }
switch (code->op) { @@ -129,6 +166,10 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, loc = *(u32 *)dest; ret = fetch_store_string_user(val + code->offset, dest, base); break; + case FETCH_OP_ST_SYMSTR: + loc = *(u32 *)dest; + ret = fetch_store_symstring(val + code->offset, dest, base); + break; default: return -EILSEQ; }
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit 66bcf65d6cf0ca6540e2341e88ee7ef02dbdda08 ]
If an array is specified with the ustring or symstr, the length of the strings are accumlated on both of 'ret' and 'total', which means the length is double counted. Just set the length to the 'ret' value for avoiding double counting.
Link: https://lore.kernel.org/all/168908492917.123124.15076463491122036025.stgit@d...
Reported-by: Dan Carpenter dan.carpenter@linaro.org Closes: https://lore.kernel.org/all/8819b154-2ba1-43c3-98a2-cbde20892023@moroto.moun... Fixes: 88903c464321 ("tracing/probe: Add ustring type for user-space string") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_probe_tmpl.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h index 21799fa813ca8..98ac09052fea4 100644 --- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -136,11 +136,11 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val, code++; goto array; case FETCH_OP_ST_USTRING: - ret += fetch_store_strlen_user(val + code->offset); + ret = fetch_store_strlen_user(val + code->offset); code++; goto array; case FETCH_OP_ST_SYMSTR: - ret += fetch_store_symstrlen(val + code->offset); + ret = fetch_store_symstrlen(val + code->offset); code++; goto array; default:
From: Steven Rostedt (Google) rostedt@goodmis.org
[ Upstream commit 00cf3d672a9dd409418647e9f98784c339c3ff63 ]
Allow a stacktrace from one event to be displayed by the end event of a synthetic event. This is very useful when looking for the longest latency of a sleep or something blocked on I/O.
# cd /sys/kernel/tracing/ # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 1||prev_state == 2' > events/sched/sched_switch/trigger # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
The above creates a "block_lat" synthetic event that take the stacktrace of when a task schedules out in either the interruptible or uninterruptible states, and on a new per process max $delta (the time it was scheduled out), will print the process id and the stacktrace.
# echo 1 > events/synthetic/block_lat/enable # cat trace # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | kworker/u16:0-767 [006] d..4. 560.645045: block_lat: pid=767 delta=66 stack=STACK: => __schedule => schedule => pipe_read => vfs_read => ksys_read => do_syscall_64 => 0x966000aa
<idle>-0 [003] d..4. 561.132117: block_lat: pid=0 delta=413787 stack=STACK: => __schedule => schedule => schedule_hrtimeout_range_clock => do_sys_poll => __x64_sys_poll => do_syscall_64 => 0x966000aa
<...>-153 [006] d..4. 562.068407: block_lat: pid=153 delta=54 stack=STACK: => __schedule => schedule => io_schedule => rq_qos_wait => wbt_wait => __rq_qos_throttle => blk_mq_submit_bio => submit_bio_noacct_nocheck => ext4_bio_write_page => mpage_submit_page => mpage_process_page_bufs => mpage_prepare_extent_to_map => ext4_do_writepages => ext4_writepages => do_writepages => __writeback_single_inode
Link: https://lkml.kernel.org/r/20230117152236.010941267@goodmis.org
Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Tom Zanussi zanussi@kernel.org Cc: Ross Zwisler zwisler@google.com Cc: Ching-lin Yu chinglinyu@google.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: 797311bce5c2 ("tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.h | 4 ++ kernel/trace/trace_events_hist.c | 7 ++- kernel/trace/trace_events_synth.c | 80 ++++++++++++++++++++++++++++++- kernel/trace/trace_synth.h | 1 + 4 files changed, 87 insertions(+), 5 deletions(-)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 2c3d9b6ce1485..33c55c8826a7e 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -113,6 +113,10 @@ enum trace_type { #define MEM_FAIL(condition, fmt, ...) \ DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
+#define HIST_STACKTRACE_DEPTH 16 +#define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long)) +#define HIST_STACKTRACE_SKIP 5 + /* * syscalls are special, and need special handling, this is why * they are not included in trace_entries.h diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 1b70fc4c703f7..c32a53f089229 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -315,10 +315,6 @@ DEFINE_HIST_FIELD_FN(u8); #define for_each_hist_key_field(i, hist_data) \ for ((i) = (hist_data)->n_vals; (i) < (hist_data)->n_fields; (i)++)
-#define HIST_STACKTRACE_DEPTH 16 -#define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long)) -#define HIST_STACKTRACE_SKIP 5 - #define HITCOUNT_IDX 0 #define HIST_KEY_SIZE_MAX (MAX_FILTER_STR_VAL + HIST_STACKTRACE_SIZE)
@@ -3431,6 +3427,9 @@ static int check_synth_field(struct synth_event *event, && field->is_dynamic) return 0;
+ if (strstr(hist_field->type, "long[") && field->is_stack) + return 0; + if (strcmp(field->type, hist_field->type) != 0) { if (field->size != hist_field->size || (!field->is_string && field->is_signed != hist_field->is_signed)) diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c index 08c7df42ade7e..ce8fad956b38d 100644 --- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -165,6 +165,14 @@ static int synth_field_is_string(char *type) return false; }
+static int synth_field_is_stack(char *type) +{ + if (strstr(type, "long[") != NULL) + return true; + + return false; +} + static int synth_field_string_size(char *type) { char buf[4], *end, *start; @@ -240,6 +248,8 @@ static int synth_field_size(char *type) size = sizeof(gfp_t); else if (synth_field_is_string(type)) size = synth_field_string_size(type); + else if (synth_field_is_stack(type)) + size = 0;
return size; } @@ -284,6 +294,8 @@ static const char *synth_field_fmt(char *type) fmt = "%x"; else if (synth_field_is_string(type)) fmt = "%.*s"; + else if (synth_field_is_stack(type)) + fmt = "%s";
return fmt; } @@ -363,6 +375,23 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter, i == se->n_fields - 1 ? "" : " "); n_u64 += STR_VAR_LEN_MAX / sizeof(u64); } + } else if (se->fields[i]->is_stack) { + u32 offset, data_offset, len; + unsigned long *p, *end; + + offset = (u32)entry->fields[n_u64]; + data_offset = offset & 0xffff; + len = offset >> 16; + + p = (void *)entry + data_offset; + end = (void *)p + len - (sizeof(long) - 1); + + trace_seq_printf(s, "%s=STACK:\n", se->fields[i]->name); + + for (; *p && p < end; p++) + trace_seq_printf(s, "=> %pS\n", (void *)*p); + n_u64++; + } else { struct trace_print_flags __flags[] = { __def_gfpflag_names, {-1, NULL} }; @@ -439,6 +468,43 @@ static unsigned int trace_string(struct synth_trace_event *entry, return len; }
+static unsigned int trace_stack(struct synth_trace_event *entry, + struct synth_event *event, + long *stack, + unsigned int data_size, + unsigned int *n_u64) +{ + unsigned int len; + u32 data_offset; + void *data_loc; + + data_offset = struct_size(entry, fields, event->n_u64); + data_offset += data_size; + + for (len = 0; len < HIST_STACKTRACE_DEPTH; len++) { + if (!stack[len]) + break; + } + + /* Include the zero'd element if it fits */ + if (len < HIST_STACKTRACE_DEPTH) + len++; + + len *= sizeof(long); + + /* Find the dynamic section to copy the stack into. */ + data_loc = (void *)entry + data_offset; + memcpy(data_loc, stack, len); + + /* Fill in the field that holds the offset/len combo */ + data_offset |= len << 16; + *(u32 *)&entry->fields[*n_u64] = data_offset; + + (*n_u64)++; + + return len; +} + static notrace void trace_event_raw_event_synth(void *__data, u64 *var_ref_vals, unsigned int *var_ref_idx) @@ -491,6 +557,12 @@ static notrace void trace_event_raw_event_synth(void *__data, event->fields[i]->is_dynamic, data_size, &n_u64); data_size += len; /* only dynamic string increments */ + } if (event->fields[i]->is_stack) { + long *stack = (long *)(long)var_ref_vals[val_idx]; + + len = trace_stack(entry, event, stack, + data_size, &n_u64); + data_size += len; } else { struct synth_field *field = event->fields[i]; u64 val = var_ref_vals[val_idx]; @@ -553,6 +625,9 @@ static int __set_synth_event_print_fmt(struct synth_event *event, event->fields[i]->is_dynamic) pos += snprintf(buf + pos, LEN_OR_ZERO, ", __get_str(%s)", event->fields[i]->name); + else if (event->fields[i]->is_stack) + pos += snprintf(buf + pos, LEN_OR_ZERO, + ", __get_stacktrace(%s)", event->fields[i]->name); else pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s", event->fields[i]->name); @@ -689,7 +764,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv, ret = -EINVAL; goto free; } else if (size == 0) { - if (synth_field_is_string(field->type)) { + if (synth_field_is_string(field->type) || + synth_field_is_stack(field->type)) { char *type;
len = sizeof("__data_loc ") + strlen(field->type) + 1; @@ -720,6 +796,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv,
if (synth_field_is_string(field->type)) field->is_string = true; + else if (synth_field_is_stack(field->type)) + field->is_stack = true;
field->is_signed = synth_field_signed(field->type); out: diff --git a/kernel/trace/trace_synth.h b/kernel/trace/trace_synth.h index b29595fe3ac5a..43f6fb6078dbf 100644 --- a/kernel/trace/trace_synth.h +++ b/kernel/trace/trace_synth.h @@ -18,6 +18,7 @@ struct synth_field { bool is_signed; bool is_string; bool is_dynamic; + bool is_stack; };
struct synth_event {
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit 4ed8f337dee32df71435689c19d22e4ee846e15a ]
This reverts commit 2e9906f84fc7c99388bb7123ade167250d50f1c0.
It was turned out that commit 2e9906f84fc7 ("tracing: Add "(fault)" name injection to kernel probes") did not work correctly and probe events still show just '(fault)' (instead of '"(fault)"'). Also, current '(fault)' is more explicit that it faulted.
This also moves FAULT_STRING macro to trace.h so that synthetic event can keep using it, and uses it in trace_probe.c too.
Link: https://lore.kernel.org/all/168908495772.123124.1250788051922100079.stgit@de... Link: https://lore.kernel.org/all/20230706230642.3793a593@rorschach.local.home/
Cc: stable@vger.kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: Tom Zanussi zanussi@kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Stable-dep-of: 797311bce5c2 ("tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.h | 2 ++ kernel/trace/trace_probe.c | 2 +- kernel/trace/trace_probe_kernel.h | 31 ++++++------------------------- 3 files changed, 9 insertions(+), 26 deletions(-)
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 33c55c8826a7e..43058077a4def 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -113,6 +113,8 @@ enum trace_type { #define MEM_FAIL(condition, fmt, ...) \ DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
+#define FAULT_STRING "(fault)" + #define HIST_STACKTRACE_DEPTH 16 #define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long)) #define HIST_STACKTRACE_SKIP 5 diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 90bae188d0928..0888f0644d257 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -64,7 +64,7 @@ int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, void *data, void *ent) int len = *(u32 *)data >> 16;
if (!len) - trace_seq_puts(s, "(fault)"); + trace_seq_puts(s, FAULT_STRING); else trace_seq_printf(s, ""%s"", (const char *)get_loc_data(data, ent)); diff --git a/kernel/trace/trace_probe_kernel.h b/kernel/trace/trace_probe_kernel.h index 77dbd9ff97826..1d43df29a1f8e 100644 --- a/kernel/trace/trace_probe_kernel.h +++ b/kernel/trace/trace_probe_kernel.h @@ -2,8 +2,6 @@ #ifndef __TRACE_PROBE_KERNEL_H_ #define __TRACE_PROBE_KERNEL_H_
-#define FAULT_STRING "(fault)" - /* * This depends on trace_probe.h, but can not include it due to * the way trace_probe_tmpl.h is used by trace_kprobe.c and trace_eprobe.c. @@ -15,16 +13,8 @@ static nokprobe_inline int kern_fetch_store_strlen_user(unsigned long addr) { const void __user *uaddr = (__force const void __user *)addr; - int ret;
- ret = strnlen_user_nofault(uaddr, MAX_STRING_SIZE); - /* - * strnlen_user_nofault returns zero on fault, insert the - * FAULT_STRING when that occurs. - */ - if (ret <= 0) - return strlen(FAULT_STRING) + 1; - return ret; + return strnlen_user_nofault(uaddr, MAX_STRING_SIZE); }
/* Return the length of string -- including null terminal byte */ @@ -44,18 +34,7 @@ kern_fetch_store_strlen(unsigned long addr) len++; } while (c && ret == 0 && len < MAX_STRING_SIZE);
- /* For faults, return enough to hold the FAULT_STRING */ - return (ret < 0) ? strlen(FAULT_STRING) + 1 : len; -} - -static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base, int len) -{ - if (ret >= 0) { - *(u32 *)dest = make_data_loc(ret, __dest - base); - } else { - strscpy(__dest, FAULT_STRING, len); - ret = strlen(__dest) + 1; - } + return (ret < 0) ? ret : len; }
/* @@ -76,7 +55,8 @@ kern_fetch_store_string_user(unsigned long addr, void *dest, void *base) __dest = get_loc_data(dest, base);
ret = strncpy_from_user_nofault(__dest, uaddr, maxlen); - set_data_loc(ret, dest, __dest, base, maxlen); + if (ret >= 0) + *(u32 *)dest = make_data_loc(ret, __dest - base);
return ret; } @@ -107,7 +87,8 @@ kern_fetch_store_string(unsigned long addr, void *dest, void *base) * probing. */ ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen); - set_data_loc(ret, dest, __dest, base, maxlen); + if (ret >= 0) + *(u32 *)dest = make_data_loc(ret, __dest - base);
return ret; }
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit 797311bce5c2ac90b8d65e357603cfd410d36ebb ]
Fix to record 0-length data to data_loc in fetch_store_string*() if it fails to get the string data. Currently those expect that the data_loc is updated by store_trace_args() if it returns the error code. However, that does not work correctly if the argument is an array of strings. In that case, store_trace_args() only clears the first entry of the array (which may have no error) and leaves other entries. So it should be cleared by fetch_store_string*() itself. Also, 'dyndata' and 'maxlen' in store_trace_args() should be updated only if it is used (ret > 0 and argument is a dynamic data.)
Link: https://lore.kernel.org/all/168908496683.123124.4761206188794205601.stgit@de...
Fixes: 40b53b771806 ("tracing: probeevent: Add array type support") Cc: stable@vger.kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_probe_kernel.h | 13 +++++++++---- kernel/trace/trace_probe_tmpl.h | 10 +++------- kernel/trace/trace_uprobe.c | 3 ++- 3 files changed, 14 insertions(+), 12 deletions(-)
diff --git a/kernel/trace/trace_probe_kernel.h b/kernel/trace/trace_probe_kernel.h index 1d43df29a1f8e..2da70be83831c 100644 --- a/kernel/trace/trace_probe_kernel.h +++ b/kernel/trace/trace_probe_kernel.h @@ -37,6 +37,13 @@ kern_fetch_store_strlen(unsigned long addr) return (ret < 0) ? ret : len; }
+static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base) +{ + if (ret < 0) + ret = 0; + *(u32 *)dest = make_data_loc(ret, __dest - base); +} + /* * Fetch a null-terminated string from user. Caller MUST set *(u32 *)buf * with max length and relative data location. @@ -55,8 +62,7 @@ kern_fetch_store_string_user(unsigned long addr, void *dest, void *base) __dest = get_loc_data(dest, base);
ret = strncpy_from_user_nofault(__dest, uaddr, maxlen); - if (ret >= 0) - *(u32 *)dest = make_data_loc(ret, __dest - base); + set_data_loc(ret, dest, __dest, base);
return ret; } @@ -87,8 +93,7 @@ kern_fetch_store_string(unsigned long addr, void *dest, void *base) * probing. */ ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen); - if (ret >= 0) - *(u32 *)dest = make_data_loc(ret, __dest - base); + set_data_loc(ret, dest, __dest, base);
return ret; } diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h index 98ac09052fea4..3e2f5a43b974c 100644 --- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -247,13 +247,9 @@ store_trace_args(void *data, struct trace_probe *tp, void *rec, if (unlikely(arg->dynamic)) *dl = make_data_loc(maxlen, dyndata - base); ret = process_fetch_insn(arg->code, rec, dl, base); - if (arg->dynamic) { - if (unlikely(ret < 0)) { - *dl = make_data_loc(0, dyndata - base); - } else { - dyndata += ret; - maxlen -= ret; - } + if (arg->dynamic && likely(ret > 0)) { + dyndata += ret; + maxlen -= ret; } } } diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 78ec1c16ccf4b..debc651015489 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -168,7 +168,8 @@ fetch_store_string(unsigned long addr, void *dest, void *base) */ ret++; *(u32 *)dest = make_data_loc(ret, (void *)dst - base); - } + } else + *(u32 *)dest = make_data_loc(0, (void *)dst - base);
return ret; }
From: Gaosheng Cui cuigaosheng1@huawei.com
[ Upstream commit 1b80addaae099dc33e683d971aba90eeeaf887a3 ]
qla2x00_get_fw_version_str() has been removed since commit abbd8870b9cb ("[SCSI] qla2xxx: Factor-out ISP specific functions to method-based call tables.").
qla2x00_release_nvram_protection() has been removed since commit 459c537807bd ("[SCSI] qla2xxx: Add ISP24xx flash-manipulation routines.").
qla82xx_rdmem() and qla82xx_wrmem() have been removed since commit 3711333dfbee ("[SCSI] qla2xxx: Updates for ISP82xx.").
qla25xx_rd_req_reg(), qla24xx_rd_req_reg(), qla25xx_wrt_rsp_reg(), qla24xx_wrt_rsp_reg(), qla25xx_wrt_req_reg() and qla24xx_wrt_req_reg() have been removed since commit 08029990b25b ("[SCSI] qla2xxx: Refactor request/response-queue register handling.").
qla2x00_async_login_done() has been removed since commit 726b85487067 ("qla2xxx: Add framework for async fabric discovery").
qlt_24xx_process_response_error() has been removed since commit c5419e2618b9 ("scsi: qla2xxx: Combine Active command arrays.").
Remove the declarations for them from header file.
Link: https://lore.kernel.org/r/20220913023722.547249-2-cuigaosheng1@huawei.com Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 6a87679626b5 ("scsi: qla2xxx: Fix task management cmd fail due to unavailable resource") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_gbl.h | 12 ------------ drivers/scsi/qla2xxx/qla_target.h | 2 -- 2 files changed, 14 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h index f82e4a348330a..4ed0777cd2a6f 100644 --- a/drivers/scsi/qla2xxx/qla_gbl.h +++ b/drivers/scsi/qla2xxx/qla_gbl.h @@ -70,8 +70,6 @@ extern int qla2x00_async_prlo(struct scsi_qla_host *, fc_port_t *); extern int qla2x00_async_adisc(struct scsi_qla_host *, fc_port_t *, uint16_t *); extern int qla2x00_async_tm_cmd(fc_port_t *, uint32_t, uint32_t, uint32_t); -extern void qla2x00_async_login_done(struct scsi_qla_host *, fc_port_t *, - uint16_t *); struct qla_work_evt *qla2x00_alloc_work(struct scsi_qla_host *, enum qla_work_type); extern int qla24xx_async_gnl(struct scsi_qla_host *, fc_port_t *); @@ -278,7 +276,6 @@ extern int qla24xx_vport_create_req_sanity_check(struct fc_vport *); extern scsi_qla_host_t *qla24xx_create_vhost(struct fc_vport *);
extern void qla2x00_sp_free_dma(srb_t *sp); -extern char *qla2x00_get_fw_version_str(struct scsi_qla_host *, char *);
extern void qla2x00_mark_device_lost(scsi_qla_host_t *, fc_port_t *, int); extern void qla2x00_mark_all_devices_lost(scsi_qla_host_t *); @@ -611,7 +608,6 @@ void __qla_consume_iocb(struct scsi_qla_host *vha, void **pkt, struct rsp_que ** /* * Global Function Prototypes in qla_sup.c source file. */ -extern void qla2x00_release_nvram_protection(scsi_qla_host_t *); extern int qla24xx_read_flash_data(scsi_qla_host_t *, uint32_t *, uint32_t, uint32_t); extern uint8_t *qla2x00_read_nvram_data(scsi_qla_host_t *, void *, uint32_t, @@ -781,12 +777,6 @@ extern void qla2x00_init_response_q_entries(struct rsp_que *); extern int qla25xx_delete_req_que(struct scsi_qla_host *, struct req_que *); extern int qla25xx_delete_rsp_que(struct scsi_qla_host *, struct rsp_que *); extern int qla25xx_delete_queues(struct scsi_qla_host *); -extern uint16_t qla24xx_rd_req_reg(struct qla_hw_data *, uint16_t); -extern uint16_t qla25xx_rd_req_reg(struct qla_hw_data *, uint16_t); -extern void qla24xx_wrt_req_reg(struct qla_hw_data *, uint16_t, uint16_t); -extern void qla25xx_wrt_req_reg(struct qla_hw_data *, uint16_t, uint16_t); -extern void qla25xx_wrt_rsp_reg(struct qla_hw_data *, uint16_t, uint16_t); -extern void qla24xx_wrt_rsp_reg(struct qla_hw_data *, uint16_t, uint16_t);
/* qlafx00 related functions */ extern int qlafx00_pci_config(struct scsi_qla_host *); @@ -871,8 +861,6 @@ extern void qla82xx_init_flags(struct qla_hw_data *); extern void qla82xx_set_drv_active(scsi_qla_host_t *); extern int qla82xx_wr_32(struct qla_hw_data *, ulong, u32); extern int qla82xx_rd_32(struct qla_hw_data *, ulong); -extern int qla82xx_rdmem(struct qla_hw_data *, u64, void *, int); -extern int qla82xx_wrmem(struct qla_hw_data *, u64, void *, int);
/* ISP 8021 IDC */ extern void qla82xx_clear_drv_active(struct qla_hw_data *); diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h index 156b950ca7e72..aa83434448377 100644 --- a/drivers/scsi/qla2xxx/qla_target.h +++ b/drivers/scsi/qla2xxx/qla_target.h @@ -1080,8 +1080,6 @@ extern void qlt_81xx_config_nvram_stage2(struct scsi_qla_host *, struct init_cb_81xx *); extern void qlt_81xx_config_nvram_stage1(struct scsi_qla_host *, struct nvram_81xx *); -extern int qlt_24xx_process_response_error(struct scsi_qla_host *, - struct sts_entry_24xx *); extern void qlt_modify_vp_config(struct scsi_qla_host *, struct vp_config_entry_24xx *); extern void qlt_probe_one_stage1(struct scsi_qla_host *, struct qla_hw_data *);
From: Quinn Tran qutran@marvell.com
[ Upstream commit 6a87679626b51b53fbb6be417ad8eb083030b617 ]
Task management command failed with status 2Ch which is a result of too many task management commands sent to the same target. Hence limit task management commands to 8 per target.
Reported-by: kernel test robot lkp@intel.com Link: https://lore.kernel.org/oe-kbuild-all/202304271952.NKNmoFzv-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230428075339.32551-4-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 3 ++ drivers/scsi/qla2xxx/qla_init.c | 63 ++++++++++++++++++++++++++++++--- 2 files changed, 61 insertions(+), 5 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index ca63aa8f33b5d..bbeb116c16cc3 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -2523,6 +2523,7 @@ enum rscn_addr_format { typedef struct fc_port { struct list_head list; struct scsi_qla_host *vha; + struct list_head tmf_pending;
unsigned int conf_compl_supported:1; unsigned int deleted:2; @@ -2543,6 +2544,8 @@ typedef struct fc_port { unsigned int do_prli_nvme:1;
uint8_t nvme_flag; + uint8_t active_tmf; +#define MAX_ACTIVE_TMF 8
uint8_t node_name[WWN_SIZE]; uint8_t port_name[WWN_SIZE]; diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 9d9b16f7f34a6..2c42ecb2a64a5 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -2151,6 +2151,54 @@ __qla2x00_async_tm_cmd(struct tmf_arg *arg) return rval; }
+static void qla_put_tmf(fc_port_t *fcport) +{ + struct scsi_qla_host *vha = fcport->vha; + struct qla_hw_data *ha = vha->hw; + unsigned long flags; + + spin_lock_irqsave(&ha->tgt.sess_lock, flags); + fcport->active_tmf--; + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); +} + +static +int qla_get_tmf(fc_port_t *fcport) +{ + struct scsi_qla_host *vha = fcport->vha; + struct qla_hw_data *ha = vha->hw; + unsigned long flags; + int rc = 0; + LIST_HEAD(tmf_elem); + + spin_lock_irqsave(&ha->tgt.sess_lock, flags); + list_add_tail(&tmf_elem, &fcport->tmf_pending); + + while (fcport->active_tmf >= MAX_ACTIVE_TMF) { + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); + + msleep(1); + + spin_lock_irqsave(&ha->tgt.sess_lock, flags); + if (fcport->deleted) { + rc = EIO; + break; + } + if (fcport->active_tmf < MAX_ACTIVE_TMF && + list_is_first(&tmf_elem, &fcport->tmf_pending)) + break; + } + + list_del(&tmf_elem); + + if (!rc) + fcport->active_tmf++; + + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); + + return rc; +} + int qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, uint32_t tag) @@ -2158,18 +2206,19 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, struct scsi_qla_host *vha = fcport->vha; struct qla_qpair *qpair; struct tmf_arg a; - struct completion comp; int i, rval;
- init_completion(&comp); a.vha = fcport->vha; a.fcport = fcport; a.lun = lun; - - if (flags & (TCF_LUN_RESET|TCF_ABORT_TASK_SET|TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) + if (flags & (TCF_LUN_RESET|TCF_ABORT_TASK_SET|TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) { a.modifier = MK_SYNC_ID_LUN; - else + + if (qla_get_tmf(fcport)) + return QLA_FUNCTION_FAILED; + } else { a.modifier = MK_SYNC_ID; + }
if (vha->hw->mqenable) { for (i = 0; i < vha->hw->num_qpairs; i++) { @@ -2188,6 +2237,9 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, a.flags = flags; rval = __qla2x00_async_tm_cmd(&a);
+ if (a.modifier == MK_SYNC_ID_LUN) + qla_put_tmf(fcport); + return rval; }
@@ -5423,6 +5475,7 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vha, gfp_t flags) INIT_WORK(&fcport->reg_work, qla_register_fcport_fn); INIT_LIST_HEAD(&fcport->gnl_entry); INIT_LIST_HEAD(&fcport->list); + INIT_LIST_HEAD(&fcport->tmf_pending);
INIT_LIST_HEAD(&fcport->sess_cmd_list); spin_lock_init(&fcport->sess_cmd_lock);
From: Arun Easi aeasi@marvell.com
[ Upstream commit f12d2d130efc49464ef0666789bfeb9073162743 ]
Add a debug print in the devloss callback.
Link: https://lore.kernel.org/r/20220616053508.27186-9-njavali@marvell.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Stable-dep-of: 9ae615c5bfd3 ("scsi: qla2xxx: Fix hang in task management") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_attr.c | 3 +++ drivers/scsi/qla2xxx/qla_def.h | 6 ++++++ 2 files changed, 9 insertions(+)
diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c index de57d45ffc5cb..4a5df867057bc 100644 --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -2705,6 +2705,9 @@ qla2x00_dev_loss_tmo_callbk(struct fc_rport *rport) if (!fcport) return;
+ ql_dbg(ql_dbg_async, fcport->vha, 0x5101, + DBG_FCPORT_PRFMT(fcport, "dev_loss_tmo expiry, rport_state=%d", + rport->port_state));
/* * Now that the rport has been deleted, set the fcport state to diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index bbeb116c16cc3..2a2bca2fb57ee 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -5475,4 +5475,10 @@ struct ql_vnd_tgt_stats_resp { #define IS_SESSION_DELETED(_fcport) (_fcport->disc_state == DSC_DELETE_PEND || \ _fcport->disc_state == DSC_DELETED)
+#define DBG_FCPORT_PRFMT(_fp, _fmt, _args...) \ + "%s: %8phC: " _fmt " (state=%d disc_state=%d scan_state=%d loopid=0x%x deleted=%d flags=0x%x)\n", \ + __func__, _fp->port_name, ##_args, atomic_read(&_fp->state), \ + _fp->disc_state, _fp->scan_state, _fp->loop_id, _fp->deleted, \ + _fp->flags + #endif
From: Quinn Tran qutran@marvell.com
[ Upstream commit 9ae615c5bfd37bd091772969b1153de5335ea986 ]
Task management command hangs where a side band chip reset failed to nudge the TMF from it's current send path.
Add additional error check to block TMF from entering during chip reset and along the TMF path to cause it to bail out, skip over abort of marker.
Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20230428075339.32551-5-njavali@marvell.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 4 +++ drivers/scsi/qla2xxx/qla_init.c | 60 +++++++++++++++++++++++++++++++-- 2 files changed, 61 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 2a2bca2fb57ee..83228ce822af3 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -5481,4 +5481,8 @@ struct ql_vnd_tgt_stats_resp { _fp->disc_state, _fp->scan_state, _fp->loop_id, _fp->deleted, \ _fp->flags
+#define TMF_NOT_READY(_fcport) \ + (!_fcport || IS_SESSION_DELETED(_fcport) || atomic_read(&_fcport->state) != FCS_ONLINE || \ + !_fcport->vha->hw->flags.fw_started) + #endif diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index 2c42ecb2a64a5..a97872b6350ca 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -1998,6 +1998,11 @@ qla2x00_tmf_iocb_timeout(void *data) int rc, h; unsigned long flags;
+ if (sp->type == SRB_MARKER) { + complete(&tmf->u.tmf.comp); + return; + } + rc = qla24xx_async_abort_cmd(sp, false); if (rc) { spin_lock_irqsave(sp->qpair->qp_lock_ptr, flags); @@ -2025,6 +2030,7 @@ static void qla_marker_sp_done(srb_t *sp, int res) sp->handle, sp->fcport->d_id.b24, sp->u.iocb_cmd.u.tmf.flags, sp->u.iocb_cmd.u.tmf.lun, sp->qpair->id);
+ sp->u.iocb_cmd.u.tmf.data = res; complete(&tmf->u.tmf.comp); }
@@ -2041,6 +2047,11 @@ static void qla_marker_sp_done(srb_t *sp, int res) } while (cnt); \ }
+/** + * qla26xx_marker: send marker IOCB and wait for the completion of it. + * @arg: pointer to argument list. + * It is assume caller will provide an fcport pointer and modifier + */ static int qla26xx_marker(struct tmf_arg *arg) { @@ -2050,6 +2061,14 @@ qla26xx_marker(struct tmf_arg *arg) int rval = QLA_FUNCTION_FAILED; fc_port_t *fcport = arg->fcport;
+ if (TMF_NOT_READY(arg->fcport)) { + ql_dbg(ql_dbg_taskm, vha, 0x8039, + "FC port not ready for marker loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d.\n", + fcport->loop_id, fcport->d_id.b24, + arg->modifier, arg->lun, arg->qpair->id); + return QLA_SUSPENDED; + } + /* ref: INIT */ sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL); if (!sp) @@ -2076,11 +2095,19 @@ qla26xx_marker(struct tmf_arg *arg)
if (rval != QLA_SUCCESS) { ql_log(ql_log_warn, vha, 0x8031, - "Marker IOCB failed (%x).\n", rval); + "Marker IOCB send failure (%x).\n", rval); goto done_free_sp; }
wait_for_completion(&tm_iocb->u.tmf.comp); + rval = tm_iocb->u.tmf.data; + + if (rval != QLA_SUCCESS) { + ql_log(ql_log_warn, vha, 0x8019, + "Marker failed hdl=%x loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d rval %d.\n", + sp->handle, fcport->loop_id, fcport->d_id.b24, + arg->modifier, arg->lun, sp->qpair->id, rval); + }
done_free_sp: /* ref: INIT */ @@ -2093,6 +2120,8 @@ static void qla2x00_tmf_sp_done(srb_t *sp, int res) { struct srb_iocb *tmf = &sp->u.iocb_cmd;
+ if (res) + tmf->u.tmf.data = res; complete(&tmf->u.tmf.comp); }
@@ -2106,6 +2135,14 @@ __qla2x00_async_tm_cmd(struct tmf_arg *arg)
fc_port_t *fcport = arg->fcport;
+ if (TMF_NOT_READY(arg->fcport)) { + ql_dbg(ql_dbg_taskm, vha, 0x8032, + "FC port not ready for TM command loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d.\n", + fcport->loop_id, fcport->d_id.b24, + arg->modifier, arg->lun, arg->qpair->id); + return QLA_SUSPENDED; + } + /* ref: INIT */ sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL); if (!sp) @@ -2180,7 +2217,9 @@ int qla_get_tmf(fc_port_t *fcport) msleep(1);
spin_lock_irqsave(&ha->tgt.sess_lock, flags); - if (fcport->deleted) { + if (TMF_NOT_READY(fcport)) { + ql_log(ql_log_warn, vha, 0x802c, + "Unable to acquire TM resource due to disruption.\n"); rc = EIO; break; } @@ -2206,7 +2245,10 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, struct scsi_qla_host *vha = fcport->vha; struct qla_qpair *qpair; struct tmf_arg a; - int i, rval; + int i, rval = QLA_SUCCESS; + + if (TMF_NOT_READY(fcport)) + return QLA_SUSPENDED;
a.vha = fcport->vha; a.fcport = fcport; @@ -2225,6 +2267,14 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, qpair = vha->hw->queue_pair_map[i]; if (!qpair) continue; + + if (TMF_NOT_READY(fcport)) { + ql_log(ql_log_warn, vha, 0x8026, + "Unable to send TM due to disruption.\n"); + rval = QLA_SUSPENDED; + break; + } + a.qpair = qpair; a.flags = flags|TCF_NOTMCMD_TO_TARGET; rval = __qla2x00_async_tm_cmd(&a); @@ -2233,10 +2283,14 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun, } }
+ if (rval) + goto bailout; + a.qpair = vha->hw->base_qpair; a.flags = flags; rval = __qla2x00_async_tm_cmd(&a);
+bailout: if (a.modifier == MK_SYNC_ID_LUN) qla_put_tmf(fcport);
From: Flora Cui flora.cui@amd.com
[ Upstream commit deefd07eedb7baa25956c8365373e6a58c81565a ]
otherwise adev->mode_info.crtcs[] is NULL
Signed-off-by: Flora Cui flora.cui@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: b42ae87a7b38 ("drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 42 ++++++++++++++++-------- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h | 5 ++- 2 files changed, 30 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c index 7d58bf410be05..24251cdf95073 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c @@ -16,6 +16,8 @@ #include "ivsrcid/ivsrcid_vislands30.h" #include "amdgpu_vkms.h" #include "amdgpu_display.h" +#include "atom.h" +#include "amdgpu_irq.h"
/** * DOC: amdgpu_vkms @@ -41,14 +43,13 @@ static const u32 amdgpu_vkms_formats[] = {
static enum hrtimer_restart amdgpu_vkms_vblank_simulate(struct hrtimer *timer) { - struct amdgpu_vkms_output *output = container_of(timer, - struct amdgpu_vkms_output, - vblank_hrtimer); - struct drm_crtc *crtc = &output->crtc; + struct amdgpu_crtc *amdgpu_crtc = container_of(timer, struct amdgpu_crtc, vblank_timer); + struct drm_crtc *crtc = &amdgpu_crtc->base; + struct amdgpu_vkms_output *output = drm_crtc_to_amdgpu_vkms_output(crtc); u64 ret_overrun; bool ret;
- ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer, + ret_overrun = hrtimer_forward_now(&amdgpu_crtc->vblank_timer, output->period_ns); WARN_ON(ret_overrun != 1);
@@ -65,22 +66,21 @@ static int amdgpu_vkms_enable_vblank(struct drm_crtc *crtc) unsigned int pipe = drm_crtc_index(crtc); struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; struct amdgpu_vkms_output *out = drm_crtc_to_amdgpu_vkms_output(crtc); + struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
drm_calc_timestamping_constants(crtc, &crtc->mode);
- hrtimer_init(&out->vblank_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); - out->vblank_hrtimer.function = &amdgpu_vkms_vblank_simulate; out->period_ns = ktime_set(0, vblank->framedur_ns); - hrtimer_start(&out->vblank_hrtimer, out->period_ns, HRTIMER_MODE_REL); + hrtimer_start(&amdgpu_crtc->vblank_timer, out->period_ns, HRTIMER_MODE_REL);
return 0; }
static void amdgpu_vkms_disable_vblank(struct drm_crtc *crtc) { - struct amdgpu_vkms_output *out = drm_crtc_to_amdgpu_vkms_output(crtc); + struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
- hrtimer_cancel(&out->vblank_hrtimer); + hrtimer_cancel(&amdgpu_crtc->vblank_timer); }
static bool amdgpu_vkms_get_vblank_timestamp(struct drm_crtc *crtc, @@ -92,13 +92,14 @@ static bool amdgpu_vkms_get_vblank_timestamp(struct drm_crtc *crtc, unsigned int pipe = crtc->index; struct amdgpu_vkms_output *output = drm_crtc_to_amdgpu_vkms_output(crtc); struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; + struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
if (!READ_ONCE(vblank->enabled)) { *vblank_time = ktime_get(); return true; }
- *vblank_time = READ_ONCE(output->vblank_hrtimer.node.expires); + *vblank_time = READ_ONCE(amdgpu_crtc->vblank_timer.node.expires);
if (WARN_ON(*vblank_time == vblank->time)) return true; @@ -166,6 +167,8 @@ static const struct drm_crtc_helper_funcs amdgpu_vkms_crtc_helper_funcs = { static int amdgpu_vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc, struct drm_plane *primary, struct drm_plane *cursor) { + struct amdgpu_device *adev = drm_to_adev(dev); + struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc); int ret;
ret = drm_crtc_init_with_planes(dev, crtc, primary, cursor, @@ -177,6 +180,17 @@ static int amdgpu_vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
drm_crtc_helper_add(crtc, &amdgpu_vkms_crtc_helper_funcs);
+ amdgpu_crtc->crtc_id = drm_crtc_index(crtc); + adev->mode_info.crtcs[drm_crtc_index(crtc)] = amdgpu_crtc; + + amdgpu_crtc->pll_id = ATOM_PPLL_INVALID; + amdgpu_crtc->encoder = NULL; + amdgpu_crtc->connector = NULL; + amdgpu_crtc->vsync_timer_enabled = AMDGPU_IRQ_STATE_DISABLE; + + hrtimer_init(&amdgpu_crtc->vblank_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + amdgpu_crtc->vblank_timer.function = &amdgpu_vkms_vblank_simulate; + return ret; }
@@ -402,7 +416,7 @@ int amdgpu_vkms_output_init(struct drm_device *dev, { struct drm_connector *connector = &output->connector; struct drm_encoder *encoder = &output->encoder; - struct drm_crtc *crtc = &output->crtc; + struct drm_crtc *crtc = &output->crtc.base; struct drm_plane *primary, *cursor = NULL; int ret;
@@ -505,8 +519,8 @@ static int amdgpu_vkms_sw_fini(void *handle) int i = 0;
for (i = 0; i < adev->mode_info.num_crtc; i++) - if (adev->amdgpu_vkms_output[i].vblank_hrtimer.function) - hrtimer_cancel(&adev->amdgpu_vkms_output[i].vblank_hrtimer); + if (adev->mode_info.crtcs[i]) + hrtimer_cancel(&adev->mode_info.crtcs[i]->vblank_timer);
kfree(adev->mode_info.bios_hardcoded_edid); kfree(adev->amdgpu_vkms_output); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h index 97f1b79c0724e..4f8722ff37c25 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.h @@ -10,15 +10,14 @@ #define YRES_MAX 16384
#define drm_crtc_to_amdgpu_vkms_output(target) \ - container_of(target, struct amdgpu_vkms_output, crtc) + container_of(target, struct amdgpu_vkms_output, crtc.base)
extern const struct amdgpu_ip_block_version amdgpu_vkms_ip_block;
struct amdgpu_vkms_output { - struct drm_crtc crtc; + struct amdgpu_crtc crtc; struct drm_encoder encoder; struct drm_connector connector; - struct hrtimer vblank_hrtimer; ktime_t period_ns; struct drm_pending_vblank_event *event; };
From: Guchun Chen guchun.chen@amd.com
[ Upstream commit b42ae87a7b3878afaf4c3852ca66c025a5b996e0 ]
In below thousands of screen rotation loop tests with virtual display enabled, a CPU hard lockup issue may happen, leading system to unresponsive and crash.
do { xrandr --output Virtual --rotate inverted xrandr --output Virtual --rotate right xrandr --output Virtual --rotate left xrandr --output Virtual --rotate normal } while (1);
NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
? hrtimer_run_softirq+0x140/0x140 ? store_vblank+0xe0/0xe0 [drm] hrtimer_cancel+0x15/0x30 amdgpu_vkms_disable_vblank+0x15/0x30 [amdgpu] drm_vblank_disable_and_save+0x185/0x1f0 [drm] drm_crtc_vblank_off+0x159/0x4c0 [drm] ? record_print_text.cold+0x11/0x11 ? wait_for_completion_timeout+0x232/0x280 ? drm_crtc_wait_one_vblank+0x40/0x40 [drm] ? bit_wait_io_timeout+0xe0/0xe0 ? wait_for_completion_interruptible+0x1d7/0x320 ? mutex_unlock+0x81/0xd0 amdgpu_vkms_crtc_atomic_disable
It's caused by a stuck in lock dependency in such scenario on different CPUs.
CPU1 CPU2 drm_crtc_vblank_off hrtimer_interrupt grab event_lock (irq disabled) __hrtimer_run_queues grab vbl_lock/vblank_time_block amdgpu_vkms_vblank_simulate amdgpu_vkms_disable_vblank drm_handle_vblank hrtimer_cancel grab dev->event_lock
So CPU1 stucks in hrtimer_cancel as timer callback is running endless on current clock base, as that timer queue on CPU2 has no chance to finish it because of failing to hold the lock. So NMI watchdog will throw the errors after its threshold, and all later CPUs are impacted/blocked.
So use hrtimer_try_to_cancel to fix this, as disable_vblank callback does not need to wait the handler to finish. And also it's not necessary to check the return value of hrtimer_try_to_cancel, because even if it's -1 which means current timer callback is running, it will be reprogrammed in hrtimer_start with calling enable_vblank to make it works.
v2: only re-arm timer when vblank is enabled (Christian) and add a Fixes tag as well
v3: drop warn printing (Christian)
v4: drop superfluous check of blank->enabled in timer function, as it's guaranteed in drm_handle_vblank (Christian)
Fixes: 84ec374bd580 ("drm/amdgpu: create amdgpu_vkms (v4)") Cc: stable@vger.kernel.org Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Guchun Chen guchun.chen@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c index 24251cdf95073..4e8274de8fc0c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c @@ -54,8 +54,9 @@ static enum hrtimer_restart amdgpu_vkms_vblank_simulate(struct hrtimer *timer) WARN_ON(ret_overrun != 1);
ret = drm_crtc_handle_vblank(crtc); + /* Don't queue timer again when vblank is disabled. */ if (!ret) - DRM_ERROR("amdgpu_vkms failure on handling vblank"); + return HRTIMER_NORESTART;
return HRTIMER_RESTART; } @@ -80,7 +81,7 @@ static void amdgpu_vkms_disable_vblank(struct drm_crtc *crtc) { struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
- hrtimer_cancel(&amdgpu_crtc->vblank_timer); + hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer); }
static bool amdgpu_vkms_get_vblank_timestamp(struct drm_crtc *crtc,
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit be22255360f80d3af789daad00025171a65424a5 ]
Since t_checkpoint_io_list was stop using in jbd2_log_do_checkpoint() now, it's time to remove the whole t_checkpoint_io_list logic.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 42 ++---------------------------------------- fs/jbd2/commit.c | 3 +-- include/linux/jbd2.h | 6 ------ 3 files changed, 3 insertions(+), 48 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index d2aba55833f92..c1f543e86170a 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -27,7 +27,7 @@ * * Called with j_list_lock held. */ -static inline void __buffer_unlink_first(struct journal_head *jh) +static inline void __buffer_unlink(struct journal_head *jh) { transaction_t *transaction = jh->b_cp_transaction;
@@ -40,23 +40,6 @@ static inline void __buffer_unlink_first(struct journal_head *jh) } }
-/* - * Unlink a buffer from a transaction checkpoint(io) list. - * - * Called with j_list_lock held. - */ -static inline void __buffer_unlink(struct journal_head *jh) -{ - transaction_t *transaction = jh->b_cp_transaction; - - __buffer_unlink_first(jh); - if (transaction->t_checkpoint_io_list == jh) { - transaction->t_checkpoint_io_list = jh->b_cpnext; - if (transaction->t_checkpoint_io_list == jh) - transaction->t_checkpoint_io_list = NULL; - } -} - /* * Check a checkpoint buffer could be release or not. * @@ -505,15 +488,6 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, break; if (need_resched() || spin_needbreak(&journal->j_list_lock)) break; - if (released) - continue; - - nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_io_list, - nr_to_scan, &released); - if (*nr_to_scan == 0) - break; - if (need_resched() || spin_needbreak(&journal->j_list_lock)) - break; } while (transaction != last_transaction);
if (transaction != last_transaction) { @@ -568,17 +542,6 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) */ if (need_resched()) return; - if (ret) - continue; - /* - * It is essential that we are as careful as in the case of - * t_checkpoint_list with removing the buffer from the list as - * we can possibly see not yet submitted buffers on io_list - */ - ret = journal_clean_one_cp_list(transaction-> - t_checkpoint_io_list, destroy); - if (need_resched()) - return; /* * Stop scanning if we couldn't free the transaction. This * avoids pointless scanning of transactions which still @@ -663,7 +626,7 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) jbd2_journal_put_journal_head(jh);
/* Is this transaction empty? */ - if (transaction->t_checkpoint_list || transaction->t_checkpoint_io_list) + if (transaction->t_checkpoint_list) return 0;
/* @@ -755,7 +718,6 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact J_ASSERT(transaction->t_forget == NULL); J_ASSERT(transaction->t_shadow_list == NULL); J_ASSERT(transaction->t_checkpoint_list == NULL); - J_ASSERT(transaction->t_checkpoint_io_list == NULL); J_ASSERT(atomic_read(&transaction->t_updates) == 0); J_ASSERT(journal->j_committing_transaction != transaction); J_ASSERT(journal->j_running_transaction != transaction); diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index ac328e3321242..20294c1bbeab7 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -1184,8 +1184,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) spin_lock(&journal->j_list_lock); commit_transaction->t_state = T_FINISHED; /* Check if the transaction can be dropped now that we are finished */ - if (commit_transaction->t_checkpoint_list == NULL && - commit_transaction->t_checkpoint_io_list == NULL) { + if (commit_transaction->t_checkpoint_list == NULL) { __jbd2_journal_drop_transaction(journal, commit_transaction); jbd2_journal_free_transaction(commit_transaction); } diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index d63b8106796e2..e6cfbcde96f29 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -626,12 +626,6 @@ struct transaction_s */ struct journal_head *t_checkpoint_list;
- /* - * Doubly-linked circular list of all buffers submitted for IO while - * checkpointing. [j_list_lock] - */ - struct journal_head *t_checkpoint_io_list; - /* * Doubly-linked circular list of metadata buffers being * shadowed by log IO. The IO buffers on the iobuf list and
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit b98dba273a0e47dbfade89c9af73c5b012a4eabb ]
journal_clean_one_cp_list() and journal_shrink_one_cp_list() are almost the same, so merge them into journal_shrink_one_cp_list(), remove the nr_to_scan parameter, always scan and try to free the whole checkpoint list.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 75 +++++++++---------------------------- include/trace/events/jbd2.h | 12 ++---- 2 files changed, 21 insertions(+), 66 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index c1f543e86170a..ab72aeb766a74 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -349,50 +349,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
/* Checkpoint list management */
-/* - * journal_clean_one_cp_list - * - * Find all the written-back checkpoint buffers in the given list and - * release them. If 'destroy' is set, clean all buffers unconditionally. - * - * Called with j_list_lock held. - * Returns 1 if we freed the transaction, 0 otherwise. - */ -static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) -{ - struct journal_head *last_jh; - struct journal_head *next_jh = jh; - - if (!jh) - return 0; - - last_jh = jh->b_cpprev; - do { - jh = next_jh; - next_jh = jh->b_cpnext; - - if (!destroy && __cp_buffer_busy(jh)) - return 0; - - if (__jbd2_journal_remove_checkpoint(jh)) - return 1; - /* - * This function only frees up some memory - * if possible so we dont have an obligation - * to finish processing. Bail out if preemption - * requested: - */ - if (need_resched()) - return 0; - } while (jh != last_jh); - - return 0; -} - /* * journal_shrink_one_cp_list * - * Find 'nr_to_scan' written-back checkpoint buffers in the given list + * Find all the written-back checkpoint buffers in the given list * and try to release them. If the whole transaction is released, set * the 'released' parameter. Return the number of released checkpointed * buffers. @@ -400,15 +360,15 @@ static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) * Called with j_list_lock held. */ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, - unsigned long *nr_to_scan, - bool *released) + bool destroy, bool *released) { struct journal_head *last_jh; struct journal_head *next_jh = jh; unsigned long nr_freed = 0; int ret;
- if (!jh || *nr_to_scan == 0) + *released = false; + if (!jh) return 0;
last_jh = jh->b_cpprev; @@ -416,8 +376,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, jh = next_jh; next_jh = jh->b_cpnext;
- (*nr_to_scan)--; - if (__cp_buffer_busy(jh)) + if (!destroy && __cp_buffer_busy(jh)) continue;
nr_freed++; @@ -429,7 +388,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
if (need_resched()) break; - } while (jh != last_jh && *nr_to_scan); + } while (jh != last_jh);
return nr_freed; } @@ -447,11 +406,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan) { transaction_t *transaction, *last_transaction, *next_transaction; - bool released; + bool __maybe_unused released; tid_t first_tid = 0, last_tid = 0, next_tid = 0; tid_t tid = 0; unsigned long nr_freed = 0; - unsigned long nr_scanned = *nr_to_scan; + unsigned long freed;
again: spin_lock(&journal->j_list_lock); @@ -480,10 +439,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, transaction = next_transaction; next_transaction = transaction->t_cpnext; tid = transaction->t_tid; - released = false;
- nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_list, - nr_to_scan, &released); + freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list, + false, &released); + nr_freed += freed; + (*nr_to_scan) -= min(*nr_to_scan, freed); if (*nr_to_scan == 0) break; if (need_resched() || spin_needbreak(&journal->j_list_lock)) @@ -504,9 +464,8 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, if (*nr_to_scan && next_tid) goto again; out: - nr_scanned -= *nr_to_scan; trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid, - nr_freed, nr_scanned, next_tid); + nr_freed, next_tid);
return nr_freed; } @@ -522,7 +481,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) { transaction_t *transaction, *last_transaction, *next_transaction; - int ret; + bool released;
transaction = journal->j_checkpoint_transactions; if (!transaction) @@ -533,8 +492,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) do { transaction = next_transaction; next_transaction = transaction->t_cpnext; - ret = journal_clean_one_cp_list(transaction->t_checkpoint_list, - destroy); + journal_shrink_one_cp_list(transaction->t_checkpoint_list, + destroy, &released); /* * This function only frees up some memory if possible so we * dont have an obligation to finish processing. Bail out if @@ -547,7 +506,7 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) * avoids pointless scanning of transactions which still * weren't checkpointed. */ - if (!ret) + if (!released) return; } while (transaction != last_transaction); } diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h index 29414288ea3e0..34ce197bd76e0 100644 --- a/include/trace/events/jbd2.h +++ b/include/trace/events/jbd2.h @@ -462,11 +462,9 @@ TRACE_EVENT(jbd2_shrink_scan_exit, TRACE_EVENT(jbd2_shrink_checkpoint_list,
TP_PROTO(journal_t *journal, tid_t first_tid, tid_t tid, tid_t last_tid, - unsigned long nr_freed, unsigned long nr_scanned, - tid_t next_tid), + unsigned long nr_freed, tid_t next_tid),
- TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, - nr_scanned, next_tid), + TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, next_tid),
TP_STRUCT__entry( __field(dev_t, dev) @@ -474,7 +472,6 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, __field(tid_t, tid) __field(tid_t, last_tid) __field(unsigned long, nr_freed) - __field(unsigned long, nr_scanned) __field(tid_t, next_tid) ),
@@ -484,15 +481,14 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, __entry->tid = tid; __entry->last_tid = last_tid; __entry->nr_freed = nr_freed; - __entry->nr_scanned = nr_scanned; __entry->next_tid = next_tid; ),
TP_printk("dev %d,%d shrink transaction %u-%u(%u) freed %lu " - "scanned %lu next transaction %u", + "next transaction %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->first_tid, __entry->tid, __entry->last_tid, - __entry->nr_freed, __entry->nr_scanned, __entry->next_tid) + __entry->nr_freed, __entry->next_tid) );
#endif /* _TRACE_JBD2_H */
On Tue 01-08-23 11:19:19, Greg Kroah-Hartman wrote:
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit b98dba273a0e47dbfade89c9af73c5b012a4eabb ]
journal_clean_one_cp_list() and journal_shrink_one_cp_list() are almost the same, so merge them into journal_shrink_one_cp_list(), remove the nr_to_scan parameter, always scan and try to free the whole checkpoint list.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin sashal@kernel.org
This and the following patch (46f881b5b175) have some issues [1] and cause a performance regression for some workloads and possible metadata corruption after a crash. So please drop these two patches from the stable trees for now. We can include them again later once the code has stabilized... Thanks!
Honza
[1] https://lore.kernel.org/all/20230714025528.564988-1-yi.zhang@huaweicloud.com
fs/jbd2/checkpoint.c | 75 +++++++++---------------------------- include/trace/events/jbd2.h | 12 ++---- 2 files changed, 21 insertions(+), 66 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index c1f543e86170a..ab72aeb766a74 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -349,50 +349,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal) /* Checkpoint list management */ -/*
- journal_clean_one_cp_list
- Find all the written-back checkpoint buffers in the given list and
- release them. If 'destroy' is set, clean all buffers unconditionally.
- Called with j_list_lock held.
- Returns 1 if we freed the transaction, 0 otherwise.
- */
-static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) -{
- struct journal_head *last_jh;
- struct journal_head *next_jh = jh;
- if (!jh)
return 0;
- last_jh = jh->b_cpprev;
- do {
jh = next_jh;
next_jh = jh->b_cpnext;
if (!destroy && __cp_buffer_busy(jh))
return 0;
if (__jbd2_journal_remove_checkpoint(jh))
return 1;
/*
* This function only frees up some memory
* if possible so we dont have an obligation
* to finish processing. Bail out if preemption
* requested:
*/
if (need_resched())
return 0;
- } while (jh != last_jh);
- return 0;
-}
/*
- journal_shrink_one_cp_list
- Find 'nr_to_scan' written-back checkpoint buffers in the given list
- Find all the written-back checkpoint buffers in the given list
- and try to release them. If the whole transaction is released, set
- the 'released' parameter. Return the number of released checkpointed
- buffers.
@@ -400,15 +360,15 @@ static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy)
- Called with j_list_lock held.
*/ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
unsigned long *nr_to_scan,
bool *released)
bool destroy, bool *released)
{ struct journal_head *last_jh; struct journal_head *next_jh = jh; unsigned long nr_freed = 0; int ret;
- if (!jh || *nr_to_scan == 0)
- *released = false;
- if (!jh) return 0;
last_jh = jh->b_cpprev; @@ -416,8 +376,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, jh = next_jh; next_jh = jh->b_cpnext;
(*nr_to_scan)--;
if (__cp_buffer_busy(jh))
if (!destroy && __cp_buffer_busy(jh)) continue;
nr_freed++; @@ -429,7 +388,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, if (need_resched()) break;
- } while (jh != last_jh && *nr_to_scan);
- } while (jh != last_jh);
return nr_freed; } @@ -447,11 +406,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan) { transaction_t *transaction, *last_transaction, *next_transaction;
- bool released;
- bool __maybe_unused released; tid_t first_tid = 0, last_tid = 0, next_tid = 0; tid_t tid = 0; unsigned long nr_freed = 0;
- unsigned long nr_scanned = *nr_to_scan;
- unsigned long freed;
again: spin_lock(&journal->j_list_lock); @@ -480,10 +439,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, transaction = next_transaction; next_transaction = transaction->t_cpnext; tid = transaction->t_tid;
released = false;
nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_list,
nr_to_scan, &released);
freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list,
false, &released);
nr_freed += freed;
if (*nr_to_scan == 0) break; if (need_resched() || spin_needbreak(&journal->j_list_lock))(*nr_to_scan) -= min(*nr_to_scan, freed);
@@ -504,9 +464,8 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, if (*nr_to_scan && next_tid) goto again; out:
- nr_scanned -= *nr_to_scan; trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid,
nr_freed, nr_scanned, next_tid);
nr_freed, next_tid);
return nr_freed; } @@ -522,7 +481,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) { transaction_t *transaction, *last_transaction, *next_transaction;
- int ret;
- bool released;
transaction = journal->j_checkpoint_transactions; if (!transaction) @@ -533,8 +492,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) do { transaction = next_transaction; next_transaction = transaction->t_cpnext;
ret = journal_clean_one_cp_list(transaction->t_checkpoint_list,
destroy);
journal_shrink_one_cp_list(transaction->t_checkpoint_list,
/*destroy, &released);
- This function only frees up some memory if possible so we
- dont have an obligation to finish processing. Bail out if
@@ -547,7 +506,7 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) * avoids pointless scanning of transactions which still * weren't checkpointed. */
if (!ret)
} while (transaction != last_transaction);if (!released) return;
} diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h index 29414288ea3e0..34ce197bd76e0 100644 --- a/include/trace/events/jbd2.h +++ b/include/trace/events/jbd2.h @@ -462,11 +462,9 @@ TRACE_EVENT(jbd2_shrink_scan_exit, TRACE_EVENT(jbd2_shrink_checkpoint_list, TP_PROTO(journal_t *journal, tid_t first_tid, tid_t tid, tid_t last_tid,
unsigned long nr_freed, unsigned long nr_scanned,
tid_t next_tid),
unsigned long nr_freed, tid_t next_tid),
- TP_ARGS(journal, first_tid, tid, last_tid, nr_freed,
nr_scanned, next_tid),
- TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, next_tid),
TP_STRUCT__entry( __field(dev_t, dev) @@ -474,7 +472,6 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, __field(tid_t, tid) __field(tid_t, last_tid) __field(unsigned long, nr_freed)
__field(tid_t, next_tid) ),__field(unsigned long, nr_scanned)
@@ -484,15 +481,14 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, __entry->tid = tid; __entry->last_tid = last_tid; __entry->nr_freed = nr_freed;
__entry->next_tid = next_tid; ),__entry->nr_scanned = nr_scanned;
TP_printk("dev %d,%d shrink transaction %u-%u(%u) freed %lu "
"scanned %lu next transaction %u",
MAJOR(__entry->dev), MINOR(__entry->dev), __entry->first_tid, __entry->tid, __entry->last_tid,"next transaction %u",
__entry->nr_freed, __entry->nr_scanned, __entry->next_tid)
__entry->nr_freed, __entry->next_tid)
);
#endif /* _TRACE_JBD2_H */
2.39.2
On Tue, Aug 01, 2023 at 03:09:44PM +0200, Jan Kara wrote:
On Tue 01-08-23 11:19:19, Greg Kroah-Hartman wrote:
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit b98dba273a0e47dbfade89c9af73c5b012a4eabb ]
journal_clean_one_cp_list() and journal_shrink_one_cp_list() are almost the same, so merge them into journal_shrink_one_cp_list(), remove the nr_to_scan parameter, always scan and try to free the whole checkpoint list.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin sashal@kernel.org
This and the following patch (46f881b5b175) have some issues [1] and cause a performance regression for some workloads and possible metadata corruption after a crash. So please drop these two patches from the stable trees for now. We can include them again later once the code has stabilized... Thanks!
Thanks, turns out there were 3 patches here (2 prep patches, one fix) that needed to be dropped from all queues. Let us know when the fix is in Linus's tree so we can add them back in.
thanks,
greg k-h
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 46f881b5b1758dc4a35fba4a643c10717d0cf427 ]
Before removing checkpoint buffer from the t_checkpoint_list, we have to check both BH_Dirty and BH_Lock bits together to distinguish buffers have not been or were being written back. But __cp_buffer_busy() checks them separately, it first check lock state and then check dirty, the window between these two checks could be raced by writing back procedure, which locks buffer and clears buffer dirty before I/O completes. So it cannot guarantee checkpointing buffers been written back to disk if some error happens later. Finally, it may clean checkpoint transactions and lead to inconsistent filesystem.
jbd2_journal_forget() and __journal_try_to_free_buffer() also have the same problem (journal_unmap_buffer() escape from this issue since it's running under the buffer lock), so fix them through introducing a new helper to try holding the buffer lock and remove really clean buffer.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Cc: stable@vger.kernel.org Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-6-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 38 +++++++++++++++++++++++++++++++++++--- fs/jbd2/transaction.c | 17 +++++------------ include/linux/jbd2.h | 1 + 3 files changed, 41 insertions(+), 15 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index ab72aeb766a74..fc6989e7a8c51 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -376,11 +376,15 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, jh = next_jh; next_jh = jh->b_cpnext;
- if (!destroy && __cp_buffer_busy(jh)) - continue; + if (destroy) { + ret = __jbd2_journal_remove_checkpoint(jh); + } else { + ret = jbd2_journal_try_remove_checkpoint(jh); + if (ret < 0) + continue; + }
nr_freed++; - ret = __jbd2_journal_remove_checkpoint(jh); if (ret) { *released = true; break; @@ -616,6 +620,34 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) return 1; }
+/* + * Check the checkpoint buffer and try to remove it from the checkpoint + * list if it's clean. Returns -EBUSY if it is not clean, returns 1 if + * it frees the transaction, 0 otherwise. + * + * This function is called with j_list_lock held. + */ +int jbd2_journal_try_remove_checkpoint(struct journal_head *jh) +{ + struct buffer_head *bh = jh2bh(jh); + + if (!trylock_buffer(bh)) + return -EBUSY; + if (buffer_dirty(bh)) { + unlock_buffer(bh); + return -EBUSY; + } + unlock_buffer(bh); + + /* + * Buffer is clean and the IO has finished (we held the buffer + * lock) so the checkpoint is done. We can safely remove the + * buffer from this transaction. + */ + JBUFFER_TRACE(jh, "remove from checkpoint list"); + return __jbd2_journal_remove_checkpoint(jh); +} + /* * journal_insert_checkpoint: put a committed buffer onto a checkpoint * list so that we know when it is safe to clean the transaction out of diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index ce4a5ccadeff4..62e68c5b8ec3d 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1775,8 +1775,7 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh) * Otherwise, if the buffer has been written to disk, * it is safe to remove the checkpoint and drop it. */ - if (!buffer_dirty(bh)) { - __jbd2_journal_remove_checkpoint(jh); + if (jbd2_journal_try_remove_checkpoint(jh) >= 0) { spin_unlock(&journal->j_list_lock); goto drop; } @@ -2103,20 +2102,14 @@ __journal_try_to_free_buffer(journal_t *journal, struct buffer_head *bh)
jh = bh2jh(bh);
- if (buffer_locked(bh) || buffer_dirty(bh)) - goto out; - if (jh->b_next_transaction != NULL || jh->b_transaction != NULL) - goto out; + return;
spin_lock(&journal->j_list_lock); - if (jh->b_cp_transaction != NULL) { - /* written-back checkpointed metadata buffer */ - JBUFFER_TRACE(jh, "remove from checkpoint list"); - __jbd2_journal_remove_checkpoint(jh); - } + /* Remove written-back checkpointed metadata buffer */ + if (jh->b_cp_transaction != NULL) + jbd2_journal_try_remove_checkpoint(jh); spin_unlock(&journal->j_list_lock); -out: return; }
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index e6cfbcde96f29..ade8a6d7acff9 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1441,6 +1441,7 @@ extern void jbd2_journal_commit_transaction(journal_t *); void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy); unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan); int __jbd2_journal_remove_checkpoint(struct journal_head *); +int jbd2_journal_try_remove_checkpoint(struct journal_head *jh); void jbd2_journal_destroy_checkpoint(journal_t *journal); void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
From: Yuan Can yuancan@huawei.com
[ Upstream commit 668dc8afce43d4bc01feb3e929d6d5ffcb14f899 ]
In the probe path, dev_err() can be replaced with dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs.
Signed-off-by: Yuan Can yuancan@huawei.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Andrew Halaney ahalaney@redhat.com Link: https://lore.kernel.org/r/20220922111228.36355-8-yuancan@huawei.com Signed-off-by: Vinod Koul vkoul@kernel.org Stable-dep-of: 8a0eb8f9b9a0 ("phy: qcom-snps-femto-v2: properly enable ref clock") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index 7e61202aa234e..54846259405a9 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -304,12 +304,9 @@ static int qcom_snps_hsphy_probe(struct platform_device *pdev) return PTR_ERR(hsphy->base);
hsphy->ref_clk = devm_clk_get(dev, "ref"); - if (IS_ERR(hsphy->ref_clk)) { - ret = PTR_ERR(hsphy->ref_clk); - if (ret != -EPROBE_DEFER) - dev_err(dev, "failed to get ref clk, %d\n", ret); - return ret; - } + if (IS_ERR(hsphy->ref_clk)) + return dev_err_probe(dev, PTR_ERR(hsphy->ref_clk), + "failed to get ref clk\n");
hsphy->phy_reset = devm_reset_control_get_exclusive(&pdev->dev, NULL); if (IS_ERR(hsphy->phy_reset)) { @@ -322,12 +319,9 @@ static int qcom_snps_hsphy_probe(struct platform_device *pdev) hsphy->vregs[i].supply = qcom_snps_hsphy_vreg_names[i];
ret = devm_regulator_bulk_get(dev, num, hsphy->vregs); - if (ret) { - if (ret != -EPROBE_DEFER) - dev_err(dev, "failed to get regulator supplies: %d\n", - ret); - return ret; - } + if (ret) + return dev_err_probe(dev, ret, + "failed to get regulator supplies\n");
pm_runtime_set_active(dev); pm_runtime_enable(dev);
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 2a881183dc5ab2474ef602e48fe7af34db460d95 ]
Update kerneldoc of struct qcom_snps_hsphy to fix:
drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c:135: warning: Function parameter or member 'update_seq_cfg' not described in 'qcom_snps_hsphy'
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230507144818.193039-1-krzysztof.kozlowski@linaro... Signed-off-by: Vinod Koul vkoul@kernel.org Stable-dep-of: 8a0eb8f9b9a0 ("phy: qcom-snps-femto-v2: properly enable ref clock") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index 54846259405a9..136b45903c798 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -73,11 +73,11 @@ static const char * const qcom_snps_hsphy_vreg_names[] = { * * @cfg_ahb_clk: AHB2PHY interface clock * @ref_clk: phy reference clock - * @iface_clk: phy interface clock * @phy_reset: phy reset control * @vregs: regulator supplies bulk data * @phy_initialized: if PHY has been initialized correctly * @mode: contains the current mode the PHY is in + * @update_seq_cfg: tuning parameters for phy init */ struct qcom_snps_hsphy { struct phy *phy;
From: Adrien Thierry athierry@redhat.com
[ Upstream commit 45d89a344eb46db9dce851c28e14f5e3c635c251 ]
In the dwc3 core, both system and runtime suspend end up calling dwc3_suspend_common(). From there, what happens for the PHYs depends on the USB mode and whether the controller is entering system or runtime suspend.
HOST mode: (1) system suspend on a non-wakeup-capable controller
The [1] if branch is taken. dwc3_core_exit() is called, which ends up calling phy_power_off() and phy_exit(). Those two functions decrease the PM runtime count at some point, so they will trigger the PHY runtime sleep (assuming the count is right).
(2) runtime suspend / system suspend on a wakeup-capable controller
The [1] branch is not taken. dwc3_suspend_common() calls phy_pm_runtime_put_sync(). Assuming the ref count is right, the PHY runtime suspend op is called.
DEVICE mode: dwc3_core_exit() is called on both runtime and system sleep unless the controller is already runtime suspended.
OTG mode: (1) system suspend : dwc3_core_exit() is called
(2) runtime suspend : do nothing
In host mode, the code seems to make a distinction between 1) runtime sleep / system sleep for wakeup-capable controller, and 2) system sleep for non-wakeup-capable controller, where phy_power_off() and phy_exit() are only called for the latter. This suggests the PHY is not supposed to be in a fully powered-off state for runtime sleep and system sleep for wakeup-capable controller.
Moreover, downstream, cfg_ahb_clk only gets disabled for system suspend. The clocks are disabled by phy->set_suspend() [2] which is only called in the system sleep path through dwc3_core_exit() [3].
With that in mind, don't disable the clocks during the femto PHY runtime suspend callback. The clocks will only be disabled during system suspend for non-wakeup-capable controllers, through dwc3_core_exit().
[1] https://elixir.bootlin.com/linux/v6.4/source/drivers/usb/dwc3/core.c#L1988 [2] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/LV.AU.1.2.1.r2-05300... [3] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/LV.AU.1.2.1.r2-05300...
Signed-off-by: Adrien Thierry athierry@redhat.com Link: https://lore.kernel.org/r/20230629144542.14906-2-athierry@redhat.com Signed-off-by: Vinod Koul vkoul@kernel.org Stable-dep-of: 8a0eb8f9b9a0 ("phy: qcom-snps-femto-v2: properly enable ref clock") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index 136b45903c798..dfe5f09449100 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -122,22 +122,13 @@ static int qcom_snps_hsphy_suspend(struct qcom_snps_hsphy *hsphy) 0, USB2_AUTO_RESUME); }
- clk_disable_unprepare(hsphy->cfg_ahb_clk); return 0; }
static int qcom_snps_hsphy_resume(struct qcom_snps_hsphy *hsphy) { - int ret; - dev_dbg(&hsphy->phy->dev, "Resume QCOM SNPS PHY, mode\n");
- ret = clk_prepare_enable(hsphy->cfg_ahb_clk); - if (ret) { - dev_err(&hsphy->phy->dev, "failed to enable cfg ahb clock\n"); - return ret; - } - return 0; }
From: Adrien Thierry athierry@redhat.com
[ Upstream commit 8a0eb8f9b9a002291a3934acfd913660b905249e ]
The driver is not enabling the ref clock, which thus gets disabled by the clk_disable_unused() initcall. This leads to the dwc3 controller failing to initialize if probed after clk_disable_unused() is called, for instance when the driver is built as a module.
To fix this, switch to the clk_bulk API to handle both cfg_ahb and ref clocks at the proper places.
Note that the cfg_ahb clock is currently not used by any device tree instantiation of the PHY. Work needs to be done separately to fix this.
Link: https://lore.kernel.org/linux-arm-msm/ZEqvy+khHeTkC2hf@fedora/ Fixes: 51e8114f80d0 ("phy: qcom-snps: Add SNPS USB PHY driver for QCOM based SOCs") Signed-off-by: Adrien Thierry athierry@redhat.com Link: https://lore.kernel.org/r/20230629144542.14906-3-athierry@redhat.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 63 ++++++++++++++----- 1 file changed, 48 insertions(+), 15 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index dfe5f09449100..abb9264569336 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -68,11 +68,13 @@ static const char * const qcom_snps_hsphy_vreg_names[] = { /** * struct qcom_snps_hsphy - snps hs phy attributes * + * @dev: device structure + * * @phy: generic phy * @base: iomapped memory space for snps hs phy * - * @cfg_ahb_clk: AHB2PHY interface clock - * @ref_clk: phy reference clock + * @num_clks: number of clocks + * @clks: array of clocks * @phy_reset: phy reset control * @vregs: regulator supplies bulk data * @phy_initialized: if PHY has been initialized correctly @@ -80,11 +82,13 @@ static const char * const qcom_snps_hsphy_vreg_names[] = { * @update_seq_cfg: tuning parameters for phy init */ struct qcom_snps_hsphy { + struct device *dev; + struct phy *phy; void __iomem *base;
- struct clk *cfg_ahb_clk; - struct clk *ref_clk; + int num_clks; + struct clk_bulk_data *clks; struct reset_control *phy_reset; struct regulator_bulk_data vregs[SNPS_HS_NUM_VREGS];
@@ -92,6 +96,34 @@ struct qcom_snps_hsphy { enum phy_mode mode; };
+static int qcom_snps_hsphy_clk_init(struct qcom_snps_hsphy *hsphy) +{ + struct device *dev = hsphy->dev; + + hsphy->num_clks = 2; + hsphy->clks = devm_kcalloc(dev, hsphy->num_clks, sizeof(*hsphy->clks), GFP_KERNEL); + if (!hsphy->clks) + return -ENOMEM; + + /* + * TODO: Currently no device tree instantiation of the PHY is using the clock. + * This needs to be fixed in order for this code to be able to use devm_clk_bulk_get(). + */ + hsphy->clks[0].id = "cfg_ahb"; + hsphy->clks[0].clk = devm_clk_get_optional(dev, "cfg_ahb"); + if (IS_ERR(hsphy->clks[0].clk)) + return dev_err_probe(dev, PTR_ERR(hsphy->clks[0].clk), + "failed to get cfg_ahb clk\n"); + + hsphy->clks[1].id = "ref"; + hsphy->clks[1].clk = devm_clk_get(dev, "ref"); + if (IS_ERR(hsphy->clks[1].clk)) + return dev_err_probe(dev, PTR_ERR(hsphy->clks[1].clk), + "failed to get ref clk\n"); + + return 0; +} + static inline void qcom_snps_hsphy_write_mask(void __iomem *base, u32 offset, u32 mask, u32 val) { @@ -174,16 +206,16 @@ static int qcom_snps_hsphy_init(struct phy *phy) if (ret) return ret;
- ret = clk_prepare_enable(hsphy->cfg_ahb_clk); + ret = clk_bulk_prepare_enable(hsphy->num_clks, hsphy->clks); if (ret) { - dev_err(&phy->dev, "failed to enable cfg ahb clock, %d\n", ret); + dev_err(&phy->dev, "failed to enable clocks, %d\n", ret); goto poweroff_phy; }
ret = reset_control_assert(hsphy->phy_reset); if (ret) { dev_err(&phy->dev, "failed to assert phy_reset, %d\n", ret); - goto disable_ahb_clk; + goto disable_clks; }
usleep_range(100, 150); @@ -191,7 +223,7 @@ static int qcom_snps_hsphy_init(struct phy *phy) ret = reset_control_deassert(hsphy->phy_reset); if (ret) { dev_err(&phy->dev, "failed to de-assert phy_reset, %d\n", ret); - goto disable_ahb_clk; + goto disable_clks; }
qcom_snps_hsphy_write_mask(hsphy->base, USB2_PHY_USB_PHY_CFG0, @@ -237,8 +269,8 @@ static int qcom_snps_hsphy_init(struct phy *phy)
return 0;
-disable_ahb_clk: - clk_disable_unprepare(hsphy->cfg_ahb_clk); +disable_clks: + clk_bulk_disable_unprepare(hsphy->num_clks, hsphy->clks); poweroff_phy: regulator_bulk_disable(ARRAY_SIZE(hsphy->vregs), hsphy->vregs);
@@ -250,7 +282,7 @@ static int qcom_snps_hsphy_exit(struct phy *phy) struct qcom_snps_hsphy *hsphy = phy_get_drvdata(phy);
reset_control_assert(hsphy->phy_reset); - clk_disable_unprepare(hsphy->cfg_ahb_clk); + clk_bulk_disable_unprepare(hsphy->num_clks, hsphy->clks); regulator_bulk_disable(ARRAY_SIZE(hsphy->vregs), hsphy->vregs); hsphy->phy_initialized = false;
@@ -290,14 +322,15 @@ static int qcom_snps_hsphy_probe(struct platform_device *pdev) if (!hsphy) return -ENOMEM;
+ hsphy->dev = dev; + hsphy->base = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(hsphy->base)) return PTR_ERR(hsphy->base);
- hsphy->ref_clk = devm_clk_get(dev, "ref"); - if (IS_ERR(hsphy->ref_clk)) - return dev_err_probe(dev, PTR_ERR(hsphy->ref_clk), - "failed to get ref clk\n"); + ret = qcom_snps_hsphy_clk_init(hsphy); + if (ret) + return dev_err_probe(dev, ret, "failed to initialize clocks\n");
hsphy->phy_reset = devm_reset_control_get_exclusive(&pdev->dev, NULL); if (IS_ERR(hsphy->phy_reset)) {
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit f84d41b2a083b990cbdf70f3b24b6b108b9678ad ]
SoundWire device status can be incorrectly updated without proper mask, fix this by adding a mask before updating the status.
Fixes: c7d49c76d1d5 ("soundwire: qcom: add support to new interrupts") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230525133812.30841-2-srinivas.kandagatla@linaro.... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soundwire/qcom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index 2045bcdfce1ab..e3b52d5aa411e 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -405,7 +405,7 @@ static int qcom_swrm_get_alert_slave_dev_num(struct qcom_swrm_ctrl *ctrl) status = (val >> (dev_num * SWRM_MCP_SLV_STATUS_SZ));
if ((status & SWRM_MCP_SLV_STATUS_MASK) == SDW_SLAVE_ALERT) { - ctrl->status[dev_num] = status; + ctrl->status[dev_num] = status & SWRM_MCP_SLV_STATUS_MASK; return dev_num; } }
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit bf4c985707d3168ebb7d87d15830de66949d979c ]
Select V4L2_FWNODE as the driver depends on it.
Reported-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Fixes: aa31f6514047 ("media: atomisp: allow building the driver again") Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Tested-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/atomisp/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/staging/media/atomisp/Kconfig b/drivers/staging/media/atomisp/Kconfig index aeed5803dfb1e..0031d76356c1c 100644 --- a/drivers/staging/media/atomisp/Kconfig +++ b/drivers/staging/media/atomisp/Kconfig @@ -13,6 +13,7 @@ config VIDEO_ATOMISP tristate "Intel Atom Image Signal Processor Driver" depends on VIDEO_V4L2 && INTEL_ATOMISP depends on PMIC_OPREGION + select V4L2_FWNODE select IOSF_MBI select VIDEOBUF_VMALLOC select VIDEO_V4L2_SUBDEV_API
From: Wang Ming machel@vivo.com
[ Upstream commit 043b1f185fb0f3939b7427f634787706f45411c4 ]
The debugfs_create_dir() function returns error pointers. It never returns NULL. Most incorrect error checks were fixed, but the one in i40e_dbg_init() was forgotten.
Fix the remaining error check.
Fixes: 02e9c290814c ("i40e: debugfs interface") Signed-off-by: Wang Ming machel@vivo.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c index c057343165a51..7c5f874ef335a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -1839,7 +1839,7 @@ void i40e_dbg_pf_exit(struct i40e_pf *pf) void i40e_dbg_init(void) { i40e_dbg_root = debugfs_create_dir(i40e_driver_name, NULL); - if (!i40e_dbg_root) + if (IS_ERR(i40e_dbg_root)) pr_info("init of debugfs failed\n"); }
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit a2f054c10bef0b54600ec9cb776508443e941343 ]
In iavf_adminq_task(), if kzalloc() fails to allocate the event.msg_buf, the function will exit without releasing the adapter->crit_lock.
This is unlikely, but if it happens, the next access to that mutex will deadlock.
Fix this by moving the unlock to the end of the function, and adding a new label to allow jumping to the unlock portion of the function exit flow.
Fixes: fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") Signed-off-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index bcceb2ddfea63..1e349a90d21aa 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -2546,7 +2546,7 @@ static void iavf_adminq_task(struct work_struct *work) event.buf_len = IAVF_MAX_AQ_BUF_SIZE; event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL); if (!event.msg_buf) - goto out; + goto unlock;
do { ret = iavf_clean_arq_element(hw, &event, &pending); @@ -2561,7 +2561,6 @@ static void iavf_adminq_task(struct work_struct *work) if (pending != 0) memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE); } while (pending); - mutex_unlock(&adapter->crit_lock);
if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES)) { if (adapter->netdev_registered || @@ -2619,6 +2618,8 @@ static void iavf_adminq_task(struct work_struct *work)
freedom: kfree(event.msg_buf); +unlock: + mutex_unlock(&adapter->crit_lock); out: /* re-enable Admin queue interrupt cause */ iavf_misc_irq_enable(adapter);
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit 91896c8acce23d33ed078cffd46a9534b1f82be5 ]
In iavf_adminq_task(), if the function can't acquire the adapter->crit_lock, it checks if the driver is removing. If so, it simply exits without re-enabling the interrupt. This is done to ensure that the task stops processing as soon as possible once the driver is being removed.
However, if the IAVF_FLAG_PF_COMMS_FAILED is set, the function checks this before attempting to acquire the lock. In this case, the function exits early and re-enables the interrupt. This will happen even if the driver is already removing.
Avoid this, by moving the check to after the adapter->crit_lock is acquired. This way, if the driver is removing, we will not re-enable the interrupt.
Fixes: fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") Signed-off-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 1e349a90d21aa..a87f4f1ae6845 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -2532,9 +2532,6 @@ static void iavf_adminq_task(struct work_struct *work) u32 val, oldval; u16 pending;
- if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) - goto out; - if (!mutex_trylock(&adapter->crit_lock)) { if (adapter->state == __IAVF_REMOVE) return; @@ -2543,6 +2540,9 @@ static void iavf_adminq_task(struct work_struct *work) goto out; }
+ if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) + goto unlock; + event.buf_len = IAVF_MAX_AQ_BUF_SIZE; event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL); if (!event.msg_buf)
From: Jiawen Wu jiawenwu@trustnetic.com
[ Upstream commit c7b75bea853daeb64fc831dbf39a6bbabcc402ac ]
Clear MV_V2_PORT_CTRL_PWRDOWN bit to set power up for 88x3310 PHY, it sometimes does not take effect immediately. And a read of this register causes the bit not to clear. This will cause mv3310_reset() to time out, which will fail the config initialization. So add a delay before the next access.
Fixes: c9cc1c815d36 ("net: phy: marvell10g: place in powersave mode at probe") Signed-off-by: Jiawen Wu jiawenwu@trustnetic.com Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/marvell10g.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c index df33637c5269a..1caa6d943a7b7 100644 --- a/drivers/net/phy/marvell10g.c +++ b/drivers/net/phy/marvell10g.c @@ -307,6 +307,13 @@ static int mv3310_power_up(struct phy_device *phydev) ret = phy_clear_bits_mmd(phydev, MDIO_MMD_VEND2, MV_V2_PORT_CTRL, MV_V2_PORT_CTRL_PWRDOWN);
+ /* Sometimes, the power down bit doesn't clear immediately, and + * a read of this register causes the bit not to clear. Delay + * 100us to allow the PHY to come out of power down mode before + * the next access. + */ + udelay(100); + if (phydev->drv->phy_id != MARVELL_PHY_ID_88X3310 || priv->firmware_ver < 0x00030000) return ret;
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit 116d9f732eef634abbd871f2c6f613a5b4677742 ]
Currently, the weight saved by the driver is used as the query result, which may be different from the actual weight in the register. Therefore, the register value read from the firmware is used as the query result
Fixes: 0e32038dc856 ("net: hns3: refactor dump tc of debugfs") Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c index 9cda8b3562b89..dd8b73aebe6a5 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c @@ -677,8 +677,7 @@ static int hclge_dbg_dump_tc(struct hclge_dev *hdev, char *buf, int len) for (i = 0; i < HNAE3_MAX_TC; i++) { sch_mode_str = ets_weight->tc_weight[i] ? "dwrr" : "sp"; pos += scnprintf(buf + pos, len - pos, "%u %4s %3u\n", - i, sch_mode_str, - hdev->tm_info.pg_info[0].tc_dwrr[i]); + i, sch_mode_str, ets_weight->tc_weight[i]); }
return 0;
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit 882481b1c55fc44861d7e2d54b4e0936b1b39f2c ]
In dwrr mode, the default bandwidth weight of disabled tc is set to 0. If the bandwidth weight is 0, the mode will change to sp. Therefore, disabled tc default bandwidth weight need changed to 1, and 0 is returned when query the bandwidth weight of disabled tc. In addition, driver need stop configure bandwidth weight if tc is disabled.
Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Jie Wang wangjie125@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 17 ++++++++++++++--- .../ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 3 ++- 2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c index 375ebf105a9aa..87640a2e1794b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c @@ -52,7 +52,10 @@ static void hclge_tm_info_to_ieee_ets(struct hclge_dev *hdev,
for (i = 0; i < HNAE3_MAX_TC; i++) { ets->prio_tc[i] = hdev->tm_info.prio_tc[i]; - ets->tc_tx_bw[i] = hdev->tm_info.pg_info[0].tc_dwrr[i]; + if (i < hdev->tm_info.num_tc) + ets->tc_tx_bw[i] = hdev->tm_info.pg_info[0].tc_dwrr[i]; + else + ets->tc_tx_bw[i] = 0;
if (hdev->tm_info.tc_info[i].tc_sch_mode == HCLGE_SCH_MODE_SP) @@ -123,7 +126,8 @@ static u8 hclge_ets_tc_changed(struct hclge_dev *hdev, struct ieee_ets *ets, }
static int hclge_ets_sch_mode_validate(struct hclge_dev *hdev, - struct ieee_ets *ets, bool *changed) + struct ieee_ets *ets, bool *changed, + u8 tc_num) { bool has_ets_tc = false; u32 total_ets_bw = 0; @@ -137,6 +141,13 @@ static int hclge_ets_sch_mode_validate(struct hclge_dev *hdev, *changed = true; break; case IEEE_8021QAZ_TSA_ETS: + if (i >= tc_num) { + dev_err(&hdev->pdev->dev, + "tc%u is disabled, cannot set ets bw\n", + i); + return -EINVAL; + } + /* The hardware will switch to sp mode if bandwidth is * 0, so limit ets bandwidth must be greater than 0. */ @@ -176,7 +187,7 @@ static int hclge_ets_validate(struct hclge_dev *hdev, struct ieee_ets *ets, if (ret) return ret;
- ret = hclge_ets_sch_mode_validate(hdev, ets, changed); + ret = hclge_ets_sch_mode_validate(hdev, ets, changed, tc_num); if (ret) return ret;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c index 97a6864f60ef4..e7cb6a81e5b67 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c @@ -732,6 +732,7 @@ static void hclge_tm_tc_info_init(struct hclge_dev *hdev) static void hclge_tm_pg_info_init(struct hclge_dev *hdev) { #define BW_PERCENT 100 +#define DEFAULT_BW_WEIGHT 1
u8 i;
@@ -753,7 +754,7 @@ static void hclge_tm_pg_info_init(struct hclge_dev *hdev) for (k = 0; k < hdev->tm_info.num_tc; k++) hdev->tm_info.pg_info[i].tc_dwrr[k] = BW_PERCENT; for (; k < HNAE3_MAX_TC; k++) - hdev->tm_info.pg_info[i].tc_dwrr[k] = 0; + hdev->tm_info.pg_info[i].tc_dwrr[k] = DEFAULT_BW_WEIGHT; } }
From: Roopa Prabhu roopa@nvidia.com
[ Upstream commit 6765393614ea8e2c0a7b953063513823f87c9115 ]
vxlan.c has grown too long. This patch moves it to its own directory. subsequent patches add new functionality in new files.
Signed-off-by: Roopa Prabhu roopa@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 94d166c5318c ("vxlan: calculate correct header length for GPE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/Makefile | 2 +- drivers/net/vxlan/Makefile | 7 +++++++ drivers/net/{vxlan.c => vxlan/vxlan_core.c} | 0 3 files changed, 8 insertions(+), 1 deletion(-) create mode 100644 drivers/net/vxlan/Makefile rename drivers/net/{vxlan.c => vxlan/vxlan_core.c} (100%)
diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 739838623cf65..50e60852f1286 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -30,7 +30,7 @@ obj-$(CONFIG_TUN) += tun.o obj-$(CONFIG_TAP) += tap.o obj-$(CONFIG_VETH) += veth.o obj-$(CONFIG_VIRTIO_NET) += virtio_net.o -obj-$(CONFIG_VXLAN) += vxlan.o +obj-$(CONFIG_VXLAN) += vxlan/ obj-$(CONFIG_GENEVE) += geneve.o obj-$(CONFIG_BAREUDP) += bareudp.o obj-$(CONFIG_GTP) += gtp.o diff --git a/drivers/net/vxlan/Makefile b/drivers/net/vxlan/Makefile new file mode 100644 index 0000000000000..5672661335933 --- /dev/null +++ b/drivers/net/vxlan/Makefile @@ -0,0 +1,7 @@ +# +# Makefile for the vxlan driver +# + +obj-$(CONFIG_VXLAN) += vxlan.o + +vxlan-objs := vxlan_core.o diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan/vxlan_core.c similarity index 100% rename from drivers/net/vxlan.c rename to drivers/net/vxlan/vxlan_core.c
From: Jiri Benc jbenc@redhat.com
[ Upstream commit 94d166c5318c6edd1e079df8552233443e909c33 ]
VXLAN-GPE does not add an extra inner Ethernet header. Take that into account when calculating header length.
This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is cached.
In the collect_md mode (which is the only mode that VXLAN-GPE supports), there's no magic auto-setting of the tunnel interface MTU. It can't be, since the destination and thus the underlying interface may be different for each packet.
So, the administrator is responsible for setting the correct tunnel interface MTU. Apparently, the administrators are capable enough to calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36). They set the tunnel interface MTU to 1464. If you run a TCP stream over such interface, it's then segmented according to the MTU 1464, i.e. producing 1514 bytes frames. Which is okay, this still fits the lower MTU.
However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as the header size and thus incorrectly calculates the frame size to be 1528. This leads to ICMP too big message being generated (locally), PMTU of 1450 to be cached and the TCP stream to be resegmented.
The fix is to use the correct actual header size, especially for skb_tunnel_check_pmtu calculation.
Fixes: e1e5314de08ba ("vxlan: implement GPE") Signed-off-by: Jiri Benc jbenc@redhat.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- drivers/net/vxlan/vxlan_core.c | 23 ++++++++----------- include/net/vxlan.h | 13 +++++++---- 3 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 6fb9c18297bc8..af824370a2f6f 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -8398,7 +8398,7 @@ static void ixgbe_atr(struct ixgbe_ring *ring, struct ixgbe_adapter *adapter = q_vector->adapter;
if (unlikely(skb_tail_pointer(skb) < hdr.network + - VXLAN_HEADROOM)) + vxlan_headroom(0))) return;
/* verify the port is recognized as VXLAN */ diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 129e270e9a7cd..106b66570e046 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2721,7 +2721,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, }
ndst = &rt->dst; - err = skb_tunnel_check_pmtu(skb, ndst, VXLAN_HEADROOM, + err = skb_tunnel_check_pmtu(skb, ndst, vxlan_headroom(flags & VXLAN_F_GPE), netif_is_any_bridge_port(dev)); if (err < 0) { goto tx_error; @@ -2782,7 +2782,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, goto out_unlock; }
- err = skb_tunnel_check_pmtu(skb, ndst, VXLAN6_HEADROOM, + err = skb_tunnel_check_pmtu(skb, ndst, + vxlan_headroom((flags & VXLAN_F_GPE) | VXLAN_F_IPV6), netif_is_any_bridge_port(dev)); if (err < 0) { goto tx_error; @@ -3159,14 +3160,12 @@ static int vxlan_change_mtu(struct net_device *dev, int new_mtu) struct vxlan_rdst *dst = &vxlan->default_dst; struct net_device *lowerdev = __dev_get_by_index(vxlan->net, dst->remote_ifindex); - bool use_ipv6 = !!(vxlan->cfg.flags & VXLAN_F_IPV6);
/* This check is different than dev->max_mtu, because it looks at * the lowerdev->mtu, rather than the static dev->max_mtu */ if (lowerdev) { - int max_mtu = lowerdev->mtu - - (use_ipv6 ? VXLAN6_HEADROOM : VXLAN_HEADROOM); + int max_mtu = lowerdev->mtu - vxlan_headroom(vxlan->cfg.flags); if (new_mtu > max_mtu) return -EINVAL; } @@ -3788,11 +3787,11 @@ static void vxlan_config_apply(struct net_device *dev, struct vxlan_dev *vxlan = netdev_priv(dev); struct vxlan_rdst *dst = &vxlan->default_dst; unsigned short needed_headroom = ETH_HLEN; - bool use_ipv6 = !!(conf->flags & VXLAN_F_IPV6); int max_mtu = ETH_MAX_MTU; + u32 flags = conf->flags;
if (!changelink) { - if (conf->flags & VXLAN_F_GPE) + if (flags & VXLAN_F_GPE) vxlan_raw_setup(dev); else vxlan_ether_setup(dev); @@ -3818,8 +3817,7 @@ static void vxlan_config_apply(struct net_device *dev,
dev->needed_tailroom = lowerdev->needed_tailroom;
- max_mtu = lowerdev->mtu - (use_ipv6 ? VXLAN6_HEADROOM : - VXLAN_HEADROOM); + max_mtu = lowerdev->mtu - vxlan_headroom(flags); if (max_mtu < ETH_MIN_MTU) max_mtu = ETH_MIN_MTU;
@@ -3830,10 +3828,9 @@ static void vxlan_config_apply(struct net_device *dev, if (dev->mtu > max_mtu) dev->mtu = max_mtu;
- if (use_ipv6 || conf->flags & VXLAN_F_COLLECT_METADATA) - needed_headroom += VXLAN6_HEADROOM; - else - needed_headroom += VXLAN_HEADROOM; + if (flags & VXLAN_F_COLLECT_METADATA) + flags |= VXLAN_F_IPV6; + needed_headroom += vxlan_headroom(flags); dev->needed_headroom = needed_headroom;
memcpy(&vxlan->cfg, conf, sizeof(*conf)); diff --git a/include/net/vxlan.h b/include/net/vxlan.h index 08537aa14f7c3..cf1d870f7b9a8 100644 --- a/include/net/vxlan.h +++ b/include/net/vxlan.h @@ -327,10 +327,15 @@ static inline netdev_features_t vxlan_features_check(struct sk_buff *skb, return features; }
-/* IP header + UDP + VXLAN + Ethernet header */ -#define VXLAN_HEADROOM (20 + 8 + 8 + 14) -/* IPv6 header + UDP + VXLAN + Ethernet header */ -#define VXLAN6_HEADROOM (40 + 8 + 8 + 14) +static inline int vxlan_headroom(u32 flags) +{ + /* VXLAN: IP4/6 header + UDP + VXLAN + Ethernet header */ + /* VXLAN-GPE: IP4/6 header + UDP + VXLAN */ + return (flags & VXLAN_F_IPV6 ? sizeof(struct ipv6hdr) : + sizeof(struct iphdr)) + + sizeof(struct udphdr) + sizeof(struct vxlanhdr) + + (flags & VXLAN_F_GPE ? 0 : ETH_HLEN); +}
static inline struct vxlanhdr *vxlan_hdr(struct sk_buff *skb) {
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
[ Upstream commit 13c088cf3657d70893d75cf116be937f1509cc0f ]
The size of array 'priv->ports[]' is INNO_PHY_PORT_NUM.
In the for loop, 'i' is used as the index for array 'priv->ports[]' with a check (i > INNO_PHY_PORT_NUM) which indicates that INNO_PHY_PORT_NUM is allowed value for 'i' in the same loop.
This > comparison needs to be changed to >=, otherwise it potentially leads to an out of bounds write on the next iteration through the loop
Fixes: ba8b0ee81fbb ("phy: add inno-usb2-phy driver for hi3798cv200 SoC") Reported-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Link: https://lore.kernel.org/r/20230721090558.3588613-1-harshit.m.mogalapalli@ora... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/hisilicon/phy-hisi-inno-usb2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/hisilicon/phy-hisi-inno-usb2.c b/drivers/phy/hisilicon/phy-hisi-inno-usb2.c index 34a6a9a1ceb25..897c6bb4cbb8c 100644 --- a/drivers/phy/hisilicon/phy-hisi-inno-usb2.c +++ b/drivers/phy/hisilicon/phy-hisi-inno-usb2.c @@ -153,7 +153,7 @@ static int hisi_inno_phy_probe(struct platform_device *pdev) phy_set_drvdata(phy, &priv->ports[i]); i++;
- if (i > INNO_PHY_PORT_NUM) { + if (i >= INNO_PHY_PORT_NUM) { dev_warn(dev, "Support %d ports in maximum\n", i); break; }
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit 69a184f7a372aac588babfb0bd681aaed9779f5b ]
in atl1e_tso_csum, it should check the return value of pskb_trim(), and return an error code if an unexpected value is returned by pskb_trim().
Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230720144219.39285-1-ruc_gongyuanjun@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/atl1e/atl1e_main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/atheros/atl1e/atl1e_main.c b/drivers/net/ethernet/atheros/atl1e/atl1e_main.c index 753973ac922e9..db13311e77e73 100644 --- a/drivers/net/ethernet/atheros/atl1e/atl1e_main.c +++ b/drivers/net/ethernet/atheros/atl1e/atl1e_main.c @@ -1642,8 +1642,11 @@ static int atl1e_tso_csum(struct atl1e_adapter *adapter, real_len = (((unsigned char *)ip_hdr(skb) - skb->data) + ntohs(ip_hdr(skb)->tot_len));
- if (real_len < skb->len) - pskb_trim(skb, real_len); + if (real_len < skb->len) { + err = pskb_trim(skb, real_len); + if (err) + return err; + }
hdr_len = (skb_transport_offset(skb) + tcp_hdrlen(skb)); if (unlikely(skb->len == hdr_len)) {
From: Maciej Żenczykowski maze@google.com
[ Upstream commit 69172f0bcb6a09110c5d2a6d792627f5095a9018 ]
currently on 6.4 net/main:
# ip link add dummy1 type dummy # echo 1 > /proc/sys/net/ipv6/conf/dummy1/use_tempaddr # ip link set dummy1 up # ip -6 addr add 2000::1/64 mngtmpaddr dev dummy1 # ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::44f3:581c:8ca:3983/64 scope global temporary dynamic valid_lft 604800sec preferred_lft 86172sec inet6 2000::1/64 scope global mngtmpaddr valid_lft forever preferred_lft forever inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever
# ip -6 addr del 2000::44f3:581c:8ca:3983/64 dev dummy1
(can wait a few seconds if you want to, the above delete isn't [directly] the problem)
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::1/64 scope global mngtmpaddr valid_lft forever preferred_lft forever inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever
# ip -6 addr del 2000::1/64 mngtmpaddr dev dummy1 # ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::81c9:56b7:f51a:b98f/64 scope global temporary dynamic valid_lft 604797sec preferred_lft 86169sec inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever
This patch prevents this new 'global temporary dynamic' address from being created by the deletion of the related (same subnet prefix) 'mngtmpaddr' (which is triggered by there already being no temporary addresses).
Cc: Jiri Pirko jiri@resnulli.us Fixes: 53bd67491537 ("ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses") Reported-by: Xiao Ma xiaom@google.com Signed-off-by: Maciej Żenczykowski maze@google.com Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20230720160022.1887942-1-maze@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/addrconf.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index e0d3909172a84..0c0b7969840f5 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2565,12 +2565,18 @@ static void manage_tempaddrs(struct inet6_dev *idev, ipv6_ifa_notify(0, ift); }
- if ((create || list_empty(&idev->tempaddr_list)) && - idev->cnf.use_tempaddr > 0) { + /* Also create a temporary address if it's enabled but no temporary + * address currently exists. + * However, we get called with valid_lft == 0, prefered_lft == 0, create == false + * as part of cleanup (ie. deleting the mngtmpaddr). + * We don't want that to result in creating a new temporary ip address. + */ + if (list_empty(&idev->tempaddr_list) && (valid_lft || prefered_lft)) + create = true; + + if (create && idev->cnf.use_tempaddr > 0) { /* When a new public address is created as described * in [ADDRCONF], also create a new temporary address. - * Also create a temporary address if it's enabled but - * no temporary address currently exists. */ read_unlock_bh(&idev->lock); ipv6_create_tempaddr(ifp, false);
From: Stewart Smith trawets@amazon.com
[ Upstream commit d11b0df7ddf1831f3e170972f43186dad520bfcc ]
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash table with a hash over the source & destination addresses and ports. However, the IPv6 hash is insufficient and can lead to a high rate of collisions.
The IPv6 hash used an XOR to fit everything into the 96 bits for the fast jenkins hash, meaning it is possible for an external entity to ensure the hash collides, thus falling back to a linear search in the bucket, which is slow.
We take the approach of hash the full length of IPv6 address in __ipv6_addr_jhash() so that all users can benefit from a more secure version.
While this may look like it adds overhead, the reality of modern CPUs means that this is unmeasurable in real world scenarios.
In simulating with llvm-mca, the increase in cycles for the hashing code was ~16 cycles on Skylake (from a base of ~155), and an extra ~9 on Nehalem (base of ~173).
In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash") netfilter switched from a jenkins hash to a siphash, but even the faster hsiphash is a more significant overhead (~20-30%) in some preliminary testing. So, in this patch, we keep to the more conservative approach to ensure we don't add much overhead per SYN.
In testing, this results in a consistently even spread across the connection buckets. In both testing and real-world scenarios, we have not found any measurable performance impact.
Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp") Signed-off-by: Stewart Smith trawets@amazon.com Signed-off-by: Samuel Mendoza-Jonas samjonas@amazon.com Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230721222410.17914-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ipv6.h | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index e3ab99f4edab7..20930086b2288 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -664,12 +664,8 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a) /* more secured version of ipv6_addr_hash() */ static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const u32 initval) { - u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1]; - - return jhash_3words(v, - (__force u32)a->s6_addr32[2], - (__force u32)a->s6_addr32[3], - initval); + return jhash2((__force const u32 *)a->s6_addr32, + ARRAY_SIZE(a->s6_addr32), initval); }
static inline bool ipv6_addr_loopback(const struct in6_addr *a)
From: Jedrzej Jagielski jedrzej.jagielski@intel.com
[ Upstream commit a3336056504d780590ac6d6ac94fbba829994594 ]
Fix ethtool FDIR logic to not use memory after its release. In the ice_ethtool_fdir.c file there are 2 spots where code can refer to pointers which may be missing.
In the ice_cfg_fdir_xtrct_seq() function seg may be freed but even then may be still used by memcpy(&tun_seg[1], seg, sizeof(*seg)).
In the ice_add_fdir_ethtool() function struct ice_fdir_fltr *input may first fail to be added via ice_fdir_update_list_entry() but then may be deleted by ice_fdir_update_list_entry.
Terminate in both cases when the returned value of the previous operation is other than 0, free memory and don't use it anymore.
Reported-by: Michal Schmidt mschmidt@redhat.com Link: https://bugzilla.redhat.com/show_bug.cgi?id=2208423 Fixes: cac2a27cd9ab ("ice: Support IPv4 Flow Director filters") Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Jedrzej Jagielski jedrzej.jagielski@intel.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230721155854.1292805-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/intel/ice/ice_ethtool_fdir.c | 26 ++++++++++--------- 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c index 16de603b280c6..0106ea3519a01 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c @@ -1135,16 +1135,21 @@ ice_cfg_fdir_xtrct_seq(struct ice_pf *pf, struct ethtool_rx_flow_spec *fsp, ICE_FLOW_FLD_OFF_INVAL); }
- /* add filter for outer headers */ fltr_idx = ice_ethtool_flow_to_fltr(fsp->flow_type & ~FLOW_EXT); + + assign_bit(fltr_idx, hw->fdir_perfect_fltr, perfect_filter); + + /* add filter for outer headers */ ret = ice_fdir_set_hw_fltr_rule(pf, seg, fltr_idx, ICE_FD_HW_SEG_NON_TUN); - if (ret == -EEXIST) - /* Rule already exists, free memory and continue */ - devm_kfree(dev, seg); - else if (ret) + if (ret == -EEXIST) { + /* Rule already exists, free memory and count as success */ + ret = 0; + goto err_exit; + } else if (ret) { /* could not write filter, free memory */ goto err_exit; + }
/* make tunneled filter HW entries if possible */ memcpy(&tun_seg[1], seg, sizeof(*seg)); @@ -1159,18 +1164,13 @@ ice_cfg_fdir_xtrct_seq(struct ice_pf *pf, struct ethtool_rx_flow_spec *fsp, devm_kfree(dev, tun_seg); }
- if (perfect_filter) - set_bit(fltr_idx, hw->fdir_perfect_fltr); - else - clear_bit(fltr_idx, hw->fdir_perfect_fltr); - return ret;
err_exit: devm_kfree(dev, tun_seg); devm_kfree(dev, seg);
- return -EOPNOTSUPP; + return ret; }
/** @@ -1684,7 +1684,9 @@ int ice_add_fdir_ethtool(struct ice_vsi *vsi, struct ethtool_rxnfc *cmd) input->comp_report = ICE_FXD_FLTR_QW0_COMP_REPORT_SW_FAIL;
/* input struct is added to the HW filter list */ - ice_fdir_update_list_entry(pf, input, fsp->location); + ret = ice_fdir_update_list_entry(pf, input, fsp->location); + if (ret) + goto release_lock;
ret = ice_fdir_write_all_fltr(pf, input, true); if (ret)
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit da19a2b967cf1e2c426f50d28550d1915214a81d ]
When adding a point to point downlink to the bond, we neglected to reset the bond's flags, which were still using flags like BROADCAST and MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink interfaces, such as when adding a GRE device to the bonding.
To address this issue, let's reset the bond's flags for P2P interfaces.
Before fix: 7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188:: 8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/gre6 2006:70:10::1 brd 2006:70:10::2 inet6 fe80::200:ff:fe00:0/64 scope link valid_lft forever preferred_lft forever
After fix: 7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9:: 8: bond0: <POINTOPOINT,NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 inet6 fe80::1/64 scope link valid_lft forever preferred_lft forever
Reported-by: Liang Li liali@redhat.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7b0b4049bd294..69cc36c0840f7 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1482,6 +1482,11 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
memcpy(bond_dev->broadcast, slave_dev->broadcast, slave_dev->addr_len); + + if (slave_dev->flags & IFF_POINTOPOINT) { + bond_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST); + bond_dev->flags |= (IFF_POINTOPOINT | IFF_NOARP); + } }
/* On bonding slaves other than the currently active slave, suppress
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit fa532bee17d15acf8bba4bc8e2062b7a093ba801 ]
When adding a point to point downlink to team device, we neglected to reset the team's flags, which were still using flags like BROADCAST and MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink interfaces, such as when adding a GRE device to team device. Fix this by remove multicast/broadcast flags and add p2p and noarp flags.
After removing the none ethernet interface and adding an ethernet interface to team, we need to reset team interface flags. Unlike bonding interface, team do not need restore IFF_MASTER, IFF_SLAVE flags.
Reported-by: Liang Li liali@redhat.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 1d76efe1577b ("team: add support for non-ethernet devices") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/team/team.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index d9386d614a94c..4dfa9c610974a 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2130,6 +2130,15 @@ static void team_setup_by_port(struct net_device *dev, dev->mtu = port_dev->mtu; memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len); eth_hw_addr_inherit(dev, port_dev); + + if (port_dev->flags & IFF_POINTOPOINT) { + dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST); + dev->flags |= (IFF_POINTOPOINT | IFF_NOARP); + } else if ((port_dev->flags & (IFF_BROADCAST | IFF_MULTICAST)) == + (IFF_BROADCAST | IFF_MULTICAST)) { + dev->flags |= (IFF_BROADCAST | IFF_MULTICAST); + dev->flags &= ~(IFF_POINTOPOINT | IFF_NOARP); + } }
static int team_dev_type_check_change(struct net_device *dev,
From: Vincent Whitchurch vincent.whitchurch@axis.com
[ Upstream commit 284779dbf4e98753458708783af8c35630674a21 ]
commit a3a57bf07de23fe1ff779e0fdf710aa581c3ff73 ("net: stmmac: work around sporadic tx issue on link-up") worked around a problem with TX sometimes not working after a link-up by avoiding a redundant write to MAC_CTRL_REG (aka GMAC_CONFIG), since the IP appeared to have problems with handling multiple writes to that register in some cases.
That commit however only added the work around to dwmac_lib.c (apart from the common code in stmmac_main.c), but my systems with version 4.21a of the IP exhibit the same problem, so add the work around to dwmac4_lib.c too.
Fixes: a3a57bf07de2 ("net: stmmac: work around sporadic tx issue on link-up") Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230721-stmmac-tx-workaround-v1-1-9411cbd5ee07@ax... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c index 9292a1fab7d32..7011c08d2e012 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c @@ -207,13 +207,15 @@ void stmmac_dwmac4_set_mac_addr(void __iomem *ioaddr, u8 addr[6], void stmmac_dwmac4_set_mac(void __iomem *ioaddr, bool enable) { u32 value = readl(ioaddr + GMAC_CONFIG); + u32 old_val = value;
if (enable) value |= GMAC_CONFIG_RE | GMAC_CONFIG_TE; else value &= ~(GMAC_CONFIG_TE | GMAC_CONFIG_RE);
- writel(value, ioaddr + GMAC_CONFIG); + if (value != old_val) + writel(value, ioaddr + GMAC_CONFIG); }
void stmmac_dwmac4_get_mac_addr(void __iomem *ioaddr, unsigned char *addr,
From: Maxim Mikityanskiy maxtram95@gmail.com
[ Upstream commit ad084a6d99bc182bf109c190c808e2ea073ec57b ]
Only the HW rfkill state is toggled on laptops with quirks->ec_read_only (so far only MSI Wind U90/U100). There are, however, a few issues with the implementation:
1. The initial HW state is always unblocked, regardless of the actual state on boot, because msi_init_rfkill only sets the SW state, regardless of ec_read_only.
2. The initial SW state corresponds to the actual state on boot, but it can't be changed afterwards, because set_device_state returns -EOPNOTSUPP. It confuses the userspace, making Wi-Fi and/or Bluetooth unusable if it was blocked on boot, and breaking the airplane mode if the rfkill was unblocked on boot.
Address the above issues by properly initializing the HW state on ec_read_only laptops and by allowing the userspace to toggle the SW state. Don't set the SW state ourselves and let the userspace fully control it. Toggling the SW state is a no-op, however, it allows the userspace to properly toggle the airplane mode. The actual SW radio disablement is handled by the corresponding rtl818x_pci and btusb drivers that have their own rfkills.
Tested on MSI Wind U100 Plus, BIOS ver 1.0G, EC ver 130.
Fixes: 0816392b97d4 ("msi-laptop: merge quirk tables to one") Fixes: 0de6575ad0a8 ("msi-laptop: Add MSI Wind U90/U100 support") Signed-off-by: Maxim Mikityanskiy maxtram95@gmail.com Link: https://lore.kernel.org/r/20230721145423.161057-1-maxtram95@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/msi-laptop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/platform/x86/msi-laptop.c b/drivers/platform/x86/msi-laptop.c index 0e804b6c2d242..dfb4af759aa75 100644 --- a/drivers/platform/x86/msi-laptop.c +++ b/drivers/platform/x86/msi-laptop.c @@ -210,7 +210,7 @@ static ssize_t set_device_state(const char *buf, size_t count, u8 mask) return -EINVAL;
if (quirks->ec_read_only) - return -EOPNOTSUPP; + return 0;
/* read current device state */ result = ec_read(MSI_STANDARD_EC_COMMAND_ADDRESS, &rdata); @@ -841,15 +841,15 @@ static bool msi_laptop_i8042_filter(unsigned char data, unsigned char str, static void msi_init_rfkill(struct work_struct *ignored) { if (rfk_wlan) { - rfkill_set_sw_state(rfk_wlan, !wlan_s); + msi_rfkill_set_state(rfk_wlan, !wlan_s); rfkill_wlan_set(NULL, !wlan_s); } if (rfk_bluetooth) { - rfkill_set_sw_state(rfk_bluetooth, !bluetooth_s); + msi_rfkill_set_state(rfk_bluetooth, !bluetooth_s); rfkill_bluetooth_set(NULL, !bluetooth_s); } if (rfk_threeg) { - rfkill_set_sw_state(rfk_threeg, !threeg_s); + msi_rfkill_set_state(rfk_threeg, !threeg_s); rfkill_threeg_set(NULL, !threeg_s); } }
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit d4a7ce642100765119a872d4aba1bf63e3a22c8a ]
The Xeon validation group has been carrying out some loaded tests with various HW configurations, and they have seen some transmit queue time out happening during the test. This will cause the reset adapter function to be called by igc_tx_timeout(). Similar race conditions may arise when the interface is being brought down and up in igc_reinit_locked(), an interrupt being generated, and igc_clean_tx_irq() being called to complete the TX.
When the igc_tx_timeout() function is invoked, this patch will turn off all TX ring HW queues during igc_down() process. TX ring HW queues will be activated again during the igc_configure_tx_ring() process when performing the igc_up() procedure later.
This patch also moved existing igc_disable_tx_ring_hw() to avoid using forward declaration.
Kernel trace: [ 7678.747813] ------------[ cut here ]------------ [ 7678.757914] NETDEV WATCHDOG: enp1s0 (igc): transmit queue 2 timed out [ 7678.770117] WARNING: CPU: 0 PID: 13 at net/sched/sch_generic.c:525 dev_watchdog+0x1ae/0x1f0 [ 7678.784459] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 7678.784496] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [ 7679.200403] RIP: 0010:dev_watchdog+0x1ae/0x1f0 [ 7679.210201] Code: 28 e9 53 ff ff ff 4c 89 e7 c6 05 06 42 b9 00 01 e8 17 d1 fb ff 44 89 e9 4c 89 e6 48 c7 c7 40 ad fb 81 48 89 c2 e8 52 62 82 ff <0f> 0b e9 72 ff ff ff 65 8b 05 80 7d 7c 7e 89 c0 48 0f a3 05 0a c1 [ 7679.245438] RSP: 0018:ffa00000001f7d90 EFLAGS: 00010282 [ 7679.256021] RAX: 0000000000000000 RBX: ff11000109938440 RCX: 0000000000000000 [ 7679.268710] RDX: ff11000361e26cd8 RSI: ff11000361e1b880 RDI: ff11000361e1b880 [ 7679.281314] RBP: ffa00000001f7da8 R08: ff1100035f8fffe8 R09: 0000000000027ffb [ 7679.293840] R10: 0000000000001f0a R11: ff1100035f840000 R12: ff11000109938000 [ 7679.306276] R13: 0000000000000002 R14: dead000000000122 R15: ffa00000001f7e18 [ 7679.318648] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 7679.332064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7679.342757] CR2: 00007ffff7fca168 CR3: 000000013b08a006 CR4: 0000000000471ef8 [ 7679.354984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 7679.367207] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 7679.379370] PKRU: 55555554 [ 7679.386446] Call Trace: [ 7679.393152] <TASK> [ 7679.399363] ? __pfx_dev_watchdog+0x10/0x10 [ 7679.407870] call_timer_fn+0x31/0x110 [ 7679.415698] expire_timers+0xb2/0x120 [ 7679.423403] run_timer_softirq+0x179/0x1e0 [ 7679.431532] ? __schedule+0x2b1/0x820 [ 7679.439078] __do_softirq+0xd1/0x295 [ 7679.446426] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 7679.454867] run_ksoftirqd+0x22/0x30 [ 7679.462058] smpboot_thread_fn+0xb7/0x160 [ 7679.469670] kthread+0xcd/0xf0 [ 7679.476097] ? __pfx_kthread+0x10/0x10 [ 7679.483211] ret_from_fork+0x29/0x50 [ 7679.490047] </TASK> [ 7679.495204] ---[ end trace 0000000000000000 ]--- [ 7679.503179] igc 0000:01:00.0 enp1s0: Register Dump [ 7679.511230] igc 0000:01:00.0 enp1s0: Register Name Value [ 7679.519892] igc 0000:01:00.0 enp1s0: CTRL 181c0641 [ 7679.528782] igc 0000:01:00.0 enp1s0: STATUS 40280683 [ 7679.537551] igc 0000:01:00.0 enp1s0: CTRL_EXT 10000040 [ 7679.546284] igc 0000:01:00.0 enp1s0: MDIC 180a3800 [ 7679.554942] igc 0000:01:00.0 enp1s0: ICR 00000081 [ 7679.563503] igc 0000:01:00.0 enp1s0: RCTL 04408022 [ 7679.571963] igc 0000:01:00.0 enp1s0: RDLEN[0-3] 00001000 00001000 00001000 00001000 [ 7679.583075] igc 0000:01:00.0 enp1s0: RDH[0-3] 00000068 000000b6 0000000f 00000031 [ 7679.594162] igc 0000:01:00.0 enp1s0: RDT[0-3] 00000066 000000b2 0000000e 00000030 [ 7679.605174] igc 0000:01:00.0 enp1s0: RXDCTL[0-3] 02040808 02040808 02040808 02040808 [ 7679.616196] igc 0000:01:00.0 enp1s0: RDBAL[0-3] 1bb7c000 1bb7f000 1bb82000 0ef33000 [ 7679.627242] igc 0000:01:00.0 enp1s0: RDBAH[0-3] 00000001 00000001 00000001 00000001 [ 7679.638256] igc 0000:01:00.0 enp1s0: TCTL a503f0fa [ 7679.646607] igc 0000:01:00.0 enp1s0: TDBAL[0-3] 2ba4a000 1bb6f000 1bb74000 1bb79000 [ 7679.657609] igc 0000:01:00.0 enp1s0: TDBAH[0-3] 00000001 00000001 00000001 00000001 [ 7679.668551] igc 0000:01:00.0 enp1s0: TDLEN[0-3] 00001000 00001000 00001000 00001000 [ 7679.679470] igc 0000:01:00.0 enp1s0: TDH[0-3] 000000a7 0000002d 000000bf 000000d9 [ 7679.690406] igc 0000:01:00.0 enp1s0: TDT[0-3] 000000a7 0000002d 000000bf 000000d9 [ 7679.701264] igc 0000:01:00.0 enp1s0: TXDCTL[0-3] 02100108 02100108 02100108 02100108 [ 7679.712123] igc 0000:01:00.0 enp1s0: Reset adapter [ 7683.085967] igc 0000:01:00.0 enp1s0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX [ 8086.945561] ------------[ cut here ]------------ Entering kdb (current=0xffffffff8220b200, pid 0) on processor 0 Oops: (null) due to oops @ 0xffffffff81573888 RIP: 0010:dql_completed+0x148/0x160 Code: c9 00 48 89 57 58 e9 46 ff ff ff 45 85 e4 41 0f 95 c4 41 39 db 0f 95 c1 41 84 cc 74 05 45 85 ed 78 0a 44 89 c1 e9 27 ff ff ff <0f> 0b 01 f6 44 89 c1 29 f1 0f 48 ca eb 8c cc cc cc cc cc cc cc cc RSP: 0018:ffa0000000003e00 EFLAGS: 00010287 RAX: 000000000000006c RBX: ffa0000003eb0f78 RCX: ff11000109938000 RDX: 0000000000000003 RSI: 0000000000000160 RDI: ff110001002e9480 RBP: ffa0000000003ed8 R08: ff110001002e93c0 R09: ffa0000000003d28 R10: 0000000000007cc0 R11: 0000000000007c54 R12: 00000000ffffffd9 R13: ff1100037039cb00 R14: 00000000ffffffd9 R15: ff1100037039c048 FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? igc_poll+0x1a9/0x14d0 [igc] __napi_poll+0x2e/0x1b0 net_rx_action+0x126/0x250 __do_softirq+0xd1/0x295 irq_exit_rcu+0xc5/0xf0 common_interrupt+0x86/0xa0 </IRQ> <TASK> asm_common_interrupt+0x27/0x40 RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 cpuidle_enter+0x2e/0x50 call_cpuidle+0x23/0x40 do_idle+0x1be/0x220 cpu_startup_entry+0x20/0x30 rest_init+0xb5/0xc0 arch_call_rest_init+0xe/0x30 start_kernel+0x448/0x760 x86_64_start_kernel+0x109/0x150 secondary_startup_64_no_verify+0xe0/0xeb </TASK> more> [0]kdb>
[0]kdb> [0]kdb> go Catastrophic error detected kdb_continue_catastrophic=0, type go a second time if you really want to continue [0]kdb> go Catastrophic error detected kdb_continue_catastrophic=0, attempting to continue [ 8086.955689] refcount_t: underflow; use-after-free. [ 8086.955697] WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110 [ 8086.955706] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 8086.955751] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [ 8086.955784] RIP: 0010:refcount_warn_saturate+0xc2/0x110 [ 8086.955788] Code: 01 e8 82 e7 b4 ff 0f 0b 5d c3 cc cc cc cc 80 3d 68 c6 eb 00 00 75 81 48 c7 c7 a0 87 f6 81 c6 05 58 c6 eb 00 01 e8 5e e7 b4 ff <0f> 0b 5d c3 cc cc cc cc 80 3d 42 c6 eb 00 00 0f 85 59 ff ff ff 48 [ 8086.955790] RSP: 0018:ffa0000000003da0 EFLAGS: 00010286 [ 8086.955793] RAX: 0000000000000000 RBX: ff1100011da40ee0 RCX: ff11000361e1b888 [ 8086.955794] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ff11000361e1b880 [ 8086.955795] RBP: ffa0000000003da0 R08: 80000000ffff9f45 R09: ffa0000000003d28 [ 8086.955796] R10: ff1100035f840000 R11: 0000000000000028 R12: ff11000319ff8000 [ 8086.955797] R13: ff1100011bb79d60 R14: 00000000ffffffd6 R15: ff1100037039cb00 [ 8086.955798] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 8086.955800] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8086.955801] CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 [ 8086.955803] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8086.955803] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 8086.955804] PKRU: 55555554 [ 8086.955805] Call Trace: [ 8086.955806] <IRQ> [ 8086.955808] tcp_wfree+0x112/0x130 [ 8086.955814] skb_release_head_state+0x24/0xa0 [ 8086.955818] napi_consume_skb+0x9c/0x160 [ 8086.955821] igc_poll+0x5d8/0x14d0 [igc] [ 8086.955835] __napi_poll+0x2e/0x1b0 [ 8086.955839] net_rx_action+0x126/0x250 [ 8086.955843] __do_softirq+0xd1/0x295 [ 8086.955846] irq_exit_rcu+0xc5/0xf0 [ 8086.955851] common_interrupt+0x86/0xa0 [ 8086.955857] </IRQ> [ 8086.955857] <TASK> [ 8086.955858] asm_common_interrupt+0x27/0x40 [ 8086.955862] RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 [ 8086.955866] Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d [ 8086.955867] RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 [ 8086.955869] RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f [ 8086.955870] RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 [ 8086.955871] RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 [ 8086.955872] R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 [ 8086.955873] R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 [ 8086.955875] cpuidle_enter+0x2e/0x50 [ 8086.955880] call_cpuidle+0x23/0x40 [ 8086.955884] do_idle+0x1be/0x220 [ 8086.955887] cpu_startup_entry+0x20/0x30 [ 8086.955889] rest_init+0xb5/0xc0 [ 8086.955892] arch_call_rest_init+0xe/0x30 [ 8086.955895] start_kernel+0x448/0x760 [ 8086.955898] x86_64_start_kernel+0x109/0x150 [ 8086.955900] secondary_startup_64_no_verify+0xe0/0xeb [ 8086.955904] </TASK> [ 8086.955904] ---[ end trace 0000000000000000 ]--- [ 8086.955912] ------------[ cut here ]------------ [ 8086.955913] kernel BUG at lib/dynamic_queue_limits.c:27! [ 8086.955918] invalid opcode: 0000 [#1] SMP [ 8086.955922] RIP: 0010:dql_completed+0x148/0x160 [ 8086.955925] Code: c9 00 48 89 57 58 e9 46 ff ff ff 45 85 e4 41 0f 95 c4 41 39 db 0f 95 c1 41 84 cc 74 05 45 85 ed 78 0a 44 89 c1 e9 27 ff ff ff <0f> 0b 01 f6 44 89 c1 29 f1 0f 48 ca eb 8c cc cc cc cc cc cc cc cc [ 8086.955927] RSP: 0018:ffa0000000003e00 EFLAGS: 00010287 [ 8086.955928] RAX: 000000000000006c RBX: ffa0000003eb0f78 RCX: ff11000109938000 [ 8086.955929] RDX: 0000000000000003 RSI: 0000000000000160 RDI: ff110001002e9480 [ 8086.955930] RBP: ffa0000000003ed8 R08: ff110001002e93c0 R09: ffa0000000003d28 [ 8086.955931] R10: 0000000000007cc0 R11: 0000000000007c54 R12: 00000000ffffffd9 [ 8086.955932] R13: ff1100037039cb00 R14: 00000000ffffffd9 R15: ff1100037039c048 [ 8086.955933] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 8086.955934] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8086.955935] CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 [ 8086.955936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8086.955937] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 8086.955938] PKRU: 55555554 [ 8086.955939] Call Trace: [ 8086.955939] <IRQ> [ 8086.955940] ? igc_poll+0x1a9/0x14d0 [igc] [ 8086.955949] __napi_poll+0x2e/0x1b0 [ 8086.955952] net_rx_action+0x126/0x250 [ 8086.955956] __do_softirq+0xd1/0x295 [ 8086.955958] irq_exit_rcu+0xc5/0xf0 [ 8086.955961] common_interrupt+0x86/0xa0 [ 8086.955964] </IRQ> [ 8086.955965] <TASK> [ 8086.955965] asm_common_interrupt+0x27/0x40 [ 8086.955968] RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 [ 8086.955971] Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d [ 8086.955972] RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 [ 8086.955973] RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f [ 8086.955974] RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 [ 8086.955974] RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 [ 8086.955975] R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 [ 8086.955976] R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 [ 8086.955978] cpuidle_enter+0x2e/0x50 [ 8086.955981] call_cpuidle+0x23/0x40 [ 8086.955984] do_idle+0x1be/0x220 [ 8086.955985] cpu_startup_entry+0x20/0x30 [ 8086.955987] rest_init+0xb5/0xc0 [ 8086.955990] arch_call_rest_init+0xe/0x30 [ 8086.955992] start_kernel+0x448/0x760 [ 8086.955994] x86_64_start_kernel+0x109/0x150 [ 8086.955996] secondary_startup_64_no_verify+0xe0/0xeb [ 8086.955998] </TASK> [ 8086.955999] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 8086.956029] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [16762.543675] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.593 msecs [16762.543678] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.595 msecs [16762.543673] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.495 msecs [16762.543679] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.599 msecs [16762.543678] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.598 msecs [16762.543690] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.605 msecs [16762.543684] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.599 msecs [16762.543693] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.613 msecs [16762.543784] ---[ end trace 0000000000000000 ]--- [16762.849099] RIP: 0010:dql_completed+0x148/0x160 PANIC: Fatal exception in interrupt
Fixes: 9b275176270e ("igc: Add ndo_tx_timeout support") Tested-by: Alejandra Victoria Alcaraz alejandra.victoria.alcaraz@intel.com Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 40 ++++++++++++++++------- 1 file changed, 28 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index bcc1c428b4cc1..a47dce10d3a78 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -316,6 +316,33 @@ static void igc_clean_all_tx_rings(struct igc_adapter *adapter) igc_clean_tx_ring(adapter->tx_ring[i]); }
+static void igc_disable_tx_ring_hw(struct igc_ring *ring) +{ + struct igc_hw *hw = &ring->q_vector->adapter->hw; + u8 idx = ring->reg_idx; + u32 txdctl; + + txdctl = rd32(IGC_TXDCTL(idx)); + txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE; + txdctl |= IGC_TXDCTL_SWFLUSH; + wr32(IGC_TXDCTL(idx), txdctl); +} + +/** + * igc_disable_all_tx_rings_hw - Disable all transmit queue operation + * @adapter: board private structure + */ +static void igc_disable_all_tx_rings_hw(struct igc_adapter *adapter) +{ + int i; + + for (i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *tx_ring = adapter->tx_ring[i]; + + igc_disable_tx_ring_hw(tx_ring); + } +} + /** * igc_setup_tx_resources - allocate Tx resources (Descriptors) * @tx_ring: tx descriptor ring (for a specific queue) to setup @@ -4975,6 +5002,7 @@ void igc_down(struct igc_adapter *adapter) /* clear VLAN promisc flag so VFTA will be updated if necessary */ adapter->flags &= ~IGC_FLAG_VLAN_PROMISC;
+ igc_disable_all_tx_rings_hw(adapter); igc_clean_all_tx_rings(adapter); igc_clean_all_rx_rings(adapter); } @@ -7124,18 +7152,6 @@ void igc_enable_rx_ring(struct igc_ring *ring) igc_alloc_rx_buffers(ring, igc_desc_unused(ring)); }
-static void igc_disable_tx_ring_hw(struct igc_ring *ring) -{ - struct igc_hw *hw = &ring->q_vector->adapter->hw; - u8 idx = ring->reg_idx; - u32 txdctl; - - txdctl = rd32(IGC_TXDCTL(idx)); - txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE; - txdctl |= IGC_TXDCTL_SWFLUSH; - wr32(IGC_TXDCTL(idx), txdctl); -} - void igc_disable_tx_ring(struct igc_ring *ring) { igc_disable_tx_ring_hw(ring);
From: Florian Westphal fw@strlen.de
[ Upstream commit f718863aca469a109895cb855e6b81fff4827d71 ]
The lazy gc on insert that should remove timed-out entries fails to release the other half of the interval, if any.
Can be reproduced with tests/shell/testcases/sets/0044interval_overlap_0 in nftables.git and kmemleak enabled kernel.
Second bug is the use of rbe_prev vs. prev pointer. If rbe_prev() returns NULL after at least one iteration, rbe_prev points to element that is not an end interval, hence it should not be removed.
Lastly, check the genmask of the end interval if this is active in the current generation.
Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_set_rbtree.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 5c05c9b990fba..8d73fffd2d09d 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -217,29 +217,37 @@ static void *nft_rbtree_get(const struct net *net, const struct nft_set *set,
static int nft_rbtree_gc_elem(const struct nft_set *__set, struct nft_rbtree *priv, - struct nft_rbtree_elem *rbe) + struct nft_rbtree_elem *rbe, + u8 genmask) { struct nft_set *set = (struct nft_set *)__set; struct rb_node *prev = rb_prev(&rbe->node); - struct nft_rbtree_elem *rbe_prev = NULL; + struct nft_rbtree_elem *rbe_prev; struct nft_set_gc_batch *gcb;
gcb = nft_set_gc_batch_check(set, NULL, GFP_ATOMIC); if (!gcb) return -ENOMEM;
- /* search for expired end interval coming before this element. */ + /* search for end interval coming before this element. + * end intervals don't carry a timeout extension, they + * are coupled with the interval start element. + */ while (prev) { rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node); - if (nft_rbtree_interval_end(rbe_prev)) + if (nft_rbtree_interval_end(rbe_prev) && + nft_set_elem_active(&rbe_prev->ext, genmask)) break;
prev = rb_prev(prev); }
- if (rbe_prev) { + if (prev) { + rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node); + rb_erase(&rbe_prev->node, &priv->root); atomic_dec(&set->nelems); + nft_set_gc_batch_add(gcb, rbe_prev); }
rb_erase(&rbe->node, &priv->root); @@ -321,7 +329,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
/* perform garbage collection to avoid bogus overlap reports. */ if (nft_set_elem_expired(&rbe->ext)) { - err = nft_rbtree_gc_elem(set, priv, rbe); + err = nft_rbtree_gc_elem(set, priv, rbe, genmask); if (err < 0) return err;
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 0a771f7b266b02d262900c75f1e175c7fe76fec2 ]
On error when building the rule, the immediate expression unbinds the chain, hence objects can be deactivated by the transaction records.
Otherwise, it is possible to trigger the following warning:
WARNING: CPU: 3 PID: 915 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] CPU: 3 PID: 915 Comm: chain-bind-err- Not tainted 6.1.39 #1 RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
Fixes: 4bedf9eee016 ("netfilter: nf_tables: fix chain binding transaction logic") Reported-by: Kevin Rich kevinrich1337@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_immediate.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c index 6b0efab4fad09..6bf1c852e8eaa 100644 --- a/net/netfilter/nft_immediate.c +++ b/net/netfilter/nft_immediate.c @@ -125,15 +125,27 @@ static void nft_immediate_activate(const struct nft_ctx *ctx, return nft_data_hold(&priv->data, nft_dreg_to_type(priv->dreg)); }
+static void nft_immediate_chain_deactivate(const struct nft_ctx *ctx, + struct nft_chain *chain, + enum nft_trans_phase phase) +{ + struct nft_ctx chain_ctx; + struct nft_rule *rule; + + chain_ctx = *ctx; + chain_ctx.chain = chain; + + list_for_each_entry(rule, &chain->rules, list) + nft_rule_expr_deactivate(&chain_ctx, rule, phase); +} + static void nft_immediate_deactivate(const struct nft_ctx *ctx, const struct nft_expr *expr, enum nft_trans_phase phase) { const struct nft_immediate_expr *priv = nft_expr_priv(expr); const struct nft_data *data = &priv->data; - struct nft_ctx chain_ctx; struct nft_chain *chain; - struct nft_rule *rule;
if (priv->dreg == NFT_REG_VERDICT) { switch (data->verdict.code) { @@ -143,20 +155,17 @@ static void nft_immediate_deactivate(const struct nft_ctx *ctx, if (!nft_chain_binding(chain)) break;
- chain_ctx = *ctx; - chain_ctx.chain = chain; - - list_for_each_entry(rule, &chain->rules, list) - nft_rule_expr_deactivate(&chain_ctx, rule, phase); - switch (phase) { case NFT_TRANS_PREPARE_ERROR: nf_tables_unbind_chain(ctx, chain); - fallthrough; + nft_deactivate_next(ctx->net, chain); + break; case NFT_TRANS_PREPARE: + nft_immediate_chain_deactivate(ctx, chain, phase); nft_deactivate_next(ctx->net, chain); break; default: + nft_immediate_chain_deactivate(ctx, chain, phase); nft_chain_del(chain); chain->bound = false; chain->table->use--;
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 0ebc1064e4874d5987722a2ddbc18f94aa53b211 ]
Bail out with EOPNOTSUPP when adding rule to bound chain via NFTA_RULE_CHAIN_ID. The following warning splat is shown when adding a rule to a deleted bound chain:
WARNING: CPU: 2 PID: 13692 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] CPU: 2 PID: 13692 Comm: chain-bound-rul Not tainted 6.1.39 #1 RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich kevinrich1337@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index e0e675313d8e1..ce9f962380b7b 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -3529,8 +3529,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]); return PTR_ERR(chain); } - if (nft_chain_is_bound(chain)) - return -EOPNOTSUPP;
} else if (nla[NFTA_RULE_CHAIN_ID]) { chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID], @@ -3543,6 +3541,9 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, return -EINVAL; }
+ if (nft_chain_is_bound(chain)) + return -EOPNOTSUPP; + if (nla[NFTA_RULE_HANDLE]) { handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_HANDLE])); rule = __nft_rule_lookup(chain, handle);
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit feb2cf3dcfb930aec2ca65c66d1365543d5ba943 ]
mqprio_init() is quite large and unwieldy to add more code to. Split the netlink attribute parsing to a dedicated function.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 6c58c8816abb ("net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64") Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_mqprio.c | 114 +++++++++++++++++++++++------------------ 1 file changed, 63 insertions(+), 51 deletions(-)
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index 50e15add6068f..a5df5604e0150 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -130,6 +130,67 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla, return 0; }
+static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, + struct nlattr *opt) +{ + struct mqprio_sched *priv = qdisc_priv(sch); + struct nlattr *tb[TCA_MQPRIO_MAX + 1]; + struct nlattr *attr; + int i, rem, err; + + err = parse_attr(tb, TCA_MQPRIO_MAX, opt, mqprio_policy, + sizeof(*qopt)); + if (err < 0) + return err; + + if (!qopt->hw) + return -EINVAL; + + if (tb[TCA_MQPRIO_MODE]) { + priv->flags |= TC_MQPRIO_F_MODE; + priv->mode = *(u16 *)nla_data(tb[TCA_MQPRIO_MODE]); + } + + if (tb[TCA_MQPRIO_SHAPER]) { + priv->flags |= TC_MQPRIO_F_SHAPER; + priv->shaper = *(u16 *)nla_data(tb[TCA_MQPRIO_SHAPER]); + } + + if (tb[TCA_MQPRIO_MIN_RATE64]) { + if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) + return -EINVAL; + i = 0; + nla_for_each_nested(attr, tb[TCA_MQPRIO_MIN_RATE64], + rem) { + if (nla_type(attr) != TCA_MQPRIO_MIN_RATE64) + return -EINVAL; + if (i >= qopt->num_tc) + break; + priv->min_rate[i] = *(u64 *)nla_data(attr); + i++; + } + priv->flags |= TC_MQPRIO_F_MIN_RATE; + } + + if (tb[TCA_MQPRIO_MAX_RATE64]) { + if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) + return -EINVAL; + i = 0; + nla_for_each_nested(attr, tb[TCA_MQPRIO_MAX_RATE64], + rem) { + if (nla_type(attr) != TCA_MQPRIO_MAX_RATE64) + return -EINVAL; + if (i >= qopt->num_tc) + break; + priv->max_rate[i] = *(u64 *)nla_data(attr); + i++; + } + priv->flags |= TC_MQPRIO_F_MAX_RATE; + } + + return 0; +} + static int mqprio_init(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) { @@ -139,9 +200,6 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt, struct Qdisc *qdisc; int i, err = -EOPNOTSUPP; struct tc_mqprio_qopt *qopt = NULL; - struct nlattr *tb[TCA_MQPRIO_MAX + 1]; - struct nlattr *attr; - int rem; int len;
BUILD_BUG_ON(TC_MAX_QUEUE != TC_QOPT_MAX_QUEUE); @@ -166,55 +224,9 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
len = nla_len(opt) - NLA_ALIGN(sizeof(*qopt)); if (len > 0) { - err = parse_attr(tb, TCA_MQPRIO_MAX, opt, mqprio_policy, - sizeof(*qopt)); - if (err < 0) + err = mqprio_parse_nlattr(sch, qopt, opt); + if (err) return err; - - if (!qopt->hw) - return -EINVAL; - - if (tb[TCA_MQPRIO_MODE]) { - priv->flags |= TC_MQPRIO_F_MODE; - priv->mode = *(u16 *)nla_data(tb[TCA_MQPRIO_MODE]); - } - - if (tb[TCA_MQPRIO_SHAPER]) { - priv->flags |= TC_MQPRIO_F_SHAPER; - priv->shaper = *(u16 *)nla_data(tb[TCA_MQPRIO_SHAPER]); - } - - if (tb[TCA_MQPRIO_MIN_RATE64]) { - if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) - return -EINVAL; - i = 0; - nla_for_each_nested(attr, tb[TCA_MQPRIO_MIN_RATE64], - rem) { - if (nla_type(attr) != TCA_MQPRIO_MIN_RATE64) - return -EINVAL; - if (i >= qopt->num_tc) - break; - priv->min_rate[i] = *(u64 *)nla_data(attr); - i++; - } - priv->flags |= TC_MQPRIO_F_MIN_RATE; - } - - if (tb[TCA_MQPRIO_MAX_RATE64]) { - if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) - return -EINVAL; - i = 0; - nla_for_each_nested(attr, tb[TCA_MQPRIO_MAX_RATE64], - rem) { - if (nla_type(attr) != TCA_MQPRIO_MAX_RATE64) - return -EINVAL; - if (i >= qopt->num_tc) - break; - priv->max_rate[i] = *(u64 *)nla_data(attr); - i++; - } - priv->flags |= TC_MQPRIO_F_MAX_RATE; - } }
/* pre-allocate qdisc, attachment can't fail */
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 57f21bf85400abadac0cb2a4db5de1d663f8863f ]
Netlink attribute parsing in mqprio is a minesweeper game, with many options having the possibility of being passed incorrectly and the user being none the wiser.
Try to make errors less sour by giving user space some information regarding what went wrong.
Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Ferenc Fejes fejes@inf.elte.hu Reviewed-by: Simon Horman simon.horman@corigine.com Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 6c58c8816abb ("net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64") Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_mqprio.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-)
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index a5df5604e0150..4ec222a5530d1 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -131,7 +131,8 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla, }
static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, - struct nlattr *opt) + struct nlattr *opt, + struct netlink_ext_ack *extack) { struct mqprio_sched *priv = qdisc_priv(sch); struct nlattr *tb[TCA_MQPRIO_MAX + 1]; @@ -143,8 +144,11 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, if (err < 0) return err;
- if (!qopt->hw) + if (!qopt->hw) { + NL_SET_ERR_MSG(extack, + "mqprio TCA_OPTIONS can only contain netlink attributes in hardware mode"); return -EINVAL; + }
if (tb[TCA_MQPRIO_MODE]) { priv->flags |= TC_MQPRIO_F_MODE; @@ -157,13 +161,19 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, }
if (tb[TCA_MQPRIO_MIN_RATE64]) { - if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) + if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) { + NL_SET_ERR_MSG_ATTR(extack, tb[TCA_MQPRIO_MIN_RATE64], + "min_rate accepted only when shaper is in bw_rlimit mode"); return -EINVAL; + } i = 0; nla_for_each_nested(attr, tb[TCA_MQPRIO_MIN_RATE64], rem) { - if (nla_type(attr) != TCA_MQPRIO_MIN_RATE64) + if (nla_type(attr) != TCA_MQPRIO_MIN_RATE64) { + NL_SET_ERR_MSG_ATTR(extack, attr, + "Attribute type expected to be TCA_MQPRIO_MIN_RATE64"); return -EINVAL; + } if (i >= qopt->num_tc) break; priv->min_rate[i] = *(u64 *)nla_data(attr); @@ -173,13 +183,19 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, }
if (tb[TCA_MQPRIO_MAX_RATE64]) { - if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) + if (priv->shaper != TC_MQPRIO_SHAPER_BW_RATE) { + NL_SET_ERR_MSG_ATTR(extack, tb[TCA_MQPRIO_MAX_RATE64], + "max_rate accepted only when shaper is in bw_rlimit mode"); return -EINVAL; + } i = 0; nla_for_each_nested(attr, tb[TCA_MQPRIO_MAX_RATE64], rem) { - if (nla_type(attr) != TCA_MQPRIO_MAX_RATE64) + if (nla_type(attr) != TCA_MQPRIO_MAX_RATE64) { + NL_SET_ERR_MSG_ATTR(extack, attr, + "Attribute type expected to be TCA_MQPRIO_MAX_RATE64"); return -EINVAL; + } if (i >= qopt->num_tc) break; priv->max_rate[i] = *(u64 *)nla_data(attr); @@ -224,7 +240,7 @@ static int mqprio_init(struct Qdisc *sch, struct nlattr *opt,
len = nla_len(opt) - NLA_ALIGN(sizeof(*qopt)); if (len > 0) { - err = mqprio_parse_nlattr(sch, qopt, opt); + err = mqprio_parse_nlattr(sch, qopt, opt, extack); if (err) return err; }
From: Lin Ma linma@zju.edu.cn
[ Upstream commit 6c58c8816abb7b93b21fa3b1d0c1726402e5e568 ]
The nla_for_each_nested parsing in function mqprio_parse_nlattr() does not check the length of the nested attribute. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as 8 byte integer and passed to priv->max_rate/min_rate.
This patch adds the check based on nla_len() when check the nla_type(), which ensures that the length of these two attribute must equals sizeof(u64).
Fixes: 4e8b86c06269 ("mqprio: Introduce new hardware offload mode and shaper in mqprio") Reviewed-by: Victor Nogueira victor@mojatatu.com Signed-off-by: Lin Ma linma@zju.edu.cn Link: https://lore.kernel.org/r/20230725024227.426561-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_mqprio.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index 4ec222a5530d1..56d3dc5e95c7c 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -174,6 +174,13 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, "Attribute type expected to be TCA_MQPRIO_MIN_RATE64"); return -EINVAL; } + + if (nla_len(attr) != sizeof(u64)) { + NL_SET_ERR_MSG_ATTR(extack, attr, + "Attribute TCA_MQPRIO_MIN_RATE64 expected to have 8 bytes length"); + return -EINVAL; + } + if (i >= qopt->num_tc) break; priv->min_rate[i] = *(u64 *)nla_data(attr); @@ -196,6 +203,13 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, "Attribute type expected to be TCA_MQPRIO_MAX_RATE64"); return -EINVAL; } + + if (nla_len(attr) != sizeof(u64)) { + NL_SET_ERR_MSG_ATTR(extack, attr, + "Attribute TCA_MQPRIO_MAX_RATE64 expected to have 8 bytes length"); + return -EINVAL; + } + if (i >= qopt->num_tc) break; priv->max_rate[i] = *(u64 *)nla_data(attr);
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit 5c85f7065718a949902b238a6abd8fc907c5d3e0 ]
in be_lancer_xmit_workarounds(), it should go to label 'tx_drop' if an unexpected value is returned by pskb_trim().
Fixes: 93040ae5cc8d ("be2net: Fix to trim skb for padded vlan packets to workaround an ASIC Bug") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Link: https://lore.kernel.org/r/20230725032726.15002-1-ruc_gongyuanjun@163.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/emulex/benet/be_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 3ccb955eb6f23..c14a3dbd075cc 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1139,7 +1139,8 @@ static struct sk_buff *be_lancer_xmit_workarounds(struct be_adapter *adapter, (lancer_chip(adapter) || BE3_chip(adapter) || skb_vlan_tag_present(skb)) && is_ipv4_pkt(skb)) { ip = (struct iphdr *)ip_hdr(skb); - pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len)); + if (unlikely(pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len)))) + goto tx_drop; }
/* If vlan tag is already inlined in the packet, skip HW VLAN
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit e46e06ffc6d667a89b979701288e2264f45e6a7b ]
goto free_skb if an unexpected result is returned by pskb_tirm() in tipc_crypto_rcv_complete().
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Reviewed-by: Tung Nguyen tung.q.nguyen@dektech.com.au Link: https://lore.kernel.org/r/20230725064810.5820-1-ruc_gongyuanjun@163.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/crypto.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c index 4243d2ab8adfb..32447e8d94ac9 100644 --- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -1971,7 +1971,8 @@ static void tipc_crypto_rcv_complete(struct net *net, struct tipc_aead *aead,
skb_reset_network_header(*skb); skb_pull(*skb, tipc_ehdr_size(ehdr)); - pskb_trim(*skb, (*skb)->len - aead->authsize); + if (pskb_trim(*skb, (*skb)->len - aead->authsize)) + goto free_skb;
/* Validate TIPCv2 message */ if (unlikely(!tipc_msg_validate(skb))) {
From: Fedor Pchelkin pchelkin@ispras.ru
[ Upstream commit de52e17326c3e9a719c9ead4adb03467b8fae0ef ]
If tipc_link_bc_create() fails inside tipc_node_create() for a newly allocated tipc node then we should stop its tipc crypto and free the resources allocated with a call to tipc_crypto_start().
As the node ref is initialized to one to that point, just put the ref on tipc_link_bc_create() error case that would lead to tipc_node_free() be eventually executed and properly clean the node and its crypto resources.
Found by Linux Verification Center (linuxtesting.org).
Fixes: cb8092d70a6f ("tipc: move bc link creation back to tipc_node_create") Suggested-by: Xin Long lucien.xin@gmail.com Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Reviewed-by: Xin Long lucien.xin@gmail.com Link: https://lore.kernel.org/r/20230725214628.25246-1-pchelkin@ispras.ru Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tipc/node.c b/net/tipc/node.c index 5e000fde80676..a9c5b6594889b 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -583,7 +583,7 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id, n->capabilities, &n->bc_entry.inputq1, &n->bc_entry.namedq, snd_l, &n->bc_entry.link)) { pr_warn("Broadcast rcv link creation failed, no memory\n"); - kfree(n); + tipc_node_put(n); n = NULL; goto exit; }
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit d64b1ee12a168030fbb3e0aebf7bce49e9a07589 ]
This code is trying to ensure that only the flags specified in the list are allowed. The problem is that ucmd->rx_hash_fields_mask is a u64 and the flags are an enum which is treated as a u32 in this context. That means the test doesn't check whether the highest 32 bits are zero.
Fixes: 4d02ebd9bbbd ("IB/mlx4: Fix RSS hash fields restrictions") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/233ed975-982d-422a-b498-410f71d8a101@moroto.mounta... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx4/qp.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index ec545b8858cc0..43b2aad845917 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -530,15 +530,15 @@ static int set_qp_rss(struct mlx4_ib_dev *dev, struct mlx4_ib_rss *rss_ctx, return (-EOPNOTSUPP); }
- if (ucmd->rx_hash_fields_mask & ~(MLX4_IB_RX_HASH_SRC_IPV4 | - MLX4_IB_RX_HASH_DST_IPV4 | - MLX4_IB_RX_HASH_SRC_IPV6 | - MLX4_IB_RX_HASH_DST_IPV6 | - MLX4_IB_RX_HASH_SRC_PORT_TCP | - MLX4_IB_RX_HASH_DST_PORT_TCP | - MLX4_IB_RX_HASH_SRC_PORT_UDP | - MLX4_IB_RX_HASH_DST_PORT_UDP | - MLX4_IB_RX_HASH_INNER)) { + if (ucmd->rx_hash_fields_mask & ~(u64)(MLX4_IB_RX_HASH_SRC_IPV4 | + MLX4_IB_RX_HASH_DST_IPV4 | + MLX4_IB_RX_HASH_SRC_IPV6 | + MLX4_IB_RX_HASH_DST_IPV6 | + MLX4_IB_RX_HASH_SRC_PORT_TCP | + MLX4_IB_RX_HASH_DST_PORT_TCP | + MLX4_IB_RX_HASH_SRC_PORT_UDP | + MLX4_IB_RX_HASH_DST_PORT_UDP | + MLX4_IB_RX_HASH_INNER)) { pr_debug("RX Hash fields_mask has unsupported mask (0x%llx)\n", ucmd->rx_hash_fields_mask); return (-EOPNOTSUPP);
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit e8383f5cf1b3573ce140a80bfbfd809278ab16d6 ]
Drop the leftover of bus-client -> interconnect conversion, the enum dpu_core_perf_data_bus_id.
Fixes: cb88482e2570 ("drm/msm/dpu: clean up references of DPU custom bus scaling") Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/546048/ Link: https://lore.kernel.org/r/20230707193942.3806526-2-dmitry.baryshkov@linaro.o... Signed-off-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 13 ------------- 1 file changed, 13 deletions(-)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h index cf4b9b5964c6c..cd6c3518ba021 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h @@ -14,19 +14,6 @@
#define DPU_PERF_DEFAULT_MAX_CORE_CLK_RATE 412500000
-/** - * enum dpu_core_perf_data_bus_id - data bus identifier - * @DPU_CORE_PERF_DATA_BUS_ID_MNOC: DPU/MNOC data bus - * @DPU_CORE_PERF_DATA_BUS_ID_LLCC: MNOC/LLCC data bus - * @DPU_CORE_PERF_DATA_BUS_ID_EBI: LLCC/EBI data bus - */ -enum dpu_core_perf_data_bus_id { - DPU_CORE_PERF_DATA_BUS_ID_MNOC, - DPU_CORE_PERF_DATA_BUS_ID_LLCC, - DPU_CORE_PERF_DATA_BUS_ID_EBI, - DPU_CORE_PERF_DATA_BUS_ID_MAX, -}; - /** * struct dpu_core_perf_params - definition of performance parameters * @max_per_pipe_ib: maximum instantaneous bandwidth request
From: Rob Clark robdclark@chromium.org
[ Upstream commit bd846ceee9c478d0397428f02696602ba5eb264a ]
The incorrect size was causing "CP | AHB bus error" when snapshotting the GPU state on a6xx gen4 (a660 family).
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/26 Signed-off-by: Rob Clark robdclark@chromium.org Reviewed-by: Akhil P Oommen quic_akhilpo@quicinc.com Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state") Patchwork: https://patchwork.freedesktop.org/patch/546763/ Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h index 2fb58b7098e4b..3bd2065a9d30e 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h @@ -200,7 +200,7 @@ static const struct a6xx_shader_block { SHADER(A6XX_SP_LB_3_DATA, 0x800), SHADER(A6XX_SP_LB_4_DATA, 0x800), SHADER(A6XX_SP_LB_5_DATA, 0x200), - SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x2000), + SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x800), SHADER(A6XX_SP_CB_LEGACY_DATA, 0x280), SHADER(A6XX_SP_UAV_DATA, 0x80), SHADER(A6XX_SP_INST_TAG, 0x80),
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit 4984eb51453ff7eddee9e5ce816145be39c0ec5c ]
On code inspection, there are many instances in the driver where CEQE and AEQE fields written to by HW are read without guaranteeing that the polarity bit has been read and checked first.
Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields prior to checking the polarity bit.
Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230711175253.1289-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/ctrl.c | 9 ++++++++- drivers/infiniband/hw/irdma/puda.c | 6 ++++++ drivers/infiniband/hw/irdma/uk.c | 3 +++ 3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index 1ac7067e21be1..ab195ae089cfd 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -3395,6 +3395,9 @@ enum irdma_status_code irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, if (polarity != ccq->cq_uk.polarity) return IRDMA_ERR_Q_EMPTY;
+ /* Ensure CEQE contents are read after valid bit is checked */ + dma_rmb(); + get_64bit_val(cqe, 8, &qp_ctx); cqp = (struct irdma_sc_cqp *)(unsigned long)qp_ctx; info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, temp); @@ -4046,13 +4049,17 @@ enum irdma_status_code irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, u8 polarity;
aeqe = IRDMA_GET_CURRENT_AEQ_ELEM(aeq); - get_64bit_val(aeqe, 0, &compl_ctx); get_64bit_val(aeqe, 8, &temp); polarity = (u8)FIELD_GET(IRDMA_AEQE_VALID, temp);
if (aeq->polarity != polarity) return IRDMA_ERR_Q_EMPTY;
+ /* Ensure AEQE contents are read after valid bit is checked */ + dma_rmb(); + + get_64bit_val(aeqe, 0, &compl_ctx); + print_hex_dump_debug("WQE: AEQ_ENTRY WQE", DUMP_PREFIX_OFFSET, 16, 8, aeqe, 16, false);
diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c index 58e7d875643b8..197eba5eb78fa 100644 --- a/drivers/infiniband/hw/irdma/puda.c +++ b/drivers/infiniband/hw/irdma/puda.c @@ -235,6 +235,9 @@ irdma_puda_poll_info(struct irdma_sc_cq *cq, struct irdma_puda_cmpl_info *info) if (valid_bit != cq_uk->polarity) return IRDMA_ERR_Q_EMPTY;
+ /* Ensure CQE contents are read after valid bit is checked */ + dma_rmb(); + if (cq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) ext_valid = (bool)FIELD_GET(IRDMA_CQ_EXTCQE, qword3);
@@ -248,6 +251,9 @@ irdma_puda_poll_info(struct irdma_sc_cq *cq, struct irdma_puda_cmpl_info *info) if (polarity != cq_uk->polarity) return IRDMA_ERR_Q_EMPTY;
+ /* Ensure ext CQE contents are read after ext valid bit is checked */ + dma_rmb(); + IRDMA_RING_MOVE_HEAD_NOCHECK(cq_uk->cq_ring); if (!IRDMA_RING_CURRENT_HEAD(cq_uk->cq_ring)) cq_uk->polarity = !cq_uk->polarity; diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c index a348f0c010ab3..4b00a9adbe3a5 100644 --- a/drivers/infiniband/hw/irdma/uk.c +++ b/drivers/infiniband/hw/irdma/uk.c @@ -1549,6 +1549,9 @@ void irdma_uk_clean_cq(void *q, struct irdma_cq_uk *cq) if (polarity != temp) break;
+ /* Ensure CQE contents are read after valid bit is checked */ + dma_rmb(); + get_64bit_val(cqe, 8, &comp_ctx); if ((void *)(unsigned long)comp_ctx == q) set_64bit_val(cqe, 8, 0);
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit f2c3037811381f9149243828c7eb9a1631df9f9c ]
CQP completion statistics is read lockesly in irdma_wait_event and irdma_check_cqp_progress while it can be updated in the completion thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.
Make completion statistics an atomic variable to reflect coherent updates to it. This will also avoid load/store tearing logic bug potentially possible by compiler optimizations.
[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]
[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4: [77346.171483] irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma] [77346.171658] irdma_cqp_ce_handler+0x164/0x270 [irdma] [77346.171835] cqp_compl_worker+0x1b/0x20 [irdma] [77346.172009] process_one_work+0x4d1/0xa40 [77346.172024] worker_thread+0x319/0x700 [77346.172037] kthread+0x180/0x1b0 [77346.172054] ret_from_fork+0x22/0x30
[77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2: [77346.172234] irdma_handle_cqp_op+0xf4/0x4b0 [irdma] [77346.172413] irdma_cqp_aeq_cmd+0x75/0xa0 [irdma] [77346.172592] irdma_create_aeq+0x390/0x45a [irdma] [77346.172769] irdma_rt_init_hw.cold+0x212/0x85d [irdma] [77346.172944] irdma_probe+0x54f/0x620 [irdma] [77346.173122] auxiliary_bus_probe+0x66/0xa0 [77346.173137] really_probe+0x140/0x540 [77346.173154] __driver_probe_device+0xc7/0x220 [77346.173173] driver_probe_device+0x5f/0x140 [77346.173190] __driver_attach+0xf0/0x2c0 [77346.173208] bus_for_each_dev+0xa8/0xf0 [77346.173225] driver_attach+0x29/0x30 [77346.173240] bus_add_driver+0x29c/0x2f0 [77346.173255] driver_register+0x10f/0x1a0 [77346.173272] __auxiliary_driver_register+0xbc/0x140 [77346.173287] irdma_init_module+0x55/0x1000 [irdma] [77346.173460] do_one_initcall+0x7d/0x410 [77346.173475] do_init_module+0x81/0x2c0 [77346.173491] load_module+0x1232/0x12c0 [77346.173506] __do_sys_finit_module+0x101/0x180 [77346.173522] __x64_sys_finit_module+0x3c/0x50 [77346.173538] do_syscall_64+0x39/0x90 [77346.173553] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/ctrl.c | 22 +++++++------- drivers/infiniband/hw/irdma/defs.h | 46 ++++++++++++++--------------- drivers/infiniband/hw/irdma/type.h | 2 ++ drivers/infiniband/hw/irdma/utils.c | 2 +- 4 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index ab195ae089cfd..ad14c2404e94c 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -2741,13 +2741,13 @@ irdma_sc_cq_modify(struct irdma_sc_cq *cq, struct irdma_modify_cq_info *info, */ void irdma_check_cqp_progress(struct irdma_cqp_timeout *timeout, struct irdma_sc_dev *dev) { - if (timeout->compl_cqp_cmds != dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]) { - timeout->compl_cqp_cmds = dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]; + u64 completed_ops = atomic64_read(&dev->cqp->completed_ops); + + if (timeout->compl_cqp_cmds != completed_ops) { + timeout->compl_cqp_cmds = completed_ops; timeout->count = 0; - } else { - if (dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] != - timeout->compl_cqp_cmds) - timeout->count++; + } else if (timeout->compl_cqp_cmds != dev->cqp->requested_ops) { + timeout->count++; } }
@@ -2790,7 +2790,7 @@ static enum irdma_status_code irdma_cqp_poll_registers(struct irdma_sc_cqp *cqp, if (newtail != tail) { /* SUCCESS */ IRDMA_RING_MOVE_TAIL(cqp->sq_ring); - cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++; + atomic64_inc(&cqp->completed_ops); return 0; } udelay(cqp->dev->hw_attrs.max_sleep_count); @@ -3152,8 +3152,8 @@ enum irdma_status_code irdma_sc_cqp_init(struct irdma_sc_cqp *cqp, info->dev->cqp = cqp;
IRDMA_RING_INIT(cqp->sq_ring, cqp->sq_size); - cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] = 0; - cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS] = 0; + cqp->requested_ops = 0; + atomic64_set(&cqp->completed_ops, 0); /* for the cqp commands backlog. */ INIT_LIST_HEAD(&cqp->dev->cqp_cmd_head);
@@ -3306,7 +3306,7 @@ __le64 *irdma_sc_cqp_get_next_send_wqe_idx(struct irdma_sc_cqp *cqp, u64 scratch if (ret_code) return NULL;
- cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS]++; + cqp->requested_ops++; if (!*wqe_idx) cqp->polarity = !cqp->polarity; wqe = cqp->sq_base[*wqe_idx].elem; @@ -3432,7 +3432,7 @@ enum irdma_status_code irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, dma_wmb(); /* make sure shadow area is updated before moving tail */
IRDMA_RING_MOVE_TAIL(cqp->sq_ring); - ccq->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++; + atomic64_inc(&cqp->completed_ops);
return ret_code; } diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index b8c10a6ccede5..afd16a93ac69c 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -190,32 +190,30 @@ enum irdma_cqp_op_type { IRDMA_OP_MANAGE_VF_PBLE_BP = 25, IRDMA_OP_QUERY_FPM_VAL = 26, IRDMA_OP_COMMIT_FPM_VAL = 27, - IRDMA_OP_REQ_CMDS = 28, - IRDMA_OP_CMPL_CMDS = 29, - IRDMA_OP_AH_CREATE = 30, - IRDMA_OP_AH_MODIFY = 31, - IRDMA_OP_AH_DESTROY = 32, - IRDMA_OP_MC_CREATE = 33, - IRDMA_OP_MC_DESTROY = 34, - IRDMA_OP_MC_MODIFY = 35, - IRDMA_OP_STATS_ALLOCATE = 36, - IRDMA_OP_STATS_FREE = 37, - IRDMA_OP_STATS_GATHER = 38, - IRDMA_OP_WS_ADD_NODE = 39, - IRDMA_OP_WS_MODIFY_NODE = 40, - IRDMA_OP_WS_DELETE_NODE = 41, - IRDMA_OP_WS_FAILOVER_START = 42, - IRDMA_OP_WS_FAILOVER_COMPLETE = 43, - IRDMA_OP_SET_UP_MAP = 44, - IRDMA_OP_GEN_AE = 45, - IRDMA_OP_QUERY_RDMA_FEATURES = 46, - IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY = 47, - IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 48, - IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 49, - IRDMA_OP_CQ_MODIFY = 50, + IRDMA_OP_AH_CREATE = 28, + IRDMA_OP_AH_MODIFY = 29, + IRDMA_OP_AH_DESTROY = 30, + IRDMA_OP_MC_CREATE = 31, + IRDMA_OP_MC_DESTROY = 32, + IRDMA_OP_MC_MODIFY = 33, + IRDMA_OP_STATS_ALLOCATE = 34, + IRDMA_OP_STATS_FREE = 35, + IRDMA_OP_STATS_GATHER = 36, + IRDMA_OP_WS_ADD_NODE = 37, + IRDMA_OP_WS_MODIFY_NODE = 38, + IRDMA_OP_WS_DELETE_NODE = 39, + IRDMA_OP_WS_FAILOVER_START = 40, + IRDMA_OP_WS_FAILOVER_COMPLETE = 41, + IRDMA_OP_SET_UP_MAP = 42, + IRDMA_OP_GEN_AE = 43, + IRDMA_OP_QUERY_RDMA_FEATURES = 44, + IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY = 45, + IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 46, + IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 47, + IRDMA_OP_CQ_MODIFY = 48,
/* Must be last entry*/ - IRDMA_MAX_CQP_OPS = 51, + IRDMA_MAX_CQP_OPS = 49, };
/* CQP SQ WQES */ diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 1241e5988c101..8b75e2610e5ba 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -411,6 +411,8 @@ struct irdma_sc_cqp { struct irdma_dcqcn_cc_params dcqcn_params; __le64 *host_ctx; u64 *scratch_array; + u64 requested_ops; + atomic64_t completed_ops; u32 cqp_id; u32 sq_size; u32 hw_sq_size; diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 1d9280d46d087..d96779996e026 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -567,7 +567,7 @@ static enum irdma_status_code irdma_wait_event(struct irdma_pci_f *rf, bool cqp_error = false; enum irdma_status_code err_code = 0;
- cqp_timeout.compl_cqp_cmds = rf->sc_dev.cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]; + cqp_timeout.compl_cqp_cmds = atomic64_read(&rf->sc_dev.cqp->completed_ops); do { irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq); if (wait_event_timeout(cqp_request->waitq,
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit f0842bb3d38863777e3454da5653d80b5fde6321 ]
KCSAN detects a data race on cqp_request->request_done memory location which is accessed locklessly in irdma_handle_cqp_op while being updated in irdma_cqp_ce_handler.
Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any compiler optimizations like load fusing and/or KCSAN warning.
[222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma]
[222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5: [222808.417610] irdma_cqp_ce_handler+0x21e/0x270 [irdma] [222808.417725] cqp_compl_worker+0x1b/0x20 [irdma] [222808.417827] process_one_work+0x4d1/0xa40 [222808.417835] worker_thread+0x319/0x700 [222808.417842] kthread+0x180/0x1b0 [222808.417852] ret_from_fork+0x22/0x30
[222808.417918] read to 0xffff8e44107019dc of 1 bytes by task 29688 on cpu 1: [222808.417995] irdma_wait_event+0x1e2/0x2c0 [irdma] [222808.418099] irdma_handle_cqp_op+0xae/0x170 [irdma] [222808.418202] irdma_cqp_cq_destroy_cmd+0x70/0x90 [irdma] [222808.418308] irdma_puda_dele_rsrc+0x46d/0x4d0 [irdma] [222808.418411] irdma_rt_deinit_hw+0x179/0x1d0 [irdma] [222808.418514] irdma_ib_dealloc_device+0x11/0x40 [irdma] [222808.418618] ib_dealloc_device+0x2a/0x120 [ib_core] [222808.418823] __ib_unregister_device+0xde/0x100 [ib_core] [222808.418981] ib_unregister_device+0x22/0x40 [ib_core] [222808.419142] irdma_ib_unregister_device+0x70/0x90 [irdma] [222808.419248] i40iw_close+0x6f/0xc0 [irdma] [222808.419352] i40e_client_device_unregister+0x14a/0x180 [i40e] [222808.419450] i40iw_remove+0x21/0x30 [irdma] [222808.419554] auxiliary_bus_remove+0x31/0x50 [222808.419563] device_remove+0x69/0xb0 [222808.419572] device_release_driver_internal+0x293/0x360 [222808.419582] driver_detach+0x7c/0xf0 [222808.419592] bus_remove_driver+0x8c/0x150 [222808.419600] driver_unregister+0x45/0x70 [222808.419610] auxiliary_driver_unregister+0x16/0x30 [222808.419618] irdma_exit_module+0x18/0x1e [irdma] [222808.419733] __do_sys_delete_module.constprop.0+0x1e2/0x310 [222808.419745] __x64_sys_delete_module+0x1b/0x30 [222808.419755] do_syscall_64+0x39/0x90 [222808.419763] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[222808.419829] value changed: 0x01 -> 0x03
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230711175253.1289-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/hw.c | 2 +- drivers/infiniband/hw/irdma/main.h | 2 +- drivers/infiniband/hw/irdma/utils.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 2159470d7f7f4..0221200e16541 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -2084,7 +2084,7 @@ void irdma_cqp_ce_handler(struct irdma_pci_f *rf, struct irdma_sc_cq *cq) cqp_request->compl_info.error = info.error;
if (cqp_request->waiting) { - cqp_request->request_done = true; + WRITE_ONCE(cqp_request->request_done, true); wake_up(&cqp_request->waitq); irdma_put_cqp_request(&rf->cqp, cqp_request); } else { diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index 454b4b370386c..f2e2bc50c6f7b 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -160,8 +160,8 @@ struct irdma_cqp_request { void (*callback_fcn)(struct irdma_cqp_request *cqp_request); void *param; struct irdma_cqp_compl_info compl_info; + bool request_done; /* READ/WRITE_ONCE macros operate on it */ bool waiting:1; - bool request_done:1; bool dynamic:1; };
diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index d96779996e026..a47eedb6df82f 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -481,7 +481,7 @@ void irdma_free_cqp_request(struct irdma_cqp *cqp, if (cqp_request->dynamic) { kfree(cqp_request); } else { - cqp_request->request_done = false; + WRITE_ONCE(cqp_request->request_done, false); cqp_request->callback_fcn = NULL; cqp_request->waiting = false;
@@ -515,7 +515,7 @@ irdma_free_pending_cqp_request(struct irdma_cqp *cqp, { if (cqp_request->waiting) { cqp_request->compl_info.error = true; - cqp_request->request_done = true; + WRITE_ONCE(cqp_request->request_done, true); wake_up(&cqp_request->waitq); } wait_event_timeout(cqp->remove_wq, @@ -571,7 +571,7 @@ static enum irdma_status_code irdma_wait_event(struct irdma_pci_f *rf, do { irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq); if (wait_event_timeout(cqp_request->waitq, - cqp_request->request_done, + READ_ONCE(cqp_request->request_done), msecs_to_jiffies(CQP_COMPL_WAIT_TIME_MS))) break;
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit dc52aadbc1849cbe3fcf6bc54d35f6baa396e0a1 ]
Commit 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") introduced a new struct mthca_sqp which doesn't contain struct mthca_qp any longer. Placing a pointer of this new struct into qptable leads to crashes, because mthca_poll_one() expects a qp pointer. Fix this by putting the correct pointer into qptable.
Fixes: 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mthca/mthca_qp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 69bba0ef4a5df..53f43649f7d08 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -1393,7 +1393,7 @@ int mthca_alloc_sqp(struct mthca_dev *dev, if (mthca_array_get(&dev->qp_table.qp, mqpn)) err = -EBUSY; else - mthca_array_set(&dev->qp_table.qp, mqpn, qp->sqp); + mthca_array_set(&dev->qp_table.qp, mqpn, qp); spin_unlock_irq(&dev->qp_table.lock);
if (err)
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit b5bbc6551297447d3cca55cf907079e206e9cd82 ]
HW may generate completions that indicates QP is destroyed. Driver should not be scheduling any more completion handlers for this QP, after the QP is destroyed. Since CQs are active during the QP destroy, driver may still schedule completion handlers. This can cause a race where the destroy_cq and poll_cq running simultaneously.
Snippet of kernel panic while doing bnxt_re driver load unload in loop. This indicates a poll after the CQ is freed.
[77786.481636] Call Trace: [77786.481640] <TASK> [77786.481644] bnxt_re_poll_cq+0x14a/0x620 [bnxt_re] [77786.481658] ? kvm_clock_read+0x14/0x30 [77786.481693] __ib_process_cq+0x57/0x190 [ib_core] [77786.481728] ib_cq_poll_work+0x26/0x80 [ib_core] [77786.481761] process_one_work+0x1e5/0x3f0 [77786.481768] worker_thread+0x50/0x3a0 [77786.481785] ? __pfx_worker_thread+0x10/0x10 [77786.481790] kthread+0xe2/0x110 [77786.481794] ? __pfx_kthread+0x10/0x10 [77786.481797] ret_from_fork+0x2c/0x50
To avoid this, complete all completion handlers before returning the destroy QP. If free_cq is called soon after destroy_qp, IB stack will cancel the CQ work before invoking the destroy_cq verb and this will prevent any race mentioned.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1689322969-25402-2-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 ++++++++++++ drivers/infiniband/hw/bnxt_re/qplib_fp.c | 18 ++++++++++++++++++ drivers/infiniband/hw/bnxt_re/qplib_fp.h | 1 + 3 files changed, 31 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 843d0b5d99acd..87ee616e69384 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -792,7 +792,10 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp) int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata) { struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp); + struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp; struct bnxt_re_dev *rdev = qp->rdev; + struct bnxt_qplib_nq *scq_nq = NULL; + struct bnxt_qplib_nq *rcq_nq = NULL; unsigned int flags; int rc;
@@ -826,6 +829,15 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata) ib_umem_release(qp->rumem); ib_umem_release(qp->sumem);
+ /* Flush all the entries of notification queue associated with + * given qp. + */ + scq_nq = qplib_qp->scq->nq; + rcq_nq = qplib_qp->rcq->nq; + bnxt_re_synchronize_nq(scq_nq); + if (scq_nq != rcq_nq) + bnxt_re_synchronize_nq(rcq_nq); + return 0; }
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index d44b6a5c90b57..f1aa3e19b6de6 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -386,6 +386,24 @@ static void bnxt_qplib_service_nq(struct tasklet_struct *t) spin_unlock_bh(&hwq->lock); }
+/* bnxt_re_synchronize_nq - self polling notification queue. + * @nq - notification queue pointer + * + * This function will start polling entries of a given notification queue + * for all pending entries. + * This function is useful to synchronize notification entries while resources + * are going away. + */ + +void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq) +{ + int budget = nq->budget; + + nq->budget = nq->hwq.max_elements; + bnxt_qplib_service_nq(&nq->nq_tasklet); + nq->budget = budget; +} + static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance) { struct bnxt_qplib_nq *nq = dev_instance; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h index f859710f9a7f4..49d89c0808275 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h @@ -548,6 +548,7 @@ int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe, int num_cqes); void bnxt_qplib_flush_cqn_wq(struct bnxt_qplib_qp *qp); +void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq);
static inline void *bnxt_qplib_get_swqe(struct bnxt_qplib_q *que, u32 *swq_idx) {
From: Gaosheng Cui cuigaosheng1@huawei.com
[ Upstream commit 6e8a996563ecbe68e49c49abd4aaeef69f11f2dc ]
The msm_gem_get_vaddr() returns an ERR_PTR() on failure, and a null is catastrophic here, so we should use IS_ERR_OR_NULL() to check the return value.
Fixes: 6a8bd08d0465 ("drm/msm: add sudo flag to submit ioctl") Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Reviewed-by: Akhil P Oommen quic_akhilpo@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/547712/ Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index ef62900b06128..e9c8111122bd6 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -90,7 +90,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit * since we've already mapped it once in * submit_reloc() */ - if (WARN_ON(!ptr)) + if (WARN_ON(IS_ERR_OR_NULL(ptr))) return;
for (i = 0; i < dwords; i++) {
From: Matus Gajdos matuszpd@gmail.com
[ Upstream commit 0e4c2b6b0c4a4b4014d9424c27e5e79d185229c5 ]
Clear TX registers on stop to prevent the SPDIF interface from sending last written word over and over again.
Fixes: a2388a498ad2 ("ASoC: fsl: Add S/PDIF CPU DAI driver") Signed-off-by: Matus Gajdos matuszpd@gmail.com Reviewed-by: Fabio Estevam festevam@gmail.com Link: https://lore.kernel.org/r/20230719164729.19969-1-matuszpd@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_spdif.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c index 8b5c3ba48516c..5b107f2555ddb 100644 --- a/sound/soc/fsl/fsl_spdif.c +++ b/sound/soc/fsl/fsl_spdif.c @@ -666,6 +666,8 @@ static int fsl_spdif_trigger(struct snd_pcm_substream *substream, case SNDRV_PCM_TRIGGER_PAUSE_PUSH: regmap_update_bits(regmap, REG_SPDIF_SCR, dmaen, 0); regmap_update_bits(regmap, REG_SPDIF_SIE, intr, 0); + regmap_write(regmap, REG_SPDIF_STL, 0x0); + regmap_write(regmap, REG_SPDIF_STR, 0x0); break; default: return -EINVAL;
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit e0933b526fbfd937c4a8f4e35fcdd49f0e22d411 ]
Fix the symbolic names for zone conditions in the blkzoned.h header file.
Cc: Hannes Reinecke hare@suse.de Cc: Damien Le Moal dlemoal@kernel.org Fixes: 6a0cb1bc106f ("block: Implement support for zoned block devices") Signed-off-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20230706201422.3987341-1-bvanassche@acm.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/blkzoned.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h index 656a326821a2b..321965feee354 100644 --- a/include/uapi/linux/blkzoned.h +++ b/include/uapi/linux/blkzoned.h @@ -51,13 +51,13 @@ enum blk_zone_type { * * The Zone Condition state machine in the ZBC/ZAC standards maps the above * deinitions as: - * - ZC1: Empty | BLK_ZONE_EMPTY + * - ZC1: Empty | BLK_ZONE_COND_EMPTY * - ZC2: Implicit Open | BLK_ZONE_COND_IMP_OPEN * - ZC3: Explicit Open | BLK_ZONE_COND_EXP_OPEN - * - ZC4: Closed | BLK_ZONE_CLOSED - * - ZC5: Full | BLK_ZONE_FULL - * - ZC6: Read Only | BLK_ZONE_READONLY - * - ZC7: Offline | BLK_ZONE_OFFLINE + * - ZC4: Closed | BLK_ZONE_COND_CLOSED + * - ZC5: Full | BLK_ZONE_COND_FULL + * - ZC6: Read Only | BLK_ZONE_COND_READONLY + * - ZC7: Offline | BLK_ZONE_COND_OFFLINE * * Conditions 0x5 to 0xC are reserved by the current ZBC/ZAC spec and should * be considered invalid.
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit bae3028799dc4f1109acc4df37c8ff06f2d8f1a0 ]
In the error paths 'bad_stripe_cache' and 'bad_check_reshape', 'reconfig_mutex' is still held after raid_ctr() returns.
Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index eba277bb8a1f1..d3ac3b894bdf5 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3278,15 +3278,19 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) /* Try to adjust the raid4/5/6 stripe cache size to the stripe size */ if (rs_is_raid456(rs)) { r = rs_set_raid456_stripe_cache(rs); - if (r) + if (r) { + mddev_unlock(&rs->md); goto bad_stripe_cache; + } }
/* Now do an early reshape check */ if (test_bit(RT_FLAG_RESHAPE_RS, &rs->runtime_flags)) { r = rs_check_reshape(rs); - if (r) + if (r) { + mddev_unlock(&rs->md); goto bad_check_reshape; + }
/* Restore new, ctr requested layout to perform check */ rs_config_restore(rs, &rs_layout); @@ -3295,6 +3299,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = rs->md.pers->check_reshape(&rs->md); if (r) { ti->error = "Reshape check failed"; + mddev_unlock(&rs->md); goto bad_check_reshape; } }
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit e74c874eabe2e9173a8fbdad616cd89c70eb8ffd ]
There are four equivalent goto tags in raid_ctr(), clean them up to use just one.
There is no functional change and this is preparation to fix raid_ctr()'s unprotected md_stop().
Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org Stable-dep-of: 7d5fff8982a2 ("dm raid: protect md_stop() with 'reconfig_mutex'") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index d3ac3b894bdf5..6d4e287350a33 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3258,8 +3258,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = md_start(&rs->md); if (r) { ti->error = "Failed to start raid array"; - mddev_unlock(&rs->md); - goto bad_md_start; + goto bad_unlock; }
/* If raid4/5/6 journal mode explicitly requested (only possible with journal dev) -> set it */ @@ -3267,8 +3266,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = r5c_journal_mode_set(&rs->md, rs->journal_dev.mode); if (r) { ti->error = "Failed to set raid4/5/6 journal mode"; - mddev_unlock(&rs->md); - goto bad_journal_mode_set; + goto bad_unlock; } }
@@ -3278,19 +3276,15 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) /* Try to adjust the raid4/5/6 stripe cache size to the stripe size */ if (rs_is_raid456(rs)) { r = rs_set_raid456_stripe_cache(rs); - if (r) { - mddev_unlock(&rs->md); - goto bad_stripe_cache; - } + if (r) + goto bad_unlock; }
/* Now do an early reshape check */ if (test_bit(RT_FLAG_RESHAPE_RS, &rs->runtime_flags)) { r = rs_check_reshape(rs); - if (r) { - mddev_unlock(&rs->md); - goto bad_check_reshape; - } + if (r) + goto bad_unlock;
/* Restore new, ctr requested layout to perform check */ rs_config_restore(rs, &rs_layout); @@ -3299,8 +3293,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = rs->md.pers->check_reshape(&rs->md); if (r) { ti->error = "Reshape check failed"; - mddev_unlock(&rs->md); - goto bad_check_reshape; + goto bad_unlock; } } } @@ -3311,10 +3304,8 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) mddev_unlock(&rs->md); return 0;
-bad_md_start: -bad_journal_mode_set: -bad_stripe_cache: -bad_check_reshape: +bad_unlock: + mddev_unlock(&rs->md); md_stop(&rs->md); bad: raid_set_free(rs);
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 7d5fff8982a2199d49ec067818af7d84d4f95ca0 ]
__md_stop_writes() and __md_stop() will modify many fields that are protected by 'reconfig_mutex', and all the callers will grab 'reconfig_mutex' except for md_stop().
Also, update md_stop() to make certain 'reconfig_mutex' is held using lockdep_assert_held().
Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 4 +++- drivers/md/md.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 6d4e287350a33..8d489933d5792 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3305,8 +3305,8 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) return 0;
bad_unlock: - mddev_unlock(&rs->md); md_stop(&rs->md); + mddev_unlock(&rs->md); bad: raid_set_free(rs);
@@ -3317,7 +3317,9 @@ static void raid_dtr(struct dm_target *ti) { struct raid_set *rs = ti->private;
+ mddev_lock_nointr(&rs->md); md_stop(&rs->md); + mddev_unlock(&rs->md); raid_set_free(rs); }
diff --git a/drivers/md/md.c b/drivers/md/md.c index 5a21aeedc1ba7..89a270d293698 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6281,6 +6281,8 @@ static void __md_stop(struct mddev *mddev)
void md_stop(struct mddev *mddev) { + lockdep_assert_held(&mddev->reconfig_mutex); + /* stop the array and free an attached data structures. * This is called from dm-raid */
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit c01aebeef3ce45f696ffa0a1303cea9b34babb45 ]
If the second call to amdgpu_bo_create_kernel() fails, the memory allocated from the first call should be cleared. If the third call fails, the memory from the second call should be cleared.
Fixes: b95b5391684b ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init") Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Lijo Lazar lijo.lazar@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index 65744c3bd3648..f305a0f8e9b9a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -341,11 +341,11 @@ static int psp_sw_init(void *handle) return 0;
failed2: - amdgpu_bo_free_kernel(&psp->fw_pri_bo, - &psp->fw_pri_mc_addr, &psp->fw_pri_buf); -failed1: amdgpu_bo_free_kernel(&psp->fence_buf_bo, &psp->fence_buf_mc_addr, &psp->fence_buf); +failed1: + amdgpu_bo_free_kernel(&psp->fw_pri_bo, + &psp->fw_pri_mc_addr, &psp->fw_pri_buf); return ret; }
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit ae463563b7a1b7d4a3d0b065b09d37a76b693937 ]
Report the correct WC error if a MW bind is performed on an already valid/bound window.
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230725155439.1057-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/hw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 0221200e16541..70dffa9a9f674 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -191,6 +191,7 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp, case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS: case IRDMA_AE_AMP_MWBIND_BIND_DISABLED: case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS: + case IRDMA_AE_AMP_MWBIND_VALID_STAG: qp->flush_code = FLUSH_MW_BIND_ERR; qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR; break;
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 3fc2febb0f8ffae354820c1772ec008733237cfa ]
The global function triggers a warning because of the missing prototype
drivers/ata/pata_ns87415.c:263:6: warning: no previous prototype for 'ns87560_tf_read' [-Wmissing-prototypes] 263 | void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf)
There are no other references to this, so just make it static.
Fixes: c4b5b7b6c4423 ("pata_ns87415: Initial cut at 87415/87560 IDE support") Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/pata_ns87415.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ata/pata_ns87415.c b/drivers/ata/pata_ns87415.c index 9dd6bffefb485..602472d4e693e 100644 --- a/drivers/ata/pata_ns87415.c +++ b/drivers/ata/pata_ns87415.c @@ -260,7 +260,7 @@ static u8 ns87560_check_status(struct ata_port *ap) * LOCKING: * Inherited from caller. */ -void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf) +static void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf) { struct ata_ioports *ioaddr = &ap->ioaddr;
From: Zheng Yejian zhengyejian1@huawei.com
[ Upstream commit 2d093282b0d4357373497f65db6a05eb0c28b7c8 ]
When pages are removed in rb_remove_pages(), 'cpu_buffer->read' is set to 0 in order to make sure any read iterators reset themselves. However, this will mess 'entries' stating, see following steps:
# cd /sys/kernel/tracing/ # 1. Enlarge ring buffer prepare for later reducing: # echo 20 > per_cpu/cpu0/buffer_size_kb # 2. Write a log into ring buffer of cpu0: # taskset -c 0 echo "hello1" > trace_marker # 3. Read the log: # cat per_cpu/cpu0/trace_pipe <...>-332 [000] ..... 62.406844: tracing_mark_write: hello1 # 4. Stop reading and see the stats, now 0 entries, and 1 event readed: # cat per_cpu/cpu0/stats entries: 0 [...] read events: 1 # 5. Reduce the ring buffer # echo 7 > per_cpu/cpu0/buffer_size_kb # 6. Now entries became unexpected 1 because actually no entries!!! # cat per_cpu/cpu0/stats entries: 1 [...] read events: 0
To fix it, introduce 'page_removed' field to count total removed pages since last reset, then use it to let read iterators reset themselves instead of changing the 'read' pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20230724054040.3489499-1-zhengyej...
Cc: mhiramat@kernel.org Cc: vnagarnaik@google.com Fixes: 83f40318dab0 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index ceeba8bf1265b..e1cef097b0df5 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -520,6 +520,8 @@ struct ring_buffer_per_cpu { rb_time_t before_stamp; u64 event_stamp[MAX_NEST]; u64 read_stamp; + /* pages removed since last reset */ + unsigned long pages_removed; /* ring buffer pages to update, > 0 to add, < 0 to remove */ long nr_pages_to_update; struct list_head new_pages; /* new pages to add */ @@ -555,6 +557,7 @@ struct ring_buffer_iter { struct buffer_page *head_page; struct buffer_page *cache_reader_page; unsigned long cache_read; + unsigned long cache_pages_removed; u64 read_stamp; u64 page_stamp; struct ring_buffer_event *event; @@ -1931,6 +1934,8 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) to_remove = rb_list_head(to_remove)->next; head_bit |= (unsigned long)to_remove & RB_PAGE_HEAD; } + /* Read iterators need to reset themselves when some pages removed */ + cpu_buffer->pages_removed += nr_removed;
next_page = rb_list_head(to_remove)->next;
@@ -1952,12 +1957,6 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) cpu_buffer->head_page = list_entry(next_page, struct buffer_page, list);
- /* - * change read pointer to make sure any read iterators reset - * themselves - */ - cpu_buffer->read = 0; - /* pages are removed, resume tracing and then free the pages */ atomic_dec(&cpu_buffer->record_disabled); raw_spin_unlock_irq(&cpu_buffer->reader_lock); @@ -4347,6 +4346,7 @@ static void rb_iter_reset(struct ring_buffer_iter *iter)
iter->cache_reader_page = iter->head_page; iter->cache_read = cpu_buffer->read; + iter->cache_pages_removed = cpu_buffer->pages_removed;
if (iter->head) { iter->read_stamp = cpu_buffer->read_stamp; @@ -4800,12 +4800,13 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts) buffer = cpu_buffer->buffer;
/* - * Check if someone performed a consuming read to - * the buffer. A consuming read invalidates the iterator - * and we need to reset the iterator in this case. + * Check if someone performed a consuming read to the buffer + * or removed some pages from the buffer. In these cases, + * iterator was invalidated and we need to reset it. */ if (unlikely(iter->cache_read != cpu_buffer->read || - iter->cache_reader_page != cpu_buffer->reader_page)) + iter->cache_reader_page != cpu_buffer->reader_page || + iter->cache_pages_removed != cpu_buffer->pages_removed)) rb_iter_reset(iter);
again: @@ -5249,6 +5250,7 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) cpu_buffer->last_overrun = 0;
rb_head_page_activate(cpu_buffer); + cpu_buffer->pages_removed = 0; }
/* Must have disabled the cpu buffer then done a synchronize_rcu */
From: Zheng Yejian zhengyejian1@huawei.com
[ Upstream commit dea499781a1150d285c62b26659f62fb00824fce ]
Warning happened in trace_buffered_event_disable() at WARN_ON_ONCE(!trace_buffered_event_ref)
Call Trace: ? __warn+0xa5/0x1b0 ? trace_buffered_event_disable+0x189/0x1b0 __ftrace_event_enable_disable+0x19e/0x3e0 free_probe_data+0x3b/0xa0 unregister_ftrace_function_probe_func+0x6b8/0x800 event_enable_func+0x2f0/0x3d0 ftrace_process_regex.isra.0+0x12d/0x1b0 ftrace_filter_write+0xe6/0x140 vfs_write+0x1c9/0x6f0 [...]
The cause of the warning is in __ftrace_event_enable_disable(), trace_buffered_event_enable() was called once while trace_buffered_event_disable() was called twice. Reproduction script show as below, for analysis, see the comments: ``` #!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then: # 1) SOFT_DISABLED_BIT was set; # 2) trace_buffered_event_enable() was called first time; echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \ set_ftrace_filter
# 2. Enable the event registered, then: # 1) SOFT_DISABLED_BIT was cleared; # 2) trace_buffered_event_disable() was called first time; echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was # set again!!! cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then: # 1) SOFT_DISABLED_BIT was cleared again; # 2) trace_buffered_event_disable() was called second time!!! echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \ set_ftrace_filter ```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at fist time soft-mode enabled, and call trace_buffered_event_disable() at last time soft-mode disabled.
Link: https://lore.kernel.org/linux-trace-kernel/20230726095804.920457-1-zhengyeji...
Cc: mhiramat@kernel.org Fixes: 0fc1b09ff1ff ("tracing: Use temp buffer when filtering events") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_events.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 160298d285c0b..2a2a599997671 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -594,7 +594,6 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, { struct trace_event_call *call = file->event_call; struct trace_array *tr = file->tr; - unsigned long file_flags = file->flags; int ret = 0; int disable;
@@ -618,6 +617,8 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, break; disable = file->flags & EVENT_FILE_FL_SOFT_DISABLED; clear_bit(EVENT_FILE_FL_SOFT_MODE_BIT, &file->flags); + /* Disable use of trace_buffered_event */ + trace_buffered_event_disable(); } else disable = !(file->flags & EVENT_FILE_FL_SOFT_MODE);
@@ -656,6 +657,8 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, if (atomic_inc_return(&file->sm_ref) > 1) break; set_bit(EVENT_FILE_FL_SOFT_MODE_BIT, &file->flags); + /* Enable use of trace_buffered_event */ + trace_buffered_event_enable(); }
if (!(file->flags & EVENT_FILE_FL_ENABLED)) { @@ -695,15 +698,6 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, break; }
- /* Enable or disable use of trace_buffered_event */ - if ((file_flags & EVENT_FILE_FL_SOFT_DISABLED) != - (file->flags & EVENT_FILE_FL_SOFT_DISABLED)) { - if (file->flags & EVENT_FILE_FL_SOFT_DISABLED) - trace_buffered_event_enable(); - else - trace_buffered_event_disable(); - } - return ret; }
From: Dan Carpenter dan.carpenter@linaro.org
commit a8291be6b5dd465c22af229483dbac543a91e24e upstream.
This reverts commit f08aa7c80dac27ee00fa6827f447597d2fba5465.
The reverted commit was based on static analysis and a misunderstanding of how PTR_ERR() and NULLs are supposed to work. When a function returns both pointer errors and NULL then normally the NULL means "continue operating without a feature because it was deliberately turned off". The NULL should not be treated as a failure. If a driver cannot work when that feature is disabled then the KConfig should enforce that the function cannot return NULL. We should not need to test for it.
In this driver, the bug means that probe cannot succeed when CONFIG_PM is disabled.
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Fixes: f08aa7c80dac ("usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/ZKQoBa84U/ykEh3C@moroto Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/tegra-xudc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/gadget/udc/tegra-xudc.c +++ b/drivers/usb/gadget/udc/tegra-xudc.c @@ -3689,15 +3689,15 @@ static int tegra_xudc_powerdomain_init(s int err;
xudc->genpd_dev_device = dev_pm_domain_attach_by_name(dev, "dev"); - if (IS_ERR_OR_NULL(xudc->genpd_dev_device)) { - err = PTR_ERR(xudc->genpd_dev_device) ? : -ENODATA; + if (IS_ERR(xudc->genpd_dev_device)) { + err = PTR_ERR(xudc->genpd_dev_device); dev_err(dev, "failed to get device power domain: %d\n", err); return err; }
xudc->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "ss"); - if (IS_ERR_OR_NULL(xudc->genpd_dev_ss)) { - err = PTR_ERR(xudc->genpd_dev_ss) ? : -ENODATA; + if (IS_ERR(xudc->genpd_dev_ss)) { + err = PTR_ERR(xudc->genpd_dev_ss); dev_err(dev, "failed to get SuperSpeed power domain: %d\n", err); return err; }
From: Frank Li Frank.Li@nxp.com
commit f4fc01af5b640bc39bd9403b5fd855345a2ad5f8 upstream.
The legacy gadget driver omitted calling usb_gadget_check_config() to ensure that the USB device controller (UDC) has adequate resources, including sufficient endpoint numbers and types, to support the given configuration.
Previously, usb_add_config() was solely invoked by the legacy gadget driver. Adds the necessary usb_gadget_check_config() after the bind() operation to fix the issue.
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number") Cc: stable stable@kernel.org Reported-by: Ravi Gunasekaran r-gunasekaran@ti.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20230707230015.494999-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/composite.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/gadget/composite.c +++ b/drivers/usb/gadget/composite.c @@ -1033,6 +1033,10 @@ int usb_add_config(struct usb_composite_ goto done;
status = bind(config); + + if (status == 0) + status = usb_gadget_check_config(cdev->gadget); + if (status < 0) { while (!list_empty(&config->functions)) { struct usb_function *f;
From: Zqiang qiang.zhang1211@gmail.com
commit 83e30f2bf86ef7c38fbd476ed81a88522b620628 upstream.
Currently, increasing raw_dev->count happens before invoke the raw_queue_event(), if the raw_queue_event() return error, invoke raw_release() will not trigger the dev_free() to be called.
[ 268.905865][ T5067] raw-gadget.0 gadget.0: failed to queue event [ 268.912053][ T5067] udc dummy_udc.0: failed to start USB Raw Gadget: -12 [ 268.918885][ T5067] raw-gadget.0: probe of gadget.0 failed with error -12 [ 268.925956][ T5067] UDC core: USB Raw Gadget: couldn't find an available UDC or it's busy [ 268.934657][ T5067] misc raw-gadget: fail, usb_gadget_register_driver returned -16
BUG: memory leak
[<ffffffff8154bf94>] kmalloc_trace+0x24/0x90 mm/slab_common.c:1076 [<ffffffff8347eb55>] kmalloc include/linux/slab.h:582 [inline] [<ffffffff8347eb55>] kzalloc include/linux/slab.h:703 [inline] [<ffffffff8347eb55>] dev_new drivers/usb/gadget/legacy/raw_gadget.c:191 [inline] [<ffffffff8347eb55>] raw_open+0x45/0x110 drivers/usb/gadget/legacy/raw_gadget.c:385 [<ffffffff827d1d09>] misc_open+0x1a9/0x1f0 drivers/char/misc.c:165
[<ffffffff8154bf94>] kmalloc_trace+0x24/0x90 mm/slab_common.c:1076 [<ffffffff8347cd2f>] kmalloc include/linux/slab.h:582 [inline] [<ffffffff8347cd2f>] raw_ioctl_init+0xdf/0x410 drivers/usb/gadget/legacy/raw_gadget.c:460 [<ffffffff8347dfe9>] raw_ioctl+0x5f9/0x1120 drivers/usb/gadget/legacy/raw_gadget.c:1250 [<ffffffff81685173>] vfs_ioctl fs/ioctl.c:51 [inline]
[<ffffffff8154bf94>] kmalloc_trace+0x24/0x90 mm/slab_common.c:1076 [<ffffffff833ecc6a>] kmalloc include/linux/slab.h:582 [inline] [<ffffffff833ecc6a>] kzalloc include/linux/slab.h:703 [inline] [<ffffffff833ecc6a>] dummy_alloc_request+0x5a/0xe0 drivers/usb/gadget/udc/dummy_hcd.c:665 [<ffffffff833e9132>] usb_ep_alloc_request+0x22/0xd0 drivers/usb/gadget/udc/core.c:196 [<ffffffff8347f13d>] gadget_bind+0x6d/0x370 drivers/usb/gadget/legacy/raw_gadget.c:292
This commit therefore invoke kref_get() under the condition that raw_queue_event() return success.
Reported-by: syzbot+feb045d335c1fdde5bf7@syzkaller.appspotmail.com Cc: stable stable@kernel.org Closes: https://syzkaller.appspot.com/bug?extid=feb045d335c1fdde5bf7 Signed-off-by: Zqiang qiang.zhang1211@gmail.com Reviewed-by: Andrey Konovalov andreyknvl@gmail.com Tested-by: Andrey Konovalov andreyknvl@gmail.com Link: https://lore.kernel.org/r/20230714074011.20989-1-qiang.zhang1211@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/legacy/raw_gadget.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/usb/gadget/legacy/raw_gadget.c +++ b/drivers/usb/gadget/legacy/raw_gadget.c @@ -310,13 +310,15 @@ static int gadget_bind(struct usb_gadget dev->eps_num = i; spin_unlock_irqrestore(&dev->lock, flags);
- /* Matches kref_put() in gadget_unbind(). */ - kref_get(&dev->count); - ret = raw_queue_event(dev, USB_RAW_EVENT_CONNECT, 0, NULL); - if (ret < 0) + if (ret < 0) { dev_err(&gadget->dev, "failed to queue event\n"); + set_gadget_data(gadget, NULL); + return ret; + }
+ /* Matches kref_put() in gadget_unbind(). */ + kref_get(&dev->count); return ret; }
From: Sean Christopherson seanjc@google.com
commit eed3013faa401aae662398709410a59bb0646e32 upstream.
Grab a reference to KVM prior to installing VM and vCPU stats file descriptors to ensure the underlying VM and vCPU objects are not freed until the last reference to any and all stats fds are dropped.
Note, the stats paths manually invoke fd_install() and so don't need to grab a reference before creating the file.
Fixes: ce55c049459c ("KVM: stats: Support binary stats retrieval for a VCPU") Fixes: fcfe1baeddbf ("KVM: stats: Support binary stats retrieval for a VM") Reported-by: Zheng Zhang zheng.zhang@email.ucr.edu Closes: https://lore.kernel.org/all/CAC_GQSr3xzZaeZt85k_RCBd5kfiOve8qXo7a81Cq53LuVQ5... Cc: stable@vger.kernel.org Cc: Kees Cook keescook@chromium.org Signed-off-by: Sean Christopherson seanjc@google.com Reviewed-by: Kees Cook keescook@chromium.org Message-Id: 20230711230131.648752-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3804,8 +3804,17 @@ static ssize_t kvm_vcpu_stats_read(struc sizeof(vcpu->stat), user_buffer, size, offset); }
+static int kvm_vcpu_stats_release(struct inode *inode, struct file *file) +{ + struct kvm_vcpu *vcpu = file->private_data; + + kvm_put_kvm(vcpu->kvm); + return 0; +} + static const struct file_operations kvm_vcpu_stats_fops = { .read = kvm_vcpu_stats_read, + .release = kvm_vcpu_stats_release, .llseek = noop_llseek, };
@@ -3826,6 +3835,9 @@ static int kvm_vcpu_ioctl_get_stats_fd(s put_unused_fd(fd); return PTR_ERR(file); } + + kvm_get_kvm(vcpu->kvm); + file->f_mode |= FMODE_PREAD; fd_install(fd, file);
@@ -4409,8 +4421,17 @@ static ssize_t kvm_vm_stats_read(struct sizeof(kvm->stat), user_buffer, size, offset); }
+static int kvm_vm_stats_release(struct inode *inode, struct file *file) +{ + struct kvm *kvm = file->private_data; + + kvm_put_kvm(kvm); + return 0; +} + static const struct file_operations kvm_vm_stats_fops = { .read = kvm_vm_stats_read, + .release = kvm_vm_stats_release, .llseek = noop_llseek, };
@@ -4429,6 +4450,9 @@ static int kvm_vm_ioctl_get_stats_fd(str put_unused_fd(fd); return PTR_ERR(file); } + + kvm_get_kvm(kvm); + file->f_mode |= FMODE_PREAD; fd_install(fd, file);
From: Sean Christopherson seanjc@google.com
commit c4abd7352023aa96114915a0bb2b88016a425cda upstream.
Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only if KVM itself is not configured to utilize unrestricted guests, i.e. don't stuff CR0/CR4 for a restricted L2 that is running as the guest of an unrestricted L1. Any attempt to VM-Enter a restricted guest with invalid CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should never observe a restricted L2 with incompatible CR0/CR4, since nested VM-Enter from L1 should have failed.
And if KVM does observe an active, restricted L2 with incompatible state, e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail does more harm than good, as KVM will often neglect to undo the side effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus the damage can easily spill over to L1. On the other hand, letting VM-Enter fail due to bad guest state is more likely to contain the damage to L2 as KVM relies on hardware to perform most guest state consistency checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into L1 irrespective of (un)restricted guest behavior.
Cc: Jim Mattson jmattson@google.com Cc: stable@vger.kernel.org Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it") Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230613203037.1968489-3-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/vmx.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1416,6 +1416,11 @@ void vmx_set_rflags(struct kvm_vcpu *vcp struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned long old_rflags;
+ /* + * Unlike CR0 and CR4, RFLAGS handling requires checking if the vCPU + * is an unrestricted guest in order to mark L2 as needing emulation + * if L1 runs L2 as a restricted guest. + */ if (is_unrestricted_guest(vcpu)) { kvm_register_mark_available(vcpu, VCPU_EXREG_RFLAGS); vmx->rflags = rflags; @@ -3088,7 +3093,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, old_cr0_pg = kvm_read_cr0_bits(vcpu, X86_CR0_PG);
hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF); - if (is_unrestricted_guest(vcpu)) + if (enable_unrestricted_guest) hw_cr0 |= KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST; else { hw_cr0 |= KVM_VM_CR0_ALWAYS_ON; @@ -3116,7 +3121,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, } #endif
- if (enable_ept && !is_unrestricted_guest(vcpu)) { + if (enable_ept && !enable_unrestricted_guest) { /* * Ensure KVM has an up-to-date snapshot of the guest's CR3. If * the below code _enables_ CR3 exiting, vmx_cache_reg() will @@ -3239,7 +3244,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long hw_cr4;
hw_cr4 = (cr4_read_shadow() & X86_CR4_MCE) | (cr4 & ~X86_CR4_MCE); - if (is_unrestricted_guest(vcpu)) + if (enable_unrestricted_guest) hw_cr4 |= KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST; else if (vmx->rmode.vm86_active) hw_cr4 |= KVM_RMODE_VM_CR4_ALWAYS_ON; @@ -3259,7 +3264,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, vcpu->arch.cr4 = cr4; kvm_register_mark_available(vcpu, VCPU_EXREG_CR4);
- if (!is_unrestricted_guest(vcpu)) { + if (!enable_unrestricted_guest) { if (enable_ept) { if (!is_paging(vcpu)) { hw_cr4 &= ~X86_CR4_PAE;
From: Johan Hovold johan+linaro@kernel.org
commit 4dd8752a14ca0303fbdf0a6c68ff65f0a50bd2fa upstream.
The runtime PM state should not be changed by drivers that do not implement runtime PM even if it happens to work around a bug in PM core.
With the wake irq arming now fixed, drop the bogus runtime PM state update which left the device in active state (and could potentially prevent a parent device from suspending).
Fixes: f3974413cf02 ("tty: serial: qcom_geni_serial: Wakeup IRQ cleanup") Cc: 5.6+ stable@vger.kernel.org # 5.6+ Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/qcom_geni_serial.c | 7 ------- 1 file changed, 7 deletions(-)
--- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -1455,13 +1455,6 @@ static int qcom_geni_serial_probe(struct if (ret) return ret;
- /* - * Set pm_runtime status as ACTIVE so that wakeup_irq gets - * enabled/disabled from dev_pm_arm_wake_irq during system - * suspend/resume respectively. - */ - pm_runtime_set_active(&pdev->dev); - if (port->wakeup_irq > 0) { device_init_wakeup(&pdev->dev, true); ret = dev_pm_set_dedicated_wake_irq(&pdev->dev,
From: Ruihong Luo colorsu1922@gmail.com
commit 748c5ea8b8796ae8ee80b8d3a3d940570b588d59 upstream.
Preserve the original value of the Divisor Latch Fraction (DLF) register. When the DLF register is modified without preservation, it can disrupt the baudrate settings established by firmware or bootloader, leading to data corruption and the generation of unreadable or distorted characters.
Fixes: 701c5e73b296 ("serial: 8250_dw: add fractional divisor support") Cc: stable stable@kernel.org Signed-off-by: Ruihong Luo colorsu1922@gmail.com Link: https://lore.kernel.org/stable/20230713004235.35904-1-colorsu1922%40gmail.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230713004235.35904-1-colorsu1922@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_dwlib.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/tty/serial/8250/8250_dwlib.c +++ b/drivers/tty/serial/8250/8250_dwlib.c @@ -80,7 +80,7 @@ static void dw8250_set_divisor(struct ua void dw8250_setup_port(struct uart_port *p) { struct uart_8250_port *up = up_to_u8250p(p); - u32 reg; + u32 reg, old_dlf;
/* * If the Component Version Register returns zero, we know that @@ -93,9 +93,11 @@ void dw8250_setup_port(struct uart_port dev_dbg(p->dev, "Designware UART version %c.%c%c\n", (reg >> 24) & 0xff, (reg >> 16) & 0xff, (reg >> 8) & 0xff);
+ /* Preserve value written by firmware or bootloader */ + old_dlf = dw8250_readl_ext(p, DW_UART_DLF); dw8250_writel_ext(p, DW_UART_DLF, ~0U); reg = dw8250_readl_ext(p, DW_UART_DLF); - dw8250_writel_ext(p, DW_UART_DLF, 0); + dw8250_writel_ext(p, DW_UART_DLF, old_dlf);
if (reg) { struct dw8250_port_data *d = p->private_data;
From: Samuel Holland samuel.holland@sifive.com
commit 9b8fef6345d5487137d4193bb0a0eae2203c284e upstream.
This function is called indirectly from the platform driver probe function. Even if the driver is built in, it may be probed after free_initmem() due to deferral or unbinding/binding via sysfs. Thus the function cannot be marked as __init.
Fixes: 45c054d0815b ("tty: serial: add driver for the SiFive UART") Cc: stable stable@kernel.org Signed-off-by: Samuel Holland samuel.holland@sifive.com Link: https://lore.kernel.org/r/20230624060159.3401369-1-samuel.holland@sifive.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sifive.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/sifive.c +++ b/drivers/tty/serial/sifive.c @@ -843,7 +843,7 @@ static void sifive_serial_console_write( local_irq_restore(flags); }
-static int __init sifive_serial_console_setup(struct console *co, char *options) +static int sifive_serial_console_setup(struct console *co, char *options) { struct sifive_serial_port *ssp; int baud = SIFIVE_DEFAULT_BAUD_RATE;
From: Jerry Meng jerry-meng@foxmail.com
commit 4f7cab49cecee16120d27c1734cfdf3d6c0e5329 upstream.
EM060K_128 is EM060K's sub-model, having the same name "Quectel EM060K-GL"
MBIM + GNSS + DIAG + NMEA + AT + QDSS + DPL
T: Bus=03 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=0128 Rev= 5.04 S: Manufacturer=Quectel S: Product=Quectel EM060K-GL S: SerialNumber=f6fa08b6 C:* #Ifs= 8 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none) E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Jerry Meng jerry-meng@foxmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -251,6 +251,7 @@ static void option_instat_callback(struc #define QUECTEL_PRODUCT_EM061K_LTA 0x0123 #define QUECTEL_PRODUCT_EM061K_LMS 0x0124 #define QUECTEL_PRODUCT_EC25 0x0125 +#define QUECTEL_PRODUCT_EM060K_128 0x0128 #define QUECTEL_PRODUCT_EG91 0x0191 #define QUECTEL_PRODUCT_EG95 0x0195 #define QUECTEL_PRODUCT_BG96 0x0296 @@ -1197,6 +1198,9 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0x00, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0xff, 0x30) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0x00, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0xff, 0x40) },
From: Mohsen Tahmasebi moh53n@moh53n.ir
commit 857ea9005806e2a458016880278f98715873e977 upstream.
Add Quectel EC200A "DIAG, AT, MODEM":
0x6005: ECM / RNDIS + DIAG + AT + MODEM
T: Bus=01 Lev=01 Prnt=02 Port=05 Cnt=01 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=6005 Rev=03.18 S: Manufacturer=Android S: Product=Android S: SerialNumber=0000 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms
Signed-off-by: Mohsen Tahmasebi moh53n@moh53n.ir Tested-by: Mostafa Ghofrani mostafaghrr@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -269,6 +269,7 @@ static void option_instat_callback(struc #define QUECTEL_PRODUCT_RM520N 0x0801 #define QUECTEL_PRODUCT_EC200U 0x0901 #define QUECTEL_PRODUCT_EC200S_CN 0x6002 +#define QUECTEL_PRODUCT_EC200A 0x6005 #define QUECTEL_PRODUCT_EM061K_LWW 0x6008 #define QUECTEL_PRODUCT_EM061K_LCN 0x6009 #define QUECTEL_PRODUCT_EC200T 0x6026 @@ -1229,6 +1230,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM520N, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, 0x0900, 0xff, 0, 0), /* RM500U-CN */ .driver_info = ZLP }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200A, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200U, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200S_CN, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) },
From: Oliver Neukum oneukum@suse.com
commit dd92c8a1f99bcd166204ffc219ea5a23dd65d64f upstream.
Add the device and product ID for this CAN bus interface / license dongle. The device is usable either directly from user space or can be attached to a kernel CAN interface with slcan_attach.
Reported-by: Kaufmann Automotive GmbH info@kaufmann-automotive.ch Tested-by: Kaufmann Automotive GmbH info@kaufmann-automotive.ch Signed-off-by: Oliver Neukum oneukum@suse.com [ johan: amend commit message and move entries in sort order ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/usb-serial-simple.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/serial/usb-serial-simple.c +++ b/drivers/usb/serial/usb-serial-simple.c @@ -63,6 +63,11 @@ DEVICE(flashloader, FLASHLOADER_IDS); 0x01) } DEVICE(google, GOOGLE_IDS);
+/* KAUFMANN RKS+CAN VCP */ +#define KAUFMANN_IDS() \ + { USB_DEVICE(0x16d0, 0x0870) } +DEVICE(kaufmann, KAUFMANN_IDS); + /* Libtransistor USB console */ #define LIBTRANSISTOR_IDS() \ { USB_DEVICE(0x1209, 0x8b00) } @@ -124,6 +129,7 @@ static struct usb_serial_driver * const &funsoft_device, &flashloader_device, &google_device, + &kaufmann_device, &libtransistor_device, &vivopay_device, &moto_modem_device, @@ -142,6 +148,7 @@ static const struct usb_device_id id_tab FUNSOFT_IDS(), FLASHLOADER_IDS(), GOOGLE_IDS(), + KAUFMANN_IDS(), LIBTRANSISTOR_IDS(), VIVOPAY_IDS(), MOTO_IDS(),
From: Johan Hovold johan@kernel.org
commit d245aedc00775c4d7265a9f4522cc4e1fd34d102 upstream.
Sort the driver symbols alphabetically in order to make it more obvious where new driver entries should be added.
Cc: stable@vger.kernel.org Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/usb-serial-simple.c | 66 ++++++++++++++++----------------- 1 file changed, 33 insertions(+), 33 deletions(-)
--- a/drivers/usb/serial/usb-serial-simple.c +++ b/drivers/usb/serial/usb-serial-simple.c @@ -38,16 +38,6 @@ static struct usb_serial_driver vendor## { USB_DEVICE(0x0a21, 0x8001) } /* MMT-7305WW */ DEVICE(carelink, CARELINK_IDS);
-/* ZIO Motherboard USB driver */ -#define ZIO_IDS() \ - { USB_DEVICE(0x1CBE, 0x0103) } -DEVICE(zio, ZIO_IDS); - -/* Funsoft Serial USB driver */ -#define FUNSOFT_IDS() \ - { USB_DEVICE(0x1404, 0xcddc) } -DEVICE(funsoft, FUNSOFT_IDS); - /* Infineon Flashloader driver */ #define FLASHLOADER_IDS() \ { USB_DEVICE_INTERFACE_CLASS(0x058b, 0x0041, USB_CLASS_CDC_DATA) }, \ @@ -55,6 +45,11 @@ DEVICE(funsoft, FUNSOFT_IDS); { USB_DEVICE(0x8087, 0x0801) } DEVICE(flashloader, FLASHLOADER_IDS);
+/* Funsoft Serial USB driver */ +#define FUNSOFT_IDS() \ + { USB_DEVICE(0x1404, 0xcddc) } +DEVICE(funsoft, FUNSOFT_IDS); + /* Google Serial USB SubClass */ #define GOOGLE_IDS() \ { USB_VENDOR_AND_INTERFACE_INFO(0x18d1, \ @@ -63,6 +58,11 @@ DEVICE(flashloader, FLASHLOADER_IDS); 0x01) } DEVICE(google, GOOGLE_IDS);
+/* HP4x (48/49) Generic Serial driver */ +#define HP4X_IDS() \ + { USB_DEVICE(0x03f0, 0x0121) } +DEVICE(hp4x, HP4X_IDS); + /* KAUFMANN RKS+CAN VCP */ #define KAUFMANN_IDS() \ { USB_DEVICE(0x16d0, 0x0870) } @@ -73,11 +73,6 @@ DEVICE(kaufmann, KAUFMANN_IDS); { USB_DEVICE(0x1209, 0x8b00) } DEVICE(libtransistor, LIBTRANSISTOR_IDS);
-/* ViVOpay USB Serial Driver */ -#define VIVOPAY_IDS() \ - { USB_DEVICE(0x1d5f, 0x1004) } /* ViVOpay 8800 */ -DEVICE(vivopay, VIVOPAY_IDS); - /* Motorola USB Phone driver */ #define MOTO_IDS() \ { USB_DEVICE(0x05c6, 0x3197) }, /* unknown Motorola phone */ \ @@ -106,10 +101,10 @@ DEVICE(nokia, NOKIA_IDS); { USB_DEVICE(0x09d7, 0x0100) } /* NovAtel FlexPack GPS */ DEVICE_N(novatel_gps, NOVATEL_IDS, 3);
-/* HP4x (48/49) Generic Serial driver */ -#define HP4X_IDS() \ - { USB_DEVICE(0x03f0, 0x0121) } -DEVICE(hp4x, HP4X_IDS); +/* Siemens USB/MPI adapter */ +#define SIEMENS_IDS() \ + { USB_DEVICE(0x908, 0x0004) } +DEVICE(siemens_mpi, SIEMENS_IDS);
/* Suunto ANT+ USB Driver */ #define SUUNTO_IDS() \ @@ -117,47 +112,52 @@ DEVICE(hp4x, HP4X_IDS); { USB_DEVICE(0x0fcf, 0x1009) } /* Dynastream ANT USB-m Stick */ DEVICE(suunto, SUUNTO_IDS);
-/* Siemens USB/MPI adapter */ -#define SIEMENS_IDS() \ - { USB_DEVICE(0x908, 0x0004) } -DEVICE(siemens_mpi, SIEMENS_IDS); +/* ViVOpay USB Serial Driver */ +#define VIVOPAY_IDS() \ + { USB_DEVICE(0x1d5f, 0x1004) } /* ViVOpay 8800 */ +DEVICE(vivopay, VIVOPAY_IDS); + +/* ZIO Motherboard USB driver */ +#define ZIO_IDS() \ + { USB_DEVICE(0x1CBE, 0x0103) } +DEVICE(zio, ZIO_IDS);
/* All of the above structures mushed into two lists */ static struct usb_serial_driver * const serial_drivers[] = { &carelink_device, - &zio_device, - &funsoft_device, &flashloader_device, + &funsoft_device, &google_device, + &hp4x_device, &kaufmann_device, &libtransistor_device, - &vivopay_device, &moto_modem_device, &motorola_tetra_device, &nokia_device, &novatel_gps_device, - &hp4x_device, - &suunto_device, &siemens_mpi_device, + &suunto_device, + &vivopay_device, + &zio_device, NULL };
static const struct usb_device_id id_table[] = { CARELINK_IDS(), - ZIO_IDS(), - FUNSOFT_IDS(), FLASHLOADER_IDS(), + FUNSOFT_IDS(), GOOGLE_IDS(), + HP4X_IDS(), KAUFMANN_IDS(), LIBTRANSISTOR_IDS(), - VIVOPAY_IDS(), MOTO_IDS(), MOTOROLA_TETRA_IDS(), NOKIA_IDS(), NOVATEL_IDS(), - HP4X_IDS(), - SUUNTO_IDS(), SIEMENS_IDS(), + SUUNTO_IDS(), + VIVOPAY_IDS(), + ZIO_IDS(), { }, }; MODULE_DEVICE_TABLE(usb, id_table);
From: Marc Kleine-Budde mkl@pengutronix.de
commit f8a2da6ec2417cca169fa85a8ab15817bccbb109 upstream.
After an initial link up the CAN device is in ERROR-ACTIVE mode. Due to a missing CAN_STATE_STOPPED in gs_can_close() it doesn't change to STOPPED after a link down:
| ip link set dev can0 up | ip link set dev can0 down | ip --details link show can0 | 13: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 10 | link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0 | can state ERROR-ACTIVE restart-ms 1000
Add missing assignment of CAN_STATE_STOPPED in gs_can_close().
Cc: stable@vger.kernel.org Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://lore.kernel.org/all/20230718-gs_usb-fix-can-state-v1-1-f19738ae2c23@... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/gs_usb.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -734,6 +734,8 @@ static int gs_can_close(struct net_devic usb_kill_anchored_urbs(&dev->tx_submitted); atomic_set(&dev->active_tx_urbs, 0);
+ dev->can.state = CAN_STATE_STOPPED; + /* reset the device */ rc = gs_cmd_reset(dev); if (rc < 0)
From: Jakub Vanek linuxtardis@gmail.com
commit 734ae15ab95a18d3d425fc9cb38b7a627d786f08 upstream.
This reverts commit b138e23d3dff90c0494925b4c1874227b81bddf7.
AutoRetry has been found to sometimes cause controller freezes when communicating with buggy USB devices.
This controller feature allows the controller in host mode to send non-terminating/burst retry ACKs instead of terminating retry ACKs to devices when a transaction error (CRC error or overflow) occurs.
Unfortunately, if the USB device continues to respond with a CRC error, the controller will not complete endpoint-related commands while it keeps trying to auto-retry. [3] The xHCI driver will notice this once it tries to abort the transfer using a Stop Endpoint command and does not receive a completion in time. [1] This situation is reported to dmesg:
[sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00 xhci-hcd: xHCI host not responding to stop endpoint command xhci-hcd: xHCI host controller not responding, assume dead xhci-hcd: HC died; cleaning up
Some users observed this problem on an Odroid HC2 with the JMS578 USB3-to-SATA bridge. The issue can be triggered by starting a read-heavy workload on an attached SSD. After a while, the host controller would die and the SSD would disappear from the system. [1]
Further analysis by Synopsys determined that controller revisions other than the one in Odroid HC2 are also affected by this. The recommended solution was to disable AutoRetry altogether. This change does not have a noticeable performance impact. [2]
Revert the enablement commit. This will keep the AutoRetry bit in the default state configured during SoC design [2].
Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in the controller") Link: https://lore.kernel.org/r/a21f34c04632d250cd0a78c7c6f4a1c9c7a43142.camel@gma... [1] Link: https://lore.kernel.org/r/20230711214834.kyr6ulync32d4ktk@synopsys.com/ [2] Link: https://lore.kernel.org/r/20230712225518.2smu7wse6djc7l5o@synopsys.com/ [3] Cc: stable@vger.kernel.org Cc: Mauro Ribeiro mauro.ribeiro@hardkernel.com Cc: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Suggested-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Signed-off-by: Jakub Vanek linuxtardis@gmail.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20230714122419.27741-1-linuxtardis@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.c | 16 ---------------- drivers/usb/dwc3/core.h | 3 --- 2 files changed, 19 deletions(-)
--- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1093,22 +1093,6 @@ static int dwc3_core_init(struct dwc3 *d dwc3_writel(dwc->regs, DWC3_GUCTL1, reg); }
- if (dwc->dr_mode == USB_DR_MODE_HOST || - dwc->dr_mode == USB_DR_MODE_OTG) { - reg = dwc3_readl(dwc->regs, DWC3_GUCTL); - - /* - * Enable Auto retry Feature to make the controller operating in - * Host mode on seeing transaction errors(CRC errors or internal - * overrun scenerios) on IN transfers to reply to the device - * with a non-terminating retry ACK (i.e, an ACK transcation - * packet with Retry=1 & Nump != 0) - */ - reg |= DWC3_GUCTL_HSTINAUTORETRY; - - dwc3_writel(dwc->regs, DWC3_GUCTL, reg); - } - /* * Must config both number of packets and max burst settings to enable * RX and/or TX threshold. --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -252,9 +252,6 @@ #define DWC3_GCTL_GBLHIBERNATIONEN BIT(1) #define DWC3_GCTL_DSBLCLKGTNG BIT(0)
-/* Global User Control Register */ -#define DWC3_GUCTL_HSTINAUTORETRY BIT(14) - /* Global User Control 1 Register */ #define DWC3_GUCTL1_DEV_DECOUPLE_L1L2_EVT BIT(31) #define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
From: Gratian Crisan gratian.crisan@ni.com
commit b32b8f2b9542d8039f5468303a6ca78c1b5611a5 upstream.
Hardware based on the Bay Trail / BYT SoCs require an external ULPI phy for USB device-mode. The phy chip usually has its 'reset' and 'chip select' lines connected to GPIOs described by ACPI fwnodes in the DSDT table.
Because of hardware with missing ACPI resources for the 'reset' and 'chip select' GPIOs commit 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources") introduced a fallback gpiod_lookup_table with hard-coded mappings for Bay Trail devices.
However there are existing Bay Trail based devices, like the National Instruments cRIO-903x series, where the phy chip has its 'reset' and 'chip-select' lines always asserted in hardware via resistor pull-ups. On this hardware the phy chip is always enabled and the ACPI dsdt table is missing information not only for the 'chip-select' and 'reset' lines but also for the BYT GPIO controller itself "INT33FC".
With the introduction of the gpiod_lookup_table initializing the USB device-mode on these hardware now errors out. The error comes from the gpiod_get_optional() calls in dwc3_pci_quirks() which will now return an -ENOENT error due to the missing ACPI entry for the INT33FC gpio controller used in the aforementioned table.
This hardware used to work before because gpiod_get_optional() will return NULL instead of -ENOENT if no GPIO has been assigned to the requested function. The dwc3_pci_quirks() code for setting the 'cs' and 'reset' GPIOs was then skipped (due to the NULL return). This is the correct behavior in cases where the phy chip is hardwired and there are no GPIOs to control.
Since the gpiod_lookup_table relies on the presence of INT33FC fwnode in ACPI tables only add the table if we know the entry for the INT33FC gpio controller is present. This allows Bay Trail based devices with hardwired dwc3 ULPI phys to continue working.
Fixes: 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources") Cc: stable stable@kernel.org Signed-off-by: Gratian Crisan gratian.crisan@ni.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230726184555.218091-2-gratian.crisan@ni.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-pci.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc3/dwc3-pci.c +++ b/drivers/usb/dwc3/dwc3-pci.c @@ -219,10 +219,12 @@ static int dwc3_pci_quirks(struct dwc3_p
/* * A lot of BYT devices lack ACPI resource entries for - * the GPIOs, add a fallback mapping to the reference + * the GPIOs. If the ACPI entry for the GPIO controller + * is present add a fallback mapping to the reference * design GPIOs which all boards seem to use. */ - gpiod_add_lookup_table(&platform_bytcr_gpios); + if (acpi_dev_present("INT33FC", NULL, -1)) + gpiod_add_lookup_table(&platform_bytcr_gpios);
/* * These GPIOs will turn on the USB2 PHY. Note that we have to
From: Jisheng Zhang jszhang@kernel.org
commit e835c0a4e23c38531dcee5ef77e8d1cf462658c7 upstream.
Commit c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on system_suspend in host mode") replaces check for HOST only dr_mode with current_dr_role. But during booting, the current_dr_role isn't initialized, thus the device side reset is always issued even if dwc3 was configured as host-only. What's more, on some platforms with host only dwc3, aways issuing device side reset by accessing device register block can cause kernel panic.
Fixes: c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on system_suspend in host mode") Cc: stable stable@kernel.org Signed-off-by: Jisheng Zhang jszhang@kernel.org Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20230627162018.739-1-jszhang@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -275,9 +275,9 @@ int dwc3_core_soft_reset(struct dwc3 *dw /* * We're resetting only the device side because, if we're in host mode, * XHCI driver will reset the host block. If dwc3 was configured for - * host-only mode, then we can return early. + * host-only mode or current role is host, then we can return early. */ - if (dwc->current_dr_role == DWC3_GCTL_PRTCAP_HOST) + if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->current_dr_role == DWC3_GCTL_PRTCAP_HOST) return 0;
reg = dwc3_readl(dwc->regs, DWC3_DCTL);
From: Guiting Shen aarongt.shen@gmail.com
commit c55afcbeaa7a6f4fffdbc999a9bf3f0b29a5186f upstream.
The ohci_hcd_at91_drv_suspend() sets ohci->rh_state to OHCI_RH_HALTED when suspend which will let the ohci_irq() skip the interrupt after resume. And nobody to handle this interrupt.
According to the comment in ohci_hcd_at91_drv_suspend(), it need to reset when resume from suspend(MEM) to fix by setting "hibernated" argument of ohci_resume().
Signed-off-by: Guiting Shen aarongt.shen@gmail.com Cc: stable stable@kernel.org Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20230626152713.18950-1-aarongt.shen@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ohci-at91.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/ohci-at91.c +++ b/drivers/usb/host/ohci-at91.c @@ -652,7 +652,13 @@ ohci_hcd_at91_drv_resume(struct device * else at91_start_clock(ohci_at91);
- ohci_resume(hcd, false); + /* + * According to the comment in ohci_hcd_at91_drv_suspend() + * we need to do a reset if the 48Mhz clock was stopped, + * that is, if ohci_at91->wakeup is clear. Tell ohci_resume() + * to reset in this case by setting its "hibernated" flag. + */ + ohci_resume(hcd, !ohci_at91->wakeup);
return 0; }
From: Łukasz Bartosik lb@semihalf.com
commit 9dc162e22387080e2d06de708b89920c0e158c9a upstream.
The Focusrite Scarlett audio device does not behave correctly during resumes. Below is what happens during every resume (captured with Beagle 5000):
<Suspend> <Resume> <Reset>/<Chirp J>/<Tiny J> <Reset/Target disconnected> <High Speed>
The Scarlett disconnects and is enumerated again.
However from time to time it drops completely off the USB bus during resume. Below is captured occurrence of such an event:
<Suspend> <Resume> <Reset>/<Chirp J>/<Tiny J> <Reset>/<Chirp K>/<Tiny K> <High Speed> <Corrupted packet> <Reset/Target disconnected>
To fix the condition a user has to unplug and plug the device again.
With USB_QUIRK_RESET_RESUME applied ("usbcore.quirks=1235:8211:b") for the Scarlett audio device the issue still reproduces.
Applying USB_QUIRK_DISCONNECT_SUSPEND ("usbcore.quirks=1235:8211:m") fixed the issue and the Scarlett audio device didn't drop off the USB bus for ~5000 suspend/resume cycles where originally issue reproduced in ~100 or less suspend/resume cycles.
Signed-off-by: Łukasz Bartosik lb@semihalf.com Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/20230724112911.1802577-1-lb@semihalf.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/quirks.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -436,6 +436,10 @@ static const struct usb_device_id usb_qu /* novation SoundControl XL */ { USB_DEVICE(0x1235, 0x0061), .driver_info = USB_QUIRK_RESET_RESUME },
+ /* Focusrite Scarlett Solo USB */ + { USB_DEVICE(0x1235, 0x8211), .driver_info = + USB_QUIRK_DISCONNECT_SUSPEND }, + /* Huawei 4G LTE module */ { USB_DEVICE(0x12d1, 0x15bb), .driver_info = USB_QUIRK_DISCONNECT_SUSPEND },
From: Frank Li Frank.Li@nxp.com
commit 2627335a1329a0d39d8d277994678571c4f21800 upstream.
Previously, the cdns3_gadget_check_config() function in the cdns3 driver mistakenly calculated the ep_buf_size by considering only one configuration's endpoint information because "claimed" will be clear after call usb_gadget_check_config().
The fix involves checking the private flags EP_CLAIMED instead of relying on the "claimed" flag.
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number") Cc: stable stable@kernel.org Reported-by: Ravi Gunasekaran r-gunasekaran@ti.com Signed-off-by: Frank Li Frank.Li@nxp.com Acked-by: Peter Chen peter.chen@kernel.org Tested-by: Ravi Gunasekaran r-gunasekaran@ti.com Link: https://lore.kernel.org/r/20230707230015.494999-2-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/cdns3-gadget.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/cdns3/cdns3-gadget.c +++ b/drivers/usb/cdns3/cdns3-gadget.c @@ -3012,12 +3012,14 @@ static int cdns3_gadget_udc_stop(struct static int cdns3_gadget_check_config(struct usb_gadget *gadget) { struct cdns3_device *priv_dev = gadget_to_cdns3_device(gadget); + struct cdns3_endpoint *priv_ep; struct usb_ep *ep; int n_in = 0; int total;
list_for_each_entry(ep, &gadget->ep_list, ep_list) { - if (ep->claimed && (ep->address & USB_DIR_IN)) + priv_ep = ep_to_cdns3_ep(ep); + if ((priv_ep->flags & EP_CLAIMED) && (ep->address & USB_DIR_IN)) n_in++; }
From: Ricardo Ribalda ribalda@chromium.org
commit 9fd10829a9eb482e192a845675ecc5480e0bfa10 upstream.
Allow devices to have dma operations beyond 64K, and avoid warnings such as:
DMA-API: xhci-mtk 11200000.usb: mapping sg segment longer than device claims to support [len=98304] [max=65536]
Fixes: 0cbd4b34cda9 ("xhci: mediatek: support MTK xHCI host controller") Cc: stable stable@kernel.org Tested-by: Zubin Mithra zsm@chromium.org Reported-by: Zubin Mithra zsm@chromium.org Signed-off-by: Ricardo Ribalda ribalda@chromium.org Link: https://lore.kernel.org/r/20230628-mtk-usb-v2-1-c8c34eb9f229@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-mtk.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/host/xhci-mtk.c +++ b/drivers/usb/host/xhci-mtk.c @@ -570,6 +570,7 @@ static int xhci_mtk_probe(struct platfor }
device_init_wakeup(dev, true); + dma_set_max_seg_size(dev, UINT_MAX);
xhci = hcd_to_xhci(hcd); xhci->main_hcd = hcd;
From: Dan Carpenter dan.carpenter@linaro.org
commit 288b4fa1798e3637a9304c6e90a93d900e02369c upstream.
This reverts commit 18fc7c435be3f17ea26a21b2e2312fcb9088e01f.
The reverted commit was based on static analysis and a misunderstanding of how PTR_ERR() and NULLs are supposed to work. When a function returns both pointer errors and NULL then normally the NULL means "continue operating without a feature because it was deliberately turned off". The NULL should not be treated as a failure. If a driver cannot work when that feature is disabled then the KConfig should enforce that the function cannot return NULL. We should not need to test for it.
In this code, the patch means that certain tegra_xusb_probe() will fail if the firmware supports power-domains but CONFIG_PM is disabled.
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Fixes: 18fc7c435be3 ("usb: xhci: tegra: Fix error check") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/8baace8d-fb4b-41a4-ad5f-848ae643a23b@moroto.mounta... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-tegra.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/host/xhci-tegra.c +++ b/drivers/usb/host/xhci-tegra.c @@ -1010,15 +1010,15 @@ static int tegra_xusb_powerdomain_init(s int err;
tegra->genpd_dev_host = dev_pm_domain_attach_by_name(dev, "xusb_host"); - if (IS_ERR_OR_NULL(tegra->genpd_dev_host)) { - err = PTR_ERR(tegra->genpd_dev_host) ? : -ENODATA; + if (IS_ERR(tegra->genpd_dev_host)) { + err = PTR_ERR(tegra->genpd_dev_host); dev_err(dev, "failed to get host pm-domain: %d\n", err); return err; }
tegra->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "xusb_ss"); - if (IS_ERR_OR_NULL(tegra->genpd_dev_ss)) { - err = PTR_ERR(tegra->genpd_dev_ss) ? : -ENODATA; + if (IS_ERR(tegra->genpd_dev_ss)) { + err = PTR_ERR(tegra->genpd_dev_ss); dev_err(dev, "failed to get superspeed pm-domain: %d\n", err); return err; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 4fee0915e649bd0cea56dece6d96f8f4643df33c upstream.
Because the linux-distros group forces reporters to release information about reported bugs, and they impose arbitrary deadlines in having those bugs fixed despite not actually being kernel developers, the kernel security team recommends not interacting with them at all as this just causes confusion and the early-release of reported security problems.
Reviewed-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/2023063020-throat-pantyhose-f110@gregkh Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/security-bugs.rst | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-)
--- a/Documentation/admin-guide/security-bugs.rst +++ b/Documentation/admin-guide/security-bugs.rst @@ -63,20 +63,18 @@ information submitted to the security li of the report are treated confidentially even after the embargo has been lifted, in perpetuity.
-Coordination ------------- +Coordination with other groups +------------------------------
-Fixes for sensitive bugs, such as those that might lead to privilege -escalations, may need to be coordinated with the private -linux-distros@vs.openwall.org mailing list so that distribution vendors -are well prepared to issue a fixed kernel upon public disclosure of the -upstream fix. Distros will need some time to test the proposed patch and -will generally request at least a few days of embargo, and vendor update -publication prefers to happen Tuesday through Thursday. When appropriate, -the security team can assist with this coordination, or the reporter can -include linux-distros from the start. In this case, remember to prefix -the email Subject line with "[vs]" as described in the linux-distros wiki: -http://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists +The kernel security team strongly recommends that reporters of potential +security issues NEVER contact the "linux-distros" mailing list until +AFTER discussing it with the kernel security team. Do not Cc: both +lists at once. You may contact the linux-distros mailing list after a +fix has been agreed on and you fully understand the requirements that +doing so will impose on you and the kernel community. + +The different lists have different goals and the linux-distros rules do +not contribute to actually fixing any potential security problems.
CVE assignment --------------
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 3c1897ae4b6bc7cc586eda2feaa2cd68325ec29c upstream.
The kernel security team does NOT assign CVEs, so document that properly and provide the "if you want one, ask MITRE for it" response that we give on a weekly basis in the document, so we don't have to constantly say it to everyone who asks.
Link: https://lore.kernel.org/r/2023063022-retouch-kerosene-7e4a@gregkh Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/security-bugs.rst | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
--- a/Documentation/admin-guide/security-bugs.rst +++ b/Documentation/admin-guide/security-bugs.rst @@ -79,13 +79,12 @@ not contribute to actually fixing any po CVE assignment --------------
-The security team does not normally assign CVEs, nor do we require them -for reports or fixes, as this can needlessly complicate the process and -may delay the bug handling. If a reporter wishes to have a CVE identifier -assigned ahead of public disclosure, they will need to contact the private -linux-distros list, described above. When such a CVE identifier is known -before a patch is provided, it is desirable to mention it in the commit -message if the reporter agrees. +The security team does not assign CVEs, nor do we require them for +reports or fixes, as this can needlessly complicate the process and may +delay the bug handling. If a reporter wishes to have a CVE identifier +assigned, they should find one by themselves, for example by contacting +MITRE directly. However under no circumstances will a patch inclusion +be delayed to wait for a CVE identifier to arrive.
Non-disclosure agreements -------------------------
From: Larry Finger Larry.Finger@lwfinger.net
commit ac83631230f77dda94154ed0ebfd368fc81c70a3 upstream.
In the above mentioned routine, memory is allocated in several places. If the first succeeds and a later one fails, the routine will leak memory. This patch fixes commit 2865d42c78a9 ("staging: r8712u: Add the new driver to the mainline kernel"). A potential memory leak in r8712_xmit_resource_alloc() is also addressed.
Fixes: 2865d42c78a9 ("staging: r8712u: Add the new driver to the mainline kernel") Reported-by: syzbot+cf71097ffb6755df8251@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/x/log.txt?x=11ac3fa0a80000 Cc: stable@vger.kernel.org Cc: Nam Cao namcaov@gmail.com Signed-off-by: Larry Finger Larry.Finger@lwfinger.net Reviewed-by: Nam Cao namcaov@gmail.com Link: https://lore.kernel.org/r/20230714175417.18578-1-Larry.Finger@lwfinger.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8712/rtl871x_xmit.c | 43 ++++++++++++++++++++++++++------- drivers/staging/rtl8712/xmit_linux.c | 6 ++++ 2 files changed, 40 insertions(+), 9 deletions(-)
--- a/drivers/staging/rtl8712/rtl871x_xmit.c +++ b/drivers/staging/rtl8712/rtl871x_xmit.c @@ -21,6 +21,7 @@ #include "osdep_intf.h" #include "usb_ops.h"
+#include <linux/usb.h> #include <linux/ieee80211.h>
static const u8 P802_1H_OUI[P80211_OUI_LEN] = {0x00, 0x00, 0xf8}; @@ -55,6 +56,7 @@ int _r8712_init_xmit_priv(struct xmit_pr sint i; struct xmit_buf *pxmitbuf; struct xmit_frame *pxframe; + int j;
memset((unsigned char *)pxmitpriv, 0, sizeof(struct xmit_priv)); spin_lock_init(&pxmitpriv->lock); @@ -117,11 +119,8 @@ int _r8712_init_xmit_priv(struct xmit_pr _init_queue(&pxmitpriv->pending_xmitbuf_queue); pxmitpriv->pallocated_xmitbuf = kmalloc(NR_XMITBUFF * sizeof(struct xmit_buf) + 4, GFP_ATOMIC); - if (!pxmitpriv->pallocated_xmitbuf) { - kfree(pxmitpriv->pallocated_frame_buf); - pxmitpriv->pallocated_frame_buf = NULL; - return -ENOMEM; - } + if (!pxmitpriv->pallocated_xmitbuf) + goto clean_up_frame_buf; pxmitpriv->pxmitbuf = pxmitpriv->pallocated_xmitbuf + 4 - ((addr_t)(pxmitpriv->pallocated_xmitbuf) & 3); pxmitbuf = (struct xmit_buf *)pxmitpriv->pxmitbuf; @@ -129,13 +128,17 @@ int _r8712_init_xmit_priv(struct xmit_pr INIT_LIST_HEAD(&pxmitbuf->list); pxmitbuf->pallocated_buf = kmalloc(MAX_XMITBUF_SZ + XMITBUF_ALIGN_SZ, GFP_ATOMIC); - if (!pxmitbuf->pallocated_buf) - return -ENOMEM; + if (!pxmitbuf->pallocated_buf) { + j = 0; + goto clean_up_alloc_buf; + } pxmitbuf->pbuf = pxmitbuf->pallocated_buf + XMITBUF_ALIGN_SZ - ((addr_t) (pxmitbuf->pallocated_buf) & (XMITBUF_ALIGN_SZ - 1)); - if (r8712_xmit_resource_alloc(padapter, pxmitbuf)) - return -ENOMEM; + if (r8712_xmit_resource_alloc(padapter, pxmitbuf)) { + j = 1; + goto clean_up_alloc_buf; + } list_add_tail(&pxmitbuf->list, &(pxmitpriv->free_xmitbuf_queue.queue)); pxmitbuf++; @@ -146,6 +149,28 @@ int _r8712_init_xmit_priv(struct xmit_pr init_hwxmits(pxmitpriv->hwxmits, pxmitpriv->hwxmit_entry); tasklet_setup(&pxmitpriv->xmit_tasklet, r8712_xmit_bh); return 0; + +clean_up_alloc_buf: + if (j) { + /* failure happened in r8712_xmit_resource_alloc() + * delete extra pxmitbuf->pallocated_buf + */ + kfree(pxmitbuf->pallocated_buf); + } + for (j = 0; j < i; j++) { + int k; + + pxmitbuf--; /* reset pointer */ + kfree(pxmitbuf->pallocated_buf); + for (k = 0; k < 8; k++) /* delete xmit urb's */ + usb_free_urb(pxmitbuf->pxmit_urb[k]); + } + kfree(pxmitpriv->pallocated_xmitbuf); + pxmitpriv->pallocated_xmitbuf = NULL; +clean_up_frame_buf: + kfree(pxmitpriv->pallocated_frame_buf); + pxmitpriv->pallocated_frame_buf = NULL; + return -ENOMEM; }
void _free_xmit_priv(struct xmit_priv *pxmitpriv) --- a/drivers/staging/rtl8712/xmit_linux.c +++ b/drivers/staging/rtl8712/xmit_linux.c @@ -118,6 +118,12 @@ int r8712_xmit_resource_alloc(struct _ad for (i = 0; i < 8; i++) { pxmitbuf->pxmit_urb[i] = usb_alloc_urb(0, GFP_KERNEL); if (!pxmitbuf->pxmit_urb[i]) { + int k; + + for (k = i - 1; k >= 0; k--) { + /* handle allocation errors part way through loop */ + usb_free_urb(pxmitbuf->pxmit_urb[k]); + } netdev_err(padapter->pnetdev, "pxmitbuf->pxmit_urb[i] == NULL\n"); return -ENOMEM; }
From: Zhang Shurong zhang_shurong@foxmail.com
commit 5f1c7031e044cb2fba82836d55cc235e2ad619dc upstream.
The "exc->key_len" is a u16 that comes from the user. If it's over IW_ENCODING_TOKEN_MAX (64) that could lead to memory corruption.
Fixes: b121d84882b9 ("staging: ks7010: simplify calls to memcpy()") Cc: stable stable@kernel.org Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/tencent_5153B668C0283CAA15AA518325346E026A09@qq.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/ks7010/ks_wlan_net.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/staging/ks7010/ks_wlan_net.c +++ b/drivers/staging/ks7010/ks_wlan_net.c @@ -1584,8 +1584,10 @@ static int ks_wlan_set_encode_ext(struct commit |= SME_WEP_FLAG; } if (enc->key_len) { - memcpy(&key->key_val[0], &enc->key[0], enc->key_len); - key->key_len = enc->key_len; + int key_len = clamp_val(enc->key_len, 0, IW_ENCODING_TOKEN_MAX); + + memcpy(&key->key_val[0], &enc->key[0], key_len); + key->key_len = key_len; commit |= (SME_WEP_VAL1 << index); } break;
From: Chaoyuan Peng hedonistsmith@gmail.com
commit 9b9c8195f3f0d74a826077fc1c01b9ee74907239 upstream.
In gsm_cleanup_mux() the 'gsm->dlci' pointer was not cleaned properly, leaving it a dangling pointer after gsm_dlci_release. This leads to use-after-free where 'gsm->dlci[0]' are freed and accessed by the subsequent gsm_cleanup_mux().
Such is the case in the following call trace:
<TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106 print_address_description+0x63/0x3b0 mm/kasan/report.c:248 __kasan_report mm/kasan/report.c:434 [inline] kasan_report+0x16b/0x1c0 mm/kasan/report.c:451 gsm_cleanup_mux+0x76a/0x850 drivers/tty/n_gsm.c:2397 gsm_config drivers/tty/n_gsm.c:2653 [inline] gsmld_ioctl+0xaae/0x15b0 drivers/tty/n_gsm.c:2986 tty_ioctl+0x8ff/0xc50 drivers/tty/tty_io.c:2816 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0xf1/0x160 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x61/0xcb </TASK>
Allocated by task 3501: kasan_save_stack mm/kasan/common.c:38 [inline] kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:434 [inline] ____kasan_kmalloc+0xba/0xf0 mm/kasan/common.c:513 kasan_kmalloc include/linux/kasan.h:264 [inline] kmem_cache_alloc_trace+0x143/0x290 mm/slub.c:3247 kmalloc include/linux/slab.h:591 [inline] kzalloc include/linux/slab.h:721 [inline] gsm_dlci_alloc+0x53/0x3a0 drivers/tty/n_gsm.c:1932 gsm_activate_mux+0x1c/0x330 drivers/tty/n_gsm.c:2438 gsm_config drivers/tty/n_gsm.c:2677 [inline] gsmld_ioctl+0xd46/0x15b0 drivers/tty/n_gsm.c:2986 tty_ioctl+0x8ff/0xc50 drivers/tty/tty_io.c:2816 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0xf1/0x160 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x61/0xcb
Freed by task 3501: kasan_save_stack mm/kasan/common.c:38 [inline] kasan_set_track+0x4b/0x80 mm/kasan/common.c:46 kasan_set_free_info+0x1f/0x40 mm/kasan/generic.c:360 ____kasan_slab_free+0xd8/0x120 mm/kasan/common.c:366 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:1705 [inline] slab_free_freelist_hook+0xdd/0x160 mm/slub.c:1731 slab_free mm/slub.c:3499 [inline] kfree+0xf1/0x270 mm/slub.c:4559 dlci_put drivers/tty/n_gsm.c:1988 [inline] gsm_dlci_release drivers/tty/n_gsm.c:2021 [inline] gsm_cleanup_mux+0x574/0x850 drivers/tty/n_gsm.c:2415 gsm_config drivers/tty/n_gsm.c:2653 [inline] gsmld_ioctl+0xaae/0x15b0 drivers/tty/n_gsm.c:2986 tty_ioctl+0x8ff/0xc50 drivers/tty/tty_io.c:2816 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0xf1/0x160 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x61/0xcb
Fixes: aa371e96f05d ("tty: n_gsm: fix restart handling via CLD command") Signed-off-by: Chaoyuan Peng hedonistsmith@gmail.com Cc: stable stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_gsm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -2411,8 +2411,10 @@ static void gsm_cleanup_mux(struct gsm_m gsm->has_devices = false; } for (i = NUM_DLCI - 1; i >= 0; i--) - if (gsm->dlci[i]) + if (gsm->dlci[i]) { gsm_dlci_release(gsm->dlci[i]); + gsm->dlci[i] = NULL; + } mutex_unlock(&gsm->mutex); /* Now wipe the queues */ tty_ldisc_flush(gsm->tty);
From: Oliver Neukum oneukum@suse.com
commit 5bef4b3cb95a5b883dfec8b3ffc0d671323d55bb upstream.
This reverts commit 5255660b208aebfdb71d574f3952cf48392f4306.
This quirk breaks at least the following hardware:
0b:00.0 0c03: 1106:3483 (rev 01) (prog-if 30 [XHCI]) Subsystem: 1106:3483 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 66 Region 0: Memory at fb400000 (64-bit, non-prefetchable) [size=4K] Capabilities: [80] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [90] MSI: Enable+ Count=1/4 Maskable- 64bit+ Address: 00000000fee007b8 Data: 0000 Capabilities: [c4] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 89W DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci
with the quirk enabled it fails early with
[ 0.754373] pci 0000:0b:00.0: xHCI HW did not halt within 32000 usec status = 0x1000 [ 0.754419] pci 0000:0b:00.0: quirk_usb_early_handoff+0x0/0x7a0 took 31459 usecs [ 2.228048] xhci_hcd 0000:0b:00.0: xHCI Host Controller [ 2.228053] xhci_hcd 0000:0b:00.0: new USB bus registered, assigned bus number 7 [ 2.260073] xhci_hcd 0000:0b:00.0: Host halt failed, -110 [ 2.260079] xhci_hcd 0000:0b:00.0: can't setup: -110 [ 2.260551] xhci_hcd 0000:0b:00.0: USB bus 7 deregistered [ 2.260624] xhci_hcd 0000:0b:00.0: init 0000:0b:00.0 fail, -110 [ 2.260639] xhci_hcd: probe of 0000:0b:00.0 failed with error -110
The hardware in question is an external PCIe card. It looks to me like the quirk needs to be narrowed down. But this needs information about the hardware showing the issue this quirk is to fix. So for now a clean revert.
Signed-off-by: Oliver Neukum oneukum@suse.com Fixes: 5255660b208a ("xhci: add quirk for host controllers that don't update endpoint DCS") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/20230713112830.21773-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 4 +--- drivers/usb/host/xhci-ring.c | 25 +------------------------ 2 files changed, 2 insertions(+), 27 deletions(-)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -294,10 +294,8 @@ static void xhci_pci_quirks(struct devic pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS;
- if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) { + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) xhci->quirks |= XHCI_LPM_SUPPORT; - xhci->quirks |= XHCI_EP_CTX_BROKEN_DCS; - }
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -592,11 +592,8 @@ static int xhci_move_dequeue_past_td(str struct xhci_ring *ep_ring; struct xhci_command *cmd; struct xhci_segment *new_seg; - struct xhci_segment *halted_seg = NULL; union xhci_trb *new_deq; int new_cycle; - union xhci_trb *halted_trb; - int index = 0; dma_addr_t addr; u64 hw_dequeue; bool cycle_found = false; @@ -634,27 +631,7 @@ static int xhci_move_dequeue_past_td(str hw_dequeue = xhci_get_hw_deq(xhci, dev, ep_index, stream_id); new_seg = ep_ring->deq_seg; new_deq = ep_ring->dequeue; - - /* - * Quirk: xHC write-back of the DCS field in the hardware dequeue - * pointer is wrong - use the cycle state of the TRB pointed to by - * the dequeue pointer. - */ - if (xhci->quirks & XHCI_EP_CTX_BROKEN_DCS && - !(ep->ep_state & EP_HAS_STREAMS)) - halted_seg = trb_in_td(xhci, td->start_seg, - td->first_trb, td->last_trb, - hw_dequeue & ~0xf, false); - if (halted_seg) { - index = ((dma_addr_t)(hw_dequeue & ~0xf) - halted_seg->dma) / - sizeof(*halted_trb); - halted_trb = &halted_seg->trbs[index]; - new_cycle = halted_trb->generic.field[3] & 0x1; - xhci_dbg(xhci, "Endpoint DCS = %d TRB index = %d cycle = %d\n", - (u8)(hw_dequeue & 0x1), index, new_cycle); - } else { - new_cycle = hw_dequeue & 0x1; - } + new_cycle = hw_dequeue & 0x1;
/* * We want to find the pointer, segment and cycle state of the new trb
From: Luka Guzenko l.guzenko@web.de
commit d510acb610e6aa07a04b688236868b2a5fd60deb upstream.
This HP Notebook used ALC236 codec with COEF 0x07 idx 1 controlling the mute LED. Enable already existing quirk for this device.
Signed-off-by: Luka Guzenko l.guzenko@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230725111509.623773-1-l.guzenko@web.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9079,6 +9079,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x880d, "HP EliteBook 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8811, "HP Spectre x360 15-eb1xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1), SND_PCI_QUIRK(0x103c, 0x8812, "HP Spectre x360 15-eb1xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1), + SND_PCI_QUIRK(0x103c, 0x881d, "HP 250 G8 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x8846, "HP EliteBook 850 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8847, "HP EliteBook x360 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x884b, "HP EliteBook 840 Aero G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
From: Baskaran Kannan Baski.Kannan@amd.com
commit e146503ac68418859fb063a3a0cd9ec93bc52238 upstream.
Industrial processor i3255 supports temperatures -40 deg celcius to 105 deg Celcius. The current implementation of k10temp_read_temp rounds off any negative temperatures to '0'. To fix this, the following changes have been made.
A flag 'disp_negative' is added to struct k10temp_data to support AMD i3255 processors. Flag 'disp_negative' is set if 3255 processor is found during k10temp_probe. Flag 'disp_negative' is used to determine whether to round off negative temperatures to '0' in k10temp_read_temp.
Signed-off-by: Baskaran Kannan Baski.Kannan@amd.com Link: https://lore.kernel.org/r/20230727162159.1056136-1-Baski.Kannan@amd.com Fixes: aef17ca12719 ("hwmon: (k10temp) Only apply temperature offset if result is positive") Cc: stable@vger.kernel.org [groeck: Fixed multi-line comment] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/k10temp.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -97,6 +97,13 @@ static DEFINE_MUTEX(nb_smu_ind_mutex); #define F19H_M01H_CFACTOR_ICORE 1000000 /* 1A / LSB */ #define F19H_M01H_CFACTOR_ISOC 310000 /* 0.31A / LSB */
+/* + * AMD's Industrial processor 3255 supports temperature from -40 deg to 105 deg Celsius. + * Use the model name to identify 3255 CPUs and set a flag to display negative temperature. + * Do not round off to zero for negative Tctl or Tdie values if the flag is set + */ +#define AMD_I3255_STR "3255" + struct k10temp_data { struct pci_dev *pdev; void (*read_htcreg)(struct pci_dev *pdev, u32 *regval); @@ -106,6 +113,7 @@ struct k10temp_data { u32 show_temp; bool is_zen; u32 ccd_offset; + bool disp_negative; };
#define TCTL_BIT 0 @@ -220,12 +228,12 @@ static int k10temp_read_temp(struct devi switch (channel) { case 0: /* Tctl */ *val = get_raw_temp(data); - if (*val < 0) + if (*val < 0 && !data->disp_negative) *val = 0; break; case 1: /* Tdie */ *val = get_raw_temp(data) - data->temp_offset; - if (*val < 0) + if (*val < 0 && !data->disp_negative) *val = 0; break; case 2 ... 9: /* Tccd{1-8} */ @@ -417,6 +425,11 @@ static int k10temp_probe(struct pci_dev data->pdev = pdev; data->show_temp |= BIT(TCTL_BIT); /* Always show Tctl */
+ if (boot_cpu_data.x86 == 0x17 && + strstr(boot_cpu_data.x86_model_id, AMD_I3255_STR)) { + data->disp_negative = true; + } + if (boot_cpu_data.x86 == 0x15 && ((boot_cpu_data.x86_model & 0xf0) == 0x60 || (boot_cpu_data.x86_model & 0xf0) == 0x70)) {
From: Gilles Buloz Gilles.Buloz@kontron.com
commit 54685abe660a59402344d5045ce08c43c6a5ac42 upstream.
Because of hex value 0x46 used instead of decimal 46, the temp6 (PECI1) temperature is always declared visible and then displayed even if disabled in the chip
Signed-off-by: Gilles Buloz gilles.buloz@kontron.com Link: https://lore.kernel.org/r/DU0PR10MB62526435ADBC6A85243B90E08002A@DU0PR10MB62... Fixes: fcdc5739dce03 ("hwmon: (nct7802) add temperature sensor type attribute") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/nct7802.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwmon/nct7802.c +++ b/drivers/hwmon/nct7802.c @@ -708,7 +708,7 @@ static umode_t nct7802_temp_is_visible(s if (index >= 38 && index < 46 && !(reg & 0x01)) /* PECI 0 */ return 0;
- if (index >= 0x46 && (!(reg & 0x02))) /* PECI 1 */ + if (index >= 46 && !(reg & 0x02)) /* PECI 1 */ return 0;
return attr->mode;
From: Filipe Manana fdmanana@suse.com
commit bf7ecbe9875061bf3fce1883e3b26b77f847d1e8 upstream.
At btrfs_wait_for_commit() we wait for a transaction to finish and then always return 0 (success) without checking if it was aborted, in which case the transaction didn't happen due to some critical error. Fix this by checking if the transaction was aborted.
Fixes: 462045928bda ("Btrfs: add START_SYNC, WAIT_SYNC ioctls") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/transaction.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -936,6 +936,7 @@ int btrfs_wait_for_commit(struct btrfs_f }
wait_for_commit(cur_trans, TRANS_STATE_COMPLETED); + ret = cur_trans->aborted; btrfs_put_transaction(cur_trans); out: return ret;
From: Filipe Manana fdmanana@suse.com
commit b28ff3a7d7e97456fd86b68d24caa32e1cfa7064 upstream.
btrfs_attach_transaction_barrier() is used to get a handle pointing to the current running transaction if the transaction has not started its commit yet (its state is < TRANS_STATE_COMMIT_START). If the transaction commit has started, then we wait for the transaction to commit and finish before returning - however we completely ignore if the transaction was aborted due to some error during its commit, we simply return ERR_PT(-ENOENT), which makes the caller assume everything is fine and no errors happened.
This could make an fsync return success (0) to user space when in fact we had a transaction abort and the target inode changes were therefore not persisted.
Fix this by checking for the return value from btrfs_wait_for_commit(), and if it returned an error, return it back to the caller.
Fixes: d4edf39bd5db ("Btrfs: fix uncompleted transaction") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/transaction.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -840,8 +840,13 @@ btrfs_attach_transaction_barrier(struct
trans = start_transaction(root, 0, TRANS_ATTACH, BTRFS_RESERVE_NO_FLUSH, true); - if (trans == ERR_PTR(-ENOENT)) - btrfs_wait_for_commit(root->fs_info, 0); + if (trans == ERR_PTR(-ENOENT)) { + int ret; + + ret = btrfs_wait_for_commit(root->fs_info, 0); + if (ret) + return ERR_PTR(ret); + }
return trans; }
From: Christian Brauner brauner@kernel.org
commit 20ea1e7d13c1b544fe67c4a8dc3943bb1ab33e6f upstream.
The pidfd_getfd() system call allows a caller with ptrace_may_access() abilities on another process to steal a file descriptor from this process. This system call is used by debuggers, container runtimes, system call supervisors, networking proxies etc. So while it is a special interest system call it is used in common tools.
That ability ends up breaking our long-time optimization in fdget_pos(), which "knew" that if we had exclusive access to the file descriptor nobody else could access it, and we didn't need the lock for the file position.
That check for file_count(file) was always fairly subtle - it depended on __fdget() not incrementing the file count for single-threaded processes and thus included that as part of the rule - but it did mean that we didn't need to take the lock in all those traditional unix process contexts.
So it's sad to see this go, and I'd love to have some way to re-instate the optimization. At the same time, the lock obviously isn't ever contended in the case we optimized, so all we were optimizing away is the atomics and the cacheline dirtying. Let's see if anybody even notices that the optimization is gone.
Link: https://lore.kernel.org/linux-fsdevel/20230724-vfs-fdget_pos-v1-1-a4abfd7103... Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall") Cc: stable@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/file.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/fs/file.c +++ b/fs/file.c @@ -1068,10 +1068,8 @@ unsigned long __fdget_pos(unsigned int f struct file *file = (struct file *)(v & ~3);
if (file && (file->f_mode & FMODE_ATOMIC_POS)) { - if (file_count(file) > 1) { - v |= FDPUT_POS_UNLOCK; - mutex_lock(&file->f_pos_lock); - } + v |= FDPUT_POS_UNLOCK; + mutex_lock(&file->f_pos_lock); } return v; }
From: Trond Myklebust trond.myklebust@hammerspace.com
commit f75546f58a70da5cfdcec5a45ffc377885ccbee8 upstream.
If the client is calling TEST_STATEID, then it is because some event occurred that requires it to check all the stateids for validity and call FREE_STATEID on the ones that have been revoked. In this case, either the stateid exists in the list of stateids associated with that nfs4_client, in which case it should be tested, or it does not. There are no additional conditions to be considered.
Reported-by: "Frank Ch. Eigler" fche@redhat.com Fixes: 7df302f75ee2 ("NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4state.c | 2 -- 1 file changed, 2 deletions(-)
--- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -5833,8 +5833,6 @@ static __be32 nfsd4_validate_stateid(str if (ZERO_STATEID(stateid) || ONE_STATEID(stateid) || CLOSE_STATEID(stateid)) return status; - if (!same_clid(&stateid->si_opaque.so_clid, &cl->cl_clientid)) - return status; spin_lock(&cl->cl_lock); s = find_stateid_locked(cl, stateid); if (!s)
From: Alexander Steffen Alexander.Steffen@infineon.com
commit 513253f8c293c0c8bd46d09d337fc892bf8f9f48 upstream.
recv_data either returns the number of received bytes, or a negative value representing an error code. Adding the return value directly to the total number of received bytes therefore looks a little weird, since it might add a negative error code to a sum of bytes.
The following check for size < expected usually makes the function return ETIME in that case, so it does not cause too many problems in practice. But to make the code look cleaner and because the caller might still be interested in the original error code, explicitly check for the presence of an error code and pass that through.
Cc: stable@vger.kernel.org Fixes: cb5354253af2 ("[PATCH] tpm: spacing cleanups 2") Signed-off-by: Alexander Steffen Alexander.Steffen@infineon.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis_core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -314,6 +314,7 @@ static int tpm_tis_recv(struct tpm_chip int size = 0; int status; u32 expected; + int rc;
if (count < TPM_HEADER_SIZE) { size = -EIO; @@ -333,8 +334,13 @@ static int tpm_tis_recv(struct tpm_chip goto out; }
- size += recv_data(chip, &buf[TPM_HEADER_SIZE], - expected - TPM_HEADER_SIZE); + rc = recv_data(chip, &buf[TPM_HEADER_SIZE], + expected - TPM_HEADER_SIZE); + if (rc < 0) { + size = rc; + goto out; + } + size += rc; if (size < expected) { dev_err(&chip->dev, "Unable to read remainder of result\n"); size = -ETIME;
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 55ad24857341c36616ecc1d9580af5626c226cf1 ]
The irq to block mapping is fixed, and interrupts from the first block will always be routed to the first parent IRQ. But the parent interrupts themselves can be routed to any available CPU.
This is used by the bootloader to map the first parent interrupt to the boot CPU, regardless wether the boot CPU is the first one or the second one.
When booting from the second CPU, the assumption that the first block's IRQ is mapped to the first CPU breaks, and the system hangs because interrupts do not get routed correctly.
Fix this by passing the appropriate bcm6434_l1_cpu to the interrupt handler instead of the chip itself, so the handler always has the right block.
Fixes: c7c42ec2baa1 ("irqchips/bmips: Add bcm6345-l1 interrupt controller") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230629072620.62527-1-jonas.gorski@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-bcm6345-l1.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c index ebc3a253f735d..7c5d8b791592e 100644 --- a/drivers/irqchip/irq-bcm6345-l1.c +++ b/drivers/irqchip/irq-bcm6345-l1.c @@ -82,6 +82,7 @@ struct bcm6345_l1_chip { };
struct bcm6345_l1_cpu { + struct bcm6345_l1_chip *intc; void __iomem *map_base; unsigned int parent_irq; u32 enable_cache[]; @@ -115,17 +116,11 @@ static inline unsigned int cpu_for_irq(struct bcm6345_l1_chip *intc,
static void bcm6345_l1_irq_handle(struct irq_desc *desc) { - struct bcm6345_l1_chip *intc = irq_desc_get_handler_data(desc); - struct bcm6345_l1_cpu *cpu; + struct bcm6345_l1_cpu *cpu = irq_desc_get_handler_data(desc); + struct bcm6345_l1_chip *intc = cpu->intc; struct irq_chip *chip = irq_desc_get_chip(desc); unsigned int idx;
-#ifdef CONFIG_SMP - cpu = intc->cpus[cpu_logical_map(smp_processor_id())]; -#else - cpu = intc->cpus[0]; -#endif - chained_irq_enter(chip, desc);
for (idx = 0; idx < intc->n_words; idx++) { @@ -257,6 +252,7 @@ static int __init bcm6345_l1_init_one(struct device_node *dn, if (!cpu) return -ENOMEM;
+ cpu->intc = intc; cpu->map_base = ioremap(res.start, sz); if (!cpu->map_base) return -ENOMEM; @@ -272,7 +268,7 @@ static int __init bcm6345_l1_init_one(struct device_node *dn, return -EINVAL; } irq_set_chained_handler_and_data(cpu->parent_irq, - bcm6345_l1_irq_handle, intc); + bcm6345_l1_irq_handle, cpu);
return 0; }
From: Marc Zyngier maz@kernel.org
[ Upstream commit 926846a703cbf5d0635cc06e67d34b228746554b ]
We normally rely on the irq_to_cpuid_[un]lock() primitives to make sure nothing will change col->idx while performing a LPI invalidation.
However, these primitives do not cover VPE doorbells, and we have some open-coded locking for that. Unfortunately, this locking is pretty bogus.
Instead, extend the above primitives to cover VPE doorbells and convert the whole thing to it.
Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") Reported-by: Kunkun Jiang jiangkunkun@huawei.com Signed-off-by: Marc Zyngier maz@kernel.org Cc: Zenghui Yu yuzenghui@huawei.com Cc: wanghaibin.wang@huawei.com Tested-by: Kunkun Jiang jiangkunkun@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Link: https://lore.kernel.org/r/20230617073242.3199746-1-maz@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-gic-v3-its.c | 75 ++++++++++++++++++++------------ 1 file changed, 46 insertions(+), 29 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 59a5d06b2d3e4..490e6cfe510e6 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -267,13 +267,23 @@ static void vpe_to_cpuid_unlock(struct its_vpe *vpe, unsigned long flags) raw_spin_unlock_irqrestore(&vpe->vpe_lock, flags); }
+static struct irq_chip its_vpe_irq_chip; + static int irq_to_cpuid_lock(struct irq_data *d, unsigned long *flags) { - struct its_vlpi_map *map = get_vlpi_map(d); + struct its_vpe *vpe = NULL; int cpu;
- if (map) { - cpu = vpe_to_cpuid_lock(map->vpe, flags); + if (d->chip == &its_vpe_irq_chip) { + vpe = irq_data_get_irq_chip_data(d); + } else { + struct its_vlpi_map *map = get_vlpi_map(d); + if (map) + vpe = map->vpe; + } + + if (vpe) { + cpu = vpe_to_cpuid_lock(vpe, flags); } else { /* Physical LPIs are already locked via the irq_desc lock */ struct its_device *its_dev = irq_data_get_irq_chip_data(d); @@ -287,10 +297,18 @@ static int irq_to_cpuid_lock(struct irq_data *d, unsigned long *flags)
static void irq_to_cpuid_unlock(struct irq_data *d, unsigned long flags) { - struct its_vlpi_map *map = get_vlpi_map(d); + struct its_vpe *vpe = NULL; + + if (d->chip == &its_vpe_irq_chip) { + vpe = irq_data_get_irq_chip_data(d); + } else { + struct its_vlpi_map *map = get_vlpi_map(d); + if (map) + vpe = map->vpe; + }
- if (map) - vpe_to_cpuid_unlock(map->vpe, flags); + if (vpe) + vpe_to_cpuid_unlock(vpe, flags); }
static struct its_collection *valid_col(struct its_collection *col) @@ -1427,14 +1445,29 @@ static void wait_for_syncr(void __iomem *rdbase) cpu_relax(); }
-static void direct_lpi_inv(struct irq_data *d) +static void __direct_lpi_inv(struct irq_data *d, u64 val) { - struct its_vlpi_map *map = get_vlpi_map(d); void __iomem *rdbase; unsigned long flags; - u64 val; int cpu;
+ /* Target the redistributor this LPI is currently routed to */ + cpu = irq_to_cpuid_lock(d, &flags); + raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock); + + rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base; + gic_write_lpir(val, rdbase + GICR_INVLPIR); + wait_for_syncr(rdbase); + + raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock); + irq_to_cpuid_unlock(d, flags); +} + +static void direct_lpi_inv(struct irq_data *d) +{ + struct its_vlpi_map *map = get_vlpi_map(d); + u64 val; + if (map) { struct its_device *its_dev = irq_data_get_irq_chip_data(d);
@@ -1447,15 +1480,7 @@ static void direct_lpi_inv(struct irq_data *d) val = d->hwirq; }
- /* Target the redistributor this LPI is currently routed to */ - cpu = irq_to_cpuid_lock(d, &flags); - raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock); - rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base; - gic_write_lpir(val, rdbase + GICR_INVLPIR); - - wait_for_syncr(rdbase); - raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock); - irq_to_cpuid_unlock(d, flags); + __direct_lpi_inv(d, val); }
static void lpi_update_config(struct irq_data *d, u8 clr, u8 set) @@ -3936,18 +3961,10 @@ static void its_vpe_send_inv(struct irq_data *d) { struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- if (gic_rdists->has_direct_lpi) { - void __iomem *rdbase; - - /* Target the redistributor this VPE is currently known on */ - raw_spin_lock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock); - rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base; - gic_write_lpir(d->parent_data->hwirq, rdbase + GICR_INVLPIR); - wait_for_syncr(rdbase); - raw_spin_unlock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock); - } else { + if (gic_rdists->has_direct_lpi) + __direct_lpi_inv(d, d->parent_data->hwirq); + else its_vpe_send_cmd(vpe, its_send_inv); - } }
static void its_vpe_mask_irq(struct irq_data *d)
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit f7853c34241807bb97673a5e97719123be39a09e ]
Henry reported that rt_mutex_adjust_prio_check() has an ordering problem and puts the lie to the comment in [7]. Sharing the sort key between lock->waiters and owner->pi_waiters *does* create problems, since unlike what the comment claims, holding [L] is insufficient.
Notably, consider:
A / \ M1 M2 | | B C
That is, task A owns both M1 and M2, B and C block on them. In this case a concurrent chain walk (B & C) will modify their resp. sort keys in [7] while holding M1->wait_lock and M2->wait_lock. So holding [L] is meaningless, they're different Ls.
This then gives rise to a race condition between [7] and [11], where the requeue of pi_waiters will observe an inconsistent tree order.
B C
(holds M1->wait_lock, (holds M2->wait_lock, holds B->pi_lock) holds A->pi_lock)
[7] waiter_update_prio(); ... [8] raw_spin_unlock(B->pi_lock); ... [10] raw_spin_lock(A->pi_lock);
[11] rt_mutex_enqueue_pi(); // observes inconsistent A->pi_waiters // tree order
Fixing this means either extending the range of the owner lock from [10-13] to [6-13], with the immediate problem that this means [6-8] hold both blocked and owner locks, or duplicating the sort key.
Since the locking in chain walk is horrible enough without having to consider pi_lock nesting rules, duplicate the sort key instead.
By giving each tree their own sort key, the above race becomes harmless, if C sees B at the old location, then B will correct things (if they need correcting) when it walks up the chain and reaches A.
Fixes: fb00aca47440 ("rtmutex: Turn the plist into an rb-tree") Reported-by: Henry Wu triangletrap12@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Thomas Gleixner tglx@linutronix.de Tested-by: Henry Wu triangletrap12@gmail.com Link: https://lkml.kernel.org/r/20230707161052.GF2883469%40hirez.programming.kicks... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/rtmutex.c | 170 +++++++++++++++++++++----------- kernel/locking/rtmutex_api.c | 2 +- kernel/locking/rtmutex_common.h | 47 ++++++--- kernel/locking/ww_mutex.h | 12 +-- 4 files changed, 155 insertions(+), 76 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index b7fa3ee3aa1de..ee5be1dda0c40 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -331,21 +331,43 @@ static __always_inline int __waiter_prio(struct task_struct *task) return prio; }
+/* + * Update the waiter->tree copy of the sort keys. + */ static __always_inline void waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task) { - waiter->prio = __waiter_prio(task); - waiter->deadline = task->dl.deadline; + lockdep_assert_held(&waiter->lock->wait_lock); + lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry)); + + waiter->tree.prio = __waiter_prio(task); + waiter->tree.deadline = task->dl.deadline; +} + +/* + * Update the waiter->pi_tree copy of the sort keys (from the tree copy). + */ +static __always_inline void +waiter_clone_prio(struct rt_mutex_waiter *waiter, struct task_struct *task) +{ + lockdep_assert_held(&waiter->lock->wait_lock); + lockdep_assert_held(&task->pi_lock); + lockdep_assert(RB_EMPTY_NODE(&waiter->pi_tree.entry)); + + waiter->pi_tree.prio = waiter->tree.prio; + waiter->pi_tree.deadline = waiter->tree.deadline; }
/* - * Only use with rt_mutex_waiter_{less,equal}() + * Only use with rt_waiter_node_{less,equal}() */ +#define task_to_waiter_node(p) \ + &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline } #define task_to_waiter(p) \ - &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline } + &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
-static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left, - struct rt_mutex_waiter *right) +static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left, + struct rt_waiter_node *right) { if (left->prio < right->prio) return 1; @@ -362,8 +384,8 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left, return 0; }
-static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left, - struct rt_mutex_waiter *right) +static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left, + struct rt_waiter_node *right) { if (left->prio != right->prio) return 0; @@ -383,7 +405,7 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left, static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter, struct rt_mutex_waiter *top_waiter) { - if (rt_mutex_waiter_less(waiter, top_waiter)) + if (rt_waiter_node_less(&waiter->tree, &top_waiter->tree)) return true;
#ifdef RT_MUTEX_BUILD_SPINLOCKS @@ -391,30 +413,30 @@ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter, * Note that RT tasks are excluded from same priority (lateral) * steals to prevent the introduction of an unbounded latency. */ - if (rt_prio(waiter->prio) || dl_prio(waiter->prio)) + if (rt_prio(waiter->tree.prio) || dl_prio(waiter->tree.prio)) return false;
- return rt_mutex_waiter_equal(waiter, top_waiter); + return rt_waiter_node_equal(&waiter->tree, &top_waiter->tree); #else return false; #endif }
#define __node_2_waiter(node) \ - rb_entry((node), struct rt_mutex_waiter, tree_entry) + rb_entry((node), struct rt_mutex_waiter, tree.entry)
static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_node *b) { struct rt_mutex_waiter *aw = __node_2_waiter(a); struct rt_mutex_waiter *bw = __node_2_waiter(b);
- if (rt_mutex_waiter_less(aw, bw)) + if (rt_waiter_node_less(&aw->tree, &bw->tree)) return 1;
if (!build_ww_mutex()) return 0;
- if (rt_mutex_waiter_less(bw, aw)) + if (rt_waiter_node_less(&bw->tree, &aw->tree)) return 0;
/* NOTE: relies on waiter->ww_ctx being set before insertion */ @@ -432,48 +454,58 @@ static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_nod static __always_inline void rt_mutex_enqueue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter) { - rb_add_cached(&waiter->tree_entry, &lock->waiters, __waiter_less); + lockdep_assert_held(&lock->wait_lock); + + rb_add_cached(&waiter->tree.entry, &lock->waiters, __waiter_less); }
static __always_inline void rt_mutex_dequeue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter) { - if (RB_EMPTY_NODE(&waiter->tree_entry)) + lockdep_assert_held(&lock->wait_lock); + + if (RB_EMPTY_NODE(&waiter->tree.entry)) return;
- rb_erase_cached(&waiter->tree_entry, &lock->waiters); - RB_CLEAR_NODE(&waiter->tree_entry); + rb_erase_cached(&waiter->tree.entry, &lock->waiters); + RB_CLEAR_NODE(&waiter->tree.entry); }
-#define __node_2_pi_waiter(node) \ - rb_entry((node), struct rt_mutex_waiter, pi_tree_entry) +#define __node_2_rt_node(node) \ + rb_entry((node), struct rt_waiter_node, entry)
-static __always_inline bool -__pi_waiter_less(struct rb_node *a, const struct rb_node *b) +static __always_inline bool __pi_waiter_less(struct rb_node *a, const struct rb_node *b) { - return rt_mutex_waiter_less(__node_2_pi_waiter(a), __node_2_pi_waiter(b)); + return rt_waiter_node_less(__node_2_rt_node(a), __node_2_rt_node(b)); }
static __always_inline void rt_mutex_enqueue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter) { - rb_add_cached(&waiter->pi_tree_entry, &task->pi_waiters, __pi_waiter_less); + lockdep_assert_held(&task->pi_lock); + + rb_add_cached(&waiter->pi_tree.entry, &task->pi_waiters, __pi_waiter_less); }
static __always_inline void rt_mutex_dequeue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter) { - if (RB_EMPTY_NODE(&waiter->pi_tree_entry)) + lockdep_assert_held(&task->pi_lock); + + if (RB_EMPTY_NODE(&waiter->pi_tree.entry)) return;
- rb_erase_cached(&waiter->pi_tree_entry, &task->pi_waiters); - RB_CLEAR_NODE(&waiter->pi_tree_entry); + rb_erase_cached(&waiter->pi_tree.entry, &task->pi_waiters); + RB_CLEAR_NODE(&waiter->pi_tree.entry); }
-static __always_inline void rt_mutex_adjust_prio(struct task_struct *p) +static __always_inline void rt_mutex_adjust_prio(struct rt_mutex_base *lock, + struct task_struct *p) { struct task_struct *pi_task = NULL;
+ lockdep_assert_held(&lock->wait_lock); + lockdep_assert(rt_mutex_owner(lock) == p); lockdep_assert_held(&p->pi_lock);
if (task_has_pi_waiters(p)) @@ -562,9 +594,14 @@ static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_st * Chain walk basics and protection scope * * [R] refcount on task - * [P] task->pi_lock held + * [Pn] task->pi_lock held * [L] rtmutex->wait_lock held * + * Normal locking order: + * + * rtmutex->wait_lock + * task->pi_lock + * * Step Description Protected by * function arguments: * @task [R] @@ -579,27 +616,32 @@ static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_st * again: * loop_sanity_check(); * retry: - * [1] lock(task->pi_lock); [R] acquire [P] - * [2] waiter = task->pi_blocked_on; [P] - * [3] check_exit_conditions_1(); [P] - * [4] lock = waiter->lock; [P] - * [5] if (!try_lock(lock->wait_lock)) { [P] try to acquire [L] - * unlock(task->pi_lock); release [P] + * [1] lock(task->pi_lock); [R] acquire [P1] + * [2] waiter = task->pi_blocked_on; [P1] + * [3] check_exit_conditions_1(); [P1] + * [4] lock = waiter->lock; [P1] + * [5] if (!try_lock(lock->wait_lock)) { [P1] try to acquire [L] + * unlock(task->pi_lock); release [P1] * goto retry; * } - * [6] check_exit_conditions_2(); [P] + [L] - * [7] requeue_lock_waiter(lock, waiter); [P] + [L] - * [8] unlock(task->pi_lock); release [P] + * [6] check_exit_conditions_2(); [P1] + [L] + * [7] requeue_lock_waiter(lock, waiter); [P1] + [L] + * [8] unlock(task->pi_lock); release [P1] * put_task_struct(task); release [R] * [9] check_exit_conditions_3(); [L] * [10] task = owner(lock); [L] * get_task_struct(task); [L] acquire [R] - * lock(task->pi_lock); [L] acquire [P] - * [11] requeue_pi_waiter(tsk, waiters(lock));[P] + [L] - * [12] check_exit_conditions_4(); [P] + [L] - * [13] unlock(task->pi_lock); release [P] + * lock(task->pi_lock); [L] acquire [P2] + * [11] requeue_pi_waiter(tsk, waiters(lock));[P2] + [L] + * [12] check_exit_conditions_4(); [P2] + [L] + * [13] unlock(task->pi_lock); release [P2] * unlock(lock->wait_lock); release [L] * goto again; + * + * Where P1 is the blocking task and P2 is the lock owner; going up one step + * the owner becomes the next blocked task etc.. + * +* */ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, enum rtmutex_chainwalk chwalk, @@ -747,7 +789,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, * enabled we continue, but stop the requeueing in the chain * walk. */ - if (rt_mutex_waiter_equal(waiter, task_to_waiter(task))) { + if (rt_waiter_node_equal(&waiter->tree, task_to_waiter_node(task))) { if (!detect_deadlock) goto out_unlock_pi; else @@ -755,13 +797,18 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, }
/* - * [4] Get the next lock + * [4] Get the next lock; per holding task->pi_lock we can't unblock + * and guarantee @lock's existence. */ lock = waiter->lock; /* * [5] We need to trylock here as we are holding task->pi_lock, * which is the reverse lock order versus the other rtmutex * operations. + * + * Per the above, holding task->pi_lock guarantees lock exists, so + * inverting this lock order is infeasible from a life-time + * perspective. */ if (!raw_spin_trylock(&lock->wait_lock)) { raw_spin_unlock_irq(&task->pi_lock); @@ -865,17 +912,18 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, * or * * DL CBS enforcement advancing the effective deadline. - * - * Even though pi_waiters also uses these fields, and that tree is only - * updated in [11], we can do this here, since we hold [L], which - * serializes all pi_waiters access and rb_erase() does not care about - * the values of the node being removed. */ waiter_update_prio(waiter, task);
rt_mutex_enqueue(lock, waiter);
- /* [8] Release the task */ + /* + * [8] Release the (blocking) task in preparation for + * taking the owner task in [10]. + * + * Since we hold lock->waiter_lock, task cannot unblock, even if we + * release task->pi_lock. + */ raw_spin_unlock(&task->pi_lock); put_task_struct(task);
@@ -899,7 +947,12 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, return 0; }
- /* [10] Grab the next task, i.e. the owner of @lock */ + /* + * [10] Grab the next task, i.e. the owner of @lock + * + * Per holding lock->wait_lock and checking for !owner above, there + * must be an owner and it cannot go away. + */ task = get_task_struct(rt_mutex_owner(lock)); raw_spin_lock(&task->pi_lock);
@@ -912,8 +965,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, * and adjust the priority of the owner. */ rt_mutex_dequeue_pi(task, prerequeue_top_waiter); + waiter_clone_prio(waiter, task); rt_mutex_enqueue_pi(task, waiter); - rt_mutex_adjust_prio(task); + rt_mutex_adjust_prio(lock, task);
} else if (prerequeue_top_waiter == waiter) { /* @@ -928,8 +982,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, */ rt_mutex_dequeue_pi(task, waiter); waiter = rt_mutex_top_waiter(lock); + waiter_clone_prio(waiter, task); rt_mutex_enqueue_pi(task, waiter); - rt_mutex_adjust_prio(task); + rt_mutex_adjust_prio(lock, task); } else { /* * Nothing changed. No need to do any priority @@ -1142,6 +1197,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock, waiter->task = task; waiter->lock = lock; waiter_update_prio(waiter, task); + waiter_clone_prio(waiter, task);
/* Get the top priority waiter on the lock */ if (rt_mutex_has_waiters(lock)) @@ -1175,7 +1231,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock, rt_mutex_dequeue_pi(owner, top_waiter); rt_mutex_enqueue_pi(owner, waiter);
- rt_mutex_adjust_prio(owner); + rt_mutex_adjust_prio(lock, owner); if (owner->pi_blocked_on) chain_walk = 1; } else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) { @@ -1222,6 +1278,8 @@ static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh, { struct rt_mutex_waiter *waiter;
+ lockdep_assert_held(&lock->wait_lock); + raw_spin_lock(¤t->pi_lock);
waiter = rt_mutex_top_waiter(lock); @@ -1234,7 +1292,7 @@ static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh, * task unblocks. */ rt_mutex_dequeue_pi(current, waiter); - rt_mutex_adjust_prio(current); + rt_mutex_adjust_prio(lock, current);
/* * As we are waking up the top waiter, and the waiter stays @@ -1471,7 +1529,7 @@ static void __sched remove_waiter(struct rt_mutex_base *lock, if (rt_mutex_has_waiters(lock)) rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
- rt_mutex_adjust_prio(owner); + rt_mutex_adjust_prio(lock, owner);
/* Store the lock on which owner is blocked or NULL */ next_lock = task_blocked_on_lock(owner); diff --git a/kernel/locking/rtmutex_api.c b/kernel/locking/rtmutex_api.c index a461be2f873db..56d1938cb52a1 100644 --- a/kernel/locking/rtmutex_api.c +++ b/kernel/locking/rtmutex_api.c @@ -437,7 +437,7 @@ void __sched rt_mutex_adjust_pi(struct task_struct *task) raw_spin_lock_irqsave(&task->pi_lock, flags);
waiter = task->pi_blocked_on; - if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) { + if (!waiter || rt_waiter_node_equal(&waiter->tree, task_to_waiter_node(task))) { raw_spin_unlock_irqrestore(&task->pi_lock, flags); return; } diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h index c47e8361bfb5c..1162e07cdaea1 100644 --- a/kernel/locking/rtmutex_common.h +++ b/kernel/locking/rtmutex_common.h @@ -17,27 +17,44 @@ #include <linux/rtmutex.h> #include <linux/sched/wake_q.h>
+ +/* + * This is a helper for the struct rt_mutex_waiter below. A waiter goes in two + * separate trees and they need their own copy of the sort keys because of + * different locking requirements. + * + * @entry: rbtree node to enqueue into the waiters tree + * @prio: Priority of the waiter + * @deadline: Deadline of the waiter if applicable + * + * See rt_waiter_node_less() and waiter_*_prio(). + */ +struct rt_waiter_node { + struct rb_node entry; + int prio; + u64 deadline; +}; + /* * This is the control structure for tasks blocked on a rt_mutex, * which is allocated on the kernel stack on of the blocked task. * - * @tree_entry: pi node to enqueue into the mutex waiters tree - * @pi_tree_entry: pi node to enqueue into the mutex owner waiters tree + * @tree: node to enqueue into the mutex waiters tree + * @pi_tree: node to enqueue into the mutex owner waiters tree * @task: task reference to the blocked task * @lock: Pointer to the rt_mutex on which the waiter blocks * @wake_state: Wakeup state to use (TASK_NORMAL or TASK_RTLOCK_WAIT) - * @prio: Priority of the waiter - * @deadline: Deadline of the waiter if applicable * @ww_ctx: WW context pointer + * + * @tree is ordered by @lock->wait_lock + * @pi_tree is ordered by rt_mutex_owner(@lock)->pi_lock */ struct rt_mutex_waiter { - struct rb_node tree_entry; - struct rb_node pi_tree_entry; + struct rt_waiter_node tree; + struct rt_waiter_node pi_tree; struct task_struct *task; struct rt_mutex_base *lock; unsigned int wake_state; - int prio; - u64 deadline; struct ww_acquire_ctx *ww_ctx; };
@@ -105,7 +122,7 @@ static inline bool rt_mutex_waiter_is_top_waiter(struct rt_mutex_base *lock, { struct rb_node *leftmost = rb_first_cached(&lock->waiters);
- return rb_entry(leftmost, struct rt_mutex_waiter, tree_entry) == waiter; + return rb_entry(leftmost, struct rt_mutex_waiter, tree.entry) == waiter; }
static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base *lock) @@ -113,8 +130,10 @@ static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base * struct rb_node *leftmost = rb_first_cached(&lock->waiters); struct rt_mutex_waiter *w = NULL;
+ lockdep_assert_held(&lock->wait_lock); + if (leftmost) { - w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry); + w = rb_entry(leftmost, struct rt_mutex_waiter, tree.entry); BUG_ON(w->lock != lock); } return w; @@ -127,8 +146,10 @@ static inline int task_has_pi_waiters(struct task_struct *p)
static inline struct rt_mutex_waiter *task_top_pi_waiter(struct task_struct *p) { + lockdep_assert_held(&p->pi_lock); + return rb_entry(p->pi_waiters.rb_leftmost, struct rt_mutex_waiter, - pi_tree_entry); + pi_tree.entry); }
#define RT_MUTEX_HAS_WAITERS 1UL @@ -190,8 +211,8 @@ static inline void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter) static inline void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter) { debug_rt_mutex_init_waiter(waiter); - RB_CLEAR_NODE(&waiter->pi_tree_entry); - RB_CLEAR_NODE(&waiter->tree_entry); + RB_CLEAR_NODE(&waiter->pi_tree.entry); + RB_CLEAR_NODE(&waiter->tree.entry); waiter->wake_state = TASK_NORMAL; waiter->task = NULL; } diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h index 56f139201f246..3ad2cc4823e59 100644 --- a/kernel/locking/ww_mutex.h +++ b/kernel/locking/ww_mutex.h @@ -96,25 +96,25 @@ __ww_waiter_first(struct rt_mutex *lock) struct rb_node *n = rb_first(&lock->rtmutex.waiters.rb_root); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline struct rt_mutex_waiter * __ww_waiter_next(struct rt_mutex *lock, struct rt_mutex_waiter *w) { - struct rb_node *n = rb_next(&w->tree_entry); + struct rb_node *n = rb_next(&w->tree.entry); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline struct rt_mutex_waiter * __ww_waiter_prev(struct rt_mutex *lock, struct rt_mutex_waiter *w) { - struct rb_node *n = rb_prev(&w->tree_entry); + struct rb_node *n = rb_prev(&w->tree.entry); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline struct rt_mutex_waiter * @@ -123,7 +123,7 @@ __ww_waiter_last(struct rt_mutex *lock) struct rb_node *n = rb_last(&lock->rtmutex.waiters.rb_root); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline void
From: Sean Christopherson seanjc@google.com
[ Upstream commit 26a0652cb453c72f6aab0974bc4939e9b14f886b ]
Reject KVM_SET_SREGS{2} with -EINVAL if the incoming CR0 is invalid, e.g. due to setting bits 63:32, illegal combinations, or to a value that isn't allowed in VMX (non-)root mode. The VMX checks in particular are "fun" as failure to disallow Real Mode for an L2 that is configured with unrestricted guest disabled, when KVM itself has unrestricted guest enabled, will result in KVM forcing VM86 mode to virtual Real Mode for L2, but then fail to unwind the related metadata when synthesizing a nested VM-Exit back to L1 (which has unrestricted guest enabled).
Opportunistically fix a benign typo in the prototype for is_valid_cr4().
Cc: stable@vger.kernel.org Reported-by: syzbot+5feef0b9ee9c8e9e5689@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000f316b705fdf6e2b4@google.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230613203037.1968489-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/kvm/vmx/vmx.c | 28 ++++++++++++++++++------ arch/x86/kvm/x86.c | 34 +++++++++++++++++++----------- 5 files changed, 52 insertions(+), 20 deletions(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 23ea8a25cbbeb..4bdcb91478a51 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -34,6 +34,7 @@ KVM_X86_OP(get_segment) KVM_X86_OP(get_cpl) KVM_X86_OP(set_segment) KVM_X86_OP_NULL(get_cs_db_l_bits) +KVM_X86_OP(is_valid_cr0) KVM_X86_OP(set_cr0) KVM_X86_OP(is_valid_cr4) KVM_X86_OP(set_cr4) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 9e800d4d323c6..08cfc26ee7c67 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1333,8 +1333,9 @@ struct kvm_x86_ops { void (*set_segment)(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l); + bool (*is_valid_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0); void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0); - bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr0); + bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); int (*set_efer)(struct kvm_vcpu *vcpu, u64 efer); void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 0611dac70c25c..302a4669c5a15 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1734,6 +1734,11 @@ static void svm_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) vmcb_mark_dirty(svm->vmcb, VMCB_DT); }
+static bool svm_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ + return true; +} + void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_svm *svm = to_svm(vcpu); @@ -4596,6 +4601,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .set_segment = svm_set_segment, .get_cpl = svm_get_cpl, .get_cs_db_l_bits = kvm_get_cs_db_l_bits, + .is_valid_cr0 = svm_is_valid_cr0, .set_cr0 = svm_set_cr0, .is_valid_cr4 = svm_is_valid_cr4, .set_cr4 = svm_set_cr4, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0841f9a34d1c2..89744ee06101a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2894,6 +2894,15 @@ static void enter_rmode(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm);
+ /* + * KVM should never use VM86 to virtualize Real Mode when L2 is active, + * as using VM86 is unnecessary if unrestricted guest is enabled, and + * if unrestricted guest is disabled, VM-Enter (from L1) with CR0.PG=0 + * should VM-Fail and KVM should reject userspace attempts to stuff + * CR0.PG=0 when L2 is active. + */ + WARN_ON_ONCE(is_guest_mode(vcpu)); + vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_TR], VCPU_SREG_TR); vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_ES], VCPU_SREG_ES); vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_DS], VCPU_SREG_DS); @@ -3084,6 +3093,17 @@ void ept_save_pdptrs(struct kvm_vcpu *vcpu) #define CR3_EXITING_BITS (CPU_BASED_CR3_LOAD_EXITING | \ CPU_BASED_CR3_STORE_EXITING)
+static bool vmx_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ + if (is_guest_mode(vcpu)) + return nested_guest_cr0_valid(vcpu, cr0); + + if (to_vmx(vcpu)->nested.vmxon) + return nested_host_cr0_valid(vcpu, cr0); + + return true; +} + void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -5027,18 +5047,11 @@ static int handle_set_cr0(struct kvm_vcpu *vcpu, unsigned long val) val = (val & ~vmcs12->cr0_guest_host_mask) | (vmcs12->guest_cr0 & vmcs12->cr0_guest_host_mask);
- if (!nested_guest_cr0_valid(vcpu, val)) - return 1; - if (kvm_set_cr0(vcpu, val)) return 1; vmcs_writel(CR0_READ_SHADOW, orig_val); return 0; } else { - if (to_vmx(vcpu)->nested.vmxon && - !nested_host_cr0_valid(vcpu, val)) - return 1; - return kvm_set_cr0(vcpu, val); } } @@ -7744,6 +7757,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .set_segment = vmx_set_segment, .get_cpl = vmx_get_cpl, .get_cs_db_l_bits = vmx_get_cs_db_l_bits, + .is_valid_cr0 = vmx_is_valid_cr0, .set_cr0 = vmx_set_cr0, .is_valid_cr4 = vmx_is_valid_cr4, .set_cr4 = vmx_set_cr4, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7e1e3bc745622..285ba12be8ce3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -876,6 +876,22 @@ int load_pdptrs(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, unsigned long cr3) } EXPORT_SYMBOL_GPL(load_pdptrs);
+static bool kvm_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ +#ifdef CONFIG_X86_64 + if (cr0 & 0xffffffff00000000UL) + return false; +#endif + + if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD)) + return false; + + if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE)) + return false; + + return static_call(kvm_x86_is_valid_cr0)(vcpu, cr0); +} + void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned long cr0) { if ((cr0 ^ old_cr0) & X86_CR0_PG) { @@ -898,20 +914,13 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) unsigned long old_cr0 = kvm_read_cr0(vcpu); unsigned long pdptr_bits = X86_CR0_CD | X86_CR0_NW | X86_CR0_PG;
- cr0 |= X86_CR0_ET; - -#ifdef CONFIG_X86_64 - if (cr0 & 0xffffffff00000000UL) + if (!kvm_is_valid_cr0(vcpu, cr0)) return 1; -#endif - - cr0 &= ~CR0_RESERVED_BITS;
- if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD)) - return 1; + cr0 |= X86_CR0_ET;
- if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE)) - return 1; + /* Write to CR0 reserved bits are ignored, even on Intel. */ + cr0 &= ~CR0_RESERVED_BITS;
#ifdef CONFIG_X86_64 if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) && @@ -10643,7 +10652,8 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) return false; }
- return kvm_is_valid_cr4(vcpu, sregs->cr4); + return kvm_is_valid_cr4(vcpu, sregs->cr4) && + kvm_is_valid_cr0(vcpu, sregs->cr0); }
static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
From: Jason Wang jasowang@redhat.com
commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream.
A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time.
Cc: stable@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Link: https://lore.kernel.org/r/20230725072049.617289-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/virtio_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_d } }
+ _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock();
@@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_d goto free_unregister_netdev; }
- virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev);
From: Stefan Haberland sth@linux.ibm.com
commit 05f1d8ed03f547054efbc4d29bb7991c958ede95 upstream.
Quiesce and resume are functions that tell the DASD driver to stop/resume issuing I/Os to a specific DASD.
On resume dasd_schedule_block_bh() is called to kick handling of IO requests again. This does unfortunately not cover internal requests which are used for path verification for example.
This could lead to a hanging device when a path event or anything else that triggers internal requests occurs on a quiesced device.
Fix by also calling dasd_schedule_device_bh() which triggers handling of internal requests on resume.
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland sth@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Link: https://lore.kernel.org/r/20230721193647.3889634-2-sth@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd_ioctl.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/s390/block/dasd_ioctl.c +++ b/drivers/s390/block/dasd_ioctl.c @@ -131,6 +131,7 @@ static int dasd_ioctl_resume(struct dasd spin_unlock_irqrestore(get_ccwdev_lock(base->cdev), flags);
dasd_schedule_block_bh(block); + dasd_schedule_device_bh(base); return 0; }
From: Mark Brown broonie@kernel.org
commit f061e2be8689057cb4ec0dbffa9f03e1a23cdcb2 upstream.
The WM8904_ADC_TEST_0 register is modified as part of updating the OSR controls but does not have a cache default, leading to errors when we try to modify these controls in cache only mode with no prior read:
wm8904 3-001a: ASoC: error at snd_soc_component_update_bits on wm8904.3-001a for register: [0x000000c6] -16
Add a read of the register to probe() to fill the cache and avoid both the error messages and the misconfiguration of the chip which will result.
Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230723-asoc-fix-wm8904-adc-test-read-v1-1-2cdf2e... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wm8904.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/soc/codecs/wm8904.c +++ b/sound/soc/codecs/wm8904.c @@ -2306,6 +2306,9 @@ static int wm8904_i2c_probe(struct i2c_c regmap_update_bits(wm8904->regmap, WM8904_BIAS_CONTROL_0, WM8904_POBCTRL, 0);
+ /* Fill the cache for the ADC test register */ + regmap_read(wm8904->regmap, WM8904_ADC_TEST_0, &val); + /* Can leave the device powered off until we need it */ regcache_cache_only(wm8904->regmap, true); regulator_bulk_disable(ARRAY_SIZE(wm8904->supplies), wm8904->supplies);
From: Xiubo Li xiubli@redhat.com
commit 50164507f6b7b7ed85d8c3ac0266849fbd908db7 upstream.
Even the 'disable_send_metrics' is true so when the session is being opened it will always trigger to send the metric for the first time.
Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Venky Shankar vshankar@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/metric.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -202,7 +202,7 @@ static void metric_delayed_work(struct w struct ceph_mds_client *mdsc = container_of(m, struct ceph_mds_client, metric);
- if (mdsc->stopping) + if (mdsc->stopping || disable_send_metrics) return;
if (!m->session || !check_session_state(m->session)) {
From: Joe Thornber ejt@redhat.com
commit 1e4ab7b4c881cf26c1c72b3f56519e03475486fb upstream.
When using the cleaner policy to decommission the cache, there is never any writeback started from the cache as it is constantly delayed due to normal I/O keeping the device busy. Meaning @idle=false was always being passed to clean_target_met()
Fix this by adding a specific 'cleaner' flag that is set when the cleaner policy is configured. This flag serves to always allow the cleaner's writeback work to be queued until the cache is decommissioned (even if the cache isn't idle).
Reported-by: David Jeffery djeffery@redhat.com Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2") Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber ejt@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-cache-policy-smq.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/drivers/md/dm-cache-policy-smq.c +++ b/drivers/md/dm-cache-policy-smq.c @@ -854,7 +854,13 @@ struct smq_policy {
struct background_tracker *bg_work;
- bool migrations_allowed; + bool migrations_allowed:1; + + /* + * If this is set the policy will try and clean the whole cache + * even if the device is not idle. + */ + bool cleaner:1; };
/*----------------------------------------------------------------*/ @@ -1133,7 +1139,7 @@ static bool clean_target_met(struct smq_ * Cache entries may not be populated. So we cannot rely on the * size of the clean queue. */ - if (idle) { + if (idle || mq->cleaner) { /* * We'd like to clean everything. */ @@ -1716,11 +1722,9 @@ static void calc_hotspot_params(sector_t *hotspot_block_size /= 2u; }
-static struct dm_cache_policy *__smq_create(dm_cblock_t cache_size, - sector_t origin_size, - sector_t cache_block_size, - bool mimic_mq, - bool migrations_allowed) +static struct dm_cache_policy * +__smq_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size, + bool mimic_mq, bool migrations_allowed, bool cleaner) { unsigned i; unsigned nr_sentinels_per_queue = 2u * NR_CACHE_LEVELS; @@ -1807,6 +1811,7 @@ static struct dm_cache_policy *__smq_cre goto bad_btracker;
mq->migrations_allowed = migrations_allowed; + mq->cleaner = cleaner;
return &mq->policy;
@@ -1830,21 +1835,24 @@ static struct dm_cache_policy *smq_creat sector_t origin_size, sector_t cache_block_size) { - return __smq_create(cache_size, origin_size, cache_block_size, false, true); + return __smq_create(cache_size, origin_size, cache_block_size, + false, true, false); }
static struct dm_cache_policy *mq_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size) { - return __smq_create(cache_size, origin_size, cache_block_size, true, true); + return __smq_create(cache_size, origin_size, cache_block_size, + true, true, false); }
static struct dm_cache_policy *cleaner_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size) { - return __smq_create(cache_size, origin_size, cache_block_size, false, false); + return __smq_create(cache_size, origin_size, cache_block_size, + false, false, true); }
/*----------------------------------------------------------------*/
From: Ilya Dryomov idryomov@gmail.com
commit f38cb9d9c2045dad16eead4a2e1aedfddd94603b upstream.
Make the "num_lockers can be only 0 or 1" assumption explicit and simplify the API by getting rid of output parameters in preparation for calling get_lock_owner_info() twice before blocklisting.
Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Dongsheng Yang dongsheng.yang@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 84 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 33 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3851,10 +3851,17 @@ static void wake_lock_waiters(struct rbd list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list); }
-static int get_lock_owner_info(struct rbd_device *rbd_dev, - struct ceph_locker **lockers, u32 *num_lockers) +static void free_locker(struct ceph_locker *locker) +{ + if (locker) + ceph_free_lockers(locker, 1); +} + +static struct ceph_locker *get_lock_owner_info(struct rbd_device *rbd_dev) { struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc; + struct ceph_locker *lockers; + u32 num_lockers; u8 lock_type; char *lock_tag; int ret; @@ -3863,39 +3870,45 @@ static int get_lock_owner_info(struct rb
ret = ceph_cls_lock_info(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc, RBD_LOCK_NAME, - &lock_type, &lock_tag, lockers, num_lockers); - if (ret) - return ret; + &lock_type, &lock_tag, &lockers, &num_lockers); + if (ret) { + rbd_warn(rbd_dev, "failed to retrieve lockers: %d", ret); + return ERR_PTR(ret); + }
- if (*num_lockers == 0) { + if (num_lockers == 0) { dout("%s rbd_dev %p no lockers detected\n", __func__, rbd_dev); + lockers = NULL; goto out; }
if (strcmp(lock_tag, RBD_LOCK_TAG)) { rbd_warn(rbd_dev, "locked by external mechanism, tag %s", lock_tag); - ret = -EBUSY; - goto out; + goto err_busy; }
if (lock_type == CEPH_CLS_LOCK_SHARED) { rbd_warn(rbd_dev, "shared lock type detected"); - ret = -EBUSY; - goto out; + goto err_busy; }
- if (strncmp((*lockers)[0].id.cookie, RBD_LOCK_COOKIE_PREFIX, + WARN_ON(num_lockers != 1); + if (strncmp(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX, strlen(RBD_LOCK_COOKIE_PREFIX))) { rbd_warn(rbd_dev, "locked by external mechanism, cookie %s", - (*lockers)[0].id.cookie); - ret = -EBUSY; - goto out; + lockers[0].id.cookie); + goto err_busy; }
out: kfree(lock_tag); - return ret; + return lockers; + +err_busy: + kfree(lock_tag); + ceph_free_lockers(lockers, num_lockers); + return ERR_PTR(-EBUSY); }
static int find_watcher(struct rbd_device *rbd_dev, @@ -3949,51 +3962,56 @@ out: static int rbd_try_lock(struct rbd_device *rbd_dev) { struct ceph_client *client = rbd_dev->rbd_client->client; - struct ceph_locker *lockers; - u32 num_lockers; + struct ceph_locker *locker; int ret;
for (;;) { + locker = NULL; + ret = rbd_lock(rbd_dev); if (ret != -EBUSY) - return ret; + goto out;
/* determine if the current lock holder is still alive */ - ret = get_lock_owner_info(rbd_dev, &lockers, &num_lockers); - if (ret) - return ret; - - if (num_lockers == 0) + locker = get_lock_owner_info(rbd_dev); + if (IS_ERR(locker)) { + ret = PTR_ERR(locker); + locker = NULL; + goto out; + } + if (!locker) goto again;
- ret = find_watcher(rbd_dev, lockers); + ret = find_watcher(rbd_dev, locker); if (ret) goto out; /* request lock or error */
rbd_warn(rbd_dev, "breaking header lock owned by %s%llu", - ENTITY_NAME(lockers[0].id.name)); + ENTITY_NAME(locker->id.name));
ret = ceph_monc_blocklist_add(&client->monc, - &lockers[0].info.addr); + &locker->info.addr); if (ret) { - rbd_warn(rbd_dev, "blocklist of %s%llu failed: %d", - ENTITY_NAME(lockers[0].id.name), ret); + rbd_warn(rbd_dev, "failed to blocklist %s%llu: %d", + ENTITY_NAME(locker->id.name), ret); goto out; }
ret = ceph_cls_break_lock(&client->osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc, RBD_LOCK_NAME, - lockers[0].id.cookie, - &lockers[0].id.name); - if (ret && ret != -ENOENT) + locker->id.cookie, &locker->id.name); + if (ret && ret != -ENOENT) { + rbd_warn(rbd_dev, "failed to break header lock: %d", + ret); goto out; + }
again: - ceph_free_lockers(lockers, num_lockers); + free_locker(locker); }
out: - ceph_free_lockers(lockers, num_lockers); + free_locker(locker); return ret; }
From: Ilya Dryomov idryomov@gmail.com
commit 8ff2c64c9765446c3cef804fb99da04916603e27 upstream.
- we want the exclusive lock type, so test for it directly - use sscanf() to actually parse the lock cookie and avoid admitting invalid handles - bail if locker has a blank address
Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Dongsheng Yang dongsheng.yang@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 21 +++++++++++++++------ net/ceph/messenger.c | 1 + 2 files changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3864,10 +3864,9 @@ static struct ceph_locker *get_lock_owne u32 num_lockers; u8 lock_type; char *lock_tag; + u64 handle; int ret;
- dout("%s rbd_dev %p\n", __func__, rbd_dev); - ret = ceph_cls_lock_info(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc, RBD_LOCK_NAME, &lock_type, &lock_tag, &lockers, &num_lockers); @@ -3888,18 +3887,28 @@ static struct ceph_locker *get_lock_owne goto err_busy; }
- if (lock_type == CEPH_CLS_LOCK_SHARED) { - rbd_warn(rbd_dev, "shared lock type detected"); + if (lock_type != CEPH_CLS_LOCK_EXCLUSIVE) { + rbd_warn(rbd_dev, "incompatible lock type detected"); goto err_busy; }
WARN_ON(num_lockers != 1); - if (strncmp(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX, - strlen(RBD_LOCK_COOKIE_PREFIX))) { + ret = sscanf(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX " %llu", + &handle); + if (ret != 1) { rbd_warn(rbd_dev, "locked by external mechanism, cookie %s", lockers[0].id.cookie); goto err_busy; } + if (ceph_addr_is_blank(&lockers[0].info.addr)) { + rbd_warn(rbd_dev, "locker has a blank address"); + goto err_busy; + } + + dout("%s rbd_dev %p got locker %s%llu@%pISpc/%u handle %llu\n", + __func__, rbd_dev, ENTITY_NAME(lockers[0].id.name), + &lockers[0].info.addr.in_addr, + le32_to_cpu(lockers[0].info.addr.nonce), handle);
out: kfree(lock_tag); --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1144,6 +1144,7 @@ bool ceph_addr_is_blank(const struct cep return true; } } +EXPORT_SYMBOL(ceph_addr_is_blank);
int ceph_addr_port(const struct ceph_entity_addr *addr) {
From: Ilya Dryomov idryomov@gmail.com
commit 588159009d5b7a09c3e5904cffddbe4a4e170301 upstream.
An attempt to acquire exclusive lock can race with the current lock owner closing the image:
1. lock is held by client123, rbd_lock() returns -EBUSY 2. get_lock_owner_info() returns client123 instance details 3. client123 closes the image, lock is released 4. find_watcher() returns 0 as there is no matching watcher anymore 5. client123 instance gets erroneously blocklisted
Particularly impacted is mirror snapshot scheduler in snapshot-based mirroring since it happens to open and close images a lot (images are opened only for as long as it takes to take the next mirror snapshot, the same client instance is used for all images).
To reduce the potential for erroneous blocklisting, retrieve the lock owner again after find_watcher() returns 0. If it's still there, make sure it matches the previously detected lock owner.
Cc: stable@vger.kernel.org # f38cb9d9c204: rbd: make get_lock_owner_info() return a single locker or NULL Cc: stable@vger.kernel.org # 8ff2c64c9765: rbd: harden get_lock_owner_info() a bit Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Dongsheng Yang dongsheng.yang@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3851,6 +3851,15 @@ static void wake_lock_waiters(struct rbd list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list); }
+static bool locker_equal(const struct ceph_locker *lhs, + const struct ceph_locker *rhs) +{ + return lhs->id.name.type == rhs->id.name.type && + lhs->id.name.num == rhs->id.name.num && + !strcmp(lhs->id.cookie, rhs->id.cookie) && + ceph_addr_equal_no_type(&lhs->info.addr, &rhs->info.addr); +} + static void free_locker(struct ceph_locker *locker) { if (locker) @@ -3971,11 +3980,11 @@ out: static int rbd_try_lock(struct rbd_device *rbd_dev) { struct ceph_client *client = rbd_dev->rbd_client->client; - struct ceph_locker *locker; + struct ceph_locker *locker, *refreshed_locker; int ret;
for (;;) { - locker = NULL; + locker = refreshed_locker = NULL;
ret = rbd_lock(rbd_dev); if (ret != -EBUSY) @@ -3995,6 +4004,16 @@ static int rbd_try_lock(struct rbd_devic if (ret) goto out; /* request lock or error */
+ refreshed_locker = get_lock_owner_info(rbd_dev); + if (IS_ERR(refreshed_locker)) { + ret = PTR_ERR(refreshed_locker); + refreshed_locker = NULL; + goto out; + } + if (!refreshed_locker || + !locker_equal(locker, refreshed_locker)) + goto again; + rbd_warn(rbd_dev, "breaking header lock owned by %s%llu", ENTITY_NAME(locker->id.name));
@@ -4016,10 +4035,12 @@ static int rbd_try_lock(struct rbd_devic }
again: + free_locker(refreshed_locker); free_locker(locker); }
out: + free_locker(refreshed_locker); free_locker(locker); return ret; }
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 9971c3f944489ff7aacb9d25e0cde841a5f6018a upstream.
The test to check if the field is a stack is to be done if it is not a string. But the code had:
} if (event->fields[i]->is_stack) {
and not
} else if (event->fields[i]->is_stack) {
which would cause it to always be tested. Worse yet, this also included an "else" statement that was only to be called if the field was not a string and a stack, but this code allows it to be called if it was a string (and not a stack).
Also fixed some whitespace issues.
Link: https://lore.kernel.org/all/202301302110.mEtNwkBD-lkp@intel.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230131095237.63e3ca8d@gandalf.l...
Cc: Tom Zanussi zanussi@kernel.org Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_synth.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -557,8 +557,8 @@ static notrace void trace_event_raw_even event->fields[i]->is_dynamic, data_size, &n_u64); data_size += len; /* only dynamic string increments */ - } if (event->fields[i]->is_stack) { - long *stack = (long *)(long)var_ref_vals[val_idx]; + } else if (event->fields[i]->is_stack) { + long *stack = (long *)(long)var_ref_vals[val_idx];
len = trace_stack(entry, event, stack, data_size, &n_u64);
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit c02d5feb6e2f60affc6ba8606d8d614c071e2ba6 upstream.
When _PPC returns 0, it means that the CPU frequency is not limited by the platform firmware, so make acpi_processor_get_platform_limit() update the frequency QoS request used by it to "no limit" in that case.
This addresses a problem with limiting CPU frequency artificially on some systems after CPU offline/online to the frequency that corresponds to the first entry in the _PSS return package.
Reported-by: Pratyush Yadav ptyadav@amazon.de Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Pratyush Yadav ptyadav@amazon.de Tested-by: Pratyush Yadav ptyadav@amazon.de Tested-by: Hagar Hemdan hagarhem@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/processor_perflib.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
--- a/drivers/acpi/processor_perflib.c +++ b/drivers/acpi/processor_perflib.c @@ -53,6 +53,8 @@ static int acpi_processor_get_platform_l { acpi_status status = 0; unsigned long long ppc = 0; + s32 qos_value; + int index; int ret;
if (!pr) @@ -72,17 +74,27 @@ static int acpi_processor_get_platform_l } }
+ index = ppc; + pr_debug("CPU %d: _PPC is %d - frequency %s limited\n", pr->id, - (int)ppc, ppc ? "" : "not"); + index, index ? "is" : "is not");
- pr->performance_platform_limit = (int)ppc; + pr->performance_platform_limit = index;
if (ppc >= pr->performance->state_count || unlikely(!freq_qos_request_active(&pr->perflib_req))) return 0;
- ret = freq_qos_update_request(&pr->perflib_req, - pr->performance->states[ppc].core_frequency * 1000); + /* + * If _PPC returns 0, it means that all of the available states can be + * used ("no limit"). + */ + if (index == 0) + qos_value = FREQ_QOS_MAX_DEFAULT_VALUE; + else + qos_value = pr->performance->states[index].core_frequency * 1000; + + ret = freq_qos_update_request(&pr->perflib_req, qos_value); if (ret < 0) { pr_warn("Failed to update perflib freq constraint: CPU%d (%d)\n", pr->id, ret);
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 99387b016022c29234c4ebf9abd34358c6e56532 upstream.
Modify acpi_processor_get_platform_limit() to avoid updating its frequency QoS request when the _PPC return value has not changed by comparing that value to the previous _PPC return value stored in the performance_platform_limit field of the struct acpi_processor corresponding to the given CPU.
While at it, do the _PPC return value check against the state count earlier, to avoid setting performance_platform_limit to an invalid value, and make acpi_processor_ppc_init() use FREQ_QOS_MAX_DEFAULT_VALUE as the "no limit" frequency QoS for consistency.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Tested-by: Hagar Hemdan hagarhem@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/processor_perflib.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
--- a/drivers/acpi/processor_perflib.c +++ b/drivers/acpi/processor_perflib.c @@ -76,13 +76,16 @@ static int acpi_processor_get_platform_l
index = ppc;
+ if (pr->performance_platform_limit == index || + ppc >= pr->performance->state_count) + return 0; + pr_debug("CPU %d: _PPC is %d - frequency %s limited\n", pr->id, index, index ? "is" : "is not");
pr->performance_platform_limit = index;
- if (ppc >= pr->performance->state_count || - unlikely(!freq_qos_request_active(&pr->perflib_req))) + if (unlikely(!freq_qos_request_active(&pr->perflib_req))) return 0;
/* @@ -177,9 +180,16 @@ void acpi_processor_ppc_init(struct cpuf if (!pr) continue;
+ /* + * Reset performance_platform_limit in case there is a stale + * value in it, so as to make it match the "no limit" QoS value + * below. + */ + pr->performance_platform_limit = 0; + ret = freq_qos_add_request(&policy->constraints, - &pr->perflib_req, - FREQ_QOS_MAX, INT_MAX); + &pr->perflib_req, FREQ_QOS_MAX, + FREQ_QOS_MAX_DEFAULT_VALUE); if (ret < 0) pr_err("Failed to add freq constraint for CPU%d (%d)\n", cpu, ret);
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit e8a0e30b742f76ebd0f3b196973df4bf65d8fbbb upstream.
After making acpi_processor_get_platform_limit() use the "no limit" value for its frequency QoS request when _PPC returns 0, it is not necessary to replace the frequency corresponding to the first _PSS return package entry with the maximum turbo frequency of the given CPU in intel_pstate_init_acpi_perf_limits() any more, so drop the code doing that along with the comment explaining it.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Tested-by: Hagar Hemdan hagarhem@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/intel_pstate.c | 14 -------------- 1 file changed, 14 deletions(-)
--- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -448,20 +448,6 @@ static void intel_pstate_init_acpi_perf_ (u32) cpu->acpi_perf_data.states[i].control); }
- /* - * The _PSS table doesn't contain whole turbo frequency range. - * This just contains +1 MHZ above the max non turbo frequency, - * with control value corresponding to max turbo ratio. But - * when cpufreq set policy is called, it will call with this - * max frequency, which will cause a reduced performance as - * this driver uses real max turbo frequency as the max - * frequency. So correct this frequency in _PSS table to - * correct max turbo frequency based on the turbo state. - * Also need to convert to MHz as _PSS freq is in MHz. - */ - if (!global.turbo_disabled) - cpu->acpi_perf_data.states[0].core_frequency = - policy->cpuinfo.max_freq / 1000; cpu->valid_pss_table = true; pr_debug("_PPC limits will be enforced\n");
From: Matthieu Baerts matthieu.baerts@tessares.net
commit a5a5990c099dd354e05e89ee77cd2dbf6655d4a1 upstream.
IPTables commands using 'iptables-nft' fail on old kernels, at least on v5.15 because it doesn't see the default IPTables chains:
$ iptables -L iptables/1.8.2 Failed to initialize nft: Protocol not supported
As a first step before switching to NFTables, we can use iptables-legacy if available.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: dc65fe82fb07 ("selftests: mptcp: add packet mark test case") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_sockopt.sh | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh @@ -13,13 +13,15 @@ timeout_poll=30 timeout_test=$((timeout_poll * 2 + 1)) mptcp_connect="" do_all_tests=1 +iptables="iptables" +ip6tables="ip6tables"
add_mark_rules() { local ns=$1 local m=$2
- for t in iptables ip6tables; do + for t in ${iptables} ${ip6tables}; do # just to debug: check we have multiple subflows connection requests ip netns exec $ns $t -A OUTPUT -p tcp --syn -m mark --mark $m -j ACCEPT
@@ -90,14 +92,14 @@ if [ $? -ne 0 ];then exit $ksft_skip fi
-iptables -V > /dev/null 2>&1 -if [ $? -ne 0 ];then +# Use the legacy version if available to support old kernel versions +if iptables-legacy -V &> /dev/null; then + iptables="iptables-legacy" + ip6tables="ip6tables-legacy" +elif ! iptables -V &> /dev/null; then echo "SKIP: Could not run all tests without iptables tool" exit $ksft_skip -fi - -ip6tables -V > /dev/null 2>&1 -if [ $? -ne 0 ];then +elif ! ip6tables -V &> /dev/null; then echo "SKIP: Could not run all tests without ip6tables tool" exit $ksft_skip fi @@ -107,10 +109,10 @@ check_mark() local ns=$1 local af=$2
- tables=iptables + tables=${iptables}
if [ $af -eq 6 ];then - tables=ip6tables + tables=${ip6tables} fi
counters=$(ip netns exec $ns $tables -v -L OUTPUT | grep DROP)
From: Jens Axboe axboe@kernel.dk
commit a9be202269580ca611c6cebac90eaf1795497800 upstream.
io-wq assumes that an issue is blocking, but it may not be if the request type has asked for a non-blocking attempt. If we get -EAGAIN for that case, then we need to treat it as a final result and not retry or arm poll for it.
Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/issues/897 Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -7066,6 +7066,14 @@ static void io_wq_submit_work(struct io_ */ if (ret != -EAGAIN || !(req->ctx->flags & IORING_SETUP_IOPOLL)) break; + + /* + * If REQ_F_NOWAIT is set, then don't wait or retry with + * poll. -EAGAIN is final for that case. + */ + if (req->flags & REQ_F_NOWAIT) + break; + cond_resched(); } while (1); }
From: Thomas Petazzoni thomas.petazzoni@bootlin.com
commit e51df4f81b02bcdd828a04de7c1eb6a92988b61e upstream.
In commit 2cb1e0259f50 ("ASoC: cs42l51: re-hook of_match_table pointer"), 9 years ago, some random guy fixed the cs42l51 after it was split into a core part and an I2C part to properly match based on a Device Tree compatible string.
However, the fix in this commit is wrong: the MODULE_DEVICE_TABLE(of, ....) is in the core part of the driver, not the I2C part. Therefore, automatic module loading based on module.alias, based on matching with the DT compatible string, loads the core part of the driver, but not the I2C part. And threfore, the i2c_driver is not registered, and the codec is not known to the system, nor matched with a DT node with the corresponding compatible string.
In order to fix that, we move the MODULE_DEVICE_TABLE(of, ...) into the I2C part of the driver. The cs42l51_of_match[] array is also moved as well, as it is not possible to have this definition in one file, and the MODULE_DEVICE_TABLE(of, ...) invocation in another file, due to how MODULE_DEVICE_TABLE works.
Thanks to this commit, the I2C part of the driver now properly autoloads, and thanks to its dependency on the core part, the core part gets autoloaded as well, resulting in a functional sound card without having to manually load kernel modules.
Fixes: 2cb1e0259f50 ("ASoC: cs42l51: re-hook of_match_table pointer") Cc: stable@vger.kernel.org Signed-off-by: Thomas Petazzoni thomas.petazzoni@bootlin.com Link: https://lore.kernel.org/r/20230713112112.778576-1-thomas.petazzoni@bootlin.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/cs42l51-i2c.c | 6 ++++++ sound/soc/codecs/cs42l51.c | 7 ------- sound/soc/codecs/cs42l51.h | 1 - 3 files changed, 6 insertions(+), 8 deletions(-)
--- a/sound/soc/codecs/cs42l51-i2c.c +++ b/sound/soc/codecs/cs42l51-i2c.c @@ -19,6 +19,12 @@ static struct i2c_device_id cs42l51_i2c_ }; MODULE_DEVICE_TABLE(i2c, cs42l51_i2c_id);
+const struct of_device_id cs42l51_of_match[] = { + { .compatible = "cirrus,cs42l51", }, + { } +}; +MODULE_DEVICE_TABLE(of, cs42l51_of_match); + static int cs42l51_i2c_probe(struct i2c_client *i2c, const struct i2c_device_id *id) { --- a/sound/soc/codecs/cs42l51.c +++ b/sound/soc/codecs/cs42l51.c @@ -825,13 +825,6 @@ int __maybe_unused cs42l51_resume(struct } EXPORT_SYMBOL_GPL(cs42l51_resume);
-const struct of_device_id cs42l51_of_match[] = { - { .compatible = "cirrus,cs42l51", }, - { } -}; -MODULE_DEVICE_TABLE(of, cs42l51_of_match); -EXPORT_SYMBOL_GPL(cs42l51_of_match); - MODULE_AUTHOR("Arnaud Patard arnaud.patard@rtp-net.org"); MODULE_DESCRIPTION("Cirrus Logic CS42L51 ALSA SoC Codec Driver"); MODULE_LICENSE("GPL"); --- a/sound/soc/codecs/cs42l51.h +++ b/sound/soc/codecs/cs42l51.h @@ -16,7 +16,6 @@ int cs42l51_probe(struct device *dev, st int cs42l51_remove(struct device *dev); int __maybe_unused cs42l51_suspend(struct device *dev); int __maybe_unused cs42l51_resume(struct device *dev); -extern const struct of_device_id cs42l51_of_match[];
#define CS42L51_CHIP_ID 0x1B #define CS42L51_CHIP_REV_A 0x00
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 016e7ba47f33064fbef8c4307a2485d2669dfd03 upstream.
If 'iptables-legacy' is available, 'ip6tables-legacy' command will be used instead of 'ip6tables'. So no need to look if 'ip6tables' is available in this case.
Cc: stable@vger.kernel.org Fixes: 0c4cd3f86a40 ("selftests: mptcp: join: use 'iptables-legacy' if available") Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Mat Martineau martineau@kernel.org Link: https://lore.kernel.org/r/20230725-send-net-20230725-v1-1-6f60fe7137a9@kerne... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -179,10 +179,7 @@ if iptables-legacy -V &> /dev/null; then elif ! iptables -V &> /dev/null; then echo "SKIP: Could not run all tests without iptables tool" exit $ksft_skip -fi - -ip6tables -V > /dev/null 2>&1 -if [ $? -ne 0 ];then +elif ! ip6tables -V &> /dev/null; then echo "SKIP: Could not run all tests without ip6tables tool" exit $ksft_skip fi
On Tue, 01 Aug 2023 11:18:32 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.15: 11 builds: 11 pass, 0 fail 28 boots: 28 pass, 0 fail 114 tests: 114 pass, 0 fail
Linux version: 5.15.124-rc1-gb2c388dc2443 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
Hello,
On Tue, 1 Aug 2023 11:18:32 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr [2] b2c388dc2443 ("Linux 5.15.124-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_m68k.sh ok 12 selftests: damon-tests: build_arm64.sh ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh
PASS
On 8/1/23 02:18, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On 8/1/23 03:18, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Tue, 1 Aug 2023 at 14:52, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Following patch caused build regression on stable-rc 5.15 and stable-rc 6.1,
Regressions found on arm:
- build/gcc-12-orion5x_defconfig - build/clang-nightly-orion5x_defconfig - build/gcc-8-orion5x_defconfig - build/clang-16-orion5x_defconfig
gpio: mvebu: Make use of devm_pwmchip_add [ Upstream commit 1945063eb59e64d2919cb14d54d081476d9e53bb ]
Build log: ------ drivers/gpio/gpio-mvebu.c: In function 'mvebu_pwm_probe': drivers/gpio/gpio-mvebu.c:877:16: error: implicit declaration of function 'devm_pwmchip_add'; did you mean 'pwmchip_add'? [-Werror=implicit-function-declaration] 877 | return devm_pwmchip_add(dev, &mvpwm->chip); | ^~~~~~~~~~~~~~~~ | pwmchip_add cc1: some warnings being treated as errors
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Links: - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15....
-- Linaro LKFT https://lkft.linaro.org
On Wed, Aug 02, 2023 at 07:27:03AM +0530, Naresh Kamboju wrote:
On Tue, 1 Aug 2023 at 14:52, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Following patch caused build regression on stable-rc 5.15 and stable-rc 6.1,
Regressions found on arm:
- build/gcc-12-orion5x_defconfig
- build/clang-nightly-orion5x_defconfig
- build/gcc-8-orion5x_defconfig
- build/clang-16-orion5x_defconfig
gpio: mvebu: Make use of devm_pwmchip_add [ Upstream commit 1945063eb59e64d2919cb14d54d081476d9e53bb ]
Build log:
drivers/gpio/gpio-mvebu.c: In function 'mvebu_pwm_probe': drivers/gpio/gpio-mvebu.c:877:16: error: implicit declaration of function 'devm_pwmchip_add'; did you mean 'pwmchip_add'? [-Werror=implicit-function-declaration] 877 | return devm_pwmchip_add(dev, &mvpwm->chip); | ^~~~~~~~~~~~~~~~ | pwmchip_add cc1: some warnings being treated as errors
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Ah, looks like 88da4e811311 ("pwm: Add a stub for devm_pwmchip_add()") is needed, I'll queue that up for 5.15 and 6.1 and push out new -rc releases.
thanks,
greg k-h
Hi Greg,
On 01/08/23 2:48 pm, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.124 release. There are 155 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.124-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
No problems seen on x86_64 and aarch64.
Tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
Thanks, Harshit
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org