This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.4.8-rc1
Dan Carpenter dan.carpenter@linaro.org dma-buf: fix an error pointer vs NULL bug
Christian König christian.koenig@amd.com dma-buf: keep the signaling time of merged fences v3
Jann Horn jannh@google.com mm/mempolicy: Take VMA lock before replacing policy
Sidhartha Kumar sidhartha.kumar@oracle.com mm/memory-failure: fix hardware poison check in unpoison_memory()
Jann Horn jannh@google.com mm: fix memory ordering for mm_lock_seq and vm_lock_seq
Jann Horn jannh@google.com mm: lock VMA in dup_anon_vma() before setting ->anon_vma
Ilya Dryomov idryomov@gmail.com rbd: retrieve and check lock owner twice before blocklisting
Ilya Dryomov idryomov@gmail.com rbd: harden get_lock_owner_info() a bit
Ilya Dryomov idryomov@gmail.com rbd: make get_lock_owner_info() return a single locker or NULL
Joe Thornber ejt@redhat.com dm cache policy smq: ensure IO doesn't prevent cleaner policy progress
Radhakrishna Sripada radhakrishna.sripada@intel.com drm/i915/dpt: Use shmem for dpt objects
Xiubo Li xiubli@redhat.com ceph: never send metrics if disable_send_metrics is set
Ahmad Fatoum a.fatoum@pengutronix.de thermal: of: fix double-free on unregistration
Johan Hovold johan+linaro@kernel.org PM: sleep: wakeirq: fix wake irq arming
Mark Brown broonie@kernel.org arm64/sme: Set new vector length before reallocating
Mark Brown broonie@kernel.org ASoC: wm8904: Fill the cache for WM8904_ADC_TEST_0 register
Paolo Abeni pabeni@redhat.com mptcp: more accurate NL event generation
Stefan Haberland sth@linux.ibm.com s390/dasd: print copy pair message only for the correct error
Stefan Haberland sth@linux.ibm.com s390/dasd: fix hanging device after quiesce/resume
Eric Van Hensbergen ericvh@kernel.org fs/9p: remove unnecessary invalidate_inode_pages2
Eric Van Hensbergen ericvh@kernel.org fs/9p: fix type mismatch in file cache mode helper
Eric Van Hensbergen ericvh@kernel.org fs/9p: fix typo in comparison logic for cache mode
Eric Van Hensbergen ericvh@kernel.org fs/9p: remove unnecessary and overrestrictive check
Dominique Martinet asmadeus@codewreck.org 9p: fix ignored return value in v9fs_dir_release
Chenguang Zhao zhaochenguang@kylinos.cn LoongArch: BPF: Enable bpf_probe_read{, str}() on LoongArch
Tiezhu Yang yangtiezhu@loongson.cn LoongArch: BPF: Fix check condition to call lu32id in move_imm()
WANG Rui wangrui@loongson.cn LoongArch: Fix return value underflow in exception path
Andy Shevchenko andriy.shevchenko@linux.intel.com Revert "um: Use swap() to make code cleaner"
Johan Hovold johan+linaro@kernel.org soundwire: fix enumeration completion
Matthieu Baerts matthieu.baerts@tessares.net selftests: mptcp: join: only check for ip6tables if needed
Sean Christopherson seanjc@google.com selftests/rseq: Play nice with binaries statically linked against glibc 2.35+
Jason Gunthorpe jgg@ziepe.ca iommufd: Set end correctly when doing batch carry
Jens Axboe axboe@kernel.dk io_uring: gate iowait schedule on having pending requests
Christian Marangi ansuelsmth@gmail.com net: dsa: qca8k: fix mdb add/del case with 0 VID
Christian Marangi ansuelsmth@gmail.com net: dsa: qca8k: fix broken search_and_del
Christian Marangi ansuelsmth@gmail.com net: dsa: qca8k: fix search_and_insert wrong handling of new rule
Christian Marangi ansuelsmth@gmail.com net: dsa: qca8k: enable use_single_write for qca8xxx
Alex Elder elder@linaro.org net: ipa: only reset hashed tables when supported
Jason Wang jasowang@redhat.com virtio-net: fix race between set queues and probe
Demi Marie Obenour demi@invisiblethingslab.com xen: speed up grant-table reclaim
Dan Carpenter dan.carpenter@linaro.org proc/vmcore: fix signedness bug in read_from_oldmem()
Peter Zijlstra peterz@infradead.org locking/rtmutex: Fix task->pi_waiters integrity
Marc Zyngier maz@kernel.org irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidation
Jonas Gorski jonas.gorski@gmail.com irq-bcm6345-l1: Do not assume a fixed block to cpu mapping
Alexander Steffen Alexander.Steffen@infineon.com tpm_tis: Explicitly check for error code
Guanghui Feng guanghuifeng@linux.alibaba.com ACPI/IORT: Remove erroneous id_count check in iort_node_get_rmr_info()
Namjae Jeon linkinjeon@kernel.org ksmbd: check if a mount point is crossed during path lookup
Trond Myklebust trond.myklebust@hammerspace.com nfsd: Remove incorrect check in nfsd4_validate_stateid
Christian Brauner brauner@kernel.org file: always lock position for FMODE_ATOMIC_POS
Kim Phillips kim.phillips@amd.com x86/cpu: Enable STIBP on AMD if Automatic IBRS is enabled
Yazen Ghannam yazen.ghannam@amd.com x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
Filipe Manana fdmanana@suse.com btrfs: check for commit error at btrfs_attach_transaction_barrier()
Filipe Manana fdmanana@suse.com btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
Filipe Manana fdmanana@suse.com btrfs: account block group tree when calculating global reserve size
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: do not enable async discard
Guenter Roeck linux@roeck-us.net hwmon: (pmbus_core) Fix Deadlock in pmbus_regulator_get_status
Patrick Rudolph patrick.rudolph@9elements.com hwmon: (pmbus_core) Fix NULL pointer dereference
Patrick Rudolph patrick.rudolph@9elements.com hwmon: (pmbus_core) Fix pmbus_is_enabled()
Aleksa Savic savicaleksa83@gmail.com hwmon: (aquacomputer_d5next) Fix incorrect PWM value readout
Gilles Buloz Gilles.Buloz@kontron.com hwmon: (nct7802) Fix for temp6 (PECI1) processed even if PECI1 disabled
Baskaran Kannan Baski.Kannan@amd.com hwmon: (k10temp) Enable AMD3255 Proc to show negative temperature
Luka Guzenko l.guzenko@web.de ALSA: hda/relatek: Enable Mute LED on HP 250 G8
Pavel Asyutchenko svenpavel@gmail.com ALSA: hda/realtek: Support ASUS G713PV laptop
Oliver Neukum oneukum@suse.com Revert "xhci: add quirk for host controllers that don't update endpoint DCS"
Chaoyuan Peng hedonistsmith@gmail.com tty: n_gsm: fix UAF in gsm_cleanup_mux
Zhang Shurong zhang_shurong@foxmail.com staging: ks7010: potential buffer overflow in ks_wlan_set_encode_ext()
Larry Finger Larry.Finger@lwfinger.net staging: r8712: Fix memory leak in _r8712_init_xmit_priv()
Greg Kroah-Hartman gregkh@linuxfoundation.org Documentation: security-bugs.rst: clarify CVE handling
Greg Kroah-Hartman gregkh@linuxfoundation.org Documentation: security-bugs.rst: update preferences when dealing with the linux-distros group
Dan Carpenter dan.carpenter@linaro.org Revert "usb: xhci: tegra: Fix error check"
Ricardo Ribalda ribalda@chromium.org usb: xhci-mtk: set the dma max_seg_size
Frank Li Frank.Li@nxp.com usb: cdns3: fix incorrect calculation of ep_buf_size when more than one config
Łukasz Bartosik lb@semihalf.com USB: quirks: add quirk for Focusrite Scarlett
Guiting Shen aarongt.shen@gmail.com usb: ohci-at91: Fix the unhandle interrupt when resume
Xu Yang xu.yang_2@nxp.com usb: misc: ehset: fix wrong if condition
Jisheng Zhang jszhang@kernel.org usb: dwc3: don't reset device side if dwc3 was configured as host-only
Gratian Crisan gratian.crisan@ni.com usb: dwc3: pci: skip BYT GPIO lookup table for hardwired phy
Jakub Vanek linuxtardis@gmail.com Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"
Kyle Tso kyletso@google.com usb: typec: Use sysfs_emit_at when concatenating the string
Kyle Tso kyletso@google.com usb: typec: Iterate pds array when showing the pd list
Kyle Tso kyletso@google.com usb: typec: Set port->pd before adding device for typec_port
Samuel Thibault samuel.thibault@ens-lyon.org TIOCSTI: always enable for CAP_SYS_ADMIN
Marc Kleine-Budde mkl@pengutronix.de can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED
Johan Hovold johan@kernel.org USB: serial: simple: sort driver entries
Oliver Neukum oneukum@suse.com USB: serial: simple: add Kaufmann RKS+CAN VCP
Mohsen Tahmasebi moh53n@moh53n.ir USB: serial: option: add Quectel EC200A module support
Jerry Meng jerry-meng@foxmail.com USB: serial: option: support Quectel EM060K_128
Samuel Holland samuel.holland@sifive.com serial: sifive: Fix sifive_serial_console_setup() section
Ruihong Luo colorsu1922@gmail.com serial: 8250_dw: Preserve original value of DLF register
Biju Das biju.das.jz@bp.renesas.com tty: serial: sh-sci: Fix sleeping in atomic context
Johan Hovold johan+linaro@kernel.org serial: qcom-geni: drop bogus runtime pm state update
Sean Christopherson seanjc@google.com KVM: x86: Disallow KVM_SET_SREGS{2} if incoming CR0 is invalid
Sean Christopherson seanjc@google.com KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
Sean Christopherson seanjc@google.com KVM: Grab a reference to KVM for VM and vCPU stats file descriptors
Michael Grzeschik m.grzeschik@pengutronix.de usb: gadget: core: remove unbalanced mutex_unlock in usb_gadget_activate
Zqiang qiang.zhang1211@gmail.com USB: gadget: Fix the memory leak in raw_gadget driver
Frank Li Frank.Li@nxp.com usb: gadget: call usb_gadget_check_config() to verify UDC capability
Dan Carpenter dan.carpenter@linaro.org Revert "usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()"
Zheng Yejian zhengyejian1@huawei.com tracing: Fix warning in trace_buffered_event_disable()
Zheng Yejian zhengyejian1@huawei.com ring-buffer: Fix wrong stat of cpu_buffer->read
Arnd Bergmann arnd@arndb.de ata: pata_ns87415: mark ns87560_tf_read static
Hugh Dickins hughd@google.com tmpfs: fix Documentation of noswap and huge mount options
Jason Gunthorpe jgg@ziepe.ca iommufd: IOMMUFD_DESTROY should not increase the refcount
Ming Lei ming.lei@redhat.com ublk: return -EINTR if breaking from waiting for existed users in DEL_DEV
Ming Lei ming.lei@redhat.com ublk: fail to recover device if queue setup is interrupted
Ming Lei ming.lei@redhat.com ublk: fail to start device if queue setup is interrupted
Rob Clark robdclark@chromium.org drm/msm: Disallow submit with fence id 0
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Report correct WC error
Sindhu Devale sindhu.devale@intel.com RDMA/irdma: Fix op_type reporting in CQEs
Dan Carpenter dan.carpenter@linaro.org drm/amd/display: Unlock on error path in dm_handle_mst_sideband_msg_ready_event()
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix an error handling mistake in psp_sw_init()
Yu Kuai yukuai3@huawei.com dm raid: protect md_stop() with 'reconfig_mutex'
Yu Kuai yukuai3@huawei.com dm raid: clean up four equivalent goto tags in raid_ctr()
Yu Kuai yukuai3@huawei.com dm raid: fix missing reconfig_mutex unlock in raid_ctr() error paths
Stefano Stabellini sstabellini@kernel.org xenbus: check xen_domain in xenbus_probe_initcall
Christophe JAILLET christophe.jaillet@wanadoo.fr drm/i915: Fix an error handling path in igt_write_huge()
Steve French stfrench@microsoft.com smb3: do not set NTLMSSP_VERSION flag for negotiate not auth request
Bart Van Assche bvanassche@acm.org block: Fix a source code comment in include/uapi/linux/blkzoned.h
Matus Gajdos matuszpd@gmail.com ASoC: fsl_spdif: Silence output on stop
Breno Leitao leitao@debian.org cxl/acpi: Return 'rc' instead of '0' in cxl_parse_cfmws()
Breno Leitao leitao@debian.org cxl/acpi: Fix a use-after-free in cxl_parse_cfmws()
Rob Clark robdclark@chromium.org drm/msm: Fix hw_fence error path cleanup
Gaosheng Cui cuigaosheng1@huawei.com drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()
Selvin Xavier selvin.xavier@broadcom.com RDMA/bnxt_re: Fix hang during driver unload
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: add helper function __poll_for_resp
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Simplify the function that sends the FW commands
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: use shadow qd while posting non blocking rcfw command
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Avoid the command wait if firmware is inactive
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Enhance the existing functions that wait for FW responses
Kashyap Desai kashyap.desai@broadcom.com RDMA/bnxt_re: Prevent handling any completions after qp destroy
Thomas Bogendoerfer tbogendoerfer@suse.de RDMA/mthca: Fix crash when polling CQ for shared QPs
Shiraz Saleem shiraz.saleem@intel.com RDMA/core: Update CMA destination address on rdma_resolve_addr
Shiraz Saleem shiraz.saleem@intel.com RDMA/irdma: Fix data race on CQP request done
Shiraz Saleem shiraz.saleem@intel.com RDMA/irdma: Fix data race on CQP completion stats
Shiraz Saleem shiraz.saleem@intel.com RDMA/irdma: Add missing read barriers
Rob Clark robdclark@chromium.org drm/msm/adreno: Fix snapshot BINDLESS_DATA size
Marijn Suijten marijn.suijten@somainline.org drm/msm/dsi: Drop unused regulators from QCM2290 14nm DSI PHY config
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/dpu: drop enum dpu_core_perf_data_bus_id
Jonathan Marek jonathan@marek.ca drm/msm/dpu: add missing flush and fetch bits for DMA4/DMA5 planes
Dmitry Baryshkov dmitry.baryshkov@linaro.org drm/msm/mdss: correct UBWC programming for SM8550
Dan Carpenter dan.carpenter@linaro.org RDMA/mlx4: Make check for invalid flags stricter
Christophe JAILLET christophe.jaillet@wanadoo.fr fs/9p: Fix a datatype used with V9FS_DIRECT_IO
Fedor Pchelkin pchelkin@ispras.ru tipc: stop tipc crypto on failure in tipc_node_create
Yuanjun Gong ruc_gongyuanjun@163.com tipc: check return value of pskb_trim()
Yuanjun Gong ruc_gongyuanjun@163.com benet: fix return value check in be_lancer_xmit_workarounds()
Lin Ma linma@zju.edu.cn net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
Wei Fang wei.fang@nxp.com net: fec: tx processing does not call XDP APIs if budget is 0
Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com tools: ynl-gen: fix enum index in _decode_enum(..)
Linus Torvalds torvalds@linux-foundation.org mm: suppress mm fault logging if fatal signal already pending
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
Florian Westphal fw@strlen.de netfilter: nft_set_rbtree: fix overlap expiration walk
Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com igc: Fix Kernel Panic during ndo_tx_timeout callback
Lin Ma linma@zju.edu.cn macvlan: add forgotten nla_policy for IFLA_MACVLAN_BC_CUTOFF
Kirill A. Shutemov kirill.shutemov@linux.intel.com x86/traps: Fix load_unaligned_zeropad() handling for shared TDX memory
Maxim Mikityanskiy maxtram95@gmail.com platform/x86: msi-laptop: Fix rfkill out-of-sync on MSI Wind U100
Vincent Whitchurch vincent.whitchurch@axis.com net: stmmac: Apply redundant write work around on 4.xx too
Suman Ghosh sumang@marvell.com octeontx2-af: Fix hash extraction enable configuration
Hangbin Liu liuhangbin@gmail.com team: reset team's flags when down link is P2P device
Hangbin Liu liuhangbin@gmail.com bonding: reset bond's flags when down link is P2P device
Jedrzej Jagielski jedrzej.jagielski@intel.com ice: Fix memory management in ice_ethtool_fdir.c
Stewart Smith trawets@amazon.com tcp: Reduce chance of collisions in inet6_hashfn().
Wei Fang wei.fang@nxp.com net: fec: avoid tx queue timeout when XDP is enabled
Maciej Żenczykowski maze@google.com ipv6 addrconf: fix bug where deleting a mngtmpaddr can create a new temporary address
Yuanjun Gong ruc_gongyuanjun@163.com ethernet: atheros: fix return value check in atl1e_tso_csum()
Yuanjun Gong ruc_gongyuanjun@163.com atheros: fix return value check in atl1_tso()
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
Jiri Benc jbenc@redhat.com vxlan: fix GRO with VXLAN-GPE
Jiri Benc jbenc@redhat.com vxlan: generalize vxlan_parse_gpe_hdr and remove unused args
Jiri Benc jbenc@redhat.com vxlan: calculate correct header length for GPE
Jijie Shao shaojijie@huawei.com net: hns3: fix wrong bw weight of disabled tc issue
Jijie Shao shaojijie@huawei.com net: hns3: fix wrong tc bandwidth weight data issue
Hao Lan lanhao@huawei.com net: hns3: fix the imp capability bit cannot exceed 32 bits issue
Jiawen Wu jiawenwu@trustnetic.com net: phy: marvell10g: fix 88x3310 power up
Jacob Keller jacob.e.keller@intel.com iavf: check for removal state before IAVF_FLAG_PF_COMMS_FAILED
Jacob Keller jacob.e.keller@intel.com iavf: fix potential deadlock on allocation failure
Wang Ming machel@vivo.com i40e: Fix an NULL vs IS_ERR() bug for debugfs_create_dir()
Arnd Bergmann arnd@arndb.de media: mtk_jpeg_core: avoid unused-variable warning
Randy Dunlap rdunlap@infradead.org media: mtk-jpeg: move data/code inside CONFIG_OF blocks
Nicolas Dufresne nicolas.dufresne@collabora.com media: amphion: Fix firmware path to match linux-firmware
Sakari Ailus sakari.ailus@linux.intel.com media: staging: atomisp: select V4L2_FWNODE
Sakari Ailus sakari.ailus@linux.intel.com media: tc358746: Address compiler warnings
Dan Carpenter dan.carpenter@linaro.org soundwire: amd: Fix a check for errors in probe()
Srinivas Kandagatla srinivas.kandagatla@linaro.org soundwire: qcom: update status correctly with mask
Adrien Thierry athierry@redhat.com phy: qcom-snps-femto-v2: properly enable ref clock
Adrien Thierry athierry@redhat.com phy: qcom-snps-femto-v2: keep cfg_ahb_clk enabled during runtime suspend
Guillaume Ranquet granquet@baylibre.com phy: mediatek: hdmi: mt8195: fix prediv bad upper limit test
Dan Carpenter dan.carpenter@linaro.org phy: phy-mtk-dp: Fix an error code in probe()
Ojaswin Mujoo ojaswin@linux.ibm.com ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
Ritesh Harjani ritesh.list@gmail.com ext4: mballoc: Remove useless setting of ac_criteria
Kemeng Shi shikemeng@huaweicloud.com ext4: add EXT4_MB_HINT_GOAL_ONLY test in ext4_mb_use_preallocated
Zhang Yi yi.zhang@huawei.com jbd2: fix a race when checking checkpoint buffer busy
Zhang Yi yi.zhang@huawei.com jbd2: remove journal_clean_one_cp_list()
Zhang Yi yi.zhang@huawei.com jbd2: remove t_checkpoint_io_list
Daniel Miess daniel.miess@amd.com drm/amd/display: Prevent vtotal from being set to 0
Daniel Miess daniel.miess@amd.com drm/amd/display: Fix possible underflow for displays with large vblank
Gabe Teeger gabe.teeger@amd.com drm/amd/display: update extended blank for dcn314 onwards
Rodrigo Siqueira Rodrigo.Siqueira@amd.com drm/amd/display: Add FAMS validation before trying to use it
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: fix 32 bit mas_next testing
Liam R. Howlett Liam.Howlett@oracle.com maple_tree: add __init and __exit to test module
Christian König christian.koenig@amd.com drm/ttm: never consider pinned BOs for eviction&swap
Mario Limonciello mario.limonciello@amd.com drm/amd/display: Set minimum requirement for using PSR-SU on Phoenix
Mario Limonciello mario.limonciello@amd.com drm/amd/display: Set minimum requirement for using PSR-SU on Rembrandt
Cruise Hung cruise.hung@amd.com drm/amd/display: Update correct DCN314 register header
Dmytro Laktyushkin dmytro.laktyushkin@amd.com drm/amd/display: fix dcn315 single stream crb allocation
Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com drm/amd/display: add pixel rate based CRB allocation support
Michael Strauss michael.strauss@amd.com drm/amd/display: Keep disable aux-i delay as 0
Michael Strauss michael.strauss@amd.com drm/amd/display: Convert Delaying Aux-I Disable To Monitor Patch
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Don't advertise MSI-X in PCIe capabilities
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Fix window mapping and address translation for endpoint
Rick Wertenbroek rick.wertenbroek@gmail.com PCI: rockchip: Remove writes to unused registers
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI/ASPM: Avoid link retraining race
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI/ASPM: Factor out pcie_wait_for_retrain()
Bjorn Helgaas bhelgaas@google.com PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link()
Christophe JAILLET christophe.jaillet@wanadoo.fr i2c: nomadik: Remove a useless call in the remove function
Andi Shyti andi.shyti@kernel.org i2c: nomadik: Use devm_clk_get_enabled()
Andi Shyti andi.shyti@kernel.org i2c: nomadik: Remove unnecessary goto label
Markus Elfring elfring@users.sourceforge.net i2c: Improve size determinations
Markus Elfring elfring@users.sourceforge.net i2c: Delete error messages for failed memory allocations
Filipe Manana fdmanana@suse.com btrfs: fix race between quota disable and relocation
Christoph Hellwig hch@lst.de btrfs: fix fsverify read error handling in end_page_read
Christoph Hellwig hch@lst.de btrfs: factor out a btrfs_verify_page helper
Guenter Roeck linux@roeck-us.net regmap: Disable locking for RBTREE and MAPLE unit tests
Bartosz Golaszewski bartosz.golaszewski@linaro.org gpio: mvebu: fix irq domain leak
Uwe Kleine-König u.kleine-koenig@pengutronix.de gpio: mvebu: Make use of devm_pwmchip_add
Hans de Goede hdegoede@redhat.com gpio: tps68470: Make tps68470_gpio_output() always set the initial value
Ondrej Mosnacek omosnace@redhat.com io_uring: don't audit the capability check in io_uring_create()
Sven Schnelle svens@linux.ibm.com s390/mm: fix per vma lock fault handling
Claudio Imbrenda imbrenda@linux.ibm.com KVM: s390: pv: fix index value of replaced ASCE
Claudio Imbrenda imbrenda@linux.ibm.com KVM: s390: pv: simplify shutdown and fix race
Haren Myneni haren@linux.ibm.com powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
Ross Lagerwall ross.lagerwall@citrix.com blk-mq: Fix stall due to recursive flush plug
Sudeep Holla sudeep.holla@arm.com KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
Zhihao Cheng chengzhihao1@huawei.com jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
Heiner Kallweit hkallweit1@gmail.com r8169: revert 2ab19de62d67 ("r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll")
Mario Limonciello mario.limonciello@amd.com drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13
Mario Limonciello mario.limonciello@amd.com drm/amd: Move helper for dynamic speed switch check out of smu13
Shyam Sundar S K Shyam-sundar.S-k@amd.com platform/x86/amd/pmf: reduce verbosity of apmf_get_system_params
Shyam Sundar S K Shyam-sundar.S-k@amd.com platform/x86/amd/pmf: Notify OS power slider update
-------------
Diffstat:
Documentation/ABI/testing/sysfs-module | 11 + Documentation/admin-guide/hw-vuln/spectre.rst | 11 +- Documentation/filesystems/tmpfs.rst | 45 +- Documentation/process/security-bugs.rst | 37 +- Makefile | 4 +- arch/arm64/include/asm/virt.h | 1 + arch/arm64/kernel/fpsimd.c | 4 +- arch/arm64/kvm/arm.c | 9 +- arch/arm64/kvm/pkvm.c | 2 +- arch/loongarch/Kconfig | 1 + arch/loongarch/lib/clear_user.S | 3 +- arch/loongarch/lib/copy_user.S | 3 +- arch/loongarch/net/bpf_jit.h | 2 +- arch/powerpc/platforms/pseries/vas.c | 9 +- arch/s390/kvm/pv.c | 8 +- arch/s390/mm/fault.c | 2 + arch/s390/mm/gmap.c | 1 + arch/um/os-Linux/sigio.c | 7 +- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 3 +- arch/x86/kernel/cpu/bugs.c | 15 +- arch/x86/kernel/cpu/mce/amd.c | 4 +- arch/x86/kernel/traps.c | 18 +- arch/x86/kvm/svm/svm.c | 6 + arch/x86/kvm/vmx/vmx.c | 41 +- arch/x86/kvm/x86.c | 34 +- block/blk-core.c | 3 +- block/blk-mq.c | 9 +- drivers/acpi/arm64/iort.c | 3 - drivers/ata/pata_ns87415.c | 2 +- drivers/base/power/power.h | 1 + drivers/base/power/wakeirq.c | 12 +- drivers/base/regmap/regmap-kunit.c | 3 + drivers/block/rbd.c | 124 ++- drivers/block/ublk_drv.c | 11 +- drivers/char/tpm/tpm_tis_core.c | 9 +- drivers/cxl/acpi.c | 5 +- drivers/dma-buf/dma-fence-unwrap.c | 26 +- drivers/dma-buf/dma-fence.c | 7 +- drivers/gpio/gpio-mvebu.c | 26 +- drivers/gpio/gpio-tps68470.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 +- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 3 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 25 +- drivers/gpu/drm/amd/display/dc/dc.h | 3 - drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 7 + drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h | 1 + drivers/gpu/drm/amd/display/dc/dc_stream.h | 1 + drivers/gpu/drm/amd/display/dc/dc_types.h | 1 + drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 4 +- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c | 7 +- .../gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c | 1 + .../drm/amd/display/dc/dcn315/dcn315_resource.c | 106 ++- .../gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 23 +- .../gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 25 +- .../gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 + .../amd/display/dc/dml/dcn31/display_mode_vba_31.c | 39 +- .../display/dc/dml/dcn31/display_rq_dlg_calc_31.c | 3 +- .../gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c | 15 +- .../dc/dml/dcn314/display_rq_dlg_calc_314.c | 16 +- .../drm/amd/display/dc/dml/display_mode_structs.h | 3 +- .../gpu/drm/amd/display/dc/dml/display_mode_vba.c | 6 + .../link_dp_training_fixed_vs_pe_retimer.c | 23 +- drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 2 + drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 2 +- drivers/gpu/drm/amd/display/dmub/src/Makefile | 2 +- drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c | 5 + drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h | 2 + drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c | 67 ++ drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h | 35 + drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 12 +- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 89 +-- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 21 +- drivers/gpu/drm/drm_syncobj.c | 6 +- drivers/gpu/drm/i915/display/intel_dpt.c | 4 +- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 6 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 13 - drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 8 +- drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c | 2 - drivers/gpu/drm/msm/msm_fence.c | 6 + drivers/gpu/drm/msm/msm_gem_submit.c | 16 +- drivers/gpu/drm/msm/msm_mdss.c | 19 +- drivers/gpu/drm/ttm/ttm_bo.c | 6 + drivers/hwmon/aquacomputer_d5next.c | 2 +- drivers/hwmon/k10temp.c | 17 +- drivers/hwmon/nct7802.c | 2 +- drivers/hwmon/pmbus/pmbus_core.c | 20 +- drivers/i2c/busses/i2c-ibm_iic.c | 4 +- drivers/i2c/busses/i2c-nomadik.c | 42 +- drivers/i2c/busses/i2c-sh7760.c | 3 +- drivers/i2c/busses/i2c-tiny-usb.c | 4 +- drivers/infiniband/core/cma.c | 2 + drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 + drivers/infiniband/hw/bnxt_re/qplib_fp.c | 28 +- drivers/infiniband/hw/bnxt_re/qplib_fp.h | 1 + drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 355 ++++++--- drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 26 + drivers/infiniband/hw/irdma/ctrl.c | 31 +- drivers/infiniband/hw/irdma/defs.h | 46 +- drivers/infiniband/hw/irdma/hw.c | 3 +- drivers/infiniband/hw/irdma/main.h | 2 +- drivers/infiniband/hw/irdma/puda.c | 6 + drivers/infiniband/hw/irdma/type.h | 2 + drivers/infiniband/hw/irdma/uk.c | 5 +- drivers/infiniband/hw/irdma/utils.c | 8 +- drivers/infiniband/hw/mlx4/qp.c | 18 +- drivers/infiniband/hw/mthca/mthca_qp.c | 2 +- drivers/iommu/iommufd/device.c | 12 +- drivers/iommu/iommufd/iommufd_private.h | 15 +- drivers/iommu/iommufd/main.c | 78 +- drivers/iommu/iommufd/pages.c | 2 +- drivers/irqchip/irq-bcm6345-l1.c | 14 +- drivers/irqchip/irq-gic-v3-its.c | 75 +- drivers/md/dm-cache-policy-smq.c | 28 +- drivers/md/dm-raid.c | 20 +- drivers/md/md.c | 2 + drivers/media/i2c/tc358746.c | 4 +- drivers/media/platform/amphion/vpu_core.c | 4 +- .../media/platform/mediatek/jpeg/mtk_jpeg_core.c | 830 ++++++++++----------- .../media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c | 4 +- .../media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c | 4 +- drivers/net/bonding/bond_main.c | 5 + drivers/net/can/usb/gs_usb.c | 2 + drivers/net/dsa/qca/qca8k-8xxx.c | 7 +- drivers/net/dsa/qca/qca8k-common.c | 19 +- drivers/net/ethernet/atheros/atl1e/atl1e_main.c | 7 +- drivers/net/ethernet/atheros/atlx/atl1.c | 7 +- drivers/net/ethernet/emulex/benet/be_main.c | 3 +- drivers/net/ethernet/freescale/fec_main.c | 18 +- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 3 +- .../hisilicon/hns3/hns3_common/hclge_comm_cmd.c | 21 +- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 17 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 3 +- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 3 +- drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 2 +- drivers/net/ethernet/intel/iavf/iavf_main.c | 11 +- drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c | 26 +- drivers/net/ethernet/intel/igc/igc_main.c | 40 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- .../ethernet/marvell/octeontx2/af/rvu_npc_hash.c | 43 +- .../ethernet/marvell/octeontx2/af/rvu_npc_hash.h | 8 +- drivers/net/ethernet/realtek/r8169_main.c | 27 +- drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 +- drivers/net/ipa/ipa_table.c | 22 +- drivers/net/macvlan.c | 1 + drivers/net/phy/marvell10g.c | 7 + drivers/net/team/team.c | 9 + drivers/net/virtio_net.c | 4 +- drivers/net/vxlan/vxlan_core.c | 165 ++-- drivers/pci/controller/pcie-rockchip-ep.c | 156 ++-- drivers/pci/controller/pcie-rockchip.h | 40 +- drivers/pci/pcie/aspm.c | 55 +- drivers/phy/hisilicon/phy-hisi-inno-usb2.c | 2 +- drivers/phy/mediatek/phy-mtk-dp.c | 2 +- drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c | 2 +- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 72 +- drivers/platform/x86/amd/pmf/acpi.c | 23 +- drivers/platform/x86/amd/pmf/core.c | 9 +- drivers/platform/x86/amd/pmf/pmf.h | 16 + drivers/platform/x86/amd/pmf/sps.c | 74 +- drivers/platform/x86/msi-laptop.c | 8 +- drivers/s390/block/dasd_3990_erp.c | 2 +- drivers/s390/block/dasd_ioctl.c | 1 + drivers/soundwire/amd_manager.c | 4 +- drivers/soundwire/bus.c | 8 +- drivers/soundwire/qcom.c | 2 +- drivers/staging/ks7010/ks_wlan_net.c | 6 +- drivers/staging/media/atomisp/Kconfig | 1 + drivers/staging/rtl8712/rtl871x_xmit.c | 43 +- drivers/staging/rtl8712/xmit_linux.c | 6 + drivers/thermal/thermal_of.c | 27 +- drivers/tty/n_gsm.c | 4 +- drivers/tty/serial/8250/8250_dwlib.c | 6 +- drivers/tty/serial/qcom_geni_serial.c | 7 - drivers/tty/serial/sh-sci.c | 2 +- drivers/tty/serial/sifive.c | 2 +- drivers/tty/tty_io.c | 2 +- drivers/usb/cdns3/cdns3-gadget.c | 4 +- drivers/usb/core/quirks.c | 4 + drivers/usb/dwc3/core.c | 20 +- drivers/usb/dwc3/core.h | 3 - drivers/usb/dwc3/dwc3-pci.c | 6 +- drivers/usb/gadget/composite.c | 4 + drivers/usb/gadget/legacy/raw_gadget.c | 12 +- drivers/usb/gadget/udc/core.c | 1 - drivers/usb/gadget/udc/tegra-xudc.c | 8 +- drivers/usb/host/ohci-at91.c | 8 +- drivers/usb/host/xhci-mtk.c | 1 + drivers/usb/host/xhci-pci.c | 4 +- drivers/usb/host/xhci-ring.c | 25 +- drivers/usb/host/xhci-tegra.c | 8 +- drivers/usb/misc/ehset.c | 8 +- drivers/usb/serial/option.c | 6 + drivers/usb/serial/usb-serial-simple.c | 73 +- drivers/usb/typec/class.c | 15 +- drivers/xen/grant-table.c | 40 +- drivers/xen/xenbus/xenbus_probe.c | 3 + fs/9p/fid.h | 6 +- fs/9p/v9fs.h | 2 +- fs/9p/vfs_dir.c | 5 +- fs/9p/vfs_file.c | 5 +- fs/btrfs/block-rsv.c | 5 + fs/btrfs/disk-io.c | 7 +- fs/btrfs/extent_io.c | 21 +- fs/btrfs/qgroup.c | 18 +- fs/btrfs/transaction.c | 10 +- fs/btrfs/zoned.c | 3 + fs/ceph/metric.c | 2 +- fs/ext4/mballoc.c | 200 ++++- fs/file.c | 6 +- fs/jbd2/checkpoint.c | 197 ++--- fs/jbd2/commit.c | 3 +- fs/jbd2/transaction.c | 17 +- fs/nfsd/nfs4state.c | 2 - fs/proc/vmcore.c | 2 +- fs/smb/client/sess.c | 4 +- fs/smb/server/ksmbd_netlink.h | 3 +- fs/smb/server/smb2pdu.c | 27 +- fs/smb/server/vfs.c | 58 +- fs/smb/server/vfs.h | 4 +- include/linux/dma-fence.h | 2 +- include/linux/jbd2.h | 7 +- include/linux/mm.h | 29 +- include/linux/mm_types.h | 28 + include/linux/mmap_lock.h | 10 +- include/net/ipv6.h | 8 +- include/net/vxlan.h | 13 +- include/trace/events/jbd2.h | 12 +- include/uapi/linux/blkzoned.h | 10 +- io_uring/io_uring.c | 25 +- kernel/locking/rtmutex.c | 172 +++-- kernel/locking/rtmutex_api.c | 2 +- kernel/locking/rtmutex_common.h | 47 +- kernel/locking/ww_mutex.h | 12 +- kernel/signal.c | 4 + kernel/trace/ring_buffer.c | 22 +- kernel/trace/trace_events.c | 14 +- lib/test_maple_tree.c | 163 ++-- mm/memory-failure.c | 2 +- mm/mempolicy.c | 15 +- mm/mmap.c | 1 + net/ceph/messenger.c | 1 + net/ipv6/addrconf.c | 14 +- net/mptcp/protocol.c | 3 +- net/netfilter/nf_tables_api.c | 5 +- net/netfilter/nft_immediate.c | 27 +- net/netfilter/nft_set_rbtree.c | 20 +- net/sched/sch_mqprio.c | 14 + net/tipc/crypto.c | 3 +- net/tipc/node.c | 2 +- sound/pci/hda/patch_realtek.c | 2 + sound/soc/codecs/wm8904.c | 3 + sound/soc/fsl/fsl_spdif.c | 2 + tools/net/ynl/lib/ynl.py | 4 +- tools/testing/radix-tree/linux/init.h | 1 + tools/testing/radix-tree/maple.c | 143 ++-- tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 +- tools/testing/selftests/rseq/rseq.c | 28 +- virt/kvm/kvm_main.c | 24 + 264 files changed, 3622 insertions(+), 2136 deletions(-)
From: Shyam Sundar S K Shyam-sundar.S-k@amd.com
commit 33c9ab5b493a0e922b06c12fed4fdcb862212cda upstream.
APMF fn8 can notify EC about the OS slider position change. Add this capability to the PMF driver so that it can call the APMF fn8 based on the changes in the Platform profile events.
Co-developed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Patil Rajesh Reddy Patil.Reddy@amd.com Signed-off-by: Shyam Sundar S K Shyam-sundar.S-k@amd.com Link: https://lore.kernel.org/r/20230714144435.1239776-2-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/amd/pmf/acpi.c | 21 ++++++++++ drivers/platform/x86/amd/pmf/core.c | 9 +++- drivers/platform/x86/amd/pmf/pmf.h | 16 +++++++ drivers/platform/x86/amd/pmf/sps.c | 74 ++++++++++++++++++++++++++++++++++-- 4 files changed, 114 insertions(+), 6 deletions(-)
--- a/drivers/platform/x86/amd/pmf/acpi.c +++ b/drivers/platform/x86/amd/pmf/acpi.c @@ -106,6 +106,27 @@ int apmf_get_static_slider_granular(stru data, sizeof(*data)); }
+int apmf_os_power_slider_update(struct amd_pmf_dev *pdev, u8 event) +{ + struct os_power_slider args; + struct acpi_buffer params; + union acpi_object *info; + int err = 0; + + args.size = sizeof(args); + args.slider_event = event; + + params.length = sizeof(args); + params.pointer = (void *)&args; + + info = apmf_if_call(pdev, APMF_FUNC_OS_POWER_SLIDER_UPDATE, ¶ms); + if (!info) + err = -EIO; + + kfree(info); + return err; +} + static void apmf_sbios_heartbeat_notify(struct work_struct *work) { struct amd_pmf_dev *dev = container_of(work, struct amd_pmf_dev, heart_beat.work); --- a/drivers/platform/x86/amd/pmf/core.c +++ b/drivers/platform/x86/amd/pmf/core.c @@ -71,7 +71,11 @@ static int amd_pmf_pwr_src_notify_call(s return NOTIFY_DONE; }
- amd_pmf_set_sps_power_limits(pmf); + if (is_apmf_func_supported(pmf, APMF_FUNC_STATIC_SLIDER_GRANULAR)) + amd_pmf_set_sps_power_limits(pmf); + + if (is_apmf_func_supported(pmf, APMF_FUNC_OS_POWER_SLIDER_UPDATE)) + amd_pmf_power_slider_update_event(pmf);
return NOTIFY_OK; } @@ -295,7 +299,8 @@ static void amd_pmf_init_features(struct int ret;
/* Enable Static Slider */ - if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR)) { + if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR) || + is_apmf_func_supported(dev, APMF_FUNC_OS_POWER_SLIDER_UPDATE)) { amd_pmf_init_sps(dev); dev->pwr_src_notifier.notifier_call = amd_pmf_pwr_src_notify_call; power_supply_reg_notifier(&dev->pwr_src_notifier); --- a/drivers/platform/x86/amd/pmf/pmf.h +++ b/drivers/platform/x86/amd/pmf/pmf.h @@ -21,6 +21,7 @@ #define APMF_FUNC_SBIOS_HEARTBEAT 4 #define APMF_FUNC_AUTO_MODE 5 #define APMF_FUNC_SET_FAN_IDX 7 +#define APMF_FUNC_OS_POWER_SLIDER_UPDATE 8 #define APMF_FUNC_STATIC_SLIDER_GRANULAR 9 #define APMF_FUNC_DYN_SLIDER_AC 11 #define APMF_FUNC_DYN_SLIDER_DC 12 @@ -44,6 +45,14 @@ #define GET_STT_LIMIT_APU 0x20 #define GET_STT_LIMIT_HS2 0x21
+/* OS slider update notification */ +#define DC_BEST_PERF 0 +#define DC_BETTER_PERF 1 +#define DC_BATTERY_SAVER 3 +#define AC_BEST_PERF 4 +#define AC_BETTER_PERF 5 +#define AC_BETTER_BATTERY 6 + /* Fan Index for Auto Mode */ #define FAN_INDEX_AUTO 0xFFFFFFFF
@@ -193,6 +202,11 @@ struct amd_pmf_static_slider_granular { struct apmf_sps_prop_granular prop[POWER_SOURCE_MAX][POWER_MODE_MAX]; };
+struct os_power_slider { + u16 size; + u8 slider_event; +} __packed; + struct fan_table_control { bool manual; unsigned long fan_id; @@ -383,6 +397,7 @@ int amd_pmf_send_cmd(struct amd_pmf_dev int amd_pmf_init_metrics_table(struct amd_pmf_dev *dev); int amd_pmf_get_power_source(void); int apmf_install_handler(struct amd_pmf_dev *pmf_dev); +int apmf_os_power_slider_update(struct amd_pmf_dev *dev, u8 flag);
/* SPS Layer */ int amd_pmf_get_pprof_modes(struct amd_pmf_dev *pmf); @@ -393,6 +408,7 @@ void amd_pmf_deinit_sps(struct amd_pmf_d int apmf_get_static_slider_granular(struct amd_pmf_dev *pdev, struct apmf_static_slider_granular_output *output); bool is_pprof_balanced(struct amd_pmf_dev *pmf); +int amd_pmf_power_slider_update_event(struct amd_pmf_dev *dev);
int apmf_update_fan_idx(struct amd_pmf_dev *pdev, bool manual, u32 idx); --- a/drivers/platform/x86/amd/pmf/sps.c +++ b/drivers/platform/x86/amd/pmf/sps.c @@ -119,14 +119,77 @@ int amd_pmf_get_pprof_modes(struct amd_p return mode; }
+int amd_pmf_power_slider_update_event(struct amd_pmf_dev *dev) +{ + u8 mode, flag = 0; + int src; + + mode = amd_pmf_get_pprof_modes(dev); + if (mode < 0) + return mode; + + src = amd_pmf_get_power_source(); + + if (src == POWER_SOURCE_AC) { + switch (mode) { + case POWER_MODE_PERFORMANCE: + flag |= BIT(AC_BEST_PERF); + break; + case POWER_MODE_BALANCED_POWER: + flag |= BIT(AC_BETTER_PERF); + break; + case POWER_MODE_POWER_SAVER: + flag |= BIT(AC_BETTER_BATTERY); + break; + default: + dev_err(dev->dev, "unsupported platform profile\n"); + return -EOPNOTSUPP; + } + + } else if (src == POWER_SOURCE_DC) { + switch (mode) { + case POWER_MODE_PERFORMANCE: + flag |= BIT(DC_BEST_PERF); + break; + case POWER_MODE_BALANCED_POWER: + flag |= BIT(DC_BETTER_PERF); + break; + case POWER_MODE_POWER_SAVER: + flag |= BIT(DC_BATTERY_SAVER); + break; + default: + dev_err(dev->dev, "unsupported platform profile\n"); + return -EOPNOTSUPP; + } + } + + apmf_os_power_slider_update(dev, flag); + + return 0; +} + static int amd_pmf_profile_set(struct platform_profile_handler *pprof, enum platform_profile_option profile) { struct amd_pmf_dev *pmf = container_of(pprof, struct amd_pmf_dev, pprof); + int ret = 0;
pmf->current_profile = profile;
- return amd_pmf_set_sps_power_limits(pmf); + /* Notify EC about the slider position change */ + if (is_apmf_func_supported(pmf, APMF_FUNC_OS_POWER_SLIDER_UPDATE)) { + ret = amd_pmf_power_slider_update_event(pmf); + if (ret) + return ret; + } + + if (is_apmf_func_supported(pmf, APMF_FUNC_STATIC_SLIDER_GRANULAR)) { + ret = amd_pmf_set_sps_power_limits(pmf); + if (ret) + return ret; + } + + return 0; }
int amd_pmf_init_sps(struct amd_pmf_dev *dev) @@ -134,10 +197,13 @@ int amd_pmf_init_sps(struct amd_pmf_dev int err;
dev->current_profile = PLATFORM_PROFILE_BALANCED; - amd_pmf_load_defaults_sps(dev);
- /* update SPS balanced power mode thermals */ - amd_pmf_set_sps_power_limits(dev); + if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR)) { + amd_pmf_load_defaults_sps(dev); + + /* update SPS balanced power mode thermals */ + amd_pmf_set_sps_power_limits(dev); + }
dev->pprof.profile_get = amd_pmf_profile_get; dev->pprof.profile_set = amd_pmf_profile_set;
From: Shyam Sundar S K Shyam-sundar.S-k@amd.com
commit 839e90e75e695b3d9ee17f5a2811e7ee5aea8d4a upstream.
apmf_get_system_params() failure is not a critical event, reduce its verbosity from dev_err to dev_dbg.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Shyam Sundar S K Shyam-sundar.S-k@amd.com Link: https://lore.kernel.org/r/20230714144435.1239776-1-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/amd/pmf/acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/platform/x86/amd/pmf/acpi.c +++ b/drivers/platform/x86/amd/pmf/acpi.c @@ -310,7 +310,7 @@ int apmf_acpi_init(struct amd_pmf_dev *p
ret = apmf_get_system_params(pmf_dev); if (ret) { - dev_err(pmf_dev->dev, "APMF apmf_get_system_params failed :%d\n", ret); + dev_dbg(pmf_dev->dev, "APMF apmf_get_system_params failed :%d\n", ret); goto out; }
From: Mario Limonciello mario.limonciello@amd.com
commit 188623076d0f1a500583d392b6187056bf7cc71a upstream.
This helper is used for checking if the connected host supports the feature, it can be moved into generic code to be used by other smu implementations as well.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Evan Quan evan.quan@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++++++++++++++++++ drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 21 +-------------------- 3 files changed, 21 insertions(+), 20 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1246,6 +1246,7 @@ int amdgpu_device_gpu_recover(struct amd void amdgpu_device_pci_config_reset(struct amdgpu_device *adev); int amdgpu_device_pci_reset(struct amdgpu_device *adev); bool amdgpu_device_need_post(struct amdgpu_device *adev); +bool amdgpu_device_pcie_dynamic_switching_supported(void); bool amdgpu_device_should_use_aspm(struct amdgpu_device *adev); bool amdgpu_device_aspm_support_quirk(void);
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1352,6 +1352,25 @@ bool amdgpu_device_need_post(struct amdg return true; }
+/* + * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic + * speed switching. Until we have confirmation from Intel that a specific host + * supports it, it's safer that we keep it disabled for all. + * + * https://edc.intel.com/content/www/us/en/design/products/platforms/details/ra... + * https://gitlab.freedesktop.org/drm/amd/-/issues/2663 + */ +bool amdgpu_device_pcie_dynamic_switching_supported(void) +{ +#if IS_ENABLED(CONFIG_X86) + struct cpuinfo_x86 *c = &cpu_data(0); + + if (c->x86_vendor == X86_VENDOR_INTEL) + return false; +#endif + return true; +} + /** * amdgpu_device_should_use_aspm - check if the device should program ASPM * --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c @@ -2454,25 +2454,6 @@ int smu_v13_0_mode1_reset(struct smu_con return ret; }
-/* - * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic - * speed switching. Until we have confirmation from Intel that a specific host - * supports it, it's safer that we keep it disabled for all. - * - * https://edc.intel.com/content/www/us/en/design/products/platforms/details/ra... - * https://gitlab.freedesktop.org/drm/amd/-/issues/2663 - */ -static bool smu_v13_0_is_pcie_dynamic_switching_supported(void) -{ -#if IS_ENABLED(CONFIG_X86) - struct cpuinfo_x86 *c = &cpu_data(0); - - if (c->x86_vendor == X86_VENDOR_INTEL) - return false; -#endif - return true; -} - int smu_v13_0_update_pcie_parameters(struct smu_context *smu, uint32_t pcie_gen_cap, uint32_t pcie_width_cap) @@ -2484,7 +2465,7 @@ int smu_v13_0_update_pcie_parameters(str uint32_t smu_pcie_arg; int ret, i;
- if (!smu_v13_0_is_pcie_dynamic_switching_supported()) { + if (!amdgpu_device_pcie_dynamic_switching_supported()) { if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap) pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1];
From: Mario Limonciello mario.limonciello@amd.com
commit e701156ccc6c7a5f104a968dda74cd6434178712 upstream.
SMU13 overrides dynamic PCIe lane width and dynamic speed by when on certain hosts. commit 38e4ced80479 ("drm/amd/pm: conditionally disable pcie lane switching for some sienna_cichlid SKUs") worked around this issue by setting up certain SKUs to set up certain limits, but the same fundamental problem with those hosts affects all SMU11 implmentations as well, so align the SMU11 and SMU13 driver handling.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Evan Quan evan.quan@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 93 +++------------- 1 file changed, 20 insertions(+), 73 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -2081,89 +2081,36 @@ static int sienna_cichlid_display_disabl return ret; }
-static void sienna_cichlid_get_override_pcie_settings(struct smu_context *smu, - uint32_t *gen_speed_override, - uint32_t *lane_width_override) -{ - struct amdgpu_device *adev = smu->adev; - - *gen_speed_override = 0xff; - *lane_width_override = 0xff; - - switch (adev->pdev->device) { - case 0x73A0: - case 0x73A1: - case 0x73A2: - case 0x73A3: - case 0x73AB: - case 0x73AE: - /* Bit 7:0: PCIE lane width, 1 to 7 corresponds is x1 to x32 */ - *lane_width_override = 6; - break; - case 0x73E0: - case 0x73E1: - case 0x73E3: - *lane_width_override = 4; - break; - case 0x7420: - case 0x7421: - case 0x7422: - case 0x7423: - case 0x7424: - *lane_width_override = 3; - break; - default: - break; - } -} - -#define MAX(a, b) ((a) > (b) ? (a) : (b)) - static int sienna_cichlid_update_pcie_parameters(struct smu_context *smu, uint32_t pcie_gen_cap, uint32_t pcie_width_cap) { struct smu_11_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context; struct smu_11_0_pcie_table *pcie_table = &dpm_context->dpm_tables.pcie_table; - uint32_t gen_speed_override, lane_width_override; - uint8_t *table_member1, *table_member2; - uint32_t min_gen_speed, max_gen_speed; - uint32_t min_lane_width, max_lane_width; - uint32_t smu_pcie_arg; + u32 smu_pcie_arg; int ret, i;
- GET_PPTABLE_MEMBER(PcieGenSpeed, &table_member1); - GET_PPTABLE_MEMBER(PcieLaneCount, &table_member2); - - sienna_cichlid_get_override_pcie_settings(smu, - &gen_speed_override, - &lane_width_override); - - /* PCIE gen speed override */ - if (gen_speed_override != 0xff) { - min_gen_speed = MIN(pcie_gen_cap, gen_speed_override); - max_gen_speed = MIN(pcie_gen_cap, gen_speed_override); - } else { - min_gen_speed = MAX(0, table_member1[0]); - max_gen_speed = MIN(pcie_gen_cap, table_member1[1]); - min_gen_speed = min_gen_speed > max_gen_speed ? - max_gen_speed : min_gen_speed; - } - pcie_table->pcie_gen[0] = min_gen_speed; - pcie_table->pcie_gen[1] = max_gen_speed; - - /* PCIE lane width override */ - if (lane_width_override != 0xff) { - min_lane_width = MIN(pcie_width_cap, lane_width_override); - max_lane_width = MIN(pcie_width_cap, lane_width_override); + /* PCIE gen speed and lane width override */ + if (!amdgpu_device_pcie_dynamic_switching_supported()) { + if (pcie_table->pcie_gen[NUM_LINK_LEVELS - 1] < pcie_gen_cap) + pcie_gen_cap = pcie_table->pcie_gen[NUM_LINK_LEVELS - 1]; + + if (pcie_table->pcie_lane[NUM_LINK_LEVELS - 1] < pcie_width_cap) + pcie_width_cap = pcie_table->pcie_lane[NUM_LINK_LEVELS - 1]; + + /* Force all levels to use the same settings */ + for (i = 0; i < NUM_LINK_LEVELS; i++) { + pcie_table->pcie_gen[i] = pcie_gen_cap; + pcie_table->pcie_lane[i] = pcie_width_cap; + } } else { - min_lane_width = MAX(1, table_member2[0]); - max_lane_width = MIN(pcie_width_cap, table_member2[1]); - min_lane_width = min_lane_width > max_lane_width ? - max_lane_width : min_lane_width; + for (i = 0; i < NUM_LINK_LEVELS; i++) { + if (pcie_table->pcie_gen[i] > pcie_gen_cap) + pcie_table->pcie_gen[i] = pcie_gen_cap; + if (pcie_table->pcie_lane[i] > pcie_width_cap) + pcie_table->pcie_lane[i] = pcie_width_cap; + } } - pcie_table->pcie_lane[0] = min_lane_width; - pcie_table->pcie_lane[1] = max_lane_width;
for (i = 0; i < NUM_LINK_LEVELS; i++) { smu_pcie_arg = (i << 16 |
From: Heiner Kallweit hkallweit1@gmail.com
commit cf2ffdea0839398cb0551762af7f5efb0a6e0fea upstream.
There have been reports that on a number of systems this change breaks network connectivity. Therefore effectively revert it. Mainly affected seem to be systems where BIOS denies ASPM access to OS. Due to later changes we can't do a direct revert.
Fixes: 2ab19de62d67 ("r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/netdev/e47bac0d-e802-65e1-b311-6acb26d5cf10@freenet.... Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217596 Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Link: https://lore.kernel.org/r/57f13ec0-b216-d5d8-363d-5b05528ec5fb@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -623,6 +623,7 @@ struct rtl8169_private { int cfg9346_usage_count;
unsigned supports_gmii:1; + unsigned aspm_manageable:1; dma_addr_t counters_phys_addr; struct rtl8169_counters *counters; struct rtl8169_tc_offsets tc_offset; @@ -2746,7 +2747,8 @@ static void rtl_hw_aspm_clkreq_enable(st if (tp->mac_version < RTL_GIGA_MAC_VER_32) return;
- if (enable) { + /* Don't enable ASPM in the chip if OS can't control ASPM */ + if (enable && tp->aspm_manageable) { /* On these chip versions ASPM can even harm * bus communication of other PCI devices. */ @@ -5156,6 +5158,16 @@ done: rtl_rar_set(tp, mac_addr); }
+/* register is set if system vendor successfully tested ASPM 1.2 */ +static bool rtl_aspm_is_safe(struct rtl8169_private *tp) +{ + if (tp->mac_version >= RTL_GIGA_MAC_VER_61 && + r8168_mac_ocp_read(tp, 0xc0b2) & 0xf) + return true; + + return false; +} + static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) { struct rtl8169_private *tp; @@ -5227,6 +5239,19 @@ static int rtl_init_one(struct pci_dev *
tp->mac_version = chipset;
+ /* Disable ASPM L1 as that cause random device stop working + * problems as well as full system hangs for some PCIe devices users. + * Chips from RTL8168h partially have issues with L1.2, but seem + * to work fine with L1 and L1.1. + */ + if (rtl_aspm_is_safe(tp)) + rc = 0; + else if (tp->mac_version >= RTL_GIGA_MAC_VER_46) + rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1_2); + else + rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1); + tp->aspm_manageable = !rc; + tp->dash_type = rtl_check_dash(tp);
tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
From: Zhihao Cheng chengzhihao1@huawei.com
[ Upstream commit e34c8dd238d0c9368b746480f313055f5bab5040 ]
Following process,
jbd2_journal_commit_transaction // there are several dirty buffer heads in transaction->t_checkpoint_list P1 wb_workfn jbd2_log_do_checkpoint if (buffer_locked(bh)) // false __block_write_full_page trylock_buffer(bh) test_clear_buffer_dirty(bh) if (!buffer_dirty(bh)) __jbd2_journal_remove_checkpoint(jh) if (buffer_write_io_error(bh)) // false >> bh IO error occurs << jbd2_cleanup_journal_tail __jbd2_update_log_tail jbd2_write_superblock // The bh won't be replayed in next mount. , which could corrupt the ext4 image, fetch a reproducer in [Link].
Since writeback process clears buffer dirty after locking buffer head, we can fix it by try locking buffer and check dirtiness while buffer is locked, the buffer head can be removed if it is neither dirty nor locked.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Fixes: 470decc613ab ("[PATCH] jbd2: initial copy of files from jbd") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-5-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index 25e3c20eb19f6..c4e0da6db7195 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -221,20 +221,6 @@ int jbd2_log_do_checkpoint(journal_t *journal) jh = transaction->t_checkpoint_list; bh = jh2bh(jh);
- /* - * The buffer may be writing back, or flushing out in the - * last couple of cycles, or re-adding into a new transaction, - * need to check it again until it's unlocked. - */ - if (buffer_locked(bh)) { - get_bh(bh); - spin_unlock(&journal->j_list_lock); - wait_on_buffer(bh); - /* the journal_head may have gone by now */ - BUFFER_TRACE(bh, "brelse"); - __brelse(bh); - goto retry; - } if (jh->b_transaction != NULL) { transaction_t *t = jh->b_transaction; tid_t tid = t->t_tid; @@ -269,7 +255,22 @@ int jbd2_log_do_checkpoint(journal_t *journal) spin_lock(&journal->j_list_lock); goto restart; } - if (!buffer_dirty(bh)) { + if (!trylock_buffer(bh)) { + /* + * The buffer is locked, it may be writing back, or + * flushing out in the last couple of cycles, or + * re-adding into a new transaction, need to check + * it again until it's unlocked. + */ + get_bh(bh); + spin_unlock(&journal->j_list_lock); + wait_on_buffer(bh); + /* the journal_head may have gone by now */ + BUFFER_TRACE(bh, "brelse"); + __brelse(bh); + goto retry; + } else if (!buffer_dirty(bh)) { + unlock_buffer(bh); BUFFER_TRACE(bh, "remove from checkpoint"); /* * If the transaction was released or the checkpoint @@ -279,6 +280,7 @@ int jbd2_log_do_checkpoint(journal_t *journal) !transaction->t_checkpoint_list) goto out; } else { + unlock_buffer(bh); /* * We are about to write the buffer, it could be * raced by some other transaction shrink or buffer
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit fa729bc7c9c8c17a2481358c841ef8ca920485d3 ]
Currently there is no synchronisation between finalize_pkvm() and kvm_arm_init() initcalls. The finalize_pkvm() proceeds happily even if kvm_arm_init() fails resulting in the following warning on all the CPUs and eventually a HYP panic:
| kvm [1]: IPA Size Limit: 48 bits | kvm [1]: Failed to init hyp memory protection | kvm [1]: error initializing Hyp mode: -22 | | <snip> | | WARNING: CPU: 0 PID: 0 at arch/arm64/kvm/pkvm.c:226 _kvm_host_prot_finalize+0x30/0x50 | Modules linked in: | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0 #237 | Hardware name: FVP Base RevC (DT) | pstate: 634020c5 (nZCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) | pc : _kvm_host_prot_finalize+0x30/0x50 | lr : __flush_smp_call_function_queue+0xd8/0x230 | | Call trace: | _kvm_host_prot_finalize+0x3c/0x50 | on_each_cpu_cond_mask+0x3c/0x6c | pkvm_drop_host_privileges+0x4c/0x78 | finalize_pkvm+0x3c/0x5c | do_one_initcall+0xcc/0x240 | do_initcall_level+0x8c/0xac | do_initcalls+0x54/0x94 | do_basic_setup+0x1c/0x28 | kernel_init_freeable+0x100/0x16c | kernel_init+0x20/0x1a0 | ret_from_fork+0x10/0x20 | Failed to finalize Hyp protection: -22 | dtb=fvp-base-revc.dtb | kvm [95]: nVHE hyp BUG at: arch/arm64/kvm/hyp/nvhe/mem_protect.c:540! | kvm [95]: nVHE call trace: | kvm [95]: [<ffff800081052984>] __kvm_nvhe_hyp_panic+0xac/0xf8 | kvm [95]: [<ffff800081059644>] __kvm_nvhe_handle_host_mem_abort+0x1a0/0x2ac | kvm [95]: [<ffff80008105511c>] __kvm_nvhe_handle_trap+0x4c/0x160 | kvm [95]: [<ffff8000810540fc>] __kvm_nvhe___skip_pauth_save+0x4/0x4 | kvm [95]: ---[ end nVHE call trace ]--- | kvm [95]: Hyp Offset: 0xfffe8db00ffa0000 | Kernel panic - not syncing: HYP panic: | PS:a34023c9 PC:0000f250710b973c ESR:00000000f2000800 | FAR:ffff000800cb00d0 HPFAR:000000000880cb00 PAR:0000000000000000 | VCPU:0000000000000000 | CPU: 3 PID: 95 Comm: kworker/u16:2 Tainted: G W 6.4.0 #237 | Hardware name: FVP Base RevC (DT) | Workqueue: rpciod rpc_async_schedule | Call trace: | dump_backtrace+0xec/0x108 | show_stack+0x18/0x2c | dump_stack_lvl+0x50/0x68 | dump_stack+0x18/0x24 | panic+0x138/0x33c | nvhe_hyp_panic_handler+0x100/0x184 | new_slab+0x23c/0x54c | ___slab_alloc+0x3e4/0x770 | kmem_cache_alloc_node+0x1f0/0x278 | __alloc_skb+0xdc/0x294 | tcp_stream_alloc_skb+0x2c/0xf0 | tcp_sendmsg_locked+0x3d0/0xda4 | tcp_sendmsg+0x38/0x5c | inet_sendmsg+0x44/0x60 | sock_sendmsg+0x1c/0x34 | xprt_sock_sendmsg+0xdc/0x274 | xs_tcp_send_request+0x1ac/0x28c | xprt_transmit+0xcc/0x300 | call_transmit+0x78/0x90 | __rpc_execute+0x114/0x3d8 | rpc_async_schedule+0x28/0x48 | process_one_work+0x1d8/0x314 | worker_thread+0x248/0x474 | kthread+0xfc/0x184 | ret_from_fork+0x10/0x20 | SMP: stopping secondary CPUs | Kernel Offset: 0x57c5cb460000 from 0xffff800080000000 | PHYS_OFFSET: 0x80000000 | CPU features: 0x00000000,1035b7a3,ccfe773f | Memory Limit: none | ---[ end Kernel panic - not syncing: HYP panic: | PS:a34023c9 PC:0000f250710b973c ESR:00000000f2000800 | FAR:ffff000800cb00d0 HPFAR:000000000880cb00 PAR:0000000000000000 | VCPU:0000000000000000 ]---
Fix it by checking for the successfull initialisation of kvm_arm_init() in finalize_pkvm() before proceeding any futher.
Fixes: 87727ba2bb05 ("KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege") Cc: Will Deacon will@kernel.org Cc: Marc Zyngier maz@kernel.org Cc: Oliver Upton oliver.upton@linux.dev Cc: James Morse james.morse@arm.com Cc: Suzuki K Poulose suzuki.poulose@arm.com Cc: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Acked-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230704193243.3300506-1-sudeep.holla@arm.com Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/virt.h | 1 + arch/arm64/kvm/arm.c | 9 ++++++++- arch/arm64/kvm/pkvm.c | 2 +- 3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h index 4eb601e7de507..06382da630123 100644 --- a/arch/arm64/include/asm/virt.h +++ b/arch/arm64/include/asm/virt.h @@ -78,6 +78,7 @@ extern u32 __boot_cpu_mode[2];
void __hyp_set_vectors(phys_addr_t phys_vector_base); void __hyp_reset_vectors(void); +bool is_kvm_arm_initialised(void);
DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 7d8c3dd8b7ca9..3a2606ba3e583 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -51,11 +51,16 @@ DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector); DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page); DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
-static bool vgic_present; +static bool vgic_present, kvm_arm_initialised;
static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled); DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+bool is_kvm_arm_initialised(void) +{ + return kvm_arm_initialised; +} + int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu) { return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE; @@ -2396,6 +2401,8 @@ static __init int kvm_arm_init(void) if (err) goto out_subs;
+ kvm_arm_initialised = true; + return 0;
out_subs: diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index 6e9ece1ebbe72..3895416cb15ae 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -243,7 +243,7 @@ static int __init finalize_pkvm(void) { int ret;
- if (!is_protected_kvm_enabled()) + if (!is_protected_kvm_enabled() || !is_kvm_arm_initialised()) return 0;
/*
From: Ross Lagerwall ross.lagerwall@citrix.com
[ Upstream commit 70904263512a74a3b8941dd9e6e515ca6fc57821 ]
We have seen rare IO stalls as follows:
* blk_mq_plug_issue_direct() is entered with an mq_list containing two requests. * For the first request, it sets last == false and enters the driver's queue_rq callback. * The driver queue_rq callback indirectly calls schedule() which calls blk_flush_plug(). This may happen if the driver has the BLK_MQ_F_BLOCKING flag set and is allowed to sleep in ->queue_rq. * blk_flush_plug() handles the remaining request in the mq_list. mq_list is now empty. * The original call to queue_rq resumes (with last == false). * The loop in blk_mq_plug_issue_direct() terminates because there are no remaining requests in mq_list.
The IO is now stalled because the last request submitted to the driver had last == false and there was no subsequent call to commit_rqs().
Fix this by returning early in blk_mq_flush_plug_list() if rq_count is 0 which it will be in the recursive case, rather than checking if the mq_list is empty. At the same time, adjust one of the callers to skip the mq_list empty check as it is not necessary.
Fixes: dc5fc361d891 ("block: attempt direct issue of plug list") Signed-off-by: Ross Lagerwall ross.lagerwall@citrix.com Reviewed-by: Bart Van Assche bvanassche@acm.org Link: https://lore.kernel.org/r/20230714101106.3635611-1-ross.lagerwall@citrix.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-core.c | 3 +-- block/blk-mq.c | 9 ++++++++- 2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c index 3fc68b9444791..0434f5a8151fe 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1141,8 +1141,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule) { if (!list_empty(&plug->cb_list)) flush_plug_callbacks(plug, from_schedule); - if (!rq_list_empty(plug->mq_list)) - blk_mq_flush_plug_list(plug, from_schedule); + blk_mq_flush_plug_list(plug, from_schedule); /* * Unconditionally flush out cached requests, even if the unplug * event came from schedule. Since we know hold references to the diff --git a/block/blk-mq.c b/block/blk-mq.c index 73ed8ccb09ce8..58bf41e8e66c7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2754,7 +2754,14 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) { struct request *rq;
- if (rq_list_empty(plug->mq_list)) + /* + * We may have been called recursively midway through handling + * plug->mq_list via a schedule() in the driver's queue_rq() callback. + * To avoid mq_list changing under our feet, clear rq_count early and + * bail out specifically if rq_count is 0 rather than checking + * whether the mq_list is empty. + */ + if (plug->rq_count == 0) return; plug->rq_count = 0;
From: Haren Myneni haren@linux.ibm.com
[ Upstream commit b59c9dc4d9d47b3c4572d826603fde507055b656 ]
Commit 8ef7b9e1765a ("powerpc/pseries/vas: Close windows with DLPAR core removal") unmaps the window paste address and issues HCALL to close window in the hypervisor for migration or DLPAR core removal events. So holds mmap_mutex and then mmap lock before unmap the paste address. But if the user space issue mmap paste address at the same time with the migration event, coproc_mmap() is called after holding the mmap lock which can trigger deadlock when trying to acquire mmap_mutex in coproc_mmap().
t1: mmap() call to mmap t2: Migration event window paste address
do_mmap2() migration_store() ksys_mmap_pgoff() pseries_migrate_partition() vm_mmap_pgoff() vas_migration_handler() Acquire mmap lock reconfig_close_windows() do_mmap() lock mmap_mutex mmap_region() Acquire mmap lock call_mmap() //Wait for mmap lock coproc_mmap() unmap vma lock mmap_mutex update window status //wait for mmap_mutex Release mmap lock mmap vma unlock mmap_mutex update window status unlock mmap_mutex ... Release mmap lock
Fix this deadlock issue by holding mmap lock first before mmap_mutex in reconfig_close_windows().
Fixes: 8ef7b9e1765a ("powerpc/pseries/vas: Close windows with DLPAR core removal") Signed-off-by: Haren Myneni haren@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230716100506.7833-1-haren@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/vas.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c index 9a44a98ba3420..3fbc2a6aa319d 100644 --- a/arch/powerpc/platforms/pseries/vas.c +++ b/arch/powerpc/platforms/pseries/vas.c @@ -744,6 +744,12 @@ static int reconfig_close_windows(struct vas_caps *vcap, int excess_creds, }
task_ref = &win->vas_win.task_ref; + /* + * VAS mmap (coproc_mmap()) and its fault handler + * (vas_mmap_fault()) are called after holding mmap lock. + * So hold mmap mutex after mmap_lock to avoid deadlock. + */ + mmap_write_lock(task_ref->mm); mutex_lock(&task_ref->mmap_mutex); vma = task_ref->vma; /* @@ -752,7 +758,6 @@ static int reconfig_close_windows(struct vas_caps *vcap, int excess_creds, */ win->vas_win.status |= flag;
- mmap_write_lock(task_ref->mm); /* * vma is set in the original mapping. But this mapping * is done with mmap() after the window is opened with ioctl. @@ -762,8 +767,8 @@ static int reconfig_close_windows(struct vas_caps *vcap, int excess_creds, if (vma) zap_vma_pages(vma);
- mmap_write_unlock(task_ref->mm); mutex_unlock(&task_ref->mmap_mutex); + mmap_write_unlock(task_ref->mm); /* * Close VAS window in the hypervisor, but do not * free vas_window struct since it may be reused
From: Claudio Imbrenda imbrenda@linux.ibm.com
[ Upstream commit 5ff92181577a89ed12ad4e0e5813751faf16a139 ]
Simplify the shutdown of non-protected VMs. There is no need to do complex manipulations of the counter if it was zero.
This also fixes a very rare race which caused pages to be torn down from the address space with a non-zero counter even on older machines that don't support the UVC instruction, causing a crash.
Reported-by: Marc Hartmayer mhartmay@linux.ibm.com Fixes: fb491d5500a7 ("KVM: s390: pv: asynchronous destroy for reboot") Reviewed-by: Nico Boehr nrb@linux.ibm.com Signed-off-by: Claudio Imbrenda imbrenda@linux.ibm.com Message-ID: 20230705111937.33472-2-imbrenda@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kvm/pv.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c index 3ce5f4351156a..899f3b8ac0110 100644 --- a/arch/s390/kvm/pv.c +++ b/arch/s390/kvm/pv.c @@ -411,8 +411,12 @@ int kvm_s390_pv_deinit_cleanup_all(struct kvm *kvm, u16 *rc, u16 *rrc) u16 _rc, _rrc; int cc = 0;
- /* Make sure the counter does not reach 0 before calling s390_uv_destroy_range */ - atomic_inc(&kvm->mm->context.protected_count); + /* + * Nothing to do if the counter was already 0. Otherwise make sure + * the counter does not reach 0 before calling s390_uv_destroy_range. + */ + if (!atomic_inc_not_zero(&kvm->mm->context.protected_count)) + return 0;
*rc = 1; /* If the current VM is protected, destroy it */
From: Claudio Imbrenda imbrenda@linux.ibm.com
[ Upstream commit c2fceb59bbda16468bda82b002383bff59de89ab ]
The index field of the struct page corresponding to a guest ASCE should be 0. When replacing the ASCE in s390_replace_asce(), the index of the new ASCE should also be set to 0.
Having the wrong index might lead to the wrong addresses being passed around when notifying pte invalidations, and eventually to validity intercepts (VM crash) if the prefix gets unmapped and the notifier gets called with the wrong address.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Fixes: faa2f72cb356 ("KVM: s390: pv: leak the topmost page table when destroy fails") Reviewed-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Claudio Imbrenda imbrenda@linux.ibm.com Message-ID: 20230705111937.33472-3-imbrenda@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/mm/gmap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index dc90d1eb0d554..d7e8297d5642b 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2846,6 +2846,7 @@ int s390_replace_asce(struct gmap *gmap) page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER); if (!page) return -ENOMEM; + page->index = 0; table = page_to_virt(page); memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 7686762d1ed092db4d120e29b565712c969dc075 ]
With per-vma locks, handle_mm_fault() may return non-fatal error flags. In this case the code should reset the fault flags before returning.
Fixes: e06f47a16573 ("s390/mm: try VMA lock-based page fault handling first") Signed-off-by: Sven Schnelle svens@linux.ibm.com Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/mm/fault.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c index dbe8394234e2b..2f123429a291b 100644 --- a/arch/s390/mm/fault.c +++ b/arch/s390/mm/fault.c @@ -421,6 +421,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access) vma_end_read(vma); if (!(fault & VM_FAULT_RETRY)) { count_vm_vma_lock_event(VMA_LOCK_SUCCESS); + if (likely(!(fault & VM_FAULT_ERROR))) + fault = 0; goto out; } count_vm_vma_lock_event(VMA_LOCK_RETRY);
From: Ondrej Mosnacek omosnace@redhat.com
[ Upstream commit 6adc2272aaaf84f34b652cf77f770c6fcc4b8336 ]
The check being unconditional may lead to unwanted denials reported by LSMs when a process has the capability granted by DAC, but denied by an LSM. In the case of SELinux such denials are a problem, since they can't be effectively filtered out via the policy and when not silenced, they produce noise that may hide a true problem or an attack.
Since not having the capability merely means that the created io_uring context will be accounted against the current user's RLIMIT_MEMLOCK limit, we can disable auditing of denials for this check by using ns_capable_noaudit() instead of capable().
Fixes: 2b188cc1bb85 ("Add io_uring IO interface") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2193317 Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Reviewed-by: Jeff Moyer jmoyer@redhat.com Link: https://lore.kernel.org/r/20230718115607.65652-1-omosnace@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index d6667b435dd39..685cf14a7189e 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3859,7 +3859,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, ctx->syscall_iopoll = 1;
ctx->compat = in_compat_syscall(); - if (!capable(CAP_IPC_LOCK)) + if (!ns_capable_noaudit(&init_user_ns, CAP_IPC_LOCK)) ctx->user = get_uid(current_user());
/*
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 5a7adc6c1069ce31ef4f606ae9c05592c80a6ab5 ]
Make tps68470_gpio_output() call tps68470_gpio_set() for output-only pins too, so that the initial value passed to gpiod_direction_output() is honored for these pins too.
Fixes: 275b13a65547 ("gpio: Add support for TPS68470 GPIOs") Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Daniel Scally dan.scally@ideasonboard.com Tested-by: Daniel Scally dan.scally@ideasonboard.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-tps68470.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpio/gpio-tps68470.c b/drivers/gpio/gpio-tps68470.c index aaddcabe9b359..532deaddfd4e2 100644 --- a/drivers/gpio/gpio-tps68470.c +++ b/drivers/gpio/gpio-tps68470.c @@ -91,13 +91,13 @@ static int tps68470_gpio_output(struct gpio_chip *gc, unsigned int offset, struct tps68470_gpio_data *tps68470_gpio = gpiochip_get_data(gc); struct regmap *regmap = tps68470_gpio->tps68470_regmap;
+ /* Set the initial value */ + tps68470_gpio_set(gc, offset, value); + /* rest are always outputs */ if (offset >= TPS68470_N_REGULAR_GPIO) return 0;
- /* Set the initial value */ - tps68470_gpio_set(gc, offset, value); - return regmap_update_bits(regmap, TPS68470_GPIO_CTL_REG_A(offset), TPS68470_GPIO_MODE_MASK, TPS68470_GPIO_MODE_OUT_CMOS);
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 1945063eb59e64d2919cb14d54d081476d9e53bb ]
This allows to get rid of a call to pwmchip_remove() in the error path. There is no .remove function for this driver, so this change fixes a resource leak when a gpio-mvebu device is unbound.
Fixes: 757642f9a584 ("gpio: mvebu: Add limited PWM support") Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Andy Shevchenko andy@kernel.org Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-mvebu.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c index a68f682aec012..a35958e7adf60 100644 --- a/drivers/gpio/gpio-mvebu.c +++ b/drivers/gpio/gpio-mvebu.c @@ -874,7 +874,7 @@ static int mvebu_pwm_probe(struct platform_device *pdev,
spin_lock_init(&mvpwm->lock);
- return pwmchip_add(&mvpwm->chip); + return devm_pwmchip_add(dev, &mvpwm->chip); }
#ifdef CONFIG_DEBUG_FS @@ -1243,8 +1243,7 @@ static int mvebu_gpio_probe(struct platform_device *pdev) if (!mvchip->domain) { dev_err(&pdev->dev, "couldn't allocate irq domain %s (DT).\n", mvchip->chip.label); - err = -ENODEV; - goto err_pwm; + return -ENODEV; }
err = irq_alloc_domain_generic_chips( @@ -1296,9 +1295,6 @@ static int mvebu_gpio_probe(struct platform_device *pdev)
err_domain: irq_domain_remove(mvchip->domain); -err_pwm: - pwmchip_remove(&mvchip->mvpwm->chip); - return err; }
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
[ Upstream commit 644ee70267a934be27370f9aa618b29af7290544 ]
Uwe Kleine-König pointed out we still have one resource leak in the mvebu driver triggered on driver detach. Let's address it with a custom devm action.
Fixes: 812d47889a8e ("gpio/mvebu: Use irq_domain_add_linear") Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-mvebu.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c index a35958e7adf60..67497116ce27d 100644 --- a/drivers/gpio/gpio-mvebu.c +++ b/drivers/gpio/gpio-mvebu.c @@ -1112,6 +1112,13 @@ static int mvebu_gpio_probe_syscon(struct platform_device *pdev, return 0; }
+static void mvebu_gpio_remove_irq_domain(void *data) +{ + struct irq_domain *domain = data; + + irq_domain_remove(domain); +} + static int mvebu_gpio_probe(struct platform_device *pdev) { struct mvebu_gpio_chip *mvchip; @@ -1246,13 +1253,18 @@ static int mvebu_gpio_probe(struct platform_device *pdev) return -ENODEV; }
+ err = devm_add_action_or_reset(&pdev->dev, mvebu_gpio_remove_irq_domain, + mvchip->domain); + if (err) + return err; + err = irq_alloc_domain_generic_chips( mvchip->domain, ngpios, 2, np->name, handle_level_irq, IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_LEVEL, 0, 0); if (err) { dev_err(&pdev->dev, "couldn't allocate irq chips %s (DT).\n", mvchip->chip.label); - goto err_domain; + return err; }
/* @@ -1292,10 +1304,6 @@ static int mvebu_gpio_probe(struct platform_device *pdev) }
return 0; - -err_domain: - irq_domain_remove(mvchip->domain); - return err; }
static struct platform_driver mvebu_gpio_driver = {
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit a9e26169cfda651802f88262a315146fbe4bc74c ]
REGCACHE_RBTREE and REGCACHE_MAPLE dynamically allocate memory for regmap operations. This is incompatible with spinlock based locking which is used for fast_io operations. Disable locking for the associated unit tests to avoid lockdep splashes.
Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache") Fixes: 2238959b6ad2 ("regmap: Add some basic kunit tests") Signed-off-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20230720032848.1306349-1-linux@roeck-us.net Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regmap-kunit.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/base/regmap/regmap-kunit.c b/drivers/base/regmap/regmap-kunit.c index f76d416881349..0b3dacc7fa424 100644 --- a/drivers/base/regmap/regmap-kunit.c +++ b/drivers/base/regmap/regmap-kunit.c @@ -58,6 +58,9 @@ static struct regmap *gen_regmap(struct regmap_config *config, int i; struct reg_default *defaults;
+ config->disable_locking = config->cache_type == REGCACHE_RBTREE || + config->cache_type == REGCACHE_MAPLE; + buf = kmalloc(size, GFP_KERNEL); if (!buf) return ERR_PTR(-ENOMEM);
From: Christoph Hellwig hch@lst.de
[ Upstream commit ed9ee98ecb4fdbdfe043ee3eec0a65c0745d8669 ]
Split all the conditionals for the fsverity calls in end_page_read into a btrfs_verify_page helper to keep the code readable and make additional refactoring easier.
Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: 2c14f0ffdd30 ("btrfs: fix fsverify read error handling in end_page_read") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_io.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a37a6587efaf0..496c2c9920fc6 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -478,6 +478,15 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, start, end, page_ops, NULL); }
+static bool btrfs_verify_page(struct page *page, u64 start) +{ + if (!fsverity_active(page->mapping->host) || + PageError(page) || PageUptodate(page) || + start >= i_size_read(page->mapping->host)) + return true; + return fsverity_verify_page(page); +} + static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len) { struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); @@ -486,11 +495,7 @@ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len) start + len <= page_offset(page) + PAGE_SIZE);
if (uptodate) { - if (fsverity_active(page->mapping->host) && - !PageError(page) && - !PageUptodate(page) && - start < i_size_read(page->mapping->host) && - !fsverity_verify_page(page)) { + if (!btrfs_verify_page(page, start)) { btrfs_page_set_error(fs_info, page, start, len); } else { btrfs_page_set_uptodate(fs_info, page, start, len);
From: Christoph Hellwig hch@lst.de
[ Upstream commit 2c14f0ffdd30bd3d321ad5fe76fcf701746e1df6 ]
Also clear the uptodate bit to make sure the page isn't seen as uptodate in the page cache if fsverity verification fails.
Fixes: 146054090b08 ("btrfs: initial fsverity support") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent_io.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 496c2c9920fc6..82b9779deaa88 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -494,12 +494,8 @@ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len) ASSERT(page_offset(page) <= start && start + len <= page_offset(page) + PAGE_SIZE);
- if (uptodate) { - if (!btrfs_verify_page(page, start)) { - btrfs_page_set_error(fs_info, page, start, len); - } else { - btrfs_page_set_uptodate(fs_info, page, start, len); - } + if (uptodate && btrfs_verify_page(page, start)) { + btrfs_page_set_uptodate(fs_info, page, start, len); } else { btrfs_page_clear_uptodate(fs_info, page, start, len); btrfs_page_set_error(fs_info, page, start, len);
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 8a4a0b2a3eaf75ca8854f856ef29690c12b2f531 ]
If we disable quotas while we have a relocation of a metadata block group that has extents belonging to the quota root, we can cause the relocation to fail with -ENOENT. This is because relocation builds backref nodes for extents of the quota root and later needs to walk the backrefs and access the quota root - however if in between a task disables quotas, it results in deleting the quota root from the root tree (with btrfs_del_root(), called from btrfs_quota_disable().
This can be sporadically triggered by test case btrfs/255 from fstests:
$ ./check btrfs/255 FSTYP -- btrfs PLATFORM -- Linux/x86_64 debian0 6.4.0-rc6-btrfs-next-134+ #1 SMP PREEMPT_DYNAMIC Thu Jun 15 11:59:28 WEST 2023 MKFS_OPTIONS -- /dev/sdc MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1
btrfs/255 6s ... _check_dmesg: something found in dmesg (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.dmesg) - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad) # --- tests/btrfs/255.out 2023-03-02 21:47:53.876609426 +0000 # +++ /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad 2023-06-16 10:20:39.267563212 +0100 # @@ -1,2 +1,4 @@ # QA output created by 255 # +ERROR: error during balancing '/home/fdmanana/btrfs-tests/scratch_1': No such file or directory # +There may be more info in syslog - try dmesg | tail # Silence is golden # ... (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/255.out /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad' to see the entire diff) Ran: btrfs/255 Failures: btrfs/255 Failed 1 of 1 tests
To fix this make the quota disable operation take the cleaner mutex, as relocation of a block group also takes this mutex. This is also what we do when deleting a subvolume/snapshot, we take the cleaner mutex in the cleaner kthread (at cleaner_kthread()) and then we call btrfs_del_root() at btrfs_drop_snapshot() while under the protection of the cleaner mutex.
Fixes: bed92eae26cc ("Btrfs: qgroup implementation and prototypes") CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/qgroup.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 360bf2522a871..2637d6b157ff9 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1232,12 +1232,23 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info) int ret = 0;
/* - * We need to have subvol_sem write locked, to prevent races between - * concurrent tasks trying to disable quotas, because we will unlock - * and relock qgroup_ioctl_lock across BTRFS_FS_QUOTA_ENABLED changes. + * We need to have subvol_sem write locked to prevent races with + * snapshot creation. */ lockdep_assert_held_write(&fs_info->subvol_sem);
+ /* + * Lock the cleaner mutex to prevent races with concurrent relocation, + * because relocation may be building backrefs for blocks of the quota + * root while we are deleting the root. This is like dropping fs roots + * of deleted snapshots/subvolumes, we need the same protection. + * + * This also prevents races between concurrent tasks trying to disable + * quotas, because we will unlock and relock qgroup_ioctl_lock across + * BTRFS_FS_QUOTA_ENABLED changes. + */ + mutex_lock(&fs_info->cleaner_mutex); + mutex_lock(&fs_info->qgroup_ioctl_lock); if (!fs_info->quota_root) goto out; @@ -1319,6 +1330,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info) btrfs_end_transaction(trans); else if (trans) ret = btrfs_end_transaction(trans); + mutex_unlock(&fs_info->cleaner_mutex);
return ret; }
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit 6b3b21a8542fd2fb6ffc61bc13b9419f0c58ebad ]
These issues were detected by using the Coccinelle software.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-ibm_iic.c | 4 +--- drivers/i2c/busses/i2c-nomadik.c | 1 - drivers/i2c/busses/i2c-sh7760.c | 1 - drivers/i2c/busses/i2c-tiny-usb.c | 4 +--- 4 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/i2c/busses/i2c-ibm_iic.c b/drivers/i2c/busses/i2c-ibm_iic.c index eeb80e34f9ad7..de3b609515e08 100644 --- a/drivers/i2c/busses/i2c-ibm_iic.c +++ b/drivers/i2c/busses/i2c-ibm_iic.c @@ -694,10 +694,8 @@ static int iic_probe(struct platform_device *ofdev) int ret;
dev = kzalloc(sizeof(*dev), GFP_KERNEL); - if (!dev) { - dev_err(&ofdev->dev, "failed to allocate device data\n"); + if (!dev) return -ENOMEM; - }
platform_set_drvdata(ofdev, dev);
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index a2d12a5b1c34c..05eaae5aeb180 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -972,7 +972,6 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
dev = devm_kzalloc(&adev->dev, sizeof(struct nmk_i2c_dev), GFP_KERNEL); if (!dev) { - dev_err(&adev->dev, "cannot allocate memory\n"); ret = -ENOMEM; goto err_no_mem; } diff --git a/drivers/i2c/busses/i2c-sh7760.c b/drivers/i2c/busses/i2c-sh7760.c index 319d1fa617c88..a0ccc5d009874 100644 --- a/drivers/i2c/busses/i2c-sh7760.c +++ b/drivers/i2c/busses/i2c-sh7760.c @@ -445,7 +445,6 @@ static int sh7760_i2c_probe(struct platform_device *pdev)
id = kzalloc(sizeof(struct cami2c), GFP_KERNEL); if (!id) { - dev_err(&pdev->dev, "no mem for private data\n"); ret = -ENOMEM; goto out0; } diff --git a/drivers/i2c/busses/i2c-tiny-usb.c b/drivers/i2c/busses/i2c-tiny-usb.c index 7279ca0eaa2d0..d1fa9ff5aeab4 100644 --- a/drivers/i2c/busses/i2c-tiny-usb.c +++ b/drivers/i2c/busses/i2c-tiny-usb.c @@ -226,10 +226,8 @@ static int i2c_tiny_usb_probe(struct usb_interface *interface,
/* allocate memory for our device state and initialize it */ dev = kzalloc(sizeof(*dev), GFP_KERNEL); - if (dev == NULL) { - dev_err(&interface->dev, "Out of memory\n"); + if (!dev) goto error; - }
dev->usb_dev = usb_get_dev(interface_to_usbdev(interface)); dev->interface = interface;
From: Markus Elfring elfring@users.sourceforge.net
[ Upstream commit 06e989578232da33a7fe96b04191b862af8b2cec ]
Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring elfring@users.sourceforge.net Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 2 +- drivers/i2c/busses/i2c-sh7760.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 05eaae5aeb180..5004b9dd98563 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -970,7 +970,7 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id) struct i2c_vendor_data *vendor = id->data; u32 max_fifo_threshold = (vendor->fifodepth / 2) - 1;
- dev = devm_kzalloc(&adev->dev, sizeof(struct nmk_i2c_dev), GFP_KERNEL); + dev = devm_kzalloc(&adev->dev, sizeof(*dev), GFP_KERNEL); if (!dev) { ret = -ENOMEM; goto err_no_mem; diff --git a/drivers/i2c/busses/i2c-sh7760.c b/drivers/i2c/busses/i2c-sh7760.c index a0ccc5d009874..051b904cb35f6 100644 --- a/drivers/i2c/busses/i2c-sh7760.c +++ b/drivers/i2c/busses/i2c-sh7760.c @@ -443,7 +443,7 @@ static int sh7760_i2c_probe(struct platform_device *pdev) goto out0; }
- id = kzalloc(sizeof(struct cami2c), GFP_KERNEL); + id = kzalloc(sizeof(*id), GFP_KERNEL); if (!id) { ret = -ENOMEM; goto out0;
From: Andi Shyti andi.shyti@kernel.org
[ Upstream commit 1c5d33fff0d375e4ab7c4261dc62a286babbb4c6 ]
The err_no_mem goto label doesn't do anything. Remove it.
Signed-off-by: Andi Shyti andi.shyti@kernel.org Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 5004b9dd98563..8b9577318388e 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -971,10 +971,9 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id) u32 max_fifo_threshold = (vendor->fifodepth / 2) - 1;
dev = devm_kzalloc(&adev->dev, sizeof(*dev), GFP_KERNEL); - if (!dev) { - ret = -ENOMEM; - goto err_no_mem; - } + if (!dev) + return -ENOMEM; + dev->vendor = vendor; dev->adev = adev; nmk_i2c_of_probe(np, dev); @@ -995,30 +994,27 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
dev->virtbase = devm_ioremap(&adev->dev, adev->res.start, resource_size(&adev->res)); - if (!dev->virtbase) { - ret = -ENOMEM; - goto err_no_mem; - } + if (!dev->virtbase) + return -ENOMEM;
dev->irq = adev->irq[0]; ret = devm_request_irq(&adev->dev, dev->irq, i2c_irq_handler, 0, DRIVER_NAME, dev); if (ret) { dev_err(&adev->dev, "cannot claim the irq %d\n", dev->irq); - goto err_no_mem; + return ret; }
dev->clk = devm_clk_get(&adev->dev, NULL); if (IS_ERR(dev->clk)) { dev_err(&adev->dev, "could not get i2c clock\n"); - ret = PTR_ERR(dev->clk); - goto err_no_mem; + return PTR_ERR(dev->clk); }
ret = clk_prepare_enable(dev->clk); if (ret) { dev_err(&adev->dev, "can't prepare_enable clock\n"); - goto err_no_mem; + return ret; }
init_hw(dev); @@ -1049,7 +1045,6 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
err_no_adap: clk_disable_unprepare(dev->clk); - err_no_mem:
return ret; }
From: Andi Shyti andi.shyti@kernel.org
[ Upstream commit 9c7174db4cdd111e10d19eed5c36fd978a14c8a2 ]
Replace the pair of functions, devm_clk_get() and clk_prepare_enable(), with a single function devm_clk_get_enabled().
Signed-off-by: Andi Shyti andi.shyti@kernel.org Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Stable-dep-of: 05f933d5f731 ("i2c: nomadik: Remove a useless call in the remove function") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 8b9577318388e..2141ba05dfece 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -1005,18 +1005,12 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id) return ret; }
- dev->clk = devm_clk_get(&adev->dev, NULL); + dev->clk = devm_clk_get_enabled(&adev->dev, NULL); if (IS_ERR(dev->clk)) { - dev_err(&adev->dev, "could not get i2c clock\n"); + dev_err(&adev->dev, "could enable i2c clock\n"); return PTR_ERR(dev->clk); }
- ret = clk_prepare_enable(dev->clk); - if (ret) { - dev_err(&adev->dev, "can't prepare_enable clock\n"); - return ret; - } - init_hw(dev);
adap = &dev->adap; @@ -1037,16 +1031,11 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
ret = i2c_add_adapter(adap); if (ret) - goto err_no_adap; + return ret;
pm_runtime_put(&adev->dev);
return 0; - - err_no_adap: - clk_disable_unprepare(dev->clk); - - return ret; }
static void nmk_i2c_remove(struct amba_device *adev) @@ -1060,7 +1049,6 @@ static void nmk_i2c_remove(struct amba_device *adev) clear_all_interrupts(dev); /* disable the controller */ i2c_clr_bit(dev->virtbase + I2C_CR, I2C_CR_PE); - clk_disable_unprepare(dev->clk); release_mem_region(res->start, resource_size(res)); }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 05f933d5f7318b03ff2028c1704dc867ac16f2c7 ]
Since commit 235602146ec9 ("i2c-nomadik: turn the platform driver to an amba driver"), there is no more request_mem_region() call in this driver.
So remove the release_mem_region() call from the remove function which is likely a left over.
Fixes: 235602146ec9 ("i2c-nomadik: turn the platform driver to an amba driver") Cc: stable@vger.kernel.org # v3.6+ Acked-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-nomadik.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c index 2141ba05dfece..9c5d66bd6dc1c 100644 --- a/drivers/i2c/busses/i2c-nomadik.c +++ b/drivers/i2c/busses/i2c-nomadik.c @@ -1040,7 +1040,6 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
static void nmk_i2c_remove(struct amba_device *adev) { - struct resource *res = &adev->res; struct nmk_i2c_dev *dev = amba_get_drvdata(adev);
i2c_del_adapter(&dev->adap); @@ -1049,7 +1048,6 @@ static void nmk_i2c_remove(struct amba_device *adev) clear_all_interrupts(dev); /* disable the controller */ i2c_clr_bit(dev->virtbase + I2C_CR, I2C_CR_PE); - release_mem_region(res->start, resource_size(res)); }
static struct i2c_vendor_data vendor_stn8815 = {
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit f5297a01ee805d7fa569d288ed65fc0f9ac9b03d ]
"pcie_retrain_link" is not a question with a true/false answer, so "bool" isn't quite the right return type. Return 0 for success or -ETIMEDOUT if the retrain failed. No functional change intended.
[bhelgaas: based on Ilpo's patch below] Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.c... Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: e7e39756363a ("PCI/ASPM: Avoid link retraining race") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index db32335039d61..88aca887e3120 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -193,7 +193,7 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) link->clkpm_disable = blacklist ? 1 : 0; }
-static bool pcie_retrain_link(struct pcie_link_state *link) +static int pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; unsigned long end_jiffies; @@ -220,7 +220,9 @@ static bool pcie_retrain_link(struct pcie_link_state *link) break; msleep(1); } while (time_before(jiffies, end_jiffies)); - return !(reg16 & PCI_EXP_LNKSTA_LT); + if (reg16 & PCI_EXP_LNKSTA_LT) + return -ETIMEDOUT; + return 0; }
/* @@ -289,15 +291,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link) reg16 &= ~PCI_EXP_LNKCTL_CCC; pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
- if (pcie_retrain_link(link)) - return; + if (pcie_retrain_link(link)) {
- /* Training failed. Restore common clock configurations */ - pci_err(parent, "ASPM: Could not configure common clock\n"); - list_for_each_entry(child, &linkbus->devices, bus_list) - pcie_capability_write_word(child, PCI_EXP_LNKCTL, + /* Training failed. Restore common clock configurations */ + pci_err(parent, "ASPM: Could not configure common clock\n"); + list_for_each_entry(child, &linkbus->devices, bus_list) + pcie_capability_write_word(child, PCI_EXP_LNKCTL, child_reg[PCI_FUNC(child->devfn)]); - pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg); + pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg); + } }
/* Convert L0s latency encoding to ns */
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 9c7f136433d26592cb4d9cd00b4e15c33d9797c6 ]
Factor pcie_wait_for_retrain() out from pcie_retrain_link(). No functional change intended.
[bhelgaas: split out from https: //lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.com] Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: e7e39756363a ("PCI/ASPM: Avoid link retraining race") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 88aca887e3120..517f834ac93ef 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -193,10 +193,26 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist) link->clkpm_disable = blacklist ? 1 : 0; }
+static int pcie_wait_for_retrain(struct pci_dev *pdev) +{ + unsigned long end_jiffies; + u16 reg16; + + /* Wait for Link Training to be cleared by hardware */ + end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT; + do { + pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, ®16); + if (!(reg16 & PCI_EXP_LNKSTA_LT)) + return 0; + msleep(1); + } while (time_before(jiffies, end_jiffies)); + + return -ETIMEDOUT; +} + static int pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; - unsigned long end_jiffies; u16 reg16;
pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); @@ -212,17 +228,7 @@ static int pcie_retrain_link(struct pcie_link_state *link) pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16); }
- /* Wait for link training end. Break out after waiting for timeout */ - end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT; - do { - pcie_capability_read_word(parent, PCI_EXP_LNKSTA, ®16); - if (!(reg16 & PCI_EXP_LNKSTA_LT)) - break; - msleep(1); - } while (time_before(jiffies, end_jiffies)); - if (reg16 & PCI_EXP_LNKSTA_LT) - return -ETIMEDOUT; - return 0; + return pcie_wait_for_retrain(parent); }
/*
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit e7e39756363ad5bd83ddeae1063193d0f13870fd ]
PCIe r6.0.1, sec 7.5.3.7, recommends setting the link control parameters, then waiting for the Link Training bit to be clear before setting the Retrain Link bit.
This avoids a race where the LTSSM may not use the updated parameters if it is already in the midst of link training because of other normal link activity.
Wait for the Link Training bit to be clear before toggling the Retrain Link bit to ensure that the LTSSM uses the updated link control parameters.
[bhelgaas: commit log, return 0 (success)/-ETIMEDOUT instead of bool for both pcie_wait_for_retrain() and the existing pcie_retrain_link()] Suggested-by: Lukas Wunner lukas@wunner.de Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.c... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Lukas Wunner lukas@wunner.de Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index 517f834ac93ef..998e26de2ad76 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -213,8 +213,19 @@ static int pcie_wait_for_retrain(struct pci_dev *pdev) static int pcie_retrain_link(struct pcie_link_state *link) { struct pci_dev *parent = link->pdev; + int rc; u16 reg16;
+ /* + * Ensure the updated LNKCTL parameters are used during link + * training by checking that there is no ongoing link training to + * avoid LTSSM race as recommended in Implementation Note at the + * end of PCIe r6.0.1 sec 7.5.3.7. + */ + rc = pcie_wait_for_retrain(parent); + if (rc) + return rc; + pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16); reg16 |= PCI_EXP_LNKCTL_RL; pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
From: Rick Wertenbroek rick.wertenbroek@gmail.com
[ Upstream commit 92a9c57c325dd51682d428ba960d961fec3c8a08 ]
Remove write accesses to registers that are marked "unused" (and therefore read-only) in the technical reference manual (TRM) (see RK3399 TRM 17.6.8.1)
Link: https://lore.kernel.org/r/20230418074700.1083505-2-rick.wertenbroek@gmail.co... Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Stable-dep-of: dc73ed0f1b8b ("PCI: rockchip: Fix window mapping and address translation for endpoint") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-ep.c | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 827d91e73efab..9e17f3dba743a 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -61,10 +61,6 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip, ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region)); rockchip_pcie_write(rockchip, 0, ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region)); - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(region)); - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(region)); }
static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn, @@ -114,12 +110,6 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn, PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); addr1 = upper_32_bits(cpu_addr); } - - /* CPU bus address region */ - rockchip_pcie_write(rockchip, addr0, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r)); - rockchip_pcie_write(rockchip, addr1, - ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r)); }
static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
From: Rick Wertenbroek rick.wertenbroek@gmail.com
[ Upstream commit dc73ed0f1b8bddd7f2bf70d123e68ffc99ad71ce ]
The RK3399 PCI endpoint core has 33 windows for PCIe space, now in the driver up to 32 fixed size (1M) windows are used and pages are allocated and mapped accordingly. The driver first used a single window and allocated space inside which caused translation issues (between CPU space and PCI space) because a window can only have a single translation at a given time, which if multiple pages are allocated inside will cause conflicts. Now each window is a single region of 1M which will always guarantee that the translation is not in conflict.
Set the translation register addresses for physical function. As documented in the technical reference manual (TRM) section 17.5.5 "PCIe Address Translation" and section 17.6.8 "Address Translation Registers Description"
Link: https://lore.kernel.org/r/20230418074700.1083505-9-rick.wertenbroek@gmail.co... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-ep.c | 128 ++++++++++------------ drivers/pci/controller/pcie-rockchip.h | 35 +++--- 2 files changed, 75 insertions(+), 88 deletions(-)
diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 9e17f3dba743a..3d6f828d29fc2 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -64,52 +64,29 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip, }
static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn, - u32 r, u32 type, u64 cpu_addr, - u64 pci_addr, size_t size) + u32 r, u64 cpu_addr, u64 pci_addr, + size_t size) { - u64 sz = 1ULL << fls64(size - 1); - int num_pass_bits = ilog2(sz); - u32 addr0, addr1, desc0, desc1; - bool is_nor_msg = (type == AXI_WRAPPER_NOR_MSG); + int num_pass_bits = fls64(size - 1); + u32 addr0, addr1, desc0;
- /* The minimal region size is 1MB */ if (num_pass_bits < 8) num_pass_bits = 8;
- cpu_addr -= rockchip->mem_res->start; - addr0 = ((is_nor_msg ? 0x10 : (num_pass_bits - 1)) & - PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) | - (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); - addr1 = upper_32_bits(is_nor_msg ? cpu_addr : pci_addr); - desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type; - desc1 = 0; - - if (is_nor_msg) { - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r)); - rockchip_pcie_write(rockchip, 0, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r)); - rockchip_pcie_write(rockchip, desc0, - ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r)); - rockchip_pcie_write(rockchip, desc1, - ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r)); - } else { - /* PCI bus address region */ - rockchip_pcie_write(rockchip, addr0, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r)); - rockchip_pcie_write(rockchip, addr1, - ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r)); - rockchip_pcie_write(rockchip, desc0, - ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r)); - rockchip_pcie_write(rockchip, desc1, - ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r)); - - addr0 = - ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) | - (lower_32_bits(cpu_addr) & - PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); - addr1 = upper_32_bits(cpu_addr); - } + addr0 = ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) | + (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR); + addr1 = upper_32_bits(pci_addr); + desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | AXI_WRAPPER_MEM_WRITE; + + /* PCI bus address region */ + rockchip_pcie_write(rockchip, addr0, + ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r)); + rockchip_pcie_write(rockchip, addr1, + ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r)); + rockchip_pcie_write(rockchip, desc0, + ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r)); + rockchip_pcie_write(rockchip, 0, + ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r)); }
static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn, @@ -248,26 +225,20 @@ static void rockchip_pcie_ep_clear_bar(struct pci_epc *epc, u8 fn, u8 vfn, ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar)); }
+static inline u32 rockchip_ob_region(phys_addr_t addr) +{ + return (addr >> ilog2(SZ_1M)) & 0x1f; +} + static int rockchip_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn, phys_addr_t addr, u64 pci_addr, size_t size) { struct rockchip_pcie_ep *ep = epc_get_drvdata(epc); struct rockchip_pcie *pcie = &ep->rockchip; - u32 r; + u32 r = rockchip_ob_region(addr);
- r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG); - /* - * Region 0 is reserved for configuration space and shouldn't - * be used elsewhere per TRM, so leave it out. - */ - if (r >= ep->max_regions - 1) { - dev_err(&epc->dev, "no free outbound region\n"); - return -EINVAL; - } - - rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, AXI_WRAPPER_MEM_WRITE, addr, - pci_addr, size); + rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, addr, pci_addr, size);
set_bit(r, &ep->ob_region_map); ep->ob_addr[r] = addr; @@ -282,15 +253,11 @@ static void rockchip_pcie_ep_unmap_addr(struct pci_epc *epc, u8 fn, u8 vfn, struct rockchip_pcie *rockchip = &ep->rockchip; u32 r;
- for (r = 0; r < ep->max_regions - 1; r++) + for (r = 0; r < ep->max_regions; r++) if (ep->ob_addr[r] == addr) break;
- /* - * Region 0 is reserved for configuration space and shouldn't - * be used elsewhere per TRM, so leave it out. - */ - if (r == ep->max_regions - 1) + if (r == ep->max_regions) return;
rockchip_pcie_clear_ep_ob_atu(rockchip, r); @@ -387,7 +354,8 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn, struct rockchip_pcie *rockchip = &ep->rockchip; u32 flags, mme, data, data_mask; u8 msi_count; - u64 pci_addr, pci_addr_mask = 0xff; + u64 pci_addr; + u32 r;
/* Check MSI enable bit */ flags = rockchip_pcie_read(&ep->rockchip, @@ -421,21 +389,20 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn, ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + ROCKCHIP_PCIE_EP_MSI_CTRL_REG + PCI_MSI_ADDRESS_LO); - pci_addr &= GENMASK_ULL(63, 2);
/* Set the outbound region if needed. */ - if (unlikely(ep->irq_pci_addr != (pci_addr & ~pci_addr_mask) || + if (unlikely(ep->irq_pci_addr != (pci_addr & PCIE_ADDR_MASK) || ep->irq_pci_fn != fn)) { - rockchip_pcie_prog_ep_ob_atu(rockchip, fn, ep->max_regions - 1, - AXI_WRAPPER_MEM_WRITE, + r = rockchip_ob_region(ep->irq_phys_addr); + rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r, ep->irq_phys_addr, - pci_addr & ~pci_addr_mask, - pci_addr_mask + 1); - ep->irq_pci_addr = (pci_addr & ~pci_addr_mask); + pci_addr & PCIE_ADDR_MASK, + ~PCIE_ADDR_MASK + 1); + ep->irq_pci_addr = (pci_addr & PCIE_ADDR_MASK); ep->irq_pci_fn = fn; }
- writew(data, ep->irq_cpu_addr + (pci_addr & pci_addr_mask)); + writew(data, ep->irq_cpu_addr + (pci_addr & ~PCIE_ADDR_MASK)); return 0; }
@@ -517,6 +484,8 @@ static int rockchip_pcie_parse_ep_dt(struct rockchip_pcie *rockchip, if (err < 0 || ep->max_regions > MAX_REGION_LIMIT) ep->max_regions = MAX_REGION_LIMIT;
+ ep->ob_region_map = 0; + err = of_property_read_u8(dev->of_node, "max-functions", &ep->epc->max_functions); if (err < 0) @@ -537,7 +506,8 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev) struct rockchip_pcie *rockchip; struct pci_epc *epc; size_t max_regions; - int err; + struct pci_epc_mem_window *windows = NULL; + int err, i;
ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL); if (!ep) @@ -584,15 +554,27 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev) /* Only enable function 0 by default */ rockchip_pcie_write(rockchip, BIT(0), PCIE_CORE_PHY_FUNC_CFG);
- err = pci_epc_mem_init(epc, rockchip->mem_res->start, - resource_size(rockchip->mem_res), PAGE_SIZE); + windows = devm_kcalloc(dev, ep->max_regions, + sizeof(struct pci_epc_mem_window), GFP_KERNEL); + if (!windows) { + err = -ENOMEM; + goto err_uninit_port; + } + for (i = 0; i < ep->max_regions; i++) { + windows[i].phys_base = rockchip->mem_res->start + (SZ_1M * i); + windows[i].size = SZ_1M; + windows[i].page_size = SZ_1M; + } + err = pci_epc_multi_mem_init(epc, windows, ep->max_regions); + devm_kfree(dev, windows); + if (err < 0) { dev_err(dev, "failed to initialize the memory space\n"); goto err_uninit_port; }
ep->irq_cpu_addr = pci_epc_mem_alloc_addr(epc, &ep->irq_phys_addr, - SZ_128K); + SZ_1M); if (!ep->irq_cpu_addr) { dev_err(dev, "failed to reserve memory space for MSI\n"); err = -ENOMEM; diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h index 8e92dc3339ecc..501d859420b4c 100644 --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -139,6 +139,7 @@
#define PCIE_RC_RP_ATS_BASE 0x400000 #define PCIE_RC_CONFIG_NORMAL_BASE 0x800000 +#define PCIE_EP_PF_CONFIG_REGS_BASE 0x800000 #define PCIE_RC_CONFIG_BASE 0xa00000 #define PCIE_EP_CONFIG_BASE 0xa00000 #define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00) @@ -157,10 +158,11 @@ #define PCIE_RC_CONFIG_THP_CAP (PCIE_RC_CONFIG_BASE + 0x274) #define PCIE_RC_CONFIG_THP_CAP_NEXT_MASK GENMASK(31, 20)
+#define PCIE_ADDR_MASK 0xffffff00 #define PCIE_CORE_AXI_CONF_BASE 0xc00000 #define PCIE_CORE_OB_REGION_ADDR0 (PCIE_CORE_AXI_CONF_BASE + 0x0) #define PCIE_CORE_OB_REGION_ADDR0_NUM_BITS 0x3f -#define PCIE_CORE_OB_REGION_ADDR0_LO_ADDR 0xffffff00 +#define PCIE_CORE_OB_REGION_ADDR0_LO_ADDR PCIE_ADDR_MASK #define PCIE_CORE_OB_REGION_ADDR1 (PCIE_CORE_AXI_CONF_BASE + 0x4) #define PCIE_CORE_OB_REGION_DESC0 (PCIE_CORE_AXI_CONF_BASE + 0x8) #define PCIE_CORE_OB_REGION_DESC1 (PCIE_CORE_AXI_CONF_BASE + 0xc) @@ -168,7 +170,7 @@ #define PCIE_CORE_AXI_INBOUND_BASE 0xc00800 #define PCIE_RP_IB_ADDR0 (PCIE_CORE_AXI_INBOUND_BASE + 0x0) #define PCIE_CORE_IB_REGION_ADDR0_NUM_BITS 0x3f -#define PCIE_CORE_IB_REGION_ADDR0_LO_ADDR 0xffffff00 +#define PCIE_CORE_IB_REGION_ADDR0_LO_ADDR PCIE_ADDR_MASK #define PCIE_RP_IB_ADDR1 (PCIE_CORE_AXI_INBOUND_BASE + 0x4)
/* Size of one AXI Region (not Region 0) */ @@ -233,13 +235,15 @@ #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16) #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24) #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1 -#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12)) +#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3 +#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \ + (PCIE_EP_PF_CONFIG_REGS_BASE + (((fn) << 12) & GENMASK(19, 12))) +#define ROCKCHIP_PCIE_EP_VIRT_FUNC_BASE(fn) \ + (PCIE_EP_PF_CONFIG_REGS_BASE + 0x10000 + (((fn) << 12) & GENMASK(19, 12))) #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \ - (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008) + (PCIE_CORE_AXI_CONF_BASE + 0x0828 + (fn) * 0x0040 + (bar) * 0x0008) #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar) \ - (PCIE_RC_RP_ATS_BASE + 0x0844 + (fn) * 0x0040 + (bar) * 0x0008) -#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0000 + ((r) & 0x1f) * 0x0020) + (PCIE_CORE_AXI_CONF_BASE + 0x082c + (fn) * 0x0040 + (bar) * 0x0008) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN_MASK GENMASK(19, 12) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN(devfn) \ (((devfn) << 12) & \ @@ -247,20 +251,21 @@ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK GENMASK(27, 20) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS(bus) \ (((bus) << 20) & ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK) +#define PCIE_RC_EP_ATR_OB_REGIONS_1_32 (PCIE_CORE_AXI_CONF_BASE + 0x0020) +#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \ + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0000 + ((r) & 0x1f) * 0x0020) #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0004 + ((r) & 0x1f) * 0x0020) + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0004 + ((r) & 0x1f) * 0x0020) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_HARDCODED_RID BIT(23) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK GENMASK(31, 24) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(devfn) \ (((devfn) << 24) & ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK) #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0008 + ((r) & 0x1f) * 0x0020) -#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \ - (PCIE_RC_RP_ATS_BASE + 0x000c + ((r) & 0x1f) * 0x0020) -#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r) \ - (PCIE_RC_RP_ATS_BASE + 0x0018 + ((r) & 0x1f) * 0x0020) -#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r) \ - (PCIE_RC_RP_ATS_BASE + 0x001c + ((r) & 0x1f) * 0x0020) + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0008 + ((r) & 0x1f) * 0x0020) +#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \ + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x000c + ((r) & 0x1f) * 0x0020) +#define ROCKCHIP_PCIE_AT_OB_REGION_DESC2(r) \ + (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0010 + ((r) & 0x1f) * 0x0020)
#define ROCKCHIP_PCIE_CORE_EP_FUNC_BAR_CFG0(fn) \ (PCIE_CORE_CTRL_MGMT_BASE + 0x0240 + (fn) * 0x0008)
From: Rick Wertenbroek rick.wertenbroek@gmail.com
[ Upstream commit a52587e0bee14cbeeadf48a24013828cb04b8df8 ]
The RK3399 PCIe endpoint controller cannot generate MSI-X IRQs. This is documented in the RK3399 technical reference manual (TRM) section 17.5.9 "Interrupt Support".
MSI-X capability should therefore not be advertised. Remove the MSI-X capability by editing the capability linked-list. The previous entry is the MSI capability, therefore get the next entry from the MSI-X capability entry and set it as next entry for the MSI capability. This in effect removes MSI-X from the list.
Linked list before : MSI cap -> MSI-X cap -> PCIe Device cap -> ... Linked list now : MSI cap -> PCIe Device cap -> ...
Link: https://lore.kernel.org/r/20230418074700.1083505-11-rick.wertenbroek@gmail.c... Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Tested-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Rick Wertenbroek rick.wertenbroek@gmail.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pcie-rockchip-ep.c | 24 +++++++++++++++++++++++ drivers/pci/controller/pcie-rockchip.h | 5 +++++ 2 files changed, 29 insertions(+)
diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c index 3d6f828d29fc2..0af0e965fb57e 100644 --- a/drivers/pci/controller/pcie-rockchip-ep.c +++ b/drivers/pci/controller/pcie-rockchip-ep.c @@ -508,6 +508,7 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev) size_t max_regions; struct pci_epc_mem_window *windows = NULL; int err, i; + u32 cfg_msi, cfg_msix_cp;
ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL); if (!ep) @@ -583,6 +584,29 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
+ /* + * MSI-X is not supported but the controller still advertises the MSI-X + * capability by default, which can lead to the Root Complex side + * allocating MSI-X vectors which cannot be used. Avoid this by skipping + * the MSI-X capability entry in the PCIe capabilities linked-list: get + * the next pointer from the MSI-X entry and set that in the MSI + * capability entry (which is the previous entry). This way the MSI-X + * entry is skipped (left out of the linked-list) and not advertised. + */ + cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE + + ROCKCHIP_PCIE_EP_MSI_CTRL_REG); + + cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK; + + cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE + + ROCKCHIP_PCIE_EP_MSIX_CAP_REG) & + ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK; + + cfg_msi |= cfg_msix_cp; + + rockchip_pcie_write(rockchip, cfg_msi, + PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG); + rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE, PCIE_CLIENT_CONFIG);
diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h index 501d859420b4c..fe0333778fd93 100644 --- a/drivers/pci/controller/pcie-rockchip.h +++ b/drivers/pci/controller/pcie-rockchip.h @@ -227,6 +227,8 @@ #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4 #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19) #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90 +#define ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET 8 +#define ROCKCHIP_PCIE_EP_MSI_CP1_MASK GENMASK(15, 8) #define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16 #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17 #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17) @@ -234,6 +236,9 @@ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK GENMASK(22, 20) #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16) #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24) +#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG 0xb0 +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET 8 +#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK GENMASK(15, 8) #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1 #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3 #define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \
From: Michael Strauss michael.strauss@amd.com
[ Upstream commit 9fa8cc0c444562fa19e20ca20f1c70e15b9d8c13 ]
[WHY] 32ms delay was added to resolve issue with a specific sink, however this same delay also introduces erroneous link training failures with certain sink devices.
[HOW] Only apply the 32ms delay for offending devices instead of globally.
Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: Jun Lei Jun.Lei@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Michael Strauss michael.strauss@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 5a096b73c8fe ("drm/amd/display: Keep disable aux-i delay as 0") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dc.h | 1 - drivers/gpu/drm/amd/display/dc/dc_types.h | 1 + .../link_dp_training_fixed_vs_pe_retimer.c | 17 +++++++++++------ 3 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 4d93ca9c627b0..07d86b961c798 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -855,7 +855,6 @@ struct dc_debug_options { bool force_usr_allow; /* uses value at boot and disables switch */ bool disable_dtb_ref_clk_switch; - uint32_t fixed_vs_aux_delay_config_wa; bool extended_blank_optimization; union aux_wake_wa_options aux_wake_wa; uint32_t mst_start_top_delay; diff --git a/drivers/gpu/drm/amd/display/dc/dc_types.h b/drivers/gpu/drm/amd/display/dc/dc_types.h index 45ab48fe5d004..139a77acd5d02 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_types.h +++ b/drivers/gpu/drm/amd/display/dc/dc_types.h @@ -196,6 +196,7 @@ struct dc_panel_patch { unsigned int disable_fams; unsigned int skip_avmute; unsigned int mst_start_top_delay; + unsigned int delay_disable_aux_intercept_ms; };
struct dc_edid_caps { diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c index 5731c4b61f9f0..fb6c938c6dab1 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c @@ -233,7 +233,8 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy( link->dpcd_caps.lttpr_caps.phy_repeater_cnt); const uint8_t vendor_lttpr_write_data_intercept_en[4] = {0x1, 0x55, 0x63, 0x0}; const uint8_t vendor_lttpr_write_data_intercept_dis[4] = {0x1, 0x55, 0x63, 0x68}; - uint32_t pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa; + uint32_t pre_disable_intercept_delay_ms = + link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms; uint8_t vendor_lttpr_write_data_vs[4] = {0x1, 0x51, 0x63, 0x0}; uint8_t vendor_lttpr_write_data_pe[4] = {0x1, 0x52, 0x63, 0x0}; uint32_t vendor_lttpr_write_address = 0xF004F; @@ -259,7 +260,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy(
/* Certain display and cable configuration require extra delay */ if (offset > 2) - pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa * 2; + pre_disable_intercept_delay_ms = pre_disable_intercept_delay_ms * 2; }
/* Vendor specific: Reset lane settings */ @@ -380,7 +381,8 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy( 0); /* Vendor specific: Disable intercept */ for (i = 0; i < max_vendor_dpcd_retries; i++) { - msleep(pre_disable_intercept_delay_ms); + if (pre_disable_intercept_delay_ms != 0) + msleep(pre_disable_intercept_delay_ms); dpcd_status = core_link_write_dpcd( link, vendor_lttpr_write_address, @@ -591,9 +593,11 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence( const uint8_t vendor_lttpr_write_data_adicora_eq1[4] = {0x1, 0x55, 0x63, 0x2E}; const uint8_t vendor_lttpr_write_data_adicora_eq2[4] = {0x1, 0x55, 0x63, 0x01}; const uint8_t vendor_lttpr_write_data_adicora_eq3[4] = {0x1, 0x55, 0x63, 0x68}; - uint32_t pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa; uint8_t vendor_lttpr_write_data_vs[4] = {0x1, 0x51, 0x63, 0x0}; uint8_t vendor_lttpr_write_data_pe[4] = {0x1, 0x52, 0x63, 0x0}; + uint32_t pre_disable_intercept_delay_ms = + link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms; +
uint32_t vendor_lttpr_write_address = 0xF004F; enum link_training_result status = LINK_TRAINING_SUCCESS; @@ -618,7 +622,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence(
/* Certain display and cable configuration require extra delay */ if (offset > 2) - pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa * 2; + pre_disable_intercept_delay_ms = pre_disable_intercept_delay_ms * 2; }
/* Vendor specific: Reset lane settings */ @@ -739,7 +743,8 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence( 0); /* Vendor specific: Disable intercept */ for (i = 0; i < max_vendor_dpcd_retries; i++) { - msleep(pre_disable_intercept_delay_ms); + if (pre_disable_intercept_delay_ms != 0) + msleep(pre_disable_intercept_delay_ms); dpcd_status = core_link_write_dpcd( link, vendor_lttpr_write_address,
From: Michael Strauss michael.strauss@amd.com
[ Upstream commit 5a096b73c8fed3a9987ba15378285df360e2284b ]
[WHY] Current Aux-I sequence checks for local_sink which isn't populated on MST links
[HOW] Leave disable aux-i delay as 0 for MST cases
Cc: stable@vger.kernel.org Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: George Shen George.Shen@amd.com Reviewed-by: Aric Cyr Aric.Cyr@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Michael Strauss michael.strauss@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../link_dp_training_fixed_vs_pe_retimer.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c index fb6c938c6dab1..15faaf645b145 100644 --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c @@ -233,8 +233,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy( link->dpcd_caps.lttpr_caps.phy_repeater_cnt); const uint8_t vendor_lttpr_write_data_intercept_en[4] = {0x1, 0x55, 0x63, 0x0}; const uint8_t vendor_lttpr_write_data_intercept_dis[4] = {0x1, 0x55, 0x63, 0x68}; - uint32_t pre_disable_intercept_delay_ms = - link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms; + uint32_t pre_disable_intercept_delay_ms = 0; uint8_t vendor_lttpr_write_data_vs[4] = {0x1, 0x51, 0x63, 0x0}; uint8_t vendor_lttpr_write_data_pe[4] = {0x1, 0x52, 0x63, 0x0}; uint32_t vendor_lttpr_write_address = 0xF004F; @@ -245,6 +244,10 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy( uint8_t toggle_rate; uint8_t rate;
+ if (link->local_sink) + pre_disable_intercept_delay_ms = + link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms; + /* Only 8b/10b is supported */ ASSERT(link_dp_get_encoding_format(<_settings->link_settings) == DP_8b_10b_ENCODING); @@ -595,10 +598,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence( const uint8_t vendor_lttpr_write_data_adicora_eq3[4] = {0x1, 0x55, 0x63, 0x68}; uint8_t vendor_lttpr_write_data_vs[4] = {0x1, 0x51, 0x63, 0x0}; uint8_t vendor_lttpr_write_data_pe[4] = {0x1, 0x52, 0x63, 0x0}; - uint32_t pre_disable_intercept_delay_ms = - link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms; - - + uint32_t pre_disable_intercept_delay_ms = 0; uint32_t vendor_lttpr_write_address = 0xF004F; enum link_training_result status = LINK_TRAINING_SUCCESS; uint8_t lane = 0; @@ -607,6 +607,10 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence( uint8_t toggle_rate; uint8_t rate;
+ if (link->local_sink) + pre_disable_intercept_delay_ms = + link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms; + /* Only 8b/10b is supported */ ASSERT(link_dp_get_encoding_format(<_settings->link_settings) == DP_8b_10b_ENCODING);
From: Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com
[ Upstream commit 9ba90d760e9354c124fa9bbea08017d96699a82c ]
This feature is meant to unblock PSTATE for certain high end display configs on dcn315. This is achieved by allocating CRB to detile buffer based on display requirements to meet pstate latency hiding needs.
Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: Charlene Liu Charlene.Liu@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Dmytro Laktyushkin Dmytro.Laktyushkin@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 49f26218c344 ("drm/amd/display: fix dcn315 single stream crb allocation") Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/amd/display/dc/dcn31/dcn31_hubbub.c | 1 + .../amd/display/dc/dcn315/dcn315_resource.c | 97 ++++++++++++++++++- .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 25 ++++- .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 + .../dc/dml/dcn31/display_mode_vba_31.c | 39 +++++--- .../drm/amd/display/dc/dml/display_mode_vba.c | 6 ++ 6 files changed, 154 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c index 7e7cd5b64e6a1..7445ed27852a1 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c @@ -103,6 +103,7 @@ static void dcn31_program_det_size(struct hubbub *hubbub, int hubp_inst, unsigne default: break; } + DC_LOG_DEBUG("Set DET%d to %d segments\n", hubp_inst, det_size_segments); /* Should never be hit, if it is we have an erroneous hw config*/ ASSERT(hubbub2->det0_size + hubbub2->det1_size + hubbub2->det2_size + hubbub2->det3_size + hubbub2->compbuf_size_segments <= hubbub2->crb_size_segs); diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c index 41c972c8eb198..42a0157fd8133 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c @@ -136,6 +136,9 @@
#define DCN3_15_MAX_DET_SIZE 384 #define DCN3_15_CRB_SEGMENT_SIZE_KB 64 +#define DCN3_15_MAX_DET_SEGS (DCN3_15_MAX_DET_SIZE / DCN3_15_CRB_SEGMENT_SIZE_KB) +/* Minimum 2 extra segments need to be in compbuf and claimable to guarantee seamless mpo transitions */ +#define MIN_RESERVED_DET_SEGS 2
enum dcn31_clk_src_array_id { DCN31_CLK_SRC_PLL0, @@ -1636,21 +1639,57 @@ static bool is_dual_plane(enum surface_pixel_format format) return format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN || format == SURFACE_PIXEL_FORMAT_GRPH_RGBE_ALPHA; }
+static int source_format_to_bpp (enum source_format_class SourcePixelFormat) +{ + if (SourcePixelFormat == dm_444_64) + return 8; + else if (SourcePixelFormat == dm_444_16 || SourcePixelFormat == dm_444_16) + return 2; + else if (SourcePixelFormat == dm_444_8) + return 1; + else if (SourcePixelFormat == dm_rgbe_alpha) + return 5; + else if (SourcePixelFormat == dm_420_8) + return 3; + else if (SourcePixelFormat == dm_420_12) + return 6; + else + return 4; +} + +static bool allow_pixel_rate_crb(struct dc *dc, struct dc_state *context) +{ + int i; + struct resource_context *res_ctx = &context->res_ctx; + + for (i = 0; i < dc->res_pool->pipe_count; i++) { + if (!res_ctx->pipe_ctx[i].stream) + continue; + + /*Don't apply if MPO to avoid transition issues*/ + if (res_ctx->pipe_ctx[i].top_pipe && res_ctx->pipe_ctx[i].top_pipe->plane_state != res_ctx->pipe_ctx[i].plane_state) + return false; + } + return true; +} + static int dcn315_populate_dml_pipes_from_context( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, bool fast_validate) { - int i, pipe_cnt; + int i, pipe_cnt, crb_idx, crb_pipes; struct resource_context *res_ctx = &context->res_ctx; struct pipe_ctx *pipe; const int max_usable_det = context->bw_ctx.dml.ip.config_return_buffer_size_in_kbytes - DCN3_15_MIN_COMPBUF_SIZE_KB; + int remaining_det_segs = max_usable_det / DCN3_15_CRB_SEGMENT_SIZE_KB; + bool pixel_rate_crb = allow_pixel_rate_crb(dc, context);
DC_FP_START(); dcn31x_populate_dml_pipes_from_context(dc, context, pipes, fast_validate); DC_FP_END();
- for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) { + for (i = 0, pipe_cnt = 0, crb_pipes = 0; i < dc->res_pool->pipe_count; i++) { struct dc_crtc_timing *timing;
if (!res_ctx->pipe_ctx[i].stream) @@ -1671,6 +1710,23 @@ static int dcn315_populate_dml_pipes_from_context( pipes[pipe_cnt].dout.dsc_input_bpc = 0; DC_FP_START(); dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt); + if (pixel_rate_crb && !pipe->top_pipe && !pipe->prev_odm_pipe) { + int bpp = source_format_to_bpp(pipes[pipe_cnt].pipe.src.source_format); + /* Ceil to crb segment size */ + int approx_det_segs_required_for_pstate = dcn_get_approx_det_segs_required_for_pstate( + &context->bw_ctx.dml.soc, timing->pix_clk_100hz, bpp, DCN3_15_CRB_SEGMENT_SIZE_KB); + if (approx_det_segs_required_for_pstate <= 2 * DCN3_15_MAX_DET_SEGS) { + bool split_required = approx_det_segs_required_for_pstate > DCN3_15_MAX_DET_SEGS; + split_required = split_required || timing->pix_clk_100hz >= dcn_get_max_non_odm_pix_rate_100hz(&dc->dml.soc); + split_required = split_required || (pipe->plane_state && pipe->plane_state->src_rect.width > 5120); + if (split_required) + approx_det_segs_required_for_pstate += approx_det_segs_required_for_pstate % 2; + pipes[pipe_cnt].pipe.src.det_size_override = approx_det_segs_required_for_pstate; + remaining_det_segs -= approx_det_segs_required_for_pstate; + } else + remaining_det_segs = -1; + crb_pipes++; + } DC_FP_END();
if (pipes[pipe_cnt].dout.dsc_enable) { @@ -1689,16 +1745,49 @@ static int dcn315_populate_dml_pipes_from_context( break; } } - pipe_cnt++; }
+ /* Spread remaining unreserved crb evenly among all pipes, use default policy if not enough det or single pipe */ + if (pixel_rate_crb) { + for (i = 0, pipe_cnt = 0, crb_idx = 0; i < dc->res_pool->pipe_count; i++) { + pipe = &res_ctx->pipe_ctx[i]; + if (!pipe->stream) + continue; + + if (!pipe->top_pipe && !pipe->prev_odm_pipe) { + bool split_required = pipe->stream->timing.pix_clk_100hz >= dcn_get_max_non_odm_pix_rate_100hz(&dc->dml.soc) + || (pipe->plane_state && pipe->plane_state->src_rect.width > 5120); + + if (remaining_det_segs < 0 || crb_pipes == 1) + pipes[pipe_cnt].pipe.src.det_size_override = 0; + if (remaining_det_segs > MIN_RESERVED_DET_SEGS) + pipes[pipe_cnt].pipe.src.det_size_override += (remaining_det_segs - MIN_RESERVED_DET_SEGS) / crb_pipes + + (crb_idx < (remaining_det_segs - MIN_RESERVED_DET_SEGS) % crb_pipes ? 1 : 0); + if (pipes[pipe_cnt].pipe.src.det_size_override > 2 * DCN3_15_MAX_DET_SEGS) { + /* Clamp to 2 pipe split max det segments */ + remaining_det_segs += pipes[pipe_cnt].pipe.src.det_size_override - 2 * (DCN3_15_MAX_DET_SEGS); + pipes[pipe_cnt].pipe.src.det_size_override = 2 * DCN3_15_MAX_DET_SEGS; + } + if (pipes[pipe_cnt].pipe.src.det_size_override > DCN3_15_MAX_DET_SEGS || split_required) { + /* If we are splitting we must have an even number of segments */ + remaining_det_segs += pipes[pipe_cnt].pipe.src.det_size_override % 2; + pipes[pipe_cnt].pipe.src.det_size_override -= pipes[pipe_cnt].pipe.src.det_size_override % 2; + } + /* Convert segments into size for DML use */ + pipes[pipe_cnt].pipe.src.det_size_override *= DCN3_15_CRB_SEGMENT_SIZE_KB; + crb_idx++; + } + pipe_cnt++; + } + } + if (pipe_cnt) context->bw_ctx.dml.ip.det_buffer_size_kbytes = (max_usable_det / DCN3_15_CRB_SEGMENT_SIZE_KB / pipe_cnt) * DCN3_15_CRB_SEGMENT_SIZE_KB; if (context->bw_ctx.dml.ip.det_buffer_size_kbytes > DCN3_15_MAX_DET_SIZE) context->bw_ctx.dml.ip.det_buffer_size_kbytes = DCN3_15_MAX_DET_SIZE; - ASSERT(context->bw_ctx.dml.ip.det_buffer_size_kbytes >= DCN3_15_DEFAULT_DET_SIZE); + dc->config.enable_4to1MPC = false; if (pipe_cnt == 1 && pipe->plane_state && !dc->debug.disable_z9_mpc) { if (is_dual_plane(pipe->plane_state->format) diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c index 59836570603ac..19d034341e640 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c @@ -483,7 +483,7 @@ void dcn31_calculate_wm_and_dlg_fp( int pipe_cnt, int vlevel) { - int i, pipe_idx, active_hubp_count = 0; + int i, pipe_idx, total_det = 0, active_hubp_count = 0; double dcfclk = context->bw_ctx.dml.vba.DCFCLKState[vlevel][context->bw_ctx.dml.vba.maxMpcComb];
dc_assert_fp_enabled(); @@ -563,6 +563,18 @@ void dcn31_calculate_wm_and_dlg_fp( if (context->res_ctx.pipe_ctx[i].stream) context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 0; } + for (i = 0, pipe_idx = 0; i < dc->res_pool->pipe_count; i++) { + if (!context->res_ctx.pipe_ctx[i].stream) + continue; + + context->res_ctx.pipe_ctx[i].det_buffer_size_kb = + get_det_buffer_size_kbytes(&context->bw_ctx.dml, pipes, pipe_cnt, pipe_idx); + if (context->res_ctx.pipe_ctx[i].det_buffer_size_kb > 384) + context->res_ctx.pipe_ctx[i].det_buffer_size_kb /= 2; + total_det += context->res_ctx.pipe_ctx[i].det_buffer_size_kb; + pipe_idx++; + } + context->bw_ctx.bw.dcn.compbuf_size_kb = context->bw_ctx.dml.ip.config_return_buffer_size_in_kbytes - total_det; }
void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params) @@ -815,3 +827,14 @@ int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc) { return soc->clock_limits[0].dispclk_mhz * 10000.0 / (1.0 + soc->dcn_downspread_percent / 100.0); } + +int dcn_get_approx_det_segs_required_for_pstate( + struct _vcs_dpi_soc_bounding_box_st *soc, + int pix_clk_100hz, int bpp, int seg_size_kb) +{ + /* Roughly calculate required crb to hide latency. In practice there is slightly + * more buffer available for latency hiding + */ + return (int)(soc->dram_clock_change_latency_us * pix_clk_100hz * bpp + / 10240000 + seg_size_kb - 1) / seg_size_kb; +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h index 687d3522cc33e..8f9c8faed2605 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h @@ -47,6 +47,9 @@ void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params void dcn315_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params); void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params); int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc); +int dcn_get_approx_det_segs_required_for_pstate( + struct _vcs_dpi_soc_bounding_box_st *soc, + int pix_clk_100hz, int bpp, int seg_size_kb);
int dcn31x_populate_dml_pipes_from_context(struct dc *dc, struct dc_state *context, diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c index bd674dc30df33..a0f44eef7763f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c @@ -532,7 +532,8 @@ static void CalculateStutterEfficiency( static void CalculateSwathAndDETConfiguration( bool ForceSingleDPP, int NumberOfActivePlanes, - unsigned int DETBufferSizeInKByte, + bool DETSharedByAllDPP, + unsigned int DETBufferSizeInKByte[], double MaximumSwathWidthLuma[], double MaximumSwathWidthChroma[], enum scan_direction_class SourceScan[], @@ -3118,7 +3119,7 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman v->SurfaceWidthC[k], v->SurfaceHeightY[k], v->SurfaceHeightC[k], - v->DETBufferSizeInKByte[0] * 1024, + v->DETBufferSizeInKByte[k] * 1024, v->BlockHeight256BytesY[k], v->BlockHeight256BytesC[k], v->SurfaceTiling[k], @@ -3313,7 +3314,8 @@ static void DisplayPipeConfiguration(struct display_mode_lib *mode_lib) CalculateSwathAndDETConfiguration( false, v->NumberOfActivePlanes, - v->DETBufferSizeInKByte[0], + mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[0], + v->DETBufferSizeInKByte, dummy1, dummy2, v->SourceScan, @@ -3779,14 +3781,16 @@ static noinline void CalculatePrefetchSchedulePerPlane( &v->VReadyOffsetPix[k]); }
-static void PatchDETBufferSizeInKByte(unsigned int NumberOfActivePlanes, int NoOfDPPThisState[], unsigned int config_return_buffer_size_in_kbytes, unsigned int *DETBufferSizeInKByte) +static void PatchDETBufferSizeInKByte(unsigned int NumberOfActivePlanes, int NoOfDPPThisState[], unsigned int config_return_buffer_size_in_kbytes, unsigned int DETBufferSizeInKByte[]) { int i, total_pipes = 0; for (i = 0; i < NumberOfActivePlanes; i++) total_pipes += NoOfDPPThisState[i]; - *DETBufferSizeInKByte = ((config_return_buffer_size_in_kbytes - DCN3_15_MIN_COMPBUF_SIZE_KB) / 64 / total_pipes) * 64; - if (*DETBufferSizeInKByte > DCN3_15_MAX_DET_SIZE) - *DETBufferSizeInKByte = DCN3_15_MAX_DET_SIZE; + DETBufferSizeInKByte[0] = ((config_return_buffer_size_in_kbytes - DCN3_15_MIN_COMPBUF_SIZE_KB) / 64 / total_pipes) * 64; + if (DETBufferSizeInKByte[0] > DCN3_15_MAX_DET_SIZE) + DETBufferSizeInKByte[0] = DCN3_15_MAX_DET_SIZE; + for (i = 1; i < NumberOfActivePlanes; i++) + DETBufferSizeInKByte[i] = DETBufferSizeInKByte[0]; }
@@ -4026,7 +4030,8 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l CalculateSwathAndDETConfiguration( true, v->NumberOfActivePlanes, - v->DETBufferSizeInKByte[0], + mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[0], + v->DETBufferSizeInKByte, v->MaximumSwathWidthLuma, v->MaximumSwathWidthChroma, v->SourceScan, @@ -4166,6 +4171,10 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l || (v->PlaneRequiredDISPCLK > v->MaxDispclkRoundedDownToDFSGranularity)) { v->DISPCLK_DPPCLK_Support[i][j] = false; } + if (mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[k] > DCN3_15_MAX_DET_SIZE && v->NoOfDPP[i][j][k] < 2) { + v->MPCCombine[i][j][k] = true; + v->NoOfDPP[i][j][k] = 2; + } } v->TotalNumberOfActiveDPP[i][j] = 0; v->TotalNumberOfSingleDPPPlanes[i][j] = 0; @@ -4642,12 +4651,13 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l v->ODMCombineEnableThisState[k] = v->ODMCombineEnablePerState[i][k]; }
- if (v->NumberOfActivePlanes > 1 && mode_lib->project == DML_PROJECT_DCN315) - PatchDETBufferSizeInKByte(v->NumberOfActivePlanes, v->NoOfDPPThisState, v->ip.config_return_buffer_size_in_kbytes, &v->DETBufferSizeInKByte[0]); + if (v->NumberOfActivePlanes > 1 && mode_lib->project == DML_PROJECT_DCN315 && !v->DETSizeOverride[0]) + PatchDETBufferSizeInKByte(v->NumberOfActivePlanes, v->NoOfDPPThisState, v->ip.config_return_buffer_size_in_kbytes, v->DETBufferSizeInKByte); CalculateSwathAndDETConfiguration( false, v->NumberOfActivePlanes, - v->DETBufferSizeInKByte[0], + mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[0], + v->DETBufferSizeInKByte, v->MaximumSwathWidthLuma, v->MaximumSwathWidthChroma, v->SourceScan, @@ -6611,7 +6621,8 @@ static void CalculateStutterEfficiency( static void CalculateSwathAndDETConfiguration( bool ForceSingleDPP, int NumberOfActivePlanes, - unsigned int DETBufferSizeInKByte, + bool DETSharedByAllDPP, + unsigned int DETBufferSizeInKByteA[], double MaximumSwathWidthLuma[], double MaximumSwathWidthChroma[], enum scan_direction_class SourceScan[], @@ -6695,6 +6706,10 @@ static void CalculateSwathAndDETConfiguration(
*ViewportSizeSupport = true; for (k = 0; k < NumberOfActivePlanes; ++k) { + unsigned int DETBufferSizeInKByte = DETBufferSizeInKByteA[k]; + + if (DETSharedByAllDPP && DPPPerPlane[k]) + DETBufferSizeInKByte /= DPPPerPlane[k]; if ((SourcePixelFormat[k] == dm_444_64 || SourcePixelFormat[k] == dm_444_32 || SourcePixelFormat[k] == dm_444_16 || SourcePixelFormat[k] == dm_mono_16 || SourcePixelFormat[k] == dm_mono_8 || SourcePixelFormat[k] == dm_rgbe)) { if (SurfaceTiling[k] == dm_sw_linear diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c b/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c index f9653f511baa3..2f63ae954826c 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c +++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c @@ -571,6 +571,10 @@ static void fetch_pipe_params(struct display_mode_lib *mode_lib) mode_lib->vba.OutputLinkDPRate[mode_lib->vba.NumberOfActivePlanes] = dout->dp_rate; mode_lib->vba.ODMUse[mode_lib->vba.NumberOfActivePlanes] = dst->odm_combine_policy; mode_lib->vba.DETSizeOverride[mode_lib->vba.NumberOfActivePlanes] = src->det_size_override; + if (src->det_size_override) + mode_lib->vba.DETBufferSizeInKByte[mode_lib->vba.NumberOfActivePlanes] = src->det_size_override; + else + mode_lib->vba.DETBufferSizeInKByte[mode_lib->vba.NumberOfActivePlanes] = ip->det_buffer_size_kbytes; //TODO: Need to assign correct values to dp_multistream vars mode_lib->vba.OutputMultistreamEn[mode_lib->vba.NumberOfActiveSurfaces] = dout->dp_multistream_en; mode_lib->vba.OutputMultistreamId[mode_lib->vba.NumberOfActiveSurfaces] = dout->dp_multistream_id; @@ -785,6 +789,8 @@ static void fetch_pipe_params(struct display_mode_lib *mode_lib) mode_lib->vba.pipe_plane[k] = mode_lib->vba.NumberOfActivePlanes; mode_lib->vba.DPPPerPlane[mode_lib->vba.NumberOfActivePlanes]++; + if (src_k->det_size_override) + mode_lib->vba.DETBufferSizeInKByte[mode_lib->vba.NumberOfActivePlanes] = src_k->det_size_override; if (mode_lib->vba.SourceScan[mode_lib->vba.NumberOfActivePlanes] == dm_horz) { mode_lib->vba.ViewportWidth[mode_lib->vba.NumberOfActivePlanes] +=
From: Dmytro Laktyushkin dmytro.laktyushkin@amd.com
[ Upstream commit 49f26218c344741cb3eaa740b1e44e960551a87f ]
Change to improve avoiding asymetric crb calculations for single stream scenarios.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Acked-by: Stylon Wang stylon.wang@amd.com Signed-off-by: Dmytro Laktyushkin dmytro.laktyushkin@amd.com Reviewed-by: Charlene Liu Charlene.Liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/amd/display/dc/dcn315/dcn315_resource.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c index 42a0157fd8133..ae99b2851e019 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c @@ -1662,6 +1662,10 @@ static bool allow_pixel_rate_crb(struct dc *dc, struct dc_state *context) int i; struct resource_context *res_ctx = &context->res_ctx;
+ /*Don't apply for single stream*/ + if (context->stream_count < 2) + return false; + for (i = 0; i < dc->res_pool->pipe_count; i++) { if (!res_ctx->pipe_ctx[i].stream) continue; @@ -1748,19 +1752,23 @@ static int dcn315_populate_dml_pipes_from_context( pipe_cnt++; }
- /* Spread remaining unreserved crb evenly among all pipes, use default policy if not enough det or single pipe */ + /* Spread remaining unreserved crb evenly among all pipes*/ if (pixel_rate_crb) { for (i = 0, pipe_cnt = 0, crb_idx = 0; i < dc->res_pool->pipe_count; i++) { pipe = &res_ctx->pipe_ctx[i]; if (!pipe->stream) continue;
+ /* Do not use asymetric crb if not enough for pstate support */ + if (remaining_det_segs < 0) { + pipes[pipe_cnt].pipe.src.det_size_override = 0; + continue; + } + if (!pipe->top_pipe && !pipe->prev_odm_pipe) { bool split_required = pipe->stream->timing.pix_clk_100hz >= dcn_get_max_non_odm_pix_rate_100hz(&dc->dml.soc) || (pipe->plane_state && pipe->plane_state->src_rect.width > 5120);
- if (remaining_det_segs < 0 || crb_pipes == 1) - pipes[pipe_cnt].pipe.src.det_size_override = 0; if (remaining_det_segs > MIN_RESERVED_DET_SEGS) pipes[pipe_cnt].pipe.src.det_size_override += (remaining_det_segs - MIN_RESERVED_DET_SEGS) / crb_pipes + (crb_idx < (remaining_det_segs - MIN_RESERVED_DET_SEGS) % crb_pipes ? 1 : 0); @@ -1776,6 +1784,7 @@ static int dcn315_populate_dml_pipes_from_context( } /* Convert segments into size for DML use */ pipes[pipe_cnt].pipe.src.det_size_override *= DCN3_15_CRB_SEGMENT_SIZE_KB; + crb_idx++; } pipe_cnt++;
From: Cruise Hung cruise.hung@amd.com
[ Upstream commit 268182606f26434c5d3ebd0e86efcb0418dec487 ]
[Why] The register header for DCN314 is not correct.
[How] Update correct DCN314 register header.
Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Tom Chung chiahsuan.chung@amd.com Signed-off-by: Cruise Hung cruise.hung@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: cd2e31a9ab93 ("drm/amd/display: Set minimum requirement for using PSR-SU on Phoenix") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dmub/src/Makefile | 2 +- .../drm/amd/display/dmub/src/dmub_dcn314.c | 62 +++++++++++++++++++ .../drm/amd/display/dmub/src/dmub_dcn314.h | 33 ++++++++++ .../gpu/drm/amd/display/dmub/src/dmub_srv.c | 5 +- 4 files changed, 100 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c create mode 100644 drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h
diff --git a/drivers/gpu/drm/amd/display/dmub/src/Makefile b/drivers/gpu/drm/amd/display/dmub/src/Makefile index 0589ad4778eea..caf095aca8f3f 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/Makefile +++ b/drivers/gpu/drm/amd/display/dmub/src/Makefile @@ -22,7 +22,7 @@
DMUB = dmub_srv.o dmub_srv_stat.o dmub_reg.o dmub_dcn20.o dmub_dcn21.o DMUB += dmub_dcn30.o dmub_dcn301.o dmub_dcn302.o dmub_dcn303.o -DMUB += dmub_dcn31.o dmub_dcn315.o dmub_dcn316.o +DMUB += dmub_dcn31.o dmub_dcn314.o dmub_dcn315.o dmub_dcn316.o DMUB += dmub_dcn32.o
AMD_DAL_DMUB = $(addprefix $(AMDDALPATH)/dmub/src/,$(DMUB)) diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c new file mode 100644 index 0000000000000..48a06dbd9be78 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c @@ -0,0 +1,62 @@ +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "../dmub_srv.h" +#include "dmub_reg.h" +#include "dmub_dcn314.h" + +#include "dcn/dcn_3_1_4_offset.h" +#include "dcn/dcn_3_1_4_sh_mask.h" + +#define DCN_BASE__INST0_SEG0 0x00000012 +#define DCN_BASE__INST0_SEG1 0x000000C0 +#define DCN_BASE__INST0_SEG2 0x000034C0 +#define DCN_BASE__INST0_SEG3 0x00009000 +#define DCN_BASE__INST0_SEG4 0x02403C00 +#define DCN_BASE__INST0_SEG5 0 + +#define BASE_INNER(seg) DCN_BASE__INST0_SEG##seg +#define CTX dmub +#define REGS dmub->regs_dcn31 +#define REG_OFFSET_EXP(reg_name) (BASE(reg##reg_name##_BASE_IDX) + reg##reg_name) + +/* Registers. */ + +const struct dmub_srv_dcn31_regs dmub_srv_dcn314_regs = { +#define DMUB_SR(reg) REG_OFFSET_EXP(reg), + { + DMUB_DCN31_REGS() + DMCUB_INTERNAL_REGS() + }, +#undef DMUB_SR + +#define DMUB_SF(reg, field) FD_MASK(reg, field), + { DMUB_DCN31_FIELDS() }, +#undef DMUB_SF + +#define DMUB_SF(reg, field) FD_SHIFT(reg, field), + { DMUB_DCN31_FIELDS() }, +#undef DMUB_SF +}; diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h new file mode 100644 index 0000000000000..674267a2940e9 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h @@ -0,0 +1,33 @@ +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#ifndef _DMUB_DCN314_H_ +#define _DMUB_DCN314_H_ + +#include "dmub_dcn31.h" + +extern const struct dmub_srv_dcn31_regs dmub_srv_dcn314_regs; + +#endif /* _DMUB_DCN314_H_ */ diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c index 92c18bfb98b3b..6d76ce327d69f 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -32,6 +32,7 @@ #include "dmub_dcn302.h" #include "dmub_dcn303.h" #include "dmub_dcn31.h" +#include "dmub_dcn314.h" #include "dmub_dcn315.h" #include "dmub_dcn316.h" #include "dmub_dcn32.h" @@ -226,7 +227,9 @@ static bool dmub_srv_hw_setup(struct dmub_srv *dmub, enum dmub_asic asic) case DMUB_ASIC_DCN314: case DMUB_ASIC_DCN315: case DMUB_ASIC_DCN316: - if (asic == DMUB_ASIC_DCN315) + if (asic == DMUB_ASIC_DCN314) + dmub->regs_dcn31 = &dmub_srv_dcn314_regs; + else if (asic == DMUB_ASIC_DCN315) dmub->regs_dcn31 = &dmub_srv_dcn315_regs; else if (asic == DMUB_ASIC_DCN316) dmub->regs_dcn31 = &dmub_srv_dcn316_regs;
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit c35b6ea8f2ecfa9d775530b70d4e727869099a9c ]
A number of parade TCONs are causing system hangs when utilized with older DMUB firmware and PSR-SU. Some changes have been introduced into DMUB firmware to add resilience against these failures.
Don't allow running PSR-SU unless on the newer firmware.
Cc: stable@vger.kernel.org Cc: Sean Wang sean.ns.wang@amd.com Cc: Marc Rossi Marc.Rossi@amd.com Cc: Hamza Mahfooz Hamza.Mahfooz@amd.com Cc: Tsung-hua (Ryan) Lin Tsung-hua.Lin@amd.com Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2443 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Leo Li sunpeng.li@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: cd2e31a9ab93 ("drm/amd/display: Set minimum requirement for using PSR-SU on Phoenix") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 3 ++- drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 7 +++++++ drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h | 1 + drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 2 ++ drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c | 5 +++++ drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h | 2 ++ drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 10 ++++++---- 7 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c index d647f68fd5630..4f61d4f257cd7 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c @@ -24,6 +24,7 @@ */
#include "amdgpu_dm_psr.h" +#include "dc_dmub_srv.h" #include "dc.h" #include "dm_helpers.h" #include "amdgpu_dm.h" @@ -50,7 +51,7 @@ static bool link_supports_psrsu(struct dc_link *link) !link->dpcd_caps.psr_info.psr2_su_y_granularity_cap) return false;
- return true; + return dc_dmub_check_min_version(dc->ctx->dmub_srv->dmub); }
/* diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c index a9b9490a532c2..ab4542b57b9a3 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c @@ -1079,3 +1079,10 @@ void dc_send_update_cursor_info_to_dmu( dc_send_cmd_to_dmu(pCtx->stream->ctx->dmub_srv, &cmd); } } + +bool dc_dmub_check_min_version(struct dmub_srv *srv) +{ + if (!srv->hw_funcs.is_psrsu_supported) + return true; + return srv->hw_funcs.is_psrsu_supported(srv); +} diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h index d34f5563df2ec..9a248ced03b9c 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h @@ -89,4 +89,5 @@ void dc_dmub_setup_subvp_dmub_command(struct dc *dc, struct dc_state *context, b void dc_dmub_srv_log_diagnostic_data(struct dc_dmub_srv *dc_dmub_srv);
void dc_send_update_cursor_info_to_dmu(struct pipe_ctx *pCtx, uint8_t pipe_idx); +bool dc_dmub_check_min_version(struct dmub_srv *srv); #endif /* _DMUB_DC_SRV_H_ */ diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h index 554ab48d4e647..9cad599b27094 100644 --- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h @@ -364,6 +364,8 @@ struct dmub_srv_hw_funcs {
bool (*is_supported)(struct dmub_srv *dmub);
+ bool (*is_psrsu_supported)(struct dmub_srv *dmub); + bool (*is_hw_init)(struct dmub_srv *dmub);
bool (*is_phy_init)(struct dmub_srv *dmub); diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c index c90b9ee42e126..89d24fb7024e2 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c @@ -297,6 +297,11 @@ bool dmub_dcn31_is_supported(struct dmub_srv *dmub) return supported; }
+bool dmub_dcn31_is_psrsu_supported(struct dmub_srv *dmub) +{ + return dmub->fw_version >= DMUB_FW_VERSION(4, 0, 59); +} + void dmub_dcn31_set_gpint(struct dmub_srv *dmub, union dmub_gpint_data_register reg) { diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h index f6db6f89d45dc..eb62410941473 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h @@ -219,6 +219,8 @@ bool dmub_dcn31_is_hw_init(struct dmub_srv *dmub);
bool dmub_dcn31_is_supported(struct dmub_srv *dmub);
+bool dmub_dcn31_is_psrsu_supported(struct dmub_srv *dmub); + void dmub_dcn31_set_gpint(struct dmub_srv *dmub, union dmub_gpint_data_register reg);
diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c index 6d76ce327d69f..0f43a05a41874 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -227,14 +227,16 @@ static bool dmub_srv_hw_setup(struct dmub_srv *dmub, enum dmub_asic asic) case DMUB_ASIC_DCN314: case DMUB_ASIC_DCN315: case DMUB_ASIC_DCN316: - if (asic == DMUB_ASIC_DCN314) + if (asic == DMUB_ASIC_DCN314) { dmub->regs_dcn31 = &dmub_srv_dcn314_regs; - else if (asic == DMUB_ASIC_DCN315) + } else if (asic == DMUB_ASIC_DCN315) { dmub->regs_dcn31 = &dmub_srv_dcn315_regs; - else if (asic == DMUB_ASIC_DCN316) + } else if (asic == DMUB_ASIC_DCN316) { dmub->regs_dcn31 = &dmub_srv_dcn316_regs; - else + } else { dmub->regs_dcn31 = &dmub_srv_dcn31_regs; + funcs->is_psrsu_supported = dmub_dcn31_is_psrsu_supported; + } funcs->reset = dmub_dcn31_reset; funcs->reset_release = dmub_dcn31_reset_release; funcs->backdoor_load = dmub_dcn31_backdoor_load;
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit cd2e31a9ab93d13c412a36c6e26811e0f830985b ]
The same parade TCON issue can potentially happen on Phoenix, and the same PSR resilience changes have been ported into the DMUB firmware.
Don't allow running PSR-SU unless on the newer firmware.
Cc: stable@vger.kernel.org Cc: Sean Wang sean.ns.wang@amd.com Cc: Marc Rossi Marc.Rossi@amd.com Cc: Hamza Mahfooz Hamza.Mahfooz@amd.com Cc: Tsung-hua (Ryan) Lin Tsung-hua.Lin@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Leo Li sunpeng.li@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c | 5 +++++ drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h | 2 ++ drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 1 + 3 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c index 48a06dbd9be78..f161aeb7e7c4a 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c @@ -60,3 +60,8 @@ const struct dmub_srv_dcn31_regs dmub_srv_dcn314_regs = { { DMUB_DCN31_FIELDS() }, #undef DMUB_SF }; + +bool dmub_dcn314_is_psrsu_supported(struct dmub_srv *dmub) +{ + return dmub->fw_version >= DMUB_FW_VERSION(8, 0, 16); +} diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h index 674267a2940e9..f213bd82c9110 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h @@ -30,4 +30,6 @@
extern const struct dmub_srv_dcn31_regs dmub_srv_dcn314_regs;
+bool dmub_dcn314_is_psrsu_supported(struct dmub_srv *dmub); + #endif /* _DMUB_DCN314_H_ */ diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c index 0f43a05a41874..0dab22d794808 100644 --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -229,6 +229,7 @@ static bool dmub_srv_hw_setup(struct dmub_srv *dmub, enum dmub_asic asic) case DMUB_ASIC_DCN316: if (asic == DMUB_ASIC_DCN314) { dmub->regs_dcn31 = &dmub_srv_dcn314_regs; + funcs->is_psrsu_supported = dmub_dcn314_is_psrsu_supported; } else if (asic == DMUB_ASIC_DCN315) { dmub->regs_dcn31 = &dmub_srv_dcn315_regs; } else if (asic == DMUB_ASIC_DCN316) {
From: Christian König christian.koenig@amd.com
[ Upstream commit a2848d08742c8e8494675892c02c0d22acbe3cf8 ]
There is a small window where we have already incremented the pin count but not yet moved the bo from the lru to the pinned list.
Signed-off-by: Christian König christian.koenig@amd.com Reported-by: Pelloux-Prayer, Pierre-Eric Pierre-eric.Pelloux-prayer@amd.com Tested-by: Pelloux-Prayer, Pierre-Eric Pierre-eric.Pelloux-prayer@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230707120826.3701-1-christia... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_bo.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 1a1cfd675cc46..7139a522b2f3b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -517,6 +517,12 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, { bool ret = false;
+ if (bo->pin_count) { + *locked = false; + *busy = false; + return false; + } + if (bo->base.resv == ctx->resv) { dma_resv_assert_held(bo->base.resv); if (ctx->allow_res_evict)
From: Liam R. Howlett Liam.Howlett@oracle.com
[ Upstream commit eaf9790d3bc6e157a2134c01c7d707a5a712fab1 ]
The test functions are not needed after the module is removed, so mark them as such. Add __exit to the module removal function. Some other variables have been marked as const static as well.
Link: https://lkml.kernel.org/r/20230518145544.1722059-20-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Suggested-by: Andrew Morton akpm@linux-foundation.org Cc: David Binderman dcb314@hotmail.com Cc: Peng Zhang zhangpeng.00@bytedance.com Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Vernon Yang vernon2gm@gmail.com Cc: Wei Yang richard.weiyang@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 7a93c71a6714 ("maple_tree: fix 32 bit mas_next testing") Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_maple_tree.c | 158 +++++++++++++------------- tools/testing/radix-tree/linux/init.h | 1 + tools/testing/radix-tree/maple.c | 147 ++++++++++++------------ 3 files changed, 155 insertions(+), 151 deletions(-)
diff --git a/lib/test_maple_tree.c b/lib/test_maple_tree.c index f1db333270e9f..261bad680f81d 100644 --- a/lib/test_maple_tree.c +++ b/lib/test_maple_tree.c @@ -30,54 +30,54 @@ #else #define cond_resched() do {} while (0) #endif -static -int mtree_insert_index(struct maple_tree *mt, unsigned long index, gfp_t gfp) +static int __init mtree_insert_index(struct maple_tree *mt, + unsigned long index, gfp_t gfp) { return mtree_insert(mt, index, xa_mk_value(index & LONG_MAX), gfp); }
-static void mtree_erase_index(struct maple_tree *mt, unsigned long index) +static void __init mtree_erase_index(struct maple_tree *mt, unsigned long index) { MT_BUG_ON(mt, mtree_erase(mt, index) != xa_mk_value(index & LONG_MAX)); MT_BUG_ON(mt, mtree_load(mt, index) != NULL); }
-static int mtree_test_insert(struct maple_tree *mt, unsigned long index, +static int __init mtree_test_insert(struct maple_tree *mt, unsigned long index, void *ptr) { return mtree_insert(mt, index, ptr, GFP_KERNEL); }
-static int mtree_test_store_range(struct maple_tree *mt, unsigned long start, - unsigned long end, void *ptr) +static int __init mtree_test_store_range(struct maple_tree *mt, + unsigned long start, unsigned long end, void *ptr) { return mtree_store_range(mt, start, end, ptr, GFP_KERNEL); }
-static int mtree_test_store(struct maple_tree *mt, unsigned long start, +static int __init mtree_test_store(struct maple_tree *mt, unsigned long start, void *ptr) { return mtree_test_store_range(mt, start, start, ptr); }
-static int mtree_test_insert_range(struct maple_tree *mt, unsigned long start, - unsigned long end, void *ptr) +static int __init mtree_test_insert_range(struct maple_tree *mt, + unsigned long start, unsigned long end, void *ptr) { return mtree_insert_range(mt, start, end, ptr, GFP_KERNEL); }
-static void *mtree_test_load(struct maple_tree *mt, unsigned long index) +static void __init *mtree_test_load(struct maple_tree *mt, unsigned long index) { return mtree_load(mt, index); }
-static void *mtree_test_erase(struct maple_tree *mt, unsigned long index) +static void __init *mtree_test_erase(struct maple_tree *mt, unsigned long index) { return mtree_erase(mt, index); }
#if defined(CONFIG_64BIT) -static noinline void check_mtree_alloc_range(struct maple_tree *mt, +static noinline void __init check_mtree_alloc_range(struct maple_tree *mt, unsigned long start, unsigned long end, unsigned long size, unsigned long expected, int eret, void *ptr) { @@ -94,7 +94,7 @@ static noinline void check_mtree_alloc_range(struct maple_tree *mt, MT_BUG_ON(mt, result != expected); }
-static noinline void check_mtree_alloc_rrange(struct maple_tree *mt, +static noinline void __init check_mtree_alloc_rrange(struct maple_tree *mt, unsigned long start, unsigned long end, unsigned long size, unsigned long expected, int eret, void *ptr) { @@ -112,8 +112,8 @@ static noinline void check_mtree_alloc_rrange(struct maple_tree *mt, } #endif
-static noinline void check_load(struct maple_tree *mt, unsigned long index, - void *ptr) +static noinline void __init check_load(struct maple_tree *mt, + unsigned long index, void *ptr) { void *ret = mtree_test_load(mt, index);
@@ -122,7 +122,7 @@ static noinline void check_load(struct maple_tree *mt, unsigned long index, MT_BUG_ON(mt, ret != ptr); }
-static noinline void check_store_range(struct maple_tree *mt, +static noinline void __init check_store_range(struct maple_tree *mt, unsigned long start, unsigned long end, void *ptr, int expected) { int ret = -EINVAL; @@ -138,7 +138,7 @@ static noinline void check_store_range(struct maple_tree *mt, check_load(mt, i, ptr); }
-static noinline void check_insert_range(struct maple_tree *mt, +static noinline void __init check_insert_range(struct maple_tree *mt, unsigned long start, unsigned long end, void *ptr, int expected) { int ret = -EINVAL; @@ -154,8 +154,8 @@ static noinline void check_insert_range(struct maple_tree *mt, check_load(mt, i, ptr); }
-static noinline void check_insert(struct maple_tree *mt, unsigned long index, - void *ptr) +static noinline void __init check_insert(struct maple_tree *mt, + unsigned long index, void *ptr) { int ret = -EINVAL;
@@ -163,7 +163,7 @@ static noinline void check_insert(struct maple_tree *mt, unsigned long index, MT_BUG_ON(mt, ret != 0); }
-static noinline void check_dup_insert(struct maple_tree *mt, +static noinline void __init check_dup_insert(struct maple_tree *mt, unsigned long index, void *ptr) { int ret = -EINVAL; @@ -173,13 +173,13 @@ static noinline void check_dup_insert(struct maple_tree *mt, }
-static noinline -void check_index_load(struct maple_tree *mt, unsigned long index) +static noinline void __init check_index_load(struct maple_tree *mt, + unsigned long index) { return check_load(mt, index, xa_mk_value(index & LONG_MAX)); }
-static inline int not_empty(struct maple_node *node) +static inline __init int not_empty(struct maple_node *node) { int i;
@@ -194,8 +194,8 @@ static inline int not_empty(struct maple_node *node) }
-static noinline void check_rev_seq(struct maple_tree *mt, unsigned long max, - bool verbose) +static noinline void __init check_rev_seq(struct maple_tree *mt, + unsigned long max, bool verbose) { unsigned long i = max, j;
@@ -227,7 +227,7 @@ static noinline void check_rev_seq(struct maple_tree *mt, unsigned long max, #endif }
-static noinline void check_seq(struct maple_tree *mt, unsigned long max, +static noinline void __init check_seq(struct maple_tree *mt, unsigned long max, bool verbose) { unsigned long i, j; @@ -256,7 +256,7 @@ static noinline void check_seq(struct maple_tree *mt, unsigned long max, #endif }
-static noinline void check_lb_not_empty(struct maple_tree *mt) +static noinline void __init check_lb_not_empty(struct maple_tree *mt) { unsigned long i, j; unsigned long huge = 4000UL * 1000 * 1000; @@ -275,13 +275,13 @@ static noinline void check_lb_not_empty(struct maple_tree *mt) mtree_destroy(mt); }
-static noinline void check_lower_bound_split(struct maple_tree *mt) +static noinline void __init check_lower_bound_split(struct maple_tree *mt) { MT_BUG_ON(mt, !mtree_empty(mt)); check_lb_not_empty(mt); }
-static noinline void check_upper_bound_split(struct maple_tree *mt) +static noinline void __init check_upper_bound_split(struct maple_tree *mt) { unsigned long i, j; unsigned long huge; @@ -306,7 +306,7 @@ static noinline void check_upper_bound_split(struct maple_tree *mt) mtree_destroy(mt); }
-static noinline void check_mid_split(struct maple_tree *mt) +static noinline void __init check_mid_split(struct maple_tree *mt) { unsigned long huge = 8000UL * 1000 * 1000;
@@ -315,7 +315,7 @@ static noinline void check_mid_split(struct maple_tree *mt) check_lb_not_empty(mt); }
-static noinline void check_rev_find(struct maple_tree *mt) +static noinline void __init check_rev_find(struct maple_tree *mt) { int i, nr_entries = 200; void *val; @@ -354,7 +354,7 @@ static noinline void check_rev_find(struct maple_tree *mt) rcu_read_unlock(); }
-static noinline void check_find(struct maple_tree *mt) +static noinline void __init check_find(struct maple_tree *mt) { unsigned long val = 0; unsigned long count; @@ -571,7 +571,7 @@ static noinline void check_find(struct maple_tree *mt) mtree_destroy(mt); }
-static noinline void check_find_2(struct maple_tree *mt) +static noinline void __init check_find_2(struct maple_tree *mt) { unsigned long i, j; void *entry; @@ -616,7 +616,7 @@ static noinline void check_find_2(struct maple_tree *mt)
#if defined(CONFIG_64BIT) -static noinline void check_alloc_rev_range(struct maple_tree *mt) +static noinline void __init check_alloc_rev_range(struct maple_tree *mt) { /* * Generated by: @@ -624,7 +624,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt) * awk -F "-" '{printf "0x%s, 0x%s, ", $1, $2}' */
- unsigned long range[] = { + static const unsigned long range[] = { /* Inclusive , Exclusive. */ 0x565234af2000, 0x565234af4000, 0x565234af4000, 0x565234af9000, @@ -652,7 +652,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt) 0x7fff58791000, 0x7fff58793000, };
- unsigned long holes[] = { + static const unsigned long holes[] = { /* * Note: start of hole is INCLUSIVE * end of hole is EXCLUSIVE @@ -672,7 +672,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt) * 4. number that should be returned. * 5. return value */ - unsigned long req_range[] = { + static const unsigned long req_range[] = { 0x565234af9000, /* Min */ 0x7fff58791000, /* Max */ 0x1000, /* Size */ @@ -783,7 +783,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt) mtree_destroy(mt); }
-static noinline void check_alloc_range(struct maple_tree *mt) +static noinline void __init check_alloc_range(struct maple_tree *mt) { /* * Generated by: @@ -791,7 +791,7 @@ static noinline void check_alloc_range(struct maple_tree *mt) * awk -F "-" '{printf "0x%s, 0x%s, ", $1, $2}' */
- unsigned long range[] = { + static const unsigned long range[] = { /* Inclusive , Exclusive. */ 0x565234af2000, 0x565234af4000, 0x565234af4000, 0x565234af9000, @@ -818,7 +818,7 @@ static noinline void check_alloc_range(struct maple_tree *mt) 0x7fff5878e000, 0x7fff58791000, 0x7fff58791000, 0x7fff58793000, }; - unsigned long holes[] = { + static const unsigned long holes[] = { /* Start of hole, end of hole, size of hole (+1) */ 0x565234afb000, 0x565234afc000, 0x1000, 0x565234afe000, 0x565235def000, 0x12F1000, @@ -833,7 +833,7 @@ static noinline void check_alloc_range(struct maple_tree *mt) * 4. number that should be returned. * 5. return value */ - unsigned long req_range[] = { + static const unsigned long req_range[] = { 0x565234af9000, /* Min */ 0x7fff58791000, /* Max */ 0x1000, /* Size */ @@ -942,10 +942,10 @@ static noinline void check_alloc_range(struct maple_tree *mt) } #endif
-static noinline void check_ranges(struct maple_tree *mt) +static noinline void __init check_ranges(struct maple_tree *mt) { int i, val, val2; - unsigned long r[] = { + static const unsigned long r[] = { 10, 15, 20, 25, 17, 22, /* Overlaps previous range. */ @@ -1210,7 +1210,7 @@ static noinline void check_ranges(struct maple_tree *mt) MT_BUG_ON(mt, mt_height(mt) != 4); }
-static noinline void check_next_entry(struct maple_tree *mt) +static noinline void __init check_next_entry(struct maple_tree *mt) { void *entry = NULL; unsigned long limit = 30, i = 0; @@ -1234,7 +1234,7 @@ static noinline void check_next_entry(struct maple_tree *mt) mtree_destroy(mt); }
-static noinline void check_prev_entry(struct maple_tree *mt) +static noinline void __init check_prev_entry(struct maple_tree *mt) { unsigned long index = 16; void *value; @@ -1278,7 +1278,7 @@ static noinline void check_prev_entry(struct maple_tree *mt) mas_unlock(&mas); }
-static noinline void check_root_expand(struct maple_tree *mt) +static noinline void __init check_root_expand(struct maple_tree *mt) { MA_STATE(mas, mt, 0, 0); void *ptr; @@ -1367,13 +1367,13 @@ static noinline void check_root_expand(struct maple_tree *mt) mas_unlock(&mas); }
-static noinline void check_gap_combining(struct maple_tree *mt) +static noinline void __init check_gap_combining(struct maple_tree *mt) { struct maple_enode *mn1, *mn2; void *entry; unsigned long singletons = 100; - unsigned long *seq100; - unsigned long seq100_64[] = { + static const unsigned long *seq100; + static const unsigned long seq100_64[] = { /* 0-5 */ 74, 75, 76, 50, 100, 2, @@ -1387,7 +1387,7 @@ static noinline void check_gap_combining(struct maple_tree *mt) 76, 2, 79, 85, 4, };
- unsigned long seq100_32[] = { + static const unsigned long seq100_32[] = { /* 0-5 */ 61, 62, 63, 50, 100, 2, @@ -1401,11 +1401,11 @@ static noinline void check_gap_combining(struct maple_tree *mt) 76, 2, 79, 85, 4, };
- unsigned long seq2000[] = { + static const unsigned long seq2000[] = { 1152, 1151, 1100, 1200, 2, }; - unsigned long seq400[] = { + static const unsigned long seq400[] = { 286, 318, 256, 260, 266, 270, 275, 280, 290, 398, 286, 310, @@ -1564,7 +1564,7 @@ static noinline void check_gap_combining(struct maple_tree *mt) mt_set_non_kernel(0); mtree_destroy(mt); } -static noinline void check_node_overwrite(struct maple_tree *mt) +static noinline void __init check_node_overwrite(struct maple_tree *mt) { int i, max = 4000;
@@ -1577,7 +1577,7 @@ static noinline void check_node_overwrite(struct maple_tree *mt) }
#if defined(BENCH_SLOT_STORE) -static noinline void bench_slot_store(struct maple_tree *mt) +static noinline void __init bench_slot_store(struct maple_tree *mt) { int i, brk = 105, max = 1040, brk_start = 100, count = 20000000;
@@ -1593,7 +1593,7 @@ static noinline void bench_slot_store(struct maple_tree *mt) #endif
#if defined(BENCH_NODE_STORE) -static noinline void bench_node_store(struct maple_tree *mt) +static noinline void __init bench_node_store(struct maple_tree *mt) { int i, overwrite = 76, max = 240, count = 20000000;
@@ -1612,7 +1612,7 @@ static noinline void bench_node_store(struct maple_tree *mt) #endif
#if defined(BENCH_AWALK) -static noinline void bench_awalk(struct maple_tree *mt) +static noinline void __init bench_awalk(struct maple_tree *mt) { int i, max = 2500, count = 50000000; MA_STATE(mas, mt, 1470, 1470); @@ -1629,7 +1629,7 @@ static noinline void bench_awalk(struct maple_tree *mt) } #endif #if defined(BENCH_WALK) -static noinline void bench_walk(struct maple_tree *mt) +static noinline void __init bench_walk(struct maple_tree *mt) { int i, max = 2500, count = 550000000; MA_STATE(mas, mt, 1470, 1470); @@ -1646,7 +1646,7 @@ static noinline void bench_walk(struct maple_tree *mt) #endif
#if defined(BENCH_MT_FOR_EACH) -static noinline void bench_mt_for_each(struct maple_tree *mt) +static noinline void __init bench_mt_for_each(struct maple_tree *mt) { int i, count = 1000000; unsigned long max = 2500, index = 0; @@ -1670,7 +1670,7 @@ static noinline void bench_mt_for_each(struct maple_tree *mt) #endif
/* check_forking - simulate the kernel forking sequence with the tree. */ -static noinline void check_forking(struct maple_tree *mt) +static noinline void __init check_forking(struct maple_tree *mt) {
struct maple_tree newmt; @@ -1709,7 +1709,7 @@ static noinline void check_forking(struct maple_tree *mt) mtree_destroy(&newmt); }
-static noinline void check_iteration(struct maple_tree *mt) +static noinline void __init check_iteration(struct maple_tree *mt) { int i, nr_entries = 125; void *val; @@ -1777,7 +1777,7 @@ static noinline void check_iteration(struct maple_tree *mt) mt_set_non_kernel(0); }
-static noinline void check_mas_store_gfp(struct maple_tree *mt) +static noinline void __init check_mas_store_gfp(struct maple_tree *mt) {
struct maple_tree newmt; @@ -1810,7 +1810,7 @@ static noinline void check_mas_store_gfp(struct maple_tree *mt) }
#if defined(BENCH_FORK) -static noinline void bench_forking(struct maple_tree *mt) +static noinline void __init bench_forking(struct maple_tree *mt) {
struct maple_tree newmt; @@ -1852,15 +1852,17 @@ static noinline void bench_forking(struct maple_tree *mt) } #endif
-static noinline void next_prev_test(struct maple_tree *mt) +static noinline void __init next_prev_test(struct maple_tree *mt) { int i, nr_entries; void *val; MA_STATE(mas, mt, 0, 0); struct maple_enode *mn; - unsigned long *level2; - unsigned long level2_64[] = {707, 1000, 710, 715, 720, 725}; - unsigned long level2_32[] = {1747, 2000, 1750, 1755, 1760, 1765}; + static const unsigned long *level2; + static const unsigned long level2_64[] = { 707, 1000, 710, 715, 720, + 725}; + static const unsigned long level2_32[] = { 1747, 2000, 1750, 1755, + 1760, 1765};
if (MAPLE_32BIT) { nr_entries = 500; @@ -2028,7 +2030,7 @@ static noinline void next_prev_test(struct maple_tree *mt)
/* Test spanning writes that require balancing right sibling or right cousin */ -static noinline void check_spanning_relatives(struct maple_tree *mt) +static noinline void __init check_spanning_relatives(struct maple_tree *mt) {
unsigned long i, nr_entries = 1000; @@ -2041,7 +2043,7 @@ static noinline void check_spanning_relatives(struct maple_tree *mt) mtree_store_range(mt, 9365, 9955, NULL, GFP_KERNEL); }
-static noinline void check_fuzzer(struct maple_tree *mt) +static noinline void __init check_fuzzer(struct maple_tree *mt) { /* * 1. Causes a spanning rebalance of a single root node. @@ -2438,7 +2440,7 @@ static noinline void check_fuzzer(struct maple_tree *mt) }
/* duplicate the tree with a specific gap */ -static noinline void check_dup_gaps(struct maple_tree *mt, +static noinline void __init check_dup_gaps(struct maple_tree *mt, unsigned long nr_entries, bool zero_start, unsigned long gap) { @@ -2478,7 +2480,7 @@ static noinline void check_dup_gaps(struct maple_tree *mt, }
/* Duplicate many sizes of trees. Mainly to test expected entry values */ -static noinline void check_dup(struct maple_tree *mt) +static noinline void __init check_dup(struct maple_tree *mt) { int i; int big_start = 100010; @@ -2566,7 +2568,7 @@ static noinline void check_dup(struct maple_tree *mt) } }
-static noinline void check_bnode_min_spanning(struct maple_tree *mt) +static noinline void __init check_bnode_min_spanning(struct maple_tree *mt) { int i = 50; MA_STATE(mas, mt, 0, 0); @@ -2585,7 +2587,7 @@ static noinline void check_bnode_min_spanning(struct maple_tree *mt) mt_set_non_kernel(0); }
-static noinline void check_empty_area_window(struct maple_tree *mt) +static noinline void __init check_empty_area_window(struct maple_tree *mt) { unsigned long i, nr_entries = 20; MA_STATE(mas, mt, 0, 0); @@ -2670,7 +2672,7 @@ static noinline void check_empty_area_window(struct maple_tree *mt) rcu_read_unlock(); }
-static noinline void check_empty_area_fill(struct maple_tree *mt) +static noinline void __init check_empty_area_fill(struct maple_tree *mt) { const unsigned long max = 0x25D78000; unsigned long size; @@ -2714,11 +2716,11 @@ static noinline void check_empty_area_fill(struct maple_tree *mt) }
static DEFINE_MTREE(tree); -static int maple_tree_seed(void) +static int __init maple_tree_seed(void) { - unsigned long set[] = {5015, 5014, 5017, 25, 1000, - 1001, 1002, 1003, 1005, 0, - 5003, 5002}; + unsigned long set[] = { 5015, 5014, 5017, 25, 1000, + 1001, 1002, 1003, 1005, 0, + 5003, 5002}; void *ptr = &set;
pr_info("\nTEST STARTING\n\n"); @@ -2988,7 +2990,7 @@ static int maple_tree_seed(void) return -EINVAL; }
-static void maple_tree_harvest(void) +static void __exit maple_tree_harvest(void) {
} diff --git a/tools/testing/radix-tree/linux/init.h b/tools/testing/radix-tree/linux/init.h index 1bb0afc213099..81563c3dfce79 100644 --- a/tools/testing/radix-tree/linux/init.h +++ b/tools/testing/radix-tree/linux/init.h @@ -1 +1,2 @@ #define __init +#define __exit diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c index adc5392df4009..67c56e9e92606 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -14,6 +14,7 @@ #include "test.h" #include <stdlib.h> #include <time.h> +#include "linux/init.h"
#define module_init(x) #define module_exit(x) @@ -81,7 +82,7 @@ static void check_mas_alloc_node_count(struct ma_state *mas) * check_new_node() - Check the creation of new nodes and error path * verification. */ -static noinline void check_new_node(struct maple_tree *mt) +static noinline void __init check_new_node(struct maple_tree *mt) {
struct maple_node *mn, *mn2, *mn3; @@ -455,7 +456,7 @@ static noinline void check_new_node(struct maple_tree *mt) /* * Check erasing including RCU. */ -static noinline void check_erase(struct maple_tree *mt, unsigned long index, +static noinline void __init check_erase(struct maple_tree *mt, unsigned long index, void *ptr) { MT_BUG_ON(mt, mtree_test_erase(mt, index) != ptr); @@ -465,24 +466,24 @@ static noinline void check_erase(struct maple_tree *mt, unsigned long index, #define erase_check_insert(mt, i) check_insert(mt, set[i], entry[i%2]) #define erase_check_erase(mt, i) check_erase(mt, set[i], entry[i%2])
-static noinline void check_erase_testset(struct maple_tree *mt) +static noinline void __init check_erase_testset(struct maple_tree *mt) { - unsigned long set[] = { 5015, 5014, 5017, 25, 1000, - 1001, 1002, 1003, 1005, 0, - 6003, 6002, 6008, 6012, 6015, - 7003, 7002, 7008, 7012, 7015, - 8003, 8002, 8008, 8012, 8015, - 9003, 9002, 9008, 9012, 9015, - 10003, 10002, 10008, 10012, 10015, - 11003, 11002, 11008, 11012, 11015, - 12003, 12002, 12008, 12012, 12015, - 13003, 13002, 13008, 13012, 13015, - 14003, 14002, 14008, 14012, 14015, - 15003, 15002, 15008, 15012, 15015, - }; - - - void *ptr = &set; + static const unsigned long set[] = { 5015, 5014, 5017, 25, 1000, + 1001, 1002, 1003, 1005, 0, + 6003, 6002, 6008, 6012, 6015, + 7003, 7002, 7008, 7012, 7015, + 8003, 8002, 8008, 8012, 8015, + 9003, 9002, 9008, 9012, 9015, + 10003, 10002, 10008, 10012, 10015, + 11003, 11002, 11008, 11012, 11015, + 12003, 12002, 12008, 12012, 12015, + 13003, 13002, 13008, 13012, 13015, + 14003, 14002, 14008, 14012, 14015, + 15003, 15002, 15008, 15012, 15015, + }; + + + void *ptr = &check_erase_testset; void *entry[2] = { ptr, mt }; void *root_node;
@@ -739,7 +740,7 @@ static noinline void check_erase_testset(struct maple_tree *mt) int mas_ce2_over_count(struct ma_state *mas_start, struct ma_state *mas_end, void *s_entry, unsigned long s_min, void *e_entry, unsigned long e_max, - unsigned long *set, int i, bool null_entry) + const unsigned long *set, int i, bool null_entry) { int count = 0, span = 0; unsigned long retry = 0; @@ -969,8 +970,8 @@ static inline void *mas_range_load(struct ma_state *mas, }
#if defined(CONFIG_64BIT) -static noinline void check_erase2_testset(struct maple_tree *mt, - unsigned long *set, unsigned long size) +static noinline void __init check_erase2_testset(struct maple_tree *mt, + const unsigned long *set, unsigned long size) { int entry_count = 0; int check = 0; @@ -1114,11 +1115,11 @@ static noinline void check_erase2_testset(struct maple_tree *mt,
/* These tests were pulled from KVM tree modifications which failed. */ -static noinline void check_erase2_sets(struct maple_tree *mt) +static noinline void __init check_erase2_sets(struct maple_tree *mt) { void *entry; unsigned long start = 0; - unsigned long set[] = { + static const unsigned long set[] = { STORE, 140737488347136, 140737488351231, STORE, 140721266458624, 140737488351231, ERASE, 140721266458624, 140737488351231, @@ -1136,7 +1137,7 @@ ERASE, 140253902692352, 140253902864383, STORE, 140253902692352, 140253902696447, STORE, 140253902696448, 140253902864383, }; - unsigned long set2[] = { + static const unsigned long set2[] = { STORE, 140737488347136, 140737488351231, STORE, 140735933583360, 140737488351231, ERASE, 140735933583360, 140737488351231, @@ -1160,7 +1161,7 @@ STORE, 140277094813696, 140277094821887, STORE, 140277094821888, 140277094825983, STORE, 140735933906944, 140735933911039, }; - unsigned long set3[] = { + static const unsigned long set3[] = { STORE, 140737488347136, 140737488351231, STORE, 140735790264320, 140737488351231, ERASE, 140735790264320, 140737488351231, @@ -1203,7 +1204,7 @@ STORE, 47135835840512, 47135835885567, STORE, 47135835885568, 47135835893759, };
- unsigned long set4[] = { + static const unsigned long set4[] = { STORE, 140737488347136, 140737488351231, STORE, 140728251703296, 140737488351231, ERASE, 140728251703296, 140737488351231, @@ -1224,7 +1225,7 @@ ERASE, 47646523277312, 47646523445247, STORE, 47646523277312, 47646523400191, };
- unsigned long set5[] = { + static const unsigned long set5[] = { STORE, 140737488347136, 140737488351231, STORE, 140726874062848, 140737488351231, ERASE, 140726874062848, 140737488351231, @@ -1357,7 +1358,7 @@ STORE, 47884791619584, 47884791623679, STORE, 47884791623680, 47884791627775, };
- unsigned long set6[] = { + static const unsigned long set6[] = { STORE, 140737488347136, 140737488351231, STORE, 140722999021568, 140737488351231, ERASE, 140722999021568, 140737488351231, @@ -1489,7 +1490,7 @@ ERASE, 47430432014336, 47430432022527, STORE, 47430432014336, 47430432018431, STORE, 47430432018432, 47430432022527, }; - unsigned long set7[] = { + static const unsigned long set7[] = { STORE, 140737488347136, 140737488351231, STORE, 140729808330752, 140737488351231, ERASE, 140729808330752, 140737488351231, @@ -1621,7 +1622,7 @@ ERASE, 47439987130368, 47439987138559, STORE, 47439987130368, 47439987134463, STORE, 47439987134464, 47439987138559, }; - unsigned long set8[] = { + static const unsigned long set8[] = { STORE, 140737488347136, 140737488351231, STORE, 140722482974720, 140737488351231, ERASE, 140722482974720, 140737488351231, @@ -1754,7 +1755,7 @@ STORE, 47708488638464, 47708488642559, STORE, 47708488642560, 47708488646655, };
- unsigned long set9[] = { + static const unsigned long set9[] = { STORE, 140737488347136, 140737488351231, STORE, 140736427839488, 140737488351231, ERASE, 140736427839488, 140736427839488, @@ -5620,7 +5621,7 @@ ERASE, 47906195480576, 47906195480576, STORE, 94641242615808, 94641242750975, };
- unsigned long set10[] = { + static const unsigned long set10[] = { STORE, 140737488347136, 140737488351231, STORE, 140736427839488, 140737488351231, ERASE, 140736427839488, 140736427839488, @@ -9484,7 +9485,7 @@ STORE, 139726599680000, 139726599684095, ERASE, 47906195480576, 47906195480576, STORE, 94641242615808, 94641242750975, }; - unsigned long set11[] = { + static const unsigned long set11[] = { STORE, 140737488347136, 140737488351231, STORE, 140732658499584, 140737488351231, ERASE, 140732658499584, 140732658499584, @@ -9510,7 +9511,7 @@ STORE, 140732658565120, 140732658569215, STORE, 140732658552832, 140732658565119, };
- unsigned long set12[] = { /* contains 12 values. */ + static const unsigned long set12[] = { /* contains 12 values. */ STORE, 140737488347136, 140737488351231, STORE, 140732658499584, 140737488351231, ERASE, 140732658499584, 140732658499584, @@ -9537,7 +9538,7 @@ STORE, 140732658552832, 140732658565119, STORE, 140014592741375, 140014592741375, /* contrived */ STORE, 140014592733184, 140014592741376, /* creates first entry retry. */ }; - unsigned long set13[] = { + static const unsigned long set13[] = { STORE, 140373516247040, 140373516251135,/*: ffffa2e7b0e10d80 */ STORE, 140373516251136, 140373516255231,/*: ffffa2e7b1195d80 */ STORE, 140373516255232, 140373516443647,/*: ffffa2e7b0e109c0 */ @@ -9550,7 +9551,7 @@ STORE, 140373518684160, 140373518688254,/*: ffffa2e7b05fec00 */ STORE, 140373518688256, 140373518692351,/*: ffffa2e7bfbdcd80 */ STORE, 140373518692352, 140373518696447,/*: ffffa2e7b0749e40 */ }; - unsigned long set14[] = { + static const unsigned long set14[] = { STORE, 140737488347136, 140737488351231, STORE, 140731667996672, 140737488351231, SNULL, 140731668000767, 140737488351231, @@ -9834,7 +9835,7 @@ SNULL, 139826136543232, 139826136809471, STORE, 139826136809472, 139826136842239, STORE, 139826136543232, 139826136809471, }; - unsigned long set15[] = { + static const unsigned long set15[] = { STORE, 140737488347136, 140737488351231, STORE, 140722061451264, 140737488351231, SNULL, 140722061455359, 140737488351231, @@ -10119,7 +10120,7 @@ STORE, 139906808958976, 139906808991743, STORE, 139906808692736, 139906808958975, };
- unsigned long set16[] = { + static const unsigned long set16[] = { STORE, 94174808662016, 94174809321471, STORE, 94174811414528, 94174811426815, STORE, 94174811426816, 94174811430911, @@ -10330,7 +10331,7 @@ STORE, 139921865613312, 139921865617407, STORE, 139921865547776, 139921865564159, };
- unsigned long set17[] = { + static const unsigned long set17[] = { STORE, 94397057224704, 94397057646591, STORE, 94397057650688, 94397057691647, STORE, 94397057691648, 94397057695743, @@ -10392,7 +10393,7 @@ STORE, 140720477511680, 140720477646847, STORE, 140720478302208, 140720478314495, STORE, 140720478314496, 140720478318591, }; - unsigned long set18[] = { + static const unsigned long set18[] = { STORE, 140737488347136, 140737488351231, STORE, 140724953673728, 140737488351231, SNULL, 140724953677823, 140737488351231, @@ -10425,7 +10426,7 @@ STORE, 140222970597376, 140222970605567, ERASE, 140222970597376, 140222970605567, STORE, 140222970597376, 140222970605567, }; - unsigned long set19[] = { + static const unsigned long set19[] = { STORE, 140737488347136, 140737488351231, STORE, 140725182459904, 140737488351231, SNULL, 140725182463999, 140737488351231, @@ -10694,7 +10695,7 @@ STORE, 140656836775936, 140656836780031, STORE, 140656787476480, 140656791920639, ERASE, 140656774639616, 140656779083775, }; - unsigned long set20[] = { + static const unsigned long set20[] = { STORE, 140737488347136, 140737488351231, STORE, 140735952392192, 140737488351231, SNULL, 140735952396287, 140737488351231, @@ -10850,7 +10851,7 @@ STORE, 140590386819072, 140590386823167, STORE, 140590386823168, 140590386827263, SNULL, 140590376591359, 140590376595455, }; - unsigned long set21[] = { + static const unsigned long set21[] = { STORE, 93874710941696, 93874711363583, STORE, 93874711367680, 93874711408639, STORE, 93874711408640, 93874711412735, @@ -10920,7 +10921,7 @@ ERASE, 140708393312256, 140708393316351, ERASE, 140708393308160, 140708393312255, ERASE, 140708393291776, 140708393308159, }; - unsigned long set22[] = { + static const unsigned long set22[] = { STORE, 93951397134336, 93951397183487, STORE, 93951397183488, 93951397728255, STORE, 93951397728256, 93951397826559, @@ -11047,7 +11048,7 @@ STORE, 140551361253376, 140551361519615, ERASE, 140551361253376, 140551361519615, };
- unsigned long set23[] = { + static const unsigned long set23[] = { STORE, 94014447943680, 94014448156671, STORE, 94014450253824, 94014450257919, STORE, 94014450257920, 94014450266111, @@ -14371,7 +14372,7 @@ SNULL, 140175956627455, 140175985139711, STORE, 140175927242752, 140175956627455, STORE, 140175956627456, 140175985139711, }; - unsigned long set24[] = { + static const unsigned long set24[] = { STORE, 140737488347136, 140737488351231, STORE, 140735281639424, 140737488351231, SNULL, 140735281643519, 140737488351231, @@ -15533,7 +15534,7 @@ ERASE, 139635393024000, 139635401412607, ERASE, 139635384627200, 139635384631295, ERASE, 139635384631296, 139635393019903, }; - unsigned long set25[] = { + static const unsigned long set25[] = { STORE, 140737488347136, 140737488351231, STORE, 140737488343040, 140737488351231, STORE, 140722547441664, 140737488351231, @@ -22321,7 +22322,7 @@ STORE, 140249652703232, 140249682087935, STORE, 140249682087936, 140249710600191, };
- unsigned long set26[] = { + static const unsigned long set26[] = { STORE, 140737488347136, 140737488351231, STORE, 140729464770560, 140737488351231, SNULL, 140729464774655, 140737488351231, @@ -22345,7 +22346,7 @@ ERASE, 140109040951296, 140109040959487, STORE, 140109040955392, 140109040959487, ERASE, 140109040955392, 140109040959487, }; - unsigned long set27[] = { + static const unsigned long set27[] = { STORE, 140737488347136, 140737488351231, STORE, 140726128070656, 140737488351231, SNULL, 140726128074751, 140737488351231, @@ -22741,7 +22742,7 @@ STORE, 140415509696512, 140415535910911, ERASE, 140415537422336, 140415562588159, STORE, 140415482433536, 140415509696511, }; - unsigned long set28[] = { + static const unsigned long set28[] = { STORE, 140737488347136, 140737488351231, STORE, 140722475622400, 140737488351231, SNULL, 140722475626495, 140737488351231, @@ -22809,7 +22810,7 @@ STORE, 139918413348864, 139918413352959, ERASE, 139918413316096, 139918413344767, STORE, 93865848528896, 93865848664063, }; - unsigned long set29[] = { + static const unsigned long set29[] = { STORE, 140737488347136, 140737488351231, STORE, 140734467944448, 140737488351231, SNULL, 140734467948543, 140737488351231, @@ -23684,7 +23685,7 @@ ERASE, 140143079972864, 140143088361471, ERASE, 140143205793792, 140143205797887, ERASE, 140143205797888, 140143214186495, }; - unsigned long set30[] = { + static const unsigned long set30[] = { STORE, 140737488347136, 140737488351231, STORE, 140733436743680, 140737488351231, SNULL, 140733436747775, 140737488351231, @@ -24566,7 +24567,7 @@ ERASE, 140165225893888, 140165225897983, ERASE, 140165225897984, 140165234286591, ERASE, 140165058105344, 140165058109439, }; - unsigned long set31[] = { + static const unsigned long set31[] = { STORE, 140737488347136, 140737488351231, STORE, 140730890784768, 140737488351231, SNULL, 140730890788863, 140737488351231, @@ -25379,7 +25380,7 @@ ERASE, 140623906590720, 140623914979327, ERASE, 140622950277120, 140622950281215, ERASE, 140622950281216, 140622958669823, }; - unsigned long set32[] = { + static const unsigned long set32[] = { STORE, 140737488347136, 140737488351231, STORE, 140731244212224, 140737488351231, SNULL, 140731244216319, 140737488351231, @@ -26175,7 +26176,7 @@ ERASE, 140400417288192, 140400425676799, ERASE, 140400283066368, 140400283070463, ERASE, 140400283070464, 140400291459071, }; - unsigned long set33[] = { + static const unsigned long set33[] = { STORE, 140737488347136, 140737488351231, STORE, 140734562918400, 140737488351231, SNULL, 140734562922495, 140737488351231, @@ -26317,7 +26318,7 @@ STORE, 140582961786880, 140583003750399, ERASE, 140582961786880, 140583003750399, };
- unsigned long set34[] = { + static const unsigned long set34[] = { STORE, 140737488347136, 140737488351231, STORE, 140731327180800, 140737488351231, SNULL, 140731327184895, 140737488351231, @@ -27198,7 +27199,7 @@ ERASE, 140012522094592, 140012530483199, ERASE, 140012033142784, 140012033146879, ERASE, 140012033146880, 140012041535487, }; - unsigned long set35[] = { + static const unsigned long set35[] = { STORE, 140737488347136, 140737488351231, STORE, 140730536939520, 140737488351231, SNULL, 140730536943615, 140737488351231, @@ -27955,7 +27956,7 @@ ERASE, 140474471936000, 140474480324607, ERASE, 140474396430336, 140474396434431, ERASE, 140474396434432, 140474404823039, }; - unsigned long set36[] = { + static const unsigned long set36[] = { STORE, 140737488347136, 140737488351231, STORE, 140723893125120, 140737488351231, SNULL, 140723893129215, 140737488351231, @@ -28816,7 +28817,7 @@ ERASE, 140121890357248, 140121898745855, ERASE, 140121269587968, 140121269592063, ERASE, 140121269592064, 140121277980671, }; - unsigned long set37[] = { + static const unsigned long set37[] = { STORE, 140737488347136, 140737488351231, STORE, 140722404016128, 140737488351231, SNULL, 140722404020223, 140737488351231, @@ -28942,7 +28943,7 @@ STORE, 139759821246464, 139759888355327, ERASE, 139759821246464, 139759888355327, ERASE, 139759888355328, 139759955464191, }; - unsigned long set38[] = { + static const unsigned long set38[] = { STORE, 140737488347136, 140737488351231, STORE, 140730666221568, 140737488351231, SNULL, 140730666225663, 140737488351231, @@ -29752,7 +29753,7 @@ ERASE, 140613504712704, 140613504716799, ERASE, 140613504716800, 140613513105407, };
- unsigned long set39[] = { + static const unsigned long set39[] = { STORE, 140737488347136, 140737488351231, STORE, 140736271417344, 140737488351231, SNULL, 140736271421439, 140737488351231, @@ -30124,7 +30125,7 @@ STORE, 140325364428800, 140325372821503, STORE, 140325356036096, 140325364428799, SNULL, 140325364432895, 140325372821503, }; - unsigned long set40[] = { + static const unsigned long set40[] = { STORE, 140737488347136, 140737488351231, STORE, 140734309167104, 140737488351231, SNULL, 140734309171199, 140737488351231, @@ -30875,7 +30876,7 @@ ERASE, 140320289300480, 140320289304575, ERASE, 140320289304576, 140320297693183, ERASE, 140320163409920, 140320163414015, }; - unsigned long set41[] = { + static const unsigned long set41[] = { STORE, 140737488347136, 140737488351231, STORE, 140728157171712, 140737488351231, SNULL, 140728157175807, 140737488351231, @@ -31185,7 +31186,7 @@ STORE, 94376135090176, 94376135094271, STORE, 94376135094272, 94376135098367, SNULL, 94376135094272, 94377208836095, }; - unsigned long set42[] = { + static const unsigned long set42[] = { STORE, 314572800, 1388314623, STORE, 1462157312, 1462169599, STORE, 1462169600, 1462185983, @@ -33862,7 +33863,7 @@ SNULL, 3798999040, 3799101439, */ };
- unsigned long set43[] = { + static const unsigned long set43[] = { STORE, 140737488347136, 140737488351231, STORE, 140734187720704, 140737488351231, SNULL, 140734187724800, 140737488351231, @@ -34996,7 +34997,7 @@ void run_check_rcu_slowread(struct maple_tree *mt, struct rcu_test_struct *vals) MT_BUG_ON(mt, !vals->seen_entry3); MT_BUG_ON(mt, !vals->seen_both); } -static noinline void check_rcu_simulated(struct maple_tree *mt) +static noinline void __init check_rcu_simulated(struct maple_tree *mt) { unsigned long i, nr_entries = 1000; unsigned long target = 4320; @@ -35157,7 +35158,7 @@ static noinline void check_rcu_simulated(struct maple_tree *mt) rcu_unregister_thread(); }
-static noinline void check_rcu_threaded(struct maple_tree *mt) +static noinline void __init check_rcu_threaded(struct maple_tree *mt) { unsigned long i, nr_entries = 1000; struct rcu_test_struct vals; @@ -35366,7 +35367,7 @@ static void check_dfs_preorder(struct maple_tree *mt) /* End of depth first search tests */
/* Preallocation testing */ -static noinline void check_prealloc(struct maple_tree *mt) +static noinline void __init check_prealloc(struct maple_tree *mt) { unsigned long i, max = 100; unsigned long allocated; @@ -35494,7 +35495,7 @@ static noinline void check_prealloc(struct maple_tree *mt) /* End of preallocation testing */
/* Spanning writes, writes that span nodes and layers of the tree */ -static noinline void check_spanning_write(struct maple_tree *mt) +static noinline void __init check_spanning_write(struct maple_tree *mt) { unsigned long i, max = 5000; MA_STATE(mas, mt, 1200, 2380); @@ -35662,7 +35663,7 @@ static noinline void check_spanning_write(struct maple_tree *mt) /* End of spanning write testing */
/* Writes to a NULL area that are adjacent to other NULLs */ -static noinline void check_null_expand(struct maple_tree *mt) +static noinline void __init check_null_expand(struct maple_tree *mt) { unsigned long i, max = 100; unsigned char data_end; @@ -35723,7 +35724,7 @@ static noinline void check_null_expand(struct maple_tree *mt) /* End of NULL area expansions */
/* Checking for no memory is best done outside the kernel */ -static noinline void check_nomem(struct maple_tree *mt) +static noinline void __init check_nomem(struct maple_tree *mt) { MA_STATE(ms, mt, 1, 1);
@@ -35758,7 +35759,7 @@ static noinline void check_nomem(struct maple_tree *mt) mtree_destroy(mt); }
-static noinline void check_locky(struct maple_tree *mt) +static noinline void __init check_locky(struct maple_tree *mt) { MA_STATE(ms, mt, 2, 2); MA_STATE(reader, mt, 2, 2);
From: Liam R. Howlett Liam.Howlett@oracle.com
[ Upstream commit 7a93c71a6714ca1a9c03d70432dac104b0cfb815 ]
The test setup of mas_next is dependent on node entry size to create a 2 level tree, but the tests did not account for this in the expected value when shifting beyond the scope of the tree.
Fix this by setting up the test to succeed depending on the node entries which is dependent on the 32/64 bit setup.
Link: https://lkml.kernel.org/r/20230712173916.168805-1-Liam.Howlett@oracle.com Fixes: 120b116208a0 ("maple_tree: reorganize testing to restore module testing") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Reported-by: Geert Uytterhoeven geert@linux-m68k.org Closes: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg5... Tested-by: Geert Uytterhoeven geert@linux-m68k.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_maple_tree.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/test_maple_tree.c b/lib/test_maple_tree.c index 261bad680f81d..fad668042f3e7 100644 --- a/lib/test_maple_tree.c +++ b/lib/test_maple_tree.c @@ -1863,13 +1863,16 @@ static noinline void __init next_prev_test(struct maple_tree *mt) 725}; static const unsigned long level2_32[] = { 1747, 2000, 1750, 1755, 1760, 1765}; + unsigned long last_index;
if (MAPLE_32BIT) { nr_entries = 500; level2 = level2_32; + last_index = 0x138e; } else { nr_entries = 200; level2 = level2_64; + last_index = 0x7d6; }
for (i = 0; i <= nr_entries; i++) @@ -1976,7 +1979,7 @@ static noinline void __init next_prev_test(struct maple_tree *mt)
val = mas_next(&mas, ULONG_MAX); MT_BUG_ON(mt, val != NULL); - MT_BUG_ON(mt, mas.index != ULONG_MAX); + MT_BUG_ON(mt, mas.index != last_index); MT_BUG_ON(mt, mas.last != ULONG_MAX);
val = mas_prev(&mas, 0);
From: Rodrigo Siqueira Rodrigo.Siqueira@amd.com
[ Upstream commit e3416e872f84086667df21daf166506fab97358d ]
To ensure that FAMS can be used, DC must check if there is VRR support. This commit adds the required configuration to ensure FAMS can be executed in the target system.
Reviewed-by: Alvin Lee Alvin.Lee2@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 2a9482e55968 ("drm/amd/display: Prevent vtotal from being set to 0") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 6 ++++++ drivers/gpu/drm/amd/display/dc/dc_stream.h | 1 + drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c | 7 ++++++- drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 2 +- 4 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 6eace83c9c6f5..7f6bdad57c920 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -2626,6 +2626,12 @@ static enum surface_update_type check_update_surfaces_for_stream(
if (stream_update->mst_bw_update) su_flags->bits.mst_bw = 1; + + if (stream_update->stream && stream_update->stream->freesync_on_desktop && + (stream_update->vrr_infopacket || stream_update->allow_freesync || + stream_update->vrr_active_variable)) + su_flags->bits.fams_changed = 1; + if (stream_update->crtc_timing_adjust && dc_extended_blank_supported(dc)) su_flags->bits.crtc_timing_adjust = 1;
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h b/drivers/gpu/drm/amd/display/dc/dc_stream.h index 25284006019c3..270282fbda4ab 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_stream.h +++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h @@ -131,6 +131,7 @@ union stream_update_flags { uint32_t dsc_changed : 1; uint32_t mst_bw : 1; uint32_t crtc_timing_adjust : 1; + uint32_t fams_changed : 1; } bits;
uint32_t raw; diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c index c95f000b63b28..34b08d90dc1da 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c @@ -301,7 +301,12 @@ static void optc3_wait_drr_doublebuffer_pending_clear(struct timing_generator *o
void optc3_set_vtotal_min_max(struct timing_generator *optc, int vtotal_min, int vtotal_max) { - optc1_set_vtotal_min_max(optc, vtotal_min, vtotal_max); + struct dc *dc = optc->ctx->dc; + + if (dc->caps.dmub_caps.mclk_sw && !dc->debug.disable_fams) + dc_dmub_srv_drr_update_cmd(dc, optc->inst, vtotal_min, vtotal_max); + else + optc1_set_vtotal_min_max(optc, vtotal_min, vtotal_max); }
void optc3_tg_init(struct timing_generator *optc) diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h index 598fa1de54ce3..1c55d3b01f53e 100644 --- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h +++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h @@ -360,7 +360,7 @@ union dmub_fw_boot_status { uint32_t optimized_init_done : 1; /**< 1 if optimized init done */ uint32_t restore_required : 1; /**< 1 if driver should call restore */ uint32_t defer_load : 1; /**< 1 if VBIOS data is deferred programmed */ - uint32_t reserved : 1; + uint32_t fams_enabled : 1; /**< 1 if VBIOS data is deferred programmed */ uint32_t detection_required: 1; /**< if detection need to be triggered by driver */ uint32_t hw_power_init_done: 1; /**< 1 if hw power init is completed */ } bits; /**< status bits */
From: Gabe Teeger gabe.teeger@amd.com
[ Upstream commit 469a62938a45ef382c9cb7b9fec6c6c1fcd781c0 ]
[Why] Flickering and underflow was observed when testing extended blank on dcn314.
[What] Vstartup is contrainted by vblank_nom, so adjusting it to include non-adjusted vtotal in its calculation during freesync video means that Vstartup is not changed when vtotal changes. This fixed the flickering + underflow.
dc_extended_blank_supported function was removed because extended blank is only relevant to when zstate is supported. The increased vtotal during freesync can be passed to dml regardless of whether extended blank is supported or not, so this function is not needed.
Updates were made recently in dml to the calculation of min_dst_y_next_start. Dml input for dcn314 will now always use the newer calculation for min_dst_y_next_start. Dml input for older dcn versions remains untouched.
The variable optimized_min_dst_y_next_start is replaced everywhere with min_dst_y_next_start, and the updated dml allows min_dst_y_next_start to increase to an optimized value during freesync video, then return to default when freesync is disengaged.
Also removed registry key for controlling extended blank feature.
Tested-by: Daniel Wheeler daniel.wheeler@amd.com Reviewed-by: Nicholas Kazlauskas Nicholas.Kazlauskas@amd.com Acked-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Gabe Teeger gabe.teeger@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 2a9482e55968 ("drm/amd/display: Prevent vtotal from being set to 0") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 21 ----------------- drivers/gpu/drm/amd/display/dc/dc.h | 2 -- .../drm/amd/display/dc/dcn20/dcn20_hwseq.c | 4 ++-- .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 23 +++++++++---------- .../dc/dml/dcn31/display_rq_dlg_calc_31.c | 3 +-- .../amd/display/dc/dml/dcn314/dcn314_fpu.c | 14 +++++++---- .../dc/dml/dcn314/display_rq_dlg_calc_314.c | 16 ++++--------- .../amd/display/dc/dml/display_mode_structs.h | 3 +-- 8 files changed, 29 insertions(+), 57 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 7f6bdad57c920..d22095a3a265a 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -2632,9 +2632,6 @@ static enum surface_update_type check_update_surfaces_for_stream( stream_update->vrr_active_variable)) su_flags->bits.fams_changed = 1;
- if (stream_update->crtc_timing_adjust && dc_extended_blank_supported(dc)) - su_flags->bits.crtc_timing_adjust = 1; - if (su_flags->raw != 0) overall_type = UPDATE_TYPE_FULL;
@@ -4900,21 +4897,3 @@ void dc_notify_vsync_int_state(struct dc *dc, struct dc_stream_state *stream, bo if (pipe->stream_res.abm && pipe->stream_res.abm->funcs->set_abm_pause) pipe->stream_res.abm->funcs->set_abm_pause(pipe->stream_res.abm, !enable, i, pipe->stream_res.tg->inst); } - -/** - * dc_extended_blank_supported - Decide whether extended blank is supported - * - * @dc: [in] Current DC state - * - * Extended blank is a freesync optimization feature to be enabled in the - * future. During the extra vblank period gained from freesync, we have the - * ability to enter z9/z10. - * - * Return: - * Indicate whether extended blank is supported (%true or %false) - */ -bool dc_extended_blank_supported(struct dc *dc) -{ - return dc->debug.extended_blank_optimization && !dc->debug.disable_z10 - && dc->caps.zstate_support && dc->caps.is_apu; -} diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index 07d86b961c798..9279990e43694 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -2125,8 +2125,6 @@ struct dc_sink_init_data { bool converter_disable_audio; };
-bool dc_extended_blank_supported(struct dc *dc); - struct dc_sink *dc_sink_create(const struct dc_sink_init_data *init_params);
/* Newer interfaces */ diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c index c38be3c6c234e..a621b6a27c1fc 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -2128,7 +2128,7 @@ void dcn20_optimize_bandwidth( dc->clk_mgr, context, true); - if (dc_extended_blank_supported(dc) && context->bw_ctx.bw.dcn.clk.zstate_support == DCN_ZSTATE_SUPPORT_ALLOW) { + if (context->bw_ctx.bw.dcn.clk.zstate_support == DCN_ZSTATE_SUPPORT_ALLOW) { for (i = 0; i < dc->res_pool->pipe_count; ++i) { struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
@@ -2136,7 +2136,7 @@ void dcn20_optimize_bandwidth( && pipe_ctx->stream->adjust.v_total_min == pipe_ctx->stream->adjust.v_total_max && pipe_ctx->stream->adjust.v_total_max > pipe_ctx->stream->timing.v_total) pipe_ctx->plane_res.hubp->funcs->program_extended_blank(pipe_ctx->plane_res.hubp, - pipe_ctx->dlg_regs.optimized_min_dst_y_next_start); + pipe_ctx->dlg_regs.min_dst_y_next_start); } } } diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c index f1c1a4b5fcac3..7661f8946aa31 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c @@ -948,10 +948,10 @@ static enum dcn_zstate_support_state decide_zstate_support(struct dc *dc, struc { int plane_count; int i; - unsigned int optimized_min_dst_y_next_start_us; + unsigned int min_dst_y_next_start_us;
plane_count = 0; - optimized_min_dst_y_next_start_us = 0; + min_dst_y_next_start_us = 0; for (i = 0; i < dc->res_pool->pipe_count; i++) { if (context->res_ctx.pipe_ctx[i].plane_state) plane_count++; @@ -973,19 +973,18 @@ static enum dcn_zstate_support_state decide_zstate_support(struct dc *dc, struc else if (context->stream_count == 1 && context->streams[0]->signal == SIGNAL_TYPE_EDP) { struct dc_link *link = context->streams[0]->sink->link; struct dc_stream_status *stream_status = &context->stream_status[0]; + struct dc_stream_state *current_stream = context->streams[0]; int minmum_z8_residency = dc->debug.minimum_z8_residency_time > 0 ? dc->debug.minimum_z8_residency_time : 1000; bool allow_z8 = context->bw_ctx.dml.vba.StutterPeriod > (double)minmum_z8_residency; bool is_pwrseq0 = link->link_index == 0; + bool isFreesyncVideo;
- if (dc_extended_blank_supported(dc)) { - for (i = 0; i < dc->res_pool->pipe_count; i++) { - if (context->res_ctx.pipe_ctx[i].stream == context->streams[0] - && context->res_ctx.pipe_ctx[i].stream->adjust.v_total_min == context->res_ctx.pipe_ctx[i].stream->adjust.v_total_max - && context->res_ctx.pipe_ctx[i].stream->adjust.v_total_min > context->res_ctx.pipe_ctx[i].stream->timing.v_total) { - optimized_min_dst_y_next_start_us = - context->res_ctx.pipe_ctx[i].dlg_regs.optimized_min_dst_y_next_start_us; - break; - } + isFreesyncVideo = current_stream->adjust.v_total_min == current_stream->adjust.v_total_max; + isFreesyncVideo = isFreesyncVideo && current_stream->timing.v_total < current_stream->adjust.v_total_min; + for (i = 0; i < dc->res_pool->pipe_count; i++) { + if (context->res_ctx.pipe_ctx[i].stream == current_stream && isFreesyncVideo) { + min_dst_y_next_start_us = context->res_ctx.pipe_ctx[i].dlg_regs.min_dst_y_next_start_us; + break; } }
@@ -993,7 +992,7 @@ static enum dcn_zstate_support_state decide_zstate_support(struct dc *dc, struc if (stream_status->plane_count > 1) return DCN_ZSTATE_SUPPORT_DISALLOW;
- if (is_pwrseq0 && (context->bw_ctx.dml.vba.StutterPeriod > 5000.0 || optimized_min_dst_y_next_start_us > 5000)) + if (is_pwrseq0 && (context->bw_ctx.dml.vba.StutterPeriod > 5000.0 || min_dst_y_next_start_us > 5000)) return DCN_ZSTATE_SUPPORT_ALLOW; else if (is_pwrseq0 && link->psr_settings.psr_version == DC_PSR_VERSION_1 && !link->panel_config.psr.disable_psr) return allow_z8 ? DCN_ZSTATE_SUPPORT_ALLOW_Z8_Z10_ONLY : DCN_ZSTATE_SUPPORT_ALLOW_Z10_ONLY; diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c index 2244e4fb8c96d..fcde8f21b8be0 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c @@ -987,8 +987,7 @@ static void dml_rq_dlg_get_dlg_params(
dlg_vblank_start = interlaced ? (vblank_start / 2) : vblank_start; disp_dlg_regs->min_dst_y_next_start = (unsigned int) (((double) dlg_vblank_start) * dml_pow(2, 2)); - disp_dlg_regs->optimized_min_dst_y_next_start_us = 0; - disp_dlg_regs->optimized_min_dst_y_next_start = disp_dlg_regs->min_dst_y_next_start; + disp_dlg_regs->min_dst_y_next_start_us = 0; ASSERT(disp_dlg_regs->min_dst_y_next_start < (unsigned int)dml_pow(2, 18));
dml_print("DML_DLG: %s: min_ttu_vblank (us) = %3.2f\n", __func__, min_ttu_vblank); diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c index 9e54e3d0eb780..1d00eb9e73c62 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c @@ -286,6 +286,7 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c struct resource_context *res_ctx = &context->res_ctx; struct pipe_ctx *pipe; bool upscaled = false; + bool isFreesyncVideo = false;
dc_assert_fp_enabled();
@@ -299,9 +300,16 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c pipe = &res_ctx->pipe_ctx[i]; timing = &pipe->stream->timing;
- if (dc_extended_blank_supported(dc) && pipe->stream->adjust.v_total_max == pipe->stream->adjust.v_total_min - && pipe->stream->adjust.v_total_min > timing->v_total) + isFreesyncVideo = pipe->stream->adjust.v_total_max == pipe->stream->adjust.v_total_min; + isFreesyncVideo = isFreesyncVideo && pipe->stream->adjust.v_total_min > timing->v_total; + + if (!isFreesyncVideo) { + pipes[pipe_cnt].pipe.dest.vblank_nom = + dcn3_14_ip.VBlankNomDefaultUS / (timing->h_total / (timing->pix_clk_100hz / 10000.0)); + } else { pipes[pipe_cnt].pipe.dest.vtotal = pipe->stream->adjust.v_total_min; + pipes[pipe_cnt].pipe.dest.vblank_nom = timing->v_total - pipes[pipe_cnt].pipe.dest.vactive; + }
if (pipe->plane_state && (pipe->plane_state->src_rect.height < pipe->plane_state->dst_rect.height || @@ -323,8 +331,6 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0; pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0; pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch; - pipes[pipe_cnt].pipe.dest.vblank_nom = - dcn3_14_ip.VBlankNomDefaultUS / (timing->h_total / (timing->pix_clk_100hz / 10000.0)); pipes[pipe_cnt].pipe.src.dcc_rate = 3; pipes[pipe_cnt].dout.dsc_input_bpc = 0;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c index ea4eb66066c42..4f945458b2b7e 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c @@ -1051,7 +1051,6 @@ static void dml_rq_dlg_get_dlg_params(
float vba__refcyc_per_req_delivery_pre_l = get_refcyc_per_req_delivery_pre_l_in_us(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; // From VBA float vba__refcyc_per_req_delivery_l = get_refcyc_per_req_delivery_l_in_us(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; // From VBA - int blank_lines = 0;
memset(disp_dlg_regs, 0, sizeof(*disp_dlg_regs)); memset(disp_ttu_regs, 0, sizeof(*disp_ttu_regs)); @@ -1075,17 +1074,10 @@ static void dml_rq_dlg_get_dlg_params( min_ttu_vblank = get_min_ttu_vblank_in_us(mode_lib, e2e_pipe_param, num_pipes, pipe_idx); // From VBA
dlg_vblank_start = interlaced ? (vblank_start / 2) : vblank_start; - disp_dlg_regs->optimized_min_dst_y_next_start = disp_dlg_regs->min_dst_y_next_start; - disp_dlg_regs->optimized_min_dst_y_next_start_us = 0; - disp_dlg_regs->min_dst_y_next_start = (unsigned int) (((double) dlg_vblank_start) * dml_pow(2, 2)); - blank_lines = (dst->vblank_end + dst->vtotal_min - dst->vblank_start - dst->vstartup_start - 1); - if (blank_lines < 0) - blank_lines = 0; - if (blank_lines != 0) { - disp_dlg_regs->optimized_min_dst_y_next_start = vba__min_dst_y_next_start; - disp_dlg_regs->optimized_min_dst_y_next_start_us = (disp_dlg_regs->optimized_min_dst_y_next_start * dst->hactive) / (unsigned int) dst->pixel_rate_mhz; - disp_dlg_regs->min_dst_y_next_start = disp_dlg_regs->optimized_min_dst_y_next_start; - } + disp_dlg_regs->min_dst_y_next_start_us = + (vba__min_dst_y_next_start * dst->hactive) / (unsigned int) dst->pixel_rate_mhz; + disp_dlg_regs->min_dst_y_next_start = vba__min_dst_y_next_start * dml_pow(2, 2); + ASSERT(disp_dlg_regs->min_dst_y_next_start < (unsigned int)dml_pow(2, 18));
dml_print("DML_DLG: %s: min_ttu_vblank (us) = %3.2f\n", __func__, min_ttu_vblank); diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h b/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h index 3c077164f3620..ff0246a9458fd 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h +++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h @@ -619,8 +619,7 @@ struct _vcs_dpi_display_dlg_regs_st { unsigned int refcyc_h_blank_end; unsigned int dlg_vblank_end; unsigned int min_dst_y_next_start; - unsigned int optimized_min_dst_y_next_start; - unsigned int optimized_min_dst_y_next_start_us; + unsigned int min_dst_y_next_start_us; unsigned int refcyc_per_htotal; unsigned int refcyc_x_after_scaler; unsigned int dst_y_after_scaler;
From: Daniel Miess daniel.miess@amd.com
[ Upstream commit 1a4bcdbea4319efeb26cc4b05be859a7867e02dc ]
[Why] Underflow observed when using a display with a large vblank region and low refresh rate
[How] Simplify calculation of vblank_nom
Increase value for VBlankNomDefaultUS to 800us
Reviewed-by: Jun Lei jun.lei@amd.com Acked-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Daniel Miess daniel.miess@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Stable-dep-of: 2a9482e55968 ("drm/amd/display: Prevent vtotal from being set to 0") Signed-off-by: Sasha Levin sashal@kernel.org --- .../amd/display/dc/dml/dcn314/dcn314_fpu.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c index 1d00eb9e73c62..554152371eb53 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c @@ -33,7 +33,7 @@ #include "dml/display_mode_vba.h"
struct _vcs_dpi_ip_params_st dcn3_14_ip = { - .VBlankNomDefaultUS = 668, + .VBlankNomDefaultUS = 800, .gpuvm_enable = 1, .gpuvm_max_page_table_levels = 1, .hostvm_enable = 1, @@ -286,7 +286,7 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c struct resource_context *res_ctx = &context->res_ctx; struct pipe_ctx *pipe; bool upscaled = false; - bool isFreesyncVideo = false; + const unsigned int max_allowed_vblank_nom = 1023;
dc_assert_fp_enabled();
@@ -300,16 +300,11 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c pipe = &res_ctx->pipe_ctx[i]; timing = &pipe->stream->timing;
- isFreesyncVideo = pipe->stream->adjust.v_total_max == pipe->stream->adjust.v_total_min; - isFreesyncVideo = isFreesyncVideo && pipe->stream->adjust.v_total_min > timing->v_total; - - if (!isFreesyncVideo) { - pipes[pipe_cnt].pipe.dest.vblank_nom = - dcn3_14_ip.VBlankNomDefaultUS / (timing->h_total / (timing->pix_clk_100hz / 10000.0)); - } else { - pipes[pipe_cnt].pipe.dest.vtotal = pipe->stream->adjust.v_total_min; - pipes[pipe_cnt].pipe.dest.vblank_nom = timing->v_total - pipes[pipe_cnt].pipe.dest.vactive; - } + pipes[pipe_cnt].pipe.dest.vtotal = pipe->stream->adjust.v_total_min; + pipes[pipe_cnt].pipe.dest.vblank_nom = timing->v_total - pipes[pipe_cnt].pipe.dest.vactive; + pipes[pipe_cnt].pipe.dest.vblank_nom = min(pipes[pipe_cnt].pipe.dest.vblank_nom, dcn3_14_ip.VBlankNomDefaultUS); + pipes[pipe_cnt].pipe.dest.vblank_nom = max(pipes[pipe_cnt].pipe.dest.vblank_nom, timing->v_sync_width); + pipes[pipe_cnt].pipe.dest.vblank_nom = min(pipes[pipe_cnt].pipe.dest.vblank_nom, max_allowed_vblank_nom);
if (pipe->plane_state && (pipe->plane_state->src_rect.height < pipe->plane_state->dst_rect.height ||
From: Daniel Miess daniel.miess@amd.com
[ Upstream commit 2a9482e55968ed7368afaa9c2133404069117320 ]
[Why] In dcn314 DML the destination pipe vtotal was being set to the crtc adjustment vtotal_min value even in cases where that value is 0.
[How] Only set vtotal to the crtc adjustment vtotal_min value in cases where the value is non-zero.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Alan Liu haoping.liu@amd.com Signed-off-by: Daniel Miess daniel.miess@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c index 554152371eb53..b878effa2129b 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c @@ -300,7 +300,11 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c pipe = &res_ctx->pipe_ctx[i]; timing = &pipe->stream->timing;
- pipes[pipe_cnt].pipe.dest.vtotal = pipe->stream->adjust.v_total_min; + if (pipe->stream->adjust.v_total_min != 0) + pipes[pipe_cnt].pipe.dest.vtotal = pipe->stream->adjust.v_total_min; + else + pipes[pipe_cnt].pipe.dest.vtotal = timing->v_total; + pipes[pipe_cnt].pipe.dest.vblank_nom = timing->v_total - pipes[pipe_cnt].pipe.dest.vactive; pipes[pipe_cnt].pipe.dest.vblank_nom = min(pipes[pipe_cnt].pipe.dest.vblank_nom, dcn3_14_ip.VBlankNomDefaultUS); pipes[pipe_cnt].pipe.dest.vblank_nom = max(pipes[pipe_cnt].pipe.dest.vblank_nom, timing->v_sync_width);
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit be22255360f80d3af789daad00025171a65424a5 ]
Since t_checkpoint_io_list was stop using in jbd2_log_do_checkpoint() now, it's time to remove the whole t_checkpoint_io_list logic.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 42 ++---------------------------------------- fs/jbd2/commit.c | 3 +-- include/linux/jbd2.h | 6 ------ 3 files changed, 3 insertions(+), 48 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index c4e0da6db7195..723b4eb112828 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -27,7 +27,7 @@ * * Called with j_list_lock held. */ -static inline void __buffer_unlink_first(struct journal_head *jh) +static inline void __buffer_unlink(struct journal_head *jh) { transaction_t *transaction = jh->b_cp_transaction;
@@ -40,23 +40,6 @@ static inline void __buffer_unlink_first(struct journal_head *jh) } }
-/* - * Unlink a buffer from a transaction checkpoint(io) list. - * - * Called with j_list_lock held. - */ -static inline void __buffer_unlink(struct journal_head *jh) -{ - transaction_t *transaction = jh->b_cp_transaction; - - __buffer_unlink_first(jh); - if (transaction->t_checkpoint_io_list == jh) { - transaction->t_checkpoint_io_list = jh->b_cpnext; - if (transaction->t_checkpoint_io_list == jh) - transaction->t_checkpoint_io_list = NULL; - } -} - /* * Check a checkpoint buffer could be release or not. * @@ -505,15 +488,6 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, break; if (need_resched() || spin_needbreak(&journal->j_list_lock)) break; - if (released) - continue; - - nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_io_list, - nr_to_scan, &released); - if (*nr_to_scan == 0) - break; - if (need_resched() || spin_needbreak(&journal->j_list_lock)) - break; } while (transaction != last_transaction);
if (transaction != last_transaction) { @@ -568,17 +542,6 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) */ if (need_resched()) return; - if (ret) - continue; - /* - * It is essential that we are as careful as in the case of - * t_checkpoint_list with removing the buffer from the list as - * we can possibly see not yet submitted buffers on io_list - */ - ret = journal_clean_one_cp_list(transaction-> - t_checkpoint_io_list, destroy); - if (need_resched()) - return; /* * Stop scanning if we couldn't free the transaction. This * avoids pointless scanning of transactions which still @@ -663,7 +626,7 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) jbd2_journal_put_journal_head(jh);
/* Is this transaction empty? */ - if (transaction->t_checkpoint_list || transaction->t_checkpoint_io_list) + if (transaction->t_checkpoint_list) return 0;
/* @@ -755,7 +718,6 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact J_ASSERT(transaction->t_forget == NULL); J_ASSERT(transaction->t_shadow_list == NULL); J_ASSERT(transaction->t_checkpoint_list == NULL); - J_ASSERT(transaction->t_checkpoint_io_list == NULL); J_ASSERT(atomic_read(&transaction->t_updates) == 0); J_ASSERT(journal->j_committing_transaction != transaction); J_ASSERT(journal->j_running_transaction != transaction); diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index b33155dd70017..1073259902a60 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -1141,8 +1141,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) spin_lock(&journal->j_list_lock); commit_transaction->t_state = T_FINISHED; /* Check if the transaction can be dropped now that we are finished */ - if (commit_transaction->t_checkpoint_list == NULL && - commit_transaction->t_checkpoint_io_list == NULL) { + if (commit_transaction->t_checkpoint_list == NULL) { __jbd2_journal_drop_transaction(journal, commit_transaction); jbd2_journal_free_transaction(commit_transaction); } diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index f619bae1dcc5d..91a2cf4bc5756 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -622,12 +622,6 @@ struct transaction_s */ struct journal_head *t_checkpoint_list;
- /* - * Doubly-linked circular list of all buffers submitted for IO while - * checkpointing. [j_list_lock] - */ - struct journal_head *t_checkpoint_io_list; - /* * Doubly-linked circular list of metadata buffers being * shadowed by log IO. The IO buffers on the iobuf list and
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit b98dba273a0e47dbfade89c9af73c5b012a4eabb ]
journal_clean_one_cp_list() and journal_shrink_one_cp_list() are almost the same, so merge them into journal_shrink_one_cp_list(), remove the nr_to_scan parameter, always scan and try to free the whole checkpoint list.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 75 +++++++++---------------------------- include/trace/events/jbd2.h | 12 ++---- 2 files changed, 21 insertions(+), 66 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index 723b4eb112828..42b34cab64fbd 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -349,50 +349,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
/* Checkpoint list management */
-/* - * journal_clean_one_cp_list - * - * Find all the written-back checkpoint buffers in the given list and - * release them. If 'destroy' is set, clean all buffers unconditionally. - * - * Called with j_list_lock held. - * Returns 1 if we freed the transaction, 0 otherwise. - */ -static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) -{ - struct journal_head *last_jh; - struct journal_head *next_jh = jh; - - if (!jh) - return 0; - - last_jh = jh->b_cpprev; - do { - jh = next_jh; - next_jh = jh->b_cpnext; - - if (!destroy && __cp_buffer_busy(jh)) - return 0; - - if (__jbd2_journal_remove_checkpoint(jh)) - return 1; - /* - * This function only frees up some memory - * if possible so we dont have an obligation - * to finish processing. Bail out if preemption - * requested: - */ - if (need_resched()) - return 0; - } while (jh != last_jh); - - return 0; -} - /* * journal_shrink_one_cp_list * - * Find 'nr_to_scan' written-back checkpoint buffers in the given list + * Find all the written-back checkpoint buffers in the given list * and try to release them. If the whole transaction is released, set * the 'released' parameter. Return the number of released checkpointed * buffers. @@ -400,15 +360,15 @@ static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy) * Called with j_list_lock held. */ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, - unsigned long *nr_to_scan, - bool *released) + bool destroy, bool *released) { struct journal_head *last_jh; struct journal_head *next_jh = jh; unsigned long nr_freed = 0; int ret;
- if (!jh || *nr_to_scan == 0) + *released = false; + if (!jh) return 0;
last_jh = jh->b_cpprev; @@ -416,8 +376,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, jh = next_jh; next_jh = jh->b_cpnext;
- (*nr_to_scan)--; - if (__cp_buffer_busy(jh)) + if (!destroy && __cp_buffer_busy(jh)) continue;
nr_freed++; @@ -429,7 +388,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
if (need_resched()) break; - } while (jh != last_jh && *nr_to_scan); + } while (jh != last_jh);
return nr_freed; } @@ -447,11 +406,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan) { transaction_t *transaction, *last_transaction, *next_transaction; - bool released; + bool __maybe_unused released; tid_t first_tid = 0, last_tid = 0, next_tid = 0; tid_t tid = 0; unsigned long nr_freed = 0; - unsigned long nr_scanned = *nr_to_scan; + unsigned long freed;
again: spin_lock(&journal->j_list_lock); @@ -480,10 +439,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, transaction = next_transaction; next_transaction = transaction->t_cpnext; tid = transaction->t_tid; - released = false;
- nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_list, - nr_to_scan, &released); + freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list, + false, &released); + nr_freed += freed; + (*nr_to_scan) -= min(*nr_to_scan, freed); if (*nr_to_scan == 0) break; if (need_resched() || spin_needbreak(&journal->j_list_lock)) @@ -504,9 +464,8 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, if (*nr_to_scan && next_tid) goto again; out: - nr_scanned -= *nr_to_scan; trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid, - nr_freed, nr_scanned, next_tid); + nr_freed, next_tid);
return nr_freed; } @@ -522,7 +481,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) { transaction_t *transaction, *last_transaction, *next_transaction; - int ret; + bool released;
transaction = journal->j_checkpoint_transactions; if (!transaction) @@ -533,8 +492,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) do { transaction = next_transaction; next_transaction = transaction->t_cpnext; - ret = journal_clean_one_cp_list(transaction->t_checkpoint_list, - destroy); + journal_shrink_one_cp_list(transaction->t_checkpoint_list, + destroy, &released); /* * This function only frees up some memory if possible so we * dont have an obligation to finish processing. Bail out if @@ -547,7 +506,7 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) * avoids pointless scanning of transactions which still * weren't checkpointed. */ - if (!ret) + if (!released) return; } while (transaction != last_transaction); } diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h index 8f5ee380d3093..5646ae15a957a 100644 --- a/include/trace/events/jbd2.h +++ b/include/trace/events/jbd2.h @@ -462,11 +462,9 @@ TRACE_EVENT(jbd2_shrink_scan_exit, TRACE_EVENT(jbd2_shrink_checkpoint_list,
TP_PROTO(journal_t *journal, tid_t first_tid, tid_t tid, tid_t last_tid, - unsigned long nr_freed, unsigned long nr_scanned, - tid_t next_tid), + unsigned long nr_freed, tid_t next_tid),
- TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, - nr_scanned, next_tid), + TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, next_tid),
TP_STRUCT__entry( __field(dev_t, dev) @@ -474,7 +472,6 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, __field(tid_t, tid) __field(tid_t, last_tid) __field(unsigned long, nr_freed) - __field(unsigned long, nr_scanned) __field(tid_t, next_tid) ),
@@ -484,15 +481,14 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list, __entry->tid = tid; __entry->last_tid = last_tid; __entry->nr_freed = nr_freed; - __entry->nr_scanned = nr_scanned; __entry->next_tid = next_tid; ),
TP_printk("dev %d,%d shrink transaction %u-%u(%u) freed %lu " - "scanned %lu next transaction %u", + "next transaction %u", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->first_tid, __entry->tid, __entry->last_tid, - __entry->nr_freed, __entry->nr_scanned, __entry->next_tid) + __entry->nr_freed, __entry->next_tid) );
#endif /* _TRACE_JBD2_H */
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 46f881b5b1758dc4a35fba4a643c10717d0cf427 ]
Before removing checkpoint buffer from the t_checkpoint_list, we have to check both BH_Dirty and BH_Lock bits together to distinguish buffers have not been or were being written back. But __cp_buffer_busy() checks them separately, it first check lock state and then check dirty, the window between these two checks could be raced by writing back procedure, which locks buffer and clears buffer dirty before I/O completes. So it cannot guarantee checkpointing buffers been written back to disk if some error happens later. Finally, it may clean checkpoint transactions and lead to inconsistent filesystem.
jbd2_journal_forget() and __journal_try_to_free_buffer() also have the same problem (journal_unmap_buffer() escape from this issue since it's running under the buffer lock), so fix them through introducing a new helper to try holding the buffer lock and remove really clean buffer.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490 Cc: stable@vger.kernel.org Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230606135928.434610-6-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 38 +++++++++++++++++++++++++++++++++++--- fs/jbd2/transaction.c | 17 +++++------------ include/linux/jbd2.h | 1 + 3 files changed, 41 insertions(+), 15 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index 42b34cab64fbd..9ec91017a7f3c 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -376,11 +376,15 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, jh = next_jh; next_jh = jh->b_cpnext;
- if (!destroy && __cp_buffer_busy(jh)) - continue; + if (destroy) { + ret = __jbd2_journal_remove_checkpoint(jh); + } else { + ret = jbd2_journal_try_remove_checkpoint(jh); + if (ret < 0) + continue; + }
nr_freed++; - ret = __jbd2_journal_remove_checkpoint(jh); if (ret) { *released = true; break; @@ -616,6 +620,34 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh) return 1; }
+/* + * Check the checkpoint buffer and try to remove it from the checkpoint + * list if it's clean. Returns -EBUSY if it is not clean, returns 1 if + * it frees the transaction, 0 otherwise. + * + * This function is called with j_list_lock held. + */ +int jbd2_journal_try_remove_checkpoint(struct journal_head *jh) +{ + struct buffer_head *bh = jh2bh(jh); + + if (!trylock_buffer(bh)) + return -EBUSY; + if (buffer_dirty(bh)) { + unlock_buffer(bh); + return -EBUSY; + } + unlock_buffer(bh); + + /* + * Buffer is clean and the IO has finished (we held the buffer + * lock) so the checkpoint is done. We can safely remove the + * buffer from this transaction. + */ + JBUFFER_TRACE(jh, "remove from checkpoint list"); + return __jbd2_journal_remove_checkpoint(jh); +} + /* * journal_insert_checkpoint: put a committed buffer onto a checkpoint * list so that we know when it is safe to clean the transaction out of diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 18611241f4513..6ef5022949c46 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1784,8 +1784,7 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh) * Otherwise, if the buffer has been written to disk, * it is safe to remove the checkpoint and drop it. */ - if (!buffer_dirty(bh)) { - __jbd2_journal_remove_checkpoint(jh); + if (jbd2_journal_try_remove_checkpoint(jh) >= 0) { spin_unlock(&journal->j_list_lock); goto drop; } @@ -2112,20 +2111,14 @@ __journal_try_to_free_buffer(journal_t *journal, struct buffer_head *bh)
jh = bh2jh(bh);
- if (buffer_locked(bh) || buffer_dirty(bh)) - goto out; - if (jh->b_next_transaction != NULL || jh->b_transaction != NULL) - goto out; + return;
spin_lock(&journal->j_list_lock); - if (jh->b_cp_transaction != NULL) { - /* written-back checkpointed metadata buffer */ - JBUFFER_TRACE(jh, "remove from checkpoint list"); - __jbd2_journal_remove_checkpoint(jh); - } + /* Remove written-back checkpointed metadata buffer */ + if (jh->b_cp_transaction != NULL) + jbd2_journal_try_remove_checkpoint(jh); spin_unlock(&journal->j_list_lock); -out: return; }
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 91a2cf4bc5756..c212da35a052c 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1443,6 +1443,7 @@ extern void jbd2_journal_commit_transaction(journal_t *); void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy); unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan); int __jbd2_journal_remove_checkpoint(struct journal_head *); +int jbd2_journal_try_remove_checkpoint(struct journal_head *jh); void jbd2_journal_destroy_checkpoint(journal_t *journal); void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
From: Kemeng Shi shikemeng@huaweicloud.com
[ Upstream commit 1eff590489a213a213c57d96b86f48b32cdf8c3a ]
ext4_mb_use_preallocated will ignore the demand to alloc goal blocks, although the EXT4_MB_HINT_GOAL_ONLY is requested. For group pa, ext4_mb_group_or_file will not set EXT4_MB_HINT_GROUP_ALLOC if EXT4_MB_HINT_GOAL_ONLY is set. So we will not alloc goal blocks from group pa if EXT4_MB_HINT_GOAL_ONLY is set. For inode pa, ext4_mb_pa_goal_check is added to check if free extent in found inode pa meets goal blocks when EXT4_MB_HINT_GOAL_ONLY is set.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Suggested-by: Ojaswin Mujoo ojaswin@linux.ibm.com Link: https://lore.kernel.org/r/20230603150327.3596033-6-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 9d3de7ee192a ("ext4: fix rbtree traversal bug in ext4_mb_use_preallocated") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index fd4d12c58c3b4..1f4d00a4308dc 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4528,6 +4528,37 @@ ext4_mb_check_group_pa(ext4_fsblk_t goal_block, return pa; }
+/* + * check if found pa meets EXT4_MB_HINT_GOAL_ONLY + */ +static bool +ext4_mb_pa_goal_check(struct ext4_allocation_context *ac, + struct ext4_prealloc_space *pa) +{ + struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); + ext4_fsblk_t start; + + if (likely(!(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY))) + return true; + + /* + * If EXT4_MB_HINT_GOAL_ONLY is set, ac_g_ex will not be adjusted + * in ext4_mb_normalize_request and will keep same with ac_o_ex + * from ext4_mb_initialize_context. Choose ac_g_ex here to keep + * consistent with ext4_mb_find_by_goal. + */ + start = pa->pa_pstart + + (ac->ac_g_ex.fe_logical - pa->pa_lstart); + if (ext4_grp_offs_to_block(ac->ac_sb, &ac->ac_g_ex) != start) + return false; + + if (ac->ac_g_ex.fe_len > pa->pa_len - + EXT4_B2C(sbi, ac->ac_g_ex.fe_logical - pa->pa_lstart)) + return false; + + return true; +} + /* * search goal blocks in preallocated space */ @@ -4578,7 +4609,8 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
/* found preallocated blocks, use them */ spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free) { + if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free && + likely(ext4_mb_pa_goal_check(ac, tmp_pa))) { atomic_inc(&tmp_pa->pa_count); ext4_mb_use_inode_pa(ac, tmp_pa); spin_unlock(&tmp_pa->pa_lock);
From: Ritesh Harjani ritesh.list@gmail.com
[ Upstream commit 569f196f1e7a14472f21734170411f75a3179db0 ]
There will be changes coming in future patches which will introduce a new criteria for block allocation. This removes the useless setting of ac_criteria. AFAIU, this might be only used to differentiate between whether a preallocated blocks was allocated or was regular allocator called for allocating blocks. Hence this also adds the debug prints to identify what type of block allocation was done in ext4_mb_show_ac().
Signed-off-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Signed-off-by: Ojaswin Mujoo ojaswin@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/1dbae05617519cb6202f1b299c9d1be3e7cda763.168544970... Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 9d3de7ee192a ("ext4: fix rbtree traversal bug in ext4_mb_use_preallocated") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 1f4d00a4308dc..d49d1a7af22db 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4614,7 +4614,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) atomic_inc(&tmp_pa->pa_count); ext4_mb_use_inode_pa(ac, tmp_pa); spin_unlock(&tmp_pa->pa_lock); - ac->ac_criteria = 10; read_unlock(&ei->i_prealloc_lock); return true; } @@ -4657,7 +4656,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) } if (cpa) { ext4_mb_use_group_pa(ac, cpa); - ac->ac_criteria = 20; return true; } return false; @@ -5431,6 +5429,10 @@ static void ext4_mb_show_ac(struct ext4_allocation_context *ac) (unsigned long)ac->ac_b_ex.fe_logical, (int)ac->ac_criteria); mb_debug(sb, "%u found", ac->ac_found); + mb_debug(sb, "used pa: %s, ", ac->ac_pa ? "yes" : "no"); + if (ac->ac_pa) + mb_debug(sb, "pa_type %s\n", ac->ac_pa->pa_type == MB_GROUP_PA ? + "group pa" : "inode pa"); ext4_mb_show_pa(sb); } #else
From: Ojaswin Mujoo ojaswin@linux.ibm.com
[ Upstream commit 9d3de7ee192a6a253f475197fe4d2e2af10a731f ]
During allocations, while looking for preallocations(PA) in the per inode rbtree, we can't do a direct traversal of the tree because ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted and that can cause direct traversal to skip some entries. This was leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy our request and ultimately tried to create a new PA that would overlap with the missed one.
To makes sure we handle that case while still keeping the performance of the rbtree, we make use of the fact that the only pa that could possibly overlap the original goal start is the one that satisfies the below conditions:
1. It must have it's logical start immediately to the left of (ie less than) original logical start.
2. It must not be deleted
To find this pa we use the following traversal method:
1. Descend into the rbtree normally to find the immediate neighboring PA. Here we keep descending irrespective of if the PA is deleted or if it overlaps with our request etc. The goal is to find an immediately adjacent PA.
2. If the found PA is on right of original goal, use rb_prev() to find the left adjacent PA.
3. Check if this PA is deleted and keep moving left with rb_prev() until a non deleted PA is found.
4. This is the PA we are looking for. Now we can check if it can satisfy the original request and proceed accordingly.
This approach also takes care of having deleted PAs in the tree.
(While we are at it, also fix a possible overflow bug in calculating the end of a PA)
[1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tK...
Cc: stable@kernel.org # 6.4 Fixes: 3872778664e3 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list") Signed-off-by: Ojaswin Mujoo ojaswin@linux.ibm.com Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.169004596... Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 158 ++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 131 insertions(+), 27 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index d49d1a7af22db..3fa5de892d89d 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4569,8 +4569,8 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) int order, i; struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_locality_group *lg; - struct ext4_prealloc_space *tmp_pa, *cpa = NULL; - ext4_lblk_t tmp_pa_start, tmp_pa_end; + struct ext4_prealloc_space *tmp_pa = NULL, *cpa = NULL; + loff_t tmp_pa_end; struct rb_node *iter; ext4_fsblk_t goal_block;
@@ -4578,47 +4578,151 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) if (!(ac->ac_flags & EXT4_MB_HINT_DATA)) return false;
- /* first, try per-file preallocation */ + /* + * first, try per-file preallocation by searching the inode pa rbtree. + * + * Here, we can't do a direct traversal of the tree because + * ext4_mb_discard_group_preallocation() can paralelly mark the pa + * deleted and that can cause direct traversal to skip some entries. + */ read_lock(&ei->i_prealloc_lock); + + if (RB_EMPTY_ROOT(&ei->i_prealloc_node)) { + goto try_group_pa; + } + + /* + * Step 1: Find a pa with logical start immediately adjacent to the + * original logical start. This could be on the left or right. + * + * (tmp_pa->pa_lstart never changes so we can skip locking for it). + */ for (iter = ei->i_prealloc_node.rb_node; iter; iter = ext4_mb_pa_rb_next_iter(ac->ac_o_ex.fe_logical, - tmp_pa_start, iter)) { + tmp_pa->pa_lstart, iter)) { tmp_pa = rb_entry(iter, struct ext4_prealloc_space, pa_node.inode_node); + }
- /* all fields in this condition don't change, - * so we can skip locking for them */ - tmp_pa_start = tmp_pa->pa_lstart; - tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); - - /* original request start doesn't lie in this PA */ - if (ac->ac_o_ex.fe_logical < tmp_pa_start || - ac->ac_o_ex.fe_logical >= tmp_pa_end) - continue; + /* + * Step 2: The adjacent pa might be to the right of logical start, find + * the left adjacent pa. After this step we'd have a valid tmp_pa whose + * logical start is towards the left of original request's logical start + */ + if (tmp_pa->pa_lstart > ac->ac_o_ex.fe_logical) { + struct rb_node *tmp; + tmp = rb_prev(&tmp_pa->pa_node.inode_node);
- /* non-extent files can't have physical blocks past 2^32 */ - if (!(ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) && - (tmp_pa->pa_pstart + EXT4_C2B(sbi, tmp_pa->pa_len) > - EXT4_MAX_BLOCK_FILE_PHYS)) { + if (tmp) { + tmp_pa = rb_entry(tmp, struct ext4_prealloc_space, + pa_node.inode_node); + } else { /* - * Since PAs don't overlap, we won't find any - * other PA to satisfy this. + * If there is no adjacent pa to the left then finding + * an overlapping pa is not possible hence stop searching + * inode pa tree */ - break; + goto try_group_pa; } + }
- /* found preallocated blocks, use them */ + BUG_ON(!(tmp_pa && tmp_pa->pa_lstart <= ac->ac_o_ex.fe_logical)); + + /* + * Step 3: If the left adjacent pa is deleted, keep moving left to find + * the first non deleted adjacent pa. After this step we should have a + * valid tmp_pa which is guaranteed to be non deleted. + */ + for (iter = &tmp_pa->pa_node.inode_node;; iter = rb_prev(iter)) { + if (!iter) { + /* + * no non deleted left adjacent pa, so stop searching + * inode pa tree + */ + goto try_group_pa; + } + tmp_pa = rb_entry(iter, struct ext4_prealloc_space, + pa_node.inode_node); spin_lock(&tmp_pa->pa_lock); - if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free && - likely(ext4_mb_pa_goal_check(ac, tmp_pa))) { - atomic_inc(&tmp_pa->pa_count); - ext4_mb_use_inode_pa(ac, tmp_pa); + if (tmp_pa->pa_deleted == 0) { + /* + * We will keep holding the pa_lock from + * this point on because we don't want group discard + * to delete this pa underneath us. Since group + * discard is anyways an ENOSPC operation it + * should be okay for it to wait a few more cycles. + */ + break; + } else { spin_unlock(&tmp_pa->pa_lock); - read_unlock(&ei->i_prealloc_lock); - return true; } + } + + BUG_ON(!(tmp_pa && tmp_pa->pa_lstart <= ac->ac_o_ex.fe_logical)); + BUG_ON(tmp_pa->pa_deleted == 1); + + /* + * Step 4: We now have the non deleted left adjacent pa. Only this + * pa can possibly satisfy the request hence check if it overlaps + * original logical start and stop searching if it doesn't. + */ + tmp_pa_end = (loff_t)tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + + if (ac->ac_o_ex.fe_logical >= tmp_pa_end) { + spin_unlock(&tmp_pa->pa_lock); + goto try_group_pa; + } + + /* non-extent files can't have physical blocks past 2^32 */ + if (!(ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) && + (tmp_pa->pa_pstart + EXT4_C2B(sbi, tmp_pa->pa_len) > + EXT4_MAX_BLOCK_FILE_PHYS)) { + /* + * Since PAs don't overlap, we won't find any other PA to + * satisfy this. + */ spin_unlock(&tmp_pa->pa_lock); + goto try_group_pa; + } + + if (tmp_pa->pa_free && likely(ext4_mb_pa_goal_check(ac, tmp_pa))) { + atomic_inc(&tmp_pa->pa_count); + ext4_mb_use_inode_pa(ac, tmp_pa); + spin_unlock(&tmp_pa->pa_lock); + read_unlock(&ei->i_prealloc_lock); + return true; + } else { + /* + * We found a valid overlapping pa but couldn't use it because + * it had no free blocks. This should ideally never happen + * because: + * + * 1. When a new inode pa is added to rbtree it must have + * pa_free > 0 since otherwise we won't actually need + * preallocation. + * + * 2. An inode pa that is in the rbtree can only have it's + * pa_free become zero when another thread calls: + * ext4_mb_new_blocks + * ext4_mb_use_preallocated + * ext4_mb_use_inode_pa + * + * 3. Further, after the above calls make pa_free == 0, we will + * immediately remove it from the rbtree in: + * ext4_mb_new_blocks + * ext4_mb_release_context + * ext4_mb_put_pa + * + * 4. Since the pa_free becoming 0 and pa_free getting removed + * from tree both happen in ext4_mb_new_blocks, which is always + * called with i_data_sem held for data allocations, we can be + * sure that another process will never see a pa in rbtree with + * pa_free == 0. + */ + WARN_ON_ONCE(tmp_pa->pa_free == 0); } + spin_unlock(&tmp_pa->pa_lock); +try_group_pa: read_unlock(&ei->i_prealloc_lock);
/* can we use group allocation? */
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 5782017cc4d0c8f3425d55b893675bb8a20c33e9 ]
Negative -EINVAL was intended instead of positive EINVAL.
Fixes: 6a23afad443a ("phy: phy-mtk-dp: Add driver for DP phy") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Chen-Yu Tsai wenst@chromium.org Link: https://lore.kernel.org/r/3c699e00-2883-40d9-92c3-0da1dc38fdd4@moroto.mounta... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/mediatek/phy-mtk-dp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/mediatek/phy-mtk-dp.c b/drivers/phy/mediatek/phy-mtk-dp.c index 232fd3f1ff1b1..d7024a1443358 100644 --- a/drivers/phy/mediatek/phy-mtk-dp.c +++ b/drivers/phy/mediatek/phy-mtk-dp.c @@ -169,7 +169,7 @@ static int mtk_dp_phy_probe(struct platform_device *pdev)
regs = *(struct regmap **)dev->platform_data; if (!regs) - return dev_err_probe(dev, EINVAL, + return dev_err_probe(dev, -EINVAL, "No data passed, requires struct regmap**\n");
dp_phy = devm_kzalloc(dev, sizeof(*dp_phy), GFP_KERNEL);
From: Guillaume Ranquet granquet@baylibre.com
[ Upstream commit 95bd315f0a5ed7d7afe771776272c5b3cdb29bc8 ]
The pll prediv calculus searchs for the smallest prediv that gets the ns_hdmipll_ck in the range of 5 GHz to 12 GHz.
A typo in the upper bound test was testing for 5Ghz to 1Ghz
Fixes: 45810d486bb44 ("phy: mediatek: add support for phy-mtk-hdmi-mt8195") Signed-off-by: Guillaume Ranquet granquet@baylibre.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Link: https://lore.kernel.org/r/20230529-hdmi_phy_fix-v1-1-bf65f53af533@baylibre.c... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c b/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c index 8aa7251de4a96..bbfe11d6a69d7 100644 --- a/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c +++ b/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c @@ -253,7 +253,7 @@ static int mtk_hdmi_pll_calc(struct mtk_hdmi_phy *hdmi_phy, struct clk_hw *hw, for (i = 0; i < ARRAY_SIZE(txpredivs); i++) { ns_hdmipll_ck = 5 * tmds_clk * txposdiv * txpredivs[i]; if (ns_hdmipll_ck >= 5 * GIGA && - ns_hdmipll_ck <= 1 * GIGA) + ns_hdmipll_ck <= 12 * GIGA) break; } if (i == (ARRAY_SIZE(txpredivs) - 1) &&
From: Adrien Thierry athierry@redhat.com
[ Upstream commit 45d89a344eb46db9dce851c28e14f5e3c635c251 ]
In the dwc3 core, both system and runtime suspend end up calling dwc3_suspend_common(). From there, what happens for the PHYs depends on the USB mode and whether the controller is entering system or runtime suspend.
HOST mode: (1) system suspend on a non-wakeup-capable controller
The [1] if branch is taken. dwc3_core_exit() is called, which ends up calling phy_power_off() and phy_exit(). Those two functions decrease the PM runtime count at some point, so they will trigger the PHY runtime sleep (assuming the count is right).
(2) runtime suspend / system suspend on a wakeup-capable controller
The [1] branch is not taken. dwc3_suspend_common() calls phy_pm_runtime_put_sync(). Assuming the ref count is right, the PHY runtime suspend op is called.
DEVICE mode: dwc3_core_exit() is called on both runtime and system sleep unless the controller is already runtime suspended.
OTG mode: (1) system suspend : dwc3_core_exit() is called
(2) runtime suspend : do nothing
In host mode, the code seems to make a distinction between 1) runtime sleep / system sleep for wakeup-capable controller, and 2) system sleep for non-wakeup-capable controller, where phy_power_off() and phy_exit() are only called for the latter. This suggests the PHY is not supposed to be in a fully powered-off state for runtime sleep and system sleep for wakeup-capable controller.
Moreover, downstream, cfg_ahb_clk only gets disabled for system suspend. The clocks are disabled by phy->set_suspend() [2] which is only called in the system sleep path through dwc3_core_exit() [3].
With that in mind, don't disable the clocks during the femto PHY runtime suspend callback. The clocks will only be disabled during system suspend for non-wakeup-capable controllers, through dwc3_core_exit().
[1] https://elixir.bootlin.com/linux/v6.4/source/drivers/usb/dwc3/core.c#L1988 [2] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/LV.AU.1.2.1.r2-05300... [3] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/LV.AU.1.2.1.r2-05300...
Signed-off-by: Adrien Thierry athierry@redhat.com Link: https://lore.kernel.org/r/20230629144542.14906-2-athierry@redhat.com Signed-off-by: Vinod Koul vkoul@kernel.org Stable-dep-of: 8a0eb8f9b9a0 ("phy: qcom-snps-femto-v2: properly enable ref clock") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index 6c237f3cc66db..3335480fc395a 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -165,22 +165,13 @@ static int qcom_snps_hsphy_suspend(struct qcom_snps_hsphy *hsphy) 0, USB2_AUTO_RESUME); }
- clk_disable_unprepare(hsphy->cfg_ahb_clk); return 0; }
static int qcom_snps_hsphy_resume(struct qcom_snps_hsphy *hsphy) { - int ret; - dev_dbg(&hsphy->phy->dev, "Resume QCOM SNPS PHY, mode\n");
- ret = clk_prepare_enable(hsphy->cfg_ahb_clk); - if (ret) { - dev_err(&hsphy->phy->dev, "failed to enable cfg ahb clock\n"); - return ret; - } - return 0; }
From: Adrien Thierry athierry@redhat.com
[ Upstream commit 8a0eb8f9b9a002291a3934acfd913660b905249e ]
The driver is not enabling the ref clock, which thus gets disabled by the clk_disable_unused() initcall. This leads to the dwc3 controller failing to initialize if probed after clk_disable_unused() is called, for instance when the driver is built as a module.
To fix this, switch to the clk_bulk API to handle both cfg_ahb and ref clocks at the proper places.
Note that the cfg_ahb clock is currently not used by any device tree instantiation of the PHY. Work needs to be done separately to fix this.
Link: https://lore.kernel.org/linux-arm-msm/ZEqvy+khHeTkC2hf@fedora/ Fixes: 51e8114f80d0 ("phy: qcom-snps: Add SNPS USB PHY driver for QCOM based SOCs") Signed-off-by: Adrien Thierry athierry@redhat.com Link: https://lore.kernel.org/r/20230629144542.14906-3-athierry@redhat.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 63 ++++++++++++++----- 1 file changed, 48 insertions(+), 15 deletions(-)
diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index 3335480fc395a..6170f8fd118e2 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -110,11 +110,13 @@ struct phy_override_seq { /** * struct qcom_snps_hsphy - snps hs phy attributes * + * @dev: device structure + * * @phy: generic phy * @base: iomapped memory space for snps hs phy * - * @cfg_ahb_clk: AHB2PHY interface clock - * @ref_clk: phy reference clock + * @num_clks: number of clocks + * @clks: array of clocks * @phy_reset: phy reset control * @vregs: regulator supplies bulk data * @phy_initialized: if PHY has been initialized correctly @@ -122,11 +124,13 @@ struct phy_override_seq { * @update_seq_cfg: tuning parameters for phy init */ struct qcom_snps_hsphy { + struct device *dev; + struct phy *phy; void __iomem *base;
- struct clk *cfg_ahb_clk; - struct clk *ref_clk; + int num_clks; + struct clk_bulk_data *clks; struct reset_control *phy_reset; struct regulator_bulk_data vregs[SNPS_HS_NUM_VREGS];
@@ -135,6 +139,34 @@ struct qcom_snps_hsphy { struct phy_override_seq update_seq_cfg[NUM_HSPHY_TUNING_PARAMS]; };
+static int qcom_snps_hsphy_clk_init(struct qcom_snps_hsphy *hsphy) +{ + struct device *dev = hsphy->dev; + + hsphy->num_clks = 2; + hsphy->clks = devm_kcalloc(dev, hsphy->num_clks, sizeof(*hsphy->clks), GFP_KERNEL); + if (!hsphy->clks) + return -ENOMEM; + + /* + * TODO: Currently no device tree instantiation of the PHY is using the clock. + * This needs to be fixed in order for this code to be able to use devm_clk_bulk_get(). + */ + hsphy->clks[0].id = "cfg_ahb"; + hsphy->clks[0].clk = devm_clk_get_optional(dev, "cfg_ahb"); + if (IS_ERR(hsphy->clks[0].clk)) + return dev_err_probe(dev, PTR_ERR(hsphy->clks[0].clk), + "failed to get cfg_ahb clk\n"); + + hsphy->clks[1].id = "ref"; + hsphy->clks[1].clk = devm_clk_get(dev, "ref"); + if (IS_ERR(hsphy->clks[1].clk)) + return dev_err_probe(dev, PTR_ERR(hsphy->clks[1].clk), + "failed to get ref clk\n"); + + return 0; +} + static inline void qcom_snps_hsphy_write_mask(void __iomem *base, u32 offset, u32 mask, u32 val) { @@ -365,16 +397,16 @@ static int qcom_snps_hsphy_init(struct phy *phy) if (ret) return ret;
- ret = clk_prepare_enable(hsphy->cfg_ahb_clk); + ret = clk_bulk_prepare_enable(hsphy->num_clks, hsphy->clks); if (ret) { - dev_err(&phy->dev, "failed to enable cfg ahb clock, %d\n", ret); + dev_err(&phy->dev, "failed to enable clocks, %d\n", ret); goto poweroff_phy; }
ret = reset_control_assert(hsphy->phy_reset); if (ret) { dev_err(&phy->dev, "failed to assert phy_reset, %d\n", ret); - goto disable_ahb_clk; + goto disable_clks; }
usleep_range(100, 150); @@ -382,7 +414,7 @@ static int qcom_snps_hsphy_init(struct phy *phy) ret = reset_control_deassert(hsphy->phy_reset); if (ret) { dev_err(&phy->dev, "failed to de-assert phy_reset, %d\n", ret); - goto disable_ahb_clk; + goto disable_clks; }
qcom_snps_hsphy_write_mask(hsphy->base, USB2_PHY_USB_PHY_CFG0, @@ -439,8 +471,8 @@ static int qcom_snps_hsphy_init(struct phy *phy)
return 0;
-disable_ahb_clk: - clk_disable_unprepare(hsphy->cfg_ahb_clk); +disable_clks: + clk_bulk_disable_unprepare(hsphy->num_clks, hsphy->clks); poweroff_phy: regulator_bulk_disable(ARRAY_SIZE(hsphy->vregs), hsphy->vregs);
@@ -452,7 +484,7 @@ static int qcom_snps_hsphy_exit(struct phy *phy) struct qcom_snps_hsphy *hsphy = phy_get_drvdata(phy);
reset_control_assert(hsphy->phy_reset); - clk_disable_unprepare(hsphy->cfg_ahb_clk); + clk_bulk_disable_unprepare(hsphy->num_clks, hsphy->clks); regulator_bulk_disable(ARRAY_SIZE(hsphy->vregs), hsphy->vregs); hsphy->phy_initialized = false;
@@ -545,14 +577,15 @@ static int qcom_snps_hsphy_probe(struct platform_device *pdev) if (!hsphy) return -ENOMEM;
+ hsphy->dev = dev; + hsphy->base = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(hsphy->base)) return PTR_ERR(hsphy->base);
- hsphy->ref_clk = devm_clk_get(dev, "ref"); - if (IS_ERR(hsphy->ref_clk)) - return dev_err_probe(dev, PTR_ERR(hsphy->ref_clk), - "failed to get ref clk\n"); + ret = qcom_snps_hsphy_clk_init(hsphy); + if (ret) + return dev_err_probe(dev, ret, "failed to initialize clocks\n");
hsphy->phy_reset = devm_reset_control_get_exclusive(&pdev->dev, NULL); if (IS_ERR(hsphy->phy_reset)) {
From: Srinivas Kandagatla srinivas.kandagatla@linaro.org
[ Upstream commit f84d41b2a083b990cbdf70f3b24b6b108b9678ad ]
SoundWire device status can be incorrectly updated without proper mask, fix this by adding a mask before updating the status.
Fixes: c7d49c76d1d5 ("soundwire: qcom: add support to new interrupts") Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20230525133812.30841-2-srinivas.kandagatla@linaro.... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soundwire/qcom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c index e3ef5ebae6b7c..027979c66486c 100644 --- a/drivers/soundwire/qcom.c +++ b/drivers/soundwire/qcom.c @@ -437,7 +437,7 @@ static int qcom_swrm_get_alert_slave_dev_num(struct qcom_swrm_ctrl *ctrl) status = (val >> (dev_num * SWRM_MCP_SLV_STATUS_SZ));
if ((status & SWRM_MCP_SLV_STATUS_MASK) == SDW_SLAVE_ALERT) { - ctrl->status[dev_num] = status; + ctrl->status[dev_num] = status & SWRM_MCP_SLV_STATUS_MASK; return dev_num; } }
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 7891d0a5ce6f627132d3068ba925cf86f29008b1 ]
This code has two problems: 1) The devm_ioremap() function returns NULL, not error pointers. 2) It's checking the wrong variable. ->mmio instead of ->acp_mmio.
Fixes: d8f48fbdfd9a ("soundwire: amd: Add support for AMD Manager driver") Suggested-by: "Mukunda,Vijendar" vijendar.mukunda@amd.com Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/9863b2bf-0de2-4bf8-8f09-fe24dc5c63ff@moroto.mounta... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soundwire/amd_manager.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/soundwire/amd_manager.c b/drivers/soundwire/amd_manager.c index 9fb7f91ca1827..21c638e38c51f 100644 --- a/drivers/soundwire/amd_manager.c +++ b/drivers/soundwire/amd_manager.c @@ -910,9 +910,9 @@ static int amd_sdw_manager_probe(struct platform_device *pdev) return -ENOMEM;
amd_manager->acp_mmio = devm_ioremap(dev, res->start, resource_size(res)); - if (IS_ERR(amd_manager->mmio)) { + if (!amd_manager->acp_mmio) { dev_err(dev, "mmio not found\n"); - return PTR_ERR(amd_manager->mmio); + return -ENOMEM; } amd_manager->instance = pdata->instance; amd_manager->mmio = amd_manager->acp_mmio +
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 9c2dcfc2cf0f2e4e0a0db33bc1a626e35928c475 ]
Address these compiler warnings by initialising the m_best and p_best values to 0 and 1 respectively (as latter is used as a divisor):
drivers/media/i2c/tc358746.c: In function 'tc358746_find_pll_settings':
drivers/media/i2c/tc358746.c:817:13: warning: 'p_best' is used uninitialized
[-Wuninitialized] 817 | u16 p_best, p; | ^~~~~~
drivers/media/i2c/tc358746.c:816:13: warning: 'm_best' is used uninitialized
[-Wuninitialized] 816 | u16 m_best, mul; | ^~~~~~
The warnings may well be a false positive but it is difficult for a compiler to find out whether that truly is the case.
Closes: https://lore.kernel.org/oe-kbuild-all/202305301627.fLT3Bkds-lkp@intel.com/
Reported-by: kernel test robot lkp@intel.com Fixes: 80a21da3605 ("media: tc358746: add Toshiba TC358746 Parallel to CSI-2 bridge driver") Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Marco Felsch m.felsch@pengutronix.de Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/tc358746.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/media/i2c/tc358746.c b/drivers/media/i2c/tc358746.c index ec1a193ba161a..25fbce5cabdaa 100644 --- a/drivers/media/i2c/tc358746.c +++ b/drivers/media/i2c/tc358746.c @@ -813,8 +813,8 @@ static unsigned long tc358746_find_pll_settings(struct tc358746 *tc358746, u32 min_delta = 0xffffffff; u16 prediv_max = 17; u16 prediv_min = 1; - u16 m_best, mul; - u16 p_best, p; + u16 m_best = 0, mul; + u16 p_best = 1, p; u8 postdiv;
if (fout > 1000 * HZ_PER_MHZ) {
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit bf4c985707d3168ebb7d87d15830de66949d979c ]
Select V4L2_FWNODE as the driver depends on it.
Reported-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Fixes: aa31f6514047 ("media: atomisp: allow building the driver again") Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Tested-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/media/atomisp/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/staging/media/atomisp/Kconfig b/drivers/staging/media/atomisp/Kconfig index c9bff98e5309a..e9b168ba97bf1 100644 --- a/drivers/staging/media/atomisp/Kconfig +++ b/drivers/staging/media/atomisp/Kconfig @@ -13,6 +13,7 @@ config VIDEO_ATOMISP tristate "Intel Atom Image Signal Processor Driver" depends on VIDEO_DEV && INTEL_ATOMISP depends on PMIC_OPREGION + select V4L2_FWNODE select IOSF_MBI select VIDEOBUF2_VMALLOC select VIDEO_V4L2_SUBDEV_API
From: Nicolas Dufresne nicolas.dufresne@collabora.com
[ Upstream commit dcff0b56f661b6b42e828012b464d22cc2068c38 ]
The path did not match the one it was submitted into linux-firmware which prevented generic distribution from having working CODEC.
Fixes: 9f599f351e86 ("media: amphion: add vpu core driver") Signed-off-by: Nicolas Dufresne nicolas.dufresne@collabora.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/amphion/vpu_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c index de23627a119a0..82bf8b3be66a2 100644 --- a/drivers/media/platform/amphion/vpu_core.c +++ b/drivers/media/platform/amphion/vpu_core.c @@ -826,7 +826,7 @@ static const struct dev_pm_ops vpu_core_pm_ops = {
static struct vpu_core_resources imx8q_enc = { .type = VPU_CORE_TYPE_ENC, - .fwname = "vpu/vpu_fw_imx8_enc.bin", + .fwname = "amphion/vpu/vpu_fw_imx8_enc.bin", .stride = 16, .max_width = 1920, .max_height = 1920, @@ -841,7 +841,7 @@ static struct vpu_core_resources imx8q_enc = {
static struct vpu_core_resources imx8q_dec = { .type = VPU_CORE_TYPE_DEC, - .fwname = "vpu/vpu_fw_imx8_dec.bin", + .fwname = "amphion/vpu/vpu_fw_imx8_dec.bin", .stride = 256, .max_width = 8188, .max_height = 8188,
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit da4ede4b7fd6aa341b69e3a9d2517b8df5e744fd ]
Lots of data and functions here are not needed when CONFIG_OF is not set, so move them inside #ifdef CONFIG_OF blocks to prevent the warnings.
../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1645:29: warning: ‘mtk_jpeg_clocks’ defined but not used [-Wunused-variable] 1645 | static struct clk_bulk_data mtk_jpeg_clocks[] = { ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1640:29: warning: ‘mt8173_jpeg_dec_clocks’ defined but not used [-Wunused-variable] 1640 | static struct clk_bulk_data mt8173_jpeg_dec_clocks[] = { ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1481:20: warning: ‘mtk_jpeg_dec_irq’ defined but not used [-Wunused-function] 1481 | static irqreturn_t mtk_jpeg_dec_irq(int irq, void *priv) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1461:20: warning: ‘mtk_jpeg_enc_irq’ defined but not used [-Wunused-function] 1461 | static irqreturn_t mtk_jpeg_enc_irq(int irq, void *priv) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1180:13: warning: ‘mtk_jpegdec_worker’ defined but not used [-Wunused-function] 1180 | static void mtk_jpegdec_worker(struct work_struct *work) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:986:13: warning: ‘mtk_jpegenc_worker’ defined but not used [-Wunused-function] 986 | static void mtk_jpegenc_worker(struct work_struct *work) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:79:28: warning: ‘mtk_jpeg_dec_formats’ defined but not used [-Wunused-variable] 79 | static struct mtk_jpeg_fmt mtk_jpeg_dec_formats[] = { ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:31:28: warning: ‘mtk_jpeg_enc_formats’ defined but not used [-Wunused-variable] 31 | static struct mtk_jpeg_fmt mtk_jpeg_enc_formats[] = { ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1222:20: warning: ‘mtk_jpeg_enc_done’ defined but not used [-Wunused-function] 1222 | static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1072:12: warning: ‘mtk_jpegdec_set_hw_param’ defined but not used [-Wunused-function] 1072 | static int mtk_jpegdec_set_hw_param(struct mtk_jpeg_ctx *ctx, ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1060:12: warning: ‘mtk_jpegdec_put_hw’ defined but not used [-Wunused-function] 1060 | static int mtk_jpegdec_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1038:12: warning: ‘mtk_jpegdec_get_hw’ defined but not used [-Wunused-function] 1038 | static int mtk_jpegdec_get_hw(struct mtk_jpeg_ctx *ctx) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:977:12: warning: ‘mtk_jpegenc_put_hw’ defined but not used [-Wunused-function] 977 | static int mtk_jpegenc_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id) ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:963:12: warning: ‘mtk_jpegenc_set_hw_param’ defined but not used [-Wunused-function] 963 | static int mtk_jpegenc_set_hw_param(struct mtk_jpeg_ctx *ctx, ../drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:941:12: warning: ‘mtk_jpegenc_get_hw’ defined but not used [-Wunused-function] 941 | static int mtk_jpegenc_get_hw(struct mtk_jpeg_ctx *ctx)
Signed-off-by: Randy Dunlap rdunlap@infradead.org Reported-by: kernel test robot lkp@intel.com Link: https://lore.kernel.org/linux-media/202305042146.j4ZxuvpM-lkp@intel.com/ Cc: Bin Liu bin.liu@mediatek.com Cc: oushixiong oushixiong@kylinos.cn Cc: Mauro Carvalho Chehab mchehab@kernel.org Cc: Hans Verkuil hverkuil-cisco@xs4all.nl Cc: linux-media@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Stable-dep-of: 20de9fdaf488 ("media: mtk_jpeg_core: avoid unused-variable warning") Signed-off-by: Sasha Levin sashal@kernel.org --- .../platform/mediatek/jpeg/mtk_jpeg_core.c | 858 +++++++++--------- 1 file changed, 430 insertions(+), 428 deletions(-)
diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c index 0051f372a66cf..4768156181c99 100644 --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c @@ -28,6 +28,7 @@ #include "mtk_jpeg_core.h" #include "mtk_jpeg_dec_parse.h"
+#if defined(CONFIG_OF) static struct mtk_jpeg_fmt mtk_jpeg_enc_formats[] = { { .fourcc = V4L2_PIX_FMT_JPEG, @@ -101,6 +102,7 @@ static struct mtk_jpeg_fmt mtk_jpeg_dec_formats[] = { .flags = MTK_JPEG_FMT_FLAG_CAPTURE, }, }; +#endif
#define MTK_JPEG_ENC_NUM_FORMATS ARRAY_SIZE(mtk_jpeg_enc_formats) #define MTK_JPEG_DEC_NUM_FORMATS ARRAY_SIZE(mtk_jpeg_dec_formats) @@ -936,148 +938,6 @@ static int mtk_jpeg_set_dec_dst(struct mtk_jpeg_ctx *ctx, return 0; }
-static int mtk_jpegenc_get_hw(struct mtk_jpeg_ctx *ctx) -{ - struct mtk_jpegenc_comp_dev *comp_jpeg; - struct mtk_jpeg_dev *jpeg = ctx->jpeg; - unsigned long flags; - int hw_id = -1; - int i; - - spin_lock_irqsave(&jpeg->hw_lock, flags); - for (i = 0; i < MTK_JPEGENC_HW_MAX; i++) { - comp_jpeg = jpeg->enc_hw_dev[i]; - if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) { - hw_id = i; - comp_jpeg->hw_state = MTK_JPEG_HW_BUSY; - break; - } - } - spin_unlock_irqrestore(&jpeg->hw_lock, flags); - - return hw_id; -} - -static int mtk_jpegenc_set_hw_param(struct mtk_jpeg_ctx *ctx, - int hw_id, - struct vb2_v4l2_buffer *src_buf, - struct vb2_v4l2_buffer *dst_buf) -{ - struct mtk_jpegenc_comp_dev *jpeg = ctx->jpeg->enc_hw_dev[hw_id]; - - jpeg->hw_param.curr_ctx = ctx; - jpeg->hw_param.src_buffer = src_buf; - jpeg->hw_param.dst_buffer = dst_buf; - - return 0; -} - -static int mtk_jpegenc_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id) -{ - unsigned long flags; - - spin_lock_irqsave(&jpeg->hw_lock, flags); - jpeg->enc_hw_dev[hw_id]->hw_state = MTK_JPEG_HW_IDLE; - spin_unlock_irqrestore(&jpeg->hw_lock, flags); - - return 0; -} - -static void mtk_jpegenc_worker(struct work_struct *work) -{ - struct mtk_jpegenc_comp_dev *comp_jpeg[MTK_JPEGENC_HW_MAX]; - enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; - struct mtk_jpeg_src_buf *jpeg_dst_buf; - struct vb2_v4l2_buffer *src_buf, *dst_buf; - int ret, i, hw_id = 0; - unsigned long flags; - - struct mtk_jpeg_ctx *ctx = container_of(work, - struct mtk_jpeg_ctx, - jpeg_work); - struct mtk_jpeg_dev *jpeg = ctx->jpeg; - - for (i = 0; i < MTK_JPEGENC_HW_MAX; i++) - comp_jpeg[i] = jpeg->enc_hw_dev[i]; - i = 0; - -retry_select: - hw_id = mtk_jpegenc_get_hw(ctx); - if (hw_id < 0) { - ret = wait_event_interruptible(jpeg->hw_wq, - atomic_read(&jpeg->hw_rdy) > 0); - if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) { - dev_err(jpeg->dev, "%s : %d, all HW are busy\n", - __func__, __LINE__); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - return; - } - - goto retry_select; - } - - atomic_dec(&jpeg->hw_rdy); - src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); - if (!src_buf) - goto getbuf_fail; - - dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); - if (!dst_buf) - goto getbuf_fail; - - v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true); - - mtk_jpegenc_set_hw_param(ctx, hw_id, src_buf, dst_buf); - ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev); - if (ret < 0) { - dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n", - __func__, __LINE__); - goto enc_end; - } - - ret = clk_prepare_enable(comp_jpeg[hw_id]->venc_clk.clks->clk); - if (ret) { - dev_err(jpeg->dev, "%s : %d, jpegenc clk_prepare_enable fail\n", - __func__, __LINE__); - goto enc_end; - } - - v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); - v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); - - schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work, - msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC)); - - spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags); - jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf); - jpeg_dst_buf->curr_ctx = ctx; - jpeg_dst_buf->frame_num = ctx->total_frame_num; - ctx->total_frame_num++; - mtk_jpeg_enc_reset(comp_jpeg[hw_id]->reg_base); - mtk_jpeg_set_enc_dst(ctx, - comp_jpeg[hw_id]->reg_base, - &dst_buf->vb2_buf); - mtk_jpeg_set_enc_src(ctx, - comp_jpeg[hw_id]->reg_base, - &src_buf->vb2_buf); - mtk_jpeg_set_enc_params(ctx, comp_jpeg[hw_id]->reg_base); - mtk_jpeg_enc_start(comp_jpeg[hw_id]->reg_base); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags); - - return; - -enc_end: - v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); - v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); - v4l2_m2m_buf_done(src_buf, buf_state); - v4l2_m2m_buf_done(dst_buf, buf_state); -getbuf_fail: - atomic_inc(&jpeg->hw_rdy); - mtk_jpegenc_put_hw(jpeg, hw_id); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); -} - static void mtk_jpeg_enc_device_run(void *priv) { struct mtk_jpeg_ctx *ctx = priv; @@ -1128,206 +988,39 @@ static void mtk_jpeg_multicore_enc_device_run(void *priv) queue_work(jpeg->workqueue, &ctx->jpeg_work); }
-static int mtk_jpegdec_get_hw(struct mtk_jpeg_ctx *ctx) +static void mtk_jpeg_multicore_dec_device_run(void *priv) { - struct mtk_jpegdec_comp_dev *comp_jpeg; + struct mtk_jpeg_ctx *ctx = priv; struct mtk_jpeg_dev *jpeg = ctx->jpeg; - unsigned long flags; - int hw_id = -1; - int i; - - spin_lock_irqsave(&jpeg->hw_lock, flags); - for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++) { - comp_jpeg = jpeg->dec_hw_dev[i]; - if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) { - hw_id = i; - comp_jpeg->hw_state = MTK_JPEG_HW_BUSY; - break; - } - } - spin_unlock_irqrestore(&jpeg->hw_lock, flags); - - return hw_id; -} - -static int mtk_jpegdec_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id) -{ - unsigned long flags; - - spin_lock_irqsave(&jpeg->hw_lock, flags); - jpeg->dec_hw_dev[hw_id]->hw_state = - MTK_JPEG_HW_IDLE; - spin_unlock_irqrestore(&jpeg->hw_lock, flags); - - return 0; -} - -static int mtk_jpegdec_set_hw_param(struct mtk_jpeg_ctx *ctx, - int hw_id, - struct vb2_v4l2_buffer *src_buf, - struct vb2_v4l2_buffer *dst_buf) -{ - struct mtk_jpegdec_comp_dev *jpeg = - ctx->jpeg->dec_hw_dev[hw_id]; - - jpeg->hw_param.curr_ctx = ctx; - jpeg->hw_param.src_buffer = src_buf; - jpeg->hw_param.dst_buffer = dst_buf;
- return 0; + queue_work(jpeg->workqueue, &ctx->jpeg_work); }
-static void mtk_jpegdec_worker(struct work_struct *work) +static void mtk_jpeg_dec_device_run(void *priv) { - struct mtk_jpeg_ctx *ctx = container_of(work, struct mtk_jpeg_ctx, - jpeg_work); - struct mtk_jpegdec_comp_dev *comp_jpeg[MTK_JPEGDEC_HW_MAX]; - enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; - struct mtk_jpeg_src_buf *jpeg_src_buf, *jpeg_dst_buf; - struct vb2_v4l2_buffer *src_buf, *dst_buf; + struct mtk_jpeg_ctx *ctx = priv; struct mtk_jpeg_dev *jpeg = ctx->jpeg; - int ret, i, hw_id = 0; + struct vb2_v4l2_buffer *src_buf, *dst_buf; + enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; + unsigned long flags; + struct mtk_jpeg_src_buf *jpeg_src_buf; struct mtk_jpeg_bs bs; struct mtk_jpeg_fb fb; - unsigned long flags; - - for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++) - comp_jpeg[i] = jpeg->dec_hw_dev[i]; - i = 0; - -retry_select: - hw_id = mtk_jpegdec_get_hw(ctx); - if (hw_id < 0) { - ret = wait_event_interruptible_timeout(jpeg->hw_wq, - atomic_read(&jpeg->hw_rdy) > 0, - MTK_JPEG_HW_TIMEOUT_MSEC); - if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) { - dev_err(jpeg->dev, "%s : %d, all HW are busy\n", - __func__, __LINE__); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - return; - } - - goto retry_select; - } + int ret;
- atomic_dec(&jpeg->hw_rdy); src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); - if (!src_buf) - goto getbuf_fail; - dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); - if (!dst_buf) - goto getbuf_fail; - - v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true); jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf); - jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf);
- if (mtk_jpeg_check_resolution_change(ctx, - &jpeg_src_buf->dec_param)) { + if (mtk_jpeg_check_resolution_change(ctx, &jpeg_src_buf->dec_param)) { mtk_jpeg_queue_src_chg_event(ctx); ctx->state = MTK_JPEG_SOURCE_CHANGE; - goto getbuf_fail; + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + return; }
- jpeg_src_buf->curr_ctx = ctx; - jpeg_src_buf->frame_num = ctx->total_frame_num; - jpeg_dst_buf->curr_ctx = ctx; - jpeg_dst_buf->frame_num = ctx->total_frame_num; - - mtk_jpegdec_set_hw_param(ctx, hw_id, src_buf, dst_buf); - ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev); - if (ret < 0) { - dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n", - __func__, __LINE__); - goto dec_end; - } - - ret = clk_prepare_enable(comp_jpeg[hw_id]->jdec_clk.clks->clk); - if (ret) { - dev_err(jpeg->dev, "%s : %d, jpegdec clk_prepare_enable fail\n", - __func__, __LINE__); - goto clk_end; - } - - v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); - v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); - - schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work, - msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC)); - - mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs); - if (mtk_jpeg_set_dec_dst(ctx, - &jpeg_src_buf->dec_param, - &dst_buf->vb2_buf, &fb)) { - dev_err(jpeg->dev, "%s : %d, mtk_jpeg_set_dec_dst fail\n", - __func__, __LINE__); - goto setdst_end; - } - - spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags); - ctx->total_frame_num++; - mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base); - mtk_jpeg_dec_set_config(comp_jpeg[hw_id]->reg_base, - &jpeg_src_buf->dec_param, - jpeg_src_buf->bs_size, - &bs, - &fb); - mtk_jpeg_dec_start(comp_jpeg[hw_id]->reg_base); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags); - - return; - -setdst_end: - clk_disable_unprepare(comp_jpeg[hw_id]->jdec_clk.clks->clk); -clk_end: - pm_runtime_put(comp_jpeg[hw_id]->dev); -dec_end: - v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); - v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); - v4l2_m2m_buf_done(src_buf, buf_state); - v4l2_m2m_buf_done(dst_buf, buf_state); -getbuf_fail: - atomic_inc(&jpeg->hw_rdy); - mtk_jpegdec_put_hw(jpeg, hw_id); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); -} - -static void mtk_jpeg_multicore_dec_device_run(void *priv) -{ - struct mtk_jpeg_ctx *ctx = priv; - struct mtk_jpeg_dev *jpeg = ctx->jpeg; - - queue_work(jpeg->workqueue, &ctx->jpeg_work); -} - -static void mtk_jpeg_dec_device_run(void *priv) -{ - struct mtk_jpeg_ctx *ctx = priv; - struct mtk_jpeg_dev *jpeg = ctx->jpeg; - struct vb2_v4l2_buffer *src_buf, *dst_buf; - enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; - unsigned long flags; - struct mtk_jpeg_src_buf *jpeg_src_buf; - struct mtk_jpeg_bs bs; - struct mtk_jpeg_fb fb; - int ret; - - src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); - dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); - jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf); - - if (mtk_jpeg_check_resolution_change(ctx, &jpeg_src_buf->dec_param)) { - mtk_jpeg_queue_src_chg_event(ctx); - ctx->state = MTK_JPEG_SOURCE_CHANGE; - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - return; - } - - ret = pm_runtime_resume_and_get(jpeg->dev); - if (ret < 0) + ret = pm_runtime_resume_and_get(jpeg->dev); + if (ret < 0) goto dec_end;
schedule_delayed_work(&jpeg->job_timeout_work, @@ -1430,101 +1123,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg) jpeg->variant->clks); }
-static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg) -{ - struct mtk_jpeg_ctx *ctx; - struct vb2_v4l2_buffer *src_buf, *dst_buf; - enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; - u32 result_size; - - ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev); - if (!ctx) { - v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n"); - return IRQ_HANDLED; - } - - src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); - dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); - - result_size = mtk_jpeg_enc_get_file_size(jpeg->reg_base); - vb2_set_plane_payload(&dst_buf->vb2_buf, 0, result_size); - - buf_state = VB2_BUF_STATE_DONE; - - v4l2_m2m_buf_done(src_buf, buf_state); - v4l2_m2m_buf_done(dst_buf, buf_state); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - pm_runtime_put(ctx->jpeg->dev); - return IRQ_HANDLED; -} - -static irqreturn_t mtk_jpeg_enc_irq(int irq, void *priv) -{ - struct mtk_jpeg_dev *jpeg = priv; - u32 irq_status; - irqreturn_t ret = IRQ_NONE; - - cancel_delayed_work(&jpeg->job_timeout_work); - - irq_status = readl(jpeg->reg_base + JPEG_ENC_INT_STS) & - JPEG_ENC_INT_STATUS_MASK_ALLIRQ; - if (irq_status) - writel(0, jpeg->reg_base + JPEG_ENC_INT_STS); - - if (!(irq_status & JPEG_ENC_INT_STATUS_DONE)) - return ret; - - ret = mtk_jpeg_enc_done(jpeg); - return ret; -} - -static irqreturn_t mtk_jpeg_dec_irq(int irq, void *priv) -{ - struct mtk_jpeg_dev *jpeg = priv; - struct mtk_jpeg_ctx *ctx; - struct vb2_v4l2_buffer *src_buf, *dst_buf; - struct mtk_jpeg_src_buf *jpeg_src_buf; - enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; - u32 dec_irq_ret; - u32 dec_ret; - int i; - - cancel_delayed_work(&jpeg->job_timeout_work); - - dec_ret = mtk_jpeg_dec_get_int_status(jpeg->reg_base); - dec_irq_ret = mtk_jpeg_dec_enum_result(dec_ret); - ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev); - if (!ctx) { - v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n"); - return IRQ_HANDLED; - } - - src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); - dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); - jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf); - - if (dec_irq_ret >= MTK_JPEG_DEC_RESULT_UNDERFLOW) - mtk_jpeg_dec_reset(jpeg->reg_base); - - if (dec_irq_ret != MTK_JPEG_DEC_RESULT_EOF_DONE) { - dev_err(jpeg->dev, "decode failed\n"); - goto dec_end; - } - - for (i = 0; i < dst_buf->vb2_buf.num_planes; i++) - vb2_set_plane_payload(&dst_buf->vb2_buf, i, - jpeg_src_buf->dec_param.comp_size[i]); - - buf_state = VB2_BUF_STATE_DONE; - -dec_end: - v4l2_m2m_buf_done(src_buf, buf_state); - v4l2_m2m_buf_done(dst_buf, buf_state); - v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); - pm_runtime_put(ctx->jpeg->dev); - return IRQ_HANDLED; -} - static void mtk_jpeg_set_default_params(struct mtk_jpeg_ctx *ctx) { struct mtk_jpeg_q_data *q = &ctx->out_q; @@ -1637,15 +1235,6 @@ static const struct v4l2_file_operations mtk_jpeg_fops = { .mmap = v4l2_m2m_fop_mmap, };
-static struct clk_bulk_data mt8173_jpeg_dec_clocks[] = { - { .id = "jpgdec-smi" }, - { .id = "jpgdec" }, -}; - -static struct clk_bulk_data mtk_jpeg_clocks[] = { - { .id = "jpgenc" }, -}; - static void mtk_jpeg_job_timeout_work(struct work_struct *work) { struct mtk_jpeg_dev *jpeg = container_of(work, struct mtk_jpeg_dev, @@ -1867,6 +1456,419 @@ static const struct dev_pm_ops mtk_jpeg_pm_ops = { };
#if defined(CONFIG_OF) +static int mtk_jpegenc_get_hw(struct mtk_jpeg_ctx *ctx) +{ + struct mtk_jpegenc_comp_dev *comp_jpeg; + struct mtk_jpeg_dev *jpeg = ctx->jpeg; + unsigned long flags; + int hw_id = -1; + int i; + + spin_lock_irqsave(&jpeg->hw_lock, flags); + for (i = 0; i < MTK_JPEGENC_HW_MAX; i++) { + comp_jpeg = jpeg->enc_hw_dev[i]; + if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) { + hw_id = i; + comp_jpeg->hw_state = MTK_JPEG_HW_BUSY; + break; + } + } + spin_unlock_irqrestore(&jpeg->hw_lock, flags); + + return hw_id; +} + +static int mtk_jpegenc_set_hw_param(struct mtk_jpeg_ctx *ctx, + int hw_id, + struct vb2_v4l2_buffer *src_buf, + struct vb2_v4l2_buffer *dst_buf) +{ + struct mtk_jpegenc_comp_dev *jpeg = ctx->jpeg->enc_hw_dev[hw_id]; + + jpeg->hw_param.curr_ctx = ctx; + jpeg->hw_param.src_buffer = src_buf; + jpeg->hw_param.dst_buffer = dst_buf; + + return 0; +} + +static int mtk_jpegenc_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id) +{ + unsigned long flags; + + spin_lock_irqsave(&jpeg->hw_lock, flags); + jpeg->enc_hw_dev[hw_id]->hw_state = MTK_JPEG_HW_IDLE; + spin_unlock_irqrestore(&jpeg->hw_lock, flags); + + return 0; +} + +static int mtk_jpegdec_get_hw(struct mtk_jpeg_ctx *ctx) +{ + struct mtk_jpegdec_comp_dev *comp_jpeg; + struct mtk_jpeg_dev *jpeg = ctx->jpeg; + unsigned long flags; + int hw_id = -1; + int i; + + spin_lock_irqsave(&jpeg->hw_lock, flags); + for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++) { + comp_jpeg = jpeg->dec_hw_dev[i]; + if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) { + hw_id = i; + comp_jpeg->hw_state = MTK_JPEG_HW_BUSY; + break; + } + } + spin_unlock_irqrestore(&jpeg->hw_lock, flags); + + return hw_id; +} + +static int mtk_jpegdec_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id) +{ + unsigned long flags; + + spin_lock_irqsave(&jpeg->hw_lock, flags); + jpeg->dec_hw_dev[hw_id]->hw_state = + MTK_JPEG_HW_IDLE; + spin_unlock_irqrestore(&jpeg->hw_lock, flags); + + return 0; +} + +static int mtk_jpegdec_set_hw_param(struct mtk_jpeg_ctx *ctx, + int hw_id, + struct vb2_v4l2_buffer *src_buf, + struct vb2_v4l2_buffer *dst_buf) +{ + struct mtk_jpegdec_comp_dev *jpeg = + ctx->jpeg->dec_hw_dev[hw_id]; + + jpeg->hw_param.curr_ctx = ctx; + jpeg->hw_param.src_buffer = src_buf; + jpeg->hw_param.dst_buffer = dst_buf; + + return 0; +} + +static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg) +{ + struct mtk_jpeg_ctx *ctx; + struct vb2_v4l2_buffer *src_buf, *dst_buf; + enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; + u32 result_size; + + ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev); + if (!ctx) { + v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n"); + return IRQ_HANDLED; + } + + src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + + result_size = mtk_jpeg_enc_get_file_size(jpeg->reg_base); + vb2_set_plane_payload(&dst_buf->vb2_buf, 0, result_size); + + buf_state = VB2_BUF_STATE_DONE; + + v4l2_m2m_buf_done(src_buf, buf_state); + v4l2_m2m_buf_done(dst_buf, buf_state); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + pm_runtime_put(ctx->jpeg->dev); + return IRQ_HANDLED; +} + +static void mtk_jpegenc_worker(struct work_struct *work) +{ + struct mtk_jpegenc_comp_dev *comp_jpeg[MTK_JPEGENC_HW_MAX]; + enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; + struct mtk_jpeg_src_buf *jpeg_dst_buf; + struct vb2_v4l2_buffer *src_buf, *dst_buf; + int ret, i, hw_id = 0; + unsigned long flags; + + struct mtk_jpeg_ctx *ctx = container_of(work, + struct mtk_jpeg_ctx, + jpeg_work); + struct mtk_jpeg_dev *jpeg = ctx->jpeg; + + for (i = 0; i < MTK_JPEGENC_HW_MAX; i++) + comp_jpeg[i] = jpeg->enc_hw_dev[i]; + i = 0; + +retry_select: + hw_id = mtk_jpegenc_get_hw(ctx); + if (hw_id < 0) { + ret = wait_event_interruptible(jpeg->hw_wq, + atomic_read(&jpeg->hw_rdy) > 0); + if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) { + dev_err(jpeg->dev, "%s : %d, all HW are busy\n", + __func__, __LINE__); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + return; + } + + goto retry_select; + } + + atomic_dec(&jpeg->hw_rdy); + src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); + if (!src_buf) + goto getbuf_fail; + + dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); + if (!dst_buf) + goto getbuf_fail; + + v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true); + + mtk_jpegenc_set_hw_param(ctx, hw_id, src_buf, dst_buf); + ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev); + if (ret < 0) { + dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n", + __func__, __LINE__); + goto enc_end; + } + + ret = clk_prepare_enable(comp_jpeg[hw_id]->venc_clk.clks->clk); + if (ret) { + dev_err(jpeg->dev, "%s : %d, jpegenc clk_prepare_enable fail\n", + __func__, __LINE__); + goto enc_end; + } + + v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + + schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work, + msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC)); + + spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags); + jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf); + jpeg_dst_buf->curr_ctx = ctx; + jpeg_dst_buf->frame_num = ctx->total_frame_num; + ctx->total_frame_num++; + mtk_jpeg_enc_reset(comp_jpeg[hw_id]->reg_base); + mtk_jpeg_set_enc_dst(ctx, + comp_jpeg[hw_id]->reg_base, + &dst_buf->vb2_buf); + mtk_jpeg_set_enc_src(ctx, + comp_jpeg[hw_id]->reg_base, + &src_buf->vb2_buf); + mtk_jpeg_set_enc_params(ctx, comp_jpeg[hw_id]->reg_base); + mtk_jpeg_enc_start(comp_jpeg[hw_id]->reg_base); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags); + + return; + +enc_end: + v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + v4l2_m2m_buf_done(src_buf, buf_state); + v4l2_m2m_buf_done(dst_buf, buf_state); +getbuf_fail: + atomic_inc(&jpeg->hw_rdy); + mtk_jpegenc_put_hw(jpeg, hw_id); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); +} + +static void mtk_jpegdec_worker(struct work_struct *work) +{ + struct mtk_jpeg_ctx *ctx = container_of(work, struct mtk_jpeg_ctx, + jpeg_work); + struct mtk_jpegdec_comp_dev *comp_jpeg[MTK_JPEGDEC_HW_MAX]; + enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; + struct mtk_jpeg_src_buf *jpeg_src_buf, *jpeg_dst_buf; + struct vb2_v4l2_buffer *src_buf, *dst_buf; + struct mtk_jpeg_dev *jpeg = ctx->jpeg; + int ret, i, hw_id = 0; + struct mtk_jpeg_bs bs; + struct mtk_jpeg_fb fb; + unsigned long flags; + + for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++) + comp_jpeg[i] = jpeg->dec_hw_dev[i]; + i = 0; + +retry_select: + hw_id = mtk_jpegdec_get_hw(ctx); + if (hw_id < 0) { + ret = wait_event_interruptible_timeout(jpeg->hw_wq, + atomic_read(&jpeg->hw_rdy) > 0, + MTK_JPEG_HW_TIMEOUT_MSEC); + if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) { + dev_err(jpeg->dev, "%s : %d, all HW are busy\n", + __func__, __LINE__); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + return; + } + + goto retry_select; + } + + atomic_dec(&jpeg->hw_rdy); + src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); + if (!src_buf) + goto getbuf_fail; + + dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); + if (!dst_buf) + goto getbuf_fail; + + v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true); + jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf); + jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf); + + if (mtk_jpeg_check_resolution_change(ctx, + &jpeg_src_buf->dec_param)) { + mtk_jpeg_queue_src_chg_event(ctx); + ctx->state = MTK_JPEG_SOURCE_CHANGE; + goto getbuf_fail; + } + + jpeg_src_buf->curr_ctx = ctx; + jpeg_src_buf->frame_num = ctx->total_frame_num; + jpeg_dst_buf->curr_ctx = ctx; + jpeg_dst_buf->frame_num = ctx->total_frame_num; + + mtk_jpegdec_set_hw_param(ctx, hw_id, src_buf, dst_buf); + ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev); + if (ret < 0) { + dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n", + __func__, __LINE__); + goto dec_end; + } + + ret = clk_prepare_enable(comp_jpeg[hw_id]->jdec_clk.clks->clk); + if (ret) { + dev_err(jpeg->dev, "%s : %d, jpegdec clk_prepare_enable fail\n", + __func__, __LINE__); + goto clk_end; + } + + v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + + schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work, + msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC)); + + mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs); + if (mtk_jpeg_set_dec_dst(ctx, + &jpeg_src_buf->dec_param, + &dst_buf->vb2_buf, &fb)) { + dev_err(jpeg->dev, "%s : %d, mtk_jpeg_set_dec_dst fail\n", + __func__, __LINE__); + goto setdst_end; + } + + spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags); + ctx->total_frame_num++; + mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base); + mtk_jpeg_dec_set_config(comp_jpeg[hw_id]->reg_base, + &jpeg_src_buf->dec_param, + jpeg_src_buf->bs_size, + &bs, + &fb); + mtk_jpeg_dec_start(comp_jpeg[hw_id]->reg_base); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags); + + return; + +setdst_end: + clk_disable_unprepare(comp_jpeg[hw_id]->jdec_clk.clks->clk); +clk_end: + pm_runtime_put(comp_jpeg[hw_id]->dev); +dec_end: + v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + v4l2_m2m_buf_done(src_buf, buf_state); + v4l2_m2m_buf_done(dst_buf, buf_state); +getbuf_fail: + atomic_inc(&jpeg->hw_rdy); + mtk_jpegdec_put_hw(jpeg, hw_id); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); +} + +static irqreturn_t mtk_jpeg_enc_irq(int irq, void *priv) +{ + struct mtk_jpeg_dev *jpeg = priv; + u32 irq_status; + irqreturn_t ret = IRQ_NONE; + + cancel_delayed_work(&jpeg->job_timeout_work); + + irq_status = readl(jpeg->reg_base + JPEG_ENC_INT_STS) & + JPEG_ENC_INT_STATUS_MASK_ALLIRQ; + if (irq_status) + writel(0, jpeg->reg_base + JPEG_ENC_INT_STS); + + if (!(irq_status & JPEG_ENC_INT_STATUS_DONE)) + return ret; + + ret = mtk_jpeg_enc_done(jpeg); + return ret; +} + +static irqreturn_t mtk_jpeg_dec_irq(int irq, void *priv) +{ + struct mtk_jpeg_dev *jpeg = priv; + struct mtk_jpeg_ctx *ctx; + struct vb2_v4l2_buffer *src_buf, *dst_buf; + struct mtk_jpeg_src_buf *jpeg_src_buf; + enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR; + u32 dec_irq_ret; + u32 dec_ret; + int i; + + cancel_delayed_work(&jpeg->job_timeout_work); + + dec_ret = mtk_jpeg_dec_get_int_status(jpeg->reg_base); + dec_irq_ret = mtk_jpeg_dec_enum_result(dec_ret); + ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev); + if (!ctx) { + v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n"); + return IRQ_HANDLED; + } + + src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf); + + if (dec_irq_ret >= MTK_JPEG_DEC_RESULT_UNDERFLOW) + mtk_jpeg_dec_reset(jpeg->reg_base); + + if (dec_irq_ret != MTK_JPEG_DEC_RESULT_EOF_DONE) { + dev_err(jpeg->dev, "decode failed\n"); + goto dec_end; + } + + for (i = 0; i < dst_buf->vb2_buf.num_planes; i++) + vb2_set_plane_payload(&dst_buf->vb2_buf, i, + jpeg_src_buf->dec_param.comp_size[i]); + + buf_state = VB2_BUF_STATE_DONE; + +dec_end: + v4l2_m2m_buf_done(src_buf, buf_state); + v4l2_m2m_buf_done(dst_buf, buf_state); + v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); + pm_runtime_put(ctx->jpeg->dev); + return IRQ_HANDLED; +} + +static struct clk_bulk_data mtk_jpeg_clocks[] = { + { .id = "jpgenc" }, +}; + +static struct clk_bulk_data mt8173_jpeg_dec_clocks[] = { + { .id = "jpgdec-smi" }, + { .id = "jpgdec" }, +}; + static const struct mtk_jpeg_variant mt8173_jpeg_drvdata = { .clks = mt8173_jpeg_dec_clocks, .num_clks = ARRAY_SIZE(mt8173_jpeg_dec_clocks),
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 20de9fdaf4883deffb0138eef28e9cbbead32cfd ]
The mtk8195_jpegenc_drvdata object was added outside of an #ifdef causing a harmless build warning.
drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c:1879:32: error: 'mtk8195_jpegenc_drvdata' defined but not used [-Werror=unused-variable] 1879 | static struct mtk_jpeg_variant mtk8195_jpegenc_drvdata = { | ^~~~~~~~~~~~~~~~~~~~~~~
A follow-up patch moved it inside of an #ifdef, which caused more warnings, and a third patch ended up adding even more #ifdefs. These were all bogus, since the actual problem here is the incorrect use of of_ptr(). Since the driver (like any other modern platform driver) only works in combination with CONFIG_OF, there is no point in hiding the reference, so just remove that along with all the pointless #ifdef checks in the driver.
This improves build coverage and avoids running into the same problem again when another part of the driver gets changed that relies on the #ifdef blocks to be completely matched.
Fixes: 934e8bccac95 ("mtk-jpegenc: support jpegenc multi-hardware") Fixes: 4ae47770d57b ("media: mtk-jpegenc: Fix a compilation issue") Fixes: da4ede4b7fd6 ("media: mtk-jpeg: move data/code inside CONFIG_OF blocks") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c | 6 +----- drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c | 4 +--- drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c | 4 +--- 3 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c index 4768156181c99..40cb3cb87ba17 100644 --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c @@ -28,7 +28,6 @@ #include "mtk_jpeg_core.h" #include "mtk_jpeg_dec_parse.h"
-#if defined(CONFIG_OF) static struct mtk_jpeg_fmt mtk_jpeg_enc_formats[] = { { .fourcc = V4L2_PIX_FMT_JPEG, @@ -102,7 +101,6 @@ static struct mtk_jpeg_fmt mtk_jpeg_dec_formats[] = { .flags = MTK_JPEG_FMT_FLAG_CAPTURE, }, }; -#endif
#define MTK_JPEG_ENC_NUM_FORMATS ARRAY_SIZE(mtk_jpeg_enc_formats) #define MTK_JPEG_DEC_NUM_FORMATS ARRAY_SIZE(mtk_jpeg_dec_formats) @@ -1455,7 +1453,6 @@ static const struct dev_pm_ops mtk_jpeg_pm_ops = { SET_RUNTIME_PM_OPS(mtk_jpeg_pm_suspend, mtk_jpeg_pm_resume, NULL) };
-#if defined(CONFIG_OF) static int mtk_jpegenc_get_hw(struct mtk_jpeg_ctx *ctx) { struct mtk_jpegenc_comp_dev *comp_jpeg; @@ -1951,14 +1948,13 @@ static const struct of_device_id mtk_jpeg_match[] = { };
MODULE_DEVICE_TABLE(of, mtk_jpeg_match); -#endif
static struct platform_driver mtk_jpeg_driver = { .probe = mtk_jpeg_probe, .remove_new = mtk_jpeg_remove, .driver = { .name = MTK_JPEG_NAME, - .of_match_table = of_match_ptr(mtk_jpeg_match), + .of_match_table = mtk_jpeg_match, .pm = &mtk_jpeg_pm_ops, }, }; diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c index 869068fac5e2f..baa7be58ce691 100644 --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c @@ -39,7 +39,6 @@ enum mtk_jpeg_color { MTK_JPEG_COLOR_400 = 0x00110000 };
-#if defined(CONFIG_OF) static const struct of_device_id mtk_jpegdec_hw_ids[] = { { .compatible = "mediatek,mt8195-jpgdec-hw", @@ -47,7 +46,6 @@ static const struct of_device_id mtk_jpegdec_hw_ids[] = { {}, }; MODULE_DEVICE_TABLE(of, mtk_jpegdec_hw_ids); -#endif
static inline int mtk_jpeg_verify_align(u32 val, int align, u32 reg) { @@ -653,7 +651,7 @@ static struct platform_driver mtk_jpegdec_hw_driver = { .probe = mtk_jpegdec_hw_probe, .driver = { .name = "mtk-jpegdec-hw", - .of_match_table = of_match_ptr(mtk_jpegdec_hw_ids), + .of_match_table = mtk_jpegdec_hw_ids, }, };
diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c index 71e85b4bbf127..244018365b6f1 100644 --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c @@ -46,7 +46,6 @@ static const struct mtk_jpeg_enc_qlt mtk_jpeg_enc_quality[] = { {.quality_param = 97, .hardware_value = JPEG_ENC_QUALITY_Q97}, };
-#if defined(CONFIG_OF) static const struct of_device_id mtk_jpegenc_drv_ids[] = { { .compatible = "mediatek,mt8195-jpgenc-hw", @@ -54,7 +53,6 @@ static const struct of_device_id mtk_jpegenc_drv_ids[] = { {}, }; MODULE_DEVICE_TABLE(of, mtk_jpegenc_drv_ids); -#endif
void mtk_jpeg_enc_reset(void __iomem *base) { @@ -377,7 +375,7 @@ static struct platform_driver mtk_jpegenc_hw_driver = { .probe = mtk_jpegenc_hw_probe, .driver = { .name = "mtk-jpegenc-hw", - .of_match_table = of_match_ptr(mtk_jpegenc_drv_ids), + .of_match_table = mtk_jpegenc_drv_ids, }, };
From: Wang Ming machel@vivo.com
[ Upstream commit 043b1f185fb0f3939b7427f634787706f45411c4 ]
The debugfs_create_dir() function returns error pointers. It never returns NULL. Most incorrect error checks were fixed, but the one in i40e_dbg_init() was forgotten.
Fix the remaining error check.
Fixes: 02e9c290814c ("i40e: debugfs interface") Signed-off-by: Wang Ming machel@vivo.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c index 9954493cd4489..62497f5565c59 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -1839,7 +1839,7 @@ void i40e_dbg_pf_exit(struct i40e_pf *pf) void i40e_dbg_init(void) { i40e_dbg_root = debugfs_create_dir(i40e_driver_name, NULL); - if (!i40e_dbg_root) + if (IS_ERR(i40e_dbg_root)) pr_info("init of debugfs failed\n"); }
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit a2f054c10bef0b54600ec9cb776508443e941343 ]
In iavf_adminq_task(), if kzalloc() fails to allocate the event.msg_buf, the function will exit without releasing the adapter->crit_lock.
This is unlikely, but if it happens, the next access to that mutex will deadlock.
Fix this by moving the unlock to the end of the function, and adding a new label to allow jumping to the unlock portion of the function exit flow.
Fixes: fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") Signed-off-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index ba96312feb505..6c25d240e70bc 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -3300,7 +3300,7 @@ static void iavf_adminq_task(struct work_struct *work) event.buf_len = IAVF_MAX_AQ_BUF_SIZE; event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL); if (!event.msg_buf) - goto out; + goto unlock;
do { ret = iavf_clean_arq_element(hw, &event, &pending); @@ -3315,7 +3315,6 @@ static void iavf_adminq_task(struct work_struct *work) if (pending != 0) memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE); } while (pending); - mutex_unlock(&adapter->crit_lock);
if (iavf_is_reset_in_progress(adapter)) goto freedom; @@ -3359,6 +3358,8 @@ static void iavf_adminq_task(struct work_struct *work)
freedom: kfree(event.msg_buf); +unlock: + mutex_unlock(&adapter->crit_lock); out: /* re-enable Admin queue interrupt cause */ iavf_misc_irq_enable(adapter);
From: Jacob Keller jacob.e.keller@intel.com
[ Upstream commit 91896c8acce23d33ed078cffd46a9534b1f82be5 ]
In iavf_adminq_task(), if the function can't acquire the adapter->crit_lock, it checks if the driver is removing. If so, it simply exits without re-enabling the interrupt. This is done to ensure that the task stops processing as soon as possible once the driver is being removed.
However, if the IAVF_FLAG_PF_COMMS_FAILED is set, the function checks this before attempting to acquire the lock. In this case, the function exits early and re-enables the interrupt. This will happen even if the driver is already removing.
Avoid this, by moving the check to after the adapter->crit_lock is acquired. This way, if the driver is removing, we will not re-enable the interrupt.
Fixes: fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") Signed-off-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index 6c25d240e70bc..e48810e0627d2 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -3286,9 +3286,6 @@ static void iavf_adminq_task(struct work_struct *work) u32 val, oldval; u16 pending;
- if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) - goto out; - if (!mutex_trylock(&adapter->crit_lock)) { if (adapter->state == __IAVF_REMOVE) return; @@ -3297,6 +3294,9 @@ static void iavf_adminq_task(struct work_struct *work) goto out; }
+ if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) + goto unlock; + event.buf_len = IAVF_MAX_AQ_BUF_SIZE; event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL); if (!event.msg_buf)
From: Jiawen Wu jiawenwu@trustnetic.com
[ Upstream commit c7b75bea853daeb64fc831dbf39a6bbabcc402ac ]
Clear MV_V2_PORT_CTRL_PWRDOWN bit to set power up for 88x3310 PHY, it sometimes does not take effect immediately. And a read of this register causes the bit not to clear. This will cause mv3310_reset() to time out, which will fail the config initialization. So add a delay before the next access.
Fixes: c9cc1c815d36 ("net: phy: marvell10g: place in powersave mode at probe") Signed-off-by: Jiawen Wu jiawenwu@trustnetic.com Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/marvell10g.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c index 55d9d7acc32eb..d4bb90d768811 100644 --- a/drivers/net/phy/marvell10g.c +++ b/drivers/net/phy/marvell10g.c @@ -328,6 +328,13 @@ static int mv3310_power_up(struct phy_device *phydev) ret = phy_clear_bits_mmd(phydev, MDIO_MMD_VEND2, MV_V2_PORT_CTRL, MV_V2_PORT_CTRL_PWRDOWN);
+ /* Sometimes, the power down bit doesn't clear immediately, and + * a read of this register causes the bit not to clear. Delay + * 100us to allow the PHY to come out of power down mode before + * the next access. + */ + udelay(100); + if (phydev->drv->phy_id != MARVELL_PHY_ID_88X3310 || priv->firmware_ver < 0x00030000) return ret;
From: Hao Lan lanhao@huawei.com
[ Upstream commit b27d0232e8897f7c896dc8ad80c9907dd57fd3f3 ]
Current only the first 32 bits of the capability flag bit are considered. When the matching capability flag bit is greater than 31 bits, it will get an error bit.This patch use bitmap to solve this issue. It can handle each capability bit whitout bit width limit.
Fixes: da77aef9cc58 ("net: hns3: create common cmdq resource allocate/free/query APIs") Signed-off-by: Hao Lan lanhao@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 3 ++- .../hns3/hns3_common/hclge_comm_cmd.c | 21 ++++++++++++++++--- 2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index 9c9c72dc57e00..06f29e80104c0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -31,6 +31,7 @@ #include <linux/pci.h> #include <linux/pkt_sched.h> #include <linux/types.h> +#include <linux/bitmap.h> #include <net/pkt_cls.h> #include <net/pkt_sched.h>
@@ -407,7 +408,7 @@ struct hnae3_ae_dev { unsigned long hw_err_reset_req; struct hnae3_dev_specs dev_specs; u32 dev_version; - unsigned long caps[BITS_TO_LONGS(HNAE3_DEV_CAPS_MAX_NUM)]; + DECLARE_BITMAP(caps, HNAE3_DEV_CAPS_MAX_NUM); void *priv; };
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c index b85c412683ddc..16ba98ff2c9b1 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c @@ -171,6 +171,20 @@ static const struct hclge_comm_caps_bit_map hclge_vf_cmd_caps[] = { {HCLGE_COMM_CAP_GRO_B, HNAE3_DEV_SUPPORT_GRO_B}, };
+static void +hclge_comm_capability_to_bitmap(unsigned long *bitmap, __le32 *caps) +{ + const unsigned int words = HCLGE_COMM_QUERY_CAP_LENGTH; + u32 val[HCLGE_COMM_QUERY_CAP_LENGTH]; + unsigned int i; + + for (i = 0; i < words; i++) + val[i] = __le32_to_cpu(caps[i]); + + bitmap_from_arr32(bitmap, val, + HCLGE_COMM_QUERY_CAP_LENGTH * BITS_PER_TYPE(u32)); +} + static void hclge_comm_parse_capability(struct hnae3_ae_dev *ae_dev, bool is_pf, struct hclge_comm_query_version_cmd *cmd) @@ -179,11 +193,12 @@ hclge_comm_parse_capability(struct hnae3_ae_dev *ae_dev, bool is_pf, is_pf ? hclge_pf_cmd_caps : hclge_vf_cmd_caps; u32 size = is_pf ? ARRAY_SIZE(hclge_pf_cmd_caps) : ARRAY_SIZE(hclge_vf_cmd_caps); - u32 caps, i; + DECLARE_BITMAP(caps, HCLGE_COMM_QUERY_CAP_LENGTH * BITS_PER_TYPE(u32)); + u32 i;
- caps = __le32_to_cpu(cmd->caps[0]); + hclge_comm_capability_to_bitmap(caps, cmd->caps); for (i = 0; i < size; i++) - if (hnae3_get_bit(caps, caps_map[i].imp_bit)) + if (test_bit(caps_map[i].imp_bit, caps)) set_bit(caps_map[i].local_bit, ae_dev->caps); }
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit 116d9f732eef634abbd871f2c6f613a5b4677742 ]
Currently, the weight saved by the driver is used as the query result, which may be different from the actual weight in the register. Therefore, the register value read from the firmware is used as the query result
Fixes: 0e32038dc856 ("net: hns3: refactor dump tc of debugfs") Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c index 233c132dc513e..409db2e709651 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c @@ -693,8 +693,7 @@ static int hclge_dbg_dump_tc(struct hclge_dev *hdev, char *buf, int len) for (i = 0; i < HNAE3_MAX_TC; i++) { sch_mode_str = ets_weight->tc_weight[i] ? "dwrr" : "sp"; pos += scnprintf(buf + pos, len - pos, "%u %4s %3u\n", - i, sch_mode_str, - hdev->tm_info.pg_info[0].tc_dwrr[i]); + i, sch_mode_str, ets_weight->tc_weight[i]); }
return 0;
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit 882481b1c55fc44861d7e2d54b4e0936b1b39f2c ]
In dwrr mode, the default bandwidth weight of disabled tc is set to 0. If the bandwidth weight is 0, the mode will change to sp. Therefore, disabled tc default bandwidth weight need changed to 1, and 0 is returned when query the bandwidth weight of disabled tc. In addition, driver need stop configure bandwidth weight if tc is disabled.
Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Jie Wang wangjie125@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c | 17 ++++++++++++++--- .../ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 3 ++- 2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c index c4aded65e848b..09362823140d5 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c @@ -52,7 +52,10 @@ static void hclge_tm_info_to_ieee_ets(struct hclge_dev *hdev,
for (i = 0; i < HNAE3_MAX_TC; i++) { ets->prio_tc[i] = hdev->tm_info.prio_tc[i]; - ets->tc_tx_bw[i] = hdev->tm_info.pg_info[0].tc_dwrr[i]; + if (i < hdev->tm_info.num_tc) + ets->tc_tx_bw[i] = hdev->tm_info.pg_info[0].tc_dwrr[i]; + else + ets->tc_tx_bw[i] = 0;
if (hdev->tm_info.tc_info[i].tc_sch_mode == HCLGE_SCH_MODE_SP) @@ -123,7 +126,8 @@ static u8 hclge_ets_tc_changed(struct hclge_dev *hdev, struct ieee_ets *ets, }
static int hclge_ets_sch_mode_validate(struct hclge_dev *hdev, - struct ieee_ets *ets, bool *changed) + struct ieee_ets *ets, bool *changed, + u8 tc_num) { bool has_ets_tc = false; u32 total_ets_bw = 0; @@ -137,6 +141,13 @@ static int hclge_ets_sch_mode_validate(struct hclge_dev *hdev, *changed = true; break; case IEEE_8021QAZ_TSA_ETS: + if (i >= tc_num) { + dev_err(&hdev->pdev->dev, + "tc%u is disabled, cannot set ets bw\n", + i); + return -EINVAL; + } + /* The hardware will switch to sp mode if bandwidth is * 0, so limit ets bandwidth must be greater than 0. */ @@ -176,7 +187,7 @@ static int hclge_ets_validate(struct hclge_dev *hdev, struct ieee_ets *ets, if (ret) return ret;
- ret = hclge_ets_sch_mode_validate(hdev, ets, changed); + ret = hclge_ets_sch_mode_validate(hdev, ets, changed, tc_num); if (ret) return ret;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c index 922c0da3660c7..150f146fa24fb 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c @@ -785,6 +785,7 @@ static void hclge_tm_tc_info_init(struct hclge_dev *hdev) static void hclge_tm_pg_info_init(struct hclge_dev *hdev) { #define BW_PERCENT 100 +#define DEFAULT_BW_WEIGHT 1
u8 i;
@@ -806,7 +807,7 @@ static void hclge_tm_pg_info_init(struct hclge_dev *hdev) for (k = 0; k < hdev->tm_info.num_tc; k++) hdev->tm_info.pg_info[i].tc_dwrr[k] = BW_PERCENT; for (; k < HNAE3_MAX_TC; k++) - hdev->tm_info.pg_info[i].tc_dwrr[k] = 0; + hdev->tm_info.pg_info[i].tc_dwrr[k] = DEFAULT_BW_WEIGHT; } }
From: Jiri Benc jbenc@redhat.com
[ Upstream commit 94d166c5318c6edd1e079df8552233443e909c33 ]
VXLAN-GPE does not add an extra inner Ethernet header. Take that into account when calculating header length.
This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is cached.
In the collect_md mode (which is the only mode that VXLAN-GPE supports), there's no magic auto-setting of the tunnel interface MTU. It can't be, since the destination and thus the underlying interface may be different for each packet.
So, the administrator is responsible for setting the correct tunnel interface MTU. Apparently, the administrators are capable enough to calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36). They set the tunnel interface MTU to 1464. If you run a TCP stream over such interface, it's then segmented according to the MTU 1464, i.e. producing 1514 bytes frames. Which is okay, this still fits the lower MTU.
However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as the header size and thus incorrectly calculates the frame size to be 1528. This leads to ICMP too big message being generated (locally), PMTU of 1450 to be cached and the TCP stream to be resegmented.
The fix is to use the correct actual header size, especially for skb_tunnel_check_pmtu calculation.
Fixes: e1e5314de08ba ("vxlan: implement GPE") Signed-off-by: Jiri Benc jbenc@redhat.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- drivers/net/vxlan/vxlan_core.c | 23 ++++++++----------- include/net/vxlan.h | 13 +++++++---- 3 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 1726297f2e0df..8eb9839a3ca69 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -8479,7 +8479,7 @@ static void ixgbe_atr(struct ixgbe_ring *ring, struct ixgbe_adapter *adapter = q_vector->adapter;
if (unlikely(skb_tail_pointer(skb) < hdr.network + - VXLAN_HEADROOM)) + vxlan_headroom(0))) return;
/* verify the port is recognized as VXLAN */ diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index 561fe1b314f5f..fed54702b7e21 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -2515,7 +2515,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, }
ndst = &rt->dst; - err = skb_tunnel_check_pmtu(skb, ndst, VXLAN_HEADROOM, + err = skb_tunnel_check_pmtu(skb, ndst, vxlan_headroom(flags & VXLAN_F_GPE), netif_is_any_bridge_port(dev)); if (err < 0) { goto tx_error; @@ -2576,7 +2576,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, goto out_unlock; }
- err = skb_tunnel_check_pmtu(skb, ndst, VXLAN6_HEADROOM, + err = skb_tunnel_check_pmtu(skb, ndst, + vxlan_headroom((flags & VXLAN_F_GPE) | VXLAN_F_IPV6), netif_is_any_bridge_port(dev)); if (err < 0) { goto tx_error; @@ -2988,14 +2989,12 @@ static int vxlan_change_mtu(struct net_device *dev, int new_mtu) struct vxlan_rdst *dst = &vxlan->default_dst; struct net_device *lowerdev = __dev_get_by_index(vxlan->net, dst->remote_ifindex); - bool use_ipv6 = !!(vxlan->cfg.flags & VXLAN_F_IPV6);
/* This check is different than dev->max_mtu, because it looks at * the lowerdev->mtu, rather than the static dev->max_mtu */ if (lowerdev) { - int max_mtu = lowerdev->mtu - - (use_ipv6 ? VXLAN6_HEADROOM : VXLAN_HEADROOM); + int max_mtu = lowerdev->mtu - vxlan_headroom(vxlan->cfg.flags); if (new_mtu > max_mtu) return -EINVAL; } @@ -3641,11 +3640,11 @@ static void vxlan_config_apply(struct net_device *dev, struct vxlan_dev *vxlan = netdev_priv(dev); struct vxlan_rdst *dst = &vxlan->default_dst; unsigned short needed_headroom = ETH_HLEN; - bool use_ipv6 = !!(conf->flags & VXLAN_F_IPV6); int max_mtu = ETH_MAX_MTU; + u32 flags = conf->flags;
if (!changelink) { - if (conf->flags & VXLAN_F_GPE) + if (flags & VXLAN_F_GPE) vxlan_raw_setup(dev); else vxlan_ether_setup(dev); @@ -3670,8 +3669,7 @@ static void vxlan_config_apply(struct net_device *dev,
dev->needed_tailroom = lowerdev->needed_tailroom;
- max_mtu = lowerdev->mtu - (use_ipv6 ? VXLAN6_HEADROOM : - VXLAN_HEADROOM); + max_mtu = lowerdev->mtu - vxlan_headroom(flags); if (max_mtu < ETH_MIN_MTU) max_mtu = ETH_MIN_MTU;
@@ -3682,10 +3680,9 @@ static void vxlan_config_apply(struct net_device *dev, if (dev->mtu > max_mtu) dev->mtu = max_mtu;
- if (use_ipv6 || conf->flags & VXLAN_F_COLLECT_METADATA) - needed_headroom += VXLAN6_HEADROOM; - else - needed_headroom += VXLAN_HEADROOM; + if (flags & VXLAN_F_COLLECT_METADATA) + flags |= VXLAN_F_IPV6; + needed_headroom += vxlan_headroom(flags); dev->needed_headroom = needed_headroom;
memcpy(&vxlan->cfg, conf, sizeof(*conf)); diff --git a/include/net/vxlan.h b/include/net/vxlan.h index 20bd7d893e10a..b57567296bc67 100644 --- a/include/net/vxlan.h +++ b/include/net/vxlan.h @@ -384,10 +384,15 @@ static inline netdev_features_t vxlan_features_check(struct sk_buff *skb, return features; }
-/* IP header + UDP + VXLAN + Ethernet header */ -#define VXLAN_HEADROOM (20 + 8 + 8 + 14) -/* IPv6 header + UDP + VXLAN + Ethernet header */ -#define VXLAN6_HEADROOM (40 + 8 + 8 + 14) +static inline int vxlan_headroom(u32 flags) +{ + /* VXLAN: IP4/6 header + UDP + VXLAN + Ethernet header */ + /* VXLAN-GPE: IP4/6 header + UDP + VXLAN */ + return (flags & VXLAN_F_IPV6 ? sizeof(struct ipv6hdr) : + sizeof(struct iphdr)) + + sizeof(struct udphdr) + sizeof(struct vxlanhdr) + + (flags & VXLAN_F_GPE ? 0 : ETH_HLEN); +}
static inline struct vxlanhdr *vxlan_hdr(struct sk_buff *skb) {
From: Jiri Benc jbenc@redhat.com
[ Upstream commit 17a0a64448b568442a101de09575f81ffdc45d15 ]
The vxlan_parse_gpe_hdr function extracts the next protocol value from the GPE header and marks GPE bits as parsed.
In order to be used in the next patch, split the function into protocol extraction and bit marking. The bit marking is meaningful only in vxlan_rcv; move it directly there.
Rename the function to vxlan_parse_gpe_proto to reflect what it now does. Remove unused arguments skb and vxflags. Move the function earlier in the file to allow it to be called from more places in the next patch.
Signed-off-by: Jiri Benc jbenc@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: b0b672c4d095 ("vxlan: fix GRO with VXLAN-GPE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vxlan/vxlan_core.c | 58 ++++++++++++++++------------------ 1 file changed, 28 insertions(+), 30 deletions(-)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index fed54702b7e21..cb2d82785d900 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -623,6 +623,32 @@ static int vxlan_fdb_append(struct vxlan_fdb *f, return 1; }
+static bool vxlan_parse_gpe_proto(struct vxlanhdr *hdr, __be16 *protocol) +{ + struct vxlanhdr_gpe *gpe = (struct vxlanhdr_gpe *)hdr; + + /* Need to have Next Protocol set for interfaces in GPE mode. */ + if (!gpe->np_applied) + return false; + /* "The initial version is 0. If a receiver does not support the + * version indicated it MUST drop the packet. + */ + if (gpe->version != 0) + return false; + /* "When the O bit is set to 1, the packet is an OAM packet and OAM + * processing MUST occur." However, we don't implement OAM + * processing, thus drop the packet. + */ + if (gpe->oam_flag) + return false; + + *protocol = tun_p_to_eth_p(gpe->next_protocol); + if (!*protocol) + return false; + + return true; +} + static struct vxlanhdr *vxlan_gro_remcsum(struct sk_buff *skb, unsigned int off, struct vxlanhdr *vh, size_t hdrlen, @@ -1525,35 +1551,6 @@ static void vxlan_parse_gbp_hdr(struct vxlanhdr *unparsed, unparsed->vx_flags &= ~VXLAN_GBP_USED_BITS; }
-static bool vxlan_parse_gpe_hdr(struct vxlanhdr *unparsed, - __be16 *protocol, - struct sk_buff *skb, u32 vxflags) -{ - struct vxlanhdr_gpe *gpe = (struct vxlanhdr_gpe *)unparsed; - - /* Need to have Next Protocol set for interfaces in GPE mode. */ - if (!gpe->np_applied) - return false; - /* "The initial version is 0. If a receiver does not support the - * version indicated it MUST drop the packet. - */ - if (gpe->version != 0) - return false; - /* "When the O bit is set to 1, the packet is an OAM packet and OAM - * processing MUST occur." However, we don't implement OAM - * processing, thus drop the packet. - */ - if (gpe->oam_flag) - return false; - - *protocol = tun_p_to_eth_p(gpe->next_protocol); - if (!*protocol) - return false; - - unparsed->vx_flags &= ~VXLAN_GPE_USED_BITS; - return true; -} - static bool vxlan_set_mac(struct vxlan_dev *vxlan, struct vxlan_sock *vs, struct sk_buff *skb, __be32 vni) @@ -1655,8 +1652,9 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb) * used by VXLAN extensions if explicitly requested. */ if (vs->flags & VXLAN_F_GPE) { - if (!vxlan_parse_gpe_hdr(&unparsed, &protocol, skb, vs->flags)) + if (!vxlan_parse_gpe_proto(&unparsed, &protocol)) goto drop; + unparsed.vx_flags &= ~VXLAN_GPE_USED_BITS; raw_proto = true; }
From: Jiri Benc jbenc@redhat.com
[ Upstream commit b0b672c4d0957e5897685667fc848132b8bd2d71 ]
In VXLAN-GPE, there may not be an Ethernet header following the VXLAN header. But in GRO, the vxlan driver calls eth_gro_receive unconditionally, which means the following header is incorrectly parsed as Ethernet.
Introduce GPE specific GRO handling.
For better performance, do not check for GPE during GRO but rather install a different set of functions at setup time.
Fixes: e1e5314de08ba ("vxlan: implement GPE") Reported-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Jiri Benc jbenc@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/vxlan/vxlan_core.c | 84 ++++++++++++++++++++++++++++------ 1 file changed, 69 insertions(+), 15 deletions(-)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c index cb2d82785d900..7532cac2154c5 100644 --- a/drivers/net/vxlan/vxlan_core.c +++ b/drivers/net/vxlan/vxlan_core.c @@ -675,26 +675,24 @@ static struct vxlanhdr *vxlan_gro_remcsum(struct sk_buff *skb, return vh; }
-static struct sk_buff *vxlan_gro_receive(struct sock *sk, - struct list_head *head, - struct sk_buff *skb) +static struct vxlanhdr *vxlan_gro_prepare_receive(struct sock *sk, + struct list_head *head, + struct sk_buff *skb, + struct gro_remcsum *grc) { - struct sk_buff *pp = NULL; struct sk_buff *p; struct vxlanhdr *vh, *vh2; unsigned int hlen, off_vx; - int flush = 1; struct vxlan_sock *vs = rcu_dereference_sk_user_data(sk); __be32 flags; - struct gro_remcsum grc;
- skb_gro_remcsum_init(&grc); + skb_gro_remcsum_init(grc);
off_vx = skb_gro_offset(skb); hlen = off_vx + sizeof(*vh); vh = skb_gro_header(skb, hlen, off_vx); if (unlikely(!vh)) - goto out; + return NULL;
skb_gro_postpull_rcsum(skb, vh, sizeof(struct vxlanhdr));
@@ -702,12 +700,12 @@ static struct sk_buff *vxlan_gro_receive(struct sock *sk,
if ((flags & VXLAN_HF_RCO) && (vs->flags & VXLAN_F_REMCSUM_RX)) { vh = vxlan_gro_remcsum(skb, off_vx, vh, sizeof(struct vxlanhdr), - vh->vx_vni, &grc, + vh->vx_vni, grc, !!(vs->flags & VXLAN_F_REMCSUM_NOPARTIAL));
if (!vh) - goto out; + return NULL; }
skb_gro_pull(skb, sizeof(struct vxlanhdr)); /* pull vxlan header */ @@ -724,12 +722,48 @@ static struct sk_buff *vxlan_gro_receive(struct sock *sk, } }
- pp = call_gro_receive(eth_gro_receive, head, skb); - flush = 0; + return vh; +} + +static struct sk_buff *vxlan_gro_receive(struct sock *sk, + struct list_head *head, + struct sk_buff *skb) +{ + struct sk_buff *pp = NULL; + struct gro_remcsum grc; + int flush = 1;
-out: + if (vxlan_gro_prepare_receive(sk, head, skb, &grc)) { + pp = call_gro_receive(eth_gro_receive, head, skb); + flush = 0; + } skb_gro_flush_final_remcsum(skb, pp, flush, &grc); + return pp; +}
+static struct sk_buff *vxlan_gpe_gro_receive(struct sock *sk, + struct list_head *head, + struct sk_buff *skb) +{ + const struct packet_offload *ptype; + struct sk_buff *pp = NULL; + struct gro_remcsum grc; + struct vxlanhdr *vh; + __be16 protocol; + int flush = 1; + + vh = vxlan_gro_prepare_receive(sk, head, skb, &grc); + if (vh) { + if (!vxlan_parse_gpe_proto(vh, &protocol)) + goto out; + ptype = gro_find_receive_by_type(protocol); + if (!ptype) + goto out; + pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb); + flush = 0; + } +out: + skb_gro_flush_final_remcsum(skb, pp, flush, &grc); return pp; }
@@ -741,6 +775,21 @@ static int vxlan_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff) return eth_gro_complete(skb, nhoff + sizeof(struct vxlanhdr)); }
+static int vxlan_gpe_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff) +{ + struct vxlanhdr *vh = (struct vxlanhdr *)(skb->data + nhoff); + const struct packet_offload *ptype; + int err = -ENOSYS; + __be16 protocol; + + if (!vxlan_parse_gpe_proto(vh, &protocol)) + return err; + ptype = gro_find_complete_by_type(protocol); + if (ptype) + err = ptype->callbacks.gro_complete(skb, nhoff + sizeof(struct vxlanhdr)); + return err; +} + static struct vxlan_fdb *vxlan_fdb_alloc(struct vxlan_dev *vxlan, const u8 *mac, __u16 state, __be32 src_vni, __u16 ndm_flags) @@ -3373,8 +3422,13 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, bool ipv6, tunnel_cfg.encap_rcv = vxlan_rcv; tunnel_cfg.encap_err_lookup = vxlan_err_lookup; tunnel_cfg.encap_destroy = NULL; - tunnel_cfg.gro_receive = vxlan_gro_receive; - tunnel_cfg.gro_complete = vxlan_gro_complete; + if (vs->flags & VXLAN_F_GPE) { + tunnel_cfg.gro_receive = vxlan_gpe_gro_receive; + tunnel_cfg.gro_complete = vxlan_gpe_gro_complete; + } else { + tunnel_cfg.gro_receive = vxlan_gro_receive; + tunnel_cfg.gro_complete = vxlan_gro_complete; + }
setup_udp_tunnel_sock(net, sock, &tunnel_cfg);
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
[ Upstream commit 13c088cf3657d70893d75cf116be937f1509cc0f ]
The size of array 'priv->ports[]' is INNO_PHY_PORT_NUM.
In the for loop, 'i' is used as the index for array 'priv->ports[]' with a check (i > INNO_PHY_PORT_NUM) which indicates that INNO_PHY_PORT_NUM is allowed value for 'i' in the same loop.
This > comparison needs to be changed to >=, otherwise it potentially leads to an out of bounds write on the next iteration through the loop
Fixes: ba8b0ee81fbb ("phy: add inno-usb2-phy driver for hi3798cv200 SoC") Reported-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Link: https://lore.kernel.org/r/20230721090558.3588613-1-harshit.m.mogalapalli@ora... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/hisilicon/phy-hisi-inno-usb2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/phy/hisilicon/phy-hisi-inno-usb2.c b/drivers/phy/hisilicon/phy-hisi-inno-usb2.c index b133ae06757ab..a922fb11a1092 100644 --- a/drivers/phy/hisilicon/phy-hisi-inno-usb2.c +++ b/drivers/phy/hisilicon/phy-hisi-inno-usb2.c @@ -158,7 +158,7 @@ static int hisi_inno_phy_probe(struct platform_device *pdev) phy_set_drvdata(phy, &priv->ports[i]); i++;
- if (i > INNO_PHY_PORT_NUM) { + if (i >= INNO_PHY_PORT_NUM) { dev_warn(dev, "Support %d ports in maximum\n", i); of_node_put(child); break;
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit ed96824b71ed67664390890441b229423a25317f ]
in atl1_tso(), it should check the return value of pskb_trim(), and return an error code if an unexpected value is returned by pskb_trim().
Fixes: 401c0aabec4b ("atl1: simplify tx packet descriptor") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Link: https://lore.kernel.org/r/20230722142511.12448-1-ruc_gongyuanjun@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/atlx/atl1.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/atheros/atlx/atl1.c b/drivers/net/ethernet/atheros/atlx/atl1.c index c8444bcdf5270..02aa6fd8ebc2d 100644 --- a/drivers/net/ethernet/atheros/atlx/atl1.c +++ b/drivers/net/ethernet/atheros/atlx/atl1.c @@ -2113,8 +2113,11 @@ static int atl1_tso(struct atl1_adapter *adapter, struct sk_buff *skb,
real_len = (((unsigned char *)iph - skb->data) + ntohs(iph->tot_len)); - if (real_len < skb->len) - pskb_trim(skb, real_len); + if (real_len < skb->len) { + err = pskb_trim(skb, real_len); + if (err) + return err; + } hdr_len = skb_tcp_all_headers(skb); if (skb->len == hdr_len) { iph->check = 0;
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit 69a184f7a372aac588babfb0bd681aaed9779f5b ]
in atl1e_tso_csum, it should check the return value of pskb_trim(), and return an error code if an unexpected value is returned by pskb_trim().
Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230720144219.39285-1-ruc_gongyuanjun@163.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/atl1e/atl1e_main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/atheros/atl1e/atl1e_main.c b/drivers/net/ethernet/atheros/atl1e/atl1e_main.c index 5db0f3495a32e..5935be190b9e2 100644 --- a/drivers/net/ethernet/atheros/atl1e/atl1e_main.c +++ b/drivers/net/ethernet/atheros/atl1e/atl1e_main.c @@ -1641,8 +1641,11 @@ static int atl1e_tso_csum(struct atl1e_adapter *adapter, real_len = (((unsigned char *)ip_hdr(skb) - skb->data) + ntohs(ip_hdr(skb)->tot_len));
- if (real_len < skb->len) - pskb_trim(skb, real_len); + if (real_len < skb->len) { + err = pskb_trim(skb, real_len); + if (err) + return err; + }
hdr_len = skb_tcp_all_headers(skb); if (unlikely(skb->len == hdr_len)) {
From: Maciej Żenczykowski maze@google.com
[ Upstream commit 69172f0bcb6a09110c5d2a6d792627f5095a9018 ]
currently on 6.4 net/main:
# ip link add dummy1 type dummy # echo 1 > /proc/sys/net/ipv6/conf/dummy1/use_tempaddr # ip link set dummy1 up # ip -6 addr add 2000::1/64 mngtmpaddr dev dummy1 # ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::44f3:581c:8ca:3983/64 scope global temporary dynamic valid_lft 604800sec preferred_lft 86172sec inet6 2000::1/64 scope global mngtmpaddr valid_lft forever preferred_lft forever inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever
# ip -6 addr del 2000::44f3:581c:8ca:3983/64 dev dummy1
(can wait a few seconds if you want to, the above delete isn't [directly] the problem)
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::1/64 scope global mngtmpaddr valid_lft forever preferred_lft forever inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever
# ip -6 addr del 2000::1/64 mngtmpaddr dev dummy1 # ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 inet6 2000::81c9:56b7:f51a:b98f/64 scope global temporary dynamic valid_lft 604797sec preferred_lft 86169sec inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link valid_lft forever preferred_lft forever
This patch prevents this new 'global temporary dynamic' address from being created by the deletion of the related (same subnet prefix) 'mngtmpaddr' (which is triggered by there already being no temporary addresses).
Cc: Jiri Pirko jiri@resnulli.us Fixes: 53bd67491537 ("ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses") Reported-by: Xiao Ma xiaom@google.com Signed-off-by: Maciej Żenczykowski maze@google.com Reviewed-by: David Ahern dsahern@kernel.org Link: https://lore.kernel.org/r/20230720160022.1887942-1-maze@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/addrconf.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 5affca8e2f53a..c63f1d62d60a5 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2561,12 +2561,18 @@ static void manage_tempaddrs(struct inet6_dev *idev, ipv6_ifa_notify(0, ift); }
- if ((create || list_empty(&idev->tempaddr_list)) && - idev->cnf.use_tempaddr > 0) { + /* Also create a temporary address if it's enabled but no temporary + * address currently exists. + * However, we get called with valid_lft == 0, prefered_lft == 0, create == false + * as part of cleanup (ie. deleting the mngtmpaddr). + * We don't want that to result in creating a new temporary ip address. + */ + if (list_empty(&idev->tempaddr_list) && (valid_lft || prefered_lft)) + create = true; + + if (create && idev->cnf.use_tempaddr > 0) { /* When a new public address is created as described * in [ADDRCONF], also create a new temporary address. - * Also create a temporary address if it's enabled but - * no temporary address currently exists. */ read_unlock_bh(&idev->lock); ipv6_create_tempaddr(ifp, false);
From: Wei Fang wei.fang@nxp.com
[ Upstream commit bb7a0156365dffe2fcd63e2051145fbe4f8908b4 ]
According to the implementation of XDP of FEC driver, the XDP path shares the transmit queues with the kernel network stack, so it is possible to lead to a tx timeout event when XDP uses the tx queue pretty much exclusively. And this event will cause the reset of the FEC hardware. To avoid timeout in this case, we use the txq_trans_cond_update() interface to update txq->trans_start to jiffies so that watchdog won't generate a transmit timeout warning.
Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support") Signed-off-by: Wei Fang wei.fang@nxp.com Link: https://lore.kernel.org/r/20230721083559.2857312-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 7659888a96917..a1b0abe54a0e5 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -3908,6 +3908,8 @@ static int fec_enet_xdp_xmit(struct net_device *dev,
__netif_tx_lock(nq, cpu);
+ /* Avoid tx timeout as XDP shares the queue with kernel stack */ + txq_trans_cond_update(nq); for (i = 0; i < num_frames; i++) { if (fec_enet_txq_xmit_frame(fep, txq, frames[i]) != 0) break;
From: Stewart Smith trawets@amazon.com
[ Upstream commit d11b0df7ddf1831f3e170972f43186dad520bfcc ]
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash table with a hash over the source & destination addresses and ports. However, the IPv6 hash is insufficient and can lead to a high rate of collisions.
The IPv6 hash used an XOR to fit everything into the 96 bits for the fast jenkins hash, meaning it is possible for an external entity to ensure the hash collides, thus falling back to a linear search in the bucket, which is slow.
We take the approach of hash the full length of IPv6 address in __ipv6_addr_jhash() so that all users can benefit from a more secure version.
While this may look like it adds overhead, the reality of modern CPUs means that this is unmeasurable in real world scenarios.
In simulating with llvm-mca, the increase in cycles for the hashing code was ~16 cycles on Skylake (from a base of ~155), and an extra ~9 on Nehalem (base of ~173).
In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash") netfilter switched from a jenkins hash to a siphash, but even the faster hsiphash is a more significant overhead (~20-30%) in some preliminary testing. So, in this patch, we keep to the more conservative approach to ensure we don't add much overhead per SYN.
In testing, this results in a consistently even spread across the connection buckets. In both testing and real-world scenarios, we have not found any measurable performance impact.
Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp") Signed-off-by: Stewart Smith trawets@amazon.com Signed-off-by: Samuel Mendoza-Jonas samjonas@amazon.com Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20230721222410.17914-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ipv6.h | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 7332296eca44b..2acc4c808d45d 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -752,12 +752,8 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a) /* more secured version of ipv6_addr_hash() */ static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const u32 initval) { - u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1]; - - return jhash_3words(v, - (__force u32)a->s6_addr32[2], - (__force u32)a->s6_addr32[3], - initval); + return jhash2((__force const u32 *)a->s6_addr32, + ARRAY_SIZE(a->s6_addr32), initval); }
static inline bool ipv6_addr_loopback(const struct in6_addr *a)
From: Jedrzej Jagielski jedrzej.jagielski@intel.com
[ Upstream commit a3336056504d780590ac6d6ac94fbba829994594 ]
Fix ethtool FDIR logic to not use memory after its release. In the ice_ethtool_fdir.c file there are 2 spots where code can refer to pointers which may be missing.
In the ice_cfg_fdir_xtrct_seq() function seg may be freed but even then may be still used by memcpy(&tun_seg[1], seg, sizeof(*seg)).
In the ice_add_fdir_ethtool() function struct ice_fdir_fltr *input may first fail to be added via ice_fdir_update_list_entry() but then may be deleted by ice_fdir_update_list_entry.
Terminate in both cases when the returned value of the previous operation is other than 0, free memory and don't use it anymore.
Reported-by: Michal Schmidt mschmidt@redhat.com Link: https://bugzilla.redhat.com/show_bug.cgi?id=2208423 Fixes: cac2a27cd9ab ("ice: Support IPv4 Flow Director filters") Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Jedrzej Jagielski jedrzej.jagielski@intel.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://lore.kernel.org/r/20230721155854.1292805-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/intel/ice/ice_ethtool_fdir.c | 26 ++++++++++--------- 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c index ead6d50fc0adc..8c6e13f87b7d3 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c @@ -1281,16 +1281,21 @@ ice_cfg_fdir_xtrct_seq(struct ice_pf *pf, struct ethtool_rx_flow_spec *fsp, ICE_FLOW_FLD_OFF_INVAL); }
- /* add filter for outer headers */ fltr_idx = ice_ethtool_flow_to_fltr(fsp->flow_type & ~FLOW_EXT); + + assign_bit(fltr_idx, hw->fdir_perfect_fltr, perfect_filter); + + /* add filter for outer headers */ ret = ice_fdir_set_hw_fltr_rule(pf, seg, fltr_idx, ICE_FD_HW_SEG_NON_TUN); - if (ret == -EEXIST) - /* Rule already exists, free memory and continue */ - devm_kfree(dev, seg); - else if (ret) + if (ret == -EEXIST) { + /* Rule already exists, free memory and count as success */ + ret = 0; + goto err_exit; + } else if (ret) { /* could not write filter, free memory */ goto err_exit; + }
/* make tunneled filter HW entries if possible */ memcpy(&tun_seg[1], seg, sizeof(*seg)); @@ -1305,18 +1310,13 @@ ice_cfg_fdir_xtrct_seq(struct ice_pf *pf, struct ethtool_rx_flow_spec *fsp, devm_kfree(dev, tun_seg); }
- if (perfect_filter) - set_bit(fltr_idx, hw->fdir_perfect_fltr); - else - clear_bit(fltr_idx, hw->fdir_perfect_fltr); - return ret;
err_exit: devm_kfree(dev, tun_seg); devm_kfree(dev, seg);
- return -EOPNOTSUPP; + return ret; }
/** @@ -1914,7 +1914,9 @@ int ice_add_fdir_ethtool(struct ice_vsi *vsi, struct ethtool_rxnfc *cmd) input->comp_report = ICE_FXD_FLTR_QW0_COMP_REPORT_SW_FAIL;
/* input struct is added to the HW filter list */ - ice_fdir_update_list_entry(pf, input, fsp->location); + ret = ice_fdir_update_list_entry(pf, input, fsp->location); + if (ret) + goto release_lock;
ret = ice_fdir_write_all_fltr(pf, input, true); if (ret)
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit da19a2b967cf1e2c426f50d28550d1915214a81d ]
When adding a point to point downlink to the bond, we neglected to reset the bond's flags, which were still using flags like BROADCAST and MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink interfaces, such as when adding a GRE device to the bonding.
To address this issue, let's reset the bond's flags for P2P interfaces.
Before fix: 7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188:: 8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/gre6 2006:70:10::1 brd 2006:70:10::2 inet6 fe80::200:ff:fe00:0/64 scope link valid_lft forever preferred_lft forever
After fix: 7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9:: 8: bond0: <POINTOPOINT,NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/gre6 2006:70:10::1 peer 2006:70:10::2 inet6 fe80::1/64 scope link valid_lft forever preferred_lft forever
Reported-by: Liang Li liali@redhat.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 091e035c76a6f..1a0776f9b008a 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1507,6 +1507,11 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
memcpy(bond_dev->broadcast, slave_dev->broadcast, slave_dev->addr_len); + + if (slave_dev->flags & IFF_POINTOPOINT) { + bond_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST); + bond_dev->flags |= (IFF_POINTOPOINT | IFF_NOARP); + } }
/* On bonding slaves other than the currently active slave, suppress
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit fa532bee17d15acf8bba4bc8e2062b7a093ba801 ]
When adding a point to point downlink to team device, we neglected to reset the team's flags, which were still using flags like BROADCAST and MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink interfaces, such as when adding a GRE device to team device. Fix this by remove multicast/broadcast flags and add p2p and noarp flags.
After removing the none ethernet interface and adding an ethernet interface to team, we need to reset team interface flags. Unlike bonding interface, team do not need restore IFF_MASTER, IFF_SLAVE flags.
Reported-by: Liang Li liali@redhat.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 1d76efe1577b ("team: add support for non-ethernet devices") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/team/team.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index 555b0b1e9a789..d3dc22509ea58 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -2135,6 +2135,15 @@ static void team_setup_by_port(struct net_device *dev, dev->mtu = port_dev->mtu; memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len); eth_hw_addr_inherit(dev, port_dev); + + if (port_dev->flags & IFF_POINTOPOINT) { + dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST); + dev->flags |= (IFF_POINTOPOINT | IFF_NOARP); + } else if ((port_dev->flags & (IFF_BROADCAST | IFF_MULTICAST)) == + (IFF_BROADCAST | IFF_MULTICAST)) { + dev->flags |= (IFF_BROADCAST | IFF_MULTICAST); + dev->flags &= ~(IFF_POINTOPOINT | IFF_NOARP); + } }
static int team_dev_type_check_change(struct net_device *dev,
From: Suman Ghosh sumang@marvell.com
[ Upstream commit 4e62c99d71e56817c934caa2a709a775c8cee078 ]
As of today, hash extraction support is enabled for all the silicons. Because of which we are facing initialization issues when the silicon does not support hash extraction. During creation of the hardware parsing table for IPv6 address, we need to consider if hash extraction is enabled then extract only 32 bit, otherwise 128 bit needs to be extracted. This patch fixes the issue and configures the hardware parser based on the availability of the feature.
Fixes: a95ab93550d3 ("octeontx2-af: Use hashed field in MCAM key") Signed-off-by: Suman Ghosh sumang@marvell.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230721061222.2632521-1-sumang@marvell.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../marvell/octeontx2/af/rvu_npc_hash.c | 43 ++++++++++++++++++- .../marvell/octeontx2/af/rvu_npc_hash.h | 8 ++-- 2 files changed, 46 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c index 6fe67f3a7f6f1..7e20282c12d00 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c @@ -218,13 +218,54 @@ void npc_config_secret_key(struct rvu *rvu, int blkaddr)
void npc_program_mkex_hash(struct rvu *rvu, int blkaddr) { + struct npc_mcam_kex_hash *mh = rvu->kpu.mkex_hash; struct hw_cap *hwcap = &rvu->hw->cap; + u8 intf, ld, hdr_offset, byte_len; struct rvu_hwinfo *hw = rvu->hw; - u8 intf; + u64 cfg;
+ /* Check if hardware supports hash extraction */ if (!hwcap->npc_hash_extract) return;
+ /* Check if IPv6 source/destination address + * should be hash enabled. + * Hashing reduces 128bit SIP/DIP fields to 32bit + * so that 224 bit X2 key can be used for IPv6 based filters as well, + * which in turn results in more number of MCAM entries available for + * use. + * + * Hashing of IPV6 SIP/DIP is enabled in below scenarios + * 1. If the silicon variant supports hashing feature + * 2. If the number of bytes of IP addr being extracted is 4 bytes ie + * 32bit. The assumption here is that if user wants 8bytes of LSB of + * IP addr or full 16 bytes then his intention is not to use 32bit + * hash. + */ + for (intf = 0; intf < hw->npc_intfs; intf++) { + for (ld = 0; ld < NPC_MAX_LD; ld++) { + cfg = rvu_read64(rvu, blkaddr, + NPC_AF_INTFX_LIDX_LTX_LDX_CFG(intf, + NPC_LID_LC, + NPC_LT_LC_IP6, + ld)); + hdr_offset = FIELD_GET(NPC_HDR_OFFSET, cfg); + byte_len = FIELD_GET(NPC_BYTESM, cfg); + /* Hashing of IPv6 source/destination address should be + * enabled if, + * hdr_offset == 8 (offset of source IPv6 address) or + * hdr_offset == 24 (offset of destination IPv6) + * address) and the number of byte to be + * extracted is 4. As per hardware configuration + * byte_len should be == actual byte_len - 1. + * Hence byte_len is checked against 3 but nor 4. + */ + if ((hdr_offset == 8 || hdr_offset == 24) && byte_len == 3) + mh->lid_lt_ld_hash_en[intf][NPC_LID_LC][NPC_LT_LC_IP6][ld] = true; + } + } + + /* Update hash configuration if the field is hash enabled */ for (intf = 0; intf < hw->npc_intfs; intf++) { npc_program_mkex_hash_rx(rvu, blkaddr, intf); npc_program_mkex_hash_tx(rvu, blkaddr, intf); diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h index a1c3d987b8044..57a09328d46b5 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h @@ -70,8 +70,8 @@ static struct npc_mcam_kex_hash npc_mkex_hash_default __maybe_unused = { [NIX_INTF_RX] = { [NPC_LID_LC] = { [NPC_LT_LC_IP6] = { - true, - true, + false, + false, }, }, }, @@ -79,8 +79,8 @@ static struct npc_mcam_kex_hash npc_mkex_hash_default __maybe_unused = { [NIX_INTF_TX] = { [NPC_LID_LC] = { [NPC_LT_LC_IP6] = { - true, - true, + false, + false, }, }, },
From: Vincent Whitchurch vincent.whitchurch@axis.com
[ Upstream commit 284779dbf4e98753458708783af8c35630674a21 ]
commit a3a57bf07de23fe1ff779e0fdf710aa581c3ff73 ("net: stmmac: work around sporadic tx issue on link-up") worked around a problem with TX sometimes not working after a link-up by avoiding a redundant write to MAC_CTRL_REG (aka GMAC_CONFIG), since the IP appeared to have problems with handling multiple writes to that register in some cases.
That commit however only added the work around to dwmac_lib.c (apart from the common code in stmmac_main.c), but my systems with version 4.21a of the IP exhibit the same problem, so add the work around to dwmac4_lib.c too.
Fixes: a3a57bf07de2 ("net: stmmac: work around sporadic tx issue on link-up") Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230721-stmmac-tx-workaround-v1-1-9411cbd5ee07@ax... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c index df41eac54058f..03ceb6a940732 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c @@ -240,13 +240,15 @@ void stmmac_dwmac4_set_mac_addr(void __iomem *ioaddr, const u8 addr[6], void stmmac_dwmac4_set_mac(void __iomem *ioaddr, bool enable) { u32 value = readl(ioaddr + GMAC_CONFIG); + u32 old_val = value;
if (enable) value |= GMAC_CONFIG_RE | GMAC_CONFIG_TE; else value &= ~(GMAC_CONFIG_TE | GMAC_CONFIG_RE);
- writel(value, ioaddr + GMAC_CONFIG); + if (value != old_val) + writel(value, ioaddr + GMAC_CONFIG); }
void stmmac_dwmac4_get_mac_addr(void __iomem *ioaddr, unsigned char *addr,
From: Maxim Mikityanskiy maxtram95@gmail.com
[ Upstream commit ad084a6d99bc182bf109c190c808e2ea073ec57b ]
Only the HW rfkill state is toggled on laptops with quirks->ec_read_only (so far only MSI Wind U90/U100). There are, however, a few issues with the implementation:
1. The initial HW state is always unblocked, regardless of the actual state on boot, because msi_init_rfkill only sets the SW state, regardless of ec_read_only.
2. The initial SW state corresponds to the actual state on boot, but it can't be changed afterwards, because set_device_state returns -EOPNOTSUPP. It confuses the userspace, making Wi-Fi and/or Bluetooth unusable if it was blocked on boot, and breaking the airplane mode if the rfkill was unblocked on boot.
Address the above issues by properly initializing the HW state on ec_read_only laptops and by allowing the userspace to toggle the SW state. Don't set the SW state ourselves and let the userspace fully control it. Toggling the SW state is a no-op, however, it allows the userspace to properly toggle the airplane mode. The actual SW radio disablement is handled by the corresponding rtl818x_pci and btusb drivers that have their own rfkills.
Tested on MSI Wind U100 Plus, BIOS ver 1.0G, EC ver 130.
Fixes: 0816392b97d4 ("msi-laptop: merge quirk tables to one") Fixes: 0de6575ad0a8 ("msi-laptop: Add MSI Wind U90/U100 support") Signed-off-by: Maxim Mikityanskiy maxtram95@gmail.com Link: https://lore.kernel.org/r/20230721145423.161057-1-maxtram95@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/msi-laptop.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/platform/x86/msi-laptop.c b/drivers/platform/x86/msi-laptop.c index 6b18ec543ac3a..f4c6c36e05a52 100644 --- a/drivers/platform/x86/msi-laptop.c +++ b/drivers/platform/x86/msi-laptop.c @@ -208,7 +208,7 @@ static ssize_t set_device_state(const char *buf, size_t count, u8 mask) return -EINVAL;
if (quirks->ec_read_only) - return -EOPNOTSUPP; + return 0;
/* read current device state */ result = ec_read(MSI_STANDARD_EC_COMMAND_ADDRESS, &rdata); @@ -838,15 +838,15 @@ static bool msi_laptop_i8042_filter(unsigned char data, unsigned char str, static void msi_init_rfkill(struct work_struct *ignored) { if (rfk_wlan) { - rfkill_set_sw_state(rfk_wlan, !wlan_s); + msi_rfkill_set_state(rfk_wlan, !wlan_s); rfkill_wlan_set(NULL, !wlan_s); } if (rfk_bluetooth) { - rfkill_set_sw_state(rfk_bluetooth, !bluetooth_s); + msi_rfkill_set_state(rfk_bluetooth, !bluetooth_s); rfkill_bluetooth_set(NULL, !bluetooth_s); } if (rfk_threeg) { - rfkill_set_sw_state(rfk_threeg, !threeg_s); + msi_rfkill_set_state(rfk_threeg, !threeg_s); rfkill_threeg_set(NULL, !threeg_s); } }
From: Kirill A. Shutemov kirill.shutemov@linux.intel.com
[ Upstream commit 9f9116406120638b4d8db3831ffbc430dd2e1e95 ]
Commit c4e34dd99f2e ("x86: simplify load_unaligned_zeropad() implementation") changes how exceptions around load_unaligned_zeropad() handled. The kernel now uses the fault_address in fixup_exception() to verify the address calculations for the load_unaligned_zeropad().
It works fine for #PF, but breaks on #VE since no fault address is passed down to fixup_exception().
Propagating ve_info.gla down to fixup_exception() resolves the issue.
See commit 1e7769653b06 ("x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page") for more context.
Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Reported-by: Michael Kelley mikelley@microsoft.com Fixes: c4e34dd99f2e ("x86: simplify load_unaligned_zeropad() implementation") Acked-by: Dave Hansen dave.hansen@linux.intel.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kernel/traps.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 58b1f208eff51..4a817d20ce3bb 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -697,9 +697,10 @@ static bool try_fixup_enqcmd_gp(void) }
static bool gp_try_fixup_and_notify(struct pt_regs *regs, int trapnr, - unsigned long error_code, const char *str) + unsigned long error_code, const char *str, + unsigned long address) { - if (fixup_exception(regs, trapnr, error_code, 0)) + if (fixup_exception(regs, trapnr, error_code, address)) return true;
current->thread.error_code = error_code; @@ -759,7 +760,7 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) goto exit; }
- if (gp_try_fixup_and_notify(regs, X86_TRAP_GP, error_code, desc)) + if (gp_try_fixup_and_notify(regs, X86_TRAP_GP, error_code, desc, 0)) goto exit;
if (error_code) @@ -1357,17 +1358,20 @@ DEFINE_IDTENTRY(exc_device_not_available)
#define VE_FAULT_STR "VE fault"
-static void ve_raise_fault(struct pt_regs *regs, long error_code) +static void ve_raise_fault(struct pt_regs *regs, long error_code, + unsigned long address) { if (user_mode(regs)) { gp_user_force_sig_segv(regs, X86_TRAP_VE, error_code, VE_FAULT_STR); return; }
- if (gp_try_fixup_and_notify(regs, X86_TRAP_VE, error_code, VE_FAULT_STR)) + if (gp_try_fixup_and_notify(regs, X86_TRAP_VE, error_code, + VE_FAULT_STR, address)) { return; + }
- die_addr(VE_FAULT_STR, regs, error_code, 0); + die_addr(VE_FAULT_STR, regs, error_code, address); }
/* @@ -1431,7 +1435,7 @@ DEFINE_IDTENTRY(exc_virtualization_exception) * it successfully, treat it as #GP(0) and handle it. */ if (!tdx_handle_virt_exception(regs, &ve)) - ve_raise_fault(regs, 0); + ve_raise_fault(regs, 0, ve.gla);
cond_local_irq_disable(regs); }
From: Lin Ma linma@zju.edu.cn
[ Upstream commit 55cef78c244d0d076f5a75a35530ca63c92f4426 ]
The previous commit 954d1fa1ac93 ("macvlan: Add netlink attribute for broadcast cutoff") added one additional attribute named IFLA_MACVLAN_BC_CUTOFF to allow broadcast cutfoff.
However, it forgot to describe the nla_policy at macvlan_policy (drivers/net/macvlan.c). Hence, this suppose NLA_S32 (4 bytes) integer can be faked as empty (0 bytes) by a malicious user, which could leads to OOB in heap just like CVE-2023-3773.
To fix it, this commit just completes the nla_policy description for IFLA_MACVLAN_BC_CUTOFF. This enforces the length check and avoids the potential OOB read.
Fixes: 954d1fa1ac93 ("macvlan: Add netlink attribute for broadcast cutoff") Signed-off-by: Lin Ma linma@zju.edu.cn Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230723080205.3715164-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/macvlan.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 4a53debf9d7c4..ed908165a8b4e 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -1746,6 +1746,7 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = { [IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 }, [IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 }, [IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_REJECT }, + [IFLA_MACVLAN_BC_CUTOFF] = { .type = NLA_S32 }, };
int macvlan_link_register(struct rtnl_link_ops *ops)
From: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com
[ Upstream commit d4a7ce642100765119a872d4aba1bf63e3a22c8a ]
The Xeon validation group has been carrying out some loaded tests with various HW configurations, and they have seen some transmit queue time out happening during the test. This will cause the reset adapter function to be called by igc_tx_timeout(). Similar race conditions may arise when the interface is being brought down and up in igc_reinit_locked(), an interrupt being generated, and igc_clean_tx_irq() being called to complete the TX.
When the igc_tx_timeout() function is invoked, this patch will turn off all TX ring HW queues during igc_down() process. TX ring HW queues will be activated again during the igc_configure_tx_ring() process when performing the igc_up() procedure later.
This patch also moved existing igc_disable_tx_ring_hw() to avoid using forward declaration.
Kernel trace: [ 7678.747813] ------------[ cut here ]------------ [ 7678.757914] NETDEV WATCHDOG: enp1s0 (igc): transmit queue 2 timed out [ 7678.770117] WARNING: CPU: 0 PID: 13 at net/sched/sch_generic.c:525 dev_watchdog+0x1ae/0x1f0 [ 7678.784459] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 7678.784496] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [ 7679.200403] RIP: 0010:dev_watchdog+0x1ae/0x1f0 [ 7679.210201] Code: 28 e9 53 ff ff ff 4c 89 e7 c6 05 06 42 b9 00 01 e8 17 d1 fb ff 44 89 e9 4c 89 e6 48 c7 c7 40 ad fb 81 48 89 c2 e8 52 62 82 ff <0f> 0b e9 72 ff ff ff 65 8b 05 80 7d 7c 7e 89 c0 48 0f a3 05 0a c1 [ 7679.245438] RSP: 0018:ffa00000001f7d90 EFLAGS: 00010282 [ 7679.256021] RAX: 0000000000000000 RBX: ff11000109938440 RCX: 0000000000000000 [ 7679.268710] RDX: ff11000361e26cd8 RSI: ff11000361e1b880 RDI: ff11000361e1b880 [ 7679.281314] RBP: ffa00000001f7da8 R08: ff1100035f8fffe8 R09: 0000000000027ffb [ 7679.293840] R10: 0000000000001f0a R11: ff1100035f840000 R12: ff11000109938000 [ 7679.306276] R13: 0000000000000002 R14: dead000000000122 R15: ffa00000001f7e18 [ 7679.318648] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 7679.332064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7679.342757] CR2: 00007ffff7fca168 CR3: 000000013b08a006 CR4: 0000000000471ef8 [ 7679.354984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 7679.367207] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 7679.379370] PKRU: 55555554 [ 7679.386446] Call Trace: [ 7679.393152] <TASK> [ 7679.399363] ? __pfx_dev_watchdog+0x10/0x10 [ 7679.407870] call_timer_fn+0x31/0x110 [ 7679.415698] expire_timers+0xb2/0x120 [ 7679.423403] run_timer_softirq+0x179/0x1e0 [ 7679.431532] ? __schedule+0x2b1/0x820 [ 7679.439078] __do_softirq+0xd1/0x295 [ 7679.446426] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 7679.454867] run_ksoftirqd+0x22/0x30 [ 7679.462058] smpboot_thread_fn+0xb7/0x160 [ 7679.469670] kthread+0xcd/0xf0 [ 7679.476097] ? __pfx_kthread+0x10/0x10 [ 7679.483211] ret_from_fork+0x29/0x50 [ 7679.490047] </TASK> [ 7679.495204] ---[ end trace 0000000000000000 ]--- [ 7679.503179] igc 0000:01:00.0 enp1s0: Register Dump [ 7679.511230] igc 0000:01:00.0 enp1s0: Register Name Value [ 7679.519892] igc 0000:01:00.0 enp1s0: CTRL 181c0641 [ 7679.528782] igc 0000:01:00.0 enp1s0: STATUS 40280683 [ 7679.537551] igc 0000:01:00.0 enp1s0: CTRL_EXT 10000040 [ 7679.546284] igc 0000:01:00.0 enp1s0: MDIC 180a3800 [ 7679.554942] igc 0000:01:00.0 enp1s0: ICR 00000081 [ 7679.563503] igc 0000:01:00.0 enp1s0: RCTL 04408022 [ 7679.571963] igc 0000:01:00.0 enp1s0: RDLEN[0-3] 00001000 00001000 00001000 00001000 [ 7679.583075] igc 0000:01:00.0 enp1s0: RDH[0-3] 00000068 000000b6 0000000f 00000031 [ 7679.594162] igc 0000:01:00.0 enp1s0: RDT[0-3] 00000066 000000b2 0000000e 00000030 [ 7679.605174] igc 0000:01:00.0 enp1s0: RXDCTL[0-3] 02040808 02040808 02040808 02040808 [ 7679.616196] igc 0000:01:00.0 enp1s0: RDBAL[0-3] 1bb7c000 1bb7f000 1bb82000 0ef33000 [ 7679.627242] igc 0000:01:00.0 enp1s0: RDBAH[0-3] 00000001 00000001 00000001 00000001 [ 7679.638256] igc 0000:01:00.0 enp1s0: TCTL a503f0fa [ 7679.646607] igc 0000:01:00.0 enp1s0: TDBAL[0-3] 2ba4a000 1bb6f000 1bb74000 1bb79000 [ 7679.657609] igc 0000:01:00.0 enp1s0: TDBAH[0-3] 00000001 00000001 00000001 00000001 [ 7679.668551] igc 0000:01:00.0 enp1s0: TDLEN[0-3] 00001000 00001000 00001000 00001000 [ 7679.679470] igc 0000:01:00.0 enp1s0: TDH[0-3] 000000a7 0000002d 000000bf 000000d9 [ 7679.690406] igc 0000:01:00.0 enp1s0: TDT[0-3] 000000a7 0000002d 000000bf 000000d9 [ 7679.701264] igc 0000:01:00.0 enp1s0: TXDCTL[0-3] 02100108 02100108 02100108 02100108 [ 7679.712123] igc 0000:01:00.0 enp1s0: Reset adapter [ 7683.085967] igc 0000:01:00.0 enp1s0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX [ 8086.945561] ------------[ cut here ]------------ Entering kdb (current=0xffffffff8220b200, pid 0) on processor 0 Oops: (null) due to oops @ 0xffffffff81573888 RIP: 0010:dql_completed+0x148/0x160 Code: c9 00 48 89 57 58 e9 46 ff ff ff 45 85 e4 41 0f 95 c4 41 39 db 0f 95 c1 41 84 cc 74 05 45 85 ed 78 0a 44 89 c1 e9 27 ff ff ff <0f> 0b 01 f6 44 89 c1 29 f1 0f 48 ca eb 8c cc cc cc cc cc cc cc cc RSP: 0018:ffa0000000003e00 EFLAGS: 00010287 RAX: 000000000000006c RBX: ffa0000003eb0f78 RCX: ff11000109938000 RDX: 0000000000000003 RSI: 0000000000000160 RDI: ff110001002e9480 RBP: ffa0000000003ed8 R08: ff110001002e93c0 R09: ffa0000000003d28 R10: 0000000000007cc0 R11: 0000000000007c54 R12: 00000000ffffffd9 R13: ff1100037039cb00 R14: 00000000ffffffd9 R15: ff1100037039c048 FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? igc_poll+0x1a9/0x14d0 [igc] __napi_poll+0x2e/0x1b0 net_rx_action+0x126/0x250 __do_softirq+0xd1/0x295 irq_exit_rcu+0xc5/0xf0 common_interrupt+0x86/0xa0 </IRQ> <TASK> asm_common_interrupt+0x27/0x40 RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 cpuidle_enter+0x2e/0x50 call_cpuidle+0x23/0x40 do_idle+0x1be/0x220 cpu_startup_entry+0x20/0x30 rest_init+0xb5/0xc0 arch_call_rest_init+0xe/0x30 start_kernel+0x448/0x760 x86_64_start_kernel+0x109/0x150 secondary_startup_64_no_verify+0xe0/0xeb </TASK> more> [0]kdb>
[0]kdb> [0]kdb> go Catastrophic error detected kdb_continue_catastrophic=0, type go a second time if you really want to continue [0]kdb> go Catastrophic error detected kdb_continue_catastrophic=0, attempting to continue [ 8086.955689] refcount_t: underflow; use-after-free. [ 8086.955697] WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0xc2/0x110 [ 8086.955706] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 8086.955751] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [ 8086.955784] RIP: 0010:refcount_warn_saturate+0xc2/0x110 [ 8086.955788] Code: 01 e8 82 e7 b4 ff 0f 0b 5d c3 cc cc cc cc 80 3d 68 c6 eb 00 00 75 81 48 c7 c7 a0 87 f6 81 c6 05 58 c6 eb 00 01 e8 5e e7 b4 ff <0f> 0b 5d c3 cc cc cc cc 80 3d 42 c6 eb 00 00 0f 85 59 ff ff ff 48 [ 8086.955790] RSP: 0018:ffa0000000003da0 EFLAGS: 00010286 [ 8086.955793] RAX: 0000000000000000 RBX: ff1100011da40ee0 RCX: ff11000361e1b888 [ 8086.955794] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ff11000361e1b880 [ 8086.955795] RBP: ffa0000000003da0 R08: 80000000ffff9f45 R09: ffa0000000003d28 [ 8086.955796] R10: ff1100035f840000 R11: 0000000000000028 R12: ff11000319ff8000 [ 8086.955797] R13: ff1100011bb79d60 R14: 00000000ffffffd6 R15: ff1100037039cb00 [ 8086.955798] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 8086.955800] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8086.955801] CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 [ 8086.955803] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8086.955803] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 8086.955804] PKRU: 55555554 [ 8086.955805] Call Trace: [ 8086.955806] <IRQ> [ 8086.955808] tcp_wfree+0x112/0x130 [ 8086.955814] skb_release_head_state+0x24/0xa0 [ 8086.955818] napi_consume_skb+0x9c/0x160 [ 8086.955821] igc_poll+0x5d8/0x14d0 [igc] [ 8086.955835] __napi_poll+0x2e/0x1b0 [ 8086.955839] net_rx_action+0x126/0x250 [ 8086.955843] __do_softirq+0xd1/0x295 [ 8086.955846] irq_exit_rcu+0xc5/0xf0 [ 8086.955851] common_interrupt+0x86/0xa0 [ 8086.955857] </IRQ> [ 8086.955857] <TASK> [ 8086.955858] asm_common_interrupt+0x27/0x40 [ 8086.955862] RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 [ 8086.955866] Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d [ 8086.955867] RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 [ 8086.955869] RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f [ 8086.955870] RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 [ 8086.955871] RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 [ 8086.955872] R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 [ 8086.955873] R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 [ 8086.955875] cpuidle_enter+0x2e/0x50 [ 8086.955880] call_cpuidle+0x23/0x40 [ 8086.955884] do_idle+0x1be/0x220 [ 8086.955887] cpu_startup_entry+0x20/0x30 [ 8086.955889] rest_init+0xb5/0xc0 [ 8086.955892] arch_call_rest_init+0xe/0x30 [ 8086.955895] start_kernel+0x448/0x760 [ 8086.955898] x86_64_start_kernel+0x109/0x150 [ 8086.955900] secondary_startup_64_no_verify+0xe0/0xeb [ 8086.955904] </TASK> [ 8086.955904] ---[ end trace 0000000000000000 ]--- [ 8086.955912] ------------[ cut here ]------------ [ 8086.955913] kernel BUG at lib/dynamic_queue_limits.c:27! [ 8086.955918] invalid opcode: 0000 [#1] SMP [ 8086.955922] RIP: 0010:dql_completed+0x148/0x160 [ 8086.955925] Code: c9 00 48 89 57 58 e9 46 ff ff ff 45 85 e4 41 0f 95 c4 41 39 db 0f 95 c1 41 84 cc 74 05 45 85 ed 78 0a 44 89 c1 e9 27 ff ff ff <0f> 0b 01 f6 44 89 c1 29 f1 0f 48 ca eb 8c cc cc cc cc cc cc cc cc [ 8086.955927] RSP: 0018:ffa0000000003e00 EFLAGS: 00010287 [ 8086.955928] RAX: 000000000000006c RBX: ffa0000003eb0f78 RCX: ff11000109938000 [ 8086.955929] RDX: 0000000000000003 RSI: 0000000000000160 RDI: ff110001002e9480 [ 8086.955930] RBP: ffa0000000003ed8 R08: ff110001002e93c0 R09: ffa0000000003d28 [ 8086.955931] R10: 0000000000007cc0 R11: 0000000000007c54 R12: 00000000ffffffd9 [ 8086.955932] R13: ff1100037039cb00 R14: 00000000ffffffd9 R15: ff1100037039c048 [ 8086.955933] FS: 0000000000000000(0000) GS:ff11000361e00000(0000) knlGS:0000000000000000 [ 8086.955934] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8086.955935] CR2: 00007ffff7fca168 CR3: 000000013b08a003 CR4: 0000000000471ef8 [ 8086.955936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8086.955937] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 8086.955938] PKRU: 55555554 [ 8086.955939] Call Trace: [ 8086.955939] <IRQ> [ 8086.955940] ? igc_poll+0x1a9/0x14d0 [igc] [ 8086.955949] __napi_poll+0x2e/0x1b0 [ 8086.955952] net_rx_action+0x126/0x250 [ 8086.955956] __do_softirq+0xd1/0x295 [ 8086.955958] irq_exit_rcu+0xc5/0xf0 [ 8086.955961] common_interrupt+0x86/0xa0 [ 8086.955964] </IRQ> [ 8086.955965] <TASK> [ 8086.955965] asm_common_interrupt+0x27/0x40 [ 8086.955968] RIP: 0010:cpuidle_enter_state+0xd3/0x3e0 [ 8086.955971] Code: 73 f1 ff ff 49 89 c6 8b 05 e2 ca a7 00 85 c0 0f 8f b3 02 00 00 31 ff e8 1b de 75 ff 80 7d d7 00 0f 85 cd 01 00 00 fb 45 85 ff <0f> 88 fd 00 00 00 49 63 cf 4c 2b 75 c8 48 8d 04 49 48 89 ca 48 8d [ 8086.955972] RSP: 0018:ffffffff82203df0 EFLAGS: 00000202 [ 8086.955973] RAX: ff11000361e2a200 RBX: 0000000000000002 RCX: 000000000000001f [ 8086.955974] RDX: 0000000000000000 RSI: 000000003cf3cf3d RDI: 0000000000000000 [ 8086.955974] RBP: ffffffff82203e28 R08: 0000075ae38471c8 R09: 0000000000000018 [ 8086.955975] R10: 000000000000031a R11: ffffffff8238dca0 R12: ffd1ffffff200000 [ 8086.955976] R13: ffffffff8238dca0 R14: 0000075ae38471c8 R15: 0000000000000002 [ 8086.955978] cpuidle_enter+0x2e/0x50 [ 8086.955981] call_cpuidle+0x23/0x40 [ 8086.955984] do_idle+0x1be/0x220 [ 8086.955985] cpu_startup_entry+0x20/0x30 [ 8086.955987] rest_init+0xb5/0xc0 [ 8086.955990] arch_call_rest_init+0xe/0x30 [ 8086.955992] start_kernel+0x448/0x760 [ 8086.955994] x86_64_start_kernel+0x109/0x150 [ 8086.955996] secondary_startup_64_no_verify+0xe0/0xeb [ 8086.955998] </TASK> [ 8086.955999] Modules linked in: xt_conntrack nft_chain_nat xt_MASQUERADE xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay dm_mod emrcha(PO) emriio(PO) rktpm(PO) cegbuf_mod(PO) patch_update(PO) se(PO) sgx_tgts(PO) mktme(PO) keylocker(PO) svtdx(PO) svfs_pci_hotplug(PO) vtd_mod(PO) davemem(PO) svmabort(PO) svindexio(PO) usbx2(PO) ehci_sched(PO) svheartbeat(PO) ioapic(PO) sv8259(PO) svintr(PO) lt(PO) pcierootport(PO) enginefw_mod(PO) ata(PO) smbus(PO) spiflash_cdf(PO) arden(PO) dsa_iax(PO) oobmsm_punit(PO) cpm(PO) svkdb(PO) ebg_pch(PO) pch(PO) sviotargets(PO) svbdf(PO) svmem(PO) svbios(PO) dram(PO) svtsc(PO) targets(PO) superio(PO) svkernel(PO) cswitch(PO) mcf(PO) pentiumIII_mod(PO) fs_svfs(PO) mdevdefdb(PO) svfs_os_services(O) ixgbe mdio mdio_devres libphy emeraldrapids_svdefs(PO) regsupport(O) libnvdimm nls_cp437 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep x86_pkg_temp_thermal snd_hda_core snd_pcm snd_timer isst_if_mbox_pci [ 8086.956029] input_leds isst_if_mmio sg snd isst_if_common soundcore wmi button sad9(O) drm fuse backlight configfs efivarfs ip_tables x_tables vmd sdhci led_class rtl8150 r8152 hid_generic pegasus mmc_block usbhid mmc_core hid megaraid_sas ixgb igb i2c_algo_bit ice i40e hpsa scsi_transport_sas e1000e e1000 e100 ax88179_178a usbnet xhci_pci sd_mod xhci_hcd t10_pi crc32c_intel crc64_rocksoft igc crc64 crc_t10dif usbcore crct10dif_generic ptp crct10dif_common usb_common pps_core [16762.543675] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.593 msecs [16762.543678] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.595 msecs [16762.543673] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.495 msecs [16762.543679] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.599 msecs [16762.543678] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.598 msecs [16762.543690] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.605 msecs [16762.543684] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.599 msecs [16762.543693] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 8675587.613 msecs [16762.543784] ---[ end trace 0000000000000000 ]--- [16762.849099] RIP: 0010:dql_completed+0x148/0x160 PANIC: Fatal exception in interrupt
Fixes: 9b275176270e ("igc: Add ndo_tx_timeout support") Tested-by: Alejandra Victoria Alcaraz alejandra.victoria.alcaraz@intel.com Signed-off-by: Muhammad Husaini Zulkifli muhammad.husaini.zulkifli@intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Naama Meir naamax.meir@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 40 ++++++++++++++++------- 1 file changed, 28 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 496a4eb687b00..3ccf2fedc5af7 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -316,6 +316,33 @@ static void igc_clean_all_tx_rings(struct igc_adapter *adapter) igc_clean_tx_ring(adapter->tx_ring[i]); }
+static void igc_disable_tx_ring_hw(struct igc_ring *ring) +{ + struct igc_hw *hw = &ring->q_vector->adapter->hw; + u8 idx = ring->reg_idx; + u32 txdctl; + + txdctl = rd32(IGC_TXDCTL(idx)); + txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE; + txdctl |= IGC_TXDCTL_SWFLUSH; + wr32(IGC_TXDCTL(idx), txdctl); +} + +/** + * igc_disable_all_tx_rings_hw - Disable all transmit queue operation + * @adapter: board private structure + */ +static void igc_disable_all_tx_rings_hw(struct igc_adapter *adapter) +{ + int i; + + for (i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *tx_ring = adapter->tx_ring[i]; + + igc_disable_tx_ring_hw(tx_ring); + } +} + /** * igc_setup_tx_resources - allocate Tx resources (Descriptors) * @tx_ring: tx descriptor ring (for a specific queue) to setup @@ -5056,6 +5083,7 @@ void igc_down(struct igc_adapter *adapter) /* clear VLAN promisc flag so VFTA will be updated if necessary */ adapter->flags &= ~IGC_FLAG_VLAN_PROMISC;
+ igc_disable_all_tx_rings_hw(adapter); igc_clean_all_tx_rings(adapter); igc_clean_all_rx_rings(adapter); } @@ -7274,18 +7302,6 @@ void igc_enable_rx_ring(struct igc_ring *ring) igc_alloc_rx_buffers(ring, igc_desc_unused(ring)); }
-static void igc_disable_tx_ring_hw(struct igc_ring *ring) -{ - struct igc_hw *hw = &ring->q_vector->adapter->hw; - u8 idx = ring->reg_idx; - u32 txdctl; - - txdctl = rd32(IGC_TXDCTL(idx)); - txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE; - txdctl |= IGC_TXDCTL_SWFLUSH; - wr32(IGC_TXDCTL(idx), txdctl); -} - void igc_disable_tx_ring(struct igc_ring *ring) { igc_disable_tx_ring_hw(ring);
From: Florian Westphal fw@strlen.de
[ Upstream commit f718863aca469a109895cb855e6b81fff4827d71 ]
The lazy gc on insert that should remove timed-out entries fails to release the other half of the interval, if any.
Can be reproduced with tests/shell/testcases/sets/0044interval_overlap_0 in nftables.git and kmemleak enabled kernel.
Second bug is the use of rbe_prev vs. prev pointer. If rbe_prev() returns NULL after at least one iteration, rbe_prev points to element that is not an end interval, hence it should not be removed.
Lastly, check the genmask of the end interval if this is active in the current generation.
Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_set_rbtree.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 5c05c9b990fba..8d73fffd2d09d 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -217,29 +217,37 @@ static void *nft_rbtree_get(const struct net *net, const struct nft_set *set,
static int nft_rbtree_gc_elem(const struct nft_set *__set, struct nft_rbtree *priv, - struct nft_rbtree_elem *rbe) + struct nft_rbtree_elem *rbe, + u8 genmask) { struct nft_set *set = (struct nft_set *)__set; struct rb_node *prev = rb_prev(&rbe->node); - struct nft_rbtree_elem *rbe_prev = NULL; + struct nft_rbtree_elem *rbe_prev; struct nft_set_gc_batch *gcb;
gcb = nft_set_gc_batch_check(set, NULL, GFP_ATOMIC); if (!gcb) return -ENOMEM;
- /* search for expired end interval coming before this element. */ + /* search for end interval coming before this element. + * end intervals don't carry a timeout extension, they + * are coupled with the interval start element. + */ while (prev) { rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node); - if (nft_rbtree_interval_end(rbe_prev)) + if (nft_rbtree_interval_end(rbe_prev) && + nft_set_elem_active(&rbe_prev->ext, genmask)) break;
prev = rb_prev(prev); }
- if (rbe_prev) { + if (prev) { + rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node); + rb_erase(&rbe_prev->node, &priv->root); atomic_dec(&set->nelems); + nft_set_gc_batch_add(gcb, rbe_prev); }
rb_erase(&rbe->node, &priv->root); @@ -321,7 +329,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
/* perform garbage collection to avoid bogus overlap reports. */ if (nft_set_elem_expired(&rbe->ext)) { - err = nft_rbtree_gc_elem(set, priv, rbe); + err = nft_rbtree_gc_elem(set, priv, rbe, genmask); if (err < 0) return err;
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 0a771f7b266b02d262900c75f1e175c7fe76fec2 ]
On error when building the rule, the immediate expression unbinds the chain, hence objects can be deactivated by the transaction records.
Otherwise, it is possible to trigger the following warning:
WARNING: CPU: 3 PID: 915 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] CPU: 3 PID: 915 Comm: chain-bind-err- Not tainted 6.1.39 #1 RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
Fixes: 4bedf9eee016 ("netfilter: nf_tables: fix chain binding transaction logic") Reported-by: Kevin Rich kevinrich1337@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_immediate.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c index 407d7197f75bb..fccb3cf7749c1 100644 --- a/net/netfilter/nft_immediate.c +++ b/net/netfilter/nft_immediate.c @@ -125,15 +125,27 @@ static void nft_immediate_activate(const struct nft_ctx *ctx, return nft_data_hold(&priv->data, nft_dreg_to_type(priv->dreg)); }
+static void nft_immediate_chain_deactivate(const struct nft_ctx *ctx, + struct nft_chain *chain, + enum nft_trans_phase phase) +{ + struct nft_ctx chain_ctx; + struct nft_rule *rule; + + chain_ctx = *ctx; + chain_ctx.chain = chain; + + list_for_each_entry(rule, &chain->rules, list) + nft_rule_expr_deactivate(&chain_ctx, rule, phase); +} + static void nft_immediate_deactivate(const struct nft_ctx *ctx, const struct nft_expr *expr, enum nft_trans_phase phase) { const struct nft_immediate_expr *priv = nft_expr_priv(expr); const struct nft_data *data = &priv->data; - struct nft_ctx chain_ctx; struct nft_chain *chain; - struct nft_rule *rule;
if (priv->dreg == NFT_REG_VERDICT) { switch (data->verdict.code) { @@ -143,20 +155,17 @@ static void nft_immediate_deactivate(const struct nft_ctx *ctx, if (!nft_chain_binding(chain)) break;
- chain_ctx = *ctx; - chain_ctx.chain = chain; - - list_for_each_entry(rule, &chain->rules, list) - nft_rule_expr_deactivate(&chain_ctx, rule, phase); - switch (phase) { case NFT_TRANS_PREPARE_ERROR: nf_tables_unbind_chain(ctx, chain); - fallthrough; + nft_deactivate_next(ctx->net, chain); + break; case NFT_TRANS_PREPARE: + nft_immediate_chain_deactivate(ctx, chain, phase); nft_deactivate_next(ctx->net, chain); break; default: + nft_immediate_chain_deactivate(ctx, chain, phase); nft_chain_del(chain); chain->bound = false; nft_use_dec(&chain->table->use);
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 0ebc1064e4874d5987722a2ddbc18f94aa53b211 ]
Bail out with EOPNOTSUPP when adding rule to bound chain via NFTA_RULE_CHAIN_ID. The following warning splat is shown when adding a rule to a deleted bound chain:
WARNING: CPU: 2 PID: 13692 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables] CPU: 2 PID: 13692 Comm: chain-bound-rul Not tainted 6.1.39 #1 RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Reported-by: Kevin Rich kevinrich1337@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index ccf0b3d80fd97..da00c411a9cd4 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -3810,8 +3810,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]); return PTR_ERR(chain); } - if (nft_chain_is_bound(chain)) - return -EOPNOTSUPP;
} else if (nla[NFTA_RULE_CHAIN_ID]) { chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID], @@ -3824,6 +3822,9 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, return -EINVAL; }
+ if (nft_chain_is_bound(chain)) + return -EOPNOTSUPP; + if (nla[NFTA_RULE_HANDLE]) { handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_HANDLE])); rule = __nft_rule_lookup(chain, handle);
From: Linus Torvalds torvalds@linux-foundation.org
[ Upstream commit 5f0bc0b042fc77ff70e14c790abdec960cde4ec1 ]
Commit eda0047296a1 ("mm: make the page fault mmap locking killable") intentionally made it much easier to trigger the "page fault fails because a fatal signal is pending" situation, by having the mmap locking fail early in that case.
We have long aborted page faults in other fatal cases when the actual IO for a page is interrupted by SIGKILL - which is particularly useful for the traditional case of NFS hanging due to network issues, but local filesystems could cause it too if you happened to get the SIGKILL while waiting for a page to be faulted in (eg lock_folio_maybe_drop_mmap()).
So aborting the page fault wasn't a new condition - but it now triggers earlier, before we even get to 'handle_mm_fault()'. And as a result the error doesn't go through our 'fault_signal_pending()' logic, and doesn't get filtered away there.
Normally you'd never even notice, because if a fatal signal is pending, the new SIGSEGV we send ends up being ignored anyway.
But it turns out that there is one very noticeable exception: if you enable 'show_unhandled_signals', the aborted page fault will be logged in the kernel messages, and you'll get a scary line looking something like this in your logs:
pverados[2183248]: segfault at 55e5a00f9ae0 ip 000055e5a00f9ae0 sp 00007ffc0720bea8 error 14 in perl[55e5a00d4000+195000] likely on CPU 10 (core 4, socket 0)
which is rather misleading. It's not really a segfault at all, it's just "the thread was killed before the page fault completed, so we aborted the page fault".
Fix this by just making it clear that a pending fatal signal means that any new signal coming in after that is implicitly handled. This will avoid the misleading logging, since now the signal isn't 'unhandled' any more.
Reported-and-tested-by: Fiona Ebner f.ebner@proxmox.com Tested-by: Thomas Lamprecht t.lamprecht@proxmox.com Link: https://lore.kernel.org/lkml/8d063a26-43f5-0bb7-3203-c6a04dc159f8@proxmox.co... Acked-by: Oleg Nesterov oleg@redhat.com Fixes: eda0047296a1 ("mm: make the page fault mmap locking killable") Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/signal.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/kernel/signal.c b/kernel/signal.c index 2547fa73bde51..1b39cba7dfd38 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -561,6 +561,10 @@ bool unhandled_signal(struct task_struct *tsk, int sig) if (handler != SIG_IGN && handler != SIG_DFL) return false;
+ /* If dying, we handle all new signals by ignoring them */ + if (fatal_signal_pending(tsk)) + return false; + /* if ptraced, let the tracer determine */ return !tsk->ptrace; }
From: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com
[ Upstream commit d7ddf5f4269fcaf19aafe971e635d91897423a3a ]
Remove wrong index adjustment, which is leftover from adding support for sparse enums. enum.entries_by_val() function shall not subtract the start-value, as it is indexed with real enum value.
Fixes: c311aaa74ca1 ("tools: ynl: fix enum-as-flags in the generic CLI") Signed-off-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Reviewed-by: Donald Hunter donald.hunter@gmail.com Link: https://lore.kernel.org/r/20230725101642.267248-2-arkadiusz.kubalewski@intel... Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/net/ynl/lib/ynl.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py index 3144f33196be4..35462c7ce48b5 100644 --- a/tools/net/ynl/lib/ynl.py +++ b/tools/net/ynl/lib/ynl.py @@ -405,8 +405,8 @@ class YnlFamily(SpecFamily): def _decode_enum(self, rsp, attr_spec): raw = rsp[attr_spec['name']] enum = self.consts[attr_spec['enum']] - i = attr_spec.get('value-start', 0) if 'enum-as-flags' in attr_spec and attr_spec['enum-as-flags']: + i = 0 value = set() while raw: if raw & 1: @@ -414,7 +414,7 @@ class YnlFamily(SpecFamily): raw >>= 1 i += 1 else: - value = enum.entries_by_val[raw - i].name + value = enum.entries_by_val[raw].name rsp[attr_spec['name']] = value
def _decode_binary(self, attr, attr_spec):
From: Wei Fang wei.fang@nxp.com
[ Upstream commit 15cec633fc7bfe4cd69aa012c3b35b31acfc86f2 ]
According to the clarification [1] in the latest napi.rst, the tx processing cannot call any XDP (or page pool) APIs if the "budget" is 0. Because NAPI is called with the budget of 0 (such as netpoll) indicates we may be in an IRQ context, however, we cannot use the page pool from IRQ context.
[1] https://lore.kernel.org/all/20230720161323.2025379-1-kuba@kernel.org/
Fixes: 20f797399035 ("net: fec: recycle pages for transmitted XDP frames") Signed-off-by: Wei Fang wei.fang@nxp.com Suggested-by: Jakub Kicinski kuba@kernel.org Link: https://lore.kernel.org/r/20230725074148.2936402-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index a1b0abe54a0e5..92410f30ad241 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1372,7 +1372,7 @@ fec_enet_hwtstamp(struct fec_enet_private *fep, unsigned ts, }
static void -fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) +fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget) { struct fec_enet_private *fep; struct xdp_frame *xdpf; @@ -1416,6 +1416,14 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) if (!skb) goto tx_buf_done; } else { + /* Tx processing cannot call any XDP (or page pool) APIs if + * the "budget" is 0. Because NAPI is called with budget of + * 0 (such as netpoll) indicates we may be in an IRQ context, + * however, we can't use the page pool from IRQ context. + */ + if (unlikely(!budget)) + break; + xdpf = txq->tx_buf[index].xdp; if (bdp->cbd_bufaddr) dma_unmap_single(&fep->pdev->dev, @@ -1508,14 +1516,14 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) writel(0, txq->bd.reg_desc_active); }
-static void fec_enet_tx(struct net_device *ndev) +static void fec_enet_tx(struct net_device *ndev, int budget) { struct fec_enet_private *fep = netdev_priv(ndev); int i;
/* Make sure that AVB queues are processed first. */ for (i = fep->num_tx_queues - 1; i >= 0; i--) - fec_enet_tx_queue(ndev, i); + fec_enet_tx_queue(ndev, i, budget); }
static void fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq, @@ -1858,7 +1866,7 @@ static int fec_enet_rx_napi(struct napi_struct *napi, int budget)
do { done += fec_enet_rx(ndev, budget - done); - fec_enet_tx(ndev); + fec_enet_tx(ndev, budget); } while ((done < budget) && fec_enet_collect_events(fep));
if (done < budget) {
From: Lin Ma linma@zju.edu.cn
[ Upstream commit 6c58c8816abb7b93b21fa3b1d0c1726402e5e568 ]
The nla_for_each_nested parsing in function mqprio_parse_nlattr() does not check the length of the nested attribute. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as 8 byte integer and passed to priv->max_rate/min_rate.
This patch adds the check based on nla_len() when check the nla_type(), which ensures that the length of these two attribute must equals sizeof(u64).
Fixes: 4e8b86c06269 ("mqprio: Introduce new hardware offload mode and shaper in mqprio") Reviewed-by: Victor Nogueira victor@mojatatu.com Signed-off-by: Lin Ma linma@zju.edu.cn Link: https://lore.kernel.org/r/20230725024227.426561-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_mqprio.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c index ab69ff7577fc7..793009f445c03 100644 --- a/net/sched/sch_mqprio.c +++ b/net/sched/sch_mqprio.c @@ -290,6 +290,13 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, "Attribute type expected to be TCA_MQPRIO_MIN_RATE64"); return -EINVAL; } + + if (nla_len(attr) != sizeof(u64)) { + NL_SET_ERR_MSG_ATTR(extack, attr, + "Attribute TCA_MQPRIO_MIN_RATE64 expected to have 8 bytes length"); + return -EINVAL; + } + if (i >= qopt->num_tc) break; priv->min_rate[i] = nla_get_u64(attr); @@ -312,6 +319,13 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt, "Attribute type expected to be TCA_MQPRIO_MAX_RATE64"); return -EINVAL; } + + if (nla_len(attr) != sizeof(u64)) { + NL_SET_ERR_MSG_ATTR(extack, attr, + "Attribute TCA_MQPRIO_MAX_RATE64 expected to have 8 bytes length"); + return -EINVAL; + } + if (i >= qopt->num_tc) break; priv->max_rate[i] = nla_get_u64(attr);
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit 5c85f7065718a949902b238a6abd8fc907c5d3e0 ]
in be_lancer_xmit_workarounds(), it should go to label 'tx_drop' if an unexpected value is returned by pskb_trim().
Fixes: 93040ae5cc8d ("be2net: Fix to trim skb for padded vlan packets to workaround an ASIC Bug") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Link: https://lore.kernel.org/r/20230725032726.15002-1-ruc_gongyuanjun@163.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/emulex/benet/be_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 0defd519ba62e..7fa057d379c1a 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1138,7 +1138,8 @@ static struct sk_buff *be_lancer_xmit_workarounds(struct be_adapter *adapter, (lancer_chip(adapter) || BE3_chip(adapter) || skb_vlan_tag_present(skb)) && is_ipv4_pkt(skb)) { ip = (struct iphdr *)ip_hdr(skb); - pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len)); + if (unlikely(pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len)))) + goto tx_drop; }
/* If vlan tag is already inlined in the packet, skip HW VLAN
From: Yuanjun Gong ruc_gongyuanjun@163.com
[ Upstream commit e46e06ffc6d667a89b979701288e2264f45e6a7b ]
goto free_skb if an unexpected result is returned by pskb_tirm() in tipc_crypto_rcv_complete().
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Signed-off-by: Yuanjun Gong ruc_gongyuanjun@163.com Reviewed-by: Tung Nguyen tung.q.nguyen@dektech.com.au Link: https://lore.kernel.org/r/20230725064810.5820-1-ruc_gongyuanjun@163.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/crypto.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c index 577fa5af33ec7..302fd749c4249 100644 --- a/net/tipc/crypto.c +++ b/net/tipc/crypto.c @@ -1960,7 +1960,8 @@ static void tipc_crypto_rcv_complete(struct net *net, struct tipc_aead *aead,
skb_reset_network_header(*skb); skb_pull(*skb, tipc_ehdr_size(ehdr)); - pskb_trim(*skb, (*skb)->len - aead->authsize); + if (pskb_trim(*skb, (*skb)->len - aead->authsize)) + goto free_skb;
/* Validate TIPCv2 message */ if (unlikely(!tipc_msg_validate(skb))) {
From: Fedor Pchelkin pchelkin@ispras.ru
[ Upstream commit de52e17326c3e9a719c9ead4adb03467b8fae0ef ]
If tipc_link_bc_create() fails inside tipc_node_create() for a newly allocated tipc node then we should stop its tipc crypto and free the resources allocated with a call to tipc_crypto_start().
As the node ref is initialized to one to that point, just put the ref on tipc_link_bc_create() error case that would lead to tipc_node_free() be eventually executed and properly clean the node and its crypto resources.
Found by Linux Verification Center (linuxtesting.org).
Fixes: cb8092d70a6f ("tipc: move bc link creation back to tipc_node_create") Suggested-by: Xin Long lucien.xin@gmail.com Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Reviewed-by: Xin Long lucien.xin@gmail.com Link: https://lore.kernel.org/r/20230725214628.25246-1-pchelkin@ispras.ru Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tipc/node.c b/net/tipc/node.c index 5e000fde80676..a9c5b6594889b 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -583,7 +583,7 @@ struct tipc_node *tipc_node_create(struct net *net, u32 addr, u8 *peer_id, n->capabilities, &n->bc_entry.inputq1, &n->bc_entry.namedq, snd_l, &n->bc_entry.link)) { pr_warn("Broadcast rcv link creation failed, no memory\n"); - kfree(n); + tipc_node_put(n); n = NULL; goto exit; }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 95f41d87810083d8b3dedcce46a4e356cf4a9673 ]
The commit in Fixes has introduced some "enum p9_session_flags" values larger than a char. Such values are stored in "v9fs_session_info->flags" which is a char only.
Turn it into an int so that the "enum p9_session_flags" values can fit in it.
Fixes: 6deffc8924b5 ("fs/9p: Add new mount modes") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Dominique Martinet asmadeus@codewreck.org Reviewed-by: Christian Schoenebeck linux_oss@crudebyte.com Signed-off-by: Eric Van Hensbergen ericvh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/9p/v9fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h index 06a2514f0d882..698c43dd5dc86 100644 --- a/fs/9p/v9fs.h +++ b/fs/9p/v9fs.h @@ -108,7 +108,7 @@ enum p9_cache_bits {
struct v9fs_session_info { /* options */ - unsigned char flags; + unsigned int flags; unsigned char nodev; unsigned short debug; unsigned int afid;
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit d64b1ee12a168030fbb3e0aebf7bce49e9a07589 ]
This code is trying to ensure that only the flags specified in the list are allowed. The problem is that ucmd->rx_hash_fields_mask is a u64 and the flags are an enum which is treated as a u32 in this context. That means the test doesn't check whether the highest 32 bits are zero.
Fixes: 4d02ebd9bbbd ("IB/mlx4: Fix RSS hash fields restrictions") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/233ed975-982d-422a-b498-410f71d8a101@moroto.mounta... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx4/qp.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 456656617c33f..9d08aa99f3cb0 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -565,15 +565,15 @@ static int set_qp_rss(struct mlx4_ib_dev *dev, struct mlx4_ib_rss *rss_ctx, return (-EOPNOTSUPP); }
- if (ucmd->rx_hash_fields_mask & ~(MLX4_IB_RX_HASH_SRC_IPV4 | - MLX4_IB_RX_HASH_DST_IPV4 | - MLX4_IB_RX_HASH_SRC_IPV6 | - MLX4_IB_RX_HASH_DST_IPV6 | - MLX4_IB_RX_HASH_SRC_PORT_TCP | - MLX4_IB_RX_HASH_DST_PORT_TCP | - MLX4_IB_RX_HASH_SRC_PORT_UDP | - MLX4_IB_RX_HASH_DST_PORT_UDP | - MLX4_IB_RX_HASH_INNER)) { + if (ucmd->rx_hash_fields_mask & ~(u64)(MLX4_IB_RX_HASH_SRC_IPV4 | + MLX4_IB_RX_HASH_DST_IPV4 | + MLX4_IB_RX_HASH_SRC_IPV6 | + MLX4_IB_RX_HASH_DST_IPV6 | + MLX4_IB_RX_HASH_SRC_PORT_TCP | + MLX4_IB_RX_HASH_DST_PORT_TCP | + MLX4_IB_RX_HASH_SRC_PORT_UDP | + MLX4_IB_RX_HASH_DST_PORT_UDP | + MLX4_IB_RX_HASH_INNER)) { pr_debug("RX Hash fields_mask has unsupported mask (0x%llx)\n", ucmd->rx_hash_fields_mask); return (-EOPNOTSUPP);
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit a85c238c5ccd64f8d4c4560702c65cb25dee791c ]
The SM8550 platform employs newer UBWC decoder, which requires slightly different programming.
Fixes: a2f33995c19d ("drm/msm: mdss: add support for SM8550") Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/546934/ Link: https://lore.kernel.org/r/20230712121145.1994830-3-dmitry.baryshkov@linaro.o... Signed-off-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_mdss.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c index e8c93731aaa18..4ae6fac20e48c 100644 --- a/drivers/gpu/drm/msm/msm_mdss.c +++ b/drivers/gpu/drm/msm/msm_mdss.c @@ -189,6 +189,7 @@ static int _msm_mdss_irq_domain_add(struct msm_mdss *msm_mdss) #define UBWC_2_0 0x20000000 #define UBWC_3_0 0x30000000 #define UBWC_4_0 0x40000000 +#define UBWC_4_3 0x40030000
static void msm_mdss_setup_ubwc_dec_20(struct msm_mdss *msm_mdss) { @@ -227,7 +228,10 @@ static void msm_mdss_setup_ubwc_dec_40(struct msm_mdss *msm_mdss) writel_relaxed(1, msm_mdss->mmio + UBWC_CTRL_2); writel_relaxed(0, msm_mdss->mmio + UBWC_PREDICTION_MODE); } else { - writel_relaxed(2, msm_mdss->mmio + UBWC_CTRL_2); + if (data->ubwc_dec_version == UBWC_4_3) + writel_relaxed(3, msm_mdss->mmio + UBWC_CTRL_2); + else + writel_relaxed(2, msm_mdss->mmio + UBWC_CTRL_2); writel_relaxed(1, msm_mdss->mmio + UBWC_PREDICTION_MODE); } } @@ -271,6 +275,7 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss) msm_mdss_setup_ubwc_dec_30(msm_mdss); break; case UBWC_4_0: + case UBWC_4_3: msm_mdss_setup_ubwc_dec_40(msm_mdss); break; default: @@ -561,6 +566,16 @@ static const struct msm_mdss_data sm8250_data = { .macrotile_mode = 1, };
+static const struct msm_mdss_data sm8550_data = { + .ubwc_version = UBWC_4_0, + .ubwc_dec_version = UBWC_4_3, + .ubwc_swizzle = 6, + .ubwc_static = 1, + /* TODO: highest_bank_bit = 2 for LP_DDR4 */ + .highest_bank_bit = 3, + .macrotile_mode = 1, +}; + static const struct of_device_id mdss_dt_match[] = { { .compatible = "qcom,mdss" }, { .compatible = "qcom,msm8998-mdss" }, @@ -575,7 +590,7 @@ static const struct of_device_id mdss_dt_match[] = { { .compatible = "qcom,sm8250-mdss", .data = &sm8250_data }, { .compatible = "qcom,sm8350-mdss", .data = &sm8250_data }, { .compatible = "qcom,sm8450-mdss", .data = &sm8250_data }, - { .compatible = "qcom,sm8550-mdss", .data = &sm8250_data }, + { .compatible = "qcom,sm8550-mdss", .data = &sm8550_data }, {} }; MODULE_DEVICE_TABLE(of, mdss_dt_match);
From: Jonathan Marek jonathan@marek.ca
[ Upstream commit ba7a94ea73120e3f72c4a9b7ed6fd5598d29c069 ]
Note that with this, DMA4/DMA5 are still non-functional, but at least display *something* in modetest instead of nothing or underflow.
Fixes: efcd0107727c ("drm/msm/dpu: add support for SM8550") Signed-off-by: Jonathan Marek jonathan@marek.ca Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Tested-by: Neil Armstrong neil.armstrong@linaro.org # on SM8550-QRD Patchwork: https://patchwork.freedesktop.org/patch/545548/ Link: https://lore.kernel.org/r/20230704160106.26055-1-jonathan@marek.ca Signed-off-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c index f6270b7a0b140..5afbc16ec5bbb 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c @@ -51,7 +51,7 @@
static const u32 fetch_tbl[SSPP_MAX] = {CTL_INVALID_BIT, 16, 17, 18, 19, CTL_INVALID_BIT, CTL_INVALID_BIT, CTL_INVALID_BIT, CTL_INVALID_BIT, 0, - 1, 2, 3, CTL_INVALID_BIT, CTL_INVALID_BIT}; + 1, 2, 3, 4, 5};
static const struct dpu_ctl_cfg *_ctl_offset(enum dpu_ctl ctl, const struct dpu_mdss_cfg *m, @@ -209,6 +209,12 @@ static void dpu_hw_ctl_update_pending_flush_sspp(struct dpu_hw_ctl *ctx, case SSPP_DMA3: ctx->pending_flush_mask |= BIT(25); break; + case SSPP_DMA4: + ctx->pending_flush_mask |= BIT(13); + break; + case SSPP_DMA5: + ctx->pending_flush_mask |= BIT(14); + break; case SSPP_CURSOR0: ctx->pending_flush_mask |= BIT(22); break;
From: Dmitry Baryshkov dmitry.baryshkov@linaro.org
[ Upstream commit e8383f5cf1b3573ce140a80bfbfd809278ab16d6 ]
Drop the leftover of bus-client -> interconnect conversion, the enum dpu_core_perf_data_bus_id.
Fixes: cb88482e2570 ("drm/msm/dpu: clean up references of DPU custom bus scaling") Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/546048/ Link: https://lore.kernel.org/r/20230707193942.3806526-2-dmitry.baryshkov@linaro.o... Signed-off-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h | 13 ------------- 1 file changed, 13 deletions(-)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h index e3795995e1454..29bb8ee2bc266 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h @@ -14,19 +14,6 @@
#define DPU_PERF_DEFAULT_MAX_CORE_CLK_RATE 412500000
-/** - * enum dpu_core_perf_data_bus_id - data bus identifier - * @DPU_CORE_PERF_DATA_BUS_ID_MNOC: DPU/MNOC data bus - * @DPU_CORE_PERF_DATA_BUS_ID_LLCC: MNOC/LLCC data bus - * @DPU_CORE_PERF_DATA_BUS_ID_EBI: LLCC/EBI data bus - */ -enum dpu_core_perf_data_bus_id { - DPU_CORE_PERF_DATA_BUS_ID_MNOC, - DPU_CORE_PERF_DATA_BUS_ID_LLCC, - DPU_CORE_PERF_DATA_BUS_ID_EBI, - DPU_CORE_PERF_DATA_BUS_ID_MAX, -}; - /** * struct dpu_core_perf_params - definition of performance parameters * @max_per_pipe_ib: maximum instantaneous bandwidth request
From: Marijn Suijten marijn.suijten@somainline.org
[ Upstream commit 97368254a08e2ca4766e7f84a45840230fe77fa3 ]
The regulator setup was likely copied from other SoCs by mistake. Just like SM6125 the DSI PHY on this platform is not getting power from a regulator but from the MX power domain.
Fixes: 572e9fd6d14a ("drm/msm/dsi: Add phy configuration for QCM2290") Signed-off-by: Marijn Suijten marijn.suijten@somainline.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/544536/ Link: https://lore.kernel.org/r/20230627-sm6125-dpu-v2-1-03e430a2078c@somainline.o... Signed-off-by: Abhinav Kumar quic_abhinavk@quicinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c index 3ce45b023e637..31deda1c664ad 100644 --- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c +++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c @@ -1087,8 +1087,6 @@ const struct msm_dsi_phy_cfg dsi_phy_14nm_8953_cfgs = {
const struct msm_dsi_phy_cfg dsi_phy_14nm_2290_cfgs = { .has_phy_lane = true, - .regulator_data = dsi_phy_14nm_17mA_regulators, - .num_regulators = ARRAY_SIZE(dsi_phy_14nm_17mA_regulators), .ops = { .enable = dsi_14nm_phy_enable, .disable = dsi_14nm_phy_disable,
From: Rob Clark robdclark@chromium.org
[ Upstream commit bd846ceee9c478d0397428f02696602ba5eb264a ]
The incorrect size was causing "CP | AHB bus error" when snapshotting the GPU state on a6xx gen4 (a660 family).
Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/26 Signed-off-by: Rob Clark robdclark@chromium.org Reviewed-by: Akhil P Oommen quic_akhilpo@quicinc.com Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state") Patchwork: https://patchwork.freedesktop.org/patch/546763/ Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h index 790f55e245332..e788ed72eb0d3 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h @@ -206,7 +206,7 @@ static const struct a6xx_shader_block { SHADER(A6XX_SP_LB_3_DATA, 0x800), SHADER(A6XX_SP_LB_4_DATA, 0x800), SHADER(A6XX_SP_LB_5_DATA, 0x200), - SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x2000), + SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x800), SHADER(A6XX_SP_CB_LEGACY_DATA, 0x280), SHADER(A6XX_SP_UAV_DATA, 0x80), SHADER(A6XX_SP_INST_TAG, 0x80),
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit 4984eb51453ff7eddee9e5ce816145be39c0ec5c ]
On code inspection, there are many instances in the driver where CEQE and AEQE fields written to by HW are read without guaranteeing that the polarity bit has been read and checked first.
Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields prior to checking the polarity bit.
Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230711175253.1289-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/ctrl.c | 9 ++++++++- drivers/infiniband/hw/irdma/puda.c | 6 ++++++ drivers/infiniband/hw/irdma/uk.c | 3 +++ 3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index d88c9184007ea..c91439f428c76 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -3363,6 +3363,9 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, if (polarity != ccq->cq_uk.polarity) return -ENOENT;
+ /* Ensure CEQE contents are read after valid bit is checked */ + dma_rmb(); + get_64bit_val(cqe, 8, &qp_ctx); cqp = (struct irdma_sc_cqp *)(unsigned long)qp_ctx; info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, temp); @@ -4009,13 +4012,17 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq, u8 polarity;
aeqe = IRDMA_GET_CURRENT_AEQ_ELEM(aeq); - get_64bit_val(aeqe, 0, &compl_ctx); get_64bit_val(aeqe, 8, &temp); polarity = (u8)FIELD_GET(IRDMA_AEQE_VALID, temp);
if (aeq->polarity != polarity) return -ENOENT;
+ /* Ensure AEQE contents are read after valid bit is checked */ + dma_rmb(); + + get_64bit_val(aeqe, 0, &compl_ctx); + print_hex_dump_debug("WQE: AEQ_ENTRY WQE", DUMP_PREFIX_OFFSET, 16, 8, aeqe, 16, false);
diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c index 4ec9639f1bdbf..562531712ea44 100644 --- a/drivers/infiniband/hw/irdma/puda.c +++ b/drivers/infiniband/hw/irdma/puda.c @@ -230,6 +230,9 @@ static int irdma_puda_poll_info(struct irdma_sc_cq *cq, if (valid_bit != cq_uk->polarity) return -ENOENT;
+ /* Ensure CQE contents are read after valid bit is checked */ + dma_rmb(); + if (cq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2) ext_valid = (bool)FIELD_GET(IRDMA_CQ_EXTCQE, qword3);
@@ -243,6 +246,9 @@ static int irdma_puda_poll_info(struct irdma_sc_cq *cq, if (polarity != cq_uk->polarity) return -ENOENT;
+ /* Ensure ext CQE contents are read after ext valid bit is checked */ + dma_rmb(); + IRDMA_RING_MOVE_HEAD_NOCHECK(cq_uk->cq_ring); if (!IRDMA_RING_CURRENT_HEAD(cq_uk->cq_ring)) cq_uk->polarity = !cq_uk->polarity; diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c index dd428d915c175..ea2c07751245a 100644 --- a/drivers/infiniband/hw/irdma/uk.c +++ b/drivers/infiniband/hw/irdma/uk.c @@ -1527,6 +1527,9 @@ void irdma_uk_clean_cq(void *q, struct irdma_cq_uk *cq) if (polarity != temp) break;
+ /* Ensure CQE contents are read after valid bit is checked */ + dma_rmb(); + get_64bit_val(cqe, 8, &comp_ctx); if ((void *)(unsigned long)comp_ctx == q) set_64bit_val(cqe, 8, 0);
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit f2c3037811381f9149243828c7eb9a1631df9f9c ]
CQP completion statistics is read lockesly in irdma_wait_event and irdma_check_cqp_progress while it can be updated in the completion thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.
Make completion statistics an atomic variable to reflect coherent updates to it. This will also avoid load/store tearing logic bug potentially possible by compiler optimizations.
[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]
[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4: [77346.171483] irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma] [77346.171658] irdma_cqp_ce_handler+0x164/0x270 [irdma] [77346.171835] cqp_compl_worker+0x1b/0x20 [irdma] [77346.172009] process_one_work+0x4d1/0xa40 [77346.172024] worker_thread+0x319/0x700 [77346.172037] kthread+0x180/0x1b0 [77346.172054] ret_from_fork+0x22/0x30
[77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2: [77346.172234] irdma_handle_cqp_op+0xf4/0x4b0 [irdma] [77346.172413] irdma_cqp_aeq_cmd+0x75/0xa0 [irdma] [77346.172592] irdma_create_aeq+0x390/0x45a [irdma] [77346.172769] irdma_rt_init_hw.cold+0x212/0x85d [irdma] [77346.172944] irdma_probe+0x54f/0x620 [irdma] [77346.173122] auxiliary_bus_probe+0x66/0xa0 [77346.173137] really_probe+0x140/0x540 [77346.173154] __driver_probe_device+0xc7/0x220 [77346.173173] driver_probe_device+0x5f/0x140 [77346.173190] __driver_attach+0xf0/0x2c0 [77346.173208] bus_for_each_dev+0xa8/0xf0 [77346.173225] driver_attach+0x29/0x30 [77346.173240] bus_add_driver+0x29c/0x2f0 [77346.173255] driver_register+0x10f/0x1a0 [77346.173272] __auxiliary_driver_register+0xbc/0x140 [77346.173287] irdma_init_module+0x55/0x1000 [irdma] [77346.173460] do_one_initcall+0x7d/0x410 [77346.173475] do_init_module+0x81/0x2c0 [77346.173491] load_module+0x1232/0x12c0 [77346.173506] __do_sys_finit_module+0x101/0x180 [77346.173522] __x64_sys_finit_module+0x3c/0x50 [77346.173538] do_syscall_64+0x39/0x90 [77346.173553] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/ctrl.c | 22 +++++++------- drivers/infiniband/hw/irdma/defs.h | 46 ++++++++++++++--------------- drivers/infiniband/hw/irdma/type.h | 2 ++ drivers/infiniband/hw/irdma/utils.c | 2 +- 4 files changed, 36 insertions(+), 36 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c index c91439f428c76..45e3344daa048 100644 --- a/drivers/infiniband/hw/irdma/ctrl.c +++ b/drivers/infiniband/hw/irdma/ctrl.c @@ -2712,13 +2712,13 @@ static int irdma_sc_cq_modify(struct irdma_sc_cq *cq, */ void irdma_check_cqp_progress(struct irdma_cqp_timeout *timeout, struct irdma_sc_dev *dev) { - if (timeout->compl_cqp_cmds != dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]) { - timeout->compl_cqp_cmds = dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]; + u64 completed_ops = atomic64_read(&dev->cqp->completed_ops); + + if (timeout->compl_cqp_cmds != completed_ops) { + timeout->compl_cqp_cmds = completed_ops; timeout->count = 0; - } else { - if (dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] != - timeout->compl_cqp_cmds) - timeout->count++; + } else if (timeout->compl_cqp_cmds != dev->cqp->requested_ops) { + timeout->count++; } }
@@ -2761,7 +2761,7 @@ static int irdma_cqp_poll_registers(struct irdma_sc_cqp *cqp, u32 tail, if (newtail != tail) { /* SUCCESS */ IRDMA_RING_MOVE_TAIL(cqp->sq_ring); - cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++; + atomic64_inc(&cqp->completed_ops); return 0; } udelay(cqp->dev->hw_attrs.max_sleep_count); @@ -3121,8 +3121,8 @@ int irdma_sc_cqp_init(struct irdma_sc_cqp *cqp, info->dev->cqp = cqp;
IRDMA_RING_INIT(cqp->sq_ring, cqp->sq_size); - cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] = 0; - cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS] = 0; + cqp->requested_ops = 0; + atomic64_set(&cqp->completed_ops, 0); /* for the cqp commands backlog. */ INIT_LIST_HEAD(&cqp->dev->cqp_cmd_head);
@@ -3274,7 +3274,7 @@ __le64 *irdma_sc_cqp_get_next_send_wqe_idx(struct irdma_sc_cqp *cqp, u64 scratch if (ret_code) return NULL;
- cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS]++; + cqp->requested_ops++; if (!*wqe_idx) cqp->polarity = !cqp->polarity; wqe = cqp->sq_base[*wqe_idx].elem; @@ -3400,7 +3400,7 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq, dma_wmb(); /* make sure shadow area is updated before moving tail */
IRDMA_RING_MOVE_TAIL(cqp->sq_ring); - ccq->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++; + atomic64_inc(&cqp->completed_ops);
return ret_code; } diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h index 6014b9d06a9ba..d06e45d2c23fd 100644 --- a/drivers/infiniband/hw/irdma/defs.h +++ b/drivers/infiniband/hw/irdma/defs.h @@ -191,32 +191,30 @@ enum irdma_cqp_op_type { IRDMA_OP_MANAGE_VF_PBLE_BP = 25, IRDMA_OP_QUERY_FPM_VAL = 26, IRDMA_OP_COMMIT_FPM_VAL = 27, - IRDMA_OP_REQ_CMDS = 28, - IRDMA_OP_CMPL_CMDS = 29, - IRDMA_OP_AH_CREATE = 30, - IRDMA_OP_AH_MODIFY = 31, - IRDMA_OP_AH_DESTROY = 32, - IRDMA_OP_MC_CREATE = 33, - IRDMA_OP_MC_DESTROY = 34, - IRDMA_OP_MC_MODIFY = 35, - IRDMA_OP_STATS_ALLOCATE = 36, - IRDMA_OP_STATS_FREE = 37, - IRDMA_OP_STATS_GATHER = 38, - IRDMA_OP_WS_ADD_NODE = 39, - IRDMA_OP_WS_MODIFY_NODE = 40, - IRDMA_OP_WS_DELETE_NODE = 41, - IRDMA_OP_WS_FAILOVER_START = 42, - IRDMA_OP_WS_FAILOVER_COMPLETE = 43, - IRDMA_OP_SET_UP_MAP = 44, - IRDMA_OP_GEN_AE = 45, - IRDMA_OP_QUERY_RDMA_FEATURES = 46, - IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY = 47, - IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 48, - IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 49, - IRDMA_OP_CQ_MODIFY = 50, + IRDMA_OP_AH_CREATE = 28, + IRDMA_OP_AH_MODIFY = 29, + IRDMA_OP_AH_DESTROY = 30, + IRDMA_OP_MC_CREATE = 31, + IRDMA_OP_MC_DESTROY = 32, + IRDMA_OP_MC_MODIFY = 33, + IRDMA_OP_STATS_ALLOCATE = 34, + IRDMA_OP_STATS_FREE = 35, + IRDMA_OP_STATS_GATHER = 36, + IRDMA_OP_WS_ADD_NODE = 37, + IRDMA_OP_WS_MODIFY_NODE = 38, + IRDMA_OP_WS_DELETE_NODE = 39, + IRDMA_OP_WS_FAILOVER_START = 40, + IRDMA_OP_WS_FAILOVER_COMPLETE = 41, + IRDMA_OP_SET_UP_MAP = 42, + IRDMA_OP_GEN_AE = 43, + IRDMA_OP_QUERY_RDMA_FEATURES = 44, + IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY = 45, + IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 46, + IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 47, + IRDMA_OP_CQ_MODIFY = 48,
/* Must be last entry*/ - IRDMA_MAX_CQP_OPS = 51, + IRDMA_MAX_CQP_OPS = 49, };
/* CQP SQ WQES */ diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h index 5ee68604e59fc..a20709577ab0a 100644 --- a/drivers/infiniband/hw/irdma/type.h +++ b/drivers/infiniband/hw/irdma/type.h @@ -365,6 +365,8 @@ struct irdma_sc_cqp { struct irdma_dcqcn_cc_params dcqcn_params; __le64 *host_ctx; u64 *scratch_array; + u64 requested_ops; + atomic64_t completed_ops; u32 cqp_id; u32 sq_size; u32 hw_sq_size; diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 71e1c5d347092..775a79946f7df 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -567,7 +567,7 @@ static int irdma_wait_event(struct irdma_pci_f *rf, bool cqp_error = false; int err_code = 0;
- cqp_timeout.compl_cqp_cmds = rf->sc_dev.cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]; + cqp_timeout.compl_cqp_cmds = atomic64_read(&rf->sc_dev.cqp->completed_ops); do { irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq); if (wait_event_timeout(cqp_request->waitq,
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit f0842bb3d38863777e3454da5653d80b5fde6321 ]
KCSAN detects a data race on cqp_request->request_done memory location which is accessed locklessly in irdma_handle_cqp_op while being updated in irdma_cqp_ce_handler.
Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any compiler optimizations like load fusing and/or KCSAN warning.
[222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma]
[222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5: [222808.417610] irdma_cqp_ce_handler+0x21e/0x270 [irdma] [222808.417725] cqp_compl_worker+0x1b/0x20 [irdma] [222808.417827] process_one_work+0x4d1/0xa40 [222808.417835] worker_thread+0x319/0x700 [222808.417842] kthread+0x180/0x1b0 [222808.417852] ret_from_fork+0x22/0x30
[222808.417918] read to 0xffff8e44107019dc of 1 bytes by task 29688 on cpu 1: [222808.417995] irdma_wait_event+0x1e2/0x2c0 [irdma] [222808.418099] irdma_handle_cqp_op+0xae/0x170 [irdma] [222808.418202] irdma_cqp_cq_destroy_cmd+0x70/0x90 [irdma] [222808.418308] irdma_puda_dele_rsrc+0x46d/0x4d0 [irdma] [222808.418411] irdma_rt_deinit_hw+0x179/0x1d0 [irdma] [222808.418514] irdma_ib_dealloc_device+0x11/0x40 [irdma] [222808.418618] ib_dealloc_device+0x2a/0x120 [ib_core] [222808.418823] __ib_unregister_device+0xde/0x100 [ib_core] [222808.418981] ib_unregister_device+0x22/0x40 [ib_core] [222808.419142] irdma_ib_unregister_device+0x70/0x90 [irdma] [222808.419248] i40iw_close+0x6f/0xc0 [irdma] [222808.419352] i40e_client_device_unregister+0x14a/0x180 [i40e] [222808.419450] i40iw_remove+0x21/0x30 [irdma] [222808.419554] auxiliary_bus_remove+0x31/0x50 [222808.419563] device_remove+0x69/0xb0 [222808.419572] device_release_driver_internal+0x293/0x360 [222808.419582] driver_detach+0x7c/0xf0 [222808.419592] bus_remove_driver+0x8c/0x150 [222808.419600] driver_unregister+0x45/0x70 [222808.419610] auxiliary_driver_unregister+0x16/0x30 [222808.419618] irdma_exit_module+0x18/0x1e [irdma] [222808.419733] __do_sys_delete_module.constprop.0+0x1e2/0x310 [222808.419745] __x64_sys_delete_module+0x1b/0x30 [222808.419755] do_syscall_64+0x39/0x90 [222808.419763] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[222808.419829] value changed: 0x01 -> 0x03
Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230711175253.1289-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/hw.c | 2 +- drivers/infiniband/hw/irdma/main.h | 2 +- drivers/infiniband/hw/irdma/utils.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 795f7fd4f2574..1cfc03da89e7a 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -2075,7 +2075,7 @@ void irdma_cqp_ce_handler(struct irdma_pci_f *rf, struct irdma_sc_cq *cq) cqp_request->compl_info.error = info.error;
if (cqp_request->waiting) { - cqp_request->request_done = true; + WRITE_ONCE(cqp_request->request_done, true); wake_up(&cqp_request->waitq); irdma_put_cqp_request(&rf->cqp, cqp_request); } else { diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h index def6dd58dcd48..2323962cdeacb 100644 --- a/drivers/infiniband/hw/irdma/main.h +++ b/drivers/infiniband/hw/irdma/main.h @@ -161,8 +161,8 @@ struct irdma_cqp_request { void (*callback_fcn)(struct irdma_cqp_request *cqp_request); void *param; struct irdma_cqp_compl_info compl_info; + bool request_done; /* READ/WRITE_ONCE macros operate on it */ bool waiting:1; - bool request_done:1; bool dynamic:1; };
diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 775a79946f7df..eb083f70b09ff 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -481,7 +481,7 @@ void irdma_free_cqp_request(struct irdma_cqp *cqp, if (cqp_request->dynamic) { kfree(cqp_request); } else { - cqp_request->request_done = false; + WRITE_ONCE(cqp_request->request_done, false); cqp_request->callback_fcn = NULL; cqp_request->waiting = false;
@@ -515,7 +515,7 @@ irdma_free_pending_cqp_request(struct irdma_cqp *cqp, { if (cqp_request->waiting) { cqp_request->compl_info.error = true; - cqp_request->request_done = true; + WRITE_ONCE(cqp_request->request_done, true); wake_up(&cqp_request->waitq); } wait_event_timeout(cqp->remove_wq, @@ -571,7 +571,7 @@ static int irdma_wait_event(struct irdma_pci_f *rf, do { irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq); if (wait_event_timeout(cqp_request->waitq, - cqp_request->request_done, + READ_ONCE(cqp_request->request_done), msecs_to_jiffies(CQP_COMPL_WAIT_TIME_MS))) break;
From: Shiraz Saleem shiraz.saleem@intel.com
[ Upstream commit 0e15863015d97c1ee2cc29d599abcc7fa2dc3e95 ]
8d037973d48c ("RDMA/core: Refactor rdma_bind_addr") intoduces as regression on irdma devices on certain tests which uses rdma CM, such as cmtime.
No connections can be established with the MAD QP experiences a fatal error on the active side.
The cma destination address is not updated with the dst_addr when ULP on active side calls rdma_bind_addr followed by rdma_resolve_addr. The id_priv state is 'bound' in resolve_prepare_src and update is skipped.
This leaves the dgid passed into irdma driver to create an Address Handle (AH) for the MAD QP at 0. The create AH descriptor as well as the ARP cache entry is invalid and HW throws an asynchronous events as result.
[ 1207.656888] resolve_prepare_src caller: ucma_resolve_addr+0xff/0x170 [rdma_ucm] daddr=200.0.4.28 id_priv->state=7 [....] [ 1207.680362] ice 0000:07:00.1 rocep7s0f1: caller: irdma_create_ah+0x3e/0x70 [irdma] ah_id=0 arp_idx=0 dest_ip=0.0.0.0 destMAC=00:00:64:ca:b7:52 ipvalid=1 raw=0000:0000:0000:0000:0000:ffff:0000:0000 [ 1207.682077] ice 0000:07:00.1 rocep7s0f1: abnormal ae_id = 0x401 bool qp=1 qp_id = 1, ae_src=5 [ 1207.691657] infiniband rocep7s0f1: Fatal error (1) on MAD QP (1)
Fix this by updating the CMA destination address when the ULP calls a resolve address with the CM state already bound.
Fixes: 8d037973d48c ("RDMA/core: Refactor rdma_bind_addr") Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230712234133.1343-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 6b3f4384e46ac..a60e587aea817 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -4062,6 +4062,8 @@ static int resolve_prepare_src(struct rdma_id_private *id_priv, RDMA_CM_ADDR_QUERY))) return -EINVAL;
+ } else { + memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr)); }
if (cma_family(id_priv) != dst_addr->sa_family) {
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit dc52aadbc1849cbe3fcf6bc54d35f6baa396e0a1 ]
Commit 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") introduced a new struct mthca_sqp which doesn't contain struct mthca_qp any longer. Placing a pointer of this new struct into qptable leads to crashes, because mthca_poll_one() expects a qp pointer. Fix this by putting the correct pointer into qptable.
Fixes: 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mthca/mthca_qp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c index 69bba0ef4a5df..53f43649f7d08 100644 --- a/drivers/infiniband/hw/mthca/mthca_qp.c +++ b/drivers/infiniband/hw/mthca/mthca_qp.c @@ -1393,7 +1393,7 @@ int mthca_alloc_sqp(struct mthca_dev *dev, if (mthca_array_get(&dev->qp_table.qp, mqpn)) err = -EBUSY; else - mthca_array_set(&dev->qp_table.qp, mqpn, qp->sqp); + mthca_array_set(&dev->qp_table.qp, mqpn, qp); spin_unlock_irq(&dev->qp_table.lock);
if (err)
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit b5bbc6551297447d3cca55cf907079e206e9cd82 ]
HW may generate completions that indicates QP is destroyed. Driver should not be scheduling any more completion handlers for this QP, after the QP is destroyed. Since CQs are active during the QP destroy, driver may still schedule completion handlers. This can cause a race where the destroy_cq and poll_cq running simultaneously.
Snippet of kernel panic while doing bnxt_re driver load unload in loop. This indicates a poll after the CQ is freed.
[77786.481636] Call Trace: [77786.481640] <TASK> [77786.481644] bnxt_re_poll_cq+0x14a/0x620 [bnxt_re] [77786.481658] ? kvm_clock_read+0x14/0x30 [77786.481693] __ib_process_cq+0x57/0x190 [ib_core] [77786.481728] ib_cq_poll_work+0x26/0x80 [ib_core] [77786.481761] process_one_work+0x1e5/0x3f0 [77786.481768] worker_thread+0x50/0x3a0 [77786.481785] ? __pfx_worker_thread+0x10/0x10 [77786.481790] kthread+0xe2/0x110 [77786.481794] ? __pfx_kthread+0x10/0x10 [77786.481797] ret_from_fork+0x2c/0x50
To avoid this, complete all completion handlers before returning the destroy QP. If free_cq is called soon after destroy_qp, IB stack will cancel the CQ work before invoking the destroy_cq verb and this will prevent any race mentioned.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1689322969-25402-2-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 12 ++++++++++++ drivers/infiniband/hw/bnxt_re/qplib_fp.c | 18 ++++++++++++++++++ drivers/infiniband/hw/bnxt_re/qplib_fp.h | 1 + 3 files changed, 31 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index 952811c40c54b..ebe6852c40e8c 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -797,7 +797,10 @@ static int bnxt_re_destroy_gsi_sqp(struct bnxt_re_qp *qp) int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata) { struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp); + struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp; struct bnxt_re_dev *rdev = qp->rdev; + struct bnxt_qplib_nq *scq_nq = NULL; + struct bnxt_qplib_nq *rcq_nq = NULL; unsigned int flags; int rc;
@@ -831,6 +834,15 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata) ib_umem_release(qp->rumem); ib_umem_release(qp->sumem);
+ /* Flush all the entries of notification queue associated with + * given qp. + */ + scq_nq = qplib_qp->scq->nq; + rcq_nq = qplib_qp->rcq->nq; + bnxt_re_synchronize_nq(scq_nq); + if (scq_nq != rcq_nq) + bnxt_re_synchronize_nq(rcq_nq); + return 0; }
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index 55f092c2c8a88..e589b04f953c5 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -381,6 +381,24 @@ static void bnxt_qplib_service_nq(struct tasklet_struct *t) spin_unlock_bh(&hwq->lock); }
+/* bnxt_re_synchronize_nq - self polling notification queue. + * @nq - notification queue pointer + * + * This function will start polling entries of a given notification queue + * for all pending entries. + * This function is useful to synchronize notification entries while resources + * are going away. + */ + +void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq) +{ + int budget = nq->budget; + + nq->budget = nq->hwq.max_elements; + bnxt_qplib_service_nq(&nq->nq_tasklet); + nq->budget = budget; +} + static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance) { struct bnxt_qplib_nq *nq = dev_instance; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h index a42820821c473..404b851091ca2 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h @@ -553,6 +553,7 @@ int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe, int num_cqes); void bnxt_qplib_flush_cqn_wq(struct bnxt_qplib_qp *qp); +void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq);
static inline void *bnxt_qplib_get_swqe(struct bnxt_qplib_q *que, u32 *swq_idx) {
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 8cf1d12ad56beb73d2439ccf334b7148e71de58e ]
Use jiffies based timewait instead of counting iteration for commands that block for FW response.
Also add a poll routine for control path commands. This is for polling completion if the waiting commands timeout. This avoids cases where the driver misses completion interrupts.
Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1686308514-11996-6-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 29900bf351e1 ("RDMA/bnxt_re: Fix hang during driver unload") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 65 +++++++++++++++++----- 1 file changed, 51 insertions(+), 14 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index c11b8e708844c..918e588588885 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -53,37 +53,74 @@
static void bnxt_qplib_service_creq(struct tasklet_struct *t);
-/* Hardware communication channel */ +/** + * __wait_for_resp - Don't hold the cpu context and wait for response + * @rcfw - rcfw channel instance of rdev + * @cookie - cookie to track the command + * + * Wait for command completion in sleepable context. + * + * Returns: + * 0 if command is completed by firmware. + * Non zero error code for rest of the case. + */ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) { struct bnxt_qplib_cmdq_ctx *cmdq; u16 cbit; - int rc; + int ret;
cmdq = &rcfw->cmdq; cbit = cookie % rcfw->cmdq_depth; - rc = wait_event_timeout(cmdq->waitq, - !test_bit(cbit, cmdq->cmdq_bitmap), - msecs_to_jiffies(RCFW_CMD_WAIT_TIME_MS)); - return rc ? 0 : -ETIMEDOUT; + + do { + /* Non zero means command completed */ + ret = wait_event_timeout(cmdq->waitq, + !test_bit(cbit, cmdq->cmdq_bitmap), + msecs_to_jiffies(10000)); + + if (!test_bit(cbit, cmdq->cmdq_bitmap)) + return 0; + + bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet); + + if (!test_bit(cbit, cmdq->cmdq_bitmap)) + return 0; + + } while (true); };
+/** + * __block_for_resp - hold the cpu context and wait for response + * @rcfw - rcfw channel instance of rdev + * @cookie - cookie to track the command + * + * This function will hold the cpu (non-sleepable context) and + * wait for command completion. Maximum holding interval is 8 second. + * + * Returns: + * -ETIMEOUT if command is not completed in specific time interval. + * 0 if command is completed by firmware. + */ static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) { - u32 count = RCFW_BLOCKED_CMD_WAIT_COUNT; - struct bnxt_qplib_cmdq_ctx *cmdq; + struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq; + unsigned long issue_time = 0; u16 cbit;
- cmdq = &rcfw->cmdq; cbit = cookie % rcfw->cmdq_depth; - if (!test_bit(cbit, cmdq->cmdq_bitmap)) - goto done; + issue_time = jiffies; + do { udelay(1); + bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet); - } while (test_bit(cbit, cmdq->cmdq_bitmap) && --count); -done: - return count ? 0 : -ETIMEDOUT; + if (!test_bit(cbit, cmdq->cmdq_bitmap)) + return 0; + + } while (time_before(jiffies, issue_time + (8 * HZ))); + + return -ETIMEDOUT; };
static int __send_message(struct bnxt_qplib_rcfw *rcfw,
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 3022cc15119733cebaef05feddb5d87b9e401c0e ]
Add a check to avoid waiting if driver already detects a FW timeout. Return success for resource destroy in case the device is detached. Add helper function to map timeout error code to success.
Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1686308514-11996-7-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 29900bf351e1 ("RDMA/bnxt_re: Fix hang during driver unload") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 52 ++++++++++++++++++++-- 1 file changed, 48 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index 918e588588885..bfa0f29c7abf4 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -53,10 +53,47 @@
static void bnxt_qplib_service_creq(struct tasklet_struct *t);
+/** + * bnxt_qplib_map_rc - map return type based on opcode + * @opcode - roce slow path opcode + * + * In some cases like firmware halt is detected, the driver is supposed to + * remap the error code of the timed out command. + * + * It is not safe to assume hardware is really inactive so certain opcodes + * like destroy qp etc are not safe to be returned success, but this function + * will be called when FW already reports a timeout. This would be possible + * only when FW crashes and resets. This will clear all the HW resources. + * + * Returns: + * 0 to communicate success to caller. + * Non zero error code to communicate failure to caller. + */ +static int bnxt_qplib_map_rc(u8 opcode) +{ + switch (opcode) { + case CMDQ_BASE_OPCODE_DESTROY_QP: + case CMDQ_BASE_OPCODE_DESTROY_SRQ: + case CMDQ_BASE_OPCODE_DESTROY_CQ: + case CMDQ_BASE_OPCODE_DEALLOCATE_KEY: + case CMDQ_BASE_OPCODE_DEREGISTER_MR: + case CMDQ_BASE_OPCODE_DELETE_GID: + case CMDQ_BASE_OPCODE_DESTROY_QP1: + case CMDQ_BASE_OPCODE_DESTROY_AH: + case CMDQ_BASE_OPCODE_DEINITIALIZE_FW: + case CMDQ_BASE_OPCODE_MODIFY_ROCE_CC: + case CMDQ_BASE_OPCODE_SET_LINK_AGGR_MODE: + return 0; + default: + return -ETIMEDOUT; + } +} + /** * __wait_for_resp - Don't hold the cpu context and wait for response * @rcfw - rcfw channel instance of rdev * @cookie - cookie to track the command + * @opcode - rcfw submitted for given opcode * * Wait for command completion in sleepable context. * @@ -64,7 +101,7 @@ static void bnxt_qplib_service_creq(struct tasklet_struct *t); * 0 if command is completed by firmware. * Non zero error code for rest of the case. */ -static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) +static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode) { struct bnxt_qplib_cmdq_ctx *cmdq; u16 cbit; @@ -74,6 +111,9 @@ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) cbit = cookie % rcfw->cmdq_depth;
do { + if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags)) + return bnxt_qplib_map_rc(opcode); + /* Non zero means command completed */ ret = wait_event_timeout(cmdq->waitq, !test_bit(cbit, cmdq->cmdq_bitmap), @@ -94,6 +134,7 @@ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) * __block_for_resp - hold the cpu context and wait for response * @rcfw - rcfw channel instance of rdev * @cookie - cookie to track the command + * @opcode - rcfw submitted for given opcode * * This function will hold the cpu (non-sleepable context) and * wait for command completion. Maximum holding interval is 8 second. @@ -102,7 +143,7 @@ static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) * -ETIMEOUT if command is not completed in specific time interval. * 0 if command is completed by firmware. */ -static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) +static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode) { struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq; unsigned long issue_time = 0; @@ -112,6 +153,9 @@ static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) issue_time = jiffies;
do { + if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags)) + return bnxt_qplib_map_rc(opcode); + udelay(1);
bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet); @@ -267,9 +311,9 @@ int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw, } while (retry_cnt--);
if (msg->block) - rc = __block_for_resp(rcfw, cookie); + rc = __block_for_resp(rcfw, cookie, opcode); else - rc = __wait_for_resp(rcfw, cookie); + rc = __wait_for_resp(rcfw, cookie, opcode); if (rc) { /* timed out */ dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x timedout (%d)msec\n",
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 65288a22ddd81422a2a2a10c15df976a5332e41b ]
Whenever there is a fast path IO and create/destroy resources from the slow path is happening in parallel, we may notice high latency of slow path command completion.
Introduces a shadow queue depth to prevent the outstanding requests to the FW. Driver will not allow more than #RCFW_CMD_NON_BLOCKING_SHADOW_QD non-blocking commands to the Firmware.
Shadow queue depth is a soft limit only for non-blocking commands. Blocking commands will be posted to the firmware as long as there is a free slot.
Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1686308514-11996-8-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 29900bf351e1 ("RDMA/bnxt_re: Fix hang during driver unload") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 60 +++++++++++++++++++++- drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 3 ++ 2 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index bfa0f29c7abf4..42484a1149c7c 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -281,8 +281,21 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, return 0; }
-int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw, - struct bnxt_qplib_cmdqmsg *msg) +/** + * __bnxt_qplib_rcfw_send_message - qplib interface to send + * and complete rcfw command. + * @rcfw - rcfw channel instance of rdev + * @msg - qplib message internal + * + * This function does not account shadow queue depth. It will send + * all the command unconditionally as long as send queue is not full. + * + * Returns: + * 0 if command completed by firmware. + * Non zero if the command is not completed by firmware. + */ +static int __bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw, + struct bnxt_qplib_cmdqmsg *msg) { struct creq_qp_event *evnt = (struct creq_qp_event *)msg->resp; u16 cookie; @@ -331,6 +344,48 @@ int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
return rc; } + +/** + * bnxt_qplib_rcfw_send_message - qplib interface to send + * and complete rcfw command. + * @rcfw - rcfw channel instance of rdev + * @msg - qplib message internal + * + * Driver interact with Firmware through rcfw channel/slow path in two ways. + * a. Blocking rcfw command send. In this path, driver cannot hold + * the context for longer period since it is holding cpu until + * command is not completed. + * b. Non-blocking rcfw command send. In this path, driver can hold the + * context for longer period. There may be many pending command waiting + * for completion because of non-blocking nature. + * + * Driver will use shadow queue depth. Current queue depth of 8K + * (due to size of rcfw message there can be actual ~4K rcfw outstanding) + * is not optimal for rcfw command processing in firmware. + * + * Restrict at max #RCFW_CMD_NON_BLOCKING_SHADOW_QD Non-Blocking rcfw commands. + * Allow all blocking commands until there is no queue full. + * + * Returns: + * 0 if command completed by firmware. + * Non zero if the command is not completed by firmware. + */ +int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw, + struct bnxt_qplib_cmdqmsg *msg) +{ + int ret; + + if (!msg->block) { + down(&rcfw->rcfw_inflight); + ret = __bnxt_qplib_rcfw_send_message(rcfw, msg); + up(&rcfw->rcfw_inflight); + } else { + ret = __bnxt_qplib_rcfw_send_message(rcfw, msg); + } + + return ret; +} + /* Completions */ static int bnxt_qplib_process_func_event(struct bnxt_qplib_rcfw *rcfw, struct creq_func_event *func_event) @@ -937,6 +992,7 @@ int bnxt_qplib_enable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw, return rc; }
+ sema_init(&rcfw->rcfw_inflight, RCFW_CMD_NON_BLOCKING_SHADOW_QD); bnxt_qplib_start_rcfw(rcfw);
return 0; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h index 92f7a25533d3b..675c388348827 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h @@ -67,6 +67,8 @@ static inline void bnxt_qplib_rcfw_cmd_prep(struct cmdq_base *req, req->cmd_size = cmd_size; }
+/* Shadow queue depth for non blocking command */ +#define RCFW_CMD_NON_BLOCKING_SHADOW_QD 64 #define RCFW_CMD_WAIT_TIME_MS 20000 /* 20 Seconds timeout */
/* CMDQ elements */ @@ -201,6 +203,7 @@ struct bnxt_qplib_rcfw { u64 oos_prev; u32 init_oos_stats; u32 cmdq_depth; + struct semaphore rcfw_inflight; };
struct bnxt_qplib_cmdqmsg {
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 159cf95e42a7ca7375646fab82c0056cbb71f9e9 ]
- Use __send_message_basic_sanity helper function. - Do not retry posting same command if there is a queue full detection. - ENXIO is used to indicate controller recovery. - In the case of ERR_DEVICE_DETACHED state, the driver should not post commands to the firmware, but also return fabricated written code.
Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1686308514-11996-9-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 29900bf351e1 ("RDMA/bnxt_re: Fix hang during driver unload") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 125 +++++++++++---------- drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 22 ++++ 2 files changed, 86 insertions(+), 61 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index 42484a1149c7c..f867507d427f9 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -170,34 +170,22 @@ static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode) static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct bnxt_qplib_cmdqmsg *msg) { - struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq; - struct bnxt_qplib_hwq *hwq = &cmdq->hwq; + u32 bsize, opcode, free_slots, required_slots; + struct bnxt_qplib_cmdq_ctx *cmdq; struct bnxt_qplib_crsqe *crsqe; struct bnxt_qplib_cmdqe *cmdqe; + struct bnxt_qplib_hwq *hwq; u32 sw_prod, cmdq_prod; struct pci_dev *pdev; unsigned long flags; - u32 bsize, opcode; u16 cookie, cbit; u8 *preq;
+ cmdq = &rcfw->cmdq; + hwq = &cmdq->hwq; pdev = rcfw->pdev;
opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz); - if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) && - (opcode != CMDQ_BASE_OPCODE_QUERY_FUNC && - opcode != CMDQ_BASE_OPCODE_INITIALIZE_FW && - opcode != CMDQ_BASE_OPCODE_QUERY_VERSION)) { - dev_err(&pdev->dev, - "RCFW not initialized, reject opcode 0x%x\n", opcode); - return -EINVAL; - } - - if (test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) && - opcode == CMDQ_BASE_OPCODE_INITIALIZE_FW) { - dev_err(&pdev->dev, "RCFW already initialized!\n"); - return -EINVAL; - }
if (test_bit(FIRMWARE_TIMED_OUT, &cmdq->flags)) return -ETIMEDOUT; @@ -206,40 +194,37 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, * cmdqe */ spin_lock_irqsave(&hwq->lock, flags); - if (msg->req->cmd_size >= HWQ_FREE_SLOTS(hwq)) { - dev_err(&pdev->dev, "RCFW: CMDQ is full!\n"); + required_slots = bnxt_qplib_get_cmd_slots(msg->req); + free_slots = HWQ_FREE_SLOTS(hwq); + cookie = cmdq->seq_num & RCFW_MAX_COOKIE_VALUE; + cbit = cookie % rcfw->cmdq_depth; + + if (required_slots >= free_slots || + test_bit(cbit, cmdq->cmdq_bitmap)) { + dev_info_ratelimited(&pdev->dev, + "CMDQ is full req/free %d/%d!", + required_slots, free_slots); spin_unlock_irqrestore(&hwq->lock, flags); return -EAGAIN; } - - - cookie = cmdq->seq_num & RCFW_MAX_COOKIE_VALUE; - cbit = cookie % rcfw->cmdq_depth; if (msg->block) cookie |= RCFW_CMD_IS_BLOCKING; - set_bit(cbit, cmdq->cmdq_bitmap); __set_cmdq_base_cookie(msg->req, msg->req_sz, cpu_to_le16(cookie)); crsqe = &rcfw->crsqe_tbl[cbit]; - if (crsqe->resp) { - spin_unlock_irqrestore(&hwq->lock, flags); - return -EBUSY; - } - - /* change the cmd_size to the number of 16byte cmdq unit. - * req->cmd_size is modified here - */ bsize = bnxt_qplib_set_cmd_slots(msg->req); - - memset(msg->resp, 0, sizeof(*msg->resp)); + crsqe->free_slots = free_slots; crsqe->resp = (struct creq_qp_event *)msg->resp; crsqe->resp->cookie = cpu_to_le16(cookie); crsqe->req_size = __get_cmdq_base_cmd_size(msg->req, msg->req_sz); if (__get_cmdq_base_resp_size(msg->req, msg->req_sz) && msg->sb) { struct bnxt_qplib_rcfw_sbuf *sbuf = msg->sb; - __set_cmdq_base_resp_addr(msg->req, msg->req_sz, cpu_to_le64(sbuf->dma_addr)); + + __set_cmdq_base_resp_addr(msg->req, msg->req_sz, + cpu_to_le64(sbuf->dma_addr)); __set_cmdq_base_resp_size(msg->req, msg->req_sz, - ALIGN(sbuf->size, BNXT_QPLIB_CMDQE_UNITS)); + ALIGN(sbuf->size, + BNXT_QPLIB_CMDQE_UNITS)); }
preq = (u8 *)msg->req; @@ -247,11 +232,6 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, /* Locate the next cmdq slot */ sw_prod = HWQ_CMP(hwq->prod, hwq); cmdqe = bnxt_qplib_get_qe(hwq, sw_prod, NULL); - if (!cmdqe) { - dev_err(&pdev->dev, - "RCFW request failed with no cmdqe!\n"); - goto done; - } /* Copy a segment of the req cmd to the cmdq */ memset(cmdqe, 0, sizeof(*cmdqe)); memcpy(cmdqe, preq, min_t(u32, bsize, sizeof(*cmdqe))); @@ -275,12 +255,43 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, wmb(); writel(cmdq_prod, cmdq->cmdq_mbox.prod); writel(RCFW_CMDQ_TRIG_VAL, cmdq->cmdq_mbox.db); -done: spin_unlock_irqrestore(&hwq->lock, flags); /* Return the CREQ response pointer */ return 0; }
+static int __send_message_basic_sanity(struct bnxt_qplib_rcfw *rcfw, + struct bnxt_qplib_cmdqmsg *msg) +{ + struct bnxt_qplib_cmdq_ctx *cmdq; + u32 opcode; + + cmdq = &rcfw->cmdq; + opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz); + + /* Prevent posting if f/w is not in a state to process */ + if (test_bit(ERR_DEVICE_DETACHED, &rcfw->cmdq.flags)) + return -ENXIO; + + if (test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) && + opcode == CMDQ_BASE_OPCODE_INITIALIZE_FW) { + dev_err(&rcfw->pdev->dev, "QPLIB: RCFW already initialized!"); + return -EINVAL; + } + + if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) && + (opcode != CMDQ_BASE_OPCODE_QUERY_FUNC && + opcode != CMDQ_BASE_OPCODE_INITIALIZE_FW && + opcode != CMDQ_BASE_OPCODE_QUERY_VERSION)) { + dev_err(&rcfw->pdev->dev, + "QPLIB: RCFW not initialized, reject opcode 0x%x", + opcode); + return -EOPNOTSUPP; + } + + return 0; +} + /** * __bnxt_qplib_rcfw_send_message - qplib interface to send * and complete rcfw command. @@ -299,29 +310,21 @@ static int __bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw, { struct creq_qp_event *evnt = (struct creq_qp_event *)msg->resp; u16 cookie; - u8 opcode, retry_cnt = 0xFF; int rc = 0; + u8 opcode;
- /* Prevent posting if f/w is not in a state to process */ - if (test_bit(ERR_DEVICE_DETACHED, &rcfw->cmdq.flags)) - return 0; + opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz);
- do { - opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz); - rc = __send_message(rcfw, msg); - cookie = le16_to_cpu(__get_cmdq_base_cookie(msg->req, msg->req_sz)) & - RCFW_MAX_COOKIE_VALUE; - if (!rc) - break; - if (!retry_cnt || (rc != -EAGAIN && rc != -EBUSY)) { - /* send failed */ - dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x send failed\n", - cookie, opcode); - return rc; - } - msg->block ? mdelay(1) : usleep_range(500, 1000); + rc = __send_message_basic_sanity(rcfw, msg); + if (rc) + return rc == -ENXIO ? bnxt_qplib_map_rc(opcode) : rc; + + rc = __send_message(rcfw, msg); + if (rc) + return rc;
- } while (retry_cnt--); + cookie = le16_to_cpu(__get_cmdq_base_cookie(msg->req, msg->req_sz)) + & RCFW_MAX_COOKIE_VALUE;
if (msg->block) rc = __block_for_resp(rcfw, cookie, opcode); diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h index 675c388348827..43dc11febf46a 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h @@ -91,6 +91,26 @@ static inline u32 bnxt_qplib_cmdqe_page_size(u32 depth) return (bnxt_qplib_cmdqe_npages(depth) * PAGE_SIZE); }
+/* Get the number of command units required for the req. The + * function returns correct value only if called before + * setting using bnxt_qplib_set_cmd_slots + */ +static inline u32 bnxt_qplib_get_cmd_slots(struct cmdq_base *req) +{ + u32 cmd_units = 0; + + if (HAS_TLV_HEADER(req)) { + struct roce_tlv *tlv_req = (struct roce_tlv *)req; + + cmd_units = tlv_req->total_size; + } else { + cmd_units = (req->cmd_size + BNXT_QPLIB_CMDQE_UNITS - 1) / + BNXT_QPLIB_CMDQE_UNITS; + } + + return cmd_units; +} + static inline u32 bnxt_qplib_set_cmd_slots(struct cmdq_base *req) { u32 cmd_byte = 0; @@ -134,6 +154,8 @@ typedef int (*aeq_handler_t)(struct bnxt_qplib_rcfw *, void *, void *); struct bnxt_qplib_crsqe { struct creq_qp_event *resp; u32 req_size; + /* Free slots at the time of submission */ + u32 free_slots; };
struct bnxt_qplib_rcfw_sbuf {
From: Kashyap Desai kashyap.desai@broadcom.com
[ Upstream commit 354f5bd985af9515190828bc642ebdf59acea121 ]
This interface will be used if the driver has not enabled interrupt and/or interrupt is disabled for a short period of time. Completion is not possible from interrupt so this interface does self-polling.
Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1686308514-11996-10-git-send-email-selvin.xavier@b... Signed-off-by: Leon Romanovsky leon@kernel.org Stable-dep-of: 29900bf351e1 ("RDMA/bnxt_re: Fix hang during driver unload") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 44 +++++++++++++++++++++- drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 1 + 2 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index f867507d427f9..0028043bb51cd 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -260,6 +260,44 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, return 0; }
+/** + * __poll_for_resp - self poll completion for rcfw command + * @rcfw - rcfw channel instance of rdev + * @cookie - cookie to track the command + * @opcode - rcfw submitted for given opcode + * + * It works same as __wait_for_resp except this function will + * do self polling in sort interval since interrupt is disabled. + * This function can not be called from non-sleepable context. + * + * Returns: + * -ETIMEOUT if command is not completed in specific time interval. + * 0 if command is completed by firmware. + */ +static int __poll_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, + u8 opcode) +{ + struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq; + unsigned long issue_time; + u16 cbit; + + cbit = cookie % rcfw->cmdq_depth; + issue_time = jiffies; + + do { + if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags)) + return bnxt_qplib_map_rc(opcode); + + usleep_range(1000, 1001); + + bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet); + if (!test_bit(cbit, cmdq->cmdq_bitmap)) + return 0; + if (jiffies_to_msecs(jiffies - issue_time) > 10000) + return -ETIMEDOUT; + } while (true); +}; + static int __send_message_basic_sanity(struct bnxt_qplib_rcfw *rcfw, struct bnxt_qplib_cmdqmsg *msg) { @@ -328,8 +366,10 @@ static int __bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
if (msg->block) rc = __block_for_resp(rcfw, cookie, opcode); - else + else if (atomic_read(&rcfw->rcfw_intr_enabled)) rc = __wait_for_resp(rcfw, cookie, opcode); + else + rc = __poll_for_resp(rcfw, cookie, opcode); if (rc) { /* timed out */ dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x timedout (%d)msec\n", @@ -798,6 +838,7 @@ void bnxt_qplib_rcfw_stop_irq(struct bnxt_qplib_rcfw *rcfw, bool kill) kfree(creq->irq_name); creq->irq_name = NULL; creq->requested = false; + atomic_set(&rcfw->rcfw_intr_enabled, 0); }
void bnxt_qplib_disable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw) @@ -859,6 +900,7 @@ int bnxt_qplib_rcfw_start_irq(struct bnxt_qplib_rcfw *rcfw, int msix_vector, creq->requested = true;
bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, res->cctx, true); + atomic_inc(&rcfw->rcfw_intr_enabled);
return 0; } diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h index 43dc11febf46a..4608c0ef07a87 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h @@ -225,6 +225,7 @@ struct bnxt_qplib_rcfw { u64 oos_prev; u32 init_oos_stats; u32 cmdq_depth; + atomic_t rcfw_intr_enabled; struct semaphore rcfw_inflight; };
From: Selvin Xavier selvin.xavier@broadcom.com
[ Upstream commit 29900bf351e1a7e4643da5c3c3cd9df75c577b88 ]
Driver unload hits a hang during stress testing of load/unload.
stack trace snippet -
tasklet_kill at ffffffff9aabb8b2 bnxt_qplib_nq_stop_irq at ffffffffc0a805fb [bnxt_re] bnxt_qplib_disable_nq at ffffffffc0a80c5b [bnxt_re] bnxt_re_dev_uninit at ffffffffc0a67d15 [bnxt_re] bnxt_re_remove_device at ffffffffc0a6af1d [bnxt_re]
tasklet_kill can hang if the tasklet is scheduled after it is disabled.
Modified the sequences to disable the interrupt first and synchronize irq before disabling the tasklet.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai kashyap.desai@broadcom.com Signed-off-by: Selvin Xavier selvin.xavier@broadcom.com Link: https://lore.kernel.org/r/1689322969-25402-3-git-send-email-selvin.xavier@br... Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/bnxt_re/qplib_fp.c | 10 +++++----- drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 9 ++++----- 2 files changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c index e589b04f953c5..b34cc500f51f3 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c @@ -420,19 +420,19 @@ void bnxt_qplib_nq_stop_irq(struct bnxt_qplib_nq *nq, bool kill) if (!nq->requested) return;
- tasklet_disable(&nq->nq_tasklet); + nq->requested = false; /* Mask h/w interrupt */ bnxt_qplib_ring_nq_db(&nq->nq_db.dbinfo, nq->res->cctx, false); /* Sync with last running IRQ handler */ synchronize_irq(nq->msix_vec); - if (kill) - tasklet_kill(&nq->nq_tasklet); - irq_set_affinity_hint(nq->msix_vec, NULL); free_irq(nq->msix_vec, nq); kfree(nq->name); nq->name = NULL; - nq->requested = false; + + if (kill) + tasklet_kill(&nq->nq_tasklet); + tasklet_disable(&nq->nq_tasklet); }
void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c index 0028043bb51cd..05683ce64887f 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c @@ -826,19 +826,18 @@ void bnxt_qplib_rcfw_stop_irq(struct bnxt_qplib_rcfw *rcfw, bool kill) if (!creq->requested) return;
- tasklet_disable(&creq->creq_tasklet); + creq->requested = false; /* Mask h/w interrupts */ bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, rcfw->res->cctx, false); /* Sync with last running IRQ-handler */ synchronize_irq(creq->msix_vec); - if (kill) - tasklet_kill(&creq->creq_tasklet); - free_irq(creq->msix_vec, rcfw); kfree(creq->irq_name); creq->irq_name = NULL; - creq->requested = false; atomic_set(&rcfw->rcfw_intr_enabled, 0); + if (kill) + tasklet_kill(&creq->creq_tasklet); + tasklet_disable(&creq->creq_tasklet); }
void bnxt_qplib_disable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw)
From: Gaosheng Cui cuigaosheng1@huawei.com
[ Upstream commit 6e8a996563ecbe68e49c49abd4aaeef69f11f2dc ]
The msm_gem_get_vaddr() returns an ERR_PTR() on failure, and a null is catastrophic here, so we should use IS_ERR_OR_NULL() to check the return value.
Fixes: 6a8bd08d0465 ("drm/msm: add sudo flag to submit ioctl") Signed-off-by: Gaosheng Cui cuigaosheng1@huawei.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Reviewed-by: Akhil P Oommen quic_akhilpo@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/547712/ Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index a99310b687932..bbb1bf33f98ef 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -89,7 +89,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit * since we've already mapped it once in * submit_reloc() */ - if (WARN_ON(!ptr)) + if (WARN_ON(IS_ERR_OR_NULL(ptr))) return;
for (i = 0; i < dwords; i++) {
From: Rob Clark robdclark@chromium.org
[ Upstream commit 1cd0787f082e1a179f2b6e749d08daff1a9f6b1b ]
In an error path where the submit is free'd without the job being run, the hw_fence pointer is simply a kzalloc'd block of memory. In this case we should just kfree() it, rather than trying to decrement it's reference count. Fortunately we can tell that this is the case by checking for a zero refcount, since if the job was run, the submit would be holding a reference to the hw_fence.
Fixes: f94e6a51e17c ("drm/msm: Pre-allocate hw_fence") Signed-off-by: Rob Clark robdclark@chromium.org Patchwork: https://patchwork.freedesktop.org/patch/547088/ Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_fence.c | 6 ++++++ drivers/gpu/drm/msm/msm_gem_submit.c | 14 +++++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 96599ec3eb783..1a5d4f1c8b422 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -191,6 +191,12 @@ msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
f->fctx = fctx;
+ /* + * Until this point, the fence was just some pre-allocated memory, + * no-one should have taken a reference to it yet. + */ + WARN_ON(kref_read(&fence->refcount)); + dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock, fctx->context, ++fctx->last_fence); } diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 9f5933c75e3df..10cad7b99bac8 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -86,7 +86,19 @@ void __msm_gem_submit_destroy(struct kref *kref) }
dma_fence_put(submit->user_fence); - dma_fence_put(submit->hw_fence); + + /* + * If the submit is freed before msm_job_run(), then hw_fence is + * just some pre-allocated memory, not a reference counted fence. + * Once the job runs and the hw_fence is initialized, it will + * have a refcount of at least one, since the submit holds a ref + * to the hw_fence. + */ + if (kref_read(&submit->hw_fence->refcount) == 0) { + kfree(submit->hw_fence); + } else { + dma_fence_put(submit->hw_fence); + }
put_pid(submit->pid); msm_submitqueue_put(submit->queue);
From: Breno Leitao leitao@debian.org
[ Upstream commit 4cf67d3cc9994a59cf77bb9c0ccf9007fe916afe ]
KASAN and KFENCE detected an user-after-free in the CXL driver. This happens in the cxl_decoder_add() fail path. KASAN prints the following error:
BUG: KASAN: slab-use-after-free in cxl_parse_cfmws (drivers/cxl/acpi.c:299)
This happens in cxl_parse_cfmws(), where put_device() is called, releasing cxld, which is accessed later.
Use the local variables in the dev_err() instead of pointing to the released memory. Since the dev_err() is printing a resource, change the open coded print format to use the %pr format specifier.
Fixes: e50fe01e1f2a ("cxl/core: Drop ->platform_res attribute for root decoders") Signed-off-by: Breno Leitao leitao@debian.org Link: https://lore.kernel.org/r/20230714093146.2253438-1-leitao@debian.org Reviewed-by: Alison Schofield alison.schofield@intel.com Reviewed-by: Dave Jiang dave.jiang@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Vishal Verma vishal.l.verma@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/acpi.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 7e1765b09e04a..973d6747078c9 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -296,8 +296,7 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, else rc = cxl_decoder_autoremove(dev, cxld); if (rc) { - dev_err(dev, "Failed to add decode range [%#llx - %#llx]\n", - cxld->hpa_range.start, cxld->hpa_range.end); + dev_err(dev, "Failed to add decode range: %pr", res); return 0; } dev_dbg(dev, "add: %s node: %d range [%#llx - %#llx]\n",
From: Breno Leitao leitao@debian.org
[ Upstream commit 91019b5bc7c2c5e6f676cce80ee6d12b2753d018 ]
Driver initialization returned success (return 0) even if the initialization (cxl_decoder_add() or acpi_table_parse_cedt()) failed.
Return the error instead of swallowing it.
Fixes: f4ce1f766f1e ("cxl/acpi: Convert CFMWS parsing to ACPI sub-table helpers") Signed-off-by: Breno Leitao leitao@debian.org Link: https://lore.kernel.org/r/20230714093146.2253438-2-leitao@debian.org Reviewed-by: Alison Schofield alison.schofield@intel.com Signed-off-by: Vishal Verma vishal.l.verma@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/acpi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 973d6747078c9..8757bf886207b 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -297,7 +297,7 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, rc = cxl_decoder_autoremove(dev, cxld); if (rc) { dev_err(dev, "Failed to add decode range: %pr", res); - return 0; + return rc; } dev_dbg(dev, "add: %s node: %d range [%#llx - %#llx]\n", dev_name(&cxld->dev),
From: Matus Gajdos matuszpd@gmail.com
[ Upstream commit 0e4c2b6b0c4a4b4014d9424c27e5e79d185229c5 ]
Clear TX registers on stop to prevent the SPDIF interface from sending last written word over and over again.
Fixes: a2388a498ad2 ("ASoC: fsl: Add S/PDIF CPU DAI driver") Signed-off-by: Matus Gajdos matuszpd@gmail.com Reviewed-by: Fabio Estevam festevam@gmail.com Link: https://lore.kernel.org/r/20230719164729.19969-1-matuszpd@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/fsl/fsl_spdif.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c index 015c3708aa04e..3fd26f2cdd60f 100644 --- a/sound/soc/fsl/fsl_spdif.c +++ b/sound/soc/fsl/fsl_spdif.c @@ -751,6 +751,8 @@ static int fsl_spdif_trigger(struct snd_pcm_substream *substream, case SNDRV_PCM_TRIGGER_PAUSE_PUSH: regmap_update_bits(regmap, REG_SPDIF_SCR, dmaen, 0); regmap_update_bits(regmap, REG_SPDIF_SIE, intr, 0); + regmap_write(regmap, REG_SPDIF_STL, 0x0); + regmap_write(regmap, REG_SPDIF_STR, 0x0); break; default: return -EINVAL;
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit e0933b526fbfd937c4a8f4e35fcdd49f0e22d411 ]
Fix the symbolic names for zone conditions in the blkzoned.h header file.
Cc: Hannes Reinecke hare@suse.de Cc: Damien Le Moal dlemoal@kernel.org Fixes: 6a0cb1bc106f ("block: Implement support for zoned block devices") Signed-off-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20230706201422.3987341-1-bvanassche@acm.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/blkzoned.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h index b80fcc9ea5257..f85743ef6e7d1 100644 --- a/include/uapi/linux/blkzoned.h +++ b/include/uapi/linux/blkzoned.h @@ -51,13 +51,13 @@ enum blk_zone_type { * * The Zone Condition state machine in the ZBC/ZAC standards maps the above * deinitions as: - * - ZC1: Empty | BLK_ZONE_EMPTY + * - ZC1: Empty | BLK_ZONE_COND_EMPTY * - ZC2: Implicit Open | BLK_ZONE_COND_IMP_OPEN * - ZC3: Explicit Open | BLK_ZONE_COND_EXP_OPEN - * - ZC4: Closed | BLK_ZONE_CLOSED - * - ZC5: Full | BLK_ZONE_FULL - * - ZC6: Read Only | BLK_ZONE_READONLY - * - ZC7: Offline | BLK_ZONE_OFFLINE + * - ZC4: Closed | BLK_ZONE_COND_CLOSED + * - ZC5: Full | BLK_ZONE_COND_FULL + * - ZC6: Read Only | BLK_ZONE_COND_READONLY + * - ZC7: Offline | BLK_ZONE_COND_OFFLINE * * Conditions 0x5 to 0xC are reserved by the current ZBC/ZAC spec and should * be considered invalid.
From: Steve French stfrench@microsoft.com
[ Upstream commit 19826558210b9102a7d4681c91784d137d60d71b ]
The NTLMSSP_NEGOTIATE_VERSION flag only needs to be sent during the NTLMSSP NEGOTIATE (not the AUTH) request, so filter it out for NTLMSSP AUTH requests. See MS-NLMP 2.2.1.3
This fixes a problem found by the gssntlmssp server.
Link: https://github.com/gssapi/gss-ntlmssp/issues/95 Fixes: 52d005337b2c ("smb3: send NTLMSSP version information") Acked-by: Roy Shterman roy.shterman@gmail.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/sess.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c index 335c078c42fb5..c57ca2050b73f 100644 --- a/fs/smb/client/sess.c +++ b/fs/smb/client/sess.c @@ -1013,6 +1013,7 @@ int build_ntlmssp_smb3_negotiate_blob(unsigned char **pbuffer, }
+/* See MS-NLMP 2.2.1.3 */ int build_ntlmssp_auth_blob(unsigned char **pbuffer, u16 *buflen, struct cifs_ses *ses, @@ -1047,7 +1048,8 @@ int build_ntlmssp_auth_blob(unsigned char **pbuffer,
flags = ses->ntlmssp->server_flags | NTLMSSP_REQUEST_TARGET | NTLMSSP_NEGOTIATE_TARGET_INFO | NTLMSSP_NEGOTIATE_WORKSTATION_SUPPLIED; - + /* we only send version information in ntlmssp negotiate, so do not set this flag */ + flags = flags & ~NTLMSSP_NEGOTIATE_VERSION; tmp = *pbuffer + sizeof(AUTHENTICATE_MESSAGE); sec_blob->NegotiateFlags = cpu_to_le32(flags);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit e354f67733115b4453268f61e6e072e9b1ea7a2f ]
All error handling paths go to 'out', except this one. Be consistent and also branch to 'out' here.
Fixes: c10a652e239e ("drm/i915/selftests: Rework context handling in hugepages selftests") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/7a036b88671312ee9adc01c74ef5b3... (cherry picked from commit 361ecaadb1ce3c5312c7c4c419271326d43899eb) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 99f39a5feca15..e86e75971ec60 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -1190,8 +1190,10 @@ static int igt_write_huge(struct drm_i915_private *i915, * times in succession a possibility by enlarging the permutation array. */ order = i915_random_order(count * count, &prng); - if (!order) - return -ENOMEM; + if (!order) { + err = -ENOMEM; + goto out; + }
max_page_size = rounddown_pow_of_two(obj->mm.page_sizes.sg); max = div_u64(max - size, max_page_size);
From: Stefano Stabellini sstabellini@kernel.org
[ Upstream commit 0d8f7cc8057890db08c54fe610d8a94af59da082 ]
The same way we already do in xenbus_init. Fixes the following warning:
[ 352.175563] Trying to free already-free IRQ 0 [ 352.177355] WARNING: CPU: 1 PID: 88 at kernel/irq/manage.c:1893 free_irq+0xbf/0x350 [...] [ 352.213951] Call Trace: [ 352.214390] <TASK> [ 352.214717] ? __warn+0x81/0x170 [ 352.215436] ? free_irq+0xbf/0x350 [ 352.215906] ? report_bug+0x10b/0x200 [ 352.216408] ? prb_read_valid+0x17/0x20 [ 352.216926] ? handle_bug+0x44/0x80 [ 352.217409] ? exc_invalid_op+0x13/0x60 [ 352.217932] ? asm_exc_invalid_op+0x16/0x20 [ 352.218497] ? free_irq+0xbf/0x350 [ 352.218979] ? __pfx_xenbus_probe_thread+0x10/0x10 [ 352.219600] xenbus_probe+0x7a/0x80 [ 352.221030] xenbus_probe_thread+0x76/0xc0
Fixes: 5b3353949e89 ("xen: add support for initializing xenstore later as HVM domain") Signed-off-by: Stefano Stabellini stefano.stabellini@amd.com Tested-by: Petr Mladek pmladek@suse.com Reviewed-by: Oleksandr Tyshchenko oleksandr_tyshchenko@epam.com
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2307211609140.3118466@ubuntu-l... Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/xenbus/xenbus_probe.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index 58b732dcbfb83..639bf628389ba 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -811,6 +811,9 @@ static int xenbus_probe_thread(void *unused)
static int __init xenbus_probe_initcall(void) { + if (!xen_domain()) + return -ENODEV; + /* * Probe XenBus here in the XS_PV case, and also XS_HVM unless we * need to wait for the platform PCI device to come up or
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit bae3028799dc4f1109acc4df37c8ff06f2d8f1a0 ]
In the error paths 'bad_stripe_cache' and 'bad_check_reshape', 'reconfig_mutex' is still held after raid_ctr() returns.
Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index c8821fcb82998..85221a94c2073 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3271,15 +3271,19 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) /* Try to adjust the raid4/5/6 stripe cache size to the stripe size */ if (rs_is_raid456(rs)) { r = rs_set_raid456_stripe_cache(rs); - if (r) + if (r) { + mddev_unlock(&rs->md); goto bad_stripe_cache; + } }
/* Now do an early reshape check */ if (test_bit(RT_FLAG_RESHAPE_RS, &rs->runtime_flags)) { r = rs_check_reshape(rs); - if (r) + if (r) { + mddev_unlock(&rs->md); goto bad_check_reshape; + }
/* Restore new, ctr requested layout to perform check */ rs_config_restore(rs, &rs_layout); @@ -3288,6 +3292,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = rs->md.pers->check_reshape(&rs->md); if (r) { ti->error = "Reshape check failed"; + mddev_unlock(&rs->md); goto bad_check_reshape; } }
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit e74c874eabe2e9173a8fbdad616cd89c70eb8ffd ]
There are four equivalent goto tags in raid_ctr(), clean them up to use just one.
There is no functional change and this is preparation to fix raid_ctr()'s unprotected md_stop().
Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org Stable-dep-of: 7d5fff8982a2 ("dm raid: protect md_stop() with 'reconfig_mutex'") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 27 +++++++++------------------ 1 file changed, 9 insertions(+), 18 deletions(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 85221a94c2073..156d44f690096 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3251,8 +3251,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = md_start(&rs->md); if (r) { ti->error = "Failed to start raid array"; - mddev_unlock(&rs->md); - goto bad_md_start; + goto bad_unlock; }
/* If raid4/5/6 journal mode explicitly requested (only possible with journal dev) -> set it */ @@ -3260,8 +3259,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = r5c_journal_mode_set(&rs->md, rs->journal_dev.mode); if (r) { ti->error = "Failed to set raid4/5/6 journal mode"; - mddev_unlock(&rs->md); - goto bad_journal_mode_set; + goto bad_unlock; } }
@@ -3271,19 +3269,15 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) /* Try to adjust the raid4/5/6 stripe cache size to the stripe size */ if (rs_is_raid456(rs)) { r = rs_set_raid456_stripe_cache(rs); - if (r) { - mddev_unlock(&rs->md); - goto bad_stripe_cache; - } + if (r) + goto bad_unlock; }
/* Now do an early reshape check */ if (test_bit(RT_FLAG_RESHAPE_RS, &rs->runtime_flags)) { r = rs_check_reshape(rs); - if (r) { - mddev_unlock(&rs->md); - goto bad_check_reshape; - } + if (r) + goto bad_unlock;
/* Restore new, ctr requested layout to perform check */ rs_config_restore(rs, &rs_layout); @@ -3292,8 +3286,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) r = rs->md.pers->check_reshape(&rs->md); if (r) { ti->error = "Reshape check failed"; - mddev_unlock(&rs->md); - goto bad_check_reshape; + goto bad_unlock; } } } @@ -3304,10 +3297,8 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) mddev_unlock(&rs->md); return 0;
-bad_md_start: -bad_journal_mode_set: -bad_stripe_cache: -bad_check_reshape: +bad_unlock: + mddev_unlock(&rs->md); md_stop(&rs->md); bad: raid_set_free(rs);
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit 7d5fff8982a2199d49ec067818af7d84d4f95ca0 ]
__md_stop_writes() and __md_stop() will modify many fields that are protected by 'reconfig_mutex', and all the callers will grab 'reconfig_mutex' except for md_stop().
Also, update md_stop() to make certain 'reconfig_mutex' is held using lockdep_assert_held().
Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-raid.c | 4 +++- drivers/md/md.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 156d44f690096..de3dd6e6bb892 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3298,8 +3298,8 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) return 0;
bad_unlock: - mddev_unlock(&rs->md); md_stop(&rs->md); + mddev_unlock(&rs->md); bad: raid_set_free(rs);
@@ -3310,7 +3310,9 @@ static void raid_dtr(struct dm_target *ti) { struct raid_set *rs = ti->private;
+ mddev_lock_nointr(&rs->md); md_stop(&rs->md); + mddev_unlock(&rs->md); raid_set_free(rs); }
diff --git a/drivers/md/md.c b/drivers/md/md.c index 18384251399ab..32d7ba8069aef 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6260,6 +6260,8 @@ static void __md_stop(struct mddev *mddev)
void md_stop(struct mddev *mddev) { + lockdep_assert_held(&mddev->reconfig_mutex); + /* stop the array and free an attached data structures. * This is called from dm-raid */
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit c01aebeef3ce45f696ffa0a1303cea9b34babb45 ]
If the second call to amdgpu_bo_create_kernel() fails, the memory allocated from the first call should be cleared. If the third call fails, the memory from the second call should be cleared.
Fixes: b95b5391684b ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init") Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Lijo Lazar lijo.lazar@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c index e4757a2807d9a..db820331f2c61 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -491,11 +491,11 @@ static int psp_sw_init(void *handle) return 0;
failed2: - amdgpu_bo_free_kernel(&psp->fw_pri_bo, - &psp->fw_pri_mc_addr, &psp->fw_pri_buf); -failed1: amdgpu_bo_free_kernel(&psp->fence_buf_bo, &psp->fence_buf_mc_addr, &psp->fence_buf); +failed1: + amdgpu_bo_free_kernel(&psp->fw_pri_bo, + &psp->fw_pri_mc_addr, &psp->fw_pri_buf); return ret; }
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 38ac4e8385ffb275b1837986ca6c16f26ea028c5 ]
This error path needs to unlock the "aconnector->handle_mst_msg_ready" mutex before returning.
Fixes: 4f6d9e38c4d2 ("drm/amd/display: Add polling method to handle MST reply packet") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index 888e80f498e97..9bc86deac9e8e 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -706,7 +706,7 @@ void dm_handle_mst_sideband_msg_ready_event(
if (retry == 3) { DRM_ERROR("Failed to ack MST event.\n"); - return; + break; }
drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr);
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit 3bfb25fa2b5bb9c29681e6ac861808f4be1331a9 ]
The op_type field CQ poll info structure is incorrectly filled in with the queue type as opposed to the op_type received in the CQEs. The wrong opcode could be decoded and returned to the ULP.
Copy the op_type field received in the CQE in the CQ poll info structure.
Fixes: 24419777e943 ("RDMA/irdma: Fix RQ completion opcode") Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230725155439.1057-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/uk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c index ea2c07751245a..280d633d4ec4f 100644 --- a/drivers/infiniband/hw/irdma/uk.c +++ b/drivers/infiniband/hw/irdma/uk.c @@ -1161,7 +1161,7 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, } wqe_idx = (u32)FIELD_GET(IRDMA_CQ_WQEIDX, qword3); info->qp_handle = (irdma_qp_handle)(unsigned long)qp; - info->op_type = (u8)FIELD_GET(IRDMA_CQ_SQ, qword3); + info->op_type = (u8)FIELD_GET(IRDMACQ_OP, qword3);
if (info->q_type == IRDMA_CQE_QTYPE_RQ) { u32 array_idx;
From: Sindhu Devale sindhu.devale@intel.com
[ Upstream commit ae463563b7a1b7d4a3d0b065b09d37a76b693937 ]
Report the correct WC error if a MW bind is performed on an already valid/bound window.
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu Devale sindhu.devale@intel.com Signed-off-by: Shiraz Saleem shiraz.saleem@intel.com Link: https://lore.kernel.org/r/20230725155439.1057-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/irdma/hw.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c index 1cfc03da89e7a..457368e324e10 100644 --- a/drivers/infiniband/hw/irdma/hw.c +++ b/drivers/infiniband/hw/irdma/hw.c @@ -191,6 +191,7 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp, case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS: case IRDMA_AE_AMP_MWBIND_BIND_DISABLED: case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS: + case IRDMA_AE_AMP_MWBIND_VALID_STAG: qp->flush_code = FLUSH_MW_BIND_ERR; qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR; break;
From: Rob Clark robdclark@chromium.org
[ Upstream commit 1b5d0ddcb34a605835051ae2950d5cfed0373dd8 ]
A fence id of zero is expected to be invalid, and is not removed from the fence_idr table. If userspace is requesting to specify the fence id with the FENCE_SN_IN flag, we need to reject a zero fence id value.
Fixes: 17154addc5c1 ("drm/msm: Add MSM_SUBMIT_FENCE_SN_IN") Signed-off-by: Rob Clark robdclark@chromium.org Patchwork: https://patchwork.freedesktop.org/patch/549180/ Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 10cad7b99bac8..1bd78041b4d0d 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -902,7 +902,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, * after the job is armed */ if ((args->flags & MSM_SUBMIT_FENCE_SN_IN) && - idr_find(&queue->fence_idr, args->fence)) { + (!args->fence || idr_find(&queue->fence_idr, args->fence))) { spin_unlock(&queue->idr_lock); idr_preload_end(); ret = -EINVAL;
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 53e7d08f6d6e214c40db1f51291bb2975c789dc2 ]
In ublk_ctrl_start_dev(), if wait_for_completion_interruptible() is interrupted by signal, queues aren't setup successfully yet, so we have to fail UBLK_CMD_START_DEV, otherwise kernel oops can be triggered.
Reported by German when working on qemu-storage-deamon which requires single thread ublk daemon.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reported-by: German Maglione gmaglione@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20230726144502.566785-2-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 33d3298a0da16..dc2856d1241fc 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1632,7 +1632,8 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd) if (ublksrv_pid <= 0) return -EINVAL;
- wait_for_completion_interruptible(&ub->completion); + if (wait_for_completion_interruptible(&ub->completion) != 0) + return -EINTR;
schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD);
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 0c0cbd4ebc375ceebc75c89df04b74f215fab23a ]
In ublk_ctrl_end_recovery(), if wait_for_completion_interruptible() is interrupted by signal, queues aren't setup successfully yet, so we have to fail UBLK_CMD_END_USER_RECOVERY, otherwise kernel oops can be triggered.
Fixes: c732a852b419 ("ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support") Reported-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Link: https://lore.kernel.org/r/20230726144502.566785-3-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index dc2856d1241fc..bf0711894c0a2 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -2107,7 +2107,9 @@ static int ublk_ctrl_end_recovery(struct ublk_device *ub, pr_devel("%s: Waiting for new ubq_daemons(nr: %d) are ready, dev id %d...\n", __func__, ub->dev_info.nr_hw_queues, header->dev_id); /* wait until new ubq_daemon sending all FETCH_REQ */ - wait_for_completion_interruptible(&ub->completion); + if (wait_for_completion_interruptible(&ub->completion)) + return -EINTR; + pr_devel("%s: All new ubq_daemons(nr: %d) are ready, dev id %d\n", __func__, ub->dev_info.nr_hw_queues, header->dev_id);
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 3e9dce80dbf91972aed972c743f539c396a34312 ]
If user interrupts wait_event_interruptible() in ublk_ctrl_del_dev(), return -EINTR and let user know what happens.
Fixes: 0abe39dec065 ("block: ublk: improve handling device deletion") Reported-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Reviewed-by: Stefano Garzarella sgarzare@redhat.com Link: https://lore.kernel.org/r/20230726144502.566785-4-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index bf0711894c0a2..e6b6e5eee4dea 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1909,8 +1909,8 @@ static int ublk_ctrl_del_dev(struct ublk_device **p_ub) * - the device number is freed already, we will not find this * device via ublk_get_device_from_id() */ - wait_event_interruptible(ublk_idr_wq, ublk_idr_freed(idx)); - + if (wait_event_interruptible(ublk_idr_wq, ublk_idr_freed(idx))) + return -EINTR; return 0; }
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit 99f98a7c0d6985d5507c8130a981972e4b7b3bdc ]
syzkaller found a race where IOMMUFD_DESTROY increments the refcount:
obj = iommufd_get_object(ucmd->ictx, cmd->id, IOMMUFD_OBJ_ANY); if (IS_ERR(obj)) return PTR_ERR(obj); iommufd_ref_to_users(obj); /* See iommufd_ref_to_users() */ if (!iommufd_object_destroy_user(ucmd->ictx, obj))
As part of the sequence to join the two existing primitives together.
Allowing the refcount the be elevated without holding the destroy_rwsem violates the assumption that all temporary refcount elevations are protected by destroy_rwsem. Racing IOMMUFD_DESTROY with iommufd_object_destroy_user() will cause spurious failures:
WARNING: CPU: 0 PID: 3076 at drivers/iommu/iommufd/device.c:477 iommufd_access_destroy+0x18/0x20 drivers/iommu/iommufd/device.c:478 Modules linked in: CPU: 0 PID: 3076 Comm: syz-executor.0 Not tainted 6.3.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023 RIP: 0010:iommufd_access_destroy+0x18/0x20 drivers/iommu/iommufd/device.c:477 Code: e8 3d 4e 00 00 84 c0 74 01 c3 0f 0b c3 0f 1f 44 00 00 f3 0f 1e fa 48 89 fe 48 8b bf a8 00 00 00 e8 1d 4e 00 00 84 c0 74 01 c3 <0f> 0b c3 0f 1f 44 00 00 41 57 41 56 41 55 4c 8d ae d0 00 00 00 41 RSP: 0018:ffffc90003067e08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888109ea0300 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000ffffffff RBP: 0000000000000004 R08: 0000000000000000 R09: ffff88810bbb3500 R10: ffff88810bbb3e48 R11: 0000000000000000 R12: ffffc90003067e88 R13: ffffc90003067ea8 R14: ffff888101249800 R15: 00000000fffffffe FS: 00007ff7254fe6c0(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555557262da8 CR3: 000000010a6fd000 CR4: 0000000000350ef0 Call Trace: <TASK> iommufd_test_create_access drivers/iommu/iommufd/selftest.c:596 [inline] iommufd_test+0x71c/0xcf0 drivers/iommu/iommufd/selftest.c:813 iommufd_fops_ioctl+0x10f/0x1b0 drivers/iommu/iommufd/main.c:337 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x84/0xc0 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The solution is to not increment the refcount on the IOMMUFD_DESTROY path at all. Instead use the xa_lock to serialize everything. The refcount check == 1 and xa_erase can be done under a single critical region. This avoids the need for any refcount incrementing.
It has the downside that if userspace races destroy with other operations it will get an EBUSY instead of waiting, but this is kind of racing is already dangerous.
Fixes: 2ff4bed7fee7 ("iommufd: File descriptor, context, kconfig and makefiles") Link: https://lore.kernel.org/r/2-v1-85aacb2af554+bc-iommufd_syz3_jgg@nvidia.com Reviewed-by: Kevin Tian kevin.tian@intel.com Reported-by: syzbot+7574ebfe589049630608@syzkaller.appspotmail.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/iommufd/device.c | 12 +--- drivers/iommu/iommufd/iommufd_private.h | 15 ++++- drivers/iommu/iommufd/main.c | 78 +++++++++++++++++++------ 3 files changed, 75 insertions(+), 30 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 29d05663d4d17..ed2937a4e196f 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -109,10 +109,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD); */ void iommufd_device_unbind(struct iommufd_device *idev) { - bool was_destroyed; - - was_destroyed = iommufd_object_destroy_user(idev->ictx, &idev->obj); - WARN_ON(!was_destroyed); + iommufd_object_destroy_user(idev->ictx, &idev->obj); } EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
@@ -382,7 +379,7 @@ void iommufd_device_detach(struct iommufd_device *idev) mutex_unlock(&hwpt->devices_lock);
if (hwpt->auto_domain) - iommufd_object_destroy_user(idev->ictx, &hwpt->obj); + iommufd_object_deref_user(idev->ictx, &hwpt->obj); else refcount_dec(&hwpt->obj.users);
@@ -456,10 +453,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD); */ void iommufd_access_destroy(struct iommufd_access *access) { - bool was_destroyed; - - was_destroyed = iommufd_object_destroy_user(access->ictx, &access->obj); - WARN_ON(!was_destroyed); + iommufd_object_destroy_user(access->ictx, &access->obj); } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD);
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index b38e67d1988bd..f9790983699ce 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -176,8 +176,19 @@ void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx, struct iommufd_object *obj); void iommufd_object_finalize(struct iommufd_ctx *ictx, struct iommufd_object *obj); -bool iommufd_object_destroy_user(struct iommufd_ctx *ictx, - struct iommufd_object *obj); +void __iommufd_object_destroy_user(struct iommufd_ctx *ictx, + struct iommufd_object *obj, bool allow_fail); +static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx, + struct iommufd_object *obj) +{ + __iommufd_object_destroy_user(ictx, obj, false); +} +static inline void iommufd_object_deref_user(struct iommufd_ctx *ictx, + struct iommufd_object *obj) +{ + __iommufd_object_destroy_user(ictx, obj, true); +} + struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, enum iommufd_object_type type); diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 3fbe636c3d8a6..4cf5f73f27084 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -116,14 +116,56 @@ struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id, return obj; }
+/* + * Remove the given object id from the xarray if the only reference to the + * object is held by the xarray. The caller must call ops destroy(). + */ +static struct iommufd_object *iommufd_object_remove(struct iommufd_ctx *ictx, + u32 id, bool extra_put) +{ + struct iommufd_object *obj; + XA_STATE(xas, &ictx->objects, id); + + xa_lock(&ictx->objects); + obj = xas_load(&xas); + if (xa_is_zero(obj) || !obj) { + obj = ERR_PTR(-ENOENT); + goto out_xa; + } + + /* + * If the caller is holding a ref on obj we put it here under the + * spinlock. + */ + if (extra_put) + refcount_dec(&obj->users); + + if (!refcount_dec_if_one(&obj->users)) { + obj = ERR_PTR(-EBUSY); + goto out_xa; + } + + xas_store(&xas, NULL); + if (ictx->vfio_ioas == container_of(obj, struct iommufd_ioas, obj)) + ictx->vfio_ioas = NULL; + +out_xa: + xa_unlock(&ictx->objects); + + /* The returned object reference count is zero */ + return obj; +} + /* * The caller holds a users refcount and wants to destroy the object. Returns * true if the object was destroyed. In all cases the caller no longer has a * reference on obj. */ -bool iommufd_object_destroy_user(struct iommufd_ctx *ictx, - struct iommufd_object *obj) +void __iommufd_object_destroy_user(struct iommufd_ctx *ictx, + struct iommufd_object *obj, bool allow_fail) { + struct iommufd_object *ret; + /* * The purpose of the destroy_rwsem is to ensure deterministic * destruction of objects used by external drivers and destroyed by this @@ -131,22 +173,22 @@ bool iommufd_object_destroy_user(struct iommufd_ctx *ictx, * side of this, such as during ioctl execution. */ down_write(&obj->destroy_rwsem); - xa_lock(&ictx->objects); - refcount_dec(&obj->users); - if (!refcount_dec_if_one(&obj->users)) { - xa_unlock(&ictx->objects); - up_write(&obj->destroy_rwsem); - return false; - } - __xa_erase(&ictx->objects, obj->id); - if (ictx->vfio_ioas && &ictx->vfio_ioas->obj == obj) - ictx->vfio_ioas = NULL; - xa_unlock(&ictx->objects); + ret = iommufd_object_remove(ictx, obj->id, true); up_write(&obj->destroy_rwsem);
+ if (allow_fail && IS_ERR(ret)) + return; + + /* + * If there is a bug and we couldn't destroy the object then we did put + * back the caller's refcount and will eventually try to free it again + * during close. + */ + if (WARN_ON(IS_ERR(ret))) + return; + iommufd_object_ops[obj->type].destroy(obj); kfree(obj); - return true; }
static int iommufd_destroy(struct iommufd_ucmd *ucmd) @@ -154,13 +196,11 @@ static int iommufd_destroy(struct iommufd_ucmd *ucmd) struct iommu_destroy *cmd = ucmd->cmd; struct iommufd_object *obj;
- obj = iommufd_get_object(ucmd->ictx, cmd->id, IOMMUFD_OBJ_ANY); + obj = iommufd_object_remove(ucmd->ictx, cmd->id, false); if (IS_ERR(obj)) return PTR_ERR(obj); - iommufd_ref_to_users(obj); - /* See iommufd_ref_to_users() */ - if (!iommufd_object_destroy_user(ucmd->ictx, obj)) - return -EBUSY; + iommufd_object_ops[obj->type].destroy(obj); + kfree(obj); return 0; }
From: Hugh Dickins hughd@google.com
[ Upstream commit 253e5df8b8f0145adb090f57c6f4e6efa52d738e ]
The noswap mount option is surely not one of the three options for sizing: move its description down.
The huge= mount option does not accept numeric values: those are just in an internal enum. Delete those numbers, and follow the manpage text more closely (but there's not yet any fadvise() or fcntl() which applies here).
/sys/kernel/mm/transparent_hugepage/shmem_enabled is hard to describe, and barely relevant to mounting a tmpfs: just refer to transhuge.rst (while still using the words deny and force, to help as informal reminders).
[rdunlap@infradead.org: fixup Docs table for huge mount options] Link: https://lkml.kernel.org/r/20230725052333.26857-1-rdunlap@infradead.org Link: https://lkml.kernel.org/r/986cb0bf-9780-354-9bb-4bf57aadbab@google.com Signed-off-by: Hugh Dickins hughd@google.com Signed-off-by: Randy Dunlap rdunlap@infradead.org Fixes: d0f5a85442d1 ("shmem: update documentation") Fixes: 2c6efe9cf2d7 ("shmem: add support to ignore swap") Reviewed-by: Luis Chamberlain mcgrof@kernel.org Cc: Christian Brauner brauner@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/filesystems/tmpfs.rst | 47 ++++++++++++----------------- 1 file changed, 20 insertions(+), 27 deletions(-)
diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst index f18f46be5c0c7..2cd8fa332feb7 100644 --- a/Documentation/filesystems/tmpfs.rst +++ b/Documentation/filesystems/tmpfs.rst @@ -84,8 +84,6 @@ nr_inodes The maximum number of inodes for this instance. The default is half of the number of your physical RAM pages, or (on a machine with highmem) the number of lowmem RAM pages, whichever is the lower. -noswap Disables swap. Remounts must respect the original settings. - By default swap is enabled. ========= ============================================================
These parameters accept a suffix k, m or g for kilo, mega and giga and @@ -99,36 +97,31 @@ mount with such options, since it allows any user with write access to use up all the memory on the machine; but enhances the scalability of that instance in a system with many CPUs making intensive use of it.
+tmpfs blocks may be swapped out, when there is a shortage of memory. +tmpfs has a mount option to disable its use of swap: + +====== =========================================================== +noswap Disables swap. Remounts must respect the original settings. + By default swap is enabled. +====== =========================================================== + tmpfs also supports Transparent Huge Pages which requires a kernel configured with CONFIG_TRANSPARENT_HUGEPAGE and with huge supported for your system (has_transparent_hugepage(), which is architecture specific). The mount options for this are:
-====== ============================================================ -huge=0 never: disables huge pages for the mount -huge=1 always: enables huge pages for the mount -huge=2 within_size: only allocate huge pages if the page will be - fully within i_size, also respect fadvise()/madvise() hints. -huge=3 advise: only allocate huge pages if requested with - fadvise()/madvise() -====== ============================================================ - -There is a sysfs file which you can also use to control system wide THP -configuration for all tmpfs mounts, the file is: - -/sys/kernel/mm/transparent_hugepage/shmem_enabled - -This sysfs file is placed on top of THP sysfs directory and so is registered -by THP code. It is however only used to control all tmpfs mounts with one -single knob. Since it controls all tmpfs mounts it should only be used either -for emergency or testing purposes. The values you can set for shmem_enabled are: - -== ============================================================ --1 deny: disables huge on shm_mnt and all mounts, for - emergency use --2 force: enables huge on shm_mnt and all mounts, w/o needing - option, for testing -== ============================================================ +================ ============================================================== +huge=never Do not allocate huge pages. This is the default. +huge=always Attempt to allocate huge page every time a new page is needed. +huge=within_size Only allocate huge page if it will be fully within i_size. + Also respect madvise(2) hints. +huge=advise Only allocate huge page if requested with madvise(2). +================ ============================================================== + +See also Documentation/admin-guide/mm/transhuge.rst, which describes the +sysfs file /sys/kernel/mm/transparent_hugepage/shmem_enabled: which can +be used to deny huge pages on all tmpfs mounts in an emergency, or to +force huge pages on all tmpfs mounts for testing.
tmpfs has a mount option to set the NUMA memory allocation policy for all files in that instance (if CONFIG_NUMA is enabled) - which can be
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 3fc2febb0f8ffae354820c1772ec008733237cfa ]
The global function triggers a warning because of the missing prototype
drivers/ata/pata_ns87415.c:263:6: warning: no previous prototype for 'ns87560_tf_read' [-Wmissing-prototypes] 263 | void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf)
There are no other references to this, so just make it static.
Fixes: c4b5b7b6c4423 ("pata_ns87415: Initial cut at 87415/87560 IDE support") Reviewed-by: Sergey Shtylyov s.shtylyov@omp.ru Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/pata_ns87415.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ata/pata_ns87415.c b/drivers/ata/pata_ns87415.c index d60e1f69d7b02..c697219a61a2d 100644 --- a/drivers/ata/pata_ns87415.c +++ b/drivers/ata/pata_ns87415.c @@ -260,7 +260,7 @@ static u8 ns87560_check_status(struct ata_port *ap) * LOCKING: * Inherited from caller. */ -void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf) +static void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf) { struct ata_ioports *ioaddr = &ap->ioaddr;
From: Zheng Yejian zhengyejian1@huawei.com
[ Upstream commit 2d093282b0d4357373497f65db6a05eb0c28b7c8 ]
When pages are removed in rb_remove_pages(), 'cpu_buffer->read' is set to 0 in order to make sure any read iterators reset themselves. However, this will mess 'entries' stating, see following steps:
# cd /sys/kernel/tracing/ # 1. Enlarge ring buffer prepare for later reducing: # echo 20 > per_cpu/cpu0/buffer_size_kb # 2. Write a log into ring buffer of cpu0: # taskset -c 0 echo "hello1" > trace_marker # 3. Read the log: # cat per_cpu/cpu0/trace_pipe <...>-332 [000] ..... 62.406844: tracing_mark_write: hello1 # 4. Stop reading and see the stats, now 0 entries, and 1 event readed: # cat per_cpu/cpu0/stats entries: 0 [...] read events: 1 # 5. Reduce the ring buffer # echo 7 > per_cpu/cpu0/buffer_size_kb # 6. Now entries became unexpected 1 because actually no entries!!! # cat per_cpu/cpu0/stats entries: 1 [...] read events: 0
To fix it, introduce 'page_removed' field to count total removed pages since last reset, then use it to let read iterators reset themselves instead of changing the 'read' pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20230724054040.3489499-1-zhengyej...
Cc: mhiramat@kernel.org Cc: vnagarnaik@google.com Fixes: 83f40318dab0 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ring_buffer.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 14d8001140c82..99634b29a8b82 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -523,6 +523,8 @@ struct ring_buffer_per_cpu { rb_time_t before_stamp; u64 event_stamp[MAX_NEST]; u64 read_stamp; + /* pages removed since last reset */ + unsigned long pages_removed; /* ring buffer pages to update, > 0 to add, < 0 to remove */ long nr_pages_to_update; struct list_head new_pages; /* new pages to add */ @@ -558,6 +560,7 @@ struct ring_buffer_iter { struct buffer_page *head_page; struct buffer_page *cache_reader_page; unsigned long cache_read; + unsigned long cache_pages_removed; u64 read_stamp; u64 page_stamp; struct ring_buffer_event *event; @@ -1956,6 +1959,8 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) to_remove = rb_list_head(to_remove)->next; head_bit |= (unsigned long)to_remove & RB_PAGE_HEAD; } + /* Read iterators need to reset themselves when some pages removed */ + cpu_buffer->pages_removed += nr_removed;
next_page = rb_list_head(to_remove)->next;
@@ -1977,12 +1982,6 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages) cpu_buffer->head_page = list_entry(next_page, struct buffer_page, list);
- /* - * change read pointer to make sure any read iterators reset - * themselves - */ - cpu_buffer->read = 0; - /* pages are removed, resume tracing and then free the pages */ atomic_dec(&cpu_buffer->record_disabled); raw_spin_unlock_irq(&cpu_buffer->reader_lock); @@ -4392,6 +4391,7 @@ static void rb_iter_reset(struct ring_buffer_iter *iter)
iter->cache_reader_page = iter->head_page; iter->cache_read = cpu_buffer->read; + iter->cache_pages_removed = cpu_buffer->pages_removed;
if (iter->head) { iter->read_stamp = cpu_buffer->read_stamp; @@ -4846,12 +4846,13 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts) buffer = cpu_buffer->buffer;
/* - * Check if someone performed a consuming read to - * the buffer. A consuming read invalidates the iterator - * and we need to reset the iterator in this case. + * Check if someone performed a consuming read to the buffer + * or removed some pages from the buffer. In these cases, + * iterator was invalidated and we need to reset it. */ if (unlikely(iter->cache_read != cpu_buffer->read || - iter->cache_reader_page != cpu_buffer->reader_page)) + iter->cache_reader_page != cpu_buffer->reader_page || + iter->cache_pages_removed != cpu_buffer->pages_removed)) rb_iter_reset(iter);
again: @@ -5295,6 +5296,7 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) cpu_buffer->last_overrun = 0;
rb_head_page_activate(cpu_buffer); + cpu_buffer->pages_removed = 0; }
/* Must have disabled the cpu buffer then done a synchronize_rcu */
From: Zheng Yejian zhengyejian1@huawei.com
[ Upstream commit dea499781a1150d285c62b26659f62fb00824fce ]
Warning happened in trace_buffered_event_disable() at WARN_ON_ONCE(!trace_buffered_event_ref)
Call Trace: ? __warn+0xa5/0x1b0 ? trace_buffered_event_disable+0x189/0x1b0 __ftrace_event_enable_disable+0x19e/0x3e0 free_probe_data+0x3b/0xa0 unregister_ftrace_function_probe_func+0x6b8/0x800 event_enable_func+0x2f0/0x3d0 ftrace_process_regex.isra.0+0x12d/0x1b0 ftrace_filter_write+0xe6/0x140 vfs_write+0x1c9/0x6f0 [...]
The cause of the warning is in __ftrace_event_enable_disable(), trace_buffered_event_enable() was called once while trace_buffered_event_disable() was called twice. Reproduction script show as below, for analysis, see the comments: ``` #!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then: # 1) SOFT_DISABLED_BIT was set; # 2) trace_buffered_event_enable() was called first time; echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \ set_ftrace_filter
# 2. Enable the event registered, then: # 1) SOFT_DISABLED_BIT was cleared; # 2) trace_buffered_event_disable() was called first time; echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was # set again!!! cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then: # 1) SOFT_DISABLED_BIT was cleared again; # 2) trace_buffered_event_disable() was called second time!!! echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \ set_ftrace_filter ```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at fist time soft-mode enabled, and call trace_buffered_event_disable() at last time soft-mode disabled.
Link: https://lore.kernel.org/linux-trace-kernel/20230726095804.920457-1-zhengyeji...
Cc: mhiramat@kernel.org Fixes: 0fc1b09ff1ff ("tracing: Use temp buffer when filtering events") Signed-off-by: Zheng Yejian zhengyejian1@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_events.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 57e539d479890..32f39eabc0716 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -611,7 +611,6 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, { struct trace_event_call *call = file->event_call; struct trace_array *tr = file->tr; - unsigned long file_flags = file->flags; int ret = 0; int disable;
@@ -635,6 +634,8 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, break; disable = file->flags & EVENT_FILE_FL_SOFT_DISABLED; clear_bit(EVENT_FILE_FL_SOFT_MODE_BIT, &file->flags); + /* Disable use of trace_buffered_event */ + trace_buffered_event_disable(); } else disable = !(file->flags & EVENT_FILE_FL_SOFT_MODE);
@@ -673,6 +674,8 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, if (atomic_inc_return(&file->sm_ref) > 1) break; set_bit(EVENT_FILE_FL_SOFT_MODE_BIT, &file->flags); + /* Enable use of trace_buffered_event */ + trace_buffered_event_enable(); }
if (!(file->flags & EVENT_FILE_FL_ENABLED)) { @@ -712,15 +715,6 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file, break; }
- /* Enable or disable use of trace_buffered_event */ - if ((file_flags & EVENT_FILE_FL_SOFT_DISABLED) != - (file->flags & EVENT_FILE_FL_SOFT_DISABLED)) { - if (file->flags & EVENT_FILE_FL_SOFT_DISABLED) - trace_buffered_event_enable(); - else - trace_buffered_event_disable(); - } - return ret; }
From: Dan Carpenter dan.carpenter@linaro.org
commit a8291be6b5dd465c22af229483dbac543a91e24e upstream.
This reverts commit f08aa7c80dac27ee00fa6827f447597d2fba5465.
The reverted commit was based on static analysis and a misunderstanding of how PTR_ERR() and NULLs are supposed to work. When a function returns both pointer errors and NULL then normally the NULL means "continue operating without a feature because it was deliberately turned off". The NULL should not be treated as a failure. If a driver cannot work when that feature is disabled then the KConfig should enforce that the function cannot return NULL. We should not need to test for it.
In this driver, the bug means that probe cannot succeed when CONFIG_PM is disabled.
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Fixes: f08aa7c80dac ("usb: gadget: tegra-xudc: Fix error check in tegra_xudc_powerdomain_init()") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/ZKQoBa84U/ykEh3C@moroto Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/tegra-xudc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/gadget/udc/tegra-xudc.c +++ b/drivers/usb/gadget/udc/tegra-xudc.c @@ -3718,15 +3718,15 @@ static int tegra_xudc_powerdomain_init(s int err;
xudc->genpd_dev_device = dev_pm_domain_attach_by_name(dev, "dev"); - if (IS_ERR_OR_NULL(xudc->genpd_dev_device)) { - err = PTR_ERR(xudc->genpd_dev_device) ? : -ENODATA; + if (IS_ERR(xudc->genpd_dev_device)) { + err = PTR_ERR(xudc->genpd_dev_device); dev_err(dev, "failed to get device power domain: %d\n", err); return err; }
xudc->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "ss"); - if (IS_ERR_OR_NULL(xudc->genpd_dev_ss)) { - err = PTR_ERR(xudc->genpd_dev_ss) ? : -ENODATA; + if (IS_ERR(xudc->genpd_dev_ss)) { + err = PTR_ERR(xudc->genpd_dev_ss); dev_err(dev, "failed to get SuperSpeed power domain: %d\n", err); return err; }
From: Frank Li Frank.Li@nxp.com
commit f4fc01af5b640bc39bd9403b5fd855345a2ad5f8 upstream.
The legacy gadget driver omitted calling usb_gadget_check_config() to ensure that the USB device controller (UDC) has adequate resources, including sufficient endpoint numbers and types, to support the given configuration.
Previously, usb_add_config() was solely invoked by the legacy gadget driver. Adds the necessary usb_gadget_check_config() after the bind() operation to fix the issue.
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number") Cc: stable stable@kernel.org Reported-by: Ravi Gunasekaran r-gunasekaran@ti.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20230707230015.494999-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/composite.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/gadget/composite.c +++ b/drivers/usb/gadget/composite.c @@ -1125,6 +1125,10 @@ int usb_add_config(struct usb_composite_ goto done;
status = bind(config); + + if (status == 0) + status = usb_gadget_check_config(cdev->gadget); + if (status < 0) { while (!list_empty(&config->functions)) { struct usb_function *f;
From: Zqiang qiang.zhang1211@gmail.com
commit 83e30f2bf86ef7c38fbd476ed81a88522b620628 upstream.
Currently, increasing raw_dev->count happens before invoke the raw_queue_event(), if the raw_queue_event() return error, invoke raw_release() will not trigger the dev_free() to be called.
[ 268.905865][ T5067] raw-gadget.0 gadget.0: failed to queue event [ 268.912053][ T5067] udc dummy_udc.0: failed to start USB Raw Gadget: -12 [ 268.918885][ T5067] raw-gadget.0: probe of gadget.0 failed with error -12 [ 268.925956][ T5067] UDC core: USB Raw Gadget: couldn't find an available UDC or it's busy [ 268.934657][ T5067] misc raw-gadget: fail, usb_gadget_register_driver returned -16
BUG: memory leak
[<ffffffff8154bf94>] kmalloc_trace+0x24/0x90 mm/slab_common.c:1076 [<ffffffff8347eb55>] kmalloc include/linux/slab.h:582 [inline] [<ffffffff8347eb55>] kzalloc include/linux/slab.h:703 [inline] [<ffffffff8347eb55>] dev_new drivers/usb/gadget/legacy/raw_gadget.c:191 [inline] [<ffffffff8347eb55>] raw_open+0x45/0x110 drivers/usb/gadget/legacy/raw_gadget.c:385 [<ffffffff827d1d09>] misc_open+0x1a9/0x1f0 drivers/char/misc.c:165
[<ffffffff8154bf94>] kmalloc_trace+0x24/0x90 mm/slab_common.c:1076 [<ffffffff8347cd2f>] kmalloc include/linux/slab.h:582 [inline] [<ffffffff8347cd2f>] raw_ioctl_init+0xdf/0x410 drivers/usb/gadget/legacy/raw_gadget.c:460 [<ffffffff8347dfe9>] raw_ioctl+0x5f9/0x1120 drivers/usb/gadget/legacy/raw_gadget.c:1250 [<ffffffff81685173>] vfs_ioctl fs/ioctl.c:51 [inline]
[<ffffffff8154bf94>] kmalloc_trace+0x24/0x90 mm/slab_common.c:1076 [<ffffffff833ecc6a>] kmalloc include/linux/slab.h:582 [inline] [<ffffffff833ecc6a>] kzalloc include/linux/slab.h:703 [inline] [<ffffffff833ecc6a>] dummy_alloc_request+0x5a/0xe0 drivers/usb/gadget/udc/dummy_hcd.c:665 [<ffffffff833e9132>] usb_ep_alloc_request+0x22/0xd0 drivers/usb/gadget/udc/core.c:196 [<ffffffff8347f13d>] gadget_bind+0x6d/0x370 drivers/usb/gadget/legacy/raw_gadget.c:292
This commit therefore invoke kref_get() under the condition that raw_queue_event() return success.
Reported-by: syzbot+feb045d335c1fdde5bf7@syzkaller.appspotmail.com Cc: stable stable@kernel.org Closes: https://syzkaller.appspot.com/bug?extid=feb045d335c1fdde5bf7 Signed-off-by: Zqiang qiang.zhang1211@gmail.com Reviewed-by: Andrey Konovalov andreyknvl@gmail.com Tested-by: Andrey Konovalov andreyknvl@gmail.com Link: https://lore.kernel.org/r/20230714074011.20989-1-qiang.zhang1211@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/legacy/raw_gadget.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/gadget/legacy/raw_gadget.c b/drivers/usb/gadget/legacy/raw_gadget.c index 2acece16b890..e549022642e5 100644 --- a/drivers/usb/gadget/legacy/raw_gadget.c +++ b/drivers/usb/gadget/legacy/raw_gadget.c @@ -310,13 +310,15 @@ static int gadget_bind(struct usb_gadget *gadget, dev->eps_num = i; spin_unlock_irqrestore(&dev->lock, flags);
- /* Matches kref_put() in gadget_unbind(). */ - kref_get(&dev->count); - ret = raw_queue_event(dev, USB_RAW_EVENT_CONNECT, 0, NULL); - if (ret < 0) + if (ret < 0) { dev_err(&gadget->dev, "failed to queue event\n"); + set_gadget_data(gadget, NULL); + return ret; + }
+ /* Matches kref_put() in gadget_unbind(). */ + kref_get(&dev->count); return ret; }
From: Michael Grzeschik m.grzeschik@pengutronix.de
commit 6237390644fb92b81f5262877fe545d0d2c7b5d7 upstream.
Commit 286d9975a838 ("usb: gadget: udc: core: Prevent soft_connect_store() race") introduced one extra mutex_unlock of connect_lock in the usb_gadget_active function.
Fixes: 286d9975a838 ("usb: gadget: udc: core: Prevent soft_connect_store() race") Cc: stable stable@kernel.org Signed-off-by: Michael Grzeschik m.grzeschik@pengutronix.de Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20230721222256.1743645-1-m.grzeschik@pengutronix.d... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/core.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -878,7 +878,6 @@ int usb_gadget_activate(struct usb_gadge */ if (gadget->connected) ret = usb_gadget_connect_locked(gadget); - mutex_unlock(&gadget->udc->connect_lock);
unlock: mutex_unlock(&gadget->udc->connect_lock);
From: Sean Christopherson seanjc@google.com
commit eed3013faa401aae662398709410a59bb0646e32 upstream.
Grab a reference to KVM prior to installing VM and vCPU stats file descriptors to ensure the underlying VM and vCPU objects are not freed until the last reference to any and all stats fds are dropped.
Note, the stats paths manually invoke fd_install() and so don't need to grab a reference before creating the file.
Fixes: ce55c049459c ("KVM: stats: Support binary stats retrieval for a VCPU") Fixes: fcfe1baeddbf ("KVM: stats: Support binary stats retrieval for a VM") Reported-by: Zheng Zhang zheng.zhang@email.ucr.edu Closes: https://lore.kernel.org/all/CAC_GQSr3xzZaeZt85k_RCBd5kfiOve8qXo7a81Cq53LuVQ5... Cc: stable@vger.kernel.org Cc: Kees Cook keescook@chromium.org Signed-off-by: Sean Christopherson seanjc@google.com Reviewed-by: Kees Cook keescook@chromium.org Message-Id: 20230711230131.648752-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4047,8 +4047,17 @@ static ssize_t kvm_vcpu_stats_read(struc sizeof(vcpu->stat), user_buffer, size, offset); }
+static int kvm_vcpu_stats_release(struct inode *inode, struct file *file) +{ + struct kvm_vcpu *vcpu = file->private_data; + + kvm_put_kvm(vcpu->kvm); + return 0; +} + static const struct file_operations kvm_vcpu_stats_fops = { .read = kvm_vcpu_stats_read, + .release = kvm_vcpu_stats_release, .llseek = noop_llseek, };
@@ -4069,6 +4078,9 @@ static int kvm_vcpu_ioctl_get_stats_fd(s put_unused_fd(fd); return PTR_ERR(file); } + + kvm_get_kvm(vcpu->kvm); + file->f_mode |= FMODE_PREAD; fd_install(fd, file);
@@ -4712,8 +4724,17 @@ static ssize_t kvm_vm_stats_read(struct sizeof(kvm->stat), user_buffer, size, offset); }
+static int kvm_vm_stats_release(struct inode *inode, struct file *file) +{ + struct kvm *kvm = file->private_data; + + kvm_put_kvm(kvm); + return 0; +} + static const struct file_operations kvm_vm_stats_fops = { .read = kvm_vm_stats_read, + .release = kvm_vm_stats_release, .llseek = noop_llseek, };
@@ -4732,6 +4753,9 @@ static int kvm_vm_ioctl_get_stats_fd(str put_unused_fd(fd); return PTR_ERR(file); } + + kvm_get_kvm(kvm); + file->f_mode |= FMODE_PREAD; fd_install(fd, file);
From: Sean Christopherson seanjc@google.com
commit c4abd7352023aa96114915a0bb2b88016a425cda upstream.
Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only if KVM itself is not configured to utilize unrestricted guests, i.e. don't stuff CR0/CR4 for a restricted L2 that is running as the guest of an unrestricted L1. Any attempt to VM-Enter a restricted guest with invalid CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should never observe a restricted L2 with incompatible CR0/CR4, since nested VM-Enter from L1 should have failed.
And if KVM does observe an active, restricted L2 with incompatible state, e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail does more harm than good, as KVM will often neglect to undo the side effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus the damage can easily spill over to L1. On the other hand, letting VM-Enter fail due to bad guest state is more likely to contain the damage to L2 as KVM relies on hardware to perform most guest state consistency checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into L1 irrespective of (un)restricted guest behavior.
Cc: Jim Mattson jmattson@google.com Cc: stable@vger.kernel.org Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it") Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230613203037.1968489-3-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/vmx/vmx.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1503,6 +1503,11 @@ void vmx_set_rflags(struct kvm_vcpu *vcp struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned long old_rflags;
+ /* + * Unlike CR0 and CR4, RFLAGS handling requires checking if the vCPU + * is an unrestricted guest in order to mark L2 as needing emulation + * if L1 runs L2 as a restricted guest. + */ if (is_unrestricted_guest(vcpu)) { kvm_register_mark_available(vcpu, VCPU_EXREG_RFLAGS); vmx->rflags = rflags; @@ -3238,7 +3243,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, old_cr0_pg = kvm_read_cr0_bits(vcpu, X86_CR0_PG);
hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF); - if (is_unrestricted_guest(vcpu)) + if (enable_unrestricted_guest) hw_cr0 |= KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST; else { hw_cr0 |= KVM_VM_CR0_ALWAYS_ON; @@ -3266,7 +3271,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, } #endif
- if (enable_ept && !is_unrestricted_guest(vcpu)) { + if (enable_ept && !enable_unrestricted_guest) { /* * Ensure KVM has an up-to-date snapshot of the guest's CR3. If * the below code _enables_ CR3 exiting, vmx_cache_reg() will @@ -3397,7 +3402,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long hw_cr4;
hw_cr4 = (cr4_read_shadow() & X86_CR4_MCE) | (cr4 & ~X86_CR4_MCE); - if (is_unrestricted_guest(vcpu)) + if (enable_unrestricted_guest) hw_cr4 |= KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST; else if (vmx->rmode.vm86_active) hw_cr4 |= KVM_RMODE_VM_CR4_ALWAYS_ON; @@ -3417,7 +3422,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, vcpu->arch.cr4 = cr4; kvm_register_mark_available(vcpu, VCPU_EXREG_CR4);
- if (!is_unrestricted_guest(vcpu)) { + if (!enable_unrestricted_guest) { if (enable_ept) { if (!is_paging(vcpu)) { hw_cr4 &= ~X86_CR4_PAE;
From: Sean Christopherson seanjc@google.com
commit 26a0652cb453c72f6aab0974bc4939e9b14f886b upstream.
Reject KVM_SET_SREGS{2} with -EINVAL if the incoming CR0 is invalid, e.g. due to setting bits 63:32, illegal combinations, or to a value that isn't allowed in VMX (non-)root mode. The VMX checks in particular are "fun" as failure to disallow Real Mode for an L2 that is configured with unrestricted guest disabled, when KVM itself has unrestricted guest enabled, will result in KVM forcing VM86 mode to virtual Real Mode for L2, but then fail to unwind the related metadata when synthesizing a nested VM-Exit back to L1 (which has unrestricted guest enabled).
Opportunistically fix a benign typo in the prototype for is_valid_cr4().
Cc: stable@vger.kernel.org Reported-by: syzbot+5feef0b9ee9c8e9e5689@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000f316b705fdf6e2b4@google.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230613203037.1968489-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/kvm/vmx/vmx.c | 28 +++++++++++++++++++++------- arch/x86/kvm/x86.c | 34 ++++++++++++++++++++++------------ 5 files changed, 52 insertions(+), 20 deletions(-)
--- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -37,6 +37,7 @@ KVM_X86_OP(get_segment) KVM_X86_OP(get_cpl) KVM_X86_OP(set_segment) KVM_X86_OP(get_cs_db_l_bits) +KVM_X86_OP(is_valid_cr0) KVM_X86_OP(set_cr0) KVM_X86_OP_OPTIONAL(post_set_cr3) KVM_X86_OP(is_valid_cr4) --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1566,9 +1566,10 @@ struct kvm_x86_ops { void (*set_segment)(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l); + bool (*is_valid_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0); void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0); void (*post_set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3); - bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr0); + bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4); int (*set_efer)(struct kvm_vcpu *vcpu, u64 efer); void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1799,6 +1799,11 @@ static void sev_post_set_cr3(struct kvm_ } }
+static bool svm_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ + return true; +} + void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_svm *svm = to_svm(vcpu); @@ -4838,6 +4843,7 @@ static struct kvm_x86_ops svm_x86_ops __ .set_segment = svm_set_segment, .get_cpl = svm_get_cpl, .get_cs_db_l_bits = svm_get_cs_db_l_bits, + .is_valid_cr0 = svm_is_valid_cr0, .set_cr0 = svm_set_cr0, .post_set_cr3 = sev_post_set_cr3, .is_valid_cr4 = svm_is_valid_cr4, --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3045,6 +3045,15 @@ static void enter_rmode(struct kvm_vcpu struct vcpu_vmx *vmx = to_vmx(vcpu); struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm);
+ /* + * KVM should never use VM86 to virtualize Real Mode when L2 is active, + * as using VM86 is unnecessary if unrestricted guest is enabled, and + * if unrestricted guest is disabled, VM-Enter (from L1) with CR0.PG=0 + * should VM-Fail and KVM should reject userspace attempts to stuff + * CR0.PG=0 when L2 is active. + */ + WARN_ON_ONCE(is_guest_mode(vcpu)); + vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_TR], VCPU_SREG_TR); vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_ES], VCPU_SREG_ES); vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_DS], VCPU_SREG_DS); @@ -3234,6 +3243,17 @@ void ept_save_pdptrs(struct kvm_vcpu *vc #define CR3_EXITING_BITS (CPU_BASED_CR3_LOAD_EXITING | \ CPU_BASED_CR3_STORE_EXITING)
+static bool vmx_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ + if (is_guest_mode(vcpu)) + return nested_guest_cr0_valid(vcpu, cr0); + + if (to_vmx(vcpu)->nested.vmxon) + return nested_host_cr0_valid(vcpu, cr0); + + return true; +} + void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -5372,18 +5392,11 @@ static int handle_set_cr0(struct kvm_vcp val = (val & ~vmcs12->cr0_guest_host_mask) | (vmcs12->guest_cr0 & vmcs12->cr0_guest_host_mask);
- if (!nested_guest_cr0_valid(vcpu, val)) - return 1; - if (kvm_set_cr0(vcpu, val)) return 1; vmcs_writel(CR0_READ_SHADOW, orig_val); return 0; } else { - if (to_vmx(vcpu)->nested.vmxon && - !nested_host_cr0_valid(vcpu, val)) - return 1; - return kvm_set_cr0(vcpu, val); } } @@ -8165,6 +8178,7 @@ static struct kvm_x86_ops vmx_x86_ops __ .set_segment = vmx_set_segment, .get_cpl = vmx_get_cpl, .get_cs_db_l_bits = vmx_get_cs_db_l_bits, + .is_valid_cr0 = vmx_is_valid_cr0, .set_cr0 = vmx_set_cr0, .is_valid_cr4 = vmx_is_valid_cr4, .set_cr4 = vmx_set_cr4, --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -906,6 +906,22 @@ int load_pdptrs(struct kvm_vcpu *vcpu, u } EXPORT_SYMBOL_GPL(load_pdptrs);
+static bool kvm_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ +#ifdef CONFIG_X86_64 + if (cr0 & 0xffffffff00000000UL) + return false; +#endif + + if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD)) + return false; + + if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE)) + return false; + + return static_call(kvm_x86_is_valid_cr0)(vcpu, cr0); +} + void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned long cr0) { /* @@ -952,20 +968,13 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, u { unsigned long old_cr0 = kvm_read_cr0(vcpu);
- cr0 |= X86_CR0_ET; - -#ifdef CONFIG_X86_64 - if (cr0 & 0xffffffff00000000UL) + if (!kvm_is_valid_cr0(vcpu, cr0)) return 1; -#endif - - cr0 &= ~CR0_RESERVED_BITS;
- if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD)) - return 1; + cr0 |= X86_CR0_ET;
- if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE)) - return 1; + /* Write to CR0 reserved bits are ignored, even on Intel. */ + cr0 &= ~CR0_RESERVED_BITS;
#ifdef CONFIG_X86_64 if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) && @@ -11461,7 +11470,8 @@ static bool kvm_is_valid_sregs(struct kv return false; }
- return kvm_is_valid_cr4(vcpu, sregs->cr4); + return kvm_is_valid_cr4(vcpu, sregs->cr4) && + kvm_is_valid_cr0(vcpu, sregs->cr0); }
static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
From: Johan Hovold johan+linaro@kernel.org
commit 4dd8752a14ca0303fbdf0a6c68ff65f0a50bd2fa upstream.
The runtime PM state should not be changed by drivers that do not implement runtime PM even if it happens to work around a bug in PM core.
With the wake irq arming now fixed, drop the bogus runtime PM state update which left the device in active state (and could potentially prevent a parent device from suspending).
Fixes: f3974413cf02 ("tty: serial: qcom_geni_serial: Wakeup IRQ cleanup") Cc: 5.6+ stable@vger.kernel.org # 5.6+ Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Tony Lindgren tony@atomide.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/qcom_geni_serial.c | 7 ------- 1 file changed, 7 deletions(-)
--- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -1676,13 +1676,6 @@ static int qcom_geni_serial_probe(struct if (ret) return ret;
- /* - * Set pm_runtime status as ACTIVE so that wakeup_irq gets - * enabled/disabled from dev_pm_arm_wake_irq during system - * suspend/resume respectively. - */ - pm_runtime_set_active(&pdev->dev); - if (port->wakeup_irq > 0) { device_init_wakeup(&pdev->dev, true); ret = dev_pm_set_dedicated_wake_irq(&pdev->dev,
From: Biju Das biju.das.jz@bp.renesas.com
commit 57c984f6fe20ebb9306d6e8c09b4f67fe63298c6 upstream.
Fix sleeping in atomic context warning as reported by the Smatch static checker tool by replacing disable_irq->disable_irq_nosync.
Reported by: Dan Carpenter dan.carpenter@linaro.org
Fixes: 8749061be196 ("tty: serial: sh-sci: Add RZ/G2L SCIFA DMA tx support") Cc: stable@kernel.org Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/r/20230704154818.406913-1-biju.das.jz@bp.renesas.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sh-sci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c index 7c9457962a3d..8b7a42e05d6d 100644 --- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -590,7 +590,7 @@ static void sci_start_tx(struct uart_port *port) dma_submit_error(s->cookie_tx)) { if (s->cfg->regtype == SCIx_RZ_SCIFA_REGTYPE) /* Switch irq from SCIF to DMA */ - disable_irq(s->irqs[SCIx_TXI_IRQ]); + disable_irq_nosync(s->irqs[SCIx_TXI_IRQ]);
s->cookie_tx = 0; schedule_work(&s->work_tx);
From: Ruihong Luo colorsu1922@gmail.com
commit 748c5ea8b8796ae8ee80b8d3a3d940570b588d59 upstream.
Preserve the original value of the Divisor Latch Fraction (DLF) register. When the DLF register is modified without preservation, it can disrupt the baudrate settings established by firmware or bootloader, leading to data corruption and the generation of unreadable or distorted characters.
Fixes: 701c5e73b296 ("serial: 8250_dw: add fractional divisor support") Cc: stable stable@kernel.org Signed-off-by: Ruihong Luo colorsu1922@gmail.com Link: https://lore.kernel.org/stable/20230713004235.35904-1-colorsu1922%40gmail.co... Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230713004235.35904-1-colorsu1922@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_dwlib.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/tty/serial/8250/8250_dwlib.c +++ b/drivers/tty/serial/8250/8250_dwlib.c @@ -244,7 +244,7 @@ void dw8250_setup_port(struct uart_port struct dw8250_port_data *pd = p->private_data; struct dw8250_data *data = to_dw8250_data(pd); struct uart_8250_port *up = up_to_u8250p(p); - u32 reg; + u32 reg, old_dlf;
pd->hw_rs485_support = dw8250_detect_rs485_hw(p); if (pd->hw_rs485_support) { @@ -270,9 +270,11 @@ void dw8250_setup_port(struct uart_port dev_dbg(p->dev, "Designware UART version %c.%c%c\n", (reg >> 24) & 0xff, (reg >> 16) & 0xff, (reg >> 8) & 0xff);
+ /* Preserve value written by firmware or bootloader */ + old_dlf = dw8250_readl_ext(p, DW_UART_DLF); dw8250_writel_ext(p, DW_UART_DLF, ~0U); reg = dw8250_readl_ext(p, DW_UART_DLF); - dw8250_writel_ext(p, DW_UART_DLF, 0); + dw8250_writel_ext(p, DW_UART_DLF, old_dlf);
if (reg) { pd->dlf_size = fls(reg);
From: Samuel Holland samuel.holland@sifive.com
commit 9b8fef6345d5487137d4193bb0a0eae2203c284e upstream.
This function is called indirectly from the platform driver probe function. Even if the driver is built in, it may be probed after free_initmem() due to deferral or unbinding/binding via sysfs. Thus the function cannot be marked as __init.
Fixes: 45c054d0815b ("tty: serial: add driver for the SiFive UART") Cc: stable stable@kernel.org Signed-off-by: Samuel Holland samuel.holland@sifive.com Link: https://lore.kernel.org/r/20230624060159.3401369-1-samuel.holland@sifive.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sifive.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/serial/sifive.c +++ b/drivers/tty/serial/sifive.c @@ -811,7 +811,7 @@ static void sifive_serial_console_write( local_irq_restore(flags); }
-static int __init sifive_serial_console_setup(struct console *co, char *options) +static int sifive_serial_console_setup(struct console *co, char *options) { struct sifive_serial_port *ssp; int baud = SIFIVE_DEFAULT_BAUD_RATE;
From: Jerry Meng jerry-meng@foxmail.com
commit 4f7cab49cecee16120d27c1734cfdf3d6c0e5329 upstream.
EM060K_128 is EM060K's sub-model, having the same name "Quectel EM060K-GL"
MBIM + GNSS + DIAG + NMEA + AT + QDSS + DPL
T: Bus=03 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=0128 Rev= 5.04 S: Manufacturer=Quectel S: Product=Quectel EM060K-GL S: SerialNumber=f6fa08b6 C:* #Ifs= 8 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none) E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Jerry Meng jerry-meng@foxmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -251,6 +251,7 @@ static void option_instat_callback(struc #define QUECTEL_PRODUCT_EM061K_LTA 0x0123 #define QUECTEL_PRODUCT_EM061K_LMS 0x0124 #define QUECTEL_PRODUCT_EC25 0x0125 +#define QUECTEL_PRODUCT_EM060K_128 0x0128 #define QUECTEL_PRODUCT_EG91 0x0191 #define QUECTEL_PRODUCT_EG95 0x0195 #define QUECTEL_PRODUCT_BG96 0x0296 @@ -1197,6 +1198,9 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0x00, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0xff, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0xff, 0x30) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0x00, 0x40) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0x00, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0xff, 0x40) },
From: Mohsen Tahmasebi moh53n@moh53n.ir
commit 857ea9005806e2a458016880278f98715873e977 upstream.
Add Quectel EC200A "DIAG, AT, MODEM":
0x6005: ECM / RNDIS + DIAG + AT + MODEM
T: Bus=01 Lev=01 Prnt=02 Port=05 Cnt=01 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=6005 Rev=03.18 S: Manufacturer=Android S: Product=Android S: SerialNumber=0000 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms
Signed-off-by: Mohsen Tahmasebi moh53n@moh53n.ir Tested-by: Mostafa Ghofrani mostafaghrr@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -269,6 +269,7 @@ static void option_instat_callback(struc #define QUECTEL_PRODUCT_RM520N 0x0801 #define QUECTEL_PRODUCT_EC200U 0x0901 #define QUECTEL_PRODUCT_EC200S_CN 0x6002 +#define QUECTEL_PRODUCT_EC200A 0x6005 #define QUECTEL_PRODUCT_EM061K_LWW 0x6008 #define QUECTEL_PRODUCT_EM061K_LCN 0x6009 #define QUECTEL_PRODUCT_EC200T 0x6026 @@ -1229,6 +1230,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM520N, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, 0x0900, 0xff, 0, 0), /* RM500U-CN */ .driver_info = ZLP }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200A, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200U, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200S_CN, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) },
From: Oliver Neukum oneukum@suse.com
commit dd92c8a1f99bcd166204ffc219ea5a23dd65d64f upstream.
Add the device and product ID for this CAN bus interface / license dongle. The device is usable either directly from user space or can be attached to a kernel CAN interface with slcan_attach.
Reported-by: Kaufmann Automotive GmbH info@kaufmann-automotive.ch Tested-by: Kaufmann Automotive GmbH info@kaufmann-automotive.ch Signed-off-by: Oliver Neukum oneukum@suse.com [ johan: amend commit message and move entries in sort order ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/usb-serial-simple.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/serial/usb-serial-simple.c +++ b/drivers/usb/serial/usb-serial-simple.c @@ -63,6 +63,11 @@ DEVICE(flashloader, FLASHLOADER_IDS); 0x01) } DEVICE(google, GOOGLE_IDS);
+/* KAUFMANN RKS+CAN VCP */ +#define KAUFMANN_IDS() \ + { USB_DEVICE(0x16d0, 0x0870) } +DEVICE(kaufmann, KAUFMANN_IDS); + /* Libtransistor USB console */ #define LIBTRANSISTOR_IDS() \ { USB_DEVICE(0x1209, 0x8b00) } @@ -124,6 +129,7 @@ static struct usb_serial_driver * const &funsoft_device, &flashloader_device, &google_device, + &kaufmann_device, &libtransistor_device, &vivopay_device, &moto_modem_device, @@ -142,6 +148,7 @@ static const struct usb_device_id id_tab FUNSOFT_IDS(), FLASHLOADER_IDS(), GOOGLE_IDS(), + KAUFMANN_IDS(), LIBTRANSISTOR_IDS(), VIVOPAY_IDS(), MOTO_IDS(),
From: Johan Hovold johan@kernel.org
commit d245aedc00775c4d7265a9f4522cc4e1fd34d102 upstream.
Sort the driver symbols alphabetically in order to make it more obvious where new driver entries should be added.
Cc: stable@vger.kernel.org Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/usb-serial-simple.c | 66 ++++++++++++++++----------------- 1 file changed, 33 insertions(+), 33 deletions(-)
--- a/drivers/usb/serial/usb-serial-simple.c +++ b/drivers/usb/serial/usb-serial-simple.c @@ -38,16 +38,6 @@ static struct usb_serial_driver vendor## { USB_DEVICE(0x0a21, 0x8001) } /* MMT-7305WW */ DEVICE(carelink, CARELINK_IDS);
-/* ZIO Motherboard USB driver */ -#define ZIO_IDS() \ - { USB_DEVICE(0x1CBE, 0x0103) } -DEVICE(zio, ZIO_IDS); - -/* Funsoft Serial USB driver */ -#define FUNSOFT_IDS() \ - { USB_DEVICE(0x1404, 0xcddc) } -DEVICE(funsoft, FUNSOFT_IDS); - /* Infineon Flashloader driver */ #define FLASHLOADER_IDS() \ { USB_DEVICE_INTERFACE_CLASS(0x058b, 0x0041, USB_CLASS_CDC_DATA) }, \ @@ -55,6 +45,11 @@ DEVICE(funsoft, FUNSOFT_IDS); { USB_DEVICE(0x8087, 0x0801) } DEVICE(flashloader, FLASHLOADER_IDS);
+/* Funsoft Serial USB driver */ +#define FUNSOFT_IDS() \ + { USB_DEVICE(0x1404, 0xcddc) } +DEVICE(funsoft, FUNSOFT_IDS); + /* Google Serial USB SubClass */ #define GOOGLE_IDS() \ { USB_VENDOR_AND_INTERFACE_INFO(0x18d1, \ @@ -63,6 +58,11 @@ DEVICE(flashloader, FLASHLOADER_IDS); 0x01) } DEVICE(google, GOOGLE_IDS);
+/* HP4x (48/49) Generic Serial driver */ +#define HP4X_IDS() \ + { USB_DEVICE(0x03f0, 0x0121) } +DEVICE(hp4x, HP4X_IDS); + /* KAUFMANN RKS+CAN VCP */ #define KAUFMANN_IDS() \ { USB_DEVICE(0x16d0, 0x0870) } @@ -73,11 +73,6 @@ DEVICE(kaufmann, KAUFMANN_IDS); { USB_DEVICE(0x1209, 0x8b00) } DEVICE(libtransistor, LIBTRANSISTOR_IDS);
-/* ViVOpay USB Serial Driver */ -#define VIVOPAY_IDS() \ - { USB_DEVICE(0x1d5f, 0x1004) } /* ViVOpay 8800 */ -DEVICE(vivopay, VIVOPAY_IDS); - /* Motorola USB Phone driver */ #define MOTO_IDS() \ { USB_DEVICE(0x05c6, 0x3197) }, /* unknown Motorola phone */ \ @@ -106,10 +101,10 @@ DEVICE(nokia, NOKIA_IDS); { USB_DEVICE(0x09d7, 0x0100) } /* NovAtel FlexPack GPS */ DEVICE_N(novatel_gps, NOVATEL_IDS, 3);
-/* HP4x (48/49) Generic Serial driver */ -#define HP4X_IDS() \ - { USB_DEVICE(0x03f0, 0x0121) } -DEVICE(hp4x, HP4X_IDS); +/* Siemens USB/MPI adapter */ +#define SIEMENS_IDS() \ + { USB_DEVICE(0x908, 0x0004) } +DEVICE(siemens_mpi, SIEMENS_IDS);
/* Suunto ANT+ USB Driver */ #define SUUNTO_IDS() \ @@ -117,47 +112,52 @@ DEVICE(hp4x, HP4X_IDS); { USB_DEVICE(0x0fcf, 0x1009) } /* Dynastream ANT USB-m Stick */ DEVICE(suunto, SUUNTO_IDS);
-/* Siemens USB/MPI adapter */ -#define SIEMENS_IDS() \ - { USB_DEVICE(0x908, 0x0004) } -DEVICE(siemens_mpi, SIEMENS_IDS); +/* ViVOpay USB Serial Driver */ +#define VIVOPAY_IDS() \ + { USB_DEVICE(0x1d5f, 0x1004) } /* ViVOpay 8800 */ +DEVICE(vivopay, VIVOPAY_IDS); + +/* ZIO Motherboard USB driver */ +#define ZIO_IDS() \ + { USB_DEVICE(0x1CBE, 0x0103) } +DEVICE(zio, ZIO_IDS);
/* All of the above structures mushed into two lists */ static struct usb_serial_driver * const serial_drivers[] = { &carelink_device, - &zio_device, - &funsoft_device, &flashloader_device, + &funsoft_device, &google_device, + &hp4x_device, &kaufmann_device, &libtransistor_device, - &vivopay_device, &moto_modem_device, &motorola_tetra_device, &nokia_device, &novatel_gps_device, - &hp4x_device, - &suunto_device, &siemens_mpi_device, + &suunto_device, + &vivopay_device, + &zio_device, NULL };
static const struct usb_device_id id_table[] = { CARELINK_IDS(), - ZIO_IDS(), - FUNSOFT_IDS(), FLASHLOADER_IDS(), + FUNSOFT_IDS(), GOOGLE_IDS(), + HP4X_IDS(), KAUFMANN_IDS(), LIBTRANSISTOR_IDS(), - VIVOPAY_IDS(), MOTO_IDS(), MOTOROLA_TETRA_IDS(), NOKIA_IDS(), NOVATEL_IDS(), - HP4X_IDS(), - SUUNTO_IDS(), SIEMENS_IDS(), + SUUNTO_IDS(), + VIVOPAY_IDS(), + ZIO_IDS(), { }, }; MODULE_DEVICE_TABLE(usb, id_table);
From: Marc Kleine-Budde mkl@pengutronix.de
commit f8a2da6ec2417cca169fa85a8ab15817bccbb109 upstream.
After an initial link up the CAN device is in ERROR-ACTIVE mode. Due to a missing CAN_STATE_STOPPED in gs_can_close() it doesn't change to STOPPED after a link down:
| ip link set dev can0 up | ip link set dev can0 down | ip --details link show can0 | 13: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 10 | link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0 | can state ERROR-ACTIVE restart-ms 1000
Add missing assignment of CAN_STATE_STOPPED in gs_can_close().
Cc: stable@vger.kernel.org Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://lore.kernel.org/all/20230718-gs_usb-fix-can-state-v1-1-f19738ae2c23@... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/gs_usb.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -1030,6 +1030,8 @@ static int gs_can_close(struct net_devic usb_kill_anchored_urbs(&dev->tx_submitted); atomic_set(&dev->active_tx_urbs, 0);
+ dev->can.state = CAN_STATE_STOPPED; + /* reset the device */ rc = gs_cmd_reset(dev); if (rc < 0)
From: Samuel Thibault samuel.thibault@ens-lyon.org
commit 690c8b804ad2eafbd35da5d3c95ad325ca7d5061 upstream.
83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled") broke BRLTTY's ability to simulate keypresses on the console, thus effectively breaking braille keyboards of blind users.
This restores the TIOCSTI feature for CAP_SYS_ADMIN processes, which BRLTTY is, thus fixing braille keyboards without re-opening the security issue.
Signed-off-by: Samuel Thibault samuel.thibault@ens-lyon.org Acked-by: Kees Cook keescook@chromium.org Fixes: 83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled") Cc: stable@vger.kernel.org Reported-by: Nicolas Pitre nico@fluxnic.net Link: https://lore.kernel.org/r/20230710002645.v565c7xq5iddruse@begin Acked-by: Jiri Slaby jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/tty_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -2276,7 +2276,7 @@ static int tiocsti(struct tty_struct *tt char ch, mbz = 0; struct tty_ldisc *ld;
- if (!tty_legacy_tiocsti) + if (!tty_legacy_tiocsti && !capable(CAP_SYS_ADMIN)) return -EIO;
if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
From: Kyle Tso kyletso@google.com
commit b33ebb2415e7e0a55ee3d049c2890d3a3e3805b6 upstream.
When calling device_add in the registration of typec_port, it will do the NULL check on usb_power_delivery handle in typec_port for the visibility of the device attributes. It is always NULL because port->pd is set in typec_port_set_usb_power_delivery which is later than the device_add call.
Set port->pd before device_add and only link the device after that.
Fixes: a7cff92f0635 ("usb: typec: USB Power Delivery helpers for ports and partners") Cc: stable@vger.kernel.org Signed-off-by: Kyle Tso kyletso@google.com Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20230623151036.3955013-2-kyletso@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/class.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/typec/class.c +++ b/drivers/usb/typec/class.c @@ -2288,6 +2288,8 @@ struct typec_port *typec_register_port(s return ERR_PTR(ret); }
+ port->pd = cap->pd; + ret = device_add(&port->dev); if (ret) { dev_err(parent, "failed to register port (%d)\n", ret); @@ -2295,7 +2297,7 @@ struct typec_port *typec_register_port(s return ERR_PTR(ret); }
- ret = typec_port_set_usb_power_delivery(port, cap->pd); + ret = usb_power_delivery_link_device(port->pd, &port->dev); if (ret) { dev_err(&port->dev, "failed to link pd\n"); device_unregister(&port->dev);
From: Kyle Tso kyletso@google.com
commit 4b642dc9829507e4afabc03d32a18abbdb192c5e upstream.
The pointers of each usb_power_delivery handles are stored in "pds" array returned from the pd_get ops but not in the adjacent memory calculated from "pd". Get the handles from "pds" array directly instead of deriving them from "pd".
Fixes: a7cff92f0635 ("usb: typec: USB Power Delivery helpers for ports and partners") Cc: stable@vger.kernel.org Signed-off-by: Kyle Tso kyletso@google.com Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20230623151036.3955013-3-kyletso@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/class.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/drivers/usb/typec/class.c +++ b/drivers/usb/typec/class.c @@ -1277,8 +1277,7 @@ static ssize_t select_usb_power_delivery { struct typec_port *port = to_typec_port(dev); struct usb_power_delivery **pds; - struct usb_power_delivery *pd; - int ret = 0; + int i, ret = 0;
if (!port->ops || !port->ops->pd_get) return -EOPNOTSUPP; @@ -1287,11 +1286,11 @@ static ssize_t select_usb_power_delivery if (!pds) return 0;
- for (pd = pds[0]; pd; pd++) { - if (pd == port->pd) - ret += sysfs_emit(buf + ret, "[%s] ", dev_name(&pd->dev)); + for (i = 0; pds[i]; i++) { + if (pds[i] == port->pd) + ret += sysfs_emit(buf + ret, "[%s] ", dev_name(&pds[i]->dev)); else - ret += sysfs_emit(buf + ret, "%s ", dev_name(&pd->dev)); + ret += sysfs_emit(buf + ret, "%s ", dev_name(&pds[i]->dev)); }
buf[ret - 1] = '\n';
From: Kyle Tso kyletso@google.com
commit 609fded3f91972ada551c141c5d04a71704f8967 upstream.
The buffer address used in sysfs_emit should be aligned to PAGE_SIZE. Use sysfs_emit_at instead to offset the buffer.
Fixes: a7cff92f0635 ("usb: typec: USB Power Delivery helpers for ports and partners") Cc: stable@vger.kernel.org Signed-off-by: Kyle Tso kyletso@google.com Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20230623151036.3955013-4-kyletso@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/class.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/typec/class.c +++ b/drivers/usb/typec/class.c @@ -1288,9 +1288,9 @@ static ssize_t select_usb_power_delivery
for (i = 0; pds[i]; i++) { if (pds[i] == port->pd) - ret += sysfs_emit(buf + ret, "[%s] ", dev_name(&pds[i]->dev)); + ret += sysfs_emit_at(buf, ret, "[%s] ", dev_name(&pds[i]->dev)); else - ret += sysfs_emit(buf + ret, "%s ", dev_name(&pds[i]->dev)); + ret += sysfs_emit_at(buf, ret, "%s ", dev_name(&pds[i]->dev)); }
buf[ret - 1] = '\n';
From: Jakub Vanek linuxtardis@gmail.com
commit 734ae15ab95a18d3d425fc9cb38b7a627d786f08 upstream.
This reverts commit b138e23d3dff90c0494925b4c1874227b81bddf7.
AutoRetry has been found to sometimes cause controller freezes when communicating with buggy USB devices.
This controller feature allows the controller in host mode to send non-terminating/burst retry ACKs instead of terminating retry ACKs to devices when a transaction error (CRC error or overflow) occurs.
Unfortunately, if the USB device continues to respond with a CRC error, the controller will not complete endpoint-related commands while it keeps trying to auto-retry. [3] The xHCI driver will notice this once it tries to abort the transfer using a Stop Endpoint command and does not receive a completion in time. [1] This situation is reported to dmesg:
[sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00 xhci-hcd: xHCI host not responding to stop endpoint command xhci-hcd: xHCI host controller not responding, assume dead xhci-hcd: HC died; cleaning up
Some users observed this problem on an Odroid HC2 with the JMS578 USB3-to-SATA bridge. The issue can be triggered by starting a read-heavy workload on an attached SSD. After a while, the host controller would die and the SSD would disappear from the system. [1]
Further analysis by Synopsys determined that controller revisions other than the one in Odroid HC2 are also affected by this. The recommended solution was to disable AutoRetry altogether. This change does not have a noticeable performance impact. [2]
Revert the enablement commit. This will keep the AutoRetry bit in the default state configured during SoC design [2].
Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in the controller") Link: https://lore.kernel.org/r/a21f34c04632d250cd0a78c7c6f4a1c9c7a43142.camel@gma... [1] Link: https://lore.kernel.org/r/20230711214834.kyr6ulync32d4ktk@synopsys.com/ [2] Link: https://lore.kernel.org/r/20230712225518.2smu7wse6djc7l5o@synopsys.com/ [3] Cc: stable@vger.kernel.org Cc: Mauro Ribeiro mauro.ribeiro@hardkernel.com Cc: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Suggested-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Signed-off-by: Jakub Vanek linuxtardis@gmail.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20230714122419.27741-1-linuxtardis@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.c | 16 ---------------- drivers/usb/dwc3/core.h | 3 --- 2 files changed, 19 deletions(-)
--- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1209,22 +1209,6 @@ static int dwc3_core_init(struct dwc3 *d dwc3_writel(dwc->regs, DWC3_GUCTL1, reg); }
- if (dwc->dr_mode == USB_DR_MODE_HOST || - dwc->dr_mode == USB_DR_MODE_OTG) { - reg = dwc3_readl(dwc->regs, DWC3_GUCTL); - - /* - * Enable Auto retry Feature to make the controller operating in - * Host mode on seeing transaction errors(CRC errors or internal - * overrun scenerios) on IN transfers to reply to the device - * with a non-terminating retry ACK (i.e, an ACK transcation - * packet with Retry=1 & Nump != 0) - */ - reg |= DWC3_GUCTL_HSTINAUTORETRY; - - dwc3_writel(dwc->regs, DWC3_GUCTL, reg); - } - /* * Must config both number of packets and max burst settings to enable * RX and/or TX threshold. --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -254,9 +254,6 @@ #define DWC3_GCTL_GBLHIBERNATIONEN BIT(1) #define DWC3_GCTL_DSBLCLKGTNG BIT(0)
-/* Global User Control Register */ -#define DWC3_GUCTL_HSTINAUTORETRY BIT(14) - /* Global User Control 1 Register */ #define DWC3_GUCTL1_DEV_DECOUPLE_L1L2_EVT BIT(31) #define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
From: Gratian Crisan gratian.crisan@ni.com
commit b32b8f2b9542d8039f5468303a6ca78c1b5611a5 upstream.
Hardware based on the Bay Trail / BYT SoCs require an external ULPI phy for USB device-mode. The phy chip usually has its 'reset' and 'chip select' lines connected to GPIOs described by ACPI fwnodes in the DSDT table.
Because of hardware with missing ACPI resources for the 'reset' and 'chip select' GPIOs commit 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources") introduced a fallback gpiod_lookup_table with hard-coded mappings for Bay Trail devices.
However there are existing Bay Trail based devices, like the National Instruments cRIO-903x series, where the phy chip has its 'reset' and 'chip-select' lines always asserted in hardware via resistor pull-ups. On this hardware the phy chip is always enabled and the ACPI dsdt table is missing information not only for the 'chip-select' and 'reset' lines but also for the BYT GPIO controller itself "INT33FC".
With the introduction of the gpiod_lookup_table initializing the USB device-mode on these hardware now errors out. The error comes from the gpiod_get_optional() calls in dwc3_pci_quirks() which will now return an -ENOENT error due to the missing ACPI entry for the INT33FC gpio controller used in the aforementioned table.
This hardware used to work before because gpiod_get_optional() will return NULL instead of -ENOENT if no GPIO has been assigned to the requested function. The dwc3_pci_quirks() code for setting the 'cs' and 'reset' GPIOs was then skipped (due to the NULL return). This is the correct behavior in cases where the phy chip is hardwired and there are no GPIOs to control.
Since the gpiod_lookup_table relies on the presence of INT33FC fwnode in ACPI tables only add the table if we know the entry for the INT33FC gpio controller is present. This allows Bay Trail based devices with hardwired dwc3 ULPI phys to continue working.
Fixes: 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources") Cc: stable stable@kernel.org Signed-off-by: Gratian Crisan gratian.crisan@ni.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230726184555.218091-2-gratian.crisan@ni.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-pci.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc3/dwc3-pci.c +++ b/drivers/usb/dwc3/dwc3-pci.c @@ -233,10 +233,12 @@ static int dwc3_pci_quirks(struct dwc3_p
/* * A lot of BYT devices lack ACPI resource entries for - * the GPIOs, add a fallback mapping to the reference + * the GPIOs. If the ACPI entry for the GPIO controller + * is present add a fallback mapping to the reference * design GPIOs which all boards seem to use. */ - gpiod_add_lookup_table(&platform_bytcr_gpios); + if (acpi_dev_present("INT33FC", NULL, -1)) + gpiod_add_lookup_table(&platform_bytcr_gpios);
/* * These GPIOs will turn on the USB2 PHY. Note that we have to
From: Jisheng Zhang jszhang@kernel.org
commit e835c0a4e23c38531dcee5ef77e8d1cf462658c7 upstream.
Commit c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on system_suspend in host mode") replaces check for HOST only dr_mode with current_dr_role. But during booting, the current_dr_role isn't initialized, thus the device side reset is always issued even if dwc3 was configured as host-only. What's more, on some platforms with host only dwc3, aways issuing device side reset by accessing device register block can cause kernel panic.
Fixes: c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on system_suspend in host mode") Cc: stable stable@kernel.org Signed-off-by: Jisheng Zhang jszhang@kernel.org Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20230627162018.739-1-jszhang@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -277,9 +277,9 @@ int dwc3_core_soft_reset(struct dwc3 *dw /* * We're resetting only the device side because, if we're in host mode, * XHCI driver will reset the host block. If dwc3 was configured for - * host-only mode, then we can return early. + * host-only mode or current role is host, then we can return early. */ - if (dwc->current_dr_role == DWC3_GCTL_PRTCAP_HOST) + if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->current_dr_role == DWC3_GCTL_PRTCAP_HOST) return 0;
reg = dwc3_readl(dwc->regs, DWC3_DCTL);
From: Xu Yang xu.yang_2@nxp.com
commit 7f2327666a9080e428166964e37548b0168cd5e9 upstream.
A negative number from ret means the host controller had failed to send usb message and 0 means succeed. Therefore, the if logic is wrong here and this patch will fix it.
Fixes: f2b42379c576 ("usb: misc: ehset: Rework test mode entry") Cc: stable stable@kernel.org Signed-off-by: Xu Yang xu.yang_2@nxp.com Link: https://lore.kernel.org/r/20230705095231.457860-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/misc/ehset.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/misc/ehset.c +++ b/drivers/usb/misc/ehset.c @@ -77,7 +77,7 @@ static int ehset_probe(struct usb_interf switch (test_pid) { case TEST_SE0_NAK_PID: ret = ehset_prepare_port_for_testing(hub_udev, portnum); - if (!ret) + if (ret < 0) break; ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE, USB_RT_PORT, USB_PORT_FEAT_TEST, @@ -86,7 +86,7 @@ static int ehset_probe(struct usb_interf break; case TEST_J_PID: ret = ehset_prepare_port_for_testing(hub_udev, portnum); - if (!ret) + if (ret < 0) break; ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE, USB_RT_PORT, USB_PORT_FEAT_TEST, @@ -95,7 +95,7 @@ static int ehset_probe(struct usb_interf break; case TEST_K_PID: ret = ehset_prepare_port_for_testing(hub_udev, portnum); - if (!ret) + if (ret < 0) break; ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE, USB_RT_PORT, USB_PORT_FEAT_TEST, @@ -104,7 +104,7 @@ static int ehset_probe(struct usb_interf break; case TEST_PACKET_PID: ret = ehset_prepare_port_for_testing(hub_udev, portnum); - if (!ret) + if (ret < 0) break; ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE, USB_RT_PORT, USB_PORT_FEAT_TEST,
From: Guiting Shen aarongt.shen@gmail.com
commit c55afcbeaa7a6f4fffdbc999a9bf3f0b29a5186f upstream.
The ohci_hcd_at91_drv_suspend() sets ohci->rh_state to OHCI_RH_HALTED when suspend which will let the ohci_irq() skip the interrupt after resume. And nobody to handle this interrupt.
According to the comment in ohci_hcd_at91_drv_suspend(), it need to reset when resume from suspend(MEM) to fix by setting "hibernated" argument of ohci_resume().
Signed-off-by: Guiting Shen aarongt.shen@gmail.com Cc: stable stable@kernel.org Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20230626152713.18950-1-aarongt.shen@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ohci-at91.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/ohci-at91.c +++ b/drivers/usb/host/ohci-at91.c @@ -673,7 +673,13 @@ ohci_hcd_at91_drv_resume(struct device * else at91_start_clock(ohci_at91);
- ohci_resume(hcd, false); + /* + * According to the comment in ohci_hcd_at91_drv_suspend() + * we need to do a reset if the 48Mhz clock was stopped, + * that is, if ohci_at91->wakeup is clear. Tell ohci_resume() + * to reset in this case by setting its "hibernated" flag. + */ + ohci_resume(hcd, !ohci_at91->wakeup);
return 0; }
From: Łukasz Bartosik lb@semihalf.com
commit 9dc162e22387080e2d06de708b89920c0e158c9a upstream.
The Focusrite Scarlett audio device does not behave correctly during resumes. Below is what happens during every resume (captured with Beagle 5000):
<Suspend> <Resume> <Reset>/<Chirp J>/<Tiny J> <Reset/Target disconnected> <High Speed>
The Scarlett disconnects and is enumerated again.
However from time to time it drops completely off the USB bus during resume. Below is captured occurrence of such an event:
<Suspend> <Resume> <Reset>/<Chirp J>/<Tiny J> <Reset>/<Chirp K>/<Tiny K> <High Speed> <Corrupted packet> <Reset/Target disconnected>
To fix the condition a user has to unplug and plug the device again.
With USB_QUIRK_RESET_RESUME applied ("usbcore.quirks=1235:8211:b") for the Scarlett audio device the issue still reproduces.
Applying USB_QUIRK_DISCONNECT_SUSPEND ("usbcore.quirks=1235:8211:m") fixed the issue and the Scarlett audio device didn't drop off the USB bus for ~5000 suspend/resume cycles where originally issue reproduced in ~100 or less suspend/resume cycles.
Signed-off-by: Łukasz Bartosik lb@semihalf.com Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/20230724112911.1802577-1-lb@semihalf.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/quirks.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -436,6 +436,10 @@ static const struct usb_device_id usb_qu /* novation SoundControl XL */ { USB_DEVICE(0x1235, 0x0061), .driver_info = USB_QUIRK_RESET_RESUME },
+ /* Focusrite Scarlett Solo USB */ + { USB_DEVICE(0x1235, 0x8211), .driver_info = + USB_QUIRK_DISCONNECT_SUSPEND }, + /* Huawei 4G LTE module */ { USB_DEVICE(0x12d1, 0x15bb), .driver_info = USB_QUIRK_DISCONNECT_SUSPEND },
From: Frank Li Frank.Li@nxp.com
commit 2627335a1329a0d39d8d277994678571c4f21800 upstream.
Previously, the cdns3_gadget_check_config() function in the cdns3 driver mistakenly calculated the ep_buf_size by considering only one configuration's endpoint information because "claimed" will be clear after call usb_gadget_check_config().
The fix involves checking the private flags EP_CLAIMED instead of relying on the "claimed" flag.
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number") Cc: stable stable@kernel.org Reported-by: Ravi Gunasekaran r-gunasekaran@ti.com Signed-off-by: Frank Li Frank.Li@nxp.com Acked-by: Peter Chen peter.chen@kernel.org Tested-by: Ravi Gunasekaran r-gunasekaran@ti.com Link: https://lore.kernel.org/r/20230707230015.494999-2-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/cdns3/cdns3-gadget.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/cdns3/cdns3-gadget.c +++ b/drivers/usb/cdns3/cdns3-gadget.c @@ -3012,12 +3012,14 @@ static int cdns3_gadget_udc_stop(struct static int cdns3_gadget_check_config(struct usb_gadget *gadget) { struct cdns3_device *priv_dev = gadget_to_cdns3_device(gadget); + struct cdns3_endpoint *priv_ep; struct usb_ep *ep; int n_in = 0; int total;
list_for_each_entry(ep, &gadget->ep_list, ep_list) { - if (ep->claimed && (ep->address & USB_DIR_IN)) + priv_ep = ep_to_cdns3_ep(ep); + if ((priv_ep->flags & EP_CLAIMED) && (ep->address & USB_DIR_IN)) n_in++; }
From: Ricardo Ribalda ribalda@chromium.org
commit 9fd10829a9eb482e192a845675ecc5480e0bfa10 upstream.
Allow devices to have dma operations beyond 64K, and avoid warnings such as:
DMA-API: xhci-mtk 11200000.usb: mapping sg segment longer than device claims to support [len=98304] [max=65536]
Fixes: 0cbd4b34cda9 ("xhci: mediatek: support MTK xHCI host controller") Cc: stable stable@kernel.org Tested-by: Zubin Mithra zsm@chromium.org Reported-by: Zubin Mithra zsm@chromium.org Signed-off-by: Ricardo Ribalda ribalda@chromium.org Link: https://lore.kernel.org/r/20230628-mtk-usb-v2-1-c8c34eb9f229@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-mtk.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/host/xhci-mtk.c +++ b/drivers/usb/host/xhci-mtk.c @@ -592,6 +592,7 @@ static int xhci_mtk_probe(struct platfor }
device_init_wakeup(dev, true); + dma_set_max_seg_size(dev, UINT_MAX);
xhci = hcd_to_xhci(hcd); xhci->main_hcd = hcd;
From: Dan Carpenter dan.carpenter@linaro.org
commit 288b4fa1798e3637a9304c6e90a93d900e02369c upstream.
This reverts commit 18fc7c435be3f17ea26a21b2e2312fcb9088e01f.
The reverted commit was based on static analysis and a misunderstanding of how PTR_ERR() and NULLs are supposed to work. When a function returns both pointer errors and NULL then normally the NULL means "continue operating without a feature because it was deliberately turned off". The NULL should not be treated as a failure. If a driver cannot work when that feature is disabled then the KConfig should enforce that the function cannot return NULL. We should not need to test for it.
In this code, the patch means that certain tegra_xusb_probe() will fail if the firmware supports power-domains but CONFIG_PM is disabled.
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Fixes: 18fc7c435be3 ("usb: xhci: tegra: Fix error check") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/8baace8d-fb4b-41a4-ad5f-848ae643a23b@moroto.mounta... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-tegra.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/host/xhci-tegra.c +++ b/drivers/usb/host/xhci-tegra.c @@ -1145,15 +1145,15 @@ static int tegra_xusb_powerdomain_init(s int err;
tegra->genpd_dev_host = dev_pm_domain_attach_by_name(dev, "xusb_host"); - if (IS_ERR_OR_NULL(tegra->genpd_dev_host)) { - err = PTR_ERR(tegra->genpd_dev_host) ? : -ENODATA; + if (IS_ERR(tegra->genpd_dev_host)) { + err = PTR_ERR(tegra->genpd_dev_host); dev_err(dev, "failed to get host pm-domain: %d\n", err); return err; }
tegra->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "xusb_ss"); - if (IS_ERR_OR_NULL(tegra->genpd_dev_ss)) { - err = PTR_ERR(tegra->genpd_dev_ss) ? : -ENODATA; + if (IS_ERR(tegra->genpd_dev_ss)) { + err = PTR_ERR(tegra->genpd_dev_ss); dev_err(dev, "failed to get superspeed pm-domain: %d\n", err); return err; }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 4fee0915e649bd0cea56dece6d96f8f4643df33c upstream.
Because the linux-distros group forces reporters to release information about reported bugs, and they impose arbitrary deadlines in having those bugs fixed despite not actually being kernel developers, the kernel security team recommends not interacting with them at all as this just causes confusion and the early-release of reported security problems.
Reviewed-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/2023063020-throat-pantyhose-f110@gregkh Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/process/security-bugs.rst | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-)
--- a/Documentation/process/security-bugs.rst +++ b/Documentation/process/security-bugs.rst @@ -63,20 +63,18 @@ information submitted to the security li of the report are treated confidentially even after the embargo has been lifted, in perpetuity.
-Coordination ------------- +Coordination with other groups +------------------------------
-Fixes for sensitive bugs, such as those that might lead to privilege -escalations, may need to be coordinated with the private -linux-distros@vs.openwall.org mailing list so that distribution vendors -are well prepared to issue a fixed kernel upon public disclosure of the -upstream fix. Distros will need some time to test the proposed patch and -will generally request at least a few days of embargo, and vendor update -publication prefers to happen Tuesday through Thursday. When appropriate, -the security team can assist with this coordination, or the reporter can -include linux-distros from the start. In this case, remember to prefix -the email Subject line with "[vs]" as described in the linux-distros wiki: -http://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists +The kernel security team strongly recommends that reporters of potential +security issues NEVER contact the "linux-distros" mailing list until +AFTER discussing it with the kernel security team. Do not Cc: both +lists at once. You may contact the linux-distros mailing list after a +fix has been agreed on and you fully understand the requirements that +doing so will impose on you and the kernel community. + +The different lists have different goals and the linux-distros rules do +not contribute to actually fixing any potential security problems.
CVE assignment --------------
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit 3c1897ae4b6bc7cc586eda2feaa2cd68325ec29c upstream.
The kernel security team does NOT assign CVEs, so document that properly and provide the "if you want one, ask MITRE for it" response that we give on a weekly basis in the document, so we don't have to constantly say it to everyone who asks.
Link: https://lore.kernel.org/r/2023063022-retouch-kerosene-7e4a@gregkh Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/process/security-bugs.rst | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/Documentation/process/security-bugs.rst b/Documentation/process/security-bugs.rst index f12ac2316ce7..5a6993795bd2 100644 --- a/Documentation/process/security-bugs.rst +++ b/Documentation/process/security-bugs.rst @@ -79,13 +79,12 @@ not contribute to actually fixing any potential security problems. CVE assignment --------------
-The security team does not normally assign CVEs, nor do we require them -for reports or fixes, as this can needlessly complicate the process and -may delay the bug handling. If a reporter wishes to have a CVE identifier -assigned ahead of public disclosure, they will need to contact the private -linux-distros list, described above. When such a CVE identifier is known -before a patch is provided, it is desirable to mention it in the commit -message if the reporter agrees. +The security team does not assign CVEs, nor do we require them for +reports or fixes, as this can needlessly complicate the process and may +delay the bug handling. If a reporter wishes to have a CVE identifier +assigned, they should find one by themselves, for example by contacting +MITRE directly. However under no circumstances will a patch inclusion +be delayed to wait for a CVE identifier to arrive.
Non-disclosure agreements -------------------------
From: Larry Finger Larry.Finger@lwfinger.net
commit ac83631230f77dda94154ed0ebfd368fc81c70a3 upstream.
In the above mentioned routine, memory is allocated in several places. If the first succeeds and a later one fails, the routine will leak memory. This patch fixes commit 2865d42c78a9 ("staging: r8712u: Add the new driver to the mainline kernel"). A potential memory leak in r8712_xmit_resource_alloc() is also addressed.
Fixes: 2865d42c78a9 ("staging: r8712u: Add the new driver to the mainline kernel") Reported-by: syzbot+cf71097ffb6755df8251@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/x/log.txt?x=11ac3fa0a80000 Cc: stable@vger.kernel.org Cc: Nam Cao namcaov@gmail.com Signed-off-by: Larry Finger Larry.Finger@lwfinger.net Reviewed-by: Nam Cao namcaov@gmail.com Link: https://lore.kernel.org/r/20230714175417.18578-1-Larry.Finger@lwfinger.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8712/rtl871x_xmit.c | 43 ++++++++++++++++++++++++++------- drivers/staging/rtl8712/xmit_linux.c | 6 ++++ 2 files changed, 40 insertions(+), 9 deletions(-)
--- a/drivers/staging/rtl8712/rtl871x_xmit.c +++ b/drivers/staging/rtl8712/rtl871x_xmit.c @@ -21,6 +21,7 @@ #include "osdep_intf.h" #include "usb_ops.h"
+#include <linux/usb.h> #include <linux/ieee80211.h>
static const u8 P802_1H_OUI[P80211_OUI_LEN] = {0x00, 0x00, 0xf8}; @@ -55,6 +56,7 @@ int _r8712_init_xmit_priv(struct xmit_pr sint i; struct xmit_buf *pxmitbuf; struct xmit_frame *pxframe; + int j;
memset((unsigned char *)pxmitpriv, 0, sizeof(struct xmit_priv)); spin_lock_init(&pxmitpriv->lock); @@ -117,11 +119,8 @@ int _r8712_init_xmit_priv(struct xmit_pr _init_queue(&pxmitpriv->pending_xmitbuf_queue); pxmitpriv->pallocated_xmitbuf = kmalloc(NR_XMITBUFF * sizeof(struct xmit_buf) + 4, GFP_ATOMIC); - if (!pxmitpriv->pallocated_xmitbuf) { - kfree(pxmitpriv->pallocated_frame_buf); - pxmitpriv->pallocated_frame_buf = NULL; - return -ENOMEM; - } + if (!pxmitpriv->pallocated_xmitbuf) + goto clean_up_frame_buf; pxmitpriv->pxmitbuf = pxmitpriv->pallocated_xmitbuf + 4 - ((addr_t)(pxmitpriv->pallocated_xmitbuf) & 3); pxmitbuf = (struct xmit_buf *)pxmitpriv->pxmitbuf; @@ -129,13 +128,17 @@ int _r8712_init_xmit_priv(struct xmit_pr INIT_LIST_HEAD(&pxmitbuf->list); pxmitbuf->pallocated_buf = kmalloc(MAX_XMITBUF_SZ + XMITBUF_ALIGN_SZ, GFP_ATOMIC); - if (!pxmitbuf->pallocated_buf) - return -ENOMEM; + if (!pxmitbuf->pallocated_buf) { + j = 0; + goto clean_up_alloc_buf; + } pxmitbuf->pbuf = pxmitbuf->pallocated_buf + XMITBUF_ALIGN_SZ - ((addr_t) (pxmitbuf->pallocated_buf) & (XMITBUF_ALIGN_SZ - 1)); - if (r8712_xmit_resource_alloc(padapter, pxmitbuf)) - return -ENOMEM; + if (r8712_xmit_resource_alloc(padapter, pxmitbuf)) { + j = 1; + goto clean_up_alloc_buf; + } list_add_tail(&pxmitbuf->list, &(pxmitpriv->free_xmitbuf_queue.queue)); pxmitbuf++; @@ -146,6 +149,28 @@ int _r8712_init_xmit_priv(struct xmit_pr init_hwxmits(pxmitpriv->hwxmits, pxmitpriv->hwxmit_entry); tasklet_setup(&pxmitpriv->xmit_tasklet, r8712_xmit_bh); return 0; + +clean_up_alloc_buf: + if (j) { + /* failure happened in r8712_xmit_resource_alloc() + * delete extra pxmitbuf->pallocated_buf + */ + kfree(pxmitbuf->pallocated_buf); + } + for (j = 0; j < i; j++) { + int k; + + pxmitbuf--; /* reset pointer */ + kfree(pxmitbuf->pallocated_buf); + for (k = 0; k < 8; k++) /* delete xmit urb's */ + usb_free_urb(pxmitbuf->pxmit_urb[k]); + } + kfree(pxmitpriv->pallocated_xmitbuf); + pxmitpriv->pallocated_xmitbuf = NULL; +clean_up_frame_buf: + kfree(pxmitpriv->pallocated_frame_buf); + pxmitpriv->pallocated_frame_buf = NULL; + return -ENOMEM; }
void _free_xmit_priv(struct xmit_priv *pxmitpriv) --- a/drivers/staging/rtl8712/xmit_linux.c +++ b/drivers/staging/rtl8712/xmit_linux.c @@ -112,6 +112,12 @@ int r8712_xmit_resource_alloc(struct _ad for (i = 0; i < 8; i++) { pxmitbuf->pxmit_urb[i] = usb_alloc_urb(0, GFP_KERNEL); if (!pxmitbuf->pxmit_urb[i]) { + int k; + + for (k = i - 1; k >= 0; k--) { + /* handle allocation errors part way through loop */ + usb_free_urb(pxmitbuf->pxmit_urb[k]); + } netdev_err(padapter->pnetdev, "pxmitbuf->pxmit_urb[i] == NULL\n"); return -ENOMEM; }
From: Zhang Shurong zhang_shurong@foxmail.com
commit 5f1c7031e044cb2fba82836d55cc235e2ad619dc upstream.
The "exc->key_len" is a u16 that comes from the user. If it's over IW_ENCODING_TOKEN_MAX (64) that could lead to memory corruption.
Fixes: b121d84882b9 ("staging: ks7010: simplify calls to memcpy()") Cc: stable stable@kernel.org Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://lore.kernel.org/r/tencent_5153B668C0283CAA15AA518325346E026A09@qq.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/ks7010/ks_wlan_net.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/staging/ks7010/ks_wlan_net.c +++ b/drivers/staging/ks7010/ks_wlan_net.c @@ -1583,8 +1583,10 @@ static int ks_wlan_set_encode_ext(struct commit |= SME_WEP_FLAG; } if (enc->key_len) { - memcpy(&key->key_val[0], &enc->key[0], enc->key_len); - key->key_len = enc->key_len; + int key_len = clamp_val(enc->key_len, 0, IW_ENCODING_TOKEN_MAX); + + memcpy(&key->key_val[0], &enc->key[0], key_len); + key->key_len = key_len; commit |= (SME_WEP_VAL1 << index); } break;
From: Chaoyuan Peng hedonistsmith@gmail.com
commit 9b9c8195f3f0d74a826077fc1c01b9ee74907239 upstream.
In gsm_cleanup_mux() the 'gsm->dlci' pointer was not cleaned properly, leaving it a dangling pointer after gsm_dlci_release. This leads to use-after-free where 'gsm->dlci[0]' are freed and accessed by the subsequent gsm_cleanup_mux().
Such is the case in the following call trace:
<TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106 print_address_description+0x63/0x3b0 mm/kasan/report.c:248 __kasan_report mm/kasan/report.c:434 [inline] kasan_report+0x16b/0x1c0 mm/kasan/report.c:451 gsm_cleanup_mux+0x76a/0x850 drivers/tty/n_gsm.c:2397 gsm_config drivers/tty/n_gsm.c:2653 [inline] gsmld_ioctl+0xaae/0x15b0 drivers/tty/n_gsm.c:2986 tty_ioctl+0x8ff/0xc50 drivers/tty/tty_io.c:2816 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0xf1/0x160 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x61/0xcb </TASK>
Allocated by task 3501: kasan_save_stack mm/kasan/common.c:38 [inline] kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:434 [inline] ____kasan_kmalloc+0xba/0xf0 mm/kasan/common.c:513 kasan_kmalloc include/linux/kasan.h:264 [inline] kmem_cache_alloc_trace+0x143/0x290 mm/slub.c:3247 kmalloc include/linux/slab.h:591 [inline] kzalloc include/linux/slab.h:721 [inline] gsm_dlci_alloc+0x53/0x3a0 drivers/tty/n_gsm.c:1932 gsm_activate_mux+0x1c/0x330 drivers/tty/n_gsm.c:2438 gsm_config drivers/tty/n_gsm.c:2677 [inline] gsmld_ioctl+0xd46/0x15b0 drivers/tty/n_gsm.c:2986 tty_ioctl+0x8ff/0xc50 drivers/tty/tty_io.c:2816 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0xf1/0x160 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x61/0xcb
Freed by task 3501: kasan_save_stack mm/kasan/common.c:38 [inline] kasan_set_track+0x4b/0x80 mm/kasan/common.c:46 kasan_set_free_info+0x1f/0x40 mm/kasan/generic.c:360 ____kasan_slab_free+0xd8/0x120 mm/kasan/common.c:366 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:1705 [inline] slab_free_freelist_hook+0xdd/0x160 mm/slub.c:1731 slab_free mm/slub.c:3499 [inline] kfree+0xf1/0x270 mm/slub.c:4559 dlci_put drivers/tty/n_gsm.c:1988 [inline] gsm_dlci_release drivers/tty/n_gsm.c:2021 [inline] gsm_cleanup_mux+0x574/0x850 drivers/tty/n_gsm.c:2415 gsm_config drivers/tty/n_gsm.c:2653 [inline] gsmld_ioctl+0xaae/0x15b0 drivers/tty/n_gsm.c:2986 tty_ioctl+0x8ff/0xc50 drivers/tty/tty_io.c:2816 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0xf1/0x160 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x61/0xcb
Fixes: aa371e96f05d ("tty: n_gsm: fix restart handling via CLD command") Signed-off-by: Chaoyuan Peng hedonistsmith@gmail.com Cc: stable stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_gsm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -3070,8 +3070,10 @@ static void gsm_cleanup_mux(struct gsm_m gsm->has_devices = false; } for (i = NUM_DLCI - 1; i >= 0; i--) - if (gsm->dlci[i]) + if (gsm->dlci[i]) { gsm_dlci_release(gsm->dlci[i]); + gsm->dlci[i] = NULL; + } mutex_unlock(&gsm->mutex); /* Now wipe the queues */ tty_ldisc_flush(gsm->tty);
From: Oliver Neukum oneukum@suse.com
commit 5bef4b3cb95a5b883dfec8b3ffc0d671323d55bb upstream.
This reverts commit 5255660b208aebfdb71d574f3952cf48392f4306.
This quirk breaks at least the following hardware:
0b:00.0 0c03: 1106:3483 (rev 01) (prog-if 30 [XHCI]) Subsystem: 1106:3483 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 66 Region 0: Memory at fb400000 (64-bit, non-prefetchable) [size=4K] Capabilities: [80] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [90] MSI: Enable+ Count=1/4 Maskable- 64bit+ Address: 00000000fee007b8 Data: 0000 Capabilities: [c4] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 89W DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <2us, L1 <16us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci
with the quirk enabled it fails early with
[ 0.754373] pci 0000:0b:00.0: xHCI HW did not halt within 32000 usec status = 0x1000 [ 0.754419] pci 0000:0b:00.0: quirk_usb_early_handoff+0x0/0x7a0 took 31459 usecs [ 2.228048] xhci_hcd 0000:0b:00.0: xHCI Host Controller [ 2.228053] xhci_hcd 0000:0b:00.0: new USB bus registered, assigned bus number 7 [ 2.260073] xhci_hcd 0000:0b:00.0: Host halt failed, -110 [ 2.260079] xhci_hcd 0000:0b:00.0: can't setup: -110 [ 2.260551] xhci_hcd 0000:0b:00.0: USB bus 7 deregistered [ 2.260624] xhci_hcd 0000:0b:00.0: init 0000:0b:00.0 fail, -110 [ 2.260639] xhci_hcd: probe of 0000:0b:00.0 failed with error -110
The hardware in question is an external PCIe card. It looks to me like the quirk needs to be narrowed down. But this needs information about the hardware showing the issue this quirk is to fix. So for now a clean revert.
Signed-off-by: Oliver Neukum oneukum@suse.com Fixes: 5255660b208a ("xhci: add quirk for host controllers that don't update endpoint DCS") Cc: stable stable@kernel.org Link: https://lore.kernel.org/r/20230713112830.21773-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 4 +--- drivers/usb/host/xhci-ring.c | 25 +------------------------ 2 files changed, 2 insertions(+), 27 deletions(-)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -486,10 +486,8 @@ static void xhci_pci_quirks(struct devic pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS;
- if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) { + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) xhci->quirks |= XHCI_LPM_SUPPORT; - xhci->quirks |= XHCI_EP_CTX_BROKEN_DCS; - }
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -592,11 +592,8 @@ static int xhci_move_dequeue_past_td(str struct xhci_ring *ep_ring; struct xhci_command *cmd; struct xhci_segment *new_seg; - struct xhci_segment *halted_seg = NULL; union xhci_trb *new_deq; int new_cycle; - union xhci_trb *halted_trb; - int index = 0; dma_addr_t addr; u64 hw_dequeue; bool cycle_found = false; @@ -634,27 +631,7 @@ static int xhci_move_dequeue_past_td(str hw_dequeue = xhci_get_hw_deq(xhci, dev, ep_index, stream_id); new_seg = ep_ring->deq_seg; new_deq = ep_ring->dequeue; - - /* - * Quirk: xHC write-back of the DCS field in the hardware dequeue - * pointer is wrong - use the cycle state of the TRB pointed to by - * the dequeue pointer. - */ - if (xhci->quirks & XHCI_EP_CTX_BROKEN_DCS && - !(ep->ep_state & EP_HAS_STREAMS)) - halted_seg = trb_in_td(xhci, td->start_seg, - td->first_trb, td->last_trb, - hw_dequeue & ~0xf, false); - if (halted_seg) { - index = ((dma_addr_t)(hw_dequeue & ~0xf) - halted_seg->dma) / - sizeof(*halted_trb); - halted_trb = &halted_seg->trbs[index]; - new_cycle = halted_trb->generic.field[3] & 0x1; - xhci_dbg(xhci, "Endpoint DCS = %d TRB index = %d cycle = %d\n", - (u8)(hw_dequeue & 0x1), index, new_cycle); - } else { - new_cycle = hw_dequeue & 0x1; - } + new_cycle = hw_dequeue & 0x1;
/* * We want to find the pointer, segment and cycle state of the new trb
From: Pavel Asyutchenko svenpavel@gmail.com
commit 8019a4ab3d80c7af391a646cccff953753fc025f upstream.
This laptop has CS35L41 amp connected via I2C.
With this patch speakers begin to work if the missing _DSD properties are added to ACPI tables.
Signed-off-by: Pavel Asyutchenko svenpavel@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230726223732.20775-1-svenpavel@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9628,6 +9628,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1043, 0x1c92, "ASUS ROG Strix G15", ALC285_FIXUP_ASUS_G533Z_PINS), SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_HEADSET_MIC), SND_PCI_QUIRK(0x1043, 0x1ccd, "ASUS X555UB", ALC256_FIXUP_ASUS_MIC), + SND_PCI_QUIRK(0x1043, 0x1d1f, "ASUS ROG Strix G17 2023 (G713PV)", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x1043, 0x1d42, "ASUS Zephyrus G14 2022", ALC289_FIXUP_ASUS_GA401), SND_PCI_QUIRK(0x1043, 0x1d4e, "ASUS TM420", ALC256_FIXUP_ASUS_HPE), SND_PCI_QUIRK(0x1043, 0x1e02, "ASUS UX3402", ALC245_FIXUP_CS35L41_SPI_2),
From: Luka Guzenko l.guzenko@web.de
commit d510acb610e6aa07a04b688236868b2a5fd60deb upstream.
This HP Notebook used ALC236 codec with COEF 0x07 idx 1 controlling the mute LED. Enable already existing quirk for this device.
Signed-off-by: Luka Guzenko l.guzenko@web.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230725111509.623773-1-l.guzenko@web.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9502,6 +9502,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x880d, "HP EliteBook 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8811, "HP Spectre x360 15-eb1xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1), SND_PCI_QUIRK(0x103c, 0x8812, "HP Spectre x360 15-eb1xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1), + SND_PCI_QUIRK(0x103c, 0x881d, "HP 250 G8 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x8846, "HP EliteBook 850 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x8847, "HP EliteBook x360 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x884b, "HP EliteBook 840 Aero G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
From: Baskaran Kannan Baski.Kannan@amd.com
commit e146503ac68418859fb063a3a0cd9ec93bc52238 upstream.
Industrial processor i3255 supports temperatures -40 deg celcius to 105 deg Celcius. The current implementation of k10temp_read_temp rounds off any negative temperatures to '0'. To fix this, the following changes have been made.
A flag 'disp_negative' is added to struct k10temp_data to support AMD i3255 processors. Flag 'disp_negative' is set if 3255 processor is found during k10temp_probe. Flag 'disp_negative' is used to determine whether to round off negative temperatures to '0' in k10temp_read_temp.
Signed-off-by: Baskaran Kannan Baski.Kannan@amd.com Link: https://lore.kernel.org/r/20230727162159.1056136-1-Baski.Kannan@amd.com Fixes: aef17ca12719 ("hwmon: (k10temp) Only apply temperature offset if result is positive") Cc: stable@vger.kernel.org [groeck: Fixed multi-line comment] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/k10temp.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -77,6 +77,13 @@ static DEFINE_MUTEX(nb_smu_ind_mutex); #define ZEN_CUR_TEMP_RANGE_SEL_MASK BIT(19) #define ZEN_CUR_TEMP_TJ_SEL_MASK GENMASK(17, 16)
+/* + * AMD's Industrial processor 3255 supports temperature from -40 deg to 105 deg Celsius. + * Use the model name to identify 3255 CPUs and set a flag to display negative temperature. + * Do not round off to zero for negative Tctl or Tdie values if the flag is set + */ +#define AMD_I3255_STR "3255" + struct k10temp_data { struct pci_dev *pdev; void (*read_htcreg)(struct pci_dev *pdev, u32 *regval); @@ -86,6 +93,7 @@ struct k10temp_data { u32 show_temp; bool is_zen; u32 ccd_offset; + bool disp_negative; };
#define TCTL_BIT 0 @@ -204,12 +212,12 @@ static int k10temp_read_temp(struct devi switch (channel) { case 0: /* Tctl */ *val = get_raw_temp(data); - if (*val < 0) + if (*val < 0 && !data->disp_negative) *val = 0; break; case 1: /* Tdie */ *val = get_raw_temp(data) - data->temp_offset; - if (*val < 0) + if (*val < 0 && !data->disp_negative) *val = 0; break; case 2 ... 13: /* Tccd{1-12} */ @@ -405,6 +413,11 @@ static int k10temp_probe(struct pci_dev data->pdev = pdev; data->show_temp |= BIT(TCTL_BIT); /* Always show Tctl */
+ if (boot_cpu_data.x86 == 0x17 && + strstr(boot_cpu_data.x86_model_id, AMD_I3255_STR)) { + data->disp_negative = true; + } + if (boot_cpu_data.x86 == 0x15 && ((boot_cpu_data.x86_model & 0xf0) == 0x60 || (boot_cpu_data.x86_model & 0xf0) == 0x70)) {
From: Gilles Buloz Gilles.Buloz@kontron.com
commit 54685abe660a59402344d5045ce08c43c6a5ac42 upstream.
Because of hex value 0x46 used instead of decimal 46, the temp6 (PECI1) temperature is always declared visible and then displayed even if disabled in the chip
Signed-off-by: Gilles Buloz gilles.buloz@kontron.com Link: https://lore.kernel.org/r/DU0PR10MB62526435ADBC6A85243B90E08002A@DU0PR10MB62... Fixes: fcdc5739dce03 ("hwmon: (nct7802) add temperature sensor type attribute") Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/nct7802.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwmon/nct7802.c +++ b/drivers/hwmon/nct7802.c @@ -725,7 +725,7 @@ static umode_t nct7802_temp_is_visible(s if (index >= 38 && index < 46 && !(reg & 0x01)) /* PECI 0 */ return 0;
- if (index >= 0x46 && (!(reg & 0x02))) /* PECI 1 */ + if (index >= 46 && !(reg & 0x02)) /* PECI 1 */ return 0;
return attr->mode;
From: Aleksa Savic savicaleksa83@gmail.com
commit a746b3689546da27125da9ccaea62b1dbaaf927c upstream.
Commit 662d20b3a5af ("hwmon: (aquacomputer_d5next) Add support for temperature sensor offsets") changed aqc_get_ctrl_val() to return the value through a parameter instead of through the return value, but didn't fix up a case that relied on the old behavior. Fix it to use the proper received value and not the return code.
Fixes: 662d20b3a5af ("hwmon: (aquacomputer_d5next) Add support for temperature sensor offsets") Cc: stable@vger.kernel.org Signed-off-by: Aleksa Savic savicaleksa83@gmail.com Link: https://lore.kernel.org/r/20230714120712.16721-1-savicaleksa83@gmail.com Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/aquacomputer_d5next.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/hwmon/aquacomputer_d5next.c +++ b/drivers/hwmon/aquacomputer_d5next.c @@ -969,7 +969,7 @@ static int aqc_read(struct device *dev, if (ret < 0) return ret;
- *val = aqc_percent_to_pwm(ret); + *val = aqc_percent_to_pwm(*val); break; } break;
From: Patrick Rudolph patrick.rudolph@9elements.com
commit 55aab08f1856894d7d47d0ee23abbb4bc4854345 upstream.
Refactor pmbus_is_enabled() to return the status without any additional processing as it is already done in _pmbus_is_enabled().
Fixes: df5f6b6af01c ("hwmon: (pmbus/core) Generalise pmbus get status") Cc: stable@vger.kernel.org # v6.4 Signed-off-by: Patrick Rudolph patrick.rudolph@9elements.com Signed-off-by: Naresh Solanki Naresh.Solanki@9elements.com Link: https://lore.kernel.org/r/20230725125428.3966803-1-Naresh.Solanki@9elements.... [groeck: Rephrased commit message] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/pmbus/pmbus_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c index fa06325f5a7c..42fb7286805b 100644 --- a/drivers/hwmon/pmbus/pmbus_core.c +++ b/drivers/hwmon/pmbus/pmbus_core.c @@ -2768,7 +2768,7 @@ static int __maybe_unused pmbus_is_enabled(struct device *dev, u8 page) ret = _pmbus_is_enabled(dev, page); mutex_unlock(&data->update_lock);
- return !!(ret & PB_OPERATION_CONTROL_ON); + return ret; }
#define to_dev_attr(_dev_attr) \
From: Patrick Rudolph patrick.rudolph@9elements.com
commit 0bd66784274a287beada2933c2c0fa3a0ddae0d7 upstream.
Pass i2c_client to _pmbus_is_enabled to drop the assumption that a regulator device is passed in.
This will fix the issue of a NULL pointer dereference when called from _pmbus_get_flags.
Fixes: df5f6b6af01c ("hwmon: (pmbus/core) Generalise pmbus get status") Cc: stable@vger.kernel.org # v6.4 Signed-off-by: Patrick Rudolph patrick.rudolph@9elements.com Signed-off-by: Naresh Solanki Naresh.Solanki@9elements.com Link: https://lore.kernel.org/r/20230725125428.3966803-2-Naresh.Solanki@9elements.... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/pmbus/pmbus_core.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c index 42fb7286805b..30aeb59062a5 100644 --- a/drivers/hwmon/pmbus/pmbus_core.c +++ b/drivers/hwmon/pmbus/pmbus_core.c @@ -2745,9 +2745,8 @@ static const struct pmbus_status_category __maybe_unused pmbus_status_flag_map[] }, };
-static int _pmbus_is_enabled(struct device *dev, u8 page) +static int _pmbus_is_enabled(struct i2c_client *client, u8 page) { - struct i2c_client *client = to_i2c_client(dev->parent); int ret;
ret = _pmbus_read_byte_data(client, page, PMBUS_OPERATION); @@ -2758,14 +2757,13 @@ static int _pmbus_is_enabled(struct device *dev, u8 page) return !!(ret & PB_OPERATION_CONTROL_ON); }
-static int __maybe_unused pmbus_is_enabled(struct device *dev, u8 page) +static int __maybe_unused pmbus_is_enabled(struct i2c_client *client, u8 page) { - struct i2c_client *client = to_i2c_client(dev->parent); struct pmbus_data *data = i2c_get_clientdata(client); int ret;
mutex_lock(&data->update_lock); - ret = _pmbus_is_enabled(dev, page); + ret = _pmbus_is_enabled(client, page); mutex_unlock(&data->update_lock);
return ret; @@ -2844,7 +2842,7 @@ static int _pmbus_get_flags(struct pmbus_data *data, u8 page, unsigned int *flag if (status < 0) return status;
- if (_pmbus_is_enabled(dev, page)) { + if (_pmbus_is_enabled(client, page)) { if (status & PB_STATUS_OFF) { *flags |= REGULATOR_ERROR_FAIL; *event |= REGULATOR_EVENT_FAIL; @@ -2898,7 +2896,10 @@ static int __maybe_unused pmbus_get_flags(struct pmbus_data *data, u8 page, unsi #if IS_ENABLED(CONFIG_REGULATOR) static int pmbus_regulator_is_enabled(struct regulator_dev *rdev) { - return pmbus_is_enabled(rdev_get_dev(rdev), rdev_get_id(rdev)); + struct device *dev = rdev_get_dev(rdev); + struct i2c_client *client = to_i2c_client(dev->parent); + + return pmbus_is_enabled(client, rdev_get_id(rdev)); }
static int _pmbus_regulator_on_off(struct regulator_dev *rdev, bool enable)
From: Guenter Roeck linux@roeck-us.net
commit b84000f2274520f73ac9dc59fd9403260b61c4e7 upstream.
pmbus_regulator_get_status() acquires update_lock. pmbus_regulator_get_error_flags() acquires it again, resulting in an immediate deadlock.
Call _pmbus_get_flags() from pmbus_regulator_get_status() directly to avoid the problem.
Reported-by: Patrick Rudolph patrick.rudolph@9elements.com Closes: https://lore.kernel.org/linux-hwmon/b7a3ad85-aab4-4718-a001-1d8b1c0eef36@roe... Cc: Naresh Solanki Naresh.Solanki@9elements.com Cc: stable@vger.kernel.org # v6.2+ Fixes: c05f477c4ba3 ("hwmon: (pmbus/core) Implement regulator get_status") Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hwmon/pmbus/pmbus_core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/hwmon/pmbus/pmbus_core.c +++ b/drivers/hwmon/pmbus/pmbus_core.c @@ -2946,6 +2946,7 @@ static int pmbus_regulator_get_status(st struct pmbus_data *data = i2c_get_clientdata(client); u8 page = rdev_get_id(rdev); int status, ret; + int event;
mutex_lock(&data->update_lock); status = pmbus_get_status(client, page, PMBUS_STATUS_WORD); @@ -2965,7 +2966,7 @@ static int pmbus_regulator_get_status(st goto unlock; }
- ret = pmbus_regulator_get_error_flags(rdev, &status); + ret = _pmbus_get_flags(data, rdev_get_id(rdev), &status, &event, false); if (ret) goto unlock;
From: Naohiro Aota naohiro.aota@wdc.com
commit 95ca6599a589ee84c69f02d0e1d928c8d1367fb1 upstream.
The zoned mode need to reset a zone before using it. We rely on btrfs's original discard functionality (discarding unused block group range) to do the resetting.
While the commit 63a7cb130718 ("btrfs: auto enable discard=async when possible") made the discard done in an async manner, a zoned reset do not need to be async, as it is fast enough.
Even worth, delaying zone rests prevents using those zones again. So, let's disable async discard on the zoned mode.
Fixes: 63a7cb130718 ("btrfs: auto enable discard=async when possible") CC: stable@vger.kernel.org # 6.3+ Reviewed-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Reviewed-by: David Sterba dsterba@suse.com [ update message text ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 7 ++++++- fs/btrfs/zoned.c | 3 +++ 2 files changed, 9 insertions(+), 1 deletion(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3692,11 +3692,16 @@ int __cold open_ctree(struct super_block * For devices supporting discard turn on discard=async automatically, * unless it's already set or disabled. This could be turned off by * nodiscard for the same mount. + * + * The zoned mode piggy backs on the discard functionality for + * resetting a zone. There is no reason to delay the zone reset as it is + * fast enough. So, do not enable async discard for zoned mode. */ if (!(btrfs_test_opt(fs_info, DISCARD_SYNC) || btrfs_test_opt(fs_info, DISCARD_ASYNC) || btrfs_test_opt(fs_info, NODISCARD)) && - fs_info->fs_devices->discardable) { + fs_info->fs_devices->discardable && + !btrfs_is_zoned(fs_info)) { btrfs_set_and_info(fs_info, DISCARD_ASYNC, "auto enabling async discard"); } --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -804,6 +804,9 @@ int btrfs_check_mountopts_zoned(struct b return -EINVAL; }
+ btrfs_clear_and_info(info, DISCARD_ASYNC, + "zoned: async discard ignored and disabled for zoned mode"); + return 0; }
From: Filipe Manana fdmanana@suse.com
commit 8dbfc14fc736eb701089aff09645c3d4ad3decb1 upstream.
When using the block group tree feature, this tree is a critical tree just like the extent, csum and free space trees, and just like them it uses the delayed refs block reserve.
So take into account the block group tree, and its current size, when calculating the size for the global reserve.
CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/block-rsv.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -349,6 +349,11 @@ void btrfs_update_global_block_rsv(struc } read_unlock(&fs_info->global_root_lock);
+ if (btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE)) { + num_bytes += btrfs_root_used(&fs_info->block_group_root->root_item); + min_items++; + } + /* * But we also want to reserve enough space so we can do the fallback * global reserve for an unlink, which is an additional
From: Filipe Manana fdmanana@suse.com
commit bf7ecbe9875061bf3fce1883e3b26b77f847d1e8 upstream.
At btrfs_wait_for_commit() we wait for a transaction to finish and then always return 0 (success) without checking if it was aborted, in which case the transaction didn't happen due to some critical error. Fix this by checking if the transaction was aborted.
Fixes: 462045928bda ("Btrfs: add START_SYNC, WAIT_SYNC ioctls") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/transaction.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -933,6 +933,7 @@ int btrfs_wait_for_commit(struct btrfs_f }
wait_for_commit(cur_trans, TRANS_STATE_COMPLETED); + ret = cur_trans->aborted; btrfs_put_transaction(cur_trans); out: return ret;
From: Filipe Manana fdmanana@suse.com
commit b28ff3a7d7e97456fd86b68d24caa32e1cfa7064 upstream.
btrfs_attach_transaction_barrier() is used to get a handle pointing to the current running transaction if the transaction has not started its commit yet (its state is < TRANS_STATE_COMMIT_START). If the transaction commit has started, then we wait for the transaction to commit and finish before returning - however we completely ignore if the transaction was aborted due to some error during its commit, we simply return ERR_PT(-ENOENT), which makes the caller assume everything is fine and no errors happened.
This could make an fsync return success (0) to user space when in fact we had a transaction abort and the target inode changes were therefore not persisted.
Fix this by checking for the return value from btrfs_wait_for_commit(), and if it returned an error, return it back to the caller.
Fixes: d4edf39bd5db ("Btrfs: fix uncompleted transaction") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/transaction.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -828,8 +828,13 @@ btrfs_attach_transaction_barrier(struct
trans = start_transaction(root, 0, TRANS_ATTACH, BTRFS_RESERVE_NO_FLUSH, true); - if (trans == ERR_PTR(-ENOENT)) - btrfs_wait_for_commit(root->fs_info, 0); + if (trans == ERR_PTR(-ENOENT)) { + int ret; + + ret = btrfs_wait_for_commit(root->fs_info, 0); + if (ret) + return ERR_PTR(ret); + }
return trans; }
From: Yazen Ghannam yazen.ghannam@amd.com
commit 3ba2e83334bed2b1980b59734e6e84dfaf96026c upstream.
AMD systems from Family 10h to 16h share MCA bank 4 across multiple CPUs. Therefore, the threshold_bank structure for bank 4, and its threshold_block structures, will be initialized once at boot time. And the kobject for the shared bank will be added to each of the CPUs that share it. Furthermore, the threshold_blocks for the shared bank will be added again to the bank's kobject. These additions will increase the refcount for the bank's kobject.
For example, a shared bank with two blocks and shared across two CPUs will be set up like this:
CPU0 init bank create and add; bank refcount = 1; threshold_create_bank() block 0 init and add; bank refcount = 2; allocate_threshold_blocks() block 1 init and add; bank refcount = 3; allocate_threshold_blocks() CPU1 init bank add; bank refcount = 3; threshold_create_bank() block 0 add; bank refcount = 4; __threshold_add_blocks() block 1 add; bank refcount = 5; __threshold_add_blocks()
Currently in threshold_remove_bank(), if the bank is shared then __threshold_remove_blocks() is called. Here the shared bank's kobject and the bank's blocks' kobjects are deleted. This is done on the first call even while the structures are still shared. Subsequent calls from other CPUs that share the structures will attempt to delete the kobjects.
During kobject_del(), kobject->sd is removed. If the kobject is not part of a kset with default_groups, then subsequent kobject_del() calls seem safe even with kobject->sd == NULL.
Originally, the AMD MCA thresholding structures did not use default_groups. And so the above behavior was not apparent.
However, a recent change implemented default_groups for the thresholding structures. Therefore, kobject_del() will go down the sysfs_remove_groups() code path. In this case, the first kobject_del() may succeed and remove kobject->sd. But subsequent kobject_del() calls will give a WARNing in kernfs_remove_by_name_ns() since kobject->sd == NULL.
Use kobject_put() on the shared bank's kobject when "removing" blocks. This decrements the bank's refcount while keeping kobjects enabled until the bank is no longer shared. At that point, kobject_put() will be called on the blocks which drives their refcount to 0 and deletes them and also decrementing the bank's refcount. And finally kobject_put() will be called on the bank driving its refcount to 0 and deleting it.
The same example above:
CPU1 shutdown bank is shared; bank refcount = 5; threshold_remove_bank() block 0 put parent bank; bank refcount = 4; __threshold_remove_blocks() block 1 put parent bank; bank refcount = 3; __threshold_remove_blocks() CPU0 shutdown bank is no longer shared; bank refcount = 3; threshold_remove_bank() block 0 put block; bank refcount = 2; deallocate_threshold_blocks() block 1 put block; bank refcount = 1; deallocate_threshold_blocks() put bank; bank refcount = 0; threshold_remove_bank()
Fixes: 7f99cb5e6039 ("x86/CPU/AMD: Use default_groups in kobj_type") Reported-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/alpine.LRH.2.02.2205301145540.25840@file01.intrane... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/mce/amd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -1259,10 +1259,10 @@ static void __threshold_remove_blocks(st struct threshold_block *pos = NULL; struct threshold_block *tmp = NULL;
- kobject_del(b->kobj); + kobject_put(b->kobj);
list_for_each_entry_safe(pos, tmp, &b->blocks->miscj, miscj) - kobject_del(&pos->kobj); + kobject_put(b->kobj); }
static void threshold_remove_bank(struct threshold_bank *bank)
From: Kim Phillips kim.phillips@amd.com
commit fd470a8beed88440b160d690344fbae05a0b9b1b upstream.
Unlike Intel's Enhanced IBRS feature, AMD's Automatic IBRS does not provide protection to processes running at CPL3/user mode, see section "Extended Feature Enable Register (EFER)" in the APM v2 at https://bugzilla.kernel.org/attachment.cgi?id=304652
Explicitly enable STIBP to protect against cross-thread CPL3 branch target injections on systems with Automatic IBRS enabled.
Also update the relevant documentation.
Fixes: e7862eda309e ("x86/cpu: Support AMD Automatic IBRS") Reported-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Kim Phillips kim.phillips@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230720194727.67022-1-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/admin-guide/hw-vuln/spectre.rst | 11 +++++++---- arch/x86/kernel/cpu/bugs.c | 15 +++++++++------ 2 files changed, 16 insertions(+), 10 deletions(-)
--- a/Documentation/admin-guide/hw-vuln/spectre.rst +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -484,11 +484,14 @@ Spectre variant 2
Systems which support enhanced IBRS (eIBRS) enable IBRS protection once at boot, by setting the IBRS bit, and they're automatically protected against - Spectre v2 variant attacks, including cross-thread branch target injections - on SMT systems (STIBP). In other words, eIBRS enables STIBP too. + Spectre v2 variant attacks.
- Legacy IBRS systems clear the IBRS bit on exit to userspace and - therefore explicitly enable STIBP for that + On Intel's enhanced IBRS systems, this includes cross-thread branch target + injections on SMT systems (STIBP). In other words, Intel eIBRS enables + STIBP, too. + + AMD Automatic IBRS does not protect userspace, and Legacy IBRS systems clear + the IBRS bit on exit to userspace, therefore both explicitly enable STIBP.
The retpoline mitigation is turned on by default on vulnerable CPUs. It can be forced on or off by the administrator --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1199,19 +1199,21 @@ spectre_v2_user_select_mitigation(void) }
/* - * If no STIBP, enhanced IBRS is enabled, or SMT impossible, STIBP + * If no STIBP, Intel enhanced IBRS is enabled, or SMT impossible, STIBP * is not required. * - * Enhanced IBRS also protects against cross-thread branch target + * Intel's Enhanced IBRS also protects against cross-thread branch target * injection in user-mode as the IBRS bit remains always set which * implicitly enables cross-thread protections. However, in legacy IBRS * mode, the IBRS bit is set only on kernel entry and cleared on return - * to userspace. This disables the implicit cross-thread protection, - * so allow for STIBP to be selected in that case. + * to userspace. AMD Automatic IBRS also does not protect userspace. + * These modes therefore disable the implicit cross-thread protection, + * so allow for STIBP to be selected in those cases. */ if (!boot_cpu_has(X86_FEATURE_STIBP) || !smt_possible || - spectre_v2_in_eibrs_mode(spectre_v2_enabled)) + (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && + !boot_cpu_has(X86_FEATURE_AUTOIBRS))) return;
/* @@ -2343,7 +2345,8 @@ static ssize_t mmio_stale_data_show_stat
static char *stibp_state(void) { - if (spectre_v2_in_eibrs_mode(spectre_v2_enabled)) + if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) && + !boot_cpu_has(X86_FEATURE_AUTOIBRS)) return "";
switch (spectre_v2_user_stibp) {
From: Christian Brauner brauner@kernel.org
commit 20ea1e7d13c1b544fe67c4a8dc3943bb1ab33e6f upstream.
The pidfd_getfd() system call allows a caller with ptrace_may_access() abilities on another process to steal a file descriptor from this process. This system call is used by debuggers, container runtimes, system call supervisors, networking proxies etc. So while it is a special interest system call it is used in common tools.
That ability ends up breaking our long-time optimization in fdget_pos(), which "knew" that if we had exclusive access to the file descriptor nobody else could access it, and we didn't need the lock for the file position.
That check for file_count(file) was always fairly subtle - it depended on __fdget() not incrementing the file count for single-threaded processes and thus included that as part of the rule - but it did mean that we didn't need to take the lock in all those traditional unix process contexts.
So it's sad to see this go, and I'd love to have some way to re-instate the optimization. At the same time, the lock obviously isn't ever contended in the case we optimized, so all we were optimizing away is the atomics and the cacheline dirtying. Let's see if anybody even notices that the optimization is gone.
Link: https://lore.kernel.org/linux-fsdevel/20230724-vfs-fdget_pos-v1-1-a4abfd7103... Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall") Cc: stable@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/file.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/fs/file.c +++ b/fs/file.c @@ -1042,10 +1042,8 @@ unsigned long __fdget_pos(unsigned int f struct file *file = (struct file *)(v & ~3);
if (file && (file->f_mode & FMODE_ATOMIC_POS)) { - if (file_count(file) > 1) { - v |= FDPUT_POS_UNLOCK; - mutex_lock(&file->f_pos_lock); - } + v |= FDPUT_POS_UNLOCK; + mutex_lock(&file->f_pos_lock); } return v; }
From: Trond Myklebust trond.myklebust@hammerspace.com
commit f75546f58a70da5cfdcec5a45ffc377885ccbee8 upstream.
If the client is calling TEST_STATEID, then it is because some event occurred that requires it to check all the stateids for validity and call FREE_STATEID on the ones that have been revoked. In this case, either the stateid exists in the list of stateids associated with that nfs4_client, in which case it should be tested, or it does not. There are no additional conditions to be considered.
Reported-by: "Frank Ch. Eigler" fche@redhat.com Fixes: 7df302f75ee2 ("NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4state.c | 2 -- 1 file changed, 2 deletions(-)
--- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -6341,8 +6341,6 @@ static __be32 nfsd4_validate_stateid(str if (ZERO_STATEID(stateid) || ONE_STATEID(stateid) || CLOSE_STATEID(stateid)) return status; - if (!same_clid(&stateid->si_opaque.so_clid, &cl->cl_clientid)) - return status; spin_lock(&cl->cl_lock); s = find_stateid_locked(cl, stateid); if (!s)
From: Namjae Jeon linkinjeon@kernel.org
commit 2b57a4322b1b14348940744fdc02f9a86cbbdbeb upstream.
Since commit 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name"), ksmbd can not lookup cross mount points. If last component is a cross mount point during path lookup, check if it is crossed to follow it down. And allow path lookup to cross a mount point when a crossmnt parameter is set to 'yes' in smb.conf.
Cc: stable@vger.kernel.org Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/ksmbd_netlink.h | 3 +- fs/smb/server/smb2pdu.c | 27 +++++++++++-------- fs/smb/server/vfs.c | 58 +++++++++++++++++++++++------------------- fs/smb/server/vfs.h | 4 +- 4 files changed, 53 insertions(+), 39 deletions(-)
--- a/fs/smb/server/ksmbd_netlink.h +++ b/fs/smb/server/ksmbd_netlink.h @@ -352,7 +352,8 @@ enum KSMBD_TREE_CONN_STATUS { #define KSMBD_SHARE_FLAG_STREAMS BIT(11) #define KSMBD_SHARE_FLAG_FOLLOW_SYMLINKS BIT(12) #define KSMBD_SHARE_FLAG_ACL_XATTR BIT(13) -#define KSMBD_SHARE_FLAG_UPDATE BIT(14) +#define KSMBD_SHARE_FLAG_UPDATE BIT(14) +#define KSMBD_SHARE_FLAG_CROSSMNT BIT(15)
/* * Tree connect request flags. --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2467,8 +2467,9 @@ static void smb2_update_xattrs(struct ks } }
-static int smb2_creat(struct ksmbd_work *work, struct path *path, char *name, - int open_flags, umode_t posix_mode, bool is_dir) +static int smb2_creat(struct ksmbd_work *work, struct path *parent_path, + struct path *path, char *name, int open_flags, + umode_t posix_mode, bool is_dir) { struct ksmbd_tree_connect *tcon = work->tcon; struct ksmbd_share_config *share = tcon->share_conf; @@ -2495,7 +2496,7 @@ static int smb2_creat(struct ksmbd_work return rc; }
- rc = ksmbd_vfs_kern_path_locked(work, name, 0, path, 0); + rc = ksmbd_vfs_kern_path_locked(work, name, 0, parent_path, path, 0); if (rc) { pr_err("cannot get linux path (%s), err = %d\n", name, rc); @@ -2565,7 +2566,7 @@ int smb2_open(struct ksmbd_work *work) struct ksmbd_tree_connect *tcon = work->tcon; struct smb2_create_req *req; struct smb2_create_rsp *rsp; - struct path path; + struct path path, parent_path; struct ksmbd_share_config *share = tcon->share_conf; struct ksmbd_file *fp = NULL; struct file *filp = NULL; @@ -2786,7 +2787,8 @@ int smb2_open(struct ksmbd_work *work) goto err_out1; }
- rc = ksmbd_vfs_kern_path_locked(work, name, LOOKUP_NO_SYMLINKS, &path, 1); + rc = ksmbd_vfs_kern_path_locked(work, name, LOOKUP_NO_SYMLINKS, + &parent_path, &path, 1); if (!rc) { file_present = true;
@@ -2908,7 +2910,8 @@ int smb2_open(struct ksmbd_work *work)
/*create file if not present */ if (!file_present) { - rc = smb2_creat(work, &path, name, open_flags, posix_mode, + rc = smb2_creat(work, &parent_path, &path, name, open_flags, + posix_mode, req->CreateOptions & FILE_DIRECTORY_FILE_LE); if (rc) { if (rc == -ENOENT) { @@ -3323,8 +3326,9 @@ int smb2_open(struct ksmbd_work *work)
err_out: if (file_present || created) { - inode_unlock(d_inode(path.dentry->d_parent)); - dput(path.dentry); + inode_unlock(d_inode(parent_path.dentry)); + path_put(&path); + path_put(&parent_path); } ksmbd_revert_fsids(work); err_out1: @@ -5547,7 +5551,7 @@ static int smb2_create_link(struct ksmbd struct nls_table *local_nls) { char *link_name = NULL, *target_name = NULL, *pathname = NULL; - struct path path; + struct path path, parent_path; bool file_present = false; int rc;
@@ -5577,7 +5581,7 @@ static int smb2_create_link(struct ksmbd
ksmbd_debug(SMB, "target name is %s\n", target_name); rc = ksmbd_vfs_kern_path_locked(work, link_name, LOOKUP_NO_SYMLINKS, - &path, 0); + &parent_path, &path, 0); if (rc) { if (rc != -ENOENT) goto out; @@ -5607,8 +5611,9 @@ static int smb2_create_link(struct ksmbd rc = -EINVAL; out: if (file_present) { - inode_unlock(d_inode(path.dentry->d_parent)); + inode_unlock(d_inode(parent_path.dentry)); path_put(&path); + path_put(&parent_path); } if (!IS_ERR(link_name)) kfree(link_name); --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -63,13 +63,13 @@ int ksmbd_vfs_lock_parent(struct dentry
static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf, char *pathname, unsigned int flags, + struct path *parent_path, struct path *path) { struct qstr last; struct filename *filename; struct path *root_share_path = &share_conf->vfs_path; int err, type; - struct path parent_path; struct dentry *d;
if (pathname[0] == '\0') { @@ -84,7 +84,7 @@ static int ksmbd_vfs_path_lookup_locked( return PTR_ERR(filename);
err = vfs_path_parent_lookup(filename, flags, - &parent_path, &last, &type, + parent_path, &last, &type, root_share_path); if (err) { putname(filename); @@ -92,13 +92,13 @@ static int ksmbd_vfs_path_lookup_locked( }
if (unlikely(type != LAST_NORM)) { - path_put(&parent_path); + path_put(parent_path); putname(filename); return -ENOENT; }
- inode_lock_nested(parent_path.dentry->d_inode, I_MUTEX_PARENT); - d = lookup_one_qstr_excl(&last, parent_path.dentry, 0); + inode_lock_nested(parent_path->dentry->d_inode, I_MUTEX_PARENT); + d = lookup_one_qstr_excl(&last, parent_path->dentry, 0); if (IS_ERR(d)) goto err_out;
@@ -108,15 +108,22 @@ static int ksmbd_vfs_path_lookup_locked( }
path->dentry = d; - path->mnt = share_conf->vfs_path.mnt; - path_put(&parent_path); - putname(filename); + path->mnt = mntget(parent_path->mnt); + + if (test_share_config_flag(share_conf, KSMBD_SHARE_FLAG_CROSSMNT)) { + err = follow_down(path, 0); + if (err < 0) { + path_put(path); + goto err_out; + } + }
+ putname(filename); return 0;
err_out: - inode_unlock(parent_path.dentry->d_inode); - path_put(&parent_path); + inode_unlock(d_inode(parent_path->dentry)); + path_put(parent_path); putname(filename); return -ENOENT; } @@ -1198,14 +1205,14 @@ static int ksmbd_vfs_lookup_in_dir(const * Return: 0 on success, otherwise error */ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name, - unsigned int flags, struct path *path, - bool caseless) + unsigned int flags, struct path *parent_path, + struct path *path, bool caseless) { struct ksmbd_share_config *share_conf = work->tcon->share_conf; int err; - struct path parent_path;
- err = ksmbd_vfs_path_lookup_locked(share_conf, name, flags, path); + err = ksmbd_vfs_path_lookup_locked(share_conf, name, flags, parent_path, + path); if (!err) return err;
@@ -1220,10 +1227,10 @@ int ksmbd_vfs_kern_path_locked(struct ks path_len = strlen(filepath); remain_len = path_len;
- parent_path = share_conf->vfs_path; - path_get(&parent_path); + *parent_path = share_conf->vfs_path; + path_get(parent_path);
- while (d_can_lookup(parent_path.dentry)) { + while (d_can_lookup(parent_path->dentry)) { char *filename = filepath + path_len - remain_len; char *next = strchrnul(filename, '/'); size_t filename_len = next - filename; @@ -1232,7 +1239,7 @@ int ksmbd_vfs_kern_path_locked(struct ks if (filename_len == 0) break;
- err = ksmbd_vfs_lookup_in_dir(&parent_path, filename, + err = ksmbd_vfs_lookup_in_dir(parent_path, filename, filename_len, work->conn->um); if (err) @@ -1249,8 +1256,8 @@ int ksmbd_vfs_kern_path_locked(struct ks goto out2; else if (is_last) goto out1; - path_put(&parent_path); - parent_path = *path; + path_put(parent_path); + *parent_path = *path;
next[0] = '/'; remain_len -= filename_len + 1; @@ -1258,16 +1265,17 @@ int ksmbd_vfs_kern_path_locked(struct ks
err = -EINVAL; out2: - path_put(&parent_path); + path_put(parent_path); out1: kfree(filepath); }
if (!err) { - err = ksmbd_vfs_lock_parent(parent_path.dentry, path->dentry); - if (err) - dput(path->dentry); - path_put(&parent_path); + err = ksmbd_vfs_lock_parent(parent_path->dentry, path->dentry); + if (err) { + path_put(path); + path_put(parent_path); + } } return err; } --- a/fs/smb/server/vfs.h +++ b/fs/smb/server/vfs.h @@ -115,8 +115,8 @@ int ksmbd_vfs_xattr_stream_name(char *st int ksmbd_vfs_remove_xattr(struct mnt_idmap *idmap, const struct path *path, char *attr_name); int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name, - unsigned int flags, struct path *path, - bool caseless); + unsigned int flags, struct path *parent_path, + struct path *path, bool caseless); struct dentry *ksmbd_vfs_kern_path_create(struct ksmbd_work *work, const char *name, unsigned int flags,
From: Guanghui Feng guanghuifeng@linux.alibaba.com
commit 003e6b56d780095a9adc23efc9cb4b4b4717169b upstream.
According to the ARM IORT specifications DEN 0049 issue E, the "Number of IDs" field in the ID mapping format reports the number of IDs in the mapping range minus one.
In iort_node_get_rmr_info(), we erroneously skip ID mappings whose "Number of IDs" equal to 0, resulting in valid mapping nodes with a single ID to map being skipped, which is wrong.
Fix iort_node_get_rmr_info() by removing the bogus id_count check.
Fixes: 491cf4a6735a ("ACPI/IORT: Add support to retrieve IORT RMR reserved regions") Signed-off-by: Guanghui Feng guanghuifeng@linux.alibaba.com Cc: stable@vger.kernel.org # 6.0.x Acked-by: Lorenzo Pieralisi lpieralisi@kernel.org Tested-by: Hanjun Guo guohanjun@huawei.com Link: https://lore.kernel.org/r/1689593625-45213-1-git-send-email-guanghuifeng@lin... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/arm64/iort.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1006,9 +1006,6 @@ static void iort_node_get_rmr_info(struc for (i = 0; i < node->mapping_count; i++, map++) { struct acpi_iort_node *parent;
- if (!map->id_count) - continue; - parent = ACPI_ADD_PTR(struct acpi_iort_node, iort_table, map->output_reference); if (parent != iommu)
From: Alexander Steffen Alexander.Steffen@infineon.com
commit 513253f8c293c0c8bd46d09d337fc892bf8f9f48 upstream.
recv_data either returns the number of received bytes, or a negative value representing an error code. Adding the return value directly to the total number of received bytes therefore looks a little weird, since it might add a negative error code to a sum of bytes.
The following check for size < expected usually makes the function return ETIME in that case, so it does not cause too many problems in practice. But to make the code look cleaner and because the caller might still be interested in the original error code, explicitly check for the presence of an error code and pass that through.
Cc: stable@vger.kernel.org Fixes: cb5354253af2 ("[PATCH] tpm: spacing cleanups 2") Signed-off-by: Alexander Steffen Alexander.Steffen@infineon.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/tpm/tpm_tis_core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -366,8 +366,13 @@ static int tpm_tis_recv(struct tpm_chip goto out; }
- size += recv_data(chip, &buf[TPM_HEADER_SIZE], - expected - TPM_HEADER_SIZE); + rc = recv_data(chip, &buf[TPM_HEADER_SIZE], + expected - TPM_HEADER_SIZE); + if (rc < 0) { + size = rc; + goto out; + } + size += rc; if (size < expected) { dev_err(&chip->dev, "Unable to read remainder of result\n"); size = -ETIME;
From: Jonas Gorski jonas.gorski@gmail.com
[ Upstream commit 55ad24857341c36616ecc1d9580af5626c226cf1 ]
The irq to block mapping is fixed, and interrupts from the first block will always be routed to the first parent IRQ. But the parent interrupts themselves can be routed to any available CPU.
This is used by the bootloader to map the first parent interrupt to the boot CPU, regardless wether the boot CPU is the first one or the second one.
When booting from the second CPU, the assumption that the first block's IRQ is mapped to the first CPU breaks, and the system hangs because interrupts do not get routed correctly.
Fix this by passing the appropriate bcm6434_l1_cpu to the interrupt handler instead of the chip itself, so the handler always has the right block.
Fixes: c7c42ec2baa1 ("irqchips/bmips: Add bcm6345-l1 interrupt controller") Signed-off-by: Jonas Gorski jonas.gorski@gmail.com Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230629072620.62527-1-jonas.gorski@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-bcm6345-l1.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c index fa113cb2529a4..6341c0167c4ab 100644 --- a/drivers/irqchip/irq-bcm6345-l1.c +++ b/drivers/irqchip/irq-bcm6345-l1.c @@ -82,6 +82,7 @@ struct bcm6345_l1_chip { };
struct bcm6345_l1_cpu { + struct bcm6345_l1_chip *intc; void __iomem *map_base; unsigned int parent_irq; u32 enable_cache[]; @@ -115,17 +116,11 @@ static inline unsigned int cpu_for_irq(struct bcm6345_l1_chip *intc,
static void bcm6345_l1_irq_handle(struct irq_desc *desc) { - struct bcm6345_l1_chip *intc = irq_desc_get_handler_data(desc); - struct bcm6345_l1_cpu *cpu; + struct bcm6345_l1_cpu *cpu = irq_desc_get_handler_data(desc); + struct bcm6345_l1_chip *intc = cpu->intc; struct irq_chip *chip = irq_desc_get_chip(desc); unsigned int idx;
-#ifdef CONFIG_SMP - cpu = intc->cpus[cpu_logical_map(smp_processor_id())]; -#else - cpu = intc->cpus[0]; -#endif - chained_irq_enter(chip, desc);
for (idx = 0; idx < intc->n_words; idx++) { @@ -253,6 +248,7 @@ static int __init bcm6345_l1_init_one(struct device_node *dn, if (!cpu) return -ENOMEM;
+ cpu->intc = intc; cpu->map_base = ioremap(res.start, sz); if (!cpu->map_base) return -ENOMEM; @@ -271,7 +267,7 @@ static int __init bcm6345_l1_init_one(struct device_node *dn, return -EINVAL; } irq_set_chained_handler_and_data(cpu->parent_irq, - bcm6345_l1_irq_handle, intc); + bcm6345_l1_irq_handle, cpu);
return 0; }
From: Marc Zyngier maz@kernel.org
[ Upstream commit 926846a703cbf5d0635cc06e67d34b228746554b ]
We normally rely on the irq_to_cpuid_[un]lock() primitives to make sure nothing will change col->idx while performing a LPI invalidation.
However, these primitives do not cover VPE doorbells, and we have some open-coded locking for that. Unfortunately, this locking is pretty bogus.
Instead, extend the above primitives to cover VPE doorbells and convert the whole thing to it.
Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") Reported-by: Kunkun Jiang jiangkunkun@huawei.com Signed-off-by: Marc Zyngier maz@kernel.org Cc: Zenghui Yu yuzenghui@huawei.com Cc: wanghaibin.wang@huawei.com Tested-by: Kunkun Jiang jiangkunkun@huawei.com Reviewed-by: Zenghui Yu yuzenghui@huawei.com Link: https://lore.kernel.org/r/20230617073242.3199746-1-maz@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-gic-v3-its.c | 75 ++++++++++++++++++++------------ 1 file changed, 46 insertions(+), 29 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 0ec2b1e1df75b..c5cb2830e8537 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -273,13 +273,23 @@ static void vpe_to_cpuid_unlock(struct its_vpe *vpe, unsigned long flags) raw_spin_unlock_irqrestore(&vpe->vpe_lock, flags); }
+static struct irq_chip its_vpe_irq_chip; + static int irq_to_cpuid_lock(struct irq_data *d, unsigned long *flags) { - struct its_vlpi_map *map = get_vlpi_map(d); + struct its_vpe *vpe = NULL; int cpu;
- if (map) { - cpu = vpe_to_cpuid_lock(map->vpe, flags); + if (d->chip == &its_vpe_irq_chip) { + vpe = irq_data_get_irq_chip_data(d); + } else { + struct its_vlpi_map *map = get_vlpi_map(d); + if (map) + vpe = map->vpe; + } + + if (vpe) { + cpu = vpe_to_cpuid_lock(vpe, flags); } else { /* Physical LPIs are already locked via the irq_desc lock */ struct its_device *its_dev = irq_data_get_irq_chip_data(d); @@ -293,10 +303,18 @@ static int irq_to_cpuid_lock(struct irq_data *d, unsigned long *flags)
static void irq_to_cpuid_unlock(struct irq_data *d, unsigned long flags) { - struct its_vlpi_map *map = get_vlpi_map(d); + struct its_vpe *vpe = NULL; + + if (d->chip == &its_vpe_irq_chip) { + vpe = irq_data_get_irq_chip_data(d); + } else { + struct its_vlpi_map *map = get_vlpi_map(d); + if (map) + vpe = map->vpe; + }
- if (map) - vpe_to_cpuid_unlock(map->vpe, flags); + if (vpe) + vpe_to_cpuid_unlock(vpe, flags); }
static struct its_collection *valid_col(struct its_collection *col) @@ -1433,14 +1451,29 @@ static void wait_for_syncr(void __iomem *rdbase) cpu_relax(); }
-static void direct_lpi_inv(struct irq_data *d) +static void __direct_lpi_inv(struct irq_data *d, u64 val) { - struct its_vlpi_map *map = get_vlpi_map(d); void __iomem *rdbase; unsigned long flags; - u64 val; int cpu;
+ /* Target the redistributor this LPI is currently routed to */ + cpu = irq_to_cpuid_lock(d, &flags); + raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock); + + rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base; + gic_write_lpir(val, rdbase + GICR_INVLPIR); + wait_for_syncr(rdbase); + + raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock); + irq_to_cpuid_unlock(d, flags); +} + +static void direct_lpi_inv(struct irq_data *d) +{ + struct its_vlpi_map *map = get_vlpi_map(d); + u64 val; + if (map) { struct its_device *its_dev = irq_data_get_irq_chip_data(d);
@@ -1453,15 +1486,7 @@ static void direct_lpi_inv(struct irq_data *d) val = d->hwirq; }
- /* Target the redistributor this LPI is currently routed to */ - cpu = irq_to_cpuid_lock(d, &flags); - raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock); - rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base; - gic_write_lpir(val, rdbase + GICR_INVLPIR); - - wait_for_syncr(rdbase); - raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock); - irq_to_cpuid_unlock(d, flags); + __direct_lpi_inv(d, val); }
static void lpi_update_config(struct irq_data *d, u8 clr, u8 set) @@ -3952,18 +3977,10 @@ static void its_vpe_send_inv(struct irq_data *d) { struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
- if (gic_rdists->has_direct_lpi) { - void __iomem *rdbase; - - /* Target the redistributor this VPE is currently known on */ - raw_spin_lock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock); - rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base; - gic_write_lpir(d->parent_data->hwirq, rdbase + GICR_INVLPIR); - wait_for_syncr(rdbase); - raw_spin_unlock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock); - } else { + if (gic_rdists->has_direct_lpi) + __direct_lpi_inv(d, d->parent_data->hwirq); + else its_vpe_send_cmd(vpe, its_send_inv); - } }
static void its_vpe_mask_irq(struct irq_data *d)
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit f7853c34241807bb97673a5e97719123be39a09e ]
Henry reported that rt_mutex_adjust_prio_check() has an ordering problem and puts the lie to the comment in [7]. Sharing the sort key between lock->waiters and owner->pi_waiters *does* create problems, since unlike what the comment claims, holding [L] is insufficient.
Notably, consider:
A / \ M1 M2 | | B C
That is, task A owns both M1 and M2, B and C block on them. In this case a concurrent chain walk (B & C) will modify their resp. sort keys in [7] while holding M1->wait_lock and M2->wait_lock. So holding [L] is meaningless, they're different Ls.
This then gives rise to a race condition between [7] and [11], where the requeue of pi_waiters will observe an inconsistent tree order.
B C
(holds M1->wait_lock, (holds M2->wait_lock, holds B->pi_lock) holds A->pi_lock)
[7] waiter_update_prio(); ... [8] raw_spin_unlock(B->pi_lock); ... [10] raw_spin_lock(A->pi_lock);
[11] rt_mutex_enqueue_pi(); // observes inconsistent A->pi_waiters // tree order
Fixing this means either extending the range of the owner lock from [10-13] to [6-13], with the immediate problem that this means [6-8] hold both blocked and owner locks, or duplicating the sort key.
Since the locking in chain walk is horrible enough without having to consider pi_lock nesting rules, duplicate the sort key instead.
By giving each tree their own sort key, the above race becomes harmless, if C sees B at the old location, then B will correct things (if they need correcting) when it walks up the chain and reaches A.
Fixes: fb00aca47440 ("rtmutex: Turn the plist into an rb-tree") Reported-by: Henry Wu triangletrap12@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Thomas Gleixner tglx@linutronix.de Tested-by: Henry Wu triangletrap12@gmail.com Link: https://lkml.kernel.org/r/20230707161052.GF2883469%40hirez.programming.kicks... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/rtmutex.c | 170 +++++++++++++++++++++----------- kernel/locking/rtmutex_api.c | 2 +- kernel/locking/rtmutex_common.h | 47 ++++++--- kernel/locking/ww_mutex.h | 12 +-- 4 files changed, 155 insertions(+), 76 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 728f434de2bbf..21db0df0eb000 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -333,21 +333,43 @@ static __always_inline int __waiter_prio(struct task_struct *task) return prio; }
+/* + * Update the waiter->tree copy of the sort keys. + */ static __always_inline void waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task) { - waiter->prio = __waiter_prio(task); - waiter->deadline = task->dl.deadline; + lockdep_assert_held(&waiter->lock->wait_lock); + lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry)); + + waiter->tree.prio = __waiter_prio(task); + waiter->tree.deadline = task->dl.deadline; +} + +/* + * Update the waiter->pi_tree copy of the sort keys (from the tree copy). + */ +static __always_inline void +waiter_clone_prio(struct rt_mutex_waiter *waiter, struct task_struct *task) +{ + lockdep_assert_held(&waiter->lock->wait_lock); + lockdep_assert_held(&task->pi_lock); + lockdep_assert(RB_EMPTY_NODE(&waiter->pi_tree.entry)); + + waiter->pi_tree.prio = waiter->tree.prio; + waiter->pi_tree.deadline = waiter->tree.deadline; }
/* - * Only use with rt_mutex_waiter_{less,equal}() + * Only use with rt_waiter_node_{less,equal}() */ +#define task_to_waiter_node(p) \ + &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline } #define task_to_waiter(p) \ - &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline } + &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
-static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left, - struct rt_mutex_waiter *right) +static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left, + struct rt_waiter_node *right) { if (left->prio < right->prio) return 1; @@ -364,8 +386,8 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left, return 0; }
-static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left, - struct rt_mutex_waiter *right) +static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left, + struct rt_waiter_node *right) { if (left->prio != right->prio) return 0; @@ -385,7 +407,7 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left, static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter, struct rt_mutex_waiter *top_waiter) { - if (rt_mutex_waiter_less(waiter, top_waiter)) + if (rt_waiter_node_less(&waiter->tree, &top_waiter->tree)) return true;
#ifdef RT_MUTEX_BUILD_SPINLOCKS @@ -393,30 +415,30 @@ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter, * Note that RT tasks are excluded from same priority (lateral) * steals to prevent the introduction of an unbounded latency. */ - if (rt_prio(waiter->prio) || dl_prio(waiter->prio)) + if (rt_prio(waiter->tree.prio) || dl_prio(waiter->tree.prio)) return false;
- return rt_mutex_waiter_equal(waiter, top_waiter); + return rt_waiter_node_equal(&waiter->tree, &top_waiter->tree); #else return false; #endif }
#define __node_2_waiter(node) \ - rb_entry((node), struct rt_mutex_waiter, tree_entry) + rb_entry((node), struct rt_mutex_waiter, tree.entry)
static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_node *b) { struct rt_mutex_waiter *aw = __node_2_waiter(a); struct rt_mutex_waiter *bw = __node_2_waiter(b);
- if (rt_mutex_waiter_less(aw, bw)) + if (rt_waiter_node_less(&aw->tree, &bw->tree)) return 1;
if (!build_ww_mutex()) return 0;
- if (rt_mutex_waiter_less(bw, aw)) + if (rt_waiter_node_less(&bw->tree, &aw->tree)) return 0;
/* NOTE: relies on waiter->ww_ctx being set before insertion */ @@ -434,48 +456,58 @@ static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_nod static __always_inline void rt_mutex_enqueue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter) { - rb_add_cached(&waiter->tree_entry, &lock->waiters, __waiter_less); + lockdep_assert_held(&lock->wait_lock); + + rb_add_cached(&waiter->tree.entry, &lock->waiters, __waiter_less); }
static __always_inline void rt_mutex_dequeue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter) { - if (RB_EMPTY_NODE(&waiter->tree_entry)) + lockdep_assert_held(&lock->wait_lock); + + if (RB_EMPTY_NODE(&waiter->tree.entry)) return;
- rb_erase_cached(&waiter->tree_entry, &lock->waiters); - RB_CLEAR_NODE(&waiter->tree_entry); + rb_erase_cached(&waiter->tree.entry, &lock->waiters); + RB_CLEAR_NODE(&waiter->tree.entry); }
-#define __node_2_pi_waiter(node) \ - rb_entry((node), struct rt_mutex_waiter, pi_tree_entry) +#define __node_2_rt_node(node) \ + rb_entry((node), struct rt_waiter_node, entry)
-static __always_inline bool -__pi_waiter_less(struct rb_node *a, const struct rb_node *b) +static __always_inline bool __pi_waiter_less(struct rb_node *a, const struct rb_node *b) { - return rt_mutex_waiter_less(__node_2_pi_waiter(a), __node_2_pi_waiter(b)); + return rt_waiter_node_less(__node_2_rt_node(a), __node_2_rt_node(b)); }
static __always_inline void rt_mutex_enqueue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter) { - rb_add_cached(&waiter->pi_tree_entry, &task->pi_waiters, __pi_waiter_less); + lockdep_assert_held(&task->pi_lock); + + rb_add_cached(&waiter->pi_tree.entry, &task->pi_waiters, __pi_waiter_less); }
static __always_inline void rt_mutex_dequeue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter) { - if (RB_EMPTY_NODE(&waiter->pi_tree_entry)) + lockdep_assert_held(&task->pi_lock); + + if (RB_EMPTY_NODE(&waiter->pi_tree.entry)) return;
- rb_erase_cached(&waiter->pi_tree_entry, &task->pi_waiters); - RB_CLEAR_NODE(&waiter->pi_tree_entry); + rb_erase_cached(&waiter->pi_tree.entry, &task->pi_waiters); + RB_CLEAR_NODE(&waiter->pi_tree.entry); }
-static __always_inline void rt_mutex_adjust_prio(struct task_struct *p) +static __always_inline void rt_mutex_adjust_prio(struct rt_mutex_base *lock, + struct task_struct *p) { struct task_struct *pi_task = NULL;
+ lockdep_assert_held(&lock->wait_lock); + lockdep_assert(rt_mutex_owner(lock) == p); lockdep_assert_held(&p->pi_lock);
if (task_has_pi_waiters(p)) @@ -571,9 +603,14 @@ static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_st * Chain walk basics and protection scope * * [R] refcount on task - * [P] task->pi_lock held + * [Pn] task->pi_lock held * [L] rtmutex->wait_lock held * + * Normal locking order: + * + * rtmutex->wait_lock + * task->pi_lock + * * Step Description Protected by * function arguments: * @task [R] @@ -588,27 +625,32 @@ static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_st * again: * loop_sanity_check(); * retry: - * [1] lock(task->pi_lock); [R] acquire [P] - * [2] waiter = task->pi_blocked_on; [P] - * [3] check_exit_conditions_1(); [P] - * [4] lock = waiter->lock; [P] - * [5] if (!try_lock(lock->wait_lock)) { [P] try to acquire [L] - * unlock(task->pi_lock); release [P] + * [1] lock(task->pi_lock); [R] acquire [P1] + * [2] waiter = task->pi_blocked_on; [P1] + * [3] check_exit_conditions_1(); [P1] + * [4] lock = waiter->lock; [P1] + * [5] if (!try_lock(lock->wait_lock)) { [P1] try to acquire [L] + * unlock(task->pi_lock); release [P1] * goto retry; * } - * [6] check_exit_conditions_2(); [P] + [L] - * [7] requeue_lock_waiter(lock, waiter); [P] + [L] - * [8] unlock(task->pi_lock); release [P] + * [6] check_exit_conditions_2(); [P1] + [L] + * [7] requeue_lock_waiter(lock, waiter); [P1] + [L] + * [8] unlock(task->pi_lock); release [P1] * put_task_struct(task); release [R] * [9] check_exit_conditions_3(); [L] * [10] task = owner(lock); [L] * get_task_struct(task); [L] acquire [R] - * lock(task->pi_lock); [L] acquire [P] - * [11] requeue_pi_waiter(tsk, waiters(lock));[P] + [L] - * [12] check_exit_conditions_4(); [P] + [L] - * [13] unlock(task->pi_lock); release [P] + * lock(task->pi_lock); [L] acquire [P2] + * [11] requeue_pi_waiter(tsk, waiters(lock));[P2] + [L] + * [12] check_exit_conditions_4(); [P2] + [L] + * [13] unlock(task->pi_lock); release [P2] * unlock(lock->wait_lock); release [L] * goto again; + * + * Where P1 is the blocking task and P2 is the lock owner; going up one step + * the owner becomes the next blocked task etc.. + * +* */ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, enum rtmutex_chainwalk chwalk, @@ -756,7 +798,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, * enabled we continue, but stop the requeueing in the chain * walk. */ - if (rt_mutex_waiter_equal(waiter, task_to_waiter(task))) { + if (rt_waiter_node_equal(&waiter->tree, task_to_waiter_node(task))) { if (!detect_deadlock) goto out_unlock_pi; else @@ -764,13 +806,18 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, }
/* - * [4] Get the next lock + * [4] Get the next lock; per holding task->pi_lock we can't unblock + * and guarantee @lock's existence. */ lock = waiter->lock; /* * [5] We need to trylock here as we are holding task->pi_lock, * which is the reverse lock order versus the other rtmutex * operations. + * + * Per the above, holding task->pi_lock guarantees lock exists, so + * inverting this lock order is infeasible from a life-time + * perspective. */ if (!raw_spin_trylock(&lock->wait_lock)) { raw_spin_unlock_irq(&task->pi_lock); @@ -874,17 +921,18 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, * or * * DL CBS enforcement advancing the effective deadline. - * - * Even though pi_waiters also uses these fields, and that tree is only - * updated in [11], we can do this here, since we hold [L], which - * serializes all pi_waiters access and rb_erase() does not care about - * the values of the node being removed. */ waiter_update_prio(waiter, task);
rt_mutex_enqueue(lock, waiter);
- /* [8] Release the task */ + /* + * [8] Release the (blocking) task in preparation for + * taking the owner task in [10]. + * + * Since we hold lock->waiter_lock, task cannot unblock, even if we + * release task->pi_lock. + */ raw_spin_unlock(&task->pi_lock); put_task_struct(task);
@@ -908,7 +956,12 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, return 0; }
- /* [10] Grab the next task, i.e. the owner of @lock */ + /* + * [10] Grab the next task, i.e. the owner of @lock + * + * Per holding lock->wait_lock and checking for !owner above, there + * must be an owner and it cannot go away. + */ task = get_task_struct(rt_mutex_owner(lock)); raw_spin_lock(&task->pi_lock);
@@ -921,8 +974,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, * and adjust the priority of the owner. */ rt_mutex_dequeue_pi(task, prerequeue_top_waiter); + waiter_clone_prio(waiter, task); rt_mutex_enqueue_pi(task, waiter); - rt_mutex_adjust_prio(task); + rt_mutex_adjust_prio(lock, task);
} else if (prerequeue_top_waiter == waiter) { /* @@ -937,8 +991,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task, */ rt_mutex_dequeue_pi(task, waiter); waiter = rt_mutex_top_waiter(lock); + waiter_clone_prio(waiter, task); rt_mutex_enqueue_pi(task, waiter); - rt_mutex_adjust_prio(task); + rt_mutex_adjust_prio(lock, task); } else { /* * Nothing changed. No need to do any priority @@ -1154,6 +1209,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock, waiter->task = task; waiter->lock = lock; waiter_update_prio(waiter, task); + waiter_clone_prio(waiter, task);
/* Get the top priority waiter on the lock */ if (rt_mutex_has_waiters(lock)) @@ -1187,7 +1243,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock, rt_mutex_dequeue_pi(owner, top_waiter); rt_mutex_enqueue_pi(owner, waiter);
- rt_mutex_adjust_prio(owner); + rt_mutex_adjust_prio(lock, owner); if (owner->pi_blocked_on) chain_walk = 1; } else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) { @@ -1234,6 +1290,8 @@ static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh, { struct rt_mutex_waiter *waiter;
+ lockdep_assert_held(&lock->wait_lock); + raw_spin_lock(¤t->pi_lock);
waiter = rt_mutex_top_waiter(lock); @@ -1246,7 +1304,7 @@ static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh, * task unblocks. */ rt_mutex_dequeue_pi(current, waiter); - rt_mutex_adjust_prio(current); + rt_mutex_adjust_prio(lock, current);
/* * As we are waking up the top waiter, and the waiter stays @@ -1482,7 +1540,7 @@ static void __sched remove_waiter(struct rt_mutex_base *lock, if (rt_mutex_has_waiters(lock)) rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
- rt_mutex_adjust_prio(owner); + rt_mutex_adjust_prio(lock, owner);
/* Store the lock on which owner is blocked or NULL */ next_lock = task_blocked_on_lock(owner); diff --git a/kernel/locking/rtmutex_api.c b/kernel/locking/rtmutex_api.c index cb9fdff76a8a3..a6974d0445930 100644 --- a/kernel/locking/rtmutex_api.c +++ b/kernel/locking/rtmutex_api.c @@ -459,7 +459,7 @@ void __sched rt_mutex_adjust_pi(struct task_struct *task) raw_spin_lock_irqsave(&task->pi_lock, flags);
waiter = task->pi_blocked_on; - if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) { + if (!waiter || rt_waiter_node_equal(&waiter->tree, task_to_waiter_node(task))) { raw_spin_unlock_irqrestore(&task->pi_lock, flags); return; } diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h index c47e8361bfb5c..1162e07cdaea1 100644 --- a/kernel/locking/rtmutex_common.h +++ b/kernel/locking/rtmutex_common.h @@ -17,27 +17,44 @@ #include <linux/rtmutex.h> #include <linux/sched/wake_q.h>
+ +/* + * This is a helper for the struct rt_mutex_waiter below. A waiter goes in two + * separate trees and they need their own copy of the sort keys because of + * different locking requirements. + * + * @entry: rbtree node to enqueue into the waiters tree + * @prio: Priority of the waiter + * @deadline: Deadline of the waiter if applicable + * + * See rt_waiter_node_less() and waiter_*_prio(). + */ +struct rt_waiter_node { + struct rb_node entry; + int prio; + u64 deadline; +}; + /* * This is the control structure for tasks blocked on a rt_mutex, * which is allocated on the kernel stack on of the blocked task. * - * @tree_entry: pi node to enqueue into the mutex waiters tree - * @pi_tree_entry: pi node to enqueue into the mutex owner waiters tree + * @tree: node to enqueue into the mutex waiters tree + * @pi_tree: node to enqueue into the mutex owner waiters tree * @task: task reference to the blocked task * @lock: Pointer to the rt_mutex on which the waiter blocks * @wake_state: Wakeup state to use (TASK_NORMAL or TASK_RTLOCK_WAIT) - * @prio: Priority of the waiter - * @deadline: Deadline of the waiter if applicable * @ww_ctx: WW context pointer + * + * @tree is ordered by @lock->wait_lock + * @pi_tree is ordered by rt_mutex_owner(@lock)->pi_lock */ struct rt_mutex_waiter { - struct rb_node tree_entry; - struct rb_node pi_tree_entry; + struct rt_waiter_node tree; + struct rt_waiter_node pi_tree; struct task_struct *task; struct rt_mutex_base *lock; unsigned int wake_state; - int prio; - u64 deadline; struct ww_acquire_ctx *ww_ctx; };
@@ -105,7 +122,7 @@ static inline bool rt_mutex_waiter_is_top_waiter(struct rt_mutex_base *lock, { struct rb_node *leftmost = rb_first_cached(&lock->waiters);
- return rb_entry(leftmost, struct rt_mutex_waiter, tree_entry) == waiter; + return rb_entry(leftmost, struct rt_mutex_waiter, tree.entry) == waiter; }
static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base *lock) @@ -113,8 +130,10 @@ static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base * struct rb_node *leftmost = rb_first_cached(&lock->waiters); struct rt_mutex_waiter *w = NULL;
+ lockdep_assert_held(&lock->wait_lock); + if (leftmost) { - w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry); + w = rb_entry(leftmost, struct rt_mutex_waiter, tree.entry); BUG_ON(w->lock != lock); } return w; @@ -127,8 +146,10 @@ static inline int task_has_pi_waiters(struct task_struct *p)
static inline struct rt_mutex_waiter *task_top_pi_waiter(struct task_struct *p) { + lockdep_assert_held(&p->pi_lock); + return rb_entry(p->pi_waiters.rb_leftmost, struct rt_mutex_waiter, - pi_tree_entry); + pi_tree.entry); }
#define RT_MUTEX_HAS_WAITERS 1UL @@ -190,8 +211,8 @@ static inline void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter) static inline void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter) { debug_rt_mutex_init_waiter(waiter); - RB_CLEAR_NODE(&waiter->pi_tree_entry); - RB_CLEAR_NODE(&waiter->tree_entry); + RB_CLEAR_NODE(&waiter->pi_tree.entry); + RB_CLEAR_NODE(&waiter->tree.entry); waiter->wake_state = TASK_NORMAL; waiter->task = NULL; } diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h index 56f139201f246..3ad2cc4823e59 100644 --- a/kernel/locking/ww_mutex.h +++ b/kernel/locking/ww_mutex.h @@ -96,25 +96,25 @@ __ww_waiter_first(struct rt_mutex *lock) struct rb_node *n = rb_first(&lock->rtmutex.waiters.rb_root); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline struct rt_mutex_waiter * __ww_waiter_next(struct rt_mutex *lock, struct rt_mutex_waiter *w) { - struct rb_node *n = rb_next(&w->tree_entry); + struct rb_node *n = rb_next(&w->tree.entry); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline struct rt_mutex_waiter * __ww_waiter_prev(struct rt_mutex *lock, struct rt_mutex_waiter *w) { - struct rb_node *n = rb_prev(&w->tree_entry); + struct rb_node *n = rb_prev(&w->tree.entry); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline struct rt_mutex_waiter * @@ -123,7 +123,7 @@ __ww_waiter_last(struct rt_mutex *lock) struct rb_node *n = rb_last(&lock->rtmutex.waiters.rb_root); if (!n) return NULL; - return rb_entry(n, struct rt_mutex_waiter, tree_entry); + return rb_entry(n, struct rt_mutex_waiter, tree.entry); }
static inline void
From: Dan Carpenter dan.carpenter@linaro.org
commit 641db40f3afe7998011bfabc726dba3e698f8196 upstream.
The bug is the error handling:
if (tmp < nr_bytes) {
"tmp" can hold negative error codes but because "nr_bytes" is type size_t the negative error codes are treated as very high positive values (success). Fix this by changing "nr_bytes" to type ssize_t. The "nr_bytes" variable is used to store values between 1 and PAGE_SIZE and they can fit in ssize_t without any issue.
Link: https://lkml.kernel.org/r/b55f7eed-1c65-4adc-95d1-6c7c65a54a6e@moroto.mounta... Fixes: 5d8de293c224 ("vmcore: convert copy_oldmem_page() to take an iov_iter") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Acked-by: Baoquan He bhe@redhat.com Cc: Dave Young dyoung@redhat.com Cc: Vivek Goyal vgoyal@redhat.com Cc: Alexey Dobriyan adobriyan@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/vmcore.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -132,7 +132,7 @@ ssize_t read_from_oldmem(struct iov_iter u64 *ppos, bool encrypted) { unsigned long pfn, offset; - size_t nr_bytes; + ssize_t nr_bytes; ssize_t read = 0, tmp; int idx;
From: Demi Marie Obenour demi@invisiblethingslab.com
commit c04e9894846c663f3278a414f34416e6e45bbe68 upstream.
When a grant entry is still in use by the remote domain, Linux must put it on a deferred list. Normally, this list is very short, because the PV network and block protocols expect the backend to unmap the grant first. However, Qubes OS's GUI protocol is subject to the constraints of the X Window System, and as such winds up with the frontend unmapping the window first. As a result, the list can grow very large, resulting in a massive memory leak and eventual VM freeze.
To partially solve this problem, make the number of entries that the VM will attempt to free at each iteration tunable. The default is still 10, but it can be overridden via a module parameter.
This is Cc: stable because (when combined with appropriate userspace changes) it fixes a severe performance and stability problem for Qubes OS users.
Cc: stable@vger.kernel.org Signed-off-by: Demi Marie Obenour demi@invisiblethingslab.com Reviewed-by: Juergen Gross jgross@suse.com Link: https://lore.kernel.org/r/20230726165354.1252-1-demi@invisiblethingslab.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/ABI/testing/sysfs-module | 11 +++++++++ drivers/xen/grant-table.c | 40 +++++++++++++++++++++++---------- 2 files changed, 40 insertions(+), 11 deletions(-)
--- a/Documentation/ABI/testing/sysfs-module +++ b/Documentation/ABI/testing/sysfs-module @@ -60,3 +60,14 @@ Description: Module taint flags: C staging driver module E unsigned module == ===================== + +What: /sys/module/grant_table/parameters/free_per_iteration +Date: July 2023 +KernelVersion: 6.5 but backported to all supported stable branches +Contact: Xen developer discussion xen-devel@lists.xenproject.org +Description: Read and write number of grant entries to attempt to free per iteration. + + Note: Future versions of Xen and Linux may provide a better + interface for controlling the rate of deferred grant reclaim + or may not need it at all. +Users: Qubes OS (https://www.qubes-os.org) --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -498,14 +498,21 @@ static LIST_HEAD(deferred_list); static void gnttab_handle_deferred(struct timer_list *); static DEFINE_TIMER(deferred_timer, gnttab_handle_deferred);
+static atomic64_t deferred_count; +static atomic64_t leaked_count; +static unsigned int free_per_iteration = 10; +module_param(free_per_iteration, uint, 0600); + static void gnttab_handle_deferred(struct timer_list *unused) { - unsigned int nr = 10; + unsigned int nr = READ_ONCE(free_per_iteration); + const bool ignore_limit = nr == 0; struct deferred_entry *first = NULL; unsigned long flags; + size_t freed = 0;
spin_lock_irqsave(&gnttab_list_lock, flags); - while (nr--) { + while ((ignore_limit || nr--) && !list_empty(&deferred_list)) { struct deferred_entry *entry = list_first_entry(&deferred_list, struct deferred_entry, list); @@ -515,10 +522,14 @@ static void gnttab_handle_deferred(struc list_del(&entry->list); spin_unlock_irqrestore(&gnttab_list_lock, flags); if (_gnttab_end_foreign_access_ref(entry->ref)) { + uint64_t ret = atomic64_dec_return(&deferred_count); + put_free_entry(entry->ref); - pr_debug("freeing g.e. %#x (pfn %#lx)\n", - entry->ref, page_to_pfn(entry->page)); + pr_debug("freeing g.e. %#x (pfn %#lx), %llu remaining\n", + entry->ref, page_to_pfn(entry->page), + (unsigned long long)ret); put_page(entry->page); + freed++; kfree(entry); entry = NULL; } else { @@ -530,21 +541,22 @@ static void gnttab_handle_deferred(struc spin_lock_irqsave(&gnttab_list_lock, flags); if (entry) list_add_tail(&entry->list, &deferred_list); - else if (list_empty(&deferred_list)) - break; } - if (!list_empty(&deferred_list) && !timer_pending(&deferred_timer)) { + if (list_empty(&deferred_list)) + WARN_ON(atomic64_read(&deferred_count)); + else if (!timer_pending(&deferred_timer)) { deferred_timer.expires = jiffies + HZ; add_timer(&deferred_timer); } spin_unlock_irqrestore(&gnttab_list_lock, flags); + pr_debug("Freed %zu references", freed); }
static void gnttab_add_deferred(grant_ref_t ref, struct page *page) { struct deferred_entry *entry; gfp_t gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL; - const char *what = KERN_WARNING "leaking"; + uint64_t leaked, deferred;
entry = kmalloc(sizeof(*entry), gfp); if (!page) { @@ -567,10 +579,16 @@ static void gnttab_add_deferred(grant_re add_timer(&deferred_timer); } spin_unlock_irqrestore(&gnttab_list_lock, flags); - what = KERN_DEBUG "deferring"; + deferred = atomic64_inc_return(&deferred_count); + leaked = atomic64_read(&leaked_count); + pr_debug("deferring g.e. %#x (pfn %#lx) (total deferred %llu, total leaked %llu)\n", + ref, page ? page_to_pfn(page) : -1, deferred, leaked); + } else { + deferred = atomic64_read(&deferred_count); + leaked = atomic64_inc_return(&leaked_count); + pr_warn("leaking g.e. %#x (pfn %#lx) (total deferred %llu, total leaked %llu)\n", + ref, page ? page_to_pfn(page) : -1, deferred, leaked); } - printk("%s g.e. %#x (pfn %#lx)\n", - what, ref, page ? page_to_pfn(page) : -1); }
int gnttab_try_end_foreign_access(grant_ref_t ref)
From: Jason Wang jasowang@redhat.com
commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream.
A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time.
Cc: stable@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang jasowang@redhat.com Acked-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Link: https://lore.kernel.org/r/20230725072049.617289-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/virtio_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -4110,6 +4110,8 @@ static int virtnet_probe(struct virtio_d if (vi->has_rss || vi->has_rss_hash_report) virtnet_init_default_rss(vi);
+ _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock();
@@ -4148,8 +4150,6 @@ static int virtnet_probe(struct virtio_d goto free_unregister_netdev; }
- virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev);
From: Alex Elder elder@linaro.org
commit e11ec2b868af2b351c6c1e2e50eb711cc5423a10 upstream.
Last year, the code that manages GSI channel transactions switched from using spinlock-protected linked lists to using indexes into the ring buffer used for a channel. Recently, Google reported seeing transaction reference count underflows occasionally during shutdown.
Doug Anderson found a way to reproduce the issue reliably, and bisected the issue to the commit that eliminated the linked lists and the lock. The root cause was ultimately determined to be related to unused transactions being committed as part of the modem shutdown cleanup activity. Unused transactions are not normally expected (except in error cases).
The modem uses some ranges of IPA-resident memory, and whenever it shuts down we zero those ranges. In ipa_filter_reset_table() a transaction is allocated to zero modem filter table entries. If hashing is not supported, hashed table memory should not be zeroed. But currently nothing prevents that, and the result is an unused transaction. Something similar occurs when we zero routing table entries for the modem.
By preventing any attempt to clear hashed tables when hashing is not supported, the reference count underflow is avoided in this case.
Note that there likely remains an issue with properly freeing unused transactions (if they occur due to errors). This patch addresses only the underflows that Google originally reported.
Cc: stable@vger.kernel.org # 6.1.x Fixes: d338ae28d8a8 ("net: ipa: kill all other transaction lists") Tested-by: Douglas Anderson dianders@chromium.org Signed-off-by: Alex Elder elder@linaro.org Link: https://lore.kernel.org/r/20230724224055.1688854-1-elder@linaro.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ipa/ipa_table.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
--- a/drivers/net/ipa/ipa_table.c +++ b/drivers/net/ipa/ipa_table.c @@ -273,16 +273,15 @@ static int ipa_filter_reset(struct ipa * if (ret) return ret;
- ret = ipa_filter_reset_table(ipa, true, false, modem); - if (ret) + ret = ipa_filter_reset_table(ipa, false, true, modem); + if (ret || !ipa_table_hash_support(ipa)) return ret;
- ret = ipa_filter_reset_table(ipa, false, true, modem); + ret = ipa_filter_reset_table(ipa, true, false, modem); if (ret) return ret; - ret = ipa_filter_reset_table(ipa, true, true, modem);
- return ret; + return ipa_filter_reset_table(ipa, true, true, modem); }
/* The AP routes and modem routes are each contiguous within the @@ -291,12 +290,13 @@ static int ipa_filter_reset(struct ipa * * */ static int ipa_route_reset(struct ipa *ipa, bool modem) { + bool hash_support = ipa_table_hash_support(ipa); u32 modem_route_count = ipa->modem_route_count; struct gsi_trans *trans; u16 first; u16 count;
- trans = ipa_cmd_trans_alloc(ipa, 4); + trans = ipa_cmd_trans_alloc(ipa, hash_support ? 4 : 2); if (!trans) { dev_err(&ipa->pdev->dev, "no transaction for %s route reset\n", @@ -313,10 +313,12 @@ static int ipa_route_reset(struct ipa *i }
ipa_table_reset_add(trans, false, false, false, first, count); - ipa_table_reset_add(trans, false, true, false, first, count); - ipa_table_reset_add(trans, false, false, true, first, count); - ipa_table_reset_add(trans, false, true, true, first, count); + + if (hash_support) { + ipa_table_reset_add(trans, false, true, false, first, count); + ipa_table_reset_add(trans, false, true, true, first, count); + }
gsi_trans_commit_wait(trans);
From: Christian Marangi ansuelsmth@gmail.com
commit 2c39dd025da489cf87d26469d9f5ff19715324a0 upstream.
The qca8xxx switch supports 2 way to write reg values, a slow way using mdio and a fast way by sending specially crafted mgmt packet to read/write reg.
The fast way can support up to 32 bytes of data as eth packet are used to send/receive.
This correctly works for almost the entire regmap of the switch but with the use of some kernel selftests for dsa drivers it was found a funny and interesting hw defect/limitation.
For some specific reg, bulk write won't work and will result in writing only part of the requested regs resulting in half data written. This was especially hard to track and discover due to the total strangeness of the problem and also by the specific regs where this occurs.
This occurs in the specific regs of the ATU table, where multiple entry needs to be written to compose the entire entry. It was discovered that with a bulk write of 12 bytes on QCA8K_REG_ATU_DATA0 only QCA8K_REG_ATU_DATA0 and QCA8K_REG_ATU_DATA2 were written, but QCA8K_REG_ATU_DATA1 was always zero. Tcpdump was used to make sure the specially crafted packet was correct and this was confirmed.
The problem was hard to track as the lack of QCA8K_REG_ATU_DATA1 resulted in an entry somehow possible as the first bytes of the mac address are set in QCA8K_REG_ATU_DATA0 and the entry type is set in QCA8K_REG_ATU_DATA2.
Funlly enough writing QCA8K_REG_ATU_DATA1 results in the same problem with QCA8K_REG_ATU_DATA2 empty and QCA8K_REG_ATU_DATA1 and QCA8K_REG_ATU_FUNC correctly written. A speculation on the problem might be that there are some kind of indirection internally when accessing these regs and they can't be accessed all together, due to the fact that it's really a table mapped somewhere in the switch SRAM.
Even more funny is the fact that every other reg was tested with all kind of combination and they are not affected by this problem. Read operation was also tested and always worked so it's not affected by this problem.
The problem is not present if we limit writing a single reg at times.
To handle this hardware defect, enable use_single_write so that bulk api can correctly split the write in multiple different operation effectively reverting to a non-bulk write.
Cc: Mark Brown broonie@kernel.org Fixes: c766e077d927 ("net: dsa: qca8k: convert to regmap read/write API") Signed-off-by: Christian Marangi ansuelsmth@gmail.com Cc: stable@vger.kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/qca/qca8k-8xxx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c index 09b80644c11b..efe9380d4a15 100644 --- a/drivers/net/dsa/qca/qca8k-8xxx.c +++ b/drivers/net/dsa/qca/qca8k-8xxx.c @@ -576,8 +576,11 @@ static struct regmap_config qca8k_regmap_config = { .rd_table = &qca8k_readable_table, .disable_locking = true, /* Locking is handled by qca8k read/write */ .cache_type = REGCACHE_NONE, /* Explicitly disable CACHE */ - .max_raw_read = 32, /* mgmt eth can read/write up to 8 registers at time */ - .max_raw_write = 32, + .max_raw_read = 32, /* mgmt eth can read up to 8 registers at time */ + /* ATU regs suffer from a bug where some data are not correctly + * written. Disable bulk write to correctly write ATU entry. + */ + .use_single_write = true, };
static int
From: Christian Marangi ansuelsmth@gmail.com
commit 80248d4160894d7e40b04111bdbaa4ff93fc4bd7 upstream.
On inserting a mdb entry, fdb_search_and_insert is used to add a port to the qca8k target entry in the FDB db.
A FDB entry can't be modified so it needs to be removed and insert again with the new values.
To detect if an entry already exist, the SEARCH operation is used and we check the aging of the entry. If the entry is not 0, the entry exist and we proceed to delete it.
Current code have 2 main problem: - The condition to check if the FDB entry exist is wrong and should be the opposite. - When a FDB entry doesn't exist, aging was never actually set to the STATIC value resulting in allocating an invalid entry.
Fix both problem by adding aging support to the function, calling the function with STATIC as aging by default and finally by correct the condition to check if the entry actually exist.
Fixes: ba8f870dfa63 ("net: dsa: qca8k: add support for mdb_add/del") Signed-off-by: Christian Marangi ansuelsmth@gmail.com Cc: stable@vger.kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/qca/qca8k-common.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/net/dsa/qca/qca8k-common.c +++ b/drivers/net/dsa/qca/qca8k-common.c @@ -244,7 +244,7 @@ void qca8k_fdb_flush(struct qca8k_priv * }
static int qca8k_fdb_search_and_insert(struct qca8k_priv *priv, u8 port_mask, - const u8 *mac, u16 vid) + const u8 *mac, u16 vid, u8 aging) { struct qca8k_fdb fdb = { 0 }; int ret; @@ -261,10 +261,12 @@ static int qca8k_fdb_search_and_insert(s goto exit;
/* Rule exist. Delete first */ - if (!fdb.aging) { + if (fdb.aging) { ret = qca8k_fdb_access(priv, QCA8K_FDB_PURGE, -1); if (ret) goto exit; + } else { + fdb.aging = aging; }
/* Add port to fdb portmask */ @@ -810,7 +812,8 @@ int qca8k_port_mdb_add(struct dsa_switch const u8 *addr = mdb->addr; u16 vid = mdb->vid;
- return qca8k_fdb_search_and_insert(priv, BIT(port), addr, vid); + return qca8k_fdb_search_and_insert(priv, BIT(port), addr, vid, + QCA8K_ATU_STATUS_STATIC); }
int qca8k_port_mdb_del(struct dsa_switch *ds, int port,
From: Christian Marangi ansuelsmth@gmail.com
commit ae70dcb9d9ecaf7d9836d3e1b5bef654d7ef5680 upstream.
On deleting an MDB entry for a port, fdb_search_and_del is used. An FDB entry can't be modified so it needs to be deleted and readded again with the new portmap (and the port deleted as requested)
We use the SEARCH operator to search the entry to edit by vid and mac address and then we check the aging if we actually found an entry.
Currently the code suffer from a bug where the searched fdb entry is never read again with the found values (if found) resulting in the code always returning -EINVAL as aging was always 0.
Fix this by correctly read the fdb entry after it was searched.
Fixes: ba8f870dfa63 ("net: dsa: qca8k: add support for mdb_add/del") Signed-off-by: Christian Marangi ansuelsmth@gmail.com Cc: stable@vger.kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/qca/qca8k-common.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/dsa/qca/qca8k-common.c +++ b/drivers/net/dsa/qca/qca8k-common.c @@ -293,6 +293,10 @@ static int qca8k_fdb_search_and_del(stru if (ret < 0) goto exit;
+ ret = qca8k_fdb_read(priv, &fdb); + if (ret < 0) + goto exit; + /* Rule doesn't exist. Why delete? */ if (!fdb.aging) { ret = -EINVAL;
From: Christian Marangi ansuelsmth@gmail.com
commit dfd739f182b00b02bd7470ed94d112684cc04fa2 upstream.
The qca8k switch doesn't support using 0 as VID and require a default VID to be always set. MDB add/del function doesn't currently handle this and are currently setting the default VID.
Fix this by correctly handling this corner case and internally use the default VID for VID 0 case.
Fixes: ba8f870dfa63 ("net: dsa: qca8k: add support for mdb_add/del") Signed-off-by: Christian Marangi ansuelsmth@gmail.com Cc: stable@vger.kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/qca/qca8k-common.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/dsa/qca/qca8k-common.c +++ b/drivers/net/dsa/qca/qca8k-common.c @@ -816,6 +816,9 @@ int qca8k_port_mdb_add(struct dsa_switch const u8 *addr = mdb->addr; u16 vid = mdb->vid;
+ if (!vid) + vid = QCA8K_PORT_VID_DEF; + return qca8k_fdb_search_and_insert(priv, BIT(port), addr, vid, QCA8K_ATU_STATUS_STATIC); } @@ -828,6 +831,9 @@ int qca8k_port_mdb_del(struct dsa_switch const u8 *addr = mdb->addr; u16 vid = mdb->vid;
+ if (!vid) + vid = QCA8K_PORT_VID_DEF; + return qca8k_fdb_search_and_del(priv, BIT(port), addr, vid); }
From: Jens Axboe axboe@kernel.dk
commit 7b72d661f1f2f950ab8c12de7e2bc48bdac8ed69 upstream.
A previous commit made all cqring waits marked as iowait, as a way to improve performance for short schedules with pending IO. However, for use cases that have a special reaper thread that does nothing but wait on events on the ring, this causes a cosmetic issue where we know have one core marked as being "busy" with 100% iowait.
While this isn't a grave issue, it is confusing to users. Rather than always mark us as being in iowait, gate setting of current->in_iowait to 1 by whether or not the waiting task has pending requests.
Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/CAMEGJJ2RxopfNQ7GNLhr7X9=bHXKo+G5OOe0LUq=+U... Link: https://bugzilla.kernel.org/show_bug.cgi?id=217699 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217700 Reported-by: Oleksandr Natalenko oleksandr@natalenko.name Reported-by: Phil Elwell phil@raspberrypi.com Tested-by: Andres Freund andres@anarazel.de Fixes: 8a796565cec3 ("io_uring: Use io_schedule* in cqring wait") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2579,11 +2579,20 @@ int io_run_task_work_sig(struct io_ring_ return 0; }
+static bool current_pending_io(void) +{ + struct io_uring_task *tctx = current->io_uring; + + if (!tctx) + return false; + return percpu_counter_read_positive(&tctx->inflight); +} + /* when returns >0, the caller should retry */ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq) { - int token, ret; + int io_wait, ret;
if (unlikely(READ_ONCE(ctx->check_cq))) return 1; @@ -2597,17 +2606,19 @@ static inline int io_cqring_wait_schedul return 0;
/* - * Use io_schedule_prepare/finish, so cpufreq can take into account - * that the task is waiting for IO - turns out to be important for low - * QD IO. + * Mark us as being in io_wait if we have pending requests, so cpufreq + * can take into account that the task is waiting for IO - turns out + * to be important for low QD IO. */ - token = io_schedule_prepare(); + io_wait = current->in_iowait; + if (current_pending_io()) + current->in_iowait = 1; ret = 0; if (iowq->timeout == KTIME_MAX) schedule(); else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS)) ret = -ETIME; - io_schedule_finish(token); + current->in_iowait = io_wait; return ret; }
From: Jason Gunthorpe jgg@nvidia.com
commit b7c822fa6b7701b17e139f1c562fc24135880ed4 upstream.
Even though the test suite covers this it somehow became obscured that this wasn't working.
The test iommufd_ioas.mock_domain.access_domain_destory would blow up rarely.
end should be set to 1 because this just pushed an item, the carry, to the pfns list.
Sometimes the test would blow up with:
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 5 PID: 584 Comm: iommufd Not tainted 6.5.0-rc1-dirty #1236 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:batch_unpin+0xa2/0x100 [iommufd] Code: 17 48 81 fe ff ff 07 00 77 70 48 8b 15 b7 be 97 e2 48 85 d2 74 14 48 8b 14 fa 48 85 d2 74 0b 40 0f b6 f6 48 c1 e6 04 48 01 f2 <48> 8b 3a 48 c1 e0 06 89 ca 48 89 de 48 83 e7 f0 48 01 c7 e8 96 dc RSP: 0018:ffffc90001677a58 EFLAGS: 00010246 RAX: 00007f7e2646f000 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 00000000fefc4c8d RDI: 0000000000fefc4c RBP: ffffc90001677a80 R08: 0000000000000048 R09: 0000000000000200 R10: 0000000000030b98 R11: ffffffff81f3bb40 R12: 0000000000000001 R13: ffff888101f75800 R14: ffffc90001677ad0 R15: 00000000000001fe FS: 00007f9323679740(0000) GS:ffff8881ba540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000105ede003 CR4: 00000000003706a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? show_regs+0x5c/0x70 ? __die+0x1f/0x60 ? page_fault_oops+0x15d/0x440 ? lock_release+0xbc/0x240 ? exc_page_fault+0x4a4/0x970 ? asm_exc_page_fault+0x27/0x30 ? batch_unpin+0xa2/0x100 [iommufd] ? batch_unpin+0xba/0x100 [iommufd] __iopt_area_unfill_domain+0x198/0x430 [iommufd] ? __mutex_lock+0x8c/0xb80 ? __mutex_lock+0x6aa/0xb80 ? xa_erase+0x28/0x30 ? iopt_table_remove_domain+0x162/0x320 [iommufd] ? lock_release+0xbc/0x240 iopt_area_unfill_domain+0xd/0x10 [iommufd] iopt_table_remove_domain+0x195/0x320 [iommufd] iommufd_hw_pagetable_destroy+0xb3/0x110 [iommufd] iommufd_object_destroy_user+0x8e/0xf0 [iommufd] iommufd_device_detach+0xc5/0x140 [iommufd] iommufd_selftest_destroy+0x1f/0x70 [iommufd] iommufd_object_destroy_user+0x8e/0xf0 [iommufd] iommufd_destroy+0x3a/0x50 [iommufd] iommufd_fops_ioctl+0xfb/0x170 [iommufd] __x64_sys_ioctl+0x40d/0x9a0 do_syscall_64+0x3c/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Link: https://lore.kernel.org/r/3-v1-85aacb2af554+bc-iommufd_syz3_jgg@nvidia.com Cc: stable@vger.kernel.org Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages") Reviewed-by: Kevin Tian kevin.tian@intel.com Tested-by: Nicolin Chen nicolinc@nvidia.com Reported-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/iommufd/pages.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iommu/iommufd/pages.c +++ b/drivers/iommu/iommufd/pages.c @@ -297,7 +297,7 @@ static void batch_clear_carry(struct pfn batch->pfns[0] = batch->pfns[batch->end - 1] + (batch->npfns[batch->end - 1] - keep_pfns); batch->npfns[0] = keep_pfns; - batch->end = 0; + batch->end = 1; }
static void batch_skip_carry(struct pfn_batch *batch, unsigned int skip_pfns)
From: Sean Christopherson seanjc@google.com
commit 3bcbc20942db5d738221cca31a928efc09827069 upstream.
To allow running rseq and KVM's rseq selftests as statically linked binaries, initialize the various "trampoline" pointers to point directly at the expect glibc symbols, and skip the dlysm() lookups if the rseq size is non-zero, i.e. the binary is statically linked *and* the libc registered its own rseq.
Define weak versions of the symbols so as not to break linking against libc versions that don't support rseq in any capacity.
The KVM selftests in particular are often statically linked so that they can be run on targets with very limited runtime environments, i.e. test machines.
Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35") Cc: Aaron Lewis aaronlewis@google.com Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230721223352.2333911-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/rseq/rseq.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-)
--- a/tools/testing/selftests/rseq/rseq.c +++ b/tools/testing/selftests/rseq/rseq.c @@ -34,9 +34,17 @@ #include "../kselftest.h" #include "rseq.h"
-static const ptrdiff_t *libc_rseq_offset_p; -static const unsigned int *libc_rseq_size_p; -static const unsigned int *libc_rseq_flags_p; +/* + * Define weak versions to play nice with binaries that are statically linked + * against a libc that doesn't support registering its own rseq. + */ +__weak ptrdiff_t __rseq_offset; +__weak unsigned int __rseq_size; +__weak unsigned int __rseq_flags; + +static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset; +static const unsigned int *libc_rseq_size_p = &__rseq_size; +static const unsigned int *libc_rseq_flags_p = &__rseq_flags;
/* Offset from the thread pointer to the rseq area. */ ptrdiff_t rseq_offset; @@ -155,9 +163,17 @@ unsigned int get_rseq_feature_size(void) static __attribute__((constructor)) void rseq_init(void) { - libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset"); - libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size"); - libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags"); + /* + * If the libc's registered rseq size isn't already valid, it may be + * because the binary is dynamically linked and not necessarily due to + * libc not having registered a restartable sequence. Try to find the + * symbols if that's the case. + */ + if (!*libc_rseq_size_p) { + libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset"); + libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size"); + libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags"); + } if (libc_rseq_size_p && libc_rseq_offset_p && libc_rseq_flags_p && *libc_rseq_size_p != 0) { /* rseq registration owned by glibc */
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 016e7ba47f33064fbef8c4307a2485d2669dfd03 upstream.
If 'iptables-legacy' is available, 'ip6tables-legacy' command will be used instead of 'ip6tables'. So no need to look if 'ip6tables' is available in this case.
Cc: stable@vger.kernel.org Fixes: 0c4cd3f86a40 ("selftests: mptcp: join: use 'iptables-legacy' if available") Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Mat Martineau martineau@kernel.org Link: https://lore.kernel.org/r/20230725-send-net-20230725-v1-1-6f60fe7137a9@kerne... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -156,9 +156,7 @@ check_tools() elif ! iptables -V &> /dev/null; then echo "SKIP: Could not run all tests without iptables tool" exit $ksft_skip - fi - - if ! ip6tables -V &> /dev/null; then + elif ! ip6tables -V &> /dev/null; then echo "SKIP: Could not run all tests without ip6tables tool" exit $ksft_skip fi
From: Johan Hovold johan+linaro@kernel.org
commit c40d6b3249b11d60e09d81530588f56233d9aa44 upstream.
The soundwire subsystem uses two completion structures that allow drivers to wait for soundwire device to become enumerated on the bus and initialised by their drivers, respectively.
The code implementing the signalling is currently broken as it does not signal all current and future waiters and also uses the wrong reinitialisation function, which can potentially lead to memory corruption if there are still waiters on the queue.
Not signalling future waiters specifically breaks sound card probe deferrals as codec drivers can not tell that the soundwire device is already attached when being reprobed. Some codec runtime PM implementations suffer from similar problems as waiting for enumeration during resume can also timeout despite the device already having been enumerated.
Fixes: fb9469e54fa7 ("soundwire: bus: fix race condition with enumeration_complete signaling") Fixes: a90def068127 ("soundwire: bus: fix race condition with initialization_complete signaling") Cc: stable@vger.kernel.org # 5.7 Cc: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Cc: Rander Wang rander.wang@linux.intel.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20230705123018.30903-2-johan+linaro@kernel.org Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/soundwire/bus.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/soundwire/bus.c +++ b/drivers/soundwire/bus.c @@ -908,8 +908,8 @@ static void sdw_modify_slave_status(stru "initializing enumeration and init completion for Slave %d\n", slave->dev_num);
- init_completion(&slave->enumeration_complete); - init_completion(&slave->initialization_complete); + reinit_completion(&slave->enumeration_complete); + reinit_completion(&slave->initialization_complete);
} else if ((status == SDW_SLAVE_ATTACHED) && (slave->status == SDW_SLAVE_UNATTACHED)) { @@ -917,7 +917,7 @@ static void sdw_modify_slave_status(stru "signaling enumeration completion for Slave %d\n", slave->dev_num);
- complete(&slave->enumeration_complete); + complete_all(&slave->enumeration_complete); } slave->status = status; mutex_unlock(&bus->bus_lock); @@ -1941,7 +1941,7 @@ int sdw_handle_slave_status(struct sdw_b "signaling initialization completion for Slave %d\n", slave->dev_num);
- complete(&slave->initialization_complete); + complete_all(&slave->initialization_complete);
/* * If the manager became pm_runtime active, the peripherals will be
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit dddfa05eb58076ad60f9a66e7155a5b3502b2dd5 upstream.
This reverts commit 9b0da3f22307af693be80f5d3a89dc4c7f360a85.
The sigio.c is clearly user space code which is handled by arch/um/scripts/Makefile.rules (see USER_OBJS rule).
The above mentioned commit simply broke this agreement, we may not use Linux kernel internal headers in them without thorough thinking.
Hence, revert the wrong commit.
Link: https://lkml.kernel.org/r/20230724143131.30090-1-andriy.shevchenko@linux.int... Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202307212304.cH79zJp1-lkp@intel.com/ Cc: Anton Ivanov anton.ivanov@cambridgegreys.com Cc: Herve Codina herve.codina@bootlin.com Cc: Jason A. Donenfeld Jason@zx2c4.com Cc: Johannes Berg johannes@sipsolutions.net Cc: Rasmus Villemoes linux@rasmusvillemoes.dk Cc: Richard Weinberger richard@nod.at Cc: Yang Guang yang.guang5@zte.com.cn Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/um/os-Linux/sigio.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/arch/um/os-Linux/sigio.c +++ b/arch/um/os-Linux/sigio.c @@ -3,7 +3,6 @@ * Copyright (C) 2002 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com) */
-#include <linux/minmax.h> #include <unistd.h> #include <errno.h> #include <fcntl.h> @@ -51,7 +50,7 @@ static struct pollfds all_sigio_fds;
static int write_sigio_thread(void *unused) { - struct pollfds *fds; + struct pollfds *fds, tmp; struct pollfd *p; int i, n, respond_fd; char c; @@ -78,7 +77,9 @@ static int write_sigio_thread(void *unus "write_sigio_thread : " "read on socket failed, " "err = %d\n", errno); - swap(current_poll, next_poll); + tmp = current_poll; + current_poll = next_poll; + next_poll = tmp; respond_fd = sigio_private[1]; } else {
From: WANG Rui wangrui@loongson.cn
commit e66d511fc92201ba481392e54896f1aeadfcf0e9 upstream.
This patch fixes an underflow issue in the return value within the exception path, specifically at .Llt8 when the remaining length is less than 8 bytes.
Cc: stable@vger.kernel.org Fixes: 8941e93ca590 ("LoongArch: Optimize memory ops (memset/memcpy/memmove)") Reported-by: Weihao Li liweihao@loongson.cn Signed-off-by: WANG Rui wangrui@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/loongarch/lib/clear_user.S | 3 ++- arch/loongarch/lib/copy_user.S | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/loongarch/lib/clear_user.S b/arch/loongarch/lib/clear_user.S index fd1d62b244f2..9dcf71719387 100644 --- a/arch/loongarch/lib/clear_user.S +++ b/arch/loongarch/lib/clear_user.S @@ -108,6 +108,7 @@ SYM_FUNC_START(__clear_user_fast) addi.d a3, a2, -8 bgeu a0, a3, .Llt8 15: st.d zero, a0, 0 + addi.d a0, a0, 8
.Llt8: 16: st.d zero, a2, -8 @@ -188,7 +189,7 @@ SYM_FUNC_START(__clear_user_fast) _asm_extable 13b, .L_fixup_handle_0 _asm_extable 14b, .L_fixup_handle_1 _asm_extable 15b, .L_fixup_handle_0 - _asm_extable 16b, .L_fixup_handle_1 + _asm_extable 16b, .L_fixup_handle_0 _asm_extable 17b, .L_fixup_handle_s0 _asm_extable 18b, .L_fixup_handle_s0 _asm_extable 19b, .L_fixup_handle_s0 diff --git a/arch/loongarch/lib/copy_user.S b/arch/loongarch/lib/copy_user.S index b21f6d5d38f5..fecd08cad702 100644 --- a/arch/loongarch/lib/copy_user.S +++ b/arch/loongarch/lib/copy_user.S @@ -136,6 +136,7 @@ SYM_FUNC_START(__copy_user_fast) bgeu a1, a4, .Llt8 30: ld.d t0, a1, 0 31: st.d t0, a0, 0 + addi.d a0, a0, 8
.Llt8: 32: ld.d t0, a3, -8 @@ -246,7 +247,7 @@ SYM_FUNC_START(__copy_user_fast) _asm_extable 30b, .L_fixup_handle_0 _asm_extable 31b, .L_fixup_handle_0 _asm_extable 32b, .L_fixup_handle_0 - _asm_extable 33b, .L_fixup_handle_1 + _asm_extable 33b, .L_fixup_handle_0 _asm_extable 34b, .L_fixup_handle_s0 _asm_extable 35b, .L_fixup_handle_s0 _asm_extable 36b, .L_fixup_handle_s0
From: Tiezhu Yang yangtiezhu@loongson.cn
commit 4eece7e6de94d833c8aeed2f438faf487cbf94ff upstream.
As the code comment says, the initial aim is to reduce one instruction in some corner cases, if bit[51:31] is all 0 or all 1, no need to call lu32id. That is to say, it should call lu32id only if bit[51:31] is not all 0 and not all 1. The current code always call lu32id, the result is right but the logic is unexpected and wrong, fix it.
Cc: stable@vger.kernel.org # 6.1 Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Reported-by: Colin King (gmail) colin.i.king@gmail.com Closes: https://lore.kernel.org/all/bcf97046-e336-712a-ac68-7fd194f2953e@gmail.com/ Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/loongarch/net/bpf_jit.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/loongarch/net/bpf_jit.h +++ b/arch/loongarch/net/bpf_jit.h @@ -150,7 +150,7 @@ static inline void move_imm(struct jit_c * no need to call lu32id to do a new filled operation. */ imm_51_31 = (imm >> 31) & 0x1fffff; - if (imm_51_31 != 0 || imm_51_31 != 0x1fffff) { + if (imm_51_31 != 0 && imm_51_31 != 0x1fffff) { /* lu32id rd, imm_51_32 */ imm_51_32 = (imm >> 32) & 0xfffff; emit_insn(ctx, lu32id, rd, imm_51_32);
From: Chenguang Zhao zhaochenguang@kylinos.cn
commit de0e30bee86d0f99c696a1fea34474e556a946ec upstream.
Currently nettrace does not work on LoongArch due to missing bpf_probe_read{,str}() support, with the error message:
ERROR: failed to load kprobe-based eBPF ERROR: failed to load kprobe-based bpf
According to commit 0ebeea8ca8a4d1d ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work"), we only need to select CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE to add said support, because LoongArch does have non-overlapping address ranges for kernel and userspace.
Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Chenguang Zhao zhaochenguang@kylinos.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/loongarch/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/arch/loongarch/Kconfig +++ b/arch/loongarch/Kconfig @@ -12,6 +12,7 @@ config LOONGARCH select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_INLINE_READ_LOCK if !PREEMPTION
From: Dominique Martinet asmadeus@codewreck.org
commit eee4a119e96c2f58cfd1b6d4de42095abc5f8877 upstream.
retval from filemap_fdatawrite was immediately overwritten by the following p9_fid_put: preserve any error in fdatawrite if there was any first.
This fixes the following scan-build warning: fs/9p/vfs_dir.c:220:4: warning: Value stored to 'retval' is never read [deadcode.DeadStores] retval = filemap_fdatawrite(inode->i_mapping); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 89c58cb395ec ("fs/9p: fix error reporting in v9fs_dir_release") Cc: stable@vger.kernel.org Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org Signed-off-by: Eric Van Hensbergen ericvh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/9p/vfs_dir.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c index 45b684b7d8d7..4102759a5cb5 100644 --- a/fs/9p/vfs_dir.c +++ b/fs/9p/vfs_dir.c @@ -208,7 +208,7 @@ int v9fs_dir_release(struct inode *inode, struct file *filp) struct p9_fid *fid; __le32 version; loff_t i_size; - int retval = 0; + int retval = 0, put_err;
fid = filp->private_data; p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n", @@ -221,7 +221,8 @@ int v9fs_dir_release(struct inode *inode, struct file *filp) spin_lock(&inode->i_lock); hlist_del(&fid->ilist); spin_unlock(&inode->i_lock); - retval = p9_fid_put(fid); + put_err = p9_fid_put(fid); + retval = retval < 0 ? retval : put_err; }
if ((filp->f_mode & FMODE_WRITE)) {
From: Eric Van Hensbergen ericvh@kernel.org
commit 75b396821cb71164dac3a1ad51dda4781ea8dbad upstream.
This eliminates a check for shared that was overrestrictive and prevented read-only mmaps when writeback caches weren't enabled.
Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reported-by: Robert Schwebel r.schwebel@pengutronix.de Closes: https://lore.kernel.org/v9fs/ZK25XZ%2BGpR3KHIB%2F@pengutronix.de Reviewed-by: Dominique Martinet asmadeus@codewreck.org Reviewed-by: Christian Schoenebeck linux_oss@crudebyte.com Signed-off-by: Eric Van Hensbergen ericvh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/9p/vfs_file.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/9p/vfs_file.c +++ b/fs/9p/vfs_file.c @@ -483,9 +483,7 @@ v9fs_file_mmap(struct file *filp, struct p9_debug(P9_DEBUG_MMAP, "filp :%p\n", filp);
if (!(v9ses->cache & CACHE_WRITEBACK)) { - p9_debug(P9_DEBUG_CACHE, "(no mmap mode)"); - if (vma->vm_flags & VM_MAYSHARE) - return -ENODEV; + p9_debug(P9_DEBUG_CACHE, "(read-only mmap mode)"); invalidate_inode_pages2(filp->f_mapping); return generic_file_readonly_mmap(filp, vma); }
From: Eric Van Hensbergen ericvh@kernel.org
commit 878cb3e0337d7c3096aee301a2a3cd358dc8aa81 upstream.
There appears to be a typo in the comparison statement for the logic which sets a file's cache mode based on mount flags.
Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reviewed-by: Christian Schoenebeck linux_oss@crudebyte.com Reviewed-by: Dominique Martinet asmadeus@codewreck.org Signed-off-by: Eric Van Hensbergen ericvh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/9p/fid.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/9p/fid.h +++ b/fs/9p/fid.h @@ -57,7 +57,7 @@ static inline void v9fs_fid_add_modes(st (s_flags & V9FS_DIRECT_IO) || (f_flags & O_DIRECT)) { fid->mode |= P9L_DIRECT; /* no read or write cache */ } else if ((!(s_cache & CACHE_WRITEBACK)) || - (f_flags & O_DSYNC) | (s_flags & V9FS_SYNC)) { + (f_flags & O_DSYNC) || (s_flags & V9FS_SYNC)) { fid->mode |= P9L_NOWRITECACHE; } }
From: Eric Van Hensbergen ericvh@kernel.org
commit 09430aba3a9ffd986834614a3406a13588170bde upstream.
There were two flags (s_flags and s_cache) which had incorrect signed type in the parameters of the file cache mode helper function.
Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reviewed-by: Dominique Martinet asmadeus@codewreck.org Signed-off-by: Eric Van Hensbergen ericvh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/9p/fid.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/9p/fid.h b/fs/9p/fid.h index 297c2c377e3d..29281b7c3887 100644 --- a/fs/9p/fid.h +++ b/fs/9p/fid.h @@ -46,8 +46,8 @@ static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry) * NOTE: these are set after open so only reflect 9p client not * underlying file system on server. */ -static inline void v9fs_fid_add_modes(struct p9_fid *fid, int s_flags, - int s_cache, unsigned int f_flags) +static inline void v9fs_fid_add_modes(struct p9_fid *fid, unsigned int s_flags, + unsigned int s_cache, unsigned int f_flags) { if (fid->qid.type != P9_QTFILE) return;
From: Eric Van Hensbergen ericvh@kernel.org
commit 350cd9b959757e7c571f45fab29d116d5f67cbff upstream.
There was an invalidate_inode_pages2 added to readonly mmap path that is unnecessary since that path is only entered when writeback cache is disabled on mount.
Cc: stable@vger.kernel.org Fixes: 1543b4c5071c ("fs/9p: remove writeback fid and fix per-file modes") Reviewed-by: Christian Schoenebeck linux_oss@crudebyte.com Signed-off-by: Eric Van Hensbergen ericvh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/9p/vfs_file.c | 1 - 1 file changed, 1 deletion(-)
--- a/fs/9p/vfs_file.c +++ b/fs/9p/vfs_file.c @@ -484,7 +484,6 @@ v9fs_file_mmap(struct file *filp, struct
if (!(v9ses->cache & CACHE_WRITEBACK)) { p9_debug(P9_DEBUG_CACHE, "(read-only mmap mode)"); - invalidate_inode_pages2(filp->f_mapping); return generic_file_readonly_mmap(filp, vma); }
From: Stefan Haberland sth@linux.ibm.com
commit 05f1d8ed03f547054efbc4d29bb7991c958ede95 upstream.
Quiesce and resume are functions that tell the DASD driver to stop/resume issuing I/Os to a specific DASD.
On resume dasd_schedule_block_bh() is called to kick handling of IO requests again. This does unfortunately not cover internal requests which are used for path verification for example.
This could lead to a hanging device when a path event or anything else that triggers internal requests occurs on a quiesced device.
Fix by also calling dasd_schedule_device_bh() which triggers handling of internal requests on resume.
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland sth@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Link: https://lore.kernel.org/r/20230721193647.3889634-2-sth@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd_ioctl.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/s390/block/dasd_ioctl.c +++ b/drivers/s390/block/dasd_ioctl.c @@ -131,6 +131,7 @@ static int dasd_ioctl_resume(struct dasd spin_unlock_irqrestore(get_ccwdev_lock(base->cdev), flags);
dasd_schedule_block_bh(block); + dasd_schedule_device_bh(base); return 0; }
From: Stefan Haberland sth@linux.ibm.com
commit 856d8e3c633b183df23549ce760ae84478a7098d upstream.
The DASD driver has certain types of requests that might be rejected by the storage server or z/VM because they are not supported. Since the missing support of the command is not a real issue there is no user visible kernel error message for this.
For copy pair setups there is a specific error that IO is not allowed on secondary devices. This error case is explicitly handled and an error message is printed.
The code checking for the error did use a bitwise 'and' that is used to check for specific bits. But in this case the whole sense byte has to match.
This leads to the problem that the copy pair related error message is erroneously printed for other error cases that are usually not reported. This might heavily confuse users and lead to follow on actions that might disrupt application processing.
Fix by checking the sense byte for the exact value and not single bits.
Cc: stable@vger.kernel.org # 6.1+ Fixes: 1fca631a1185 ("s390/dasd: suppress generic error messages for PPRC secondary devices") Signed-off-by: Stefan Haberland sth@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Link: https://lore.kernel.org/r/20230721193647.3889634-5-sth@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd_3990_erp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/s390/block/dasd_3990_erp.c +++ b/drivers/s390/block/dasd_3990_erp.c @@ -1050,7 +1050,7 @@ dasd_3990_erp_com_rej(struct dasd_ccw_re dev_err(&device->cdev->dev, "An I/O request was rejected" " because writing is inhibited\n"); erp = dasd_3990_erp_cleanup(erp, DASD_CQR_FAILED); - } else if (sense[7] & SNS7_INVALID_ON_SEC) { + } else if (sense[7] == SNS7_INVALID_ON_SEC) { dev_err(&device->cdev->dev, "An I/O request was rejected on a copy pair secondary device\n"); /* suppress dump of sense data for this error */ set_bit(DASD_CQR_SUPPRESS_CR, &erp->refers->flags);
From: Paolo Abeni pabeni@redhat.com
commit 21d9b73a7d5241905367098d260a3c68b811da32 upstream.
Currently the mptcp code generate a "new listener" event even if the actual listen() syscall fails. Address the issue moving the event generation call under the successful branch.
Cc: stable@vger.kernel.org Fixes: f8c9dfbd875b ("mptcp: add pm listener events") Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau martineau@kernel.org Link: https://lore.kernel.org/r/20230725-send-net-20230725-v1-2-6f60fe7137a9@kerne... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3717,10 +3717,9 @@ static int mptcp_listen(struct socket *s if (!err) { sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); mptcp_copy_inaddrs(sk, ssock->sk); + mptcp_event_pm_listener(ssock->sk, MPTCP_EVENT_LISTENER_CREATED); }
- mptcp_event_pm_listener(ssock->sk, MPTCP_EVENT_LISTENER_CREATED); - unlock: release_sock(sk); return err;
From: Mark Brown broonie@kernel.org
commit f061e2be8689057cb4ec0dbffa9f03e1a23cdcb2 upstream.
The WM8904_ADC_TEST_0 register is modified as part of updating the OSR controls but does not have a cache default, leading to errors when we try to modify these controls in cache only mode with no prior read:
wm8904 3-001a: ASoC: error at snd_soc_component_update_bits on wm8904.3-001a for register: [0x000000c6] -16
Add a read of the register to probe() to fill the cache and avoid both the error messages and the misconfiguration of the chip which will result.
Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230723-asoc-fix-wm8904-adc-test-read-v1-1-2cdf2e... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/wm8904.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/soc/codecs/wm8904.c +++ b/sound/soc/codecs/wm8904.c @@ -2308,6 +2308,9 @@ static int wm8904_i2c_probe(struct i2c_c regmap_update_bits(wm8904->regmap, WM8904_BIAS_CONTROL_0, WM8904_POBCTRL, 0);
+ /* Fill the cache for the ADC test register */ + regmap_read(wm8904->regmap, WM8904_ADC_TEST_0, &val); + /* Can leave the device powered off until we need it */ regcache_cache_only(wm8904->regmap, true); regulator_bulk_disable(ARRAY_SIZE(wm8904->supplies), wm8904->supplies);
From: Mark Brown broonie@kernel.org
commit 05d881b85b48c7ac6a7c92ce00aa916c4a84d052 upstream.
As part of fixing the allocation of the buffer for SVE state when changing SME vector length we introduced an immediate reallocation of the SVE state, this is also done when changing the SVE vector length for consistency. Unfortunately this reallocation is done prior to writing the new vector length to the task struct, meaning the allocation is done with the old vector length and can lead to memory corruption due to an undersized buffer being used.
Move the update of the vector length before the allocation to ensure that the new vector length is taken into account.
For some reason this isn't triggering any problems when running tests on the arm64 fixes branch (even after repeated tries) but is triggering issues very often after merge into mainline.
Fixes: d4d5be94a878 ("arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes") Signed-off-by: Mark Brown broonie@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230726-arm64-fix-sme-fix-v1-1-7752ec58af27@kerne... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/fpsimd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -917,6 +917,8 @@ int vec_set_vector_length(struct task_st if (task == current) put_cpu_fpsimd_context();
+ task_set_vl(task, type, vl); + /* * Free the changed states if they are not in use, SME will be * reallocated to the correct size on next use and we just @@ -931,8 +933,6 @@ int vec_set_vector_length(struct task_st if (free_sme) sme_free(task);
- task_set_vl(task, type, vl); - out: update_tsk_thread_flag(task, vec_vl_inherit_flag(type), flags & PR_SVE_VL_INHERIT);
From: Johan Hovold johan+linaro@kernel.org
commit 8527beb12087238d4387607597b4020bc393c4b4 upstream.
The decision whether to enable a wake irq during suspend can not be done based on the runtime PM state directly as a driver may use wake irqs without implementing runtime PM. Such drivers specifically leave the state set to the default 'suspended' and the wake irq is thus never enabled at suspend.
Add a new wake irq flag to track whether a dedicated wake irq has been enabled at runtime suspend and therefore must not be enabled at system suspend.
Note that pm_runtime_enabled() can not be used as runtime PM is always disabled during late suspend.
Fixes: 69728051f5bf ("PM / wakeirq: Fix unbalanced IRQ enable for wakeirq") Cc: 4.16+ stable@vger.kernel.org # 4.16+ Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Tony Lindgren tony@atomide.com Tested-by: Tony Lindgren tony@atomide.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/power/power.h | 1 + drivers/base/power/wakeirq.c | 12 ++++++++---- 2 files changed, 9 insertions(+), 4 deletions(-)
--- a/drivers/base/power/power.h +++ b/drivers/base/power/power.h @@ -29,6 +29,7 @@ extern u64 pm_runtime_active_time(struct #define WAKE_IRQ_DEDICATED_MASK (WAKE_IRQ_DEDICATED_ALLOCATED | \ WAKE_IRQ_DEDICATED_MANAGED | \ WAKE_IRQ_DEDICATED_REVERSE) +#define WAKE_IRQ_DEDICATED_ENABLED BIT(3)
struct wake_irq { struct device *dev; --- a/drivers/base/power/wakeirq.c +++ b/drivers/base/power/wakeirq.c @@ -314,8 +314,10 @@ void dev_pm_enable_wake_irq_check(struct return;
enable: - if (!can_change_status || !(wirq->status & WAKE_IRQ_DEDICATED_REVERSE)) + if (!can_change_status || !(wirq->status & WAKE_IRQ_DEDICATED_REVERSE)) { enable_irq(wirq->irq); + wirq->status |= WAKE_IRQ_DEDICATED_ENABLED; + } }
/** @@ -336,8 +338,10 @@ void dev_pm_disable_wake_irq_check(struc if (cond_disable && (wirq->status & WAKE_IRQ_DEDICATED_REVERSE)) return;
- if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED) + if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED) { + wirq->status &= ~WAKE_IRQ_DEDICATED_ENABLED; disable_irq_nosync(wirq->irq); + } }
/** @@ -376,7 +380,7 @@ void dev_pm_arm_wake_irq(struct wake_irq
if (device_may_wakeup(wirq->dev)) { if (wirq->status & WAKE_IRQ_DEDICATED_ALLOCATED && - !pm_runtime_status_suspended(wirq->dev)) + !(wirq->status & WAKE_IRQ_DEDICATED_ENABLED)) enable_irq(wirq->irq);
enable_irq_wake(wirq->irq); @@ -399,7 +403,7 @@ void dev_pm_disarm_wake_irq(struct wake_ disable_irq_wake(wirq->irq);
if (wirq->status & WAKE_IRQ_DEDICATED_ALLOCATED && - !pm_runtime_status_suspended(wirq->dev)) + !(wirq->status & WAKE_IRQ_DEDICATED_ENABLED)) disable_irq_nosync(wirq->irq); } }
From: Ahmad Fatoum a.fatoum@pengutronix.de
commit ac4436a5b20e0ef1f608a9ef46c08d5d142f8da6 upstream.
Since commit 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure"), thermal_zone_device_register() allocates a copy of the tzp argument and frees it when unregistering, so thermal_of_zone_register() now ends up leaking its original tzp and double-freeing the tzp copy. Fix this by locating tzp on stack instead.
Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure") Signed-off-by: Ahmad Fatoum a.fatoum@pengutronix.de Acked-by: Daniel Lezcano daniel.lezcano@linaro.org Cc: 6.4+ stable@vger.kernel.org # 6.4+: 8bcbb18c61d6: thermal: core: constify params in thermal_zone_device_register Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thermal/thermal_of.c | 27 ++++++--------------------- 1 file changed, 6 insertions(+), 21 deletions(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index 6fb14e521197..bc07ae1c284c 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -238,17 +238,13 @@ static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdel return 0; }
-static struct thermal_zone_params *thermal_of_parameters_init(struct device_node *np) +static void thermal_of_parameters_init(struct device_node *np, + struct thermal_zone_params *tzp) { - struct thermal_zone_params *tzp; int coef[2]; int ncoef = ARRAY_SIZE(coef); int prop, ret;
- tzp = kzalloc(sizeof(*tzp), GFP_KERNEL); - if (!tzp) - return ERR_PTR(-ENOMEM); - tzp->no_hwmon = true;
if (!of_property_read_u32(np, "sustainable-power", &prop)) @@ -267,8 +263,6 @@ static struct thermal_zone_params *thermal_of_parameters_init(struct device_node
tzp->slope = coef[0]; tzp->offset = coef[1]; - - return tzp; }
static struct device_node *thermal_of_zone_get_by_name(struct thermal_zone_device *tz) @@ -442,13 +436,11 @@ static int thermal_of_unbind(struct thermal_zone_device *tz, static void thermal_of_zone_unregister(struct thermal_zone_device *tz) { struct thermal_trip *trips = tz->trips; - struct thermal_zone_params *tzp = tz->tzp; struct thermal_zone_device_ops *ops = tz->ops;
thermal_zone_device_disable(tz); thermal_zone_device_unregister(tz); kfree(trips); - kfree(tzp); kfree(ops); }
@@ -477,7 +469,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * { struct thermal_zone_device *tz; struct thermal_trip *trips; - struct thermal_zone_params *tzp; + struct thermal_zone_params tzp = {}; struct thermal_zone_device_ops *of_ops; struct device_node *np; int delay, pdelay; @@ -509,12 +501,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * goto out_kfree_trips; }
- tzp = thermal_of_parameters_init(np); - if (IS_ERR(tzp)) { - ret = PTR_ERR(tzp); - pr_err("Failed to initialize parameter from %pOFn: %d\n", np, ret); - goto out_kfree_trips; - } + thermal_of_parameters_init(np, &tzp);
of_ops->bind = thermal_of_bind; of_ops->unbind = thermal_of_unbind; @@ -522,12 +509,12 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * mask = GENMASK_ULL((ntrips) - 1, 0);
tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips, - mask, data, of_ops, tzp, + mask, data, of_ops, &tzp, pdelay, delay); if (IS_ERR(tz)) { ret = PTR_ERR(tz); pr_err("Failed to register thermal zone %pOFn: %d\n", np, ret); - goto out_kfree_tzp; + goto out_kfree_trips; }
ret = thermal_zone_device_enable(tz); @@ -540,8 +527,6 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
return tz;
-out_kfree_tzp: - kfree(tzp); out_kfree_trips: kfree(trips); out_kfree_of_ops:
From: Xiubo Li xiubli@redhat.com
commit 50164507f6b7b7ed85d8c3ac0266849fbd908db7 upstream.
Even the 'disable_send_metrics' is true so when the session is being opened it will always trigger to send the metric for the first time.
Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Venky Shankar vshankar@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/metric.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -208,7 +208,7 @@ static void metric_delayed_work(struct w struct ceph_mds_client *mdsc = container_of(m, struct ceph_mds_client, metric);
- if (mdsc->stopping) + if (mdsc->stopping || disable_send_metrics) return;
if (!m->session || !check_session_state(m->session)) {
From: Radhakrishna Sripada radhakrishna.sripada@intel.com
commit 3844ed5e78823eebb5f0f1edefc403310693d402 upstream.
Dpt objects that are created from internal get evicted when there is memory pressure and do not get restored when pinned during scanout. The pinned page table entries look corrupted and programming the display engine with the incorrect pte's result in DE throwing pipe faults.
Create DPT objects from shmem and mark the object as dirty when pinning so that the object is restored when shrinker evicts an unpinned buffer object.
v2: Unconditionally mark the dpt objects dirty during pinning(Chris).
Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt") Cc: stable@vger.kernel.org # v6.0+ Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Suggested-by: Chris Wilson chris.p.wilson@intel.com Signed-off-by: Fei Yang fei.yang@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230718225118.2562132-1-radha... (cherry picked from commit e91a777a6e602ba0e3366e053e4e094a334a1244) Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_dpt.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/display/intel_dpt.c +++ b/drivers/gpu/drm/i915/display/intel_dpt.c @@ -166,6 +166,8 @@ struct i915_vma *intel_dpt_pin(struct i9 i915_vma_get(vma); }
+ dpt->obj->mm.dirty = true; + atomic_dec(&i915->gpu_error.pending_fb_pin); intel_runtime_pm_put(&i915->runtime_pm, wakeref);
@@ -261,7 +263,7 @@ intel_dpt_create(struct intel_framebuffe dpt_obj = i915_gem_object_create_stolen(i915, size); if (IS_ERR(dpt_obj) && !HAS_LMEM(i915)) { drm_dbg_kms(&i915->drm, "Allocating dpt from smem\n"); - dpt_obj = i915_gem_object_create_internal(i915, size); + dpt_obj = i915_gem_object_create_shmem(i915, size); } if (IS_ERR(dpt_obj)) return ERR_CAST(dpt_obj);
From: Joe Thornber ejt@redhat.com
commit 1e4ab7b4c881cf26c1c72b3f56519e03475486fb upstream.
When using the cleaner policy to decommission the cache, there is never any writeback started from the cache as it is constantly delayed due to normal I/O keeping the device busy. Meaning @idle=false was always being passed to clean_target_met()
Fix this by adding a specific 'cleaner' flag that is set when the cleaner policy is configured. This flag serves to always allow the cleaner's writeback work to be queued until the cache is decommissioned (even if the cache isn't idle).
Reported-by: David Jeffery djeffery@redhat.com Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2") Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber ejt@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-cache-policy-smq.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/drivers/md/dm-cache-policy-smq.c +++ b/drivers/md/dm-cache-policy-smq.c @@ -857,7 +857,13 @@ struct smq_policy {
struct background_tracker *bg_work;
- bool migrations_allowed; + bool migrations_allowed:1; + + /* + * If this is set the policy will try and clean the whole cache + * even if the device is not idle. + */ + bool cleaner:1; };
/*----------------------------------------------------------------*/ @@ -1138,7 +1144,7 @@ static bool clean_target_met(struct smq_ * Cache entries may not be populated. So we cannot rely on the * size of the clean queue. */ - if (idle) { + if (idle || mq->cleaner) { /* * We'd like to clean everything. */ @@ -1722,11 +1728,9 @@ static void calc_hotspot_params(sector_t *hotspot_block_size /= 2u; }
-static struct dm_cache_policy *__smq_create(dm_cblock_t cache_size, - sector_t origin_size, - sector_t cache_block_size, - bool mimic_mq, - bool migrations_allowed) +static struct dm_cache_policy * +__smq_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size, + bool mimic_mq, bool migrations_allowed, bool cleaner) { unsigned int i; unsigned int nr_sentinels_per_queue = 2u * NR_CACHE_LEVELS; @@ -1813,6 +1817,7 @@ static struct dm_cache_policy *__smq_cre goto bad_btracker;
mq->migrations_allowed = migrations_allowed; + mq->cleaner = cleaner;
return &mq->policy;
@@ -1836,21 +1841,24 @@ static struct dm_cache_policy *smq_creat sector_t origin_size, sector_t cache_block_size) { - return __smq_create(cache_size, origin_size, cache_block_size, false, true); + return __smq_create(cache_size, origin_size, cache_block_size, + false, true, false); }
static struct dm_cache_policy *mq_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size) { - return __smq_create(cache_size, origin_size, cache_block_size, true, true); + return __smq_create(cache_size, origin_size, cache_block_size, + true, true, false); }
static struct dm_cache_policy *cleaner_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size) { - return __smq_create(cache_size, origin_size, cache_block_size, false, false); + return __smq_create(cache_size, origin_size, cache_block_size, + false, false, true); }
/*----------------------------------------------------------------*/
From: Ilya Dryomov idryomov@gmail.com
commit f38cb9d9c2045dad16eead4a2e1aedfddd94603b upstream.
Make the "num_lockers can be only 0 or 1" assumption explicit and simplify the API by getting rid of output parameters in preparation for calling get_lock_owner_info() twice before blocklisting.
Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Dongsheng Yang dongsheng.yang@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 84 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 51 insertions(+), 33 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3849,10 +3849,17 @@ static void wake_lock_waiters(struct rbd list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list); }
-static int get_lock_owner_info(struct rbd_device *rbd_dev, - struct ceph_locker **lockers, u32 *num_lockers) +static void free_locker(struct ceph_locker *locker) +{ + if (locker) + ceph_free_lockers(locker, 1); +} + +static struct ceph_locker *get_lock_owner_info(struct rbd_device *rbd_dev) { struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc; + struct ceph_locker *lockers; + u32 num_lockers; u8 lock_type; char *lock_tag; int ret; @@ -3861,39 +3868,45 @@ static int get_lock_owner_info(struct rb
ret = ceph_cls_lock_info(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc, RBD_LOCK_NAME, - &lock_type, &lock_tag, lockers, num_lockers); - if (ret) - return ret; + &lock_type, &lock_tag, &lockers, &num_lockers); + if (ret) { + rbd_warn(rbd_dev, "failed to retrieve lockers: %d", ret); + return ERR_PTR(ret); + }
- if (*num_lockers == 0) { + if (num_lockers == 0) { dout("%s rbd_dev %p no lockers detected\n", __func__, rbd_dev); + lockers = NULL; goto out; }
if (strcmp(lock_tag, RBD_LOCK_TAG)) { rbd_warn(rbd_dev, "locked by external mechanism, tag %s", lock_tag); - ret = -EBUSY; - goto out; + goto err_busy; }
if (lock_type == CEPH_CLS_LOCK_SHARED) { rbd_warn(rbd_dev, "shared lock type detected"); - ret = -EBUSY; - goto out; + goto err_busy; }
- if (strncmp((*lockers)[0].id.cookie, RBD_LOCK_COOKIE_PREFIX, + WARN_ON(num_lockers != 1); + if (strncmp(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX, strlen(RBD_LOCK_COOKIE_PREFIX))) { rbd_warn(rbd_dev, "locked by external mechanism, cookie %s", - (*lockers)[0].id.cookie); - ret = -EBUSY; - goto out; + lockers[0].id.cookie); + goto err_busy; }
out: kfree(lock_tag); - return ret; + return lockers; + +err_busy: + kfree(lock_tag); + ceph_free_lockers(lockers, num_lockers); + return ERR_PTR(-EBUSY); }
static int find_watcher(struct rbd_device *rbd_dev, @@ -3947,51 +3960,56 @@ out: static int rbd_try_lock(struct rbd_device *rbd_dev) { struct ceph_client *client = rbd_dev->rbd_client->client; - struct ceph_locker *lockers; - u32 num_lockers; + struct ceph_locker *locker; int ret;
for (;;) { + locker = NULL; + ret = rbd_lock(rbd_dev); if (ret != -EBUSY) - return ret; + goto out;
/* determine if the current lock holder is still alive */ - ret = get_lock_owner_info(rbd_dev, &lockers, &num_lockers); - if (ret) - return ret; - - if (num_lockers == 0) + locker = get_lock_owner_info(rbd_dev); + if (IS_ERR(locker)) { + ret = PTR_ERR(locker); + locker = NULL; + goto out; + } + if (!locker) goto again;
- ret = find_watcher(rbd_dev, lockers); + ret = find_watcher(rbd_dev, locker); if (ret) goto out; /* request lock or error */
rbd_warn(rbd_dev, "breaking header lock owned by %s%llu", - ENTITY_NAME(lockers[0].id.name)); + ENTITY_NAME(locker->id.name));
ret = ceph_monc_blocklist_add(&client->monc, - &lockers[0].info.addr); + &locker->info.addr); if (ret) { - rbd_warn(rbd_dev, "blocklist of %s%llu failed: %d", - ENTITY_NAME(lockers[0].id.name), ret); + rbd_warn(rbd_dev, "failed to blocklist %s%llu: %d", + ENTITY_NAME(locker->id.name), ret); goto out; }
ret = ceph_cls_break_lock(&client->osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc, RBD_LOCK_NAME, - lockers[0].id.cookie, - &lockers[0].id.name); - if (ret && ret != -ENOENT) + locker->id.cookie, &locker->id.name); + if (ret && ret != -ENOENT) { + rbd_warn(rbd_dev, "failed to break header lock: %d", + ret); goto out; + }
again: - ceph_free_lockers(lockers, num_lockers); + free_locker(locker); }
out: - ceph_free_lockers(lockers, num_lockers); + free_locker(locker); return ret; }
From: Ilya Dryomov idryomov@gmail.com
commit 8ff2c64c9765446c3cef804fb99da04916603e27 upstream.
- we want the exclusive lock type, so test for it directly - use sscanf() to actually parse the lock cookie and avoid admitting invalid handles - bail if locker has a blank address
Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Dongsheng Yang dongsheng.yang@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 21 +++++++++++++++------ net/ceph/messenger.c | 1 + 2 files changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3862,10 +3862,9 @@ static struct ceph_locker *get_lock_owne u32 num_lockers; u8 lock_type; char *lock_tag; + u64 handle; int ret;
- dout("%s rbd_dev %p\n", __func__, rbd_dev); - ret = ceph_cls_lock_info(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc, RBD_LOCK_NAME, &lock_type, &lock_tag, &lockers, &num_lockers); @@ -3886,18 +3885,28 @@ static struct ceph_locker *get_lock_owne goto err_busy; }
- if (lock_type == CEPH_CLS_LOCK_SHARED) { - rbd_warn(rbd_dev, "shared lock type detected"); + if (lock_type != CEPH_CLS_LOCK_EXCLUSIVE) { + rbd_warn(rbd_dev, "incompatible lock type detected"); goto err_busy; }
WARN_ON(num_lockers != 1); - if (strncmp(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX, - strlen(RBD_LOCK_COOKIE_PREFIX))) { + ret = sscanf(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX " %llu", + &handle); + if (ret != 1) { rbd_warn(rbd_dev, "locked by external mechanism, cookie %s", lockers[0].id.cookie); goto err_busy; } + if (ceph_addr_is_blank(&lockers[0].info.addr)) { + rbd_warn(rbd_dev, "locker has a blank address"); + goto err_busy; + } + + dout("%s rbd_dev %p got locker %s%llu@%pISpc/%u handle %llu\n", + __func__, rbd_dev, ENTITY_NAME(lockers[0].id.name), + &lockers[0].info.addr.in_addr, + le32_to_cpu(lockers[0].info.addr.nonce), handle);
out: kfree(lock_tag); --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1123,6 +1123,7 @@ bool ceph_addr_is_blank(const struct cep return true; } } +EXPORT_SYMBOL(ceph_addr_is_blank);
int ceph_addr_port(const struct ceph_entity_addr *addr) {
From: Ilya Dryomov idryomov@gmail.com
commit 588159009d5b7a09c3e5904cffddbe4a4e170301 upstream.
An attempt to acquire exclusive lock can race with the current lock owner closing the image:
1. lock is held by client123, rbd_lock() returns -EBUSY 2. get_lock_owner_info() returns client123 instance details 3. client123 closes the image, lock is released 4. find_watcher() returns 0 as there is no matching watcher anymore 5. client123 instance gets erroneously blocklisted
Particularly impacted is mirror snapshot scheduler in snapshot-based mirroring since it happens to open and close images a lot (images are opened only for as long as it takes to take the next mirror snapshot, the same client instance is used for all images).
To reduce the potential for erroneous blocklisting, retrieve the lock owner again after find_watcher() returns 0. If it's still there, make sure it matches the previously detected lock owner.
Cc: stable@vger.kernel.org # f38cb9d9c204: rbd: make get_lock_owner_info() return a single locker or NULL Cc: stable@vger.kernel.org # 8ff2c64c9765: rbd: harden get_lock_owner_info() a bit Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Dongsheng Yang dongsheng.yang@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3849,6 +3849,15 @@ static void wake_lock_waiters(struct rbd list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list); }
+static bool locker_equal(const struct ceph_locker *lhs, + const struct ceph_locker *rhs) +{ + return lhs->id.name.type == rhs->id.name.type && + lhs->id.name.num == rhs->id.name.num && + !strcmp(lhs->id.cookie, rhs->id.cookie) && + ceph_addr_equal_no_type(&lhs->info.addr, &rhs->info.addr); +} + static void free_locker(struct ceph_locker *locker) { if (locker) @@ -3969,11 +3978,11 @@ out: static int rbd_try_lock(struct rbd_device *rbd_dev) { struct ceph_client *client = rbd_dev->rbd_client->client; - struct ceph_locker *locker; + struct ceph_locker *locker, *refreshed_locker; int ret;
for (;;) { - locker = NULL; + locker = refreshed_locker = NULL;
ret = rbd_lock(rbd_dev); if (ret != -EBUSY) @@ -3993,6 +4002,16 @@ static int rbd_try_lock(struct rbd_devic if (ret) goto out; /* request lock or error */
+ refreshed_locker = get_lock_owner_info(rbd_dev); + if (IS_ERR(refreshed_locker)) { + ret = PTR_ERR(refreshed_locker); + refreshed_locker = NULL; + goto out; + } + if (!refreshed_locker || + !locker_equal(locker, refreshed_locker)) + goto again; + rbd_warn(rbd_dev, "breaking header lock owned by %s%llu", ENTITY_NAME(locker->id.name));
@@ -4014,10 +4033,12 @@ static int rbd_try_lock(struct rbd_devic }
again: + free_locker(refreshed_locker); free_locker(locker); }
out: + free_locker(refreshed_locker); free_locker(locker); return ret; }
From: Jann Horn jannh@google.com
commit d8ab9f7b644a2c9b64de405c1953c905ff219dc9 upstream.
When VMAs are merged, dup_anon_vma() is called with `dst` pointing to the VMA that is being expanded to cover the area previously occupied by another VMA. This currently happens while `dst` is not write-locked.
This means that, in the `src->anon_vma && !dst->anon_vma` case, as soon as the assignment `dst->anon_vma = src->anon_vma` has happened, concurrent page faults can happen on `dst` under the per-VMA lock. This is already icky in itself, since such page faults can now install pages into `dst` that are attached to an `anon_vma` that is not yet tied back to the `anon_vma` with an `anon_vma_chain`. But if `anon_vma_clone()` fails due to an out-of-memory error, things get much worse: `anon_vma_clone()` then reverts `dst->anon_vma` back to NULL, and `dst` remains completely unconnected to the `anon_vma`, even though we can have pages in the area covered by `dst` that point to the `anon_vma`.
This means the `anon_vma` of such pages can be freed while the pages are still mapped into userspace, which leads to UAF when a helper like folio_lock_anon_vma_read() tries to look up the anon_vma of such a page.
This theoretically is a security bug, but I believe it is really hard to actually trigger as an unprivileged user because it requires that you can make an order-0 GFP_KERNEL allocation fail, and the page allocator tries pretty hard to prevent that.
I think doing the vma_start_write() call inside dup_anon_vma() is the most straightforward fix for now.
For a kernel-assisted reproducer, see the notes section of the patch mail.
Link: https://lkml.kernel.org/r/20230721034643.616851-1-jannh@google.com Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/mmap.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/mmap.c +++ b/mm/mmap.c @@ -647,6 +647,7 @@ static inline int dup_anon_vma(struct vm * anon pages imported. */ if (src->anon_vma && !dst->anon_vma) { + vma_start_write(dst); dst->anon_vma = src->anon_vma; return anon_vma_clone(dst, src); }
From: Jann Horn jannh@google.com
commit b1f02b95758d05b799731d939e76a0bd6da312db upstream.
mm->mm_lock_seq effectively functions as a read/write lock; therefore it must be used with acquire/release semantics.
A specific example is the interaction between userfaultfd_register() and lock_vma_under_rcu().
userfaultfd_register() does the following from the point where it changes a VMA's flags to the point where concurrent readers are permitted again (in a simple scenario where only a single private VMA is accessed and no merging/splitting is involved):
userfaultfd_register userfaultfd_set_vm_flags vm_flags_reset vma_start_write down_write(&vma->vm_lock->lock) vma->vm_lock_seq = mm_lock_seq [marks VMA as busy] up_write(&vma->vm_lock->lock) vm_flags_init [sets VM_UFFD_* in __vm_flags] vma->vm_userfaultfd_ctx.ctx = ctx mmap_write_unlock vma_end_write_all WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1) [unlocks VMA]
There are no memory barriers in between the __vm_flags update and the mm->mm_lock_seq update that unlocks the VMA, so the unlock can be reordered to above the `vm_flags_init()` call, which means from the perspective of a concurrent reader, a VMA can be marked as a userfaultfd VMA while it is not VMA-locked. That's bad, we definitely need a store-release for the unlock operation.
The non-atomic write to vma->vm_lock_seq in vma_start_write() is mostly fine because all accesses to vma->vm_lock_seq that matter are always protected by the VMA lock. There is a racy read in vma_start_read() though that can tolerate false-positives, so we should be using WRITE_ONCE() to keep things tidy and data-race-free (including for KCSAN).
On the other side, lock_vma_under_rcu() works as follows in the relevant region for locking and userfaultfd check:
lock_vma_under_rcu vma_start_read vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [early bailout] down_read_trylock(&vma->vm_lock->lock) vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [main check] userfaultfd_armed checks vma->vm_flags & __VM_UFFD_FLAGS
Here, the interesting aspect is how far down the mm->mm_lock_seq read can be reordered - if this read is reordered down below the vma->vm_flags access, this could cause lock_vma_under_rcu() to partly operate on information that was read while the VMA was supposed to be locked. To prevent this kind of downwards bleeding of the mm->mm_lock_seq read, we need to read it with a load-acquire.
Some of the comment wording is based on suggestions by Suren.
BACKPORT WARNING: One of the functions changed by this patch (which I've written against Linus' tree) is vma_try_start_write(), but this function no longer exists in mm/mm-everything. I don't know whether the merged version of this patch will be ordered before or after the patch that removes vma_try_start_write(). If you're backporting this patch to a tree with vma_try_start_write(), make sure this patch changes that function.
Link: https://lkml.kernel.org/r/20230721225107.942336-1-jannh@google.com Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 29 +++++++++++++++++++++++------ include/linux/mm_types.h | 28 ++++++++++++++++++++++++++++ include/linux/mmap_lock.h | 10 ++++++++-- 3 files changed, 59 insertions(+), 8 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h index 2dd73e4f3d8e..406ab9ea818f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -641,8 +641,14 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {} */ static inline bool vma_start_read(struct vm_area_struct *vma) { - /* Check before locking. A race might cause false locked result. */ - if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) + /* + * Check before locking. A race might cause false locked result. + * We can use READ_ONCE() for the mm_lock_seq here, and don't need + * ACQUIRE semantics, because this is just a lockless check whose result + * we don't rely on for anything - the mm_lock_seq read against which we + * need ordering is below. + */ + if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(vma->vm_mm->mm_lock_seq)) return false;
if (unlikely(down_read_trylock(&vma->vm_lock->lock) == 0)) @@ -653,8 +659,13 @@ static inline bool vma_start_read(struct vm_area_struct *vma) * False unlocked result is impossible because we modify and check * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq * modification invalidates all existing locks. + * + * We must use ACQUIRE semantics for the mm_lock_seq so that if we are + * racing with vma_end_write_all(), we only start reading from the VMA + * after it has been unlocked. + * This pairs with RELEASE semantics in vma_end_write_all(). */ - if (unlikely(vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) { + if (unlikely(vma->vm_lock_seq == smp_load_acquire(&vma->vm_mm->mm_lock_seq))) { up_read(&vma->vm_lock->lock); return false; } @@ -676,7 +687,7 @@ static bool __is_vma_write_locked(struct vm_area_struct *vma, int *mm_lock_seq) * current task is holding mmap_write_lock, both vma->vm_lock_seq and * mm->mm_lock_seq can't be concurrently modified. */ - *mm_lock_seq = READ_ONCE(vma->vm_mm->mm_lock_seq); + *mm_lock_seq = vma->vm_mm->mm_lock_seq; return (vma->vm_lock_seq == *mm_lock_seq); }
@@ -688,7 +699,13 @@ static inline void vma_start_write(struct vm_area_struct *vma) return;
down_write(&vma->vm_lock->lock); - vma->vm_lock_seq = mm_lock_seq; + /* + * We should use WRITE_ONCE() here because we can have concurrent reads + * from the early lockless pessimistic check in vma_start_read(). + * We don't really care about the correctness of that early check, but + * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy. + */ + WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); up_write(&vma->vm_lock->lock); }
@@ -702,7 +719,7 @@ static inline bool vma_try_start_write(struct vm_area_struct *vma) if (!down_write_trylock(&vma->vm_lock->lock)) return false;
- vma->vm_lock_seq = mm_lock_seq; + WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); up_write(&vma->vm_lock->lock); return true; } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index de10fc797c8e..5e74ce4a28cd 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -514,6 +514,20 @@ struct vm_area_struct { };
#ifdef CONFIG_PER_VMA_LOCK + /* + * Can only be written (using WRITE_ONCE()) while holding both: + * - mmap_lock (in write mode) + * - vm_lock->lock (in write mode) + * Can be read reliably while holding one of: + * - mmap_lock (in read or write mode) + * - vm_lock->lock (in read or write mode) + * Can be read unreliably (using READ_ONCE()) for pessimistic bailout + * while holding nothing (except RCU to keep the VMA struct allocated). + * + * This sequence counter is explicitly allowed to overflow; sequence + * counter reuse can only lead to occasional unnecessary use of the + * slowpath. + */ int vm_lock_seq; struct vma_lock *vm_lock;
@@ -679,6 +693,20 @@ struct mm_struct { * by mmlist_lock */ #ifdef CONFIG_PER_VMA_LOCK + /* + * This field has lock-like semantics, meaning it is sometimes + * accessed with ACQUIRE/RELEASE semantics. + * Roughly speaking, incrementing the sequence number is + * equivalent to releasing locks on VMAs; reading the sequence + * number can be part of taking a read lock on a VMA. + * + * Can be modified under write mmap_lock using RELEASE + * semantics. + * Can be read with no other protection when holding write + * mmap_lock. + * Can be read with ACQUIRE semantics if not holding write + * mmap_lock. + */ int mm_lock_seq; #endif
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index aab8f1b28d26..e05e167dbd16 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -76,8 +76,14 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm) static inline void vma_end_write_all(struct mm_struct *mm) { mmap_assert_write_locked(mm); - /* No races during update due to exclusive mmap_lock being held */ - WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); + /* + * Nobody can concurrently modify mm->mm_lock_seq due to exclusive + * mmap_lock being held. + * We need RELEASE semantics here to ensure that preceding stores into + * the VMA take effect before we unlock it with this store. + * Pairs with ACQUIRE semantics in vma_start_read(). + */ + smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); } #else static inline void vma_end_write_all(struct mm_struct *mm) {}
From: Sidhartha Kumar sidhartha.kumar@oracle.com
commit 6c54312f9689fbe27c70db5d42eebd29d04b672e upstream.
It was pointed out[1] that using folio_test_hwpoison() is wrong as we need to check the indiviual page that has poison. folio_test_hwpoison() only checks the head page so go back to using PageHWPoison().
User-visible effects include existing hwpoison-inject tests possibly failing as unpoisoning a single subpage could lead to unpoisoning an entire folio. Memory unpoisoning could also not work as expected as the function will break early due to only checking the head page and not the actually poisoned subpage.
[1]: https://lore.kernel.org/lkml/ZLIbZygG7LqSI9xe@casper.infradead.org/
Link: https://lkml.kernel.org/r/20230717181812.167757-1-sidhartha.kumar@oracle.com Fixes: a6fddef49eef ("mm/memory-failure: convert unpoison_memory() to folios") Signed-off-by: Sidhartha Kumar sidhartha.kumar@oracle.com Reported-by: Matthew Wilcox (Oracle) willy@infradead.org Acked-by: Naoya Horiguchi naoya.horiguchi@nec.com Reviewed-by: Miaohe Lin linmiaohe@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2490,7 +2490,7 @@ int unpoison_memory(unsigned long pfn) goto unlock_mutex; }
- if (!folio_test_hwpoison(folio)) { + if (!PageHWPoison(p)) { unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n", pfn, &unpoison_rs); goto unlock_mutex;
From: Jann Horn jannh@google.com
commit 6c21e066f9256ea1df6f88768f6ae1080b7cf509 upstream.
mbind() calls down into vma_replace_policy() without taking the per-VMA locks, replaces the VMA's vma->vm_policy pointer, and frees the old policy. That's bad; a concurrent page fault might still be using the old policy (in vma_alloc_folio()), resulting in use-after-free.
Normally this will manifest as a use-after-free read first, but it can result in memory corruption, including because vma_alloc_folio() can call mpol_cond_put() on the freed policy, which conditionally changes the policy's refcount member.
This bug is specific to CONFIG_NUMA, but it does also affect non-NUMA systems as long as the kernel was built with CONFIG_NUMA.
Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Suren Baghdasaryan surenb@google.com Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Cc: stable@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/mempolicy.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
--- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -384,8 +384,10 @@ void mpol_rebind_mm(struct mm_struct *mm VMA_ITERATOR(vmi, mm, 0);
mmap_write_lock(mm); - for_each_vma(vmi, vma) + for_each_vma(vmi, vma) { + vma_start_write(vma); mpol_rebind_policy(vma->vm_policy, new); + } mmap_write_unlock(mm); }
@@ -765,6 +767,8 @@ static int vma_replace_policy(struct vm_ struct mempolicy *old; struct mempolicy *new;
+ vma_assert_write_locked(vma); + pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n", vma->vm_start, vma->vm_end, vma->vm_pgoff, vma->vm_ops, vma->vm_file, @@ -1313,6 +1317,14 @@ static long do_mbind(unsigned long start if (err) goto mpol_out;
+ /* + * Lock the VMAs before scanning for pages to migrate, to ensure we don't + * miss a concurrently inserted page. + */ + vma_iter_init(&vmi, mm, start); + for_each_vma_range(vmi, vma, end) + vma_start_write(vma); + ret = queue_pages_range(mm, start, end, nmask, flags | MPOL_MF_INVERT, &pagelist);
@@ -1538,6 +1550,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, break; }
+ vma_start_write(vma); new->home_node = home_node; err = mbind_range(&vmi, vma, &prev, start, end, new); mpol_put(new);
From: Christian König christian.koenig@amd.com
commit f781f661e8c99b0cb34129f2e374234d61864e77 upstream.
Some Android CTS is testing if the signaling time keeps consistent during merges.
v2: use the current time if the fence is still in the signaling path and the timestamp not yet available. v3: improve comment, fix one more case to use the correct timestamp
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Luben Tuikov luben.tuikov@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christ... Cc: Jindong Yue jindong.yue@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma-buf/dma-fence-unwrap.c | 26 ++++++++++++++++++++++---- drivers/dma-buf/dma-fence.c | 5 +++-- drivers/gpu/drm/drm_syncobj.c | 2 +- include/linux/dma-fence.h | 2 +- 4 files changed, 27 insertions(+), 8 deletions(-)
--- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -66,18 +66,36 @@ struct dma_fence *__dma_fence_unwrap_mer { struct dma_fence_array *result; struct dma_fence *tmp, **array; + ktime_t timestamp; unsigned int i; size_t count;
count = 0; + timestamp = ns_to_ktime(0); for (i = 0; i < num_fences; ++i) { - dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) - if (!dma_fence_is_signaled(tmp)) + dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) { + if (!dma_fence_is_signaled(tmp)) { ++count; + } else if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, + &tmp->flags)) { + if (ktime_after(tmp->timestamp, timestamp)) + timestamp = tmp->timestamp; + } else { + /* + * Use the current time if the fence is + * currently signaling. + */ + timestamp = ktime_get(); + } + } }
+ /* + * If we couldn't find a pending fence just return a private signaled + * fence with the timestamp of the last signaled one. + */ if (count == 0) - return dma_fence_get_stub(); + return dma_fence_allocate_private_stub(timestamp);
array = kmalloc_array(count, sizeof(*array), GFP_KERNEL); if (!array) @@ -138,7 +156,7 @@ restart: } while (tmp);
if (count == 0) { - tmp = dma_fence_get_stub(); + tmp = dma_fence_allocate_private_stub(ktime_get()); goto return_tmp; }
--- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -150,10 +150,11 @@ EXPORT_SYMBOL(dma_fence_get_stub);
/** * dma_fence_allocate_private_stub - return a private, signaled fence + * @timestamp: timestamp when the fence was signaled * * Return a newly allocated and signaled stub fence. */ -struct dma_fence *dma_fence_allocate_private_stub(void) +struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp) { struct dma_fence *fence;
@@ -169,7 +170,7 @@ struct dma_fence *dma_fence_allocate_pri set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags);
- dma_fence_signal(fence); + dma_fence_signal_timestamp(fence, timestamp);
return fence; } --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -353,7 +353,7 @@ EXPORT_SYMBOL(drm_syncobj_replace_fence) */ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj) { - struct dma_fence *fence = dma_fence_allocate_private_stub(); + struct dma_fence *fence = dma_fence_allocate_private_stub(ktime_get());
if (IS_ERR(fence)) return PTR_ERR(fence); --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -606,7 +606,7 @@ static inline signed long dma_fence_wait void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);
struct dma_fence *dma_fence_get_stub(void); -struct dma_fence *dma_fence_allocate_private_stub(void); +struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp); u64 dma_fence_context_alloc(unsigned num);
extern const struct dma_fence_ops dma_fence_array_ops;
From: Dan Carpenter dan.carpenter@linaro.org
commit 00ae1491f970acc454be0df63f50942d94825860 upstream.
Smatch detected potential error pointer dereference.
drivers/gpu/drm/drm_syncobj.c:888 drm_syncobj_transfer_to_timeline() error: 'fence' dereferencing possible ERR_PTR()
The error pointer comes from dma_fence_allocate_private_stub(). One caller expected error pointers and one expected NULL pointers. Change it to return NULL and update the caller which expected error pointers, drm_syncobj_assign_null_handle(), to check for NULL instead.
Fixes: f781f661e8c9 ("dma-buf: keep the signaling time of merged fences v3") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Christian König christian.koenig@amd.com Reviewed-by: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/b09f1996-3838-4fa2-9193-832b68... Cc: Jindong Yue jindong.yue@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma-buf/dma-fence.c | 2 +- drivers/gpu/drm/drm_syncobj.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -160,7 +160,7 @@ struct dma_fence *dma_fence_allocate_pri
fence = kzalloc(sizeof(*fence), GFP_KERNEL); if (fence == NULL) - return ERR_PTR(-ENOMEM); + return NULL;
dma_fence_init(fence, &dma_fence_stub_ops, --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -355,8 +355,8 @@ static int drm_syncobj_assign_null_handl { struct dma_fence *fence = dma_fence_allocate_private_stub(ktime_get());
- if (IS_ERR(fence)) - return PTR_ERR(fence); + if (!fence) + return -ENOMEM;
drm_syncobj_replace_fence(syncobj, fence); dma_fence_put(fence);
On Tue, 01 Aug 2023 11:17:44 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.4: 11 builds: 11 pass, 0 fail 28 boots: 28 pass, 0 fail 120 tests: 120 pass, 0 fail
Linux version: 6.4.8-rc1-g2c273bf138a4 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Tue, Aug 01, 2023 at 11:17:44AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
Hello,
On Tue, 1 Aug 2023 11:17:44 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr [2] 2c273bf138a4 ("Linux 6.4.8-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 2 selftests: damon: debugfs_schemes.sh ok 3 selftests: damon: debugfs_target_ids.sh ok 4 selftests: damon: debugfs_empty_targets.sh ok 5 selftests: damon: debugfs_huge_count_read_write.sh ok 6 selftests: damon: debugfs_duplicate_context_creation.sh ok 7 selftests: damon: debugfs_rm_non_contexts.sh ok 8 selftests: damon: sysfs.sh ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh ok 10 selftests: damon: reclaim.sh ok 11 selftests: damon: lru_sort.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_m68k.sh ok 12 selftests: damon-tests: build_arm64.sh ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh
PASS
On 8/1/23 03:17, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 8/1/23 02:17, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On Tue, 1 Aug 2023 at 15:11, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Following kselftest build regression found,
selftests/rseq: Play nice with binaries statically linked against glibc 2.35+ commit 3bcbc20942db5d738221cca31a928efc09827069 upstream.
To allow running rseq and KVM's rseq selftests as statically linked binaries, initialize the various "trampoline" pointers to point directly at the expect glibc symbols, and skip the dlysm() lookups if the rseq size is non-zero, i.e. the binary is statically linked *and* the libc registered its own rseq.
Define weak versions of the symbols so as not to break linking against libc versions that don't support rseq in any capacity.
The KVM selftests in particular are often statically linked so that they can be run on targets with very limited runtime environments, i.e. test machines.
Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35") Cc: Aaron Lewis aaronlewis@google.com Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230721223352.2333911-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Build log: ---- x86_64-linux-gnu-gcc -O2 -Wall -g -I./ -isystem /home/tuxbuild/.cache/tuxmake/builds/1/build/usr/include -L/home/tuxbuild/.cache/tuxmake/builds/1/build/kselftest/rseq -Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -ldl -o /home/tuxbuild/.cache/tuxmake/builds/1/build/kselftest/rseq/librseq.so rseq.c:41:1: error: unknown type name '__weak' 41 | __weak ptrdiff_t __rseq_offset; | ^~~~~~ rseq.c:41:18: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__rseq_offset' 41 | __weak ptrdiff_t __rseq_offset; | ^~~~~~~~~~~~~ rseq.c:42:7: error: expected ';' before 'unsigned' 42 | __weak unsigned int __rseq_size; | ^~~~~~~~~ | ; rseq.c:43:7: error: expected ';' before 'unsigned' 43 | __weak unsigned int __rseq_flags; | ^~~~~~~~~ | ; rseq.c:45:47: error: '__rseq_offset' undeclared here (not in a function); did you mean 'rseq_offset'? 45 | static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset; | ^~~~~~~~~~~~~ | rseq_offset make[3]: Leaving directory 'tools/testing/selftests/rseq'
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Links: - https://storage.tuxsuite.com/public/linaro/lkft/builds/2TNSVjRCfcIaJWQNkPwDQ... - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.4.y/build/v6.4.7-...
-- Linaro LKFT https://lkft.linaro.org
On Wed, Aug 02, 2023 at 08:22:59AM +0530, Naresh Kamboju wrote:
On Tue, 1 Aug 2023 at 15:11, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Following kselftest build regression found,
selftests/rseq: Play nice with binaries statically linked against
glibc 2.35+ commit 3bcbc20942db5d738221cca31a928efc09827069 upstream.
To allow running rseq and KVM's rseq selftests as statically linked binaries, initialize the various "trampoline" pointers to point directly at the expect glibc symbols, and skip the dlysm() lookups if the rseq size is non-zero, i.e. the binary is statically linked *and* the libc registered its own rseq. Define weak versions of the symbols so as not to break linking against libc versions that don't support rseq in any capacity. The KVM selftests in particular are often statically linked so that they can be run on targets with very limited runtime environments, i.e. test machines. Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for
compatibility with glibc-2.35") Cc: Aaron Lewis aaronlewis@google.com Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20230721223352.2333911-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Build log:
x86_64-linux-gnu-gcc -O2 -Wall -g -I./ -isystem /home/tuxbuild/.cache/tuxmake/builds/1/build/usr/include -L/home/tuxbuild/.cache/tuxmake/builds/1/build/kselftest/rseq -Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -ldl -o /home/tuxbuild/.cache/tuxmake/builds/1/build/kselftest/rseq/librseq.so rseq.c:41:1: error: unknown type name '__weak' 41 | __weak ptrdiff_t __rseq_offset; | ^~~~~~ rseq.c:41:18: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__rseq_offset' 41 | __weak ptrdiff_t __rseq_offset; | ^~~~~~~~~~~~~ rseq.c:42:7: error: expected ';' before 'unsigned' 42 | __weak unsigned int __rseq_size; | ^~~~~~~~~ | ; rseq.c:43:7: error: expected ';' before 'unsigned' 43 | __weak unsigned int __rseq_flags; | ^~~~~~~~~ | ; rseq.c:45:47: error: '__rseq_offset' undeclared here (not in a function); did you mean 'rseq_offset'? 45 | static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset; | ^~~~~~~~~~~~~ | rseq_offset make[3]: Leaving directory 'tools/testing/selftests/rseq'
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Links:
Odd this didn't also show up in 6.1. I'll go drop the offending commit for now.
thanks,
greg k-h
On 8/1/23 2:17 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 03 Aug 2023 09:18:38 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.4.8-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.4.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Tue, Aug 01, 2023 at 11:17:44AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.4.8 release. There are 239 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully compiled and installed bindeb-pkgs on my computer (Acer Aspire E15, Intel Core i3 Haswell). No noticeable regressions.
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
linux-stable-mirror@lists.linaro.org