[PATCH AUTOSEL 6.1 01/27] PCI: Only override AMD USB controller if required

List overview All Threads
Download

newer

older

[PATCH AUTOSEL 6.7 01/23] wifi:...

[PATCH AUTOSEL 5.10 1/8] wifi:...

Sasha Levin

28 Jan 2024 28 Jan '24

4:13 p.m.

From: "Guilherme G. Piccoli" gpiccoli@igalia.com

[ Upstream commit e585a37e5061f6d5060517aed1ca4ccb2e56a34c ]

By running a Van Gogh device (Steam Deck), the following message was noticed in the kernel log:

pci 0000:04:00.3: PCI class overridden (0x0c03fe -> 0x0c03fe) so dwc3 driver can claim this instead of xhci

Effectively this means the quirk executed but changed nothing, since the class of this device was already the proper one (likely adjusted by newer firmware versions).

Check and perform the override only if necessary.

Link: https://lore.kernel.org/r/20231120160531.361552-1-gpiccoli@igalia.com Signed-off-by: Guilherme G. Piccoli gpiccoli@igalia.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: Huang Rui ray.huang@amd.com Cc: Vicki Pfau vi@endrift.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 8765544bac35..75b297c15cf5 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -607,10 +607,13 @@ static void quirk_amd_dwc_class(struct pci_dev *pdev) { u32 class = pdev->class;

- /* Use "USB Device (not host controller)" class */ - pdev->class = PCI_CLASS_SERIAL_USB_DEVICE; - pci_info(pdev, "PCI class overridden (%#08x -> %#08x) so dwc3 driver can claim this instead of xhci\n", - class, pdev->class); + if (class != PCI_CLASS_SERIAL_USB_DEVICE) { + /* Use "USB Device (not host controller)" class */ + pdev->class = PCI_CLASS_SERIAL_USB_DEVICE; + pci_info(pdev, + "PCI class overridden (%#08x -> %#08x) so dwc3 driver can claim this instead of xhci\n", + class, pdev->class); + } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_NL_USB, quirk_amd_dwc_class);

-- 2.43.0

Show replies by date

Sasha Levin

28 Jan 28 Jan

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 02/27] PCI: switchtec: Fix stdev_release() crash after surprise hot remove

From: Daniel Stodden dns@arista.com

[ Upstream commit df25461119d987b8c81d232cfe4411e91dcabe66 ]

A PCI device hot removal may occur while stdev->cdev is held open. The call to stdev_release() then happens during close or exit, at a point way past switchtec_pci_remove(). Otherwise the last ref would vanish with the trailing put_device(), just before return.

At that later point in time, the devm cleanup has already removed the stdev->mmio_mrpc mapping. Also, the stdev->pdev reference was not a counted one. Therefore, in DMA mode, the iowrite32() in stdev_release() will cause a fatal page fault, and the subsequent dma_free_coherent(), if reached, would pass a stale &stdev->pdev->dev pointer.

Fix by moving MRPC DMA shutdown into switchtec_pci_remove(), after stdev_kill(). Counting the stdev->pdev ref is now optional, but may prevent future accidents.

Reproducible via the script at https://lore.kernel.org/r/20231113212150.96410-1-dns@arista.com

Link: https://lore.kernel.org/r/20231122042316.91208-2-dns@arista.com Signed-off-by: Daniel Stodden dns@arista.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Logan Gunthorpe logang@deltatee.com Reviewed-by: Dmitry Safonov dima@arista.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/switch/switchtec.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c index 0c1faa6c1973..3f3320d0a4f8 100644 --- a/drivers/pci/switch/switchtec.c +++ b/drivers/pci/switch/switchtec.c @@ -1308,13 +1308,6 @@ static void stdev_release(struct device *dev) { struct switchtec_dev *stdev = to_stdev(dev);

- if (stdev->dma_mrpc) { - iowrite32(0, &stdev->mmio_mrpc->dma_en); - flush_wc_buf(stdev); - writeq(0, &stdev->mmio_mrpc->dma_addr); - dma_free_coherent(&stdev->pdev->dev, sizeof(*stdev->dma_mrpc), - stdev->dma_mrpc, stdev->dma_mrpc_dma_addr); - } kfree(stdev); }

@@ -1358,7 +1351,7 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev) return ERR_PTR(-ENOMEM);

stdev->alive = true; - stdev->pdev = pdev; + stdev->pdev = pci_dev_get(pdev); INIT_LIST_HEAD(&stdev->mrpc_queue); mutex_init(&stdev->mrpc_mutex); stdev->mrpc_busy = 0; @@ -1391,6 +1384,7 @@ static struct switchtec_dev *stdev_create(struct pci_dev *pdev) return stdev;

err_put: + pci_dev_put(stdev->pdev); put_device(&stdev->dev); return ERR_PTR(rc); } @@ -1646,6 +1640,18 @@ static int switchtec_init_pci(struct switchtec_dev *stdev, return 0; }

+static void switchtec_exit_pci(struct switchtec_dev *stdev) +{ + if (stdev->dma_mrpc) { + iowrite32(0, &stdev->mmio_mrpc->dma_en); + flush_wc_buf(stdev); + writeq(0, &stdev->mmio_mrpc->dma_addr); + dma_free_coherent(&stdev->pdev->dev, sizeof(*stdev->dma_mrpc), + stdev->dma_mrpc, stdev->dma_mrpc_dma_addr); + stdev->dma_mrpc = NULL; + } +} + static int switchtec_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) { @@ -1705,6 +1711,9 @@ static void switchtec_pci_remove(struct pci_dev *pdev) ida_free(&switchtec_minor_ida, MINOR(stdev->dev.devt)); dev_info(&stdev->dev, "unregistered.\n"); stdev_kill(stdev); + switchtec_exit_pci(stdev); + pci_dev_put(stdev->pdev); + stdev->pdev = NULL; put_device(&stdev->dev); }

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 03/27] perf cs-etm: Bump minimum OpenCSD version to ensure a bugfix is present

From: James Clark james.clark@arm.com

[ Upstream commit 2dbba30fd69b604802a9535b74bddb5bcca23793 ]

Since commit d927ef5004ef ("perf cs-etm: Add exception level consistency check"), the exception that was added to Perf will be triggered unless the following bugfix from OpenCSD is present:

- _Version 1.2.1_: - __Bugfix__: ETM4x / ETE - output of context elements to client can in some circumstances be delayed until after subsequent atoms have been processed leading to incorrect memory decode access via the client callbacks. Fixed to flush context elements immediately they are committed.

Rather than remove the assert and silently fail, just increase the minimum version requirement to avoid hard to debug issues and regressions.

Reviewed-by: Ian Rogers irogers@google.com Signed-off-by: James Clark james.clark@arm.com Tested-by: Leo Yan leo.yan@linaro.org Cc: John Garry john.g.garry@oracle.com Cc: Mike Leach mike.leach@linaro.org Cc: Will Deacon will@kernel.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230901133716.677499-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/build/feature/test-libopencsd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/build/feature/test-libopencsd.c b/tools/build/feature/test-libopencsd.c index eb6303ff446e..4cfcef9da3e4 100644 --- a/tools/build/feature/test-libopencsd.c +++ b/tools/build/feature/test-libopencsd.c @@ -4,9 +4,9 @@ /* * Check OpenCSD library version is sufficient to provide required features */ -#define OCSD_MIN_VER ((1 << 16) | (1 << 8) | (1)) +#define OCSD_MIN_VER ((1 << 16) | (2 << 8) | (1)) #if !defined(OCSD_VER_NUM) || (OCSD_VER_NUM < OCSD_MIN_VER) -#error "OpenCSD >= 1.1.1 is required" +#error "OpenCSD >= 1.2.1 is required" #endif

int main(void)

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 04/27] xhci: fix possible null pointer deref during xhci urb enqueue

From: Mathias Nyman mathias.nyman@linux.intel.com

[ Upstream commit e2e2aacf042f52854c92775b7800ba668e0bdfe4 ]

There is a short gap between urb being submitted and actually added to the endpoint queue (linked). If the device is disconnected during this time then usb core is not yet aware of the pending urb, and device may be freed just before xhci_urq_enqueue() continues, dereferencing the freed device.

Freeing the device is protected by the xhci spinlock, so make sure we take and keep the lock while checking that device exists, dereference it, and add the urb to the queue.

Remove the unnecessary URB check, usb core checks it before calling xhci_urb_enqueue()

Suggested-by: Kuen-Han Tsai khtsai@google.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20231201150647.1307406-20-mathias.nyman@linux.inte... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/xhci.c | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index c02ad4f76bb3..127fbad32a75 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1654,24 +1654,7 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag struct urb_priv *urb_priv; int num_tds;

- if (!urb) - return -EINVAL; - ret = xhci_check_args(hcd, urb->dev, urb->ep, - true, true, __func__); - if (ret <= 0) - return ret ? ret : -EINVAL; - - slot_id = urb->dev->slot_id; ep_index = xhci_get_endpoint_index(&urb->ep->desc); - ep_state = &xhci->devs[slot_id]->eps[ep_index].ep_state; - - if (!HCD_HW_ACCESSIBLE(hcd)) - return -ESHUTDOWN; - - if (xhci->devs[slot_id]->flags & VDEV_PORT_ERROR) { - xhci_dbg(xhci, "Can't queue urb, port error, link inactive\n"); - return -ENODEV; - }

if (usb_endpoint_xfer_isoc(&urb->ep->desc)) num_tds = urb->number_of_packets; @@ -1710,12 +1693,35 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag

spin_lock_irqsave(&xhci->lock, flags);

+ ret = xhci_check_args(hcd, urb->dev, urb->ep, + true, true, __func__); + if (ret <= 0) { + ret = ret ? ret : -EINVAL; + goto free_priv; + } + + slot_id = urb->dev->slot_id; + + if (!HCD_HW_ACCESSIBLE(hcd)) { + ret = -ESHUTDOWN; + goto free_priv; + } + + if (xhci->devs[slot_id]->flags & VDEV_PORT_ERROR) { + xhci_dbg(xhci, "Can't queue urb, port error, link inactive\n"); + ret = -ENODEV; + goto free_priv; + } + if (xhci->xhc_state & XHCI_STATE_DYING) { xhci_dbg(xhci, "Ep 0x%x: URB %p submitted for non-responsive xHCI host.\n", urb->ep->desc.bEndpointAddress, urb); ret = -ESHUTDOWN; goto free_priv; } + + ep_state = &xhci->devs[slot_id]->eps[ep_index].ep_state; + if (*ep_state & (EP_GETTING_STREAMS | EP_GETTING_NO_STREAMS)) { xhci_warn(xhci, "WARN: Can't enqueue URB, ep in streams transition state %x\n", *ep_state);

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 05/27] usb: hub: Replace hardcoded quirk value with BIT() macro

From: Hardik Gajjar hgajjar@de.adit-jv.com

[ Upstream commit 6666ea93d2c422ebeb8039d11e642552da682070 ]

This patch replaces the hardcoded quirk value in the macro with BIT().

Signed-off-by: Hardik Gajjar hgajjar@de.adit-jv.com Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20231205181829.127353-1-hgajjar@de.adit-jv.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/core/hub.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 81c8f564cf87..9163fd5af046 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -47,8 +47,8 @@ #define USB_VENDOR_TEXAS_INSTRUMENTS 0x0451 #define USB_PRODUCT_TUSB8041_USB3 0x8140 #define USB_PRODUCT_TUSB8041_USB2 0x8142 -#define HUB_QUIRK_CHECK_PORT_AUTOSUSPEND 0x01 -#define HUB_QUIRK_DISABLE_AUTOSUSPEND 0x02 +#define HUB_QUIRK_CHECK_PORT_AUTOSUSPEND BIT(0) +#define HUB_QUIRK_DISABLE_AUTOSUSPEND BIT(1)

#define USB_TP_TRANSMISSION_DELAY 40 /* ns */ #define USB_TP_TRANSMISSION_DELAY_MAX 65535 /* ns */

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 06/27] usb: hub: Add quirk to decrease IN-ep poll interval for Microchip USB491x hub

From: Hardik Gajjar hgajjar@de.adit-jv.com

[ Upstream commit 855d75cf8311fee156fabb5639bb53757ca83dd4 ]

There is a potential delay in notifying Linux USB drivers of downstream USB bus activity when connecting a high-speed or superSpeed device via the Microchip USB491x hub. This delay is due to the fixed bInterval value of 12 in the silicon of the Microchip USB491x hub.

Microchip requested to ignore the device descriptor and decrease that value to 9 as it was too late to modify that in silicon.

This patch speeds up the USB enummeration process that helps to pass Apple Carplay certifications and improve the User experience when utilizing the USB device via Microchip Multihost USB491x Hub.

A new hub quirk HUB_QUIRK_REDUCE_FRAME_INTR_BINTERVAL speeds up the notification process for Microchip USB491x hub by limiting the maximum bInterval value to 9.

Signed-off-by: Hardik Gajjar hgajjar@de.adit-jv.com Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20231205181829.127353-2-hgajjar@de.adit-jv.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/core/hub.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 9163fd5af046..4f181110d00d 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -47,12 +47,18 @@ #define USB_VENDOR_TEXAS_INSTRUMENTS 0x0451 #define USB_PRODUCT_TUSB8041_USB3 0x8140 #define USB_PRODUCT_TUSB8041_USB2 0x8142 +#define USB_VENDOR_MICROCHIP 0x0424 +#define USB_PRODUCT_USB4913 0x4913 +#define USB_PRODUCT_USB4914 0x4914 +#define USB_PRODUCT_USB4915 0x4915 #define HUB_QUIRK_CHECK_PORT_AUTOSUSPEND BIT(0) #define HUB_QUIRK_DISABLE_AUTOSUSPEND BIT(1) +#define HUB_QUIRK_REDUCE_FRAME_INTR_BINTERVAL BIT(2)

#define USB_TP_TRANSMISSION_DELAY 40 /* ns */ #define USB_TP_TRANSMISSION_DELAY_MAX 65535 /* ns */ #define USB_PING_RESPONSE_TIME 400 /* ns */ +#define USB_REDUCE_FRAME_INTR_BINTERVAL 9

/* Protect struct usb_device->state and ->children members * Note: Both are also protected by ->dev.sem, except that ->state can @@ -1904,6 +1910,14 @@ static int hub_probe(struct usb_interface *intf, const struct usb_device_id *id) usb_autopm_get_interface_no_resume(intf); }

+ if ((id->driver_info & HUB_QUIRK_REDUCE_FRAME_INTR_BINTERVAL) && + desc->endpoint[0].desc.bInterval > USB_REDUCE_FRAME_INTR_BINTERVAL) { + desc->endpoint[0].desc.bInterval = + USB_REDUCE_FRAME_INTR_BINTERVAL; + /* Tell the HCD about the interrupt ep's new bInterval */ + usb_set_interface(hdev, 0, 0); + } + if (hub_configure(hub, &desc->endpoint[0].desc) >= 0) { onboard_hub_create_pdevs(hdev, &hub->onboard_hub_devs);

@@ -5885,6 +5899,21 @@ static const struct usb_device_id hub_id_table[] = { .idVendor = USB_VENDOR_TEXAS_INSTRUMENTS, .idProduct = USB_PRODUCT_TUSB8041_USB3, .driver_info = HUB_QUIRK_DISABLE_AUTOSUSPEND}, + { .match_flags = USB_DEVICE_ID_MATCH_VENDOR + | USB_DEVICE_ID_MATCH_PRODUCT, + .idVendor = USB_VENDOR_MICROCHIP, + .idProduct = USB_PRODUCT_USB4913, + .driver_info = HUB_QUIRK_REDUCE_FRAME_INTR_BINTERVAL}, + { .match_flags = USB_DEVICE_ID_MATCH_VENDOR + | USB_DEVICE_ID_MATCH_PRODUCT, + .idVendor = USB_VENDOR_MICROCHIP, + .idProduct = USB_PRODUCT_USB4914, + .driver_info = HUB_QUIRK_REDUCE_FRAME_INTR_BINTERVAL}, + { .match_flags = USB_DEVICE_ID_MATCH_VENDOR + | USB_DEVICE_ID_MATCH_PRODUCT, + .idVendor = USB_VENDOR_MICROCHIP, + .idProduct = USB_PRODUCT_USB4915, + .driver_info = HUB_QUIRK_REDUCE_FRAME_INTR_BINTERVAL}, { .match_flags = USB_DEVICE_ID_MATCH_DEV_CLASS, .bDeviceClass = USB_CLASS_HUB}, { .match_flags = USB_DEVICE_ID_MATCH_INT_CLASS,

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 07/27] selftests/sgx: Fix linker script asserts

From: Jo Van Bulck jo.vanbulck@cs.kuleuven.be

[ Upstream commit 9fd552ee32c6c1e27c125016b87d295bea6faea7 ]

DEFINED only considers symbols, not section names. Hence, replace the check for .got.plt with the _GLOBAL_OFFSET_TABLE_ symbol and remove other (non-essential) asserts.

Signed-off-by: Jo Van Bulck jo.vanbulck@cs.kuleuven.be Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Link: https://lore.kernel.org/all/20231005153854.25566-10-jo.vanbulck%40cs.kuleuve... Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/sgx/test_encl.lds | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/tools/testing/selftests/sgx/test_encl.lds b/tools/testing/selftests/sgx/test_encl.lds index a1ec64f7d91f..108bc11d1d8c 100644 --- a/tools/testing/selftests/sgx/test_encl.lds +++ b/tools/testing/selftests/sgx/test_encl.lds @@ -34,8 +34,4 @@ SECTIONS } }

-ASSERT(!DEFINED(.altinstructions), "ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.altinstr_replacement), "ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.discard.retpoline_safe), "RETPOLINE ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.discard.nospec), "RETPOLINE ALTERNATIVES are not supported in enclaves") -ASSERT(!DEFINED(.got.plt), "Libcalls are not supported in enclaves") +ASSERT(!DEFINED(_GLOBAL_OFFSET_TABLE_), "Libcalls through GOT are not supported in enclaves")

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 08/27] tty: allow TIOCSLCKTRMIOS with CAP_CHECKPOINT_RESTORE

From: Adrian Reber areber@redhat.com

[ Upstream commit e0f25b8992345aa5f113da2815f5add98738c611 ]

The capability CAP_CHECKPOINT_RESTORE was introduced to allow non-root users to checkpoint and restore processes as non-root with CRIU.

This change extends CAP_CHECKPOINT_RESTORE to enable the CRIU option '--shell-job' as non-root. CRIU's man-page describes the '--shell-job' option like this:

Allow one to dump shell jobs. This implies the restored task will inherit session and process group ID from the criu itself. This option also allows to migrate a single external tty connection, to migrate applications like top.

TIOCSLCKTRMIOS can only be done if the process has CAP_SYS_ADMIN and this change extends it to CAP_SYS_ADMIN or CAP_CHECKPOINT_RESTORE.

With this change it is possible to checkpoint and restore processes which have a tty connection as non-root if CAP_CHECKPOINT_RESTORE is set.

Acked-by: Christian Brauner brauner@kernel.org Signed-off-by: Adrian Reber areber@redhat.com Acked-by: Andrei Vagin avagin@gmail.com Link: https://lore.kernel.org/r/20231208143656.1019-1-areber@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/tty_ioctl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/tty_ioctl.c b/drivers/tty/tty_ioctl.c index ad1cf51ecd11..41004720d4ae 100644 --- a/drivers/tty/tty_ioctl.c +++ b/drivers/tty/tty_ioctl.c @@ -859,7 +859,7 @@ int tty_mode_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) ret = -EFAULT; return ret; case TIOCSLCKTRMIOS: - if (!capable(CAP_SYS_ADMIN)) + if (!checkpoint_restore_ns_capable(&init_user_ns)) return -EPERM; copy_termios_locked(real_tty, &kterm); if (user_termios_to_kernel_termios(&kterm, @@ -876,7 +876,7 @@ int tty_mode_ioctl(struct tty_struct *tty, unsigned int cmd, unsigned long arg) ret = -EFAULT; return ret; case TIOCSLCKTRMIOS: - if (!capable(CAP_SYS_ADMIN)) + if (!checkpoint_restore_ns_capable(&init_user_ns)) return -EPERM; copy_termios_locked(real_tty, &kterm); if (user_termios_to_kernel_termios_1(&kterm,

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 09/27] fs/kernfs/dir: obey S_ISGID

From: Max Kellermann max.kellermann@ionos.com

[ Upstream commit 5133bee62f0ea5d4c316d503cc0040cac5637601 ]

Handling of S_ISGID is usually done by inode_init_owner() in all other filesystems, but kernfs doesn't use that function. In kernfs, struct kernfs_node is the primary data structure, and struct inode is only created from it on demand. Therefore, inode_init_owner() can't be used and we need to imitate its behavior.

S_ISGID support is useful for the cgroup filesystem; it allows subtrees managed by an unprivileged process to retain a certain owner gid, which then enables sharing access to the subtree with another unprivileged process.

-- v1 -> v2: minor coding style fix (comment)

Signed-off-by: Max Kellermann max.kellermann@ionos.com Acked-by: Tejun Heo tj@kernel.org Link: https://lore.kernel.org/r/20231208093310.297233-2-max.kellermann@ionos.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/kernfs/dir.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 44842e6cf0a9..a00e11ebfa77 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -669,6 +669,18 @@ struct kernfs_node *kernfs_new_node(struct kernfs_node *parent, { struct kernfs_node *kn;

+ if (parent->mode & S_ISGID) { + /* this code block imitates inode_init_owner() for + * kernfs + */ + + if (parent->iattr) + gid = parent->iattr->ia_gid; + + if (flags & KERNFS_DIR) + mode |= S_ISGID; + } + kn = __kernfs_new_node(kernfs_root(parent), parent, name, mode, uid, gid, flags); if (kn) {

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 10/27] spmi: mediatek: Fix UAF on device remove

From: Yu-Che Cheng giver@chromium.org

[ Upstream commit e821d50ab5b956ed0effa49faaf29912fd4106d9 ]

The pmif driver data that contains the clocks is allocated along with spmi_controller. On device remove, spmi_controller will be freed first, and then devres , including the clocks, will be cleanup. This leads to UAF because putting the clocks will access the clocks in the pmif driver data, which is already freed along with spmi_controller.

This can be reproduced by enabling DEBUG_TEST_DRIVER_REMOVE and building the kernel with KASAN.

Fix the UAF issue by using unmanaged clk_bulk_get() and putting the clocks before freeing spmi_controller.

Reported-by: Fei Shao fshao@chromium.org Signed-off-by: Yu-Che Cheng giver@chromium.org Link: https://lore.kernel.org/r/20230717173934.1.If004a6e055a189c7f2d0724fa814422c... Tested-by: Fei Shao fshao@chromium.org Reviewed-by: Fei Shao fshao@chromium.org Reviewed-by: Chen-Yu Tsai wenst@chromium.org Signed-off-by: Stephen Boyd sboyd@kernel.org Link: https://lore.kernel.org/r/20231206231733.4031901-3-sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spmi/spmi-mtk-pmif.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/spmi/spmi-mtk-pmif.c b/drivers/spmi/spmi-mtk-pmif.c index ad511f2c3324..739ad0cbd3bb 100644 --- a/drivers/spmi/spmi-mtk-pmif.c +++ b/drivers/spmi/spmi-mtk-pmif.c @@ -465,7 +465,7 @@ static int mtk_spmi_probe(struct platform_device *pdev) for (i = 0; i < arb->nclks; i++) arb->clks[i].id = pmif_clock_names[i];

- err = devm_clk_bulk_get(&pdev->dev, arb->nclks, arb->clks); + err = clk_bulk_get(&pdev->dev, arb->nclks, arb->clks); if (err) { dev_err(&pdev->dev, "Failed to get clocks: %d\n", err); goto err_put_ctrl; @@ -474,7 +474,7 @@ static int mtk_spmi_probe(struct platform_device *pdev) err = clk_bulk_prepare_enable(arb->nclks, arb->clks); if (err) { dev_err(&pdev->dev, "Failed to enable clocks: %d\n", err); - goto err_put_ctrl; + goto err_put_clks; }

ctrl->cmd = pmif_arb_cmd; @@ -498,6 +498,8 @@ static int mtk_spmi_probe(struct platform_device *pdev)

err_domain_remove: clk_bulk_disable_unprepare(arb->nclks, arb->clks); +err_put_clks: + clk_bulk_put(arb->nclks, arb->clks); err_put_ctrl: spmi_controller_put(ctrl); return err; @@ -509,6 +511,7 @@ static int mtk_spmi_remove(struct platform_device *pdev) struct pmif *arb = spmi_controller_get_drvdata(ctrl);

clk_bulk_disable_unprepare(arb->nclks, arb->clks); + clk_bulk_put(arb->nclks, arb->clks); spmi_controller_remove(ctrl); spmi_controller_put(ctrl); return 0;

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 11/27] PCI: Fix 64GT/s effective data rate calculation

From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com

[ Upstream commit ac4f1897fa5433a1b07a625503a91b6aa9d7e643 ]

Unlike the lower rates, the PCIe 64GT/s Data Rate uses 1b/1b encoding, not 128b/130b (PCIe r6.1 sec 1.2, Table 1-1). Correct the PCIE_SPEED2MBS_ENC() calculation to reflect that.

Link: https://lore.kernel.org/r/20240102172701.65501-1-ilpo.jarvinen@linux.intel.c... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index ed6d75d138c7..e1d02b7c6029 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -274,7 +274,7 @@ void pci_bus_put(struct pci_bus *bus);

/* PCIe speed to Mb/s reduced by encoding overhead */ #define PCIE_SPEED2MBS_ENC(speed) \ - ((speed) == PCIE_SPEED_64_0GT ? 64000*128/130 : \ + ((speed) == PCIE_SPEED_64_0GT ? 64000*1/1 : \ (speed) == PCIE_SPEED_32_0GT ? 32000*128/130 : \ (speed) == PCIE_SPEED_16_0GT ? 16000*128/130 : \ (speed) == PCIE_SPEED_8_0GT ? 8000*128/130 : \

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 12/27] PCI/AER: Decode Requester ID when no error info found

From: Bjorn Helgaas bhelgaas@google.com

[ Upstream commit 1291b716bbf969e101d517bfb8ba18d958f758b8 ]

When a device with AER detects an error, it logs error information in its own AER Error Status registers. It may send an Error Message to the Root Port (RCEC in the case of an RCiEP), which logs the fact that an Error Message was received (Root Error Status) and the Requester ID of the message source (Error Source Identification).

aer_print_port_info() prints the Requester ID from the Root Port Error Source in the usual Linux "bb:dd.f" format, but when find_source_device() finds no error details in the hierarchy below the Root Port, it printed the raw Requester ID without decoding it.

Decode the Requester ID in the usual Linux format so it matches other messages.

Sample message changes:

- pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 - pcieport 0000:00:1c.5: AER: can't find device of ID00e5 + pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5 + pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5

Link: https://lore.kernel.org/r/20231206224231.732765-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aer.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index e2d8a74f83c3..5426f450ce91 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -748,7 +748,7 @@ static void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info) u8 bus = info->id >> 8; u8 devfn = info->id & 0xff;

- pci_info(dev, "%s%s error received: %04x:%02x:%02x.%d\n", + pci_info(dev, "%s%s error message received from %04x:%02x:%02x.%d\n", info->multi_error_valid ? "Multiple " : "", aer_error_severity_string[info->severity], pci_domain_nr(dev->bus), bus, PCI_SLOT(devfn), @@ -936,7 +936,12 @@ static bool find_source_device(struct pci_dev *parent, pci_walk_bus(parent->subordinate, find_device_iter, e_info);

if (!e_info->error_dev_num) { - pci_info(parent, "can't find device of ID%04x\n", e_info->id); + u8 bus = e_info->id >> 8; + u8 devfn = e_info->id & 0xff; + + pci_info(parent, "found no error details for %04x:%02x:%02x.%d\n", + pci_domain_nr(parent->bus), bus, PCI_SLOT(devfn), + PCI_FUNC(devfn)); return false; } return true;

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 13/27] 9p: Fix initialisation of netfs_inode for 9p

From: David Howells dhowells@redhat.com

[ Upstream commit 9546ac78b232bac56ff975072b1965e0e755ebd4 ]

The 9p filesystem is calling netfs_inode_init() in v9fs_init_inode() - before the struct inode fields have been initialised from the obtained file stats (ie. after v9fs_stat2inode*() has been called), but netfslib wants to set a couple of its fields from i_size.

Reported-by: Marc Dionne marc.dionne@auristor.com Signed-off-by: David Howells dhowells@redhat.com Tested-by: Marc Dionne marc.dionne@auristor.com Tested-by: Dominique Martinet asmadeus@codewreck.org Acked-by: Dominique Martinet asmadeus@codewreck.org cc: Eric Van Hensbergen ericvh@kernel.org cc: Latchesar Ionkov lucho@ionkov.net cc: Dominique Martinet asmadeus@codewreck.org cc: Christian Schoenebeck linux_oss@crudebyte.com cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/9p/v9fs_vfs.h | 1 + fs/9p/vfs_inode.c | 6 +++--- fs/9p/vfs_inode_dotl.c | 1 + 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h index bc417da7e9c1..633fe4f527b8 100644 --- a/fs/9p/v9fs_vfs.h +++ b/fs/9p/v9fs_vfs.h @@ -46,6 +46,7 @@ struct inode *v9fs_alloc_inode(struct super_block *sb); void v9fs_free_inode(struct inode *inode); struct inode *v9fs_get_inode(struct super_block *sb, umode_t mode, dev_t rdev); +void v9fs_set_netfs_context(struct inode *inode); int v9fs_init_inode(struct v9fs_session_info *v9ses, struct inode *inode, umode_t mode, dev_t rdev); void v9fs_evict_inode(struct inode *inode); diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 4d1a4a8d9277..5e2657c1dbbe 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -250,7 +250,7 @@ void v9fs_free_inode(struct inode *inode) /* * Set parameters for the netfs library */ -static void v9fs_set_netfs_context(struct inode *inode) +void v9fs_set_netfs_context(struct inode *inode) { struct v9fs_inode *v9inode = V9FS_I(inode); netfs_inode_init(&v9inode->netfs, &v9fs_req_ops); @@ -344,8 +344,6 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses, err = -EINVAL; goto error; } - - v9fs_set_netfs_context(inode); error: return err;

@@ -377,6 +375,7 @@ struct inode *v9fs_get_inode(struct super_block *sb, umode_t mode, dev_t rdev) iput(inode); return ERR_PTR(err); } + v9fs_set_netfs_context(inode); return inode; }

@@ -479,6 +478,7 @@ static struct inode *v9fs_qid_iget(struct super_block *sb, goto error;

v9fs_stat2inode(st, inode, sb, 0); + v9fs_set_netfs_context(inode); v9fs_cache_inode_get_cookie(inode); unlock_new_inode(inode); return inode; diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c index 5cfa4b4f070f..e15ad46833e0 100644 --- a/fs/9p/vfs_inode_dotl.c +++ b/fs/9p/vfs_inode_dotl.c @@ -130,6 +130,7 @@ static struct inode *v9fs_qid_iget_dotl(struct super_block *sb, goto error;

v9fs_stat2inode_dotl(st, inode, 0); + v9fs_set_netfs_context(inode); v9fs_cache_inode_get_cookie(inode); retval = v9fs_get_acl(inode, fid); if (retval)

-- 2.43.0

Sasha Levin

4:13 p.m.

New subject: [PATCH AUTOSEL 6.1 14/27] misc: lis3lv02d_i2c: Add missing setting of the reg_ctrl callback

From: Hans de Goede hdegoede@redhat.com

[ Upstream commit b1b9f7a494400c0c39f8cd83de3aaa6111c55087 ]

The lis3lv02d_i2c driver was missing a line to set the lis3_dev's reg_ctrl callback.

lis3_reg_ctrl(on) is called from the init callback, but due to the missing reg_ctrl callback the regulators where never turned off again leading to the following oops/backtrace when detaching the driver:

[ 82.313527] ------------[ cut here ]------------ [ 82.313546] WARNING: CPU: 1 PID: 1724 at drivers/regulator/core.c:2396 _regulator_put+0x219/0x230 ... [ 82.313695] RIP: 0010:_regulator_put+0x219/0x230 ... [ 82.314767] Call Trace: [ 82.314770] <TASK> [ 82.314772] ? _regulator_put+0x219/0x230 [ 82.314777] ? __warn+0x81/0x170 [ 82.314784] ? _regulator_put+0x219/0x230 [ 82.314791] ? report_bug+0x18d/0x1c0 [ 82.314801] ? handle_bug+0x3c/0x80 [ 82.314806] ? exc_invalid_op+0x13/0x60 [ 82.314812] ? asm_exc_invalid_op+0x16/0x20 [ 82.314845] ? _regulator_put+0x219/0x230 [ 82.314857] regulator_bulk_free+0x39/0x60 [ 82.314865] i2c_device_remove+0x22/0xb0

Add the missing setting of the callback so that the regulators properly get turned off again when not used.

Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20231224183402.95640-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/lis3lv02d/lis3lv02d_i2c.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/misc/lis3lv02d/lis3lv02d_i2c.c b/drivers/misc/lis3lv02d/lis3lv02d_i2c.c index d7daa01fe7ca..fdec2c30eb16 100644 --- a/drivers/misc/lis3lv02d/lis3lv02d_i2c.c +++ b/drivers/misc/lis3lv02d/lis3lv02d_i2c.c @@ -151,6 +151,7 @@ static int lis3lv02d_i2c_probe(struct i2c_client *client, lis3_dev.init = lis3_i2c_init; lis3_dev.read = lis3_i2c_read; lis3_dev.write = lis3_i2c_write; + lis3_dev.reg_ctrl = lis3_reg_ctrl; lis3_dev.irq = client->irq; lis3_dev.ac = lis3lv02d_axis_map; lis3_dev.pm_dev = &client->dev;

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 15/27] libsubcmd: Fix memory leak in uniq()

From: Ian Rogers irogers@google.com

[ Upstream commit ad30469a841b50dbb541df4d6971d891f703c297 ]

uniq() will write one command name over another causing the overwritten string to be leaked. Fix by doing a pass that removes duplicates and a second that removes the holes.

Signed-off-by: Ian Rogers irogers@google.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Chenyuan Mi cymi20@fudan.edu.cn Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20231208000515.1693746-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/subcmd/help.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/lib/subcmd/help.c b/tools/lib/subcmd/help.c index bf02d62a3b2b..42f57b640f11 100644 --- a/tools/lib/subcmd/help.c +++ b/tools/lib/subcmd/help.c @@ -50,11 +50,21 @@ void uniq(struct cmdnames *cmds) if (!cmds->cnt) return;

- for (i = j = 1; i < cmds->cnt; i++) - if (strcmp(cmds->names[i]->name, cmds->names[i-1]->name)) - cmds->names[j++] = cmds->names[i]; - + for (i = 1; i < cmds->cnt; i++) { + if (!strcmp(cmds->names[i]->name, cmds->names[i-1]->name)) + zfree(&cmds->names[i - 1]); + } + for (i = 0, j = 0; i < cmds->cnt; i++) { + if (cmds->names[i]) { + if (i == j) + j++; + else + cmds->names[j++] = cmds->names[i]; + } + } cmds->cnt = j; + while (j < i) + cmds->names[j++] = NULL; }

void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes)

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 16/27] drm/amdkfd: Fix lock dependency warning

From: Felix Kuehling felix.kuehling@amd.com

[ Upstream commit 47bf0f83fc86df1bf42b385a91aadb910137c5c9 ]

====================================================== WARNING: possible circular locking dependency detected 6.5.0-kfd-fkuehlin #276 Not tainted ------------------------------------------------------ kworker/8:2/2676 is trying to acquire lock: ffff9435aae95c88 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}, at: __flush_work+0x52/0x550

but task is already holding lock: ffff9435cd8e1720 (&svms->lock){+.+.}-{3:3}, at: svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&svms->lock){+.+.}-{3:3}: __mutex_lock+0x97/0xd30 kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu] kfd_ioctl+0x1b2/0x5d0 [amdgpu] __x64_sys_ioctl+0x86/0xc0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (&mm->mmap_lock){++++}-{3:3}: down_read+0x42/0x160 svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu] process_one_work+0x27a/0x540 worker_thread+0x53/0x3e0 kthread+0xeb/0x120 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}: __lock_acquire+0x1426/0x2200 lock_acquire+0xc1/0x2b0 __flush_work+0x80/0x550 __cancel_work_timer+0x109/0x190 svm_range_bo_release+0xdc/0x1c0 [amdgpu] svm_range_free+0x175/0x180 [amdgpu] svm_range_deferred_list_work+0x15d/0x340 [amdgpu] process_one_work+0x27a/0x540 worker_thread+0x53/0x3e0 kthread+0xeb/0x120 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of: (work_completion)(&svm_bo->eviction_work) --> &mm->mmap_lock --> &svms->lock

Possible unsafe locking scenario:

CPU0 CPU1 ---- ---- lock(&svms->lock); lock(&mm->mmap_lock); lock(&svms->lock); lock((work_completion)(&svm_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO refcount is non-0. That means it's impossible that svm_range_bo_release is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in svm_range_schedule_evict_svm_bo instead of in the worker. That way it's impossible for a BO to get freed while eviction work is pending and the cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also removed redundant checks that are already done in amdkfd_fence_enable_signaling.

Signed-off-by: Felix Kuehling felix.kuehling@amd.com Reviewed-by: Philip Yang philip.yang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 208812512d8a..4ecc4be1a910 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -380,14 +380,9 @@ static void svm_range_bo_release(struct kref *kref) spin_lock(&svm_bo->list_lock); } spin_unlock(&svm_bo->list_lock); - if (!dma_fence_is_signaled(&svm_bo->eviction_fence->base)) { - /* We're not in the eviction worker. - * Signal the fence and synchronize with any - * pending eviction work. - */ + if (!dma_fence_is_signaled(&svm_bo->eviction_fence->base)) + /* We're not in the eviction worker. Signal the fence. */ dma_fence_signal(&svm_bo->eviction_fence->base); - cancel_work_sync(&svm_bo->eviction_work); - } dma_fence_put(&svm_bo->eviction_fence->base); amdgpu_bo_unref(&svm_bo->bo); kfree(svm_bo); @@ -3310,13 +3305,14 @@ svm_range_trigger_migration(struct mm_struct *mm, struct svm_range *prange,

int svm_range_schedule_evict_svm_bo(struct amdgpu_amdkfd_fence *fence) { - if (!fence) - return -EINVAL; - - if (dma_fence_is_signaled(&fence->base)) - return 0; - - if (fence->svm_bo) { + /* Dereferencing fence->svm_bo is safe here because the fence hasn't + * signaled yet and we're under the protection of the fence->lock. + * After the fence is signaled in svm_range_bo_release, we cannot get + * here any more. + * + * Reference is dropped in svm_range_evict_svm_bo_worker. + */ + if (svm_bo_ref_unless_zero(fence->svm_bo)) { WRITE_ONCE(fence->svm_bo->evicting, 1); schedule_work(&fence->svm_bo->eviction_work); } @@ -3331,8 +3327,6 @@ static void svm_range_evict_svm_bo_worker(struct work_struct *work) int r = 0;

svm_bo = container_of(work, struct svm_range_bo, eviction_work); - if (!svm_bo_ref_unless_zero(svm_bo)) - return; /* svm_bo was freed while eviction was pending */

if (mmget_not_zero(svm_bo->eviction_fence->mm)) { mm = svm_bo->eviction_fence->mm;

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 17/27] drm/amdkfd: Fix lock dependency warning with srcu

From: Philip Yang Philip.Yang@amd.com

[ Upstream commit 2a9de42e8d3c82c6990d226198602be44f43f340 ]

====================================================== WARNING: possible circular locking dependency detected 6.5.0-kfd-yangp #2289 Not tainted ------------------------------------------------------ kworker/0:2/996 is trying to acquire lock: (srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0

but task is already holding lock: ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}, at: process_one_work+0x211/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}: __flush_work+0x88/0x4f0 svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu] svm_range_set_attr+0xd6/0x14c0 [amdgpu] kfd_ioctl+0x1d1/0x630 [amdgpu] __x64_sys_ioctl+0x88/0xc0

-> #2 (&info->lock#2){+.+.}-{3:3}: __mutex_lock+0x99/0xc70 amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu] restore_process_helper+0x22/0x80 [amdgpu] restore_process_worker+0x2d/0xa0 [amdgpu] process_one_work+0x29b/0x560 worker_thread+0x3d/0x3d0

-> #1 ((work_completion)(&(&process->restore_work)->work)){+.+.}-{0:0}: __flush_work+0x88/0x4f0 __cancel_work_timer+0x12c/0x1c0 kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu] __mmu_notifier_release+0xad/0x240 exit_mmap+0x6a/0x3a0 mmput+0x6a/0x120 do_exit+0x322/0xb90 do_group_exit+0x37/0xa0 __x64_sys_exit_group+0x18/0x20 do_syscall_64+0x38/0x80

-> #0 (srcu){.+.+}-{0:0}: __lock_acquire+0x1521/0x2510 lock_sync+0x5f/0x90 __synchronize_srcu+0x4f/0x1a0 __mmu_notifier_release+0x128/0x240 exit_mmap+0x6a/0x3a0 mmput+0x6a/0x120 svm_range_deferred_list_work+0x19f/0x350 [amdgpu] process_one_work+0x29b/0x560 worker_thread+0x3d/0x3d0

other info that might help us debug this: Chain exists of: srcu --> &info->lock#2 --> (work_completion)(&svms->deferred_list_work)

Possible unsafe locking scenario:

CPU0 CPU1 ---- ---- lock((work_completion)(&svms->deferred_list_work)); lock(&info->lock#2); lock((work_completion)(&svms->deferred_list_work)); sync(srcu);

Signed-off-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 4ecc4be1a910..5188c4d2e7c0 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2241,8 +2241,10 @@ static void svm_range_deferred_list_work(struct work_struct *work) mutex_unlock(&svms->lock); mmap_write_unlock(mm);

- /* Pairs with mmget in svm_range_add_list_work */ - mmput(mm); + /* Pairs with mmget in svm_range_add_list_work. If dropping the + * last mm refcount, schedule release work to avoid circular locking + */ + mmput_async(mm);

spin_lock(&svms->deferred_list_lock); }

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 18/27] virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings

From: Zhu Yanjun yanjun.zhu@linux.dev

[ Upstream commit e3fe8d28c67bf6c291e920c6d04fa22afa14e6e4 ]

Fix the warnings when building virtio_net driver.

" drivers/net/virtio_net.c: In function ‘init_vqs’: drivers/net/virtio_net.c:4551:48: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Wformat-overflow=] 4551 | sprintf(vi->rq[i].name, "input.%d", i); | ^~ In function ‘virtnet_find_vqs’, inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8: drivers/net/virtio_net.c:4551:41: note: directive argument in the range [-2147483643, 65534] 4551 | sprintf(vi->rq[i].name, "input.%d", i); | ^~~~~~~~~~ drivers/net/virtio_net.c:4551:17: note: ‘sprintf’ output between 8 and 18 bytes into a destination of size 16 4551 | sprintf(vi->rq[i].name, "input.%d", i); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/virtio_net.c: In function ‘init_vqs’: drivers/net/virtio_net.c:4552:49: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 9 [-Wformat-overflow=] 4552 | sprintf(vi->sq[i].name, "output.%d", i); | ^~ In function ‘virtnet_find_vqs’, inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8: drivers/net/virtio_net.c:4552:41: note: directive argument in the range [-2147483643, 65534] 4552 | sprintf(vi->sq[i].name, "output.%d", i); | ^~~~~~~~~~~ drivers/net/virtio_net.c:4552:17: note: ‘sprintf’ output between 9 and 19 bytes into a destination of size 16 4552 | sprintf(vi->sq[i].name, "output.%d", i);

Reviewed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Zhu Yanjun yanjun.zhu@linux.dev Link: https://lore.kernel.org/r/20240104020902.2753599-1-yanjun.zhu@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/virtio_net.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 21d3461fb5d1..45f1a871b7da 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3474,10 +3474,11 @@ static int virtnet_find_vqs(struct virtnet_info *vi) { vq_callback_t **callbacks; struct virtqueue **vqs; - int ret = -ENOMEM; - int i, total_vqs; const char **names; + int ret = -ENOMEM; + int total_vqs; bool *ctx; + u16 i;

/* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by * possible N-1 RX/TX queue pairs used in multiqueue mode, followed by @@ -3514,8 +3515,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi) for (i = 0; i < vi->max_queue_pairs; i++) { callbacks[rxq2vq(i)] = skb_recv_done; callbacks[txq2vq(i)] = skb_xmit_done; - sprintf(vi->rq[i].name, "input.%d", i); - sprintf(vi->sq[i].name, "output.%d", i); + sprintf(vi->rq[i].name, "input.%u", i); + sprintf(vi->sq[i].name, "output.%u", i); names[rxq2vq(i)] = vi->rq[i].name; names[txq2vq(i)] = vi->sq[i].name; if (ctx)

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 19/27] blk-mq: fix IO hang from sbitmap wakeup race

From: Ming Lei ming.lei@redhat.com

[ Upstream commit 5266caaf5660529e3da53004b8b7174cab6374ed ]

In blk_mq_mark_tag_wait(), __add_wait_queue() may be re-ordered with the following blk_mq_get_driver_tag() in case of getting driver tag failure.

Then in __sbitmap_queue_wake_up(), waitqueue_active() may not observe the added waiter in blk_mq_mark_tag_wait() and wake up nothing, meantime blk_mq_mark_tag_wait() can't get driver tag successfully.

This issue can be reproduced by running the following test in loop, and fio hang can be observed in < 30min when running it on my test VM in laptop.

modprobe -r scsi_debug modprobe scsi_debug delay=0 dev_size_mb=4096 max_queue=1 host_max_queue=1 submit_queues=4 dev=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename` fio --filename=/dev/"$dev" --direct=1 --rw=randrw --bs=4k --iodepth=1 \ --runtime=100 --numjobs=40 --time_based --name=test \ --ioengine=libaio

Fix the issue by adding one explicit barrier in blk_mq_mark_tag_wait(), which is just fine in case of running out of tag.

Cc: Jan Kara jack@suse.cz Cc: Kemeng Shi shikemeng@huaweicloud.com Reported-by: Changhui Zhong czhong@redhat.com Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20240112122626.4181044-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c index 368f1947c895..ebfefbcb3604 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1859,6 +1859,22 @@ static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx *hctx, wait->flags &= ~WQ_FLAG_EXCLUSIVE; __add_wait_queue(wq, wait);

+ /* + * Add one explicit barrier since blk_mq_get_driver_tag() may + * not imply barrier in case of failure. + * + * Order adding us to wait queue and allocating driver tag. + * + * The pair is the one implied in sbitmap_queue_wake_up() which + * orders clearing sbitmap tag bits and waitqueue_active() in + * __sbitmap_queue_wake_up(), since waitqueue_active() is lockless + * + * Otherwise, re-order of adding wait queue and getting driver tag + * may cause __sbitmap_queue_wake_up() to wake up nothing because + * the waitqueue_active() may not observe us in wait queue. + */ + smp_mb(); + /* * It's possible that a tag was freed in the window between the * allocation failure and adding the hardware queue to the wait

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 20/27] ceph: reinitialize mds feature bit even when session in open

From: Venky Shankar vshankar@redhat.com

[ Upstream commit f48e0342a74d7770cdf1d11894bdc3b6d989b29e ]

Following along the same lines as per the user-space fix. Right now this isn't really an issue with the ceph kernel driver because of the feature bit laginess, however, that can change over time (when the new snaprealm info type is ported to the kernel driver) and depending on the MDS version that's being upgraded can cause message decoding issues - so, fix that early on.

Link: http://tracker.ceph.com/issues/63188 Signed-off-by: Venky Shankar vshankar@redhat.com Reviewed-by: Xiubo Li xiubli@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/mds_client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 82874be94524..da9fcf48ab6c 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -3650,11 +3650,11 @@ static void handle_session(struct ceph_mds_session *session, if (session->s_state == CEPH_MDS_SESSION_RECONNECTING) pr_info("mds%d reconnect success\n", session->s_mds);

+ session->s_features = features; if (session->s_state == CEPH_MDS_SESSION_OPEN) { pr_notice("mds%d is already opened\n", session->s_mds); } else { session->s_state = CEPH_MDS_SESSION_OPEN; - session->s_features = features; renewed_caps(mdsc, session, 0); if (test_bit(CEPHFS_FEATURE_METRIC_COLLECT, &session->s_features))

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 21/27] ceph: fix deadlock or deadcode of misusing dget()

From: Xiubo Li xiubli@redhat.com

[ Upstream commit b493ad718b1f0357394d2cdecbf00a44a36fa085 ]

The lock order is incorrect between denty and its parent, we should always make sure that the parent get the lock first.

But since this deadcode is never used and the parent dir will always be set from the callers, let's just remove it.

Link: https://lore.kernel.org/r/20231116081919.GZ1957730@ZenIV Reported-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 36052a362683..111938a6307e 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4597,12 +4597,14 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry, struct inode *dir, int mds, int drop, int unless) { - struct dentry *parent = NULL; struct ceph_mds_request_release *rel = *p; struct ceph_dentry_info *di = ceph_dentry(dentry); int force = 0; int ret;

+ /* This shouldn't happen */ + BUG_ON(!dir); + /* * force an record for the directory caps if we have a dentry lease. * this is racy (can't take i_ceph_lock and d_lock together), but it @@ -4612,14 +4614,9 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry, spin_lock(&dentry->d_lock); if (di->lease_session && di->lease_session->s_mds == mds) force = 1; - if (!dir) { - parent = dget(dentry->d_parent); - dir = d_inode(parent); - } spin_unlock(&dentry->d_lock);

ret = ceph_encode_inode_release(p, dir, mds, drop, unless, force); - dput(parent);

spin_lock(&dentry->d_lock); if (ret && di->lease_session && di->lease_session->s_mds == mds) {

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 22/27] ceph: fix invalid pointer access if get_quota_realm return ERR_PTR

From: Wenchao Hao haowenchao2@huawei.com

[ Upstream commit 0f4cf64eabc6e16cfc2704f1960e82dc79d91c8d ]

This issue is reported by smatch that get_quota_realm() might return ERR_PTR but we did not handle it. It's not a immediate bug, while we still should address it to avoid potential bugs if get_quota_realm() is changed to return other ERR_PTR in future.

Set ceph_snap_realm's pointer in get_quota_realm()'s to address this issue, the pointer would be set to NULL if get_quota_realm() failed to get struct ceph_snap_realm, so no ERR_PTR would happen any more.

[ xiubli: minor code style clean up ]

Signed-off-by: Wenchao Hao haowenchao2@huawei.com Reviewed-by: Xiubo Li xiubli@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/quota.c | 39 ++++++++++++++++++++++----------------- 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c index f7fcf7f08ec6..ca4932e6f71b 100644 --- a/fs/ceph/quota.c +++ b/fs/ceph/quota.c @@ -194,10 +194,10 @@ void ceph_cleanup_quotarealms_inodes(struct ceph_mds_client *mdsc) }

/* - * This function walks through the snaprealm for an inode and returns the - * ceph_snap_realm for the first snaprealm that has quotas set (max_files, + * This function walks through the snaprealm for an inode and set the + * realmp with the first snaprealm that has quotas set (max_files, * max_bytes, or any, depending on the 'which_quota' argument). If the root is - * reached, return the root ceph_snap_realm instead. + * reached, set the realmp with the root ceph_snap_realm instead. * * Note that the caller is responsible for calling ceph_put_snap_realm() on the * returned realm. @@ -208,18 +208,19 @@ void ceph_cleanup_quotarealms_inodes(struct ceph_mds_client *mdsc) * this function will return -EAGAIN; otherwise, the snaprealms walk-through * will be restarted. */ -static struct ceph_snap_realm *get_quota_realm(struct ceph_mds_client *mdsc, - struct inode *inode, - enum quota_get_realm which_quota, - bool retry) +static int get_quota_realm(struct ceph_mds_client *mdsc, struct inode *inode, + enum quota_get_realm which_quota, + struct ceph_snap_realm **realmp, bool retry) { struct ceph_inode_info *ci = NULL; struct ceph_snap_realm *realm, *next; struct inode *in; bool has_quota;

+ if (realmp) + *realmp = NULL; if (ceph_snap(inode) != CEPH_NOSNAP) - return NULL; + return 0;

restart: realm = ceph_inode(inode)->i_snap_realm; @@ -245,7 +246,7 @@ static struct ceph_snap_realm *get_quota_realm(struct ceph_mds_client *mdsc, break; ceph_put_snap_realm(mdsc, realm); if (!retry) - return ERR_PTR(-EAGAIN); + return -EAGAIN; goto restart; }

@@ -254,8 +255,11 @@ static struct ceph_snap_realm *get_quota_realm(struct ceph_mds_client *mdsc, iput(in);

next = realm->parent; - if (has_quota || !next) - return realm; + if (has_quota || !next) { + if (realmp) + *realmp = realm; + return 0; + }

ceph_get_snap_realm(mdsc, next); ceph_put_snap_realm(mdsc, realm); @@ -264,7 +268,7 @@ static struct ceph_snap_realm *get_quota_realm(struct ceph_mds_client *mdsc, if (realm) ceph_put_snap_realm(mdsc, realm);

- return NULL; + return 0; }

bool ceph_quota_is_same_realm(struct inode *old, struct inode *new) @@ -272,6 +276,7 @@ bool ceph_quota_is_same_realm(struct inode *old, struct inode *new) struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(old->i_sb); struct ceph_snap_realm *old_realm, *new_realm; bool is_same; + int ret;

restart: /* @@ -281,9 +286,9 @@ bool ceph_quota_is_same_realm(struct inode *old, struct inode *new) * dropped and we can then restart the whole operation. */ down_read(&mdsc->snap_rwsem); - old_realm = get_quota_realm(mdsc, old, QUOTA_GET_ANY, true); - new_realm = get_quota_realm(mdsc, new, QUOTA_GET_ANY, false); - if (PTR_ERR(new_realm) == -EAGAIN) { + get_quota_realm(mdsc, old, QUOTA_GET_ANY, &old_realm, true); + ret = get_quota_realm(mdsc, new, QUOTA_GET_ANY, &new_realm, false); + if (ret == -EAGAIN) { up_read(&mdsc->snap_rwsem); if (old_realm) ceph_put_snap_realm(mdsc, old_realm); @@ -485,8 +490,8 @@ bool ceph_quota_update_statfs(struct ceph_fs_client *fsc, struct kstatfs *buf) bool is_updated = false;

down_read(&mdsc->snap_rwsem); - realm = get_quota_realm(mdsc, d_inode(fsc->sb->s_root), - QUOTA_GET_MAX_BYTES, true); + get_quota_realm(mdsc, d_inode(fsc->sb->s_root), QUOTA_GET_MAX_BYTES, + &realm, true); up_read(&mdsc->snap_rwsem); if (!realm) return false;

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 23/27] drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'

From: Srinivasan Shanmugam srinivasan.shanmugam@amd.com

[ Upstream commit 6616b5e1999146b1304abe78232af810080c67e3 ]

In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect structure type is passed to sizeof() in kzalloc, larger structure types were used, thus using correct type 'struct phm_ppm_table' fixes the below:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table vs _ATOM_Tonga_PPM_Table'

Cc: Eric Huang JinHuiEric.Huang@amd.com Cc: Christian König christian.koenig@amd.com Cc: Alex Deucher alexander.deucher@amd.com Signed-off-by: Srinivasan Shanmugam srinivasan.shanmugam@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c index f2a55c1413f5..17882f8dfdd3 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c @@ -200,7 +200,7 @@ static int get_platform_power_management_table( struct pp_hwmgr *hwmgr, ATOM_Tonga_PPM_Table *atom_ppm_table) { - struct phm_ppm_table *ptr = kzalloc(sizeof(ATOM_Tonga_PPM_Table), GFP_KERNEL); + struct phm_ppm_table *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL); struct phm_ppt_v1_information *pp_table_information = (struct phm_ppt_v1_information *)(hwmgr->pptable);

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 24/27] drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'

From: Srinivasan Shanmugam srinivasan.shanmugam@amd.com

[ Upstream commit fac4ebd79fed60e79cccafdad45a2bb8d3795044 ]

The amdgpu_gmc_vram_checking() function in emulation checks whether all of the memory range of shared system memory could be accessed by GPU, from this aspect, -EIO is returned for error scenarios.

Fixes the below: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:919 gmc_v6_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1103 gmc_v7_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1223 gmc_v8_0_hw_init() warn: missing error code? 'r' drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2344 gmc_v9_0_hw_init() warn: missing error code? 'r'

Cc: Xiaojian Du Xiaojian.Du@amd.com Cc: Lijo Lazar lijo.lazar@amd.com Cc: Christian König christian.koenig@amd.com Cc: Alex Deucher alexander.deucher@amd.com Signed-off-by: Srinivasan Shanmugam srinivasan.shanmugam@amd.com Suggested-by: Christian König christian.koenig@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 2bc791ed8830..ea0fb079f942 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -808,19 +808,26 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev) * seconds, so here, we just pick up three parts for emulation. */ ret = memcmp(vram_ptr, cptr, 10); - if (ret) - return ret; + if (ret) { + ret = -EIO; + goto release_buffer; + }

ret = memcmp(vram_ptr + (size / 2), cptr, 10); - if (ret) - return ret; + if (ret) { + ret = -EIO; + goto release_buffer; + }

ret = memcmp(vram_ptr + size - 10, cptr, 10); - if (ret) - return ret; + if (ret) { + ret = -EIO; + goto release_buffer; + }

+release_buffer: amdgpu_bo_free_kernel(&vram_bo, &vram_gpu, &vram_ptr);

- return 0; + return ret; }

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 25/27] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

From: Srinivasan Shanmugam srinivasan.shanmugam@amd.com

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' - 'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu Monk.Liu@amd.com Cc: Christian König christian.koenig@amd.com Cc: Alex Deucher alexander.deucher@amd.com Signed-off-by: Srinivasan Shanmugam srinivasan.shanmugam@amd.com Suggested-by: Lijo Lazar lijo.lazar@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index a5352e5e2bd4..4b91f95066ec 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1310,6 +1310,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev) return true;

fw_ver = *((uint32_t *)adev->pm.fw->data + 69); + release_firmware(adev->pm.fw); if (fw_ver < 0x00160e00) return true; }

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 26/27] drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'

From: Srinivasan Shanmugam srinivasan.shanmugam@amd.com

[ Upstream commit d7a254fad873775ce6c32b77796c81e81e6b7f2e ]

Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next return value still needs NULL check, thus modified from "node" to "rb_node".

Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 svm_range_get_range_boundaries() warn: can 'node' even be NULL?

Suggested-by: Philip Yang Philip.Yang@amd.com Cc: Felix Kuehling Felix.Kuehling@amd.com Cc: Christian König christian.koenig@amd.com Cc: Alex Deucher alexander.deucher@amd.com Signed-off-by: Srinivasan Shanmugam srinivasan.shanmugam@amd.com Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 5188c4d2e7c0..7fa5e70f1aac 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2553,6 +2553,7 @@ svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr, { struct vm_area_struct *vma; struct interval_tree_node *node; + struct rb_node *rb_node; unsigned long start_limit, end_limit;

vma = find_vma(p->mm, addr << PAGE_SHIFT); @@ -2575,16 +2576,15 @@ svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr, if (node) { end_limit = min(end_limit, node->start); /* Last range that ends before the fault address */ - node = container_of(rb_prev(&node->rb), - struct interval_tree_node, rb); + rb_node = rb_prev(&node->rb); } else { /* Last range must end before addr because * there was no range after addr */ - node = container_of(rb_last(&p->svms.objects.rb_root), - struct interval_tree_node, rb); + rb_node = rb_last(&p->svms.objects.rb_root); } - if (node) { + if (rb_node) { + node = container_of(rb_node, struct interval_tree_node, rb); if (node->last >= addr) { WARN(1, "Overlap with prev node and page fault addr\n"); return -EFAULT;

-- 2.43.0

Sasha Levin

4:14 p.m.

New subject: [PATCH AUTOSEL 6.1 27/27] i2c: rk3x: Adjust mask/value offset for i2c2 on rv1126

From: Tim Lunn tim@feathertop.org

[ Upstream commit 92a85b7c6262f19c65a1c115cf15f411ba65a57c ]

Rockchip RV1126 is using old style i2c controller, the i2c2 bus uses a non-sequential offset in the grf register for the mask/value bits for this bus.

This patch fixes i2c2 bus on rv1126 SoCs.

Signed-off-by: Tim Lunn tim@feathertop.org Acked-by: Heiko Stuebner heiko@sntech.de Reviewed-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-rk3x.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c index 6aa4f1f06240..c8cd5cadcf56 100644 --- a/drivers/i2c/busses/i2c-rk3x.c +++ b/drivers/i2c/busses/i2c-rk3x.c @@ -1295,8 +1295,12 @@ static int rk3x_i2c_probe(struct platform_device *pdev) return -EINVAL; }

- /* 27+i: write mask, 11+i: value */ - value = BIT(27 + bus_nr) | BIT(11 + bus_nr); + /* rv1126 i2c2 uses non-sequential write mask 20, value 4 */ + if (i2c->soc_data == &rv1126_soc_data && bus_nr == 2) + value = BIT(20) | BIT(4); + else + /* 27+i: write mask, 11+i: value */ + value = BIT(27 + bus_nr) | BIT(11 + bus_nr);

ret = regmap_write(grf, i2c->soc_data->grf_offset, value); if (ret != 0) {

-- 2.43.0

Tim Lunn

11:19 p.m.

New subject: [PATCH AUTOSEL 6.1 27/27] i2c: rk3x: Adjust mask/value offset for i2c2 on rv1126

Hi Sasha,

Support for the rv1126 SoC was only added around linux 6.2 and 6.3, thus doesnt make sense to pick this patch up in 6.1

Regards Tim

On 1/29/24 03:14, Sasha Levin wrote:

...

From: Tim Lunn tim@feathertop.org

[ Upstream commit 92a85b7c6262f19c65a1c115cf15f411ba65a57c ]

Rockchip RV1126 is using old style i2c controller, the i2c2 bus uses a non-sequential offset in the grf register for the mask/value bits for this bus.

This patch fixes i2c2 bus on rv1126 SoCs.

Signed-off-by: Tim Lunn tim@feathertop.org Acked-by: Heiko Stuebner heiko@sntech.de Reviewed-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org

drivers/i2c/busses/i2c-rk3x.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c index 6aa4f1f06240..c8cd5cadcf56 100644 --- a/drivers/i2c/busses/i2c-rk3x.c +++ b/drivers/i2c/busses/i2c-rk3x.c @@ -1295,8 +1295,12 @@ static int rk3x_i2c_probe(struct platform_device *pdev) return -EINVAL; }
/* 27+i: write mask, 11+i: value */
value = BIT(27 + bus_nr) | BIT(11 + bus_nr);
/* rv1126 i2c2 uses non-sequential write mask 20, value 4 */
if (i2c->soc_data == &rv1126_soc_data && bus_nr == 2)
	value = BIT(20) | BIT(4);
else
	/* 27+i: write mask, 11+i: value */
	value = BIT(27 + bus_nr) | BIT(11 + bus_nr);
ret = regmap_write(grf, i2c->soc_data->grf_offset, value); if (ret != 0) {

Sasha Levin

22 Feb 22 Feb

12:37 p.m.

New subject: [PATCH AUTOSEL 6.1 27/27] i2c: rk3x: Adjust mask/value offset for i2c2 on rv1126

On Mon, Jan 29, 2024 at 10:19:35AM +1100, Tim Lunn wrote:

...

Hi Sasha,

Support for the rv1126 SoC was only added around linux 6.2 and 6.3, thus doesnt make sense to pick this patch up in 6.1

I'll drop it from 6.1 (and older), thanks!

-- Thanks, Sasha

684

days inactive

709

days old

linux-stable-mirror@lists.linaro.org

28 comments

participants

tags (0)

participants (2)

Sasha Levin
Tim Lunn