[PATCH 6.3 00/13] 6.3.12-rc1 review

List overview All Threads
Download

newer

older

Linux 5.15.120

+...

Greg Kroah-Hartman

3 Jul 2023 3 Jul '23

6:54 p.m.

This is the start of the stable review cycle for the 6.3.12 release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.

Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000. Anything received after that time might be too late.

The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y and the diffstat can be found below.

thanks,

greg k-h

------------- Pseudo-Shortlog of commits:

Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.3.12-rc1

Bas Nieuwenhuizen bas@basnieuwenhuizen.nl drm/amdgpu: Validate VM ioctl flags.

Demi Marie Obenour demi@invisiblethingslab.com dm ioctl: Avoid double-fetch of version

Ahmed S. Darwish darwi@linutronix.de docs: Set minimal gtags / GNU GLOBAL version to 6.6.5

Ahmed S. Darwish darwi@linutronix.de scripts/tags.sh: Resolve gtags empty index generation

Mike Kravetz mike.kravetz@oracle.com hugetlb: revert use of page_cache_next_miss()

Finn Thain fthain@linux-m68k.org nubus: Partially revert proc_create_single_data() conversion

Dan Williams dan.j.williams@intel.com Revert "cxl/port: Enable the HDM decoder capability for switch ports"

Jeff Layton jlayton@kernel.org nfs: don't report STATX_BTIME in ->getattr

Linus Torvalds torvalds@linux-foundation.org execve: always mark stack as growing down during early stack setup

Mario Limonciello mario.limonciello@amd.com PCI/ACPI: Call _REG when transitioning D-states

Bjorn Helgaas bhelgaas@google.com PCI/ACPI: Validate acpi_pci_set_power_state() parameter

Aric Cyr aric.cyr@amd.com drm/amd/display: Do not update DRR while BW optimizations pending

Max Filippov jcmvbkbc@gmail.com xtensa: fix lock_mm_and_find_vma in case VMA not found

-------------

Diffstat:

Show replies by date

Greg Kroah-Hartman

3 Jul 3 Jul

6:54 p.m.

New subject: [PATCH 6.3 01/13] xtensa: fix lock_mm_and_find_vma in case VMA not found

From: Max Filippov jcmvbkbc@gmail.com

commit 03f889378f33aa9a9d8e5f49ba94134cf6158090 upstream.

MMU version of lock_mm_and_find_vma releases the mm lock before returning when VMA is not found. Do the same in noMMU version. This fixes hang on an attempt to handle protection fault.

Fixes: d85a143b69ab ("xtensa: fix NOMMU build with lock_mm_and_find_vma() conversion") Signed-off-by: Max Filippov jcmvbkbc@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/nommu.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/nommu.c +++ b/mm/nommu.c @@ -637,8 +637,13 @@ EXPORT_SYMBOL(find_vma); struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm, unsigned long addr, struct pt_regs *regs) { + struct vm_area_struct *vma; + mmap_read_lock(mm); - return vma_lookup(mm, addr); + vma = vma_lookup(mm, addr); + if (!vma) + mmap_read_unlock(mm); + return vma; }

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 02/13] drm/amd/display: Do not update DRR while BW optimizations pending

From: Aric Cyr aric.cyr@amd.com

commit 32953485c558cecf08f33fbfa251e80e44cef981 upstream.

[why] While bandwidth optimizations are pending, it's possible a pstate change will occur. During this time, VSYNC handler should not also try to update DRR parameters causing pstate hang

[how] Do not adjust DRR if optimize bandwidth is set.

Reviewed-by: Aric Cyr aric.cyr@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Aric Cyr aric.cyr@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 48 ++++++++++++++++++------------- 1 file changed, 29 insertions(+), 19 deletions(-)

--- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -400,6 +400,13 @@ bool dc_stream_adjust_vmin_vmax(struct d { int i;

+ /* + * Don't adjust DRR while there's bandwidth optimizations pending to + * avoid conflicting with firmware updates. + */ + if (dc->optimized_required || dc->wm_optimized_required) + return false; + stream->adjust.v_total_max = adjust->v_total_max; stream->adjust.v_total_mid = adjust->v_total_mid; stream->adjust.v_total_mid_frame_num = adjust->v_total_mid_frame_num; @@ -2201,27 +2208,33 @@ void dc_post_update_surfaces_to_stream(s

post_surface_trace(dc);

- if (dc->ctx->dce_version >= DCE_VERSION_MAX) - TRACE_DCN_CLOCK_STATE(&context->bw_ctx.bw.dcn.clk); - else + /* + * Only relevant for DCN behavior where we can guarantee the optimization + * is safe to apply - retain the legacy behavior for DCE. + */ + + if (dc->ctx->dce_version < DCE_VERSION_MAX) TRACE_DCE_CLOCK_STATE(&context->bw_ctx.bw.dce); + else { + TRACE_DCN_CLOCK_STATE(&context->bw_ctx.bw.dcn.clk);

- if (is_flip_pending_in_pipes(dc, context)) - return; + if (is_flip_pending_in_pipes(dc, context)) + return;

- for (i = 0; i < dc->res_pool->pipe_count; i++) - if (context->res_ctx.pipe_ctx[i].stream == NULL || - context->res_ctx.pipe_ctx[i].plane_state == NULL) { - context->res_ctx.pipe_ctx[i].pipe_idx = i; - dc->hwss.disable_plane(dc, &context->res_ctx.pipe_ctx[i]); - } + for (i = 0; i < dc->res_pool->pipe_count; i++) + if (context->res_ctx.pipe_ctx[i].stream == NULL || + context->res_ctx.pipe_ctx[i].plane_state == NULL) { + context->res_ctx.pipe_ctx[i].pipe_idx = i; + dc->hwss.disable_plane(dc, &context->res_ctx.pipe_ctx[i]); + }

- process_deferred_updates(dc); + process_deferred_updates(dc);

- dc->hwss.optimize_bandwidth(dc, context); + dc->hwss.optimize_bandwidth(dc, context);

- if (dc->debug.enable_double_buffered_dsc_pg_support) - dc->hwss.update_dsc_pg(dc, context, true); + if (dc->debug.enable_double_buffered_dsc_pg_support) + dc->hwss.update_dsc_pg(dc, context, true); + }

dc->optimized_required = false; dc->wm_optimized_required = false; @@ -4203,12 +4216,9 @@ void dc_commit_updates_for_stream(struct if (new_pipe->plane_state && new_pipe->plane_state != old_pipe->plane_state) new_pipe->plane_state->force_full_update = true; } - } else if (update_type == UPDATE_TYPE_FAST && dc_ctx->dce_version >= DCE_VERSION_MAX) { + } else if (update_type == UPDATE_TYPE_FAST) { /* * Previous frame finished and HW is ready for optimization. - * - * Only relevant for DCN behavior where we can guarantee the optimization - * is safe to apply - retain the legacy behavior for DCE. */ dc_post_update_surfaces_to_stream(dc); }

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 03/13] PCI/ACPI: Validate acpi_pci_set_power_state() parameter

From: Bjorn Helgaas bhelgaas@google.com

commit 5557b62634abbd55bab7b154ce4bca348ad7f96f upstream.

Previously acpi_pci_set_power_state() assumed the requested power state was valid (PCI_D0 ... PCI_D3cold). If a caller supplied something else, we could index outside the state_conv[] array and pass junk to acpi_device_set_power().

Validate the pci_power_t parameter and return -EINVAL if it's invalid.

Link: https://lore.kernel.org/r/20230621222857.GA122930@bhelgaas Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-acpi.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-)

--- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -1053,32 +1053,37 @@ int acpi_pci_set_power_state(struct pci_ [PCI_D3hot] = ACPI_STATE_D3_HOT, [PCI_D3cold] = ACPI_STATE_D3_COLD, }; - int error = -EINVAL; + int error;

/* If the ACPI device has _EJ0, ignore the device */ if (!adev || acpi_has_method(adev->handle, "_EJ0")) return -ENODEV;

switch (state) { - case PCI_D3cold: - if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) == - PM_QOS_FLAGS_ALL) { - error = -EBUSY; - break; - } - fallthrough; case PCI_D0: case PCI_D1: case PCI_D2: case PCI_D3hot: - error = acpi_device_set_power(adev, state_conv[state]); + case PCI_D3cold: + break; + default: + return -EINVAL; }

- if (!error) - pci_dbg(dev, "power state changed by ACPI to %s\n", - acpi_power_state_string(adev->power.state)); + if (state == PCI_D3cold) { + if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) == + PM_QOS_FLAGS_ALL) + return -EBUSY; + } + + error = acpi_device_set_power(adev, state_conv[state]); + if (error) + return error; + + pci_dbg(dev, "power state changed by ACPI to %s\n", + acpi_power_state_string(adev->power.state));

- return error; + return 0; }

pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 04/13] PCI/ACPI: Call _REG when transitioning D-states

From: Mario Limonciello mario.limonciello@amd.com

commit 112a7f9c8edbf76f7cb83856a6cb6b60a210b659 upstream.

ACPI r6.5, sec 6.5.4, describes how AML is unable to access an OperationRegion unless _REG has been called to connect a handler:

The OS runs _REG control methods to inform AML code of a change in the availability of an operation region. When an operation region handler is unavailable, AML cannot access data fields in that region. (Operation region writes will be ignored and reads will return indeterminate data.)

The PCI core does not call _REG at any time, leading to the undefined behavior mentioned in the spec.

The spec explains that _REG should be executed to indicate whether a given region can be accessed:

Once _REG has been executed for a particular operation region, indicating that the operation region handler is ready, a control method can access fields in the operation region. Conversely, control methods must not access fields in operation regions when _REG method execution has not indicated that the operation region handler is ready.

An example included in the spec demonstrates calling _REG when devices are turned off: "when the host controller or bridge controller is turned off or disabled, PCI Config Space Operation Regions for child devices are no longer available. As such, ETH0’s _REG method will be run when it is turned off and will again be run when PCI1 is turned off."

It is reported that ASMedia PCIe GPIO controllers fail functional tests after the system has returning from suspend (S3 or s2idle). This is because the BIOS checks whether the OSPM has called the _REG method to determine whether it can interact with the OperationRegion assigned to the device as part of the other AML called for the device.

To fix this issue, call acpi_evaluate_reg() when devices are transitioning to D3cold or D0.

[bhelgaas: split pci_power_t checking to preliminary patch] Link: https://uefi.org/specs/ACPI/6.5/06_Device_Configuration.html#reg-region Link: https://lore.kernel.org/r/20230620140451.21007-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Rafael J. Wysocki rafael@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-acpi.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)

--- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev * return false; }

+static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable) +{ + int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT; + int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev), + ACPI_ADR_SPACE_PCI_CONFIG, val); + if (ret) + pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n", + enable ? "connect" : "disconnect", ret); +} + int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state) { struct acpi_device *adev = ACPI_COMPANION(&dev->dev); @@ -1074,6 +1084,9 @@ int acpi_pci_set_power_state(struct pci_ if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) == PM_QOS_FLAGS_ALL) return -EBUSY; + + /* Notify AML lack of PCI config space availability */ + acpi_pci_config_space_access(dev, false); }

error = acpi_device_set_power(adev, state_conv[state]); @@ -1083,6 +1096,15 @@ int acpi_pci_set_power_state(struct pci_ pci_dbg(dev, "power state changed by ACPI to %s\n", acpi_power_state_string(adev->power.state));

+ /* + * Notify AML of PCI config space availability. Config space is + * accessible in all states except D3cold; the only transitions + * that change availability are transitions to D3cold and from + * D3cold to D0. + */ + if (state == PCI_D0) + acpi_pci_config_space_access(dev, true); + return 0; }

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 05/13] execve: always mark stack as growing down during early stack setup

From: Linus Torvalds torvalds@linux-foundation.org

commit f66066bc5136f25e36a2daff4896c768f18c211e upstream.

While our user stacks can grow either down (all common architectures) or up (parisc and the ia64 register stack), the initial stack setup when we copy the argument and environment strings to the new stack at execve() time is always done by extending the stack downwards.

But it turns out that in commit 8d7071af8907 ("mm: always expand the stack with the mmap write lock held"), as part of making the stack growing code more robust, 'expand_downwards()' was now made to actually check the vma flags:

if (!(vma->vm_flags & VM_GROWSDOWN)) return -EFAULT;

and that meant that this execve-time stack expansion started failing on parisc, because on that architecture, the stack flags do not contain the VM_GROWSDOWN bit.

At the same time the new check in expand_downwards() is clearly correct, and simplified the callers, so let's not remove it.

The solution is instead to just codify the fact that yes, during execve(), the stack grows down. This not only matches reality, it ends up being particularly simple: we already have special execve-time flags for the stack (VM_STACK_INCOMPLETE_SETUP) and use those flags to avoid page migration during this setup time (see vma_is_temporary_stack() and invalid_migration_vma()).

So just add VM_GROWSDOWN to that set of temporary flags, and now our stack flags automatically match reality, and the parisc stack expansion works again.

Note that the VM_STACK_INCOMPLETE_SETUP bits will be cleared when the stack is finalized, so we only add the extra VM_GROWSDOWN bit on CONFIG_STACK_GROWSUP architectures (ie parisc) rather than adding it in general.

Link: https://lore.kernel.org/all/612eaa53-6904-6e16-67fc-394f4faa0e16@bell.net/ Link: https://lore.kernel.org/all/5fd98a09-4792-1433-752d-029ae3545168@gmx.de/ Fixes: 8d7071af8907 ("mm: always expand the stack with the mmap write lock held") Reported-by: John David Anglin dave.anglin@bell.net Reported-and-tested-by: Helge Deller deller@gmx.de Reported-and-tested-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -384,7 +384,7 @@ extern unsigned int kobjsize(const void #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */

/* Bits set in the VMA until the stack is in its final location */ -#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ) +#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY)

#define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0)

@@ -406,8 +406,10 @@ extern unsigned int kobjsize(const void

#ifdef CONFIG_STACK_GROWSUP #define VM_STACK VM_GROWSUP +#define VM_STACK_EARLY VM_GROWSDOWN #else #define VM_STACK VM_GROWSDOWN +#define VM_STACK_EARLY 0 #endif

#define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT)

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 06/13] nfs: dont report STATX_BTIME in ->getattr

From: Jeff Layton jlayton@kernel.org

commit cded49ba366220ae7009d71c5804baa01acfb860 upstream.

NFS doesn't properly support reporting the btime in getattr (yet), but 61a968b4f05e mistakenly added it to the request_mask. This causes statx for STATX_BTIME to report a zeroed out btime instead of properly clearing the flag.

Cc: stable@vger.kernel.org # v6.3+ Fixes: 61a968b4f05e ("nfs: report the inode version in getattr if requested") Signed-off-by: Jeff Layton jlayton@kernel.org Link: https://bugzilla.redhat.com/show_bug.cgi?id=2214134 Reported-by: Boyang Xue bxue@redhat.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index a910b9a638c5..8172dd4135a1 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -845,7 +845,7 @@ int nfs_getattr(struct mnt_idmap *idmap, const struct path *path,

if ((query_flags & AT_STATX_DONT_SYNC) && !force_sync) {

-- 2.41.0

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 07/13] Revert "cxl/port: Enable the HDM decoder capability for switch ports"

From: Dan Williams dan.j.williams@intel.com

commit 8f0220af58c3b73e9041377a23708d37600b33c1 upstream.

commit eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports")

...was added on the observation of CXL memory not being accessible after setting up a region on a "cold-plugged" device. A "cold-plugged" CXL device is one that was not present at boot, so platform-firmware/BIOS has no chance to set it up.

While it is true that the debug found the enable bit clear in the host-bridge's instance of the global control register (CXL 3.0 8.2.4.19.2 CXL HDM Decoder Global Control Register), that bit is described as:

"This bit is only applicable to CXL.mem devices and shall return 0 on CXL Host Bridges and Upstream Switch Ports."

So it is meant to be zero, and further testing confirmed that this "fix" had no effect on the failure. Revert it, and be more vigilant about proposed fixes in the future. Since the original copied stable@, flag this revert for stable@ as well.

Cc: stable@vger.kernel.org Fixes: eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/168685882012.3475336.16733084892658264991.stgit@dw... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cxl/core/pci.c | 27 ++++----------------------- drivers/cxl/cxl.h | 1 - drivers/cxl/port.c | 14 +++++--------- tools/testing/cxl/Kbuild | 1 - tools/testing/cxl/test/mock.c | 15 --------------- 5 files changed, 9 insertions(+), 49 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 67f4ab6daa34..74962b18e3b2 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -308,36 +308,17 @@ static void disable_hdm(void *_cxlhdm) hdm + CXL_HDM_DECODER_CTRL_OFFSET); }

-int devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm) +static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm) { - void __iomem *hdm; + void __iomem *hdm = cxlhdm->regs.hdm_decoder; u32 global_ctrl;

- /* - * If the hdm capability was not mapped there is nothing to enable and - * the caller is responsible for what happens next. For example, - * emulate a passthrough decoder. - */ - if (IS_ERR(cxlhdm)) - return 0; - - hdm = cxlhdm->regs.hdm_decoder; global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); - - /* - * If the HDM decoder capability was enabled on entry, skip - * registering disable_hdm() since this decode capability may be - * owned by platform firmware. - */ - if (global_ctrl & CXL_HDM_DECODER_ENABLE) - return 0; - writel(global_ctrl | CXL_HDM_DECODER_ENABLE, hdm + CXL_HDM_DECODER_CTRL_OFFSET);

- return devm_add_action_or_reset(&port->dev, disable_hdm, cxlhdm); + return devm_add_action_or_reset(host, disable_hdm, cxlhdm); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_enable_hdm, CXL);

int cxl_dvsec_rr_decode(struct device *dev, int d, struct cxl_endpoint_dvsec_info *info) @@ -511,7 +492,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, if (info->mem_enabled) return 0;

- rc = devm_cxl_enable_hdm(port, cxlhdm); + rc = devm_cxl_enable_hdm(&port->dev, cxlhdm); if (rc) return rc;

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index f309b1387858..f0c428cb9a71 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -710,7 +710,6 @@ struct cxl_endpoint_dvsec_info { struct cxl_hdm; struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port, struct cxl_endpoint_dvsec_info *info); -int devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm); int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm, struct cxl_endpoint_dvsec_info *info); int devm_cxl_add_passthrough_decoder(struct cxl_port *port); diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c index c23b6164e1c0..07c5ac598da1 100644 --- a/drivers/cxl/port.c +++ b/drivers/cxl/port.c @@ -60,17 +60,13 @@ static int discover_region(struct device *dev, void *root) static int cxl_switch_port_probe(struct cxl_port *port) { struct cxl_hdm *cxlhdm; - int rc, nr_dports; - - nr_dports = devm_cxl_port_enumerate_dports(port); - if (nr_dports < 0) - return nr_dports; + int rc;

- cxlhdm = devm_cxl_setup_hdm(port, NULL); - rc = devm_cxl_enable_hdm(port, cxlhdm); - if (rc) + rc = devm_cxl_port_enumerate_dports(port); + if (rc < 0) return rc;

+ cxlhdm = devm_cxl_setup_hdm(port, NULL); if (!IS_ERR(cxlhdm)) return devm_cxl_enumerate_decoders(cxlhdm, NULL);

@@ -79,7 +75,7 @@ static int cxl_switch_port_probe(struct cxl_port *port) return PTR_ERR(cxlhdm); }

- if (nr_dports == 1) { + if (rc == 1) { dev_dbg(&port->dev, "Fallback to passthrough decoder\n"); return devm_cxl_add_passthrough_decoder(port); } diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild index 6f9347ade82c..fba7bec96acd 100644 --- a/tools/testing/cxl/Kbuild +++ b/tools/testing/cxl/Kbuild @@ -6,7 +6,6 @@ ldflags-y += --wrap=acpi_pci_find_root ldflags-y += --wrap=nvdimm_bus_register ldflags-y += --wrap=devm_cxl_port_enumerate_dports ldflags-y += --wrap=devm_cxl_setup_hdm -ldflags-y += --wrap=devm_cxl_enable_hdm ldflags-y += --wrap=devm_cxl_add_passthrough_decoder ldflags-y += --wrap=devm_cxl_enumerate_decoders ldflags-y += --wrap=cxl_await_media_ready diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c index 284416527644..de3933a776fd 100644 --- a/tools/testing/cxl/test/mock.c +++ b/tools/testing/cxl/test/mock.c @@ -149,21 +149,6 @@ struct cxl_hdm *__wrap_devm_cxl_setup_hdm(struct cxl_port *port, } EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_setup_hdm, CXL);

-int __wrap_devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm) -{ - int index, rc; - struct cxl_mock_ops *ops = get_cxl_mock_ops(&index); - - if (ops && ops->is_mock_port(port->uport)) - rc = 0; - else - rc = devm_cxl_enable_hdm(port, cxlhdm); - put_cxl_mock_ops(index); - - return rc; -} -EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_enable_hdm, CXL); - int __wrap_devm_cxl_add_passthrough_decoder(struct cxl_port *port) { int rc, index;

-- 2.41.0

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 08/13] nubus: Partially revert proc_create_single_data() conversion

From: Finn Thain fthain@linux-m68k.org

commit 0e96647cff9224db564a1cee6efccb13dbe11ee2 upstream.

The conversion to proc_create_single_data() introduced a regression whereby reading a file in /proc/bus/nubus results in a seg fault:

# grep -r . /proc/bus/nubus/e/ Data read fault at 0x00000020 in Super Data (pc=0x1074c2) BAD KERNEL BUSERR Oops: 00000000 Modules linked in: PC: [<001074c2>] PDE_DATA+0xc/0x16 SR: 2010 SP: 38284958 a2: 01152370 d0: 00000001 d1: 01013000 d2: 01002790 d3: 00000000 d4: 00000001 d5: 0008ce2e a0: 00000000 a1: 00222a40 Process grep (pid: 45, task=142f8727) Frame format=B ssw=074d isc=2008 isb=4e5e daddr=00000020 dobuf=01199e70 baddr=001074c8 dibuf=ffffffff ver=f Stack from 01199e48: 01199e70 00222a58 01002790 00000000 011a3000 01199eb0 015000c0 00000000 00000000 01199ec0 01199ec0 000d551a 011a3000 00000001 00000000 00018000 d003f000 00000003 00000001 0002800d 01052840 01199fa8 c01f8000 00000000 00000029 0b532b80 00000000 00000000 00000029 0b532b80 01199ee4 00103640 011198c0 d003f000 00018000 01199fa8 00000000 011198c0 00000000 01199f4c 000b3344 011198c0 d003f000 00018000 01199fa8 00000000 00018000 011198c0 Call Trace: [<00222a58>] nubus_proc_rsrc_show+0x18/0xa0 [<000d551a>] seq_read+0xc4/0x510 [<00018000>] fp_fcos+0x2/0x82 [<0002800d>] __sys_setreuid+0x115/0x1c6 [<00103640>] proc_reg_read+0x5c/0xb0 [<00018000>] fp_fcos+0x2/0x82 [<000b3344>] __vfs_read+0x2c/0x13c [<00018000>] fp_fcos+0x2/0x82 [<00018000>] fp_fcos+0x2/0x82 [<000b8aa2>] sys_statx+0x60/0x7e [<000b34b6>] vfs_read+0x62/0x12a [<00018000>] fp_fcos+0x2/0x82 [<00018000>] fp_fcos+0x2/0x82 [<000b39c2>] ksys_read+0x48/0xbe [<00018000>] fp_fcos+0x2/0x82 [<000b3a4e>] sys_read+0x16/0x1a [<00018000>] fp_fcos+0x2/0x82 [<00002b84>] syscall+0x8/0xc [<00018000>] fp_fcos+0x2/0x82 [<0000c016>] not_ext+0xa/0x18 Code: 4e5e 4e75 4e56 0000 206e 0008 2068 ffe8 <2068> 0020 2008 4e5e 4e75 4e56 0000 2f0b 206e 0008 2068 0004 2668 0020 206b ffe8 Disabling lock debugging due to kernel taint

Segmentation fault

The proc_create_single_data() conversion does not work because single_open(file, nubus_proc_rsrc_show, PDE_DATA(inode)) is not equivalent to the original code.

Fixes: 3f3942aca6da ("proc: introduce proc_create_single{,_data}") Cc: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org # 5.6+ Signed-off-by: Finn Thain fthain@linux-m68k.org Reviewed-by: Geert Uytterhoeven geert@linux-m68k.org Link: https://lore.kernel.org/r/d4e2a586e793cc8d9442595684ab8a077c0fe726.167878391... Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nubus/proc.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-)

--- a/drivers/nubus/proc.c +++ b/drivers/nubus/proc.c @@ -137,6 +137,18 @@ static int nubus_proc_rsrc_show(struct s return 0; }

+static int nubus_rsrc_proc_open(struct inode *inode, struct file *file) +{ + return single_open(file, nubus_proc_rsrc_show, inode); +} + +static const struct proc_ops nubus_rsrc_proc_ops = { + .proc_open = nubus_rsrc_proc_open, + .proc_read = seq_read, + .proc_lseek = seq_lseek, + .proc_release = single_release, +}; + void nubus_proc_add_rsrc_mem(struct proc_dir_entry *procdir, const struct nubus_dirent *ent, unsigned int size) @@ -152,8 +164,8 @@ void nubus_proc_add_rsrc_mem(struct proc pded = nubus_proc_alloc_pde_data(nubus_dirptr(ent), size); else pded = NULL; - proc_create_single_data(name, S_IFREG | 0444, procdir, - nubus_proc_rsrc_show, pded); + proc_create_data(name, S_IFREG | 0444, procdir, + &nubus_rsrc_proc_ops, pded); }

void nubus_proc_add_rsrc(struct proc_dir_entry *procdir, @@ -166,9 +178,9 @@ void nubus_proc_add_rsrc(struct proc_dir return;

snprintf(name, sizeof(name), "%x", ent->type); - proc_create_single_data(name, S_IFREG | 0444, procdir, - nubus_proc_rsrc_show, - nubus_proc_alloc_pde_data(data, 0)); + proc_create_data(name, S_IFREG | 0444, procdir, + &nubus_rsrc_proc_ops, + nubus_proc_alloc_pde_data(data, 0)); }

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 09/13] hugetlb: revert use of page_cache_next_miss()

From: Mike Kravetz mike.kravetz@oracle.com

commit fd4aed8d985a3236d0877ff6d0c80ad39d4ce81a upstream.

Ackerley Tng reported an issue with hugetlbfs fallocate as noted in the Closes tag. The issue showed up after the conversion of hugetlb page cache lookup code to use page_cache_next_miss. User visible effects are:

- hugetlbfs fallocate incorrectly returns -EEXIST if pages are presnet in the file. - hugetlb pages will not be included in core dumps if they need to be brought in via GUP. - userfaultfd UFFDIO_COPY will not notice pages already present in the cache. It may try to allocate a new page and potentially return ENOMEM as opposed to EEXIST.

Revert the use page_cache_next_miss() in hugetlb code.

IMPORTANT NOTE FOR STABLE BACKPORTS: This patch will apply cleanly to v6.3. However, due to the change of filemap_get_folio() return values, it will not function correctly. This patch must be modified for stable backports.

[dan.carpenter@linaro.org: fix hugetlbfs_pagecache_present()] Link: https://lkml.kernel.org/r/efa86091-6a2c-4064-8f55-9b44e1313015@moroto.mounta... Link: https://lkml.kernel.org/r/20230621212403.174710-2-mike.kravetz@oracle.com Fixes: d0ce0e47b323 ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reported-by: Ackerley Tng ackerleytng@google.com Closes: https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com Reviewed-by: Sidhartha Kumar sidhartha.kumar@oracle.com Cc: Erdem Aktas erdemaktas@google.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Matthew Wilcox willy@infradead.org Cc: Muchun Song songmuchun@bytedance.com Cc: Vishal Annapurve vannapurve@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sidhartha Kumar sidhartha.kumar@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hugetlbfs/inode.c | 8 +++----- mm/hugetlb.c | 12 ++++++------ 2 files changed, 9 insertions(+), 11 deletions(-)

--- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct f */ struct folio *folio; unsigned long addr; - bool present;

cond_resched();

@@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct f mutex_lock(&hugetlb_fault_mutex_table[hash]);

/* See if already present in mapping to avoid alloc/free */ - rcu_read_lock(); - present = page_cache_next_miss(mapping, index, 1) != index; - rcu_read_unlock(); - if (present) { + folio = filemap_get_folio(mapping, index); + if (!IS_ERR(folio)) { + folio_put(folio); mutex_unlock(&hugetlb_fault_mutex_table[hash]); hugetlb_drop_vma_policy(&pseudo_vma); continue; --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5672,13 +5672,13 @@ static bool hugetlbfs_pagecache_present( { struct address_space *mapping = vma->vm_file->f_mapping; pgoff_t idx = vma_hugecache_offset(h, vma, address); - bool present; + struct folio *folio;

- rcu_read_lock(); - present = page_cache_next_miss(mapping, idx, 1) != idx; - rcu_read_unlock(); - - return present; + folio = filemap_get_folio(mapping, idx); + if (IS_ERR(folio)) + return false; + folio_put(folio); + return true; }

int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 10/13] scripts/tags.sh: Resolve gtags empty index generation

From: Ahmed S. Darwish darwi@linutronix.de

commit e1b37563caffc410bb4b55f153ccb14dede66815 upstream.

gtags considers any file outside of its current working directory "outside the source tree" and refuses to index it. For O= kernel builds, or when "make" is invoked from a directory other then the kernel source tree, gtags ignores the entire kernel source and generates an empty index.

Force-set gtags current working directory to the kernel source tree.

Due to commit 9da0763bdd82 ("kbuild: Use relative path when building in a subdir of the source tree"), if the kernel build is done in a sub-directory of the kernel source tree, the kernel Makefile will set the kernel's $srctree to ".." for shorter compile-time and run-time warnings. Consequently, the list of files to be indexed will be in the "../*" form, rendering all such paths invalid once gtags switches to the kernel source tree as its current working directory.

If gtags indexing is requested and the build directory is not the kernel source tree, index all files in absolute-path form.

Note, indexing in absolute-path form will not affect the generated index, as paths in gtags indices are always relative to the gtags "root directory" anyway (as evidenced by "gtags --dump").

Signed-off-by: Ahmed S. Darwish darwi@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/tags.sh | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)

--- a/scripts/tags.sh +++ b/scripts/tags.sh @@ -32,6 +32,13 @@ else tree=${srctree}/ fi

+# gtags(1) refuses to index any file outside of its current working dir. +# If gtags indexing is requested and the build output directory is not +# the kernel source tree, index all files in absolute-path form. +if [[ "$1" == "gtags" && -n "${tree}" ]]; then + tree=$(realpath "$tree")/ +fi + # Detect if ALLSOURCE_ARCHS is set. If not, we assume SRCARCH if [ "${ALLSOURCE_ARCHS}" = "" ]; then ALLSOURCE_ARCHS=${SRCARCH} @@ -131,7 +138,7 @@ docscope()

dogtags() { - all_target_sources | gtags -i -f - + all_target_sources | gtags -i -C "${tree:-.}" -f - "$PWD" }

# Basic regular expressions with an optional /kind-spec/ for ctags and

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 11/13] docs: Set minimal gtags / GNU GLOBAL version to 6.6.5

From: Ahmed S. Darwish darwi@linutronix.de

commit b230235b386589d8f0d631b1c77a95ca79bb0732 upstream.

Kernel build now uses the gtags "-C (--directory)" option, available since GNU GLOBAL v6.6.5. Update the documentation accordingly.

Signed-off-by: Ahmed S. Darwish darwi@linutronix.de Cc: stable@vger.kernel.org Link: https://lists.gnu.org/archive/html/info-global/2020-09/msg00000.html Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/process/changes.rst | 7 +++++++ 1 file changed, 7 insertions(+)

--- a/Documentation/process/changes.rst +++ b/Documentation/process/changes.rst @@ -60,6 +60,7 @@ openssl & libcrypto 1.0.0 bc 1.06.95 bc --version Sphinx\ [#f1]_ 1.7 sphinx-build --version cpio any cpio --version +gtags (optional) 6.6.5 gtags --version ====================== =============== ========================================

.. [#f1] Sphinx is needed only to build the Kernel documentation @@ -174,6 +175,12 @@ You will need openssl to build kernels 3 enabled. You will also need openssl development packages to build kernels 4.3 and higher.

+gtags / GNU GLOBAL (optional) +----------------------------- + +The kernel build requires GNU GLOBAL version 6.6.5 or later to generate +tag files through ``make gtags``. This is due to its use of the gtags +``-C (--directory)`` flag.

System utilities ****************

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 12/13] dm ioctl: Avoid double-fetch of version

From: Demi Marie Obenour demi@invisiblethingslab.com

commit 249bed821b4db6d95a99160f7d6d236ea5fe6362 upstream.

The version is fetched once in check_version(), which then does some validation and then overwrites the version in userspace with the API version supported by the kernel. copy_params() then fetches the version from userspace *again*, and this time no validation is done. The result is that the kernel's version number is completely controllable by userspace, provided that userspace can win a race condition.

Fix this flaw by not copying the version back to the kernel the second time. This is not exploitable as the version is not further used in the kernel. However, it could become a problem if future patches start relying on the version field.

Cc: stable@vger.kernel.org Signed-off-by: Demi Marie Obenour demi@invisiblethingslab.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-ioctl.c | 33 +++++++++++++++++++++------------ 1 file changed, 21 insertions(+), 12 deletions(-)

--- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1830,30 +1830,36 @@ static ioctl_fn lookup_ioctl(unsigned in * As well as checking the version compatibility this always * copies the kernel interface version out. */ -static int check_version(unsigned int cmd, struct dm_ioctl __user *user) +static int check_version(unsigned int cmd, struct dm_ioctl __user *user, + struct dm_ioctl *kernel_params) { - uint32_t version[3]; int r = 0;

- if (copy_from_user(version, user->version, sizeof(version))) + /* Make certain version is first member of dm_ioctl struct */ + BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0); + + if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version))) return -EFAULT;

- if ((version[0] != DM_VERSION_MAJOR) || - (version[1] > DM_VERSION_MINOR)) { + if ((kernel_params->version[0] != DM_VERSION_MAJOR) || + (kernel_params->version[1] > DM_VERSION_MINOR)) { DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)", DM_VERSION_MAJOR, DM_VERSION_MINOR, DM_VERSION_PATCHLEVEL, - version[0], version[1], version[2], cmd); + kernel_params->version[0], + kernel_params->version[1], + kernel_params->version[2], + cmd); r = -EINVAL; }

/* * Fill in the kernel version. */ - version[0] = DM_VERSION_MAJOR; - version[1] = DM_VERSION_MINOR; - version[2] = DM_VERSION_PATCHLEVEL; - if (copy_to_user(user->version, version, sizeof(version))) + kernel_params->version[0] = DM_VERSION_MAJOR; + kernel_params->version[1] = DM_VERSION_MINOR; + kernel_params->version[2] = DM_VERSION_PATCHLEVEL; + if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version))) return -EFAULT;

return r; @@ -1879,7 +1885,10 @@ static int copy_params(struct dm_ioctl _ const size_t minimum_data_size = offsetof(struct dm_ioctl, data); unsigned int noio_flag;

- if (copy_from_user(param_kernel, user, minimum_data_size)) + /* check_version() already copied version from userspace, avoid TOCTOU */ + if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version), + (char __user *)user + sizeof(param_kernel->version), + minimum_data_size - sizeof(param_kernel->version))) return -EFAULT;

if (param_kernel->data_size < minimum_data_size) { @@ -1991,7 +2000,7 @@ static int ctl_ioctl(struct file *file, * Check the interface version passed in. This also * writes out the kernel's interface version. */ - r = check_version(cmd, user); + r = check_version(cmd, user, &param_kernel); if (r) return r;

Greg Kroah-Hartman

6:54 p.m.

New subject: [PATCH 6.3 13/13] drm/amdgpu: Validate VM ioctl flags.

From: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl

commit a2b308044dcaca8d3e580959a4f867a1d5c37fac upstream.

None have been defined yet, so reject anybody setting any. Mesa sets it to 0 anyway.

Signed-off-by: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 ++++ 1 file changed, 4 insertions(+)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2371,6 +2371,10 @@ int amdgpu_vm_ioctl(struct drm_device *d struct amdgpu_fpriv *fpriv = filp->driver_priv; int r;

+ /* No valid flags defined yet */ + if (args->in.flags) + return -EINVAL; + switch (args->in.op) { case AMDGPU_VM_OP_RESERVE_VMID: /* We only have requirement to reserve vmid from gfxhub */

Naresh Kamboju

4 Jul 4 Jul

7:34 a.m.

On Tue, 4 Jul 2023 at 00:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:

...

This is the start of the stable review cycle for the 6.3.12 release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.

Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000. Anything received after that time might be too late.

The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y and the diffstat can be found below.

thanks,

greg k-h

While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Reported-by: Linux Kernel Functional Testing lkft@linaro.org

Crash log: ========= [ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests....... tst_hugepage.c:83: TINFO: 2 huge[ 54.396708] BUG: kernel NULL pointer dereference, address: 0000000000000034 [ 54.404495] #PF: supervisor write access in kernel mode [ 54.409718] #PF: error_code(0x0002) - not-present page [ 54.414849] PGD 800000010394a067 P4D 800000010394a067 PUD 1033ba067 PMD 0 [ 54.421721] Oops: 0002 [#1] PREEMPT SMP PTI [ 54.425900] CPU: 3 PID: 411 Comm: hugefallocate01 Not tainted 6.3.12-rc1 #1 [ 54.432860] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.5 11/26/2020 [ 54.440244] RIP: 0010:hugetlbfs_fallocate+0x256/0x580 [ 54.445296] Code: 3d 6f 37 06 02 89 c3 48 c1 e3 05 48 01 df e8 71 fa cb 00 31 c9 31 d2 4c 89 e6 4c 89 f7 e8 72 a6 de ff 48 3d 00 f0 ff ff 77 53 <f0> ff 48 34 74 43 48 03 1d 3d 37 06 02 48 89 df e8 25 f0 cb 00 48 [ 54.464041] RSP: 0018:ffffab24409f7dc0 EFLAGS: 00010207 [ 54.469260] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000000 [ 54.476390] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9fe006b253c0 [ 54.483514] RBP: ffffab24409f7ec0 R08: 0000000000000000 R09: 0000000000000000 [ 54.490640] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 54.497762] R13: ffff9fe006a68010 R14: ffff9fe006a68188 R15: 0000000000000000 [ 54.504887] FS: 00007f8bec2ff740(0000) GS:ffff9fe367b80000(0000) knlGS:0000000000000000 [ 54.512965] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.518702] CR2: 0000000000000034 CR3: 0000000101cd2003 CR4: 00000000003706e0 [ 54.525826] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 54.532950] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 54.540075] Call Trace: [ 54.542519] <TASK> [ 54.544618] ? show_regs+0x6e/0x80 [ 54.548022] ? __die+0x29/0x70 [ 54.551080] ? page_fault_oops+0x154/0x470 [ 54.555186] ? do_user_addr_fault+0x2f3/0x580 [ 54.559551] ? exc_page_fault+0x6b/0x170 [ 54.563502] ? asm_exc_page_fault+0x2b/0x30 [ 54.567686] ? hugetlbfs_fallocate+0x256/0x580 [ 54.572164] vfs_fallocate+0x156/0x360 [ 54.575921] __x64_sys_fallocate+0x47/0x80 [ 54.580018] do_syscall_64+0x3c/0x90 [ 54.583590] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 54.588642] RIP: 0033:0x7f8bec402baa [ 54.592213] Code: d8 64 89 02 b8 ff ff ff ff eb bd 0f 1f 44 00 00 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 1d 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 54.610952] RSP: 002b:00007fff3fa4bcd8 EFLAGS: 00000246 ORIG_RAX: 000000000000011d [ 54.618515] RAX: ffffffffffffffda RBX: 00007fff3fa4bcf8 RCX: 00007f8bec402baa [ 54.625639] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 [ 54.632765] RBP: 0000000000000002 R08: 0000000000000007 R09: 0000000000fd82a0 [ 54.639889] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 [ 54.647011] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 54.654151] </TASK> [ 54.656364] Modules linked in: x86_pkg_temp_thermal [ 54.661244] CR2: 0000000000000034 [ 54.664554] ---[ end trace 0000000000000000 ]--- [ 54.669186] RIP: 0010:hugetlbfs_fallocate+0x256/0x580 [ 54.674235] Code: 3d 6f 37 06 02 89 c3 48 c1 e3 05 48 01 df e8 71 fa cb 00 31 c9 31 d2 4c 89 e6 4c 89 f7 e8 72 a6 de ff 48 3d 00 f0 ff ff 77 53 <f0> ff 48 34 74 43 48 03 1d 3d 37 06 02 48 89 df e8 25 f0 cb 00 48 [ 54.692972] RSP: 0018:ffffab24409f7dc0 EFLAGS: 00010207 [ 54.698190] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000000 [ 54.705315] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9fe006b253c0 [ 54.712447] RBP: ffffab24409f7ec0 R08: 0000000000000000 R09: 0000000000000000 [ 54.719578] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 54.726703] R13: ffff9fe006a68010 R14: ffff9fe006a68188 R15: 0000000000000000 [ 54.733827] FS: 00007f8bec2ff740(0000) GS:ffff9fe367b80000(0000) knlGS:0000000000000000 [ 54.741904] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.747641] CR2: 0000000000000034 CR3: 0000000101cd2003 CR4: 00000000003706e0 [ 54.754765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 54.761890] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 54.769012] note: hugefallocate01[411] exited with irqs disabled page(s) reserved tst_test.c:1558: TINFO: Timeout per run is 0h 05m 00s tst_test.c:1612: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1 tst_test.c:1614: TBROK: Test killed! (ti[ 54.791850] hugefallocate02 (412): drop_caches: 3 meout?)

Summa[ 54.798824] BUG: kernel NULL pointer dereference, address: 0000000000000034 [ 54.806185] #PF: supervisor write access in kernel mode [ 54.811402] #PF: error_code(0x0002) - not-present page [ 54.816532] PGD 8000000103ace067 P4D 8000000103ace067 PUD 1076d0067 PMD 0 [ 54.823406] Oops: 0002 [#2] PREEMPT SMP PTI [ 54.827584] CPU: 1 PID: 413 Comm: hugefallocate02 Tainted: G D 6.3.12-rc1 #1 [ 54.836015] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.5 11/26/2020 [ 54.843400] RIP: 0010:hugetlbfs_fallocate+0x256/0x580 [ 54.848478] Code: 3d 6f 37 06 02 89 c3 48 c1 e3 05 48 01 df e8 71 fa cb 00 31 c9 31 d2 4c 89 e6 4c 89 f7 e8 72 a6 de ff 48 3d 00 f0 ff ff 77 53 <f0> ff 48 34 74 43 48 03 1d 3d 37 06 02 48 89 df e8 25 f0 cb 00 48 [ 54.867215] RSP: 0018:ffffab24409f7dc0 EFLAGS: 00010207 [ 54.872434] RAX: 0000000000000000 RBX: 00000000000000c0 RCX: 0000000000000000 [ 54.879566] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9fe003b2c300 [ 54.886689] RBP: ffffab24409f7ec0 R08: 0000000000000000 R09: 0000000000000000 [ 54.893813] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 54.900939] R13: ffff9fe004f3c010 R14: ffff9fe004f3c188 R15: 0000000000000000 [ 54.908070] FS: 00007f8454bee740(0000) GS:ffff9fe367a80000(0000) knlGS:0000000000000000 [ 54.916189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.921928] CR2: 0000000000000034 CR3: 0000000106a24001 CR4: 00000000003706e0 [ 54.929051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 54.936193] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 54.943317] Call Trace: [ 54.945763] <TASK> [ 54.947859] ? show_regs+0x6e/0x80 [ 54.951290] ? __die+0x29/0x70 [ 54.954341] ? page_fault_oops+0x154/0x470 [ 54.958435] ? do_user_addr_fault+0x2f3/0x580 [ 54.962818] ? exc_page_fault+0x6b/0x170 [ 54.966745] ? asm_exc_page_fault+0x2b/0x30 [ 54.970931] ? hugetlbfs_fallocate+0x256/0x580 [ 54.975407] vfs_fallocate+0x156/0x360 [ 54.979161] __x64_sys_fallocate+0x47/0x80 [ 54.983254] do_syscall_64+0x3c/0x90 [ 54.986832] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 54.991876] RIP: 0033:0x7f8454cf1baa [ 54.995449] Code: d8 64 89 02 b8 ff ff ff ff eb bd 0f 1f 44 00 00 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 1d 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 55.014193] RSP: 002b:00007ffdbcb38428 EFLAGS: 00000246 ORIG_RAX: 000000000000011d [ 55.021752] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8454cf1baa [ 55.028877] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 [ 55.036009] RBP: 0000000000000003 R08: 0000000000000040 R09: 00007f8454de10a0 [ 55.043155] R10: 0000000000600000 R11: 0000000000000246 R12: 00007ffdbcb38448 [ 55.050333] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 55.057459] </TASK> [ 55.059641] Modules linked in: x86_pkg_temp_thermal [ 55.064513] CR2: 0000000000000034 [ 55.067825] ---[ end trace 0000000000000000 ]--- [ 55.072442] RIP: 0010:hugetlbfs_fallocate+0x256/0x580 [ 55.077495] Code: 3d 6f 37 06 02 89 c3 48 c1 e3 05 48 01 df e8 71 fa cb 00 31 c9 31 d2 4c 89 e6 4c 89 f7 e8 72 a6 de ff 48 3d 00 f0 ff ff 77 53 <f0> ff 48 34 74 43 48 03 1d 3d 37 06 02 48 89 df e8 25 f0 cb 00 48 [ 55.096235] RSP: 0018:ffffab24409f7dc0 EFLAGS: 00010207 [ 55.101458] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000000 [ 55.108584] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9fe006b253c0 [ 55.115708] RBP: ffffab24409f7ec0 R08: 0000000000000000 R09: 0000000000000000 [ 55.122838] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 55.129962] R13: ffff9fe006a68010 R14: ffff9fe006a68188 R15: 0000000000000000 [ 55.137087] FS: 00007f8454bee740(0000) GS:ffff9fe367a80000(0000) knlGS:0000000000000000 [ 55.145200] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 55.150937] CR2: 0000000000000034 CR3: 0000000106a24001 CR4: 00000000003706e0 [ 55.158062] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 55.165194] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 55.172318] note: hugefallocate02[413] exited with irqs disabled ry: passed 0[ 55.179411] hugefallocate02 (412) used greatest stack depth: 12520 bytes left

Links: - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.3.y/build/v6.3.11... - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.3.y/build/v6.3.11... - https://lkft.validation.linaro.org/scheduler/job/6568036#L1767 - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.3.y/build/v6.3.11...

metadata: git_ref: linux-6.3.y git_repo: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc git_sha: ec916e7bb7e9c20ced0d1fbf4caf972af2cecec9 git_describe: v6.3.11-14-gec916e7bb7e9 kernel_version: 6.3.12-rc1 kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2S4gySfB44bWUQMh7uOgH... build-url: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc/-/pipelines/91... artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2S4gySfB44bWUQMh7uOgH... toolchain: gcc-11

## Build * kernel: 6.3.12-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.3.y * git commit: ec916e7bb7e9c20ced0d1fbf4caf972af2cecec9 * git describe: v6.3.11-14-gec916e7bb7e9 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.3.y/build/v6.3.11...

## Test Regressions (compared to v6.3.11)

* qemu-x86_64, log-parser-test - check-kernel-bug - check-kernel-exception - check-kernel-invalid-opcode - check-kernel-kasan - check-kernel-kfence - check-kernel-oops - check-kernel-panic - check-kernel-warning

* x86, log-parser-boot - check-kernel-warning

* x86, log-parser-test - check-kernel-bug - check-kernel-oops

* x86-kasan, log-parser-boot - check-kernel-warning

* x86_64-clang, log-parser-boot - check-kernel-warning

* x86_64-clang, log-parser-test - check-kernel-bug - check-kernel-oops

## Metric Regressions (compared to v6.3.11)

## Test Fixes (compared to v6.3.11)

## Metric Fixes (compared to v6.3.11)

## Test result summary total: 161897, pass: 132775, fail: 1650, skip: 27332, xfail: 140

## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 145 total, 144 passed, 1 failed * arm64: 54 total, 53 passed, 1 failed * i386: 41 total, 40 passed, 1 failed * mips: 30 total, 28 passed, 2 failed * parisc: 4 total, 4 passed, 0 failed * powerpc: 38 total, 36 passed, 2 failed * riscv: 26 total, 25 passed, 1 failed * s390: 16 total, 14 passed, 2 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 46 total, 46 passed, 0 failed

## Test suites summary * boot * fwts * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-mincore * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * rcutorture * v4l2-compliance * vdso

-- Linaro LKFT https://lkft.linaro.org

Greg Kroah-Hartman

7:43 a.m.

On Tue, Jul 04, 2023 at 01:04:36PM +0530, Naresh Kamboju wrote:

...

On Tue, 4 Jul 2023 at 00:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:

...
This is the start of the stable review cycle for the 6.3.12 release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.

Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000. Anything received after that time might be too late.

The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y and the diffstat can be found below.

thanks,

greg k-h

While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Reported-by: Linux Kernel Functional Testing lkft@linaro.org

Crash log:

[ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests.......

And this worked on 6.3.11 just fine?

Trying to narrow down what would have caused this...

Any chance you can run Linus's tree with thie LTP test as well?

thanks,

greg k-h

Greg Kroah-Hartman

8:24 a.m.

On Tue, Jul 04, 2023 at 08:43:54AM +0100, Greg Kroah-Hartman wrote:

...

On Tue, Jul 04, 2023 at 01:04:36PM +0530, Naresh Kamboju wrote:

...
On Tue, 4 Jul 2023 at 00:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:

...
This is the start of the stable review cycle for the 6.3.12 release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.

Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000. Anything received after that time might be too late.

The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y and the diffstat can be found below.

thanks,

greg k-h

While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Reported-by: Linux Kernel Functional Testing lkft@linaro.org

Crash log:

[ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests.......

And this worked on 6.3.11 just fine?

Trying to narrow down what would have caused this...

Any chance you can run Linus's tree with thie LTP test as well?

Ah, I can hit this here locally too! Let me bisect...

thanks,

greg k-h

Greg Kroah-Hartman

8:39 a.m.

On Tue, Jul 04, 2023 at 09:24:37AM +0100, Greg Kroah-Hartman wrote:

...

On Tue, Jul 04, 2023 at 08:43:54AM +0100, Greg Kroah-Hartman wrote:

...
On Tue, Jul 04, 2023 at 01:04:36PM +0530, Naresh Kamboju wrote:

...
On Tue, 4 Jul 2023 at 00:26, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:

...
This is the start of the stable review cycle for the 6.3.12 release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.

Responses should be made by Wed, 05 Jul 2023 18:45:08 +0000. Anything received after that time might be too late.

The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.3.12-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.3.y and the diffstat can be found below.

thanks,

greg k-h

While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Reported-by: Linux Kernel Functional Testing lkft@linaro.org

Crash log:

[ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests.......

And this worked on 6.3.11 just fine?

Trying to narrow down what would have caused this...

Any chance you can run Linus's tree with thie LTP test as well?

Ah, I can hit this here locally too! Let me bisect...

Found it. I'll drop the offending patch and push out a new -rc release, thanks.

greg k-h

Harshit Mogalapalli

8:43 a.m.

Hi Greg,

On 04/07/23 1:54 pm, Greg Kroah-Hartman wrote:

...

...
...
While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Reported-by: Linux Kernel Functional Testing lkft@linaro.org

Crash log:

[ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests.......

And this worked on 6.3.11 just fine?

Trying to narrow down what would have caused this...

Any chance you can run Linus's tree with thie LTP test as well?

Ah, I can hit this here locally too! Let me bisect...

Have you looked at Patch 9 of this series:

https://lore.kernel.org/stable/2023070416-wow-phrasing-b92c@gregkh/T/#m12068...

Looks very much related, it also has a note on Backporting. As I think it could be related, I am sharing this.(But haven't tested anything)

Thanks, Harshit

...

thanks,

greg k-h

Greg Kroah-Hartman

8:47 a.m.

On Tue, Jul 04, 2023 at 02:13:03PM +0530, Harshit Mogalapalli wrote:

...

Hi Greg,

On 04/07/23 1:54 pm, Greg Kroah-Hartman wrote:

...
...
...
While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Reported-by: Linux Kernel Functional Testing lkft@linaro.org

Crash log:

[ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests.......

And this worked on 6.3.11 just fine?

Trying to narrow down what would have caused this...

Any chance you can run Linus's tree with thie LTP test as well?

Ah, I can hit this here locally too! Let me bisect...

Have you looked at Patch 9 of this series:

https://lore.kernel.org/stable/2023070416-wow-phrasing-b92c@gregkh/T/#m12068...

Looks very much related, it also has a note on Backporting. As I think it could be related, I am sharing this.(But haven't tested anything)

Yes, that's the offending patch. I should have read over the full changelogs before doing bisection, but bisection/test proved that this was not correct for 6.3.y at this point in time.

thanks,

greg k-h

Thorsten Leemhuis

9:56 a.m.

On 04.07.23 10:47, Greg Kroah-Hartman wrote:

...

On Tue, Jul 04, 2023 at 02:13:03PM +0530, Harshit Mogalapalli wrote:

...
On 04/07/23 1:54 pm, Greg Kroah-Hartman wrote:

...
...
...
While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Have you looked at Patch 9 of this series:

https://lore.kernel.org/stable/2023070416-wow-phrasing-b92c@gregkh/T/#m12068...

Looks very much related, it also has a note on Backporting. As I think it could be related, I am sharing this.(But haven't tested anything)

Yes, that's the offending patch. I should have read over the full changelogs before doing bisection, but bisection/test proved that this was not correct for 6.3.y at this point in time.

FWIW, I'm preparing a few small tweaks for Documentation/process/stable-kernel-rules.rst (to be submitted after the merge window). I among others consider adding something like this that might help avoiding this situation:

``` To delay pick up of patches submitted via :ref:`option_1`, use the following format:

.. code-block:: none

Cc: stable@vger.kernel.org # after 4 weeks in mainline

For any other requests related to patches submitted via :ref:`option_1`, just add a note to the stable tag. This for example can be used to point out known problems:

.. code-block:: none

Cc: stable@vger.kernel.org # see patch description, needs adjustments for 6.3 and earlier

```

Greg, if this is stupid or in case you want it to say something else, just say so.

Ciao, Thorsten

Greg Kroah-Hartman

12:28 p.m.

On Tue, Jul 04, 2023 at 11:56:11AM +0200, Thorsten Leemhuis wrote:

...

On 04.07.23 10:47, Greg Kroah-Hartman wrote:

...
On Tue, Jul 04, 2023 at 02:13:03PM +0530, Harshit Mogalapalli wrote:

...
On 04/07/23 1:54 pm, Greg Kroah-Hartman wrote:

...
...
...
While running LTP hugetlb testing on x86 the following kernel BUG noticed on running stable-rc 6.3.12-rc1.

Have you looked at Patch 9 of this series:

https://lore.kernel.org/stable/2023070416-wow-phrasing-b92c@gregkh/T/#m12068...

Looks very much related, it also has a note on Backporting. As I think it could be related, I am sharing this.(But haven't tested anything)

Yes, that's the offending patch. I should have read over the full changelogs before doing bisection, but bisection/test proved that this was not correct for 6.3.y at this point in time.

FWIW, I'm preparing a few small tweaks for Documentation/process/stable-kernel-rules.rst (to be submitted after the merge window). I among others consider adding something like this that might help avoiding this situation:
To delay pick up of patches submitted via :ref:`option_1`, use the
following format:

.. code-block:: none

     Cc: <stable@vger.kernel.org> # after 4 weeks in mainline

For any other requests related to patches submitted via :ref:`option_1`,
just add a note to the stable tag. This for example can be used to point
out known problems:

.. code-block:: none

     Cc: <stable@vger.kernel.org> # see patch description, needs
adjustments for 6.3 and earlier
Greg, if this is stupid or in case you want it to say something else, just say so.

That looks great, hopefully people notice this. We still have a huge number of people refusing to even put cc: stable in a patch, let alone these extra hints :)

thanks,

greg k-h

Mike Kravetz

5 Jul 5 Jul

5:36 p.m.

On 07/04/23 13:28, Greg Kroah-Hartman wrote:

...

On Tue, Jul 04, 2023 at 11:56:11AM +0200, Thorsten Leemhuis wrote:

...
On 04.07.23 10:47, Greg Kroah-Hartman wrote:

...
On Tue, Jul 04, 2023 at 02:13:03PM +0530, Harshit Mogalapalli wrote:

...
On 04/07/23 1:54 pm, Greg Kroah-Hartman wrote:

...
...
> While running LTP hugetlb testing on x86 the following kernel BUG noticed > on running stable-rc 6.3.12-rc1.

Have you looked at Patch 9 of this series:

https://lore.kernel.org/stable/2023070416-wow-phrasing-b92c@gregkh/T/#m12068...

Looks very much related, it also has a note on Backporting. As I think it could be related, I am sharing this.(But haven't tested anything)

Yes, that's the offending patch. I should have read over the full changelogs before doing bisection, but bisection/test proved that this was not correct for 6.3.y at this point in time.

FWIW, I'm preparing a few small tweaks for Documentation/process/stable-kernel-rules.rst (to be submitted after the merge window). I among others consider adding something like this that might help avoiding this situation:
To delay pick up of patches submitted via :ref:`option_1`, use the
following format:

.. code-block:: none

     Cc: <stable@vger.kernel.org> # after 4 weeks in mainline

For any other requests related to patches submitted via :ref:`option_1`,
just add a note to the stable tag. This for example can be used to point
out known problems:

.. code-block:: none

     Cc: <stable@vger.kernel.org> # see patch description, needs
adjustments for 6.3 and earlier
Greg, if this is stupid or in case you want it to say something else, just say so.
That looks great, hopefully people notice this. We still have a huge number of people refusing to even put cc: stable in a patch, let alone these extra hints :)

We were trying to follow "Option 2" of the stable rules with this patch. Because of the issue with 6.3.y, cc: stable was intentionally left off the upstream patch. And, after the patch was in Linus's tree a 6.3.y specific version was sent: https://lore.kernel.org/lkml/20230629211817.194786-1-sidhartha.kumar@oracle....

To complicate matters, a bug was found and fixed in the upstream patch during this process.

Apologies if things were not done correctly.

-- Mike Kravetz

Arnd Bergmann

4 Jul 4 Jul

10:53 a.m.

On Tue, Jul 4, 2023, at 09:34, Naresh Kamboju wrote:

...

On Tue, 4 Jul 2023 at 00:26, Greg Kroah-Hartman [ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests....... tst_hugepage.c:83: TINFO: 2 huge[ 54.396708] BUG: kernel NULL pointer dereference, address: 0000000000000034 [ 54.404495] #PF: supervisor write access in kernel mode [ 54.409718] #PF: error_code(0x0002) - not-present page [ 54.414849] PGD 800000010394a067 P4D 800000010394a067 PUD 1033ba067 PMD 0 [ 54.421721] Oops: 0002 [#1] PREEMPT SMP PTI [ 54.425900] CPU: 3 PID: 411 Comm: hugefallocate01 Not tainted 6.3.12-rc1 #1 [ 54.432860] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.5 11/26/2020 [ 54.440244] RIP: 0010:hugetlbfs_fallocate+0x256/0x580 [ 54.445296] Code: 3d 6f 37 06 02 89 c3 48 c1 e3 05 48 01 df e8 71 fa cb 00 31 c9 31 d2 4c 89 e6 4c 89 f7 e8 72 a6 de ff 48 3d 00 f0 ff ff 77 53 <f0> ff 48 34 74 43 48 03 1d 3d 37 06 02 48 89 df e8 25 f0 cb 00 48 [ 54.464041] RSP: 0018:ffffab24409f7dc0 EFLAGS: 00010207 [ 54.469260] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000000 [ 54.476390] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9fe006b253c0 [ 54.483514] RBP: ffffab24409f7ec0 R08: 0000000000000000 R09: 0000000000000000 [ 54.490640] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 54.497762] R13: ffff9fe006a68010 R14: ffff9fe006a68188 R15: 0000000000000000 [ 54.504887] FS: 00007f8bec2ff740(0000) GS:ffff9fe367b80000(0000) knlGS:0000000000000000 [ 54.512965] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.518702] CR2: 0000000000000034 CR3: 0000000101cd2003 CR4: 00000000003706e0 [ 54.525826] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 54.532950] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 54.540075] Call Trace: [ 54.542519] <TASK> [ 54.544618] ? show_regs+0x6e/0x80 [ 54.548022] ? __die+0x29/0x70 [ 54.551080] ? page_fault_oops+0x154/0x470 [ 54.555186] ? do_user_addr_fault+0x2f3/0x580 [ 54.559551] ? exc_page_fault+0x6b/0x170 [ 54.563502] ? asm_exc_page_fault+0x2b/0x30 [ 54.567686] ? hugetlbfs_fallocate+0x256/0x580

From your vmlinux file I see this hugetlbfs_fallocate+0x256/0x580 is folio_put(NULL):

ffffffff815bdd29: e8 72 a6 de ff call ffffffff813a83a0 <__filemap_get_folio> ffffffff815bdd2e: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax ffffffff815bdd34: 77 53 ja ffffffff815bdd89 <hugetlbfs_fallocate+0x2a9> ffffffff815bdd36: f0 ff 48 34 lock decl 0x34(%rax)

/* See if already present in mapping to avoid alloc/free */ folio = filemap_get_folio(mapping, index); if (!IS_ERR(folio)) { folio_put(folio);

It looks like filemap_get_folio() has always returned NULL on error rather than an error pointer.

Arnd

Greg Kroah-Hartman

12:29 p.m.

On Tue, Jul 04, 2023 at 12:53:16PM +0200, Arnd Bergmann wrote:

...

On Tue, Jul 4, 2023, at 09:34, Naresh Kamboju wrote:

...
On Tue, 4 Jul 2023 at 00:26, Greg Kroah-Hartman [ 54.386939] hugefallocate01 (410): drop_caches: 3 g tests....... tst_hugepage.c:83: TINFO: 2 huge[ 54.396708] BUG: kernel NULL pointer dereference, address: 0000000000000034 [ 54.404495] #PF: supervisor write access in kernel mode [ 54.409718] #PF: error_code(0x0002) - not-present page [ 54.414849] PGD 800000010394a067 P4D 800000010394a067 PUD 1033ba067 PMD 0 [ 54.421721] Oops: 0002 [#1] PREEMPT SMP PTI [ 54.425900] CPU: 3 PID: 411 Comm: hugefallocate01 Not tainted 6.3.12-rc1 #1 [ 54.432860] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.5 11/26/2020 [ 54.440244] RIP: 0010:hugetlbfs_fallocate+0x256/0x580 [ 54.445296] Code: 3d 6f 37 06 02 89 c3 48 c1 e3 05 48 01 df e8 71 fa cb 00 31 c9 31 d2 4c 89 e6 4c 89 f7 e8 72 a6 de ff 48 3d 00 f0 ff ff 77 53 <f0> ff 48 34 74 43 48 03 1d 3d 37 06 02 48 89 df e8 25 f0 cb 00 48 [ 54.464041] RSP: 0018:ffffab24409f7dc0 EFLAGS: 00010207 [ 54.469260] RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000000 [ 54.476390] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9fe006b253c0 [ 54.483514] RBP: ffffab24409f7ec0 R08: 0000000000000000 R09: 0000000000000000 [ 54.490640] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 54.497762] R13: ffff9fe006a68010 R14: ffff9fe006a68188 R15: 0000000000000000 [ 54.504887] FS: 00007f8bec2ff740(0000) GS:ffff9fe367b80000(0000) knlGS:0000000000000000 [ 54.512965] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.518702] CR2: 0000000000000034 CR3: 0000000101cd2003 CR4: 00000000003706e0 [ 54.525826] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 54.532950] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 54.540075] Call Trace: [ 54.542519] <TASK> [ 54.544618] ? show_regs+0x6e/0x80 [ 54.548022] ? __die+0x29/0x70 [ 54.551080] ? page_fault_oops+0x154/0x470 [ 54.555186] ? do_user_addr_fault+0x2f3/0x580 [ 54.559551] ? exc_page_fault+0x6b/0x170 [ 54.563502] ? asm_exc_page_fault+0x2b/0x30 [ 54.567686] ? hugetlbfs_fallocate+0x256/0x580

...
From your vmlinux file I see this hugetlbfs_fallocate+0x256/0x580

is folio_put(NULL):

ffffffff815bdd29: e8 72 a6 de ff call ffffffff813a83a0 <__filemap_get_folio> ffffffff815bdd2e: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax ffffffff815bdd34: 77 53 ja ffffffff815bdd89 <hugetlbfs_fallocate+0x2a9> ffffffff815bdd36: f0 ff 48 34 lock decl 0x34(%rax)
            /* See if already present in mapping to avoid alloc/free */
            folio = filemap_get_folio(mapping, index);
            if (!IS_ERR(folio)) {
                    folio_put(folio);
It looks like filemap_get_folio() has always returned NULL on error rather than an error pointer.

Yeah, this needs to be reworked from 6.3.y, as the commit message said, I just missed it, my fault.

Hopefully 6.3.y doesn't live much longer (maybe a few days), then we don't have to deal with this api mismatch which will only cause problems with backports...

thanks,

greg k-h

Conor Dooley

5 Jul 5 Jul

7:09 a.m.

On Mon, Jul 03, 2023 at 08:54:10PM +0200, Greg Kroah-Hartman wrote:

...

This is the start of the stable review cycle for the 6.3.12 release. There are 13 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.

Tested-by: Conor Dooley conor.dooley@microchip.com

Cheers, Conor.

916

days inactive

918

days old

linux-stable-mirror@lists.linaro.org

25 comments

participants

tags (0)

participants (7)

Arnd Bergmann
Conor Dooley
Greg Kroah-Hartman
Harshit Mogalapalli
Mike Kravetz
Naresh Kamboju
Thorsten Leemhuis