From: Johannes Berg johannes.berg@intel.com
[ Upstream commit b2296eeac91555bd13f774efa7ab7d4b12fb71ef ]
Now that UML has PCI support, this driver must depend also on !UML since it pokes at X86_64 architecture internals that don't exist on ARCH=um.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Johannes Berg johannes.berg@intel.com Acked-by: Dave Jiang dave.jiang@intel.com Acked-By: Anton Ivanov anton.ivanov@cambridgegreys.com Link: https://lore.kernel.org/r/20210625103810.fe877ae0aef4.If240438e3f50ae226f3f7... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index 39b5b46e880f..f450e4231db7 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -279,7 +279,7 @@ config INTEL_IDMA64
config INTEL_IDXD tristate "Intel Data Accelerators support" - depends on PCI && X86_64 + depends on PCI && X86_64 && !UML depends on PCI_MSI depends on SBITMAP select DMA_ENGINE
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 4faee8b65ec32346f8096e64c5fa1d5a73121742 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Reviewed-by: Baolin Wang baolin.wang7@gmail.com Link: https://lore.kernel.org/r/1620094977-70146-1-git-send-email-zou_wei@huawei.c... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/sprd-dma.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma/sprd-dma.c b/drivers/dma/sprd-dma.c index 0ef5ca81ba4d..4357d2395e6b 100644 --- a/drivers/dma/sprd-dma.c +++ b/drivers/dma/sprd-dma.c @@ -1265,6 +1265,7 @@ static const struct of_device_id sprd_dma_match[] = { { .compatible = "sprd,sc9860-dma", }, {}, }; +MODULE_DEVICE_TABLE(of, sprd_dma_match);
static int __maybe_unused sprd_dma_runtime_suspend(struct device *dev) {
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/driver-api/cxl/memory-devices.rst | 2 +- drivers/cxl/Makefile | 4 +--- drivers/cxl/core/Makefile | 5 +++++ drivers/cxl/{core.c => core/bus.c} | 4 ++-- drivers/cxl/{mem.h => cxlmem.h} | 0 drivers/cxl/pci.c | 2 +- drivers/cxl/pmem.c | 2 +- 7 files changed, 11 insertions(+), 8 deletions(-) create mode 100644 drivers/cxl/core/Makefile rename drivers/cxl/{core.c => core/bus.c} (99%) rename drivers/cxl/{mem.h => cxlmem.h} (100%)
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 487ce4f41d77..a86e2c7c551a 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -36,7 +36,7 @@ CXL Core .. kernel-doc:: drivers/cxl/cxl.h :internal:
-.. kernel-doc:: drivers/cxl/core.c +.. kernel-doc:: drivers/cxl/core/bus.c :doc: cxl core
External Interfaces diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index 32954059b37b..d1aaabc940f3 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,11 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CXL_BUS) += cxl_core.o +obj-$(CONFIG_CXL_BUS) += core/ obj-$(CONFIG_CXL_MEM) += cxl_pci.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
-ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -cxl_core-y := core.o cxl_pci-y := pci.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile new file mode 100644 index 000000000000..ad137f96e5c8 --- /dev/null +++ b/drivers/cxl/core/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_CXL_BUS) += cxl_core.o + +ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(srctree)/drivers/cxl +cxl_core-y := bus.o diff --git a/drivers/cxl/core.c b/drivers/cxl/core/bus.c similarity index 99% rename from drivers/cxl/core.c rename to drivers/cxl/core/bus.c index a2e4d54fc7bc..0815eec23944 100644 --- a/drivers/cxl/core.c +++ b/drivers/cxl/core/bus.c @@ -6,8 +6,8 @@ #include <linux/pci.h> #include <linux/slab.h> #include <linux/idr.h> -#include "cxl.h" -#include "mem.h" +#include <cxlmem.h> +#include <cxl.h>
/** * DOC: cxl core diff --git a/drivers/cxl/mem.h b/drivers/cxl/cxlmem.h similarity index 100% rename from drivers/cxl/mem.h rename to drivers/cxl/cxlmem.h diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4cf351a3cf99..a945c5fda292 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -12,9 +12,9 @@ #include <linux/pci.h> #include <linux/io.h> #include <linux/io-64-nonatomic-lo-hi.h> +#include "cxlmem.h" #include "pci.h" #include "cxl.h" -#include "mem.h"
/** * DOC: cxl pci diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index 0088e41dd2f3..9652c3ee41e7 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -6,7 +6,7 @@ #include <linux/ndctl.h> #include <linux/async.h> #include <linux/slab.h> -#include "mem.h" +#include "cxlmem.h" #include "cxl.h"
/*
On Mon, 13 Sep 2021 18:33:17 -0400 Sasha Levin sashal@kernel.org wrote:
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Not a fix...
I'm guessing this got picked up on the basis of the Reported-by: tag? I think that was added for a minor tweak as this went through review rather than referring to the whole patch.
Jonathan
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
Documentation/driver-api/cxl/memory-devices.rst | 2 +- drivers/cxl/Makefile | 4 +--- drivers/cxl/core/Makefile | 5 +++++ drivers/cxl/{core.c => core/bus.c} | 4 ++-- drivers/cxl/{mem.h => cxlmem.h} | 0 drivers/cxl/pci.c | 2 +- drivers/cxl/pmem.c | 2 +- 7 files changed, 11 insertions(+), 8 deletions(-) create mode 100644 drivers/cxl/core/Makefile rename drivers/cxl/{core.c => core/bus.c} (99%) rename drivers/cxl/{mem.h => cxlmem.h} (100%)
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 487ce4f41d77..a86e2c7c551a 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -36,7 +36,7 @@ CXL Core .. kernel-doc:: drivers/cxl/cxl.h :internal: -.. kernel-doc:: drivers/cxl/core.c +.. kernel-doc:: drivers/cxl/core/bus.c :doc: cxl core External Interfaces diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index 32954059b37b..d1aaabc940f3 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,11 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CXL_BUS) += cxl_core.o +obj-$(CONFIG_CXL_BUS) += core/ obj-$(CONFIG_CXL_MEM) += cxl_pci.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -cxl_core-y := core.o cxl_pci-y := pci.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile new file mode 100644 index 000000000000..ad137f96e5c8 --- /dev/null +++ b/drivers/cxl/core/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_CXL_BUS) += cxl_core.o
+ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(srctree)/drivers/cxl +cxl_core-y := bus.o diff --git a/drivers/cxl/core.c b/drivers/cxl/core/bus.c similarity index 99% rename from drivers/cxl/core.c rename to drivers/cxl/core/bus.c index a2e4d54fc7bc..0815eec23944 100644 --- a/drivers/cxl/core.c +++ b/drivers/cxl/core/bus.c @@ -6,8 +6,8 @@ #include <linux/pci.h> #include <linux/slab.h> #include <linux/idr.h> -#include "cxl.h" -#include "mem.h" +#include <cxlmem.h> +#include <cxl.h> /**
- DOC: cxl core
diff --git a/drivers/cxl/mem.h b/drivers/cxl/cxlmem.h similarity index 100% rename from drivers/cxl/mem.h rename to drivers/cxl/cxlmem.h diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4cf351a3cf99..a945c5fda292 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -12,9 +12,9 @@ #include <linux/pci.h> #include <linux/io.h> #include <linux/io-64-nonatomic-lo-hi.h> +#include "cxlmem.h" #include "pci.h" #include "cxl.h" -#include "mem.h" /**
- DOC: cxl pci
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index 0088e41dd2f3..9652c3ee41e7 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -6,7 +6,7 @@ #include <linux/ndctl.h> #include <linux/async.h> #include <linux/slab.h> -#include "mem.h" +#include "cxlmem.h" #include "cxl.h" /*
On Tue, 14 Sep 2021 09:56:23 +0100 Jonathan Cameron Jonathan.Cameron@Huawei.com wrote:
On Mon, 13 Sep 2021 18:33:17 -0400 Sasha Levin sashal@kernel.org wrote:
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Not a fix...
I'm guessing this got picked up on the basis of the Reported-by: tag? I think that was added for a minor tweak as this went through review rather than referring to the whole patch.
Or possibly because it was a precursor to the fix in the next patch.
Hmm. Ben, Dan, does it make sense for these two to go into stable?
Jonathan
Jonathan
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
Documentation/driver-api/cxl/memory-devices.rst | 2 +- drivers/cxl/Makefile | 4 +--- drivers/cxl/core/Makefile | 5 +++++ drivers/cxl/{core.c => core/bus.c} | 4 ++-- drivers/cxl/{mem.h => cxlmem.h} | 0 drivers/cxl/pci.c | 2 +- drivers/cxl/pmem.c | 2 +- 7 files changed, 11 insertions(+), 8 deletions(-) create mode 100644 drivers/cxl/core/Makefile rename drivers/cxl/{core.c => core/bus.c} (99%) rename drivers/cxl/{mem.h => cxlmem.h} (100%)
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 487ce4f41d77..a86e2c7c551a 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -36,7 +36,7 @@ CXL Core .. kernel-doc:: drivers/cxl/cxl.h :internal: -.. kernel-doc:: drivers/cxl/core.c +.. kernel-doc:: drivers/cxl/core/bus.c :doc: cxl core External Interfaces diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index 32954059b37b..d1aaabc940f3 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,11 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CXL_BUS) += cxl_core.o +obj-$(CONFIG_CXL_BUS) += core/ obj-$(CONFIG_CXL_MEM) += cxl_pci.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -cxl_core-y := core.o cxl_pci-y := pci.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile new file mode 100644 index 000000000000..ad137f96e5c8 --- /dev/null +++ b/drivers/cxl/core/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_CXL_BUS) += cxl_core.o
+ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(srctree)/drivers/cxl +cxl_core-y := bus.o diff --git a/drivers/cxl/core.c b/drivers/cxl/core/bus.c similarity index 99% rename from drivers/cxl/core.c rename to drivers/cxl/core/bus.c index a2e4d54fc7bc..0815eec23944 100644 --- a/drivers/cxl/core.c +++ b/drivers/cxl/core/bus.c @@ -6,8 +6,8 @@ #include <linux/pci.h> #include <linux/slab.h> #include <linux/idr.h> -#include "cxl.h" -#include "mem.h" +#include <cxlmem.h> +#include <cxl.h> /**
- DOC: cxl core
diff --git a/drivers/cxl/mem.h b/drivers/cxl/cxlmem.h similarity index 100% rename from drivers/cxl/mem.h rename to drivers/cxl/cxlmem.h diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4cf351a3cf99..a945c5fda292 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -12,9 +12,9 @@ #include <linux/pci.h> #include <linux/io.h> #include <linux/io-64-nonatomic-lo-hi.h> +#include "cxlmem.h" #include "pci.h" #include "cxl.h" -#include "mem.h" /**
- DOC: cxl pci
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index 0088e41dd2f3..9652c3ee41e7 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -6,7 +6,7 @@ #include <linux/ndctl.h> #include <linux/async.h> #include <linux/slab.h> -#include "mem.h" +#include "cxlmem.h" #include "cxl.h" /*
On 21-09-14 09:57:49, Jonathan Cameron wrote:
On Tue, 14 Sep 2021 09:56:23 +0100 Jonathan Cameron Jonathan.Cameron@Huawei.com wrote:
On Mon, 13 Sep 2021 18:33:17 -0400 Sasha Levin sashal@kernel.org wrote:
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Not a fix...
I'm guessing this got picked up on the basis of the Reported-by: tag? I think that was added for a minor tweak as this went through review rather than referring to the whole patch.
Or possibly because it was a precursor to the fix in the next patch.
Hmm. Ben, Dan, does it make sense for these two to go into stable?
Jonathan
As of now, no, but having this will make future fixes much easier to cherry pick.
Jonathan
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
Documentation/driver-api/cxl/memory-devices.rst | 2 +- drivers/cxl/Makefile | 4 +--- drivers/cxl/core/Makefile | 5 +++++ drivers/cxl/{core.c => core/bus.c} | 4 ++-- drivers/cxl/{mem.h => cxlmem.h} | 0 drivers/cxl/pci.c | 2 +- drivers/cxl/pmem.c | 2 +- 7 files changed, 11 insertions(+), 8 deletions(-) create mode 100644 drivers/cxl/core/Makefile rename drivers/cxl/{core.c => core/bus.c} (99%) rename drivers/cxl/{mem.h => cxlmem.h} (100%)
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 487ce4f41d77..a86e2c7c551a 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -36,7 +36,7 @@ CXL Core .. kernel-doc:: drivers/cxl/cxl.h :internal: -.. kernel-doc:: drivers/cxl/core.c +.. kernel-doc:: drivers/cxl/core/bus.c :doc: cxl core External Interfaces diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index 32954059b37b..d1aaabc940f3 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,11 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CXL_BUS) += cxl_core.o +obj-$(CONFIG_CXL_BUS) += core/ obj-$(CONFIG_CXL_MEM) += cxl_pci.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o -ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -cxl_core-y := core.o cxl_pci-y := pci.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile new file mode 100644 index 000000000000..ad137f96e5c8 --- /dev/null +++ b/drivers/cxl/core/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_CXL_BUS) += cxl_core.o
+ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(srctree)/drivers/cxl +cxl_core-y := bus.o diff --git a/drivers/cxl/core.c b/drivers/cxl/core/bus.c similarity index 99% rename from drivers/cxl/core.c rename to drivers/cxl/core/bus.c index a2e4d54fc7bc..0815eec23944 100644 --- a/drivers/cxl/core.c +++ b/drivers/cxl/core/bus.c @@ -6,8 +6,8 @@ #include <linux/pci.h> #include <linux/slab.h> #include <linux/idr.h> -#include "cxl.h" -#include "mem.h" +#include <cxlmem.h> +#include <cxl.h> /**
- DOC: cxl core
diff --git a/drivers/cxl/mem.h b/drivers/cxl/cxlmem.h similarity index 100% rename from drivers/cxl/mem.h rename to drivers/cxl/cxlmem.h diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 4cf351a3cf99..a945c5fda292 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -12,9 +12,9 @@ #include <linux/pci.h> #include <linux/io.h> #include <linux/io-64-nonatomic-lo-hi.h> +#include "cxlmem.h" #include "pci.h" #include "cxl.h" -#include "mem.h" /**
- DOC: cxl pci
diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index 0088e41dd2f3..9652c3ee41e7 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -6,7 +6,7 @@ #include <linux/ndctl.h> #include <linux/async.h> #include <linux/slab.h> -#include "mem.h" +#include "cxlmem.h" #include "cxl.h" /*
On Tue, Sep 14, 2021 at 8:06 AM Ben Widawsky ben.widawsky@intel.com wrote:
On 21-09-14 09:57:49, Jonathan Cameron wrote:
On Tue, 14 Sep 2021 09:56:23 +0100 Jonathan Cameron Jonathan.Cameron@Huawei.com wrote:
On Mon, 13 Sep 2021 18:33:17 -0400 Sasha Levin sashal@kernel.org wrote:
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Not a fix...
I'm guessing this got picked up on the basis of the Reported-by: tag? I think that was added for a minor tweak as this went through review rather than referring to the whole patch.
Or possibly because it was a precursor to the fix in the next patch.
Hmm. Ben, Dan, does it make sense for these two to go into stable?
Jonathan
As of now, no, but having this will make future fixes much easier to cherry pick.
Sasha, please drop this. The CXL subsystem is still in major feature development. I would rather manually backport small fixes rather than backport major code movement just to make small fix backport easier.
Let me know if there was a specific fix autosel was trying to resolve and we'll take a look at whether it makes sense to do a custom backport for -stable.
On Tue, Sep 14, 2021 at 08:05:58AM -0700, Ben Widawsky wrote:
On 21-09-14 09:57:49, Jonathan Cameron wrote:
On Tue, 14 Sep 2021 09:56:23 +0100 Jonathan Cameron Jonathan.Cameron@Huawei.com wrote:
On Mon, 13 Sep 2021 18:33:17 -0400 Sasha Levin sashal@kernel.org wrote:
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Not a fix...
I'm guessing this got picked up on the basis of the Reported-by: tag? I think that was added for a minor tweak as this went through review rather than referring to the whole patch.
Or possibly because it was a precursor to the fix in the next patch.
Hmm. Ben, Dan, does it make sense for these two to go into stable?
Jonathan
As of now, no, but having this will make future fixes much easier to cherry
It was picked because the next patch depends on it. I'll just drop both if you don't want the next one. Thanks!
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 9cc238c7a526dba9ee8c210fa2828886fc65db66 ]
In preparation for moving cxl_memdev allocation to the core, introduce cdevm_file_operations to coordinate file operations shutdown relative to driver data release.
The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move will fix a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y.
Another motivation for passing in file_operations to the core cxl_memdev creation flow is to allow for alternate drivers, like unit test code, to define their own ioctl backends.
Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792539962.368511.2962268954245340288.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/cxlmem.h | 15 ++++++++++ drivers/cxl/pci.c | 65 ++++++++++++++++++++++++++------------------ 2 files changed, 53 insertions(+), 27 deletions(-)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 8f02d02b26b4..0cd463de1342 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -34,6 +34,21 @@ */ #define CXL_MEM_MAX_DEVS 65536
+/** + * struct cdevm_file_operations - devm coordinated cdev file operations + * @fops: file operations that are synchronized against @shutdown + * @shutdown: disconnect driver data + * + * @shutdown is invoked in the devres release path to disconnect any + * driver instance data from @dev. It assumes synchronization with any + * fops operation that requires driver data. After @shutdown an + * operation may only reference @device data. + */ +struct cdevm_file_operations { + struct file_operations fops; + void (*shutdown)(struct device *dev); +}; + /** * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device * @dev: driver core device object diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index a945c5fda292..f7a5ad5e1f4a 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -806,13 +806,30 @@ static int cxl_memdev_release_file(struct inode *inode, struct file *file) return 0; }
-static const struct file_operations cxl_memdev_fops = { - .owner = THIS_MODULE, - .unlocked_ioctl = cxl_memdev_ioctl, - .open = cxl_memdev_open, - .release = cxl_memdev_release_file, - .compat_ioctl = compat_ptr_ioctl, - .llseek = noop_llseek, +static struct cxl_memdev *to_cxl_memdev(struct device *dev) +{ + return container_of(dev, struct cxl_memdev, dev); +} + +static void cxl_memdev_shutdown(struct device *dev) +{ + struct cxl_memdev *cxlmd = to_cxl_memdev(dev); + + down_write(&cxl_memdev_rwsem); + cxlmd->cxlm = NULL; + up_write(&cxl_memdev_rwsem); +} + +static const struct cdevm_file_operations cxl_memdev_fops = { + .fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = cxl_memdev_ioctl, + .open = cxl_memdev_open, + .release = cxl_memdev_release_file, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, + }, + .shutdown = cxl_memdev_shutdown, };
static inline struct cxl_mem_command *cxl_mem_find_command(u16 opcode) @@ -1161,11 +1178,6 @@ static int cxl_mem_setup_regs(struct cxl_mem *cxlm) return ret; }
-static struct cxl_memdev *to_cxl_memdev(struct device *dev) -{ - return container_of(dev, struct cxl_memdev, dev); -} - static void cxl_memdev_release(struct device *dev) { struct cxl_memdev *cxlmd = to_cxl_memdev(dev); @@ -1281,24 +1293,22 @@ static const struct device_type cxl_memdev_type = { .groups = cxl_memdev_attribute_groups, };
-static void cxl_memdev_shutdown(struct cxl_memdev *cxlmd) -{ - down_write(&cxl_memdev_rwsem); - cxlmd->cxlm = NULL; - up_write(&cxl_memdev_rwsem); -} - static void cxl_memdev_unregister(void *_cxlmd) { struct cxl_memdev *cxlmd = _cxlmd; struct device *dev = &cxlmd->dev; + struct cdev *cdev = &cxlmd->cdev; + const struct cdevm_file_operations *cdevm_fops; + + cdevm_fops = container_of(cdev->ops, typeof(*cdevm_fops), fops); + cdevm_fops->shutdown(dev);
cdev_device_del(&cxlmd->cdev, dev); - cxl_memdev_shutdown(cxlmd); put_device(dev); }
-static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm) +static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm, + const struct file_operations *fops) { struct pci_dev *pdev = cxlm->pdev; struct cxl_memdev *cxlmd; @@ -1324,7 +1334,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm) device_set_pm_not_required(dev);
cdev = &cxlmd->cdev; - cdev_init(cdev, &cxl_memdev_fops); + cdev_init(cdev, fops); return cxlmd;
err: @@ -1332,15 +1342,16 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm) return ERR_PTR(rc); }
-static struct cxl_memdev *devm_cxl_add_memdev(struct device *host, - struct cxl_mem *cxlm) +static struct cxl_memdev * +devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm, + const struct cdevm_file_operations *cdevm_fops) { struct cxl_memdev *cxlmd; struct device *dev; struct cdev *cdev; int rc;
- cxlmd = cxl_memdev_alloc(cxlm); + cxlmd = cxl_memdev_alloc(cxlm, &cdevm_fops->fops); if (IS_ERR(cxlmd)) return cxlmd;
@@ -1370,7 +1381,7 @@ static struct cxl_memdev *devm_cxl_add_memdev(struct device *host, * The cdev was briefly live, shutdown any ioctl operations that * saw that state. */ - cxl_memdev_shutdown(cxlmd); + cdevm_fops->shutdown(dev); put_device(dev); return ERR_PTR(rc); } @@ -1611,7 +1622,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc;
- cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm); + cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm, &cxl_memdev_fops); if (IS_ERR(cxlmd)) return PTR_ERR(cxlmd);
On Mon, Sep 13, 2021 at 3:33 PM Sasha Levin sashal@kernel.org wrote:
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 9cc238c7a526dba9ee8c210fa2828886fc65db66 ]
In preparation for moving cxl_memdev allocation to the core, introduce cdevm_file_operations to coordinate file operations shutdown relative to driver data release.
The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move will fix a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y.
Another motivation for passing in file_operations to the core cxl_memdev creation flow is to allow for alternate drivers, like unit test code, to define their own ioctl backends.
Hi Sasha,
Please drop this. It's not a fix, it's just a reorganization for easing the addition of new features and capabilities.
On Tue, Sep 14, 2021 at 08:42:04AM -0700, Dan Williams wrote:
On Mon, Sep 13, 2021 at 3:33 PM Sasha Levin sashal@kernel.org wrote:
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 9cc238c7a526dba9ee8c210fa2828886fc65db66 ]
In preparation for moving cxl_memdev allocation to the core, introduce cdevm_file_operations to coordinate file operations shutdown relative to driver data release.
The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move will fix a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y.
Another motivation for passing in file_operations to the core cxl_memdev creation flow is to allow for alternate drivers, like unit test code, to define their own ioctl backends.
Hi Sasha,
Please drop this. It's not a fix, it's just a reorganization for easing the addition of new features and capabilities.
I'll drop it, but just to satisfy my curiousity: the description says it fixes a use-after-free bug in the existing code, is it not the case?
On Tue, Sep 14, 2021 at 10:01 AM Sasha Levin sashal@kernel.org wrote:
On Tue, Sep 14, 2021 at 08:42:04AM -0700, Dan Williams wrote:
On Mon, Sep 13, 2021 at 3:33 PM Sasha Levin sashal@kernel.org wrote:
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 9cc238c7a526dba9ee8c210fa2828886fc65db66 ]
In preparation for moving cxl_memdev allocation to the core, introduce cdevm_file_operations to coordinate file operations shutdown relative to driver data release.
The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move will fix a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y.
Another motivation for passing in file_operations to the core cxl_memdev creation flow is to allow for alternate drivers, like unit test code, to define their own ioctl backends.
Hi Sasha,
Please drop this. It's not a fix, it's just a reorganization for easing the addition of new features and capabilities.
I'll drop it, but just to satisfy my curiousity: the description says it fixes a use-after-free bug in the existing code, is it not the case?
It does fix a problem if the final put_device() happens after the module text has been unloaded. However, I am only aware of the artificial trigger for that (CONFIG_DEBUG_KOBJECT_RELEASE=y). I.e. if CONFIG_DEBUG_KOBJECT_RELEASE=n I am not aware of any agent that will hold a device reference besides the driver itself. That was the rationale for not tagging this for -stable.
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit d75b9fa053e4cd278281386d860c26fdbfbe9d03 ]
The permission check in gfs2_setattr is an old and outdated version of may_setattr(). Switch to the updated version.
Fixes fstest generic/079.
Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 6e15434b23ac..3130f85d2b3f 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1985,8 +1985,8 @@ static int gfs2_setattr(struct user_namespace *mnt_userns, if (error) goto out;
- error = -EPERM; - if (IS_IMMUTABLE(inode) || IS_APPEND(inode)) + error = may_setattr(&init_user_ns, inode, attr->ia_valid); + if (error) goto error;
error = setattr_prepare(&init_user_ns, dentry, attr);
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit bbac7a92a46f0876e588722ebe552ddfe6fd790f ]
Now that UML has PCI support, this driver must depend also on !UML since it pokes at X86_64 architecture internals that don't exist on ARCH=um.
Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Johannes Berg johannes.berg@intel.com Acked-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/20210809112409.a3a0974874d2.I2ffe3d11ed37f735da2f3... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index f450e4231db7..4f70cf57471a 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -315,7 +315,7 @@ config INTEL_IDXD_PERFMON
config INTEL_IOATDMA tristate "Intel I/OAT DMA support" - depends on PCI && X86_64 + depends on PCI && X86_64 && !UML select DMA_ENGINE select DMA_ENGINE_RAID select DCA
From: Radhey Shyam Pandey radhey.shyam.pandey@xilinx.com
[ Upstream commit aac6c0f90799d66b8989be1e056408f33fd99fe6 ]
The xilinx dma driver uses the consistent allocations, so for correct operation also set the DMA mask for coherent APIs. It fixes the below kernel crash with dmatest client when DMA IP is configured with 64-bit address width and linux is booted from high (>4GB) memory.
Call trace: [ 489.531257] dma_alloc_from_pool+0x8c/0x1c0 [ 489.535431] dma_direct_alloc+0x284/0x330 [ 489.539432] dma_alloc_attrs+0x80/0xf0 [ 489.543174] dma_pool_alloc+0x160/0x2c0 [ 489.547003] xilinx_cdma_prep_memcpy+0xa4/0x180 [ 489.551524] dmatest_func+0x3cc/0x114c [ 489.555266] kthread+0x124/0x130 [ 489.558486] ret_from_fork+0x10/0x3c [ 489.562051] ---[ end trace 248625b2d596a90a ]---
Signed-off-by: Radhey Shyam Pandey radhey.shyam.pandey@xilinx.com Reviewed-by: Harini Katakam harini.katakam@xilinx.com Link: https://lore.kernel.org/r/1629363528-30347-1-git-send-email-radhey.shyam.pan... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/xilinx/xilinx_dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 4b9530a7bf65..434b1ff22e31 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -3077,7 +3077,7 @@ static int xilinx_dma_probe(struct platform_device *pdev) xdev->ext_addr = false;
/* Set the dma mask bits */ - dma_set_mask(xdev->dev, DMA_BIT_MASK(addr_width)); + dma_set_mask_and_coherent(xdev->dev, DMA_BIT_MASK(addr_width));
/* Initialize the DMA engine */ xdev->common.dev = &pdev->dev;
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 436fc4feeabbf103d78d50a8e091b3aac28cc37f ]
kmemleak with enabled auto scanning reports that our stack allocation is lost. This is because we're saving the pointer + STACK_INIT_OFFSET to lowcore. When kmemleak now scans the objects, it thinks that this one is lost because it can't find a corresponding pointer.
Reported-by: Marc Hartmayer mhartmay@linux.ibm.com Signed-off-by: Sven Schnelle svens@linux.ibm.com Tested-by: Marc Hartmayer mhartmay@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/setup.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index ff0f9e838916..08c2263271e3 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -50,6 +50,7 @@ #include <linux/compat.h> #include <linux/start_kernel.h> #include <linux/hugetlb.h> +#include <linux/kmemleak.h>
#include <asm/boot_data.h> #include <asm/ipl.h> @@ -312,9 +313,12 @@ void *restart_stack; unsigned long stack_alloc(void) { #ifdef CONFIG_VMAP_STACK - return (unsigned long)__vmalloc_node(THREAD_SIZE, THREAD_SIZE, - THREADINFO_GFP, NUMA_NO_NODE, - __builtin_return_address(0)); + void *ret; + + ret = __vmalloc_node(THREAD_SIZE, THREAD_SIZE, THREADINFO_GFP, + NUMA_NO_NODE, __builtin_return_address(0)); + kmemleak_not_leak(ret); + return (unsigned long)ret; #else return __get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER); #endif
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit 5f939f49771002f347039edf984aca42f30fc31a ]
commit 63f2f9cceb09f8 ("ASoC: audio-graph: remove Platform support") removed Platform support from audio-graph, because it doesn't have "plat" support on DT (simple-card has). But, Platform support is needed if user is using snd_dmaengine_pcm_register() which adds generic DMA as Platform. And this Platform dev is using CPU dev.
Without this patch, at least STM32MP15 audio sound card is no more functional (v5.13 or later). This patch respawn Platform Support on audio-graph again.
Reported-by: Olivier MOYSAN olivier.moysan@foss.st.com Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Tested-by: Olivier MOYSAN olivier.moysan@foss.st.com Link: https://lore.kernel.org/r/878s0jzrpf.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/generic/audio-graph-card.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/soc/generic/audio-graph-card.c b/sound/soc/generic/audio-graph-card.c index 5e71382467e8..546f6fd0609e 100644 --- a/sound/soc/generic/audio-graph-card.c +++ b/sound/soc/generic/audio-graph-card.c @@ -285,6 +285,7 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv *priv, if (li->cpu) { struct snd_soc_card *card = simple_priv_to_card(priv); struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0); + struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0); int is_single_links = 0;
/* Codec is dummy */ @@ -313,6 +314,7 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv *priv, dai_link->no_pcm = 1;
asoc_simple_canonicalize_cpu(cpus, is_single_links); + asoc_simple_canonicalize_platform(platforms, cpus); } else { struct snd_soc_codec_conf *cconf = simple_props_to_codec_conf(dai_props, 0); struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0); @@ -366,6 +368,7 @@ static int graph_dai_link_of(struct asoc_simple_priv *priv, struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link); struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0); struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0); + struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0); char dai_name[64]; int ret, is_single_links = 0;
@@ -383,6 +386,7 @@ static int graph_dai_link_of(struct asoc_simple_priv *priv, "%s-%s", cpus->dai_name, codecs->dai_name);
asoc_simple_canonicalize_cpu(cpus, is_single_links); + asoc_simple_canonicalize_platform(platforms, cpus);
ret = graph_link_init(priv, cpu_ep, codec_ep, li, dai_name); if (ret < 0) @@ -608,6 +612,7 @@ static int graph_count_noml(struct asoc_simple_priv *priv,
li->num[li->link].cpus = 1; li->num[li->link].codecs = 1; + li->num[li->link].platforms = 1;
li->link += 1; /* 1xCPU-Codec */
@@ -630,6 +635,7 @@ static int graph_count_dpcm(struct asoc_simple_priv *priv,
if (li->cpu) { li->num[li->link].cpus = 1; + li->num[li->link].platforms = 1;
li->link++; /* 1xCPU-dummy */ } else {
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit fa209644a7124b3f4cf811ced55daef49ae39ac6 ]
It was reported that on "HP ENVY x360" that power LED does not come back, certain keys like brightness controls do not work, and the fan never spins up, even under load on 5.14 final.
In analysis of the SSDT it's clear that the Microsoft UUID doesn't provide functional support, but rather the AMD UUID should be supporting this system.
Because this is a gap in the expected logic, we checked back with internal team. The conclusion was that on Windows AMD uPEP *does* run even when Microsoft UUID present, but most OEM systems have adopted value of "0x3" for supported functions and hence nothing runs.
Henceforth add support for running both Microsoft and AMD methods. This approach will also allow the same logic on Intel systems if desired at a future time as well by pulling the evaluation of `lps0_dsm_func_mask_microsoft` out of the `if` block for `acpi_s2idle_vendor_amd`.
Link: https://gitlab.freedesktop.org/drm/amd/uploads/9fbcd7ec3a385cc6949c9bacf45dc... BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1691 Reported-by: Maxwell Beck max@ryt.one Signed-off-by: Mario Limonciello mario.limonciello@amd.com [ rjw: Edits of the new comments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/s2idle.c | 67 +++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 3a308461246a..bd92b549fd5a 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -449,25 +449,30 @@ int acpi_s2idle_prepare_late(void) if (pm_debug_messages_on) lpi_check_constraints();
- if (lps0_dsm_func_mask_microsoft > 0) { + /* Screen off */ + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_SCREEN_OFF_AMD : + ACPI_LPS0_SCREEN_OFF, + lps0_dsm_func_mask, lps0_dsm_guid); + + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY, - lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + + /* LPS0 entry */ + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_ENTRY_AMD : + ACPI_LPS0_ENTRY, + lps0_dsm_func_mask, lps0_dsm_guid); + if (lps0_dsm_func_mask_microsoft > 0) { acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - } else if (acpi_s2idle_vendor_amd()) { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - } else { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF, - lps0_dsm_func_mask, lps0_dsm_guid); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY, - lps0_dsm_func_mask, lps0_dsm_guid); + /* modern standby entry */ + acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY, + lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); } - return 0; }
@@ -476,24 +481,30 @@ void acpi_s2idle_restore_early(void) if (!lps0_device_handle || sleep_no_lps0) return;
- if (lps0_dsm_func_mask_microsoft > 0) { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT, - lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + /* Modern standby exit */ + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON, - lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - } else if (acpi_s2idle_vendor_amd()) { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - } else { + + /* LPS0 exit */ + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_EXIT_AMD : + ACPI_LPS0_EXIT, + lps0_dsm_func_mask, lps0_dsm_guid); + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT, - lps0_dsm_func_mask, lps0_dsm_guid); + lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + + /* Screen on */ + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON, - lps0_dsm_func_mask, lps0_dsm_guid); - } + lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_SCREEN_ON_AMD : + ACPI_LPS0_SCREEN_ON, + lps0_dsm_func_mask, lps0_dsm_guid); }
static const struct platform_s2idle_ops acpi_s2idle_ops_lps0 = {
[Public]
-----Original Message----- From: Sasha Levin sashal@kernel.org Sent: Monday, September 13, 2021 17:33 To: linux-kernel@vger.kernel.org; stable@vger.kernel.org Cc: Limonciello, Mario Mario.Limonciello@amd.com; Maxwell Beck max@ryt.one; Rafael J . Wysocki rafael.j.wysocki@intel.com; Sasha Levin sashal@kernel.org; linux-acpi@vger.kernel.org Subject: [PATCH AUTOSEL 5.14 10/25] ACPI: PM: s2idle: Run both AMD and Microsoft methods if both are supported
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit fa209644a7124b3f4cf811ced55daef49ae39ac6 ]
It was reported that on "HP ENVY x360" that power LED does not come back, certain keys like brightness controls do not work, and the fan never spins up, even under load on 5.14 final.
In analysis of the SSDT it's clear that the Microsoft UUID doesn't provide functional support, but rather the AMD UUID should be supporting this system.
Because this is a gap in the expected logic, we checked back with internal team. The conclusion was that on Windows AMD uPEP *does* run even when Microsoft UUID present, but most OEM systems have adopted value of "0x3" for supported functions and hence nothing runs.
Henceforth add support for running both Microsoft and AMD methods. This approach will also allow the same logic on Intel systems if desired at a future time as well by pulling the evaluation of `lps0_dsm_func_mask_microsoft` out of the `if` block for `acpi_s2idle_vendor_amd`.
Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr eedesktop.org%2Fdrm%2Famd%2Fuploads%2F9fbcd7ec3a385cc6949c9bacf45d c41b%2Facpi- f.20.bin&data=04%7C01%7Cmario.limonciello%40amd.com%7Ce1f8dfc3d bfb45fc44ef08d97706917f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C 0%7C637671692363481559%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sda ta=wP0oz8OMnby9PA4MFrbY1ZAT%2FKv1jctTyXl%2BDteNHqY%3D&reserv ed=0 BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr eedesktop.org%2Fdrm%2Famd%2F- %2Fissues%2F1691&data=04%7C01%7Cmario.limonciello%40amd.com%7 Ce1f8dfc3dbfb45fc44ef08d97706917f%7C3dd8961fe4884e608e11a82d994e183 d%7C0%7C0%7C637671692363481559%7CUnknown%7CTWFpbGZsb3d8eyJWIj oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C100 0&sdata=MVhVz%2BYBTdwgvkkSRRFsL5QdfDLPgTzoMBjD4dsFfMA%3D&a mp;reserved=0 Reported-by: Maxwell Beck max@ryt.one Signed-off-by: Mario Limonciello mario.limonciello@amd.com [ rjw: Edits of the new comments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/acpi/x86/s2idle.c | 67 +++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 3a308461246a..bd92b549fd5a 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -449,25 +449,30 @@ int acpi_s2idle_prepare_late(void) if (pm_debug_messages_on) lpi_check_constraints();
- if (lps0_dsm_func_mask_microsoft > 0) {
- /* Screen off */
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_SCREEN_OFF_AMD :
ACPI_LPS0_SCREEN_OFF,
lps0_dsm_func_mask, lps0_dsm_guid);
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF, lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- /* LPS0 entry */
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_ENTRY_AMD :
ACPI_LPS0_ENTRY,
lps0_dsm_func_mask, lps0_dsm_guid);
- if (lps0_dsm_func_mask_microsoft > 0) { acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY, lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- } else if (acpi_s2idle_vendor_amd()) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
- } else {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF,
lps0_dsm_func_mask, lps0_dsm_guid);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY,
lps0_dsm_func_mask, lps0_dsm_guid);
/* modern standby entry */
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft); }
- return 0;
}
@@ -476,24 +481,30 @@ void acpi_s2idle_restore_early(void) if (!lps0_device_handle || sleep_no_lps0) return;
- if (lps0_dsm_func_mask_microsoft > 0) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- /* Modern standby exit */
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT, lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- } else if (acpi_s2idle_vendor_amd()) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
- } else {
- /* LPS0 exit */
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_EXIT_AMD :
ACPI_LPS0_EXIT,
lps0_dsm_func_mask, lps0_dsm_guid);
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT,
lps0_dsm_func_mask, lps0_dsm_guid);
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- /* Screen on */
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask, lps0_dsm_guid);
- }
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_SCREEN_ON_AMD :
ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask, lps0_dsm_guid);
}
static const struct platform_s2idle_ops acpi_s2idle_ops_lps0 = {
2.30.2
I noticed this didn't get picked up automatically for 5.14.6, so as the submitter of the original patch here is an explicit:
Acked-by: Mario Limonciello mario.limonciello@amd.com
On Tue, Sep 21, 2021 at 02:57:26PM +0000, Limonciello, Mario wrote:
[Public]
-----Original Message----- From: Sasha Levin sashal@kernel.org Sent: Monday, September 13, 2021 17:33 To: linux-kernel@vger.kernel.org; stable@vger.kernel.org Cc: Limonciello, Mario Mario.Limonciello@amd.com; Maxwell Beck max@ryt.one; Rafael J . Wysocki rafael.j.wysocki@intel.com; Sasha Levin sashal@kernel.org; linux-acpi@vger.kernel.org Subject: [PATCH AUTOSEL 5.14 10/25] ACPI: PM: s2idle: Run both AMD and Microsoft methods if both are supported
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit fa209644a7124b3f4cf811ced55daef49ae39ac6 ]
It was reported that on "HP ENVY x360" that power LED does not come back, certain keys like brightness controls do not work, and the fan never spins up, even under load on 5.14 final.
In analysis of the SSDT it's clear that the Microsoft UUID doesn't provide functional support, but rather the AMD UUID should be supporting this system.
Because this is a gap in the expected logic, we checked back with internal team. The conclusion was that on Windows AMD uPEP *does* run even when Microsoft UUID present, but most OEM systems have adopted value of "0x3" for supported functions and hence nothing runs.
Henceforth add support for running both Microsoft and AMD methods. This approach will also allow the same logic on Intel systems if desired at a future time as well by pulling the evaluation of `lps0_dsm_func_mask_microsoft` out of the `if` block for `acpi_s2idle_vendor_amd`.
Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr eedesktop.org%2Fdrm%2Famd%2Fuploads%2F9fbcd7ec3a385cc6949c9bacf45d c41b%2Facpi- f.20.bin&data=04%7C01%7Cmario.limonciello%40amd.com%7Ce1f8dfc3d bfb45fc44ef08d97706917f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C 0%7C637671692363481559%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sda ta=wP0oz8OMnby9PA4MFrbY1ZAT%2FKv1jctTyXl%2BDteNHqY%3D&reserv ed=0 BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr eedesktop.org%2Fdrm%2Famd%2F- %2Fissues%2F1691&data=04%7C01%7Cmario.limonciello%40amd.com%7 Ce1f8dfc3dbfb45fc44ef08d97706917f%7C3dd8961fe4884e608e11a82d994e183 d%7C0%7C0%7C637671692363481559%7CUnknown%7CTWFpbGZsb3d8eyJWIj oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C100 0&sdata=MVhVz%2BYBTdwgvkkSRRFsL5QdfDLPgTzoMBjD4dsFfMA%3D&a mp;reserved=0 Reported-by: Maxwell Beck max@ryt.one Signed-off-by: Mario Limonciello mario.limonciello@amd.com [ rjw: Edits of the new comments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/acpi/x86/s2idle.c | 67 +++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 3a308461246a..bd92b549fd5a 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -449,25 +449,30 @@ int acpi_s2idle_prepare_late(void) if (pm_debug_messages_on) lpi_check_constraints();
- if (lps0_dsm_func_mask_microsoft > 0) {
- /* Screen off */
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_SCREEN_OFF_AMD :
ACPI_LPS0_SCREEN_OFF,
lps0_dsm_func_mask, lps0_dsm_guid);
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF, lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- /* LPS0 entry */
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_ENTRY_AMD :
ACPI_LPS0_ENTRY,
lps0_dsm_func_mask, lps0_dsm_guid);
- if (lps0_dsm_func_mask_microsoft > 0) { acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY, lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- } else if (acpi_s2idle_vendor_amd()) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
- } else {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF,
lps0_dsm_func_mask, lps0_dsm_guid);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY,
lps0_dsm_func_mask, lps0_dsm_guid);
/* modern standby entry */
acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft); }
- return 0;
}
@@ -476,24 +481,30 @@ void acpi_s2idle_restore_early(void) if (!lps0_device_handle || sleep_no_lps0) return;
- if (lps0_dsm_func_mask_microsoft > 0) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- /* Modern standby exit */
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT, lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- } else if (acpi_s2idle_vendor_amd()) {
acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON_AMD,
lps0_dsm_func_mask, lps0_dsm_guid);
- } else {
- /* LPS0 exit */
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_EXIT_AMD :
ACPI_LPS0_EXIT,
lps0_dsm_func_mask, lps0_dsm_guid);
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT,
lps0_dsm_func_mask, lps0_dsm_guid);
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- /* Screen on */
- if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask, lps0_dsm_guid);
- }
lps0_dsm_func_mask_microsoft,
lps0_dsm_guid_microsoft);
- if (lps0_dsm_func_mask > 0)
acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ?
ACPI_LPS0_SCREEN_ON_AMD :
ACPI_LPS0_SCREEN_ON,
lps0_dsm_func_mask, lps0_dsm_guid);
}
static const struct platform_s2idle_ops acpi_s2idle_ops_lps0 = {
2.30.2
I noticed this didn't get picked up automatically for 5.14.6, so as the submitter of the original patch here is an explicit:
Acked-by: Mario Limonciello mario.limonciello@amd.com
Oh it did now, AUTOSEL just has a lengthier review process. Thanks!
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 2ad32cf09bd28a21e6ad1595355a023ed631b529 ]
If we hit a decoding error late in the frame, then we might exit the function without putting the pool_ns string. Ensure that we always put that reference on the way out of the function.
Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 39db97f149b9..c2d654156783 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4134,8 +4134,9 @@ void ceph_handle_caps(struct ceph_mds_session *session, done: mutex_unlock(&session->s_mutex); done_unlocked: - ceph_put_string(extra_info.pool_ns); iput(inode); +out: + ceph_put_string(extra_info.pool_ns); return;
flush_cap_releases: @@ -4150,7 +4151,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, bad: pr_err("ceph_handle_caps: corrupt message\n"); ceph_msg_dump(msg); - return; + goto out; }
/*
From: Jeff Layton jlayton@kernel.org
[ Upstream commit b11ed50346683a749632ea664959b28d524d7395 ]
The current code will update the mtime and then try to get caps to handle the write. If we end up having to request caps from the MDS, then the mtime in the cap grant will clobber the updated mtime and it'll be lost.
This is most noticable when two clients are alternately writing to the same file. Fw caps are continually being granted and revoked, and the mtime ends up stuck because the updated mtimes are always being overwritten with the old one.
Fix this by changing the order of operations in ceph_write_iter to get the caps before updating the times. Also, make sure we check the pool full conditions before even getting any caps or uninlining.
URL: https://tracker.ceph.com/issues/46574 Reported-by: Jozef Kováč kovac@firma.zoznam.sk Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Xiubo Li xiubli@redhat.com Reviewed-by: Luis Henriques lhenriques@suse.de Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/file.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index d1755ac1d964..3daebfaec8c6 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1722,32 +1722,26 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) goto out; }
- err = file_remove_privs(file); - if (err) + down_read(&osdc->lock); + map_flags = osdc->osdmap->flags; + pool_flags = ceph_pg_pool_flags(osdc->osdmap, ci->i_layout.pool_id); + up_read(&osdc->lock); + if ((map_flags & CEPH_OSDMAP_FULL) || + (pool_flags & CEPH_POOL_FLAG_FULL)) { + err = -ENOSPC; goto out; + }
- err = file_update_time(file); + err = file_remove_privs(file); if (err) goto out;
- inode_inc_iversion_raw(inode); - if (ci->i_inline_version != CEPH_INLINE_NONE) { err = ceph_uninline_data(file, NULL); if (err < 0) goto out; }
- down_read(&osdc->lock); - map_flags = osdc->osdmap->flags; - pool_flags = ceph_pg_pool_flags(osdc->osdmap, ci->i_layout.pool_id); - up_read(&osdc->lock); - if ((map_flags & CEPH_OSDMAP_FULL) || - (pool_flags & CEPH_POOL_FLAG_FULL)) { - err = -ENOSPC; - goto out; - } - dout("aio_write %p %llx.%llx %llu~%zd getting caps. i_size %llu\n", inode, ceph_vinop(inode), pos, count, i_size_read(inode)); if (fi->fmode & CEPH_FILE_MODE_LAZY) @@ -1759,6 +1753,12 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) if (err < 0) goto out;
+ err = file_update_time(file); + if (err) + goto out_caps; + + inode_inc_iversion_raw(inode); + dout("aio_write %p %llx.%llx %llu~%zd got cap refs on %s\n", inode, ceph_vinop(inode), pos, count, ceph_cap_string(got));
@@ -1842,6 +1842,8 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) }
goto out_unlocked; +out_caps: + ceph_put_cap_refs(ci, got); out: if (direct_lock) ceph_end_io_direct(inode);
From: Xiubo Li xiubli@redhat.com
[ Upstream commit a6d37ccdd240e80f26aaea0e62cda310e0e184d7 ]
capsnaps will take inode references via ihold when queueing to flush. When force unmounting, the client will just close the sessions and may never get a flush reply, causing a leak and inode ref leak.
Fix this by removing the capsnaps for an inode when removing the caps.
URL: https://tracker.ceph.com/issues/52295 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 68 +++++++++++++++++++++++++++++++++----------- fs/ceph/mds_client.c | 31 +++++++++++++++++++- fs/ceph/super.h | 6 ++++ 3 files changed, 87 insertions(+), 18 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index c2d654156783..73d8487a71d8 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3114,7 +3114,16 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, break; } } - BUG_ON(!found); + + if (!found) { + /* + * The capsnap should already be removed when removing + * auth cap in the case of a forced unmount. + */ + WARN_ON_ONCE(ci->i_auth_cap); + goto unlock; + } + capsnap->dirty_pages -= nr; if (capsnap->dirty_pages == 0) { complete_capsnap = true; @@ -3136,6 +3145,7 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, complete_capsnap ? " (complete capsnap)" : ""); }
+unlock: spin_unlock(&ci->i_ceph_lock);
if (last) { @@ -3606,6 +3616,43 @@ static void handle_cap_flush_ack(struct inode *inode, u64 flush_tid, iput(inode); }
+void __ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc; + bool ret; + + lockdep_assert_held(&ci->i_ceph_lock); + + dout("removing capsnap %p, inode %p ci %p\n", capsnap, inode, ci); + + list_del_init(&capsnap->ci_item); + ret = __detach_cap_flush_from_ci(ci, &capsnap->cap_flush); + if (wake_ci) + *wake_ci = ret; + + spin_lock(&mdsc->cap_dirty_lock); + if (list_empty(&ci->i_cap_flush_list)) + list_del_init(&ci->i_flushing_item); + + ret = __detach_cap_flush_from_mdsc(mdsc, &capsnap->cap_flush); + if (wake_mdsc) + *wake_mdsc = ret; + spin_unlock(&mdsc->cap_dirty_lock); +} + +void ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + + lockdep_assert_held(&ci->i_ceph_lock); + + WARN_ON_ONCE(capsnap->dirty_pages || capsnap->writing); + __ceph_remove_capsnap(inode, capsnap, wake_ci, wake_mdsc); +} + /* * Handle FLUSHSNAP_ACK. MDS has flushed snap data to disk and we can * throw away our cap_snap. @@ -3643,23 +3690,10 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid, capsnap, capsnap->follows); } } - if (flushed) { - WARN_ON(capsnap->dirty_pages || capsnap->writing); - dout(" removing %p cap_snap %p follows %lld\n", - inode, capsnap, follows); - list_del(&capsnap->ci_item); - wake_ci |= __detach_cap_flush_from_ci(ci, &capsnap->cap_flush); - - spin_lock(&mdsc->cap_dirty_lock); - - if (list_empty(&ci->i_cap_flush_list)) - list_del_init(&ci->i_flushing_item); - - wake_mdsc |= __detach_cap_flush_from_mdsc(mdsc, - &capsnap->cap_flush); - spin_unlock(&mdsc->cap_dirty_lock); - } + if (flushed) + ceph_remove_capsnap(inode, capsnap, &wake_ci, &wake_mdsc); spin_unlock(&ci->i_ceph_lock); + if (flushed) { ceph_put_snap_context(capsnap->context); ceph_put_cap_snap(capsnap); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 0b69aec23e5c..bbd46b728eeb 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1583,14 +1583,39 @@ int ceph_iterate_session_caps(struct ceph_mds_session *session, return ret; }
+static int remove_capsnaps(struct ceph_mds_client *mdsc, struct inode *inode) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_cap_snap *capsnap; + int capsnap_release = 0; + + lockdep_assert_held(&ci->i_ceph_lock); + + dout("removing capsnaps, ci is %p, inode is %p\n", ci, inode); + + while (!list_empty(&ci->i_cap_snaps)) { + capsnap = list_first_entry(&ci->i_cap_snaps, + struct ceph_cap_snap, ci_item); + __ceph_remove_capsnap(inode, capsnap, NULL, NULL); + ceph_put_snap_context(capsnap->context); + ceph_put_cap_snap(capsnap); + capsnap_release++; + } + wake_up_all(&ci->i_cap_wq); + wake_up_all(&mdsc->cap_flushing_wq); + return capsnap_release; +} + static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, void *arg) { struct ceph_fs_client *fsc = (struct ceph_fs_client *)arg; + struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_inode_info *ci = ceph_inode(inode); LIST_HEAD(to_remove); bool dirty_dropped = false; bool invalidate = false; + int capsnap_release = 0;
dout("removing cap %p, ci is %p, inode is %p\n", cap, ci, &ci->vfs_inode); @@ -1598,7 +1623,6 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, __ceph_remove_cap(cap, false); if (!ci->i_auth_cap) { struct ceph_cap_flush *cf; - struct ceph_mds_client *mdsc = fsc->mdsc;
if (READ_ONCE(fsc->mount_state) >= CEPH_MOUNT_SHUTDOWN) { if (inode->i_data.nrpages > 0) @@ -1662,6 +1686,9 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, list_add(&ci->i_prealloc_cap_flush->i_list, &to_remove); ci->i_prealloc_cap_flush = NULL; } + + if (!list_empty(&ci->i_cap_snaps)) + capsnap_release = remove_capsnaps(mdsc, inode); } spin_unlock(&ci->i_ceph_lock); while (!list_empty(&to_remove)) { @@ -1678,6 +1705,8 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, ceph_queue_invalidate(inode); if (dirty_dropped) iput(inode); + while (capsnap_release--) + iput(inode); return 0; }
diff --git a/fs/ceph/super.h b/fs/ceph/super.h index b1a363641beb..2200ed76b123 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1163,6 +1163,12 @@ extern void ceph_put_cap_refs_no_check_caps(struct ceph_inode_info *ci, int had); extern void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, struct ceph_snap_context *snapc); +extern void __ceph_remove_capsnap(struct inode *inode, + struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc); +extern void ceph_remove_capsnap(struct inode *inode, + struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc); extern void ceph_flush_snaps(struct ceph_inode_info *ci, struct ceph_mds_session **psession); extern bool __ceph_should_report_size(struct ceph_inode_info *ci);
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 3eaf5aa1cfa8c97c72f5824e2e9263d6cc977b03 ]
Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 73d8487a71d8..1fa961a5a22f 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1856,6 +1856,8 @@ static u64 __mark_caps_flushing(struct inode *inode, * try to invalidate mapping pages without blocking. */ static int try_nonblocking_invalidate(struct inode *inode) + __releases(ci->i_ceph_lock) + __acquires(ci->i_ceph_lock) { struct ceph_inode_info *ci = ceph_inode(inode); u32 invalidating_gen = ci->i_rdcache_gen;
From: Vasily Gorbik gor@linux.ibm.com
[ Upstream commit 88b604263f3d6eedae0b1c2c3bbd602d1e2e8775 ]
current_stack_pointer() simply returns current value of %r15. If current_stack_pointer() caller allocates stack (which is the case in unwind code) %r15 points to a stack frame allocated for callees, meaning current_stack_pointer() caller (e.g. stack_trace_save) will end up in the stacktrace. This is not expected by stack_trace_save*() callers and causes problems.
current_frame_address() on the other hand returns function stack frame address, which matches %r15 upon function invocation. Using it in get_stack_pointer() makes it more aligned with x86 implementation (according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*() caller's expectations, notably KCSAN.
Also make sure unwind_start is always inlined.
Reported-by: Nathan Chancellor nathan@kernel.org Suggested-by: Marco Elver elver@google.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Tested-by: Marco Elver elver@google.com Tested-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/patch.git-04dd26be3043.your-ad-here.call-016305048... Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/stacktrace.h | 20 ++++++++++---------- arch/s390/include/asm/unwind.h | 8 ++++---- 2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/s390/include/asm/stacktrace.h b/arch/s390/include/asm/stacktrace.h index 3d8a4b94c620..dd00d98804ec 100644 --- a/arch/s390/include/asm/stacktrace.h +++ b/arch/s390/include/asm/stacktrace.h @@ -34,16 +34,6 @@ static inline bool on_stack(struct stack_info *info, return addr >= info->begin && addr + len <= info->end; }
-static __always_inline unsigned long get_stack_pointer(struct task_struct *task, - struct pt_regs *regs) -{ - if (regs) - return (unsigned long) kernel_stack_pointer(regs); - if (task == current) - return current_stack_pointer(); - return (unsigned long) task->thread.ksp; -} - /* * Stack layout of a C stack frame. */ @@ -74,6 +64,16 @@ struct stack_frame { ((unsigned long)__builtin_frame_address(0) - \ offsetof(struct stack_frame, back_chain))
+static __always_inline unsigned long get_stack_pointer(struct task_struct *task, + struct pt_regs *regs) +{ + if (regs) + return (unsigned long)kernel_stack_pointer(regs); + if (task == current) + return current_frame_address(); + return (unsigned long)task->thread.ksp; +} + /* * To keep this simple mark register 2-6 as being changed (volatile) * by the called function, even though register 6 is saved/nonvolatile. diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h index de9006b0cfeb..5ebf534ef753 100644 --- a/arch/s390/include/asm/unwind.h +++ b/arch/s390/include/asm/unwind.h @@ -55,10 +55,10 @@ static inline bool unwind_error(struct unwind_state *state) return state->error; }
-static inline void unwind_start(struct unwind_state *state, - struct task_struct *task, - struct pt_regs *regs, - unsigned long first_frame) +static __always_inline void unwind_start(struct unwind_state *state, + struct task_struct *task, + struct pt_regs *regs, + unsigned long first_frame) { task = task ?: current; first_frame = first_frame ?: get_stack_pointer(task, regs);
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 8f96a5bfa1503e0a5f3c78d51e993a1794d4aff1 ]
We update the ctime/mtime of a block device when we remove it so that blkid knows the device changed. However we do this by re-opening the block device and calling filp_update_time. This is more correct because it'll call the inode->i_op->update_time if it exists, but the block dev inodes do not do this. Instead call generic_update_time() on the bd_inode in order to avoid the blkdev_open path and get rid of the following lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc2+ #406 Not tainted ------------------------------------------------------ losetup/11596 is trying to acquire lock: ffff939640d2f538 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock: ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x25/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 do_sys_openat2+0x7b/0x130 __x64_sys_openat+0x46/0x70 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 blkdev_get_by_dev.part.0+0x56/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 file_open_name+0xc7/0x170 filp_open+0x2c/0x50 btrfs_scratch_superblocks.part.0+0x10f/0x170 btrfs_rm_device.cold+0xe8/0xed btrfs_ioctl+0x2a31/0x2e70 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}: lo_write_bvec+0xc2/0x240 [loop] loop_process_work+0x238/0xd00 [loop] process_one_work+0x26b/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x245/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 flush_workqueue+0x91/0x5e0 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11596: #0: ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace: CPU: 1 PID: 11596 Comm: losetup Not tainted 5.14.0-rc2+ #406 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0xcf/0xf0 ? stack_trace_save+0x3b/0x50 __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 ? flush_workqueue+0x67/0x5e0 ? lockdep_init_map_type+0x47/0x220 flush_workqueue+0x91/0x5e0 ? flush_workqueue+0x67/0x5e0 ? verify_cpu+0xf0/0x100 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] ? blkdev_ioctl+0x8d/0x2a0 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 354ffd8f81af..8dc14985128f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1925,15 +1925,17 @@ static int btrfs_add_dev_item(struct btrfs_trans_handle *trans, * Function to update ctime/mtime for a given device path. * Mainly used for ctime/mtime based probe like libblkid. */ -static void update_dev_time(const char *path_name) +static void update_dev_time(struct block_device *bdev) { - struct file *filp; + struct inode *inode = bdev->bd_inode; + struct timespec64 now;
- filp = filp_open(path_name, O_RDWR, 0); - if (IS_ERR(filp)) + /* Shouldn't happen but just in case. */ + if (!inode) return; - file_update_time(filp); - filp_close(filp, NULL); + + now = current_time(inode); + generic_update_time(inode, &now, S_MTIME | S_CTIME); }
static int btrfs_rm_dev_item(struct btrfs_device *device) @@ -2113,7 +2115,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info, btrfs_kobject_uevent(bdev, KOBJ_CHANGE);
/* Update ctime/mtime for device path for libblkid */ - update_dev_time(device_path); + update_dev_time(bdev); }
int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, @@ -2766,7 +2768,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path btrfs_forget_devices(device_path);
/* Update ctime/mtime for blkid or udev */ - update_dev_time(device_path); + update_dev_time(bdev);
return ret;
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 3fa421dedbc82f985f030c5a6480ea2d784334c3 ]
When removing the device we call blkdev_put() on the device once we've removed it, and because we have an EXCL open we need to take the ->open_mutex on the block device to clean it up. Unfortunately during device remove we are holding the sb writers lock, which results in the following lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc2+ #407 Not tainted ------------------------------------------------------ losetup/11595 is trying to acquire lock: ffff973ac35dd138 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock: ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x25/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 do_sys_openat2+0x7b/0x130 __x64_sys_openat+0x46/0x70 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 blkdev_put+0x3a/0x220 btrfs_rm_device.cold+0x62/0xe5 btrfs_ioctl+0x2a31/0x2e70 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}: lo_write_bvec+0xc2/0x240 [loop] loop_process_work+0x238/0xd00 [loop] process_one_work+0x26b/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x245/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 flush_workqueue+0x91/0x5e0 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11595: #0: ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace: CPU: 0 PID: 11595 Comm: losetup Not tainted 5.14.0-rc2+ #407 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0xcf/0xf0 ? stack_trace_save+0x3b/0x50 __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 ? flush_workqueue+0x67/0x5e0 ? lockdep_init_map_type+0x47/0x220 flush_workqueue+0x91/0x5e0 ? flush_workqueue+0x67/0x5e0 ? verify_cpu+0xf0/0x100 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] ? blkdev_ioctl+0x8d/0x2a0 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc21255d4cb
So instead save the bdev and do the put once we've dropped the sb writers lock in order to avoid the lockdep recursion.
Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ioctl.c | 15 +++++++++++---- fs/btrfs/volumes.c | 23 +++++++++++++++++------ fs/btrfs/volumes.h | 3 ++- 3 files changed, 30 insertions(+), 11 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 0ba98e08a029..50e12989e84a 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3205,6 +3205,8 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ioctl_vol_args_v2 *vol_args; + struct block_device *bdev = NULL; + fmode_t mode; int ret; bool cancel = false;
@@ -3237,9 +3239,9 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) /* Exclusive operation is now claimed */
if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) - ret = btrfs_rm_device(fs_info, NULL, vol_args->devid); + ret = btrfs_rm_device(fs_info, NULL, vol_args->devid, &bdev, &mode); else - ret = btrfs_rm_device(fs_info, vol_args->name, 0); + ret = btrfs_rm_device(fs_info, vol_args->name, 0, &bdev, &mode);
btrfs_exclop_finish(fs_info);
@@ -3255,6 +3257,8 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) kfree(vol_args); err_drop: mnt_drop_write_file(file); + if (bdev) + blkdev_put(bdev, mode); return ret; }
@@ -3263,6 +3267,8 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ioctl_vol_args *vol_args; + struct block_device *bdev = NULL; + fmode_t mode; int ret; bool cancel;
@@ -3284,7 +3290,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) ret = exclop_start_or_cancel_reloc(fs_info, BTRFS_EXCLOP_DEV_REMOVE, cancel); if (ret == 0) { - ret = btrfs_rm_device(fs_info, vol_args->name, 0); + ret = btrfs_rm_device(fs_info, vol_args->name, 0, &bdev, &mode); if (!ret) btrfs_info(fs_info, "disk deleted %s", vol_args->name); btrfs_exclop_finish(fs_info); @@ -3293,7 +3299,8 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) kfree(vol_args); out_drop_write: mnt_drop_write_file(file); - + if (bdev) + blkdev_put(bdev, mode); return ret; }
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 8dc14985128f..7f38f81f1a92 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2119,7 +2119,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info, }
int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, - u64 devid) + u64 devid, struct block_device **bdev, fmode_t *mode) { struct btrfs_device *device; struct btrfs_fs_devices *cur_devices; @@ -2233,15 +2233,26 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, mutex_unlock(&fs_devices->device_list_mutex);
/* - * at this point, the device is zero sized and detached from - * the devices list. All that's left is to zero out the old - * supers and free the device. + * At this point, the device is zero sized and detached from the + * devices list. All that's left is to zero out the old supers and + * free the device. + * + * We cannot call btrfs_close_bdev() here because we're holding the sb + * write lock, and blkdev_put() will pull in the ->open_mutex on the + * block device and it's dependencies. Instead just flush the device + * and let the caller do the final blkdev_put. */ - if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) + if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) { btrfs_scratch_superblocks(fs_info, device->bdev, device->name->str); + if (device->bdev) { + sync_blockdev(device->bdev); + invalidate_bdev(device->bdev); + } + }
- btrfs_close_bdev(device); + *bdev = device->bdev; + *mode = device->mode; synchronize_rcu(); btrfs_free_device(device);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 55a8ba244716..f77f869dfd2c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -472,7 +472,8 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info, const u8 *uuid); void btrfs_free_device(struct btrfs_device *device); int btrfs_rm_device(struct btrfs_fs_info *fs_info, - const char *device_path, u64 devid); + const char *device_path, u64 devid, + struct block_device **bdev, fmode_t *mode); void __exit btrfs_cleanup_fs_uuids(void); int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len); int btrfs_grow_device(struct btrfs_trans_handle *trans,
From: Anand Jain anand.jain@oracle.com
[ Upstream commit c124706900c20dee70f921bb3a90492431561a0a ]
Following test case reproduces lockdep warning.
Test case:
$ mkfs.btrfs -f <dev1> $ btrfstune -S 1 <dev1> $ mount <dev1> <mnt> $ btrfs device add <dev2> <mnt> -f $ umount <mnt> $ mount <dev2> <mnt> $ umount <mnt>
The warning claims a possible ABBA deadlock between the threads initiated by [#1] btrfs device add and [#0] the mount.
[ 540.743122] WARNING: possible circular locking dependency detected [ 540.743129] 5.11.0-rc7+ #5 Not tainted [ 540.743135] ------------------------------------------------------ [ 540.743142] mount/2515 is trying to acquire lock: [ 540.743149] ffffa0c5544c2ce0 (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: clone_fs_devices+0x6d/0x210 [btrfs] [ 540.743458] but task is already holding lock: [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.743541] which lock already depends on the new lock. [ 540.743543] the existing dependency chain (in reverse order) is:
[ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}: [ 540.743566] down_read_nested+0x48/0x2b0 [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs] [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs] [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs] [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- device_list_mutex [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 [btrfs] [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs] [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs] [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs] [ 540.744166] __x64_sys_ioctl+0xe4/0x140 [ 540.744170] do_syscall_64+0x4b/0x80 [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}: [ 540.744184] __lock_acquire+0x155f/0x2360 [ 540.744188] lock_acquire+0x10b/0x5c0 [ 540.744190] __mutex_lock+0xb1/0xf80 [ 540.744193] mutex_lock_nested+0x27/0x30 [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs] [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs] [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs] [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs] [ 540.744472] legacy_get_tree+0x38/0x90 [ 540.744475] vfs_get_tree+0x30/0x140 [ 540.744478] fc_mount+0x16/0x60 [ 540.744482] vfs_kern_mount+0x91/0x100 [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs] [ 540.744536] legacy_get_tree+0x38/0x90 [ 540.744537] vfs_get_tree+0x30/0x140 [ 540.744539] path_mount+0x8d8/0x1070 [ 540.744541] do_mount+0x8d/0xc0 [ 540.744543] __x64_sys_mount+0x125/0x160 [ 540.744545] do_syscall_64+0x4b/0x80 [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 540.744551] other info that might help us debug this: [ 540.744552] Possible unsafe locking scenario:
[ 540.744553] CPU0 CPU1 [ 540.744554] ---- ---- [ 540.744555] lock(btrfs-chunk-00); [ 540.744557] lock(&fs_devs->device_list_mutex); [ 540.744560] lock(btrfs-chunk-00); [ 540.744562] lock(&fs_devs->device_list_mutex); [ 540.744564] *** DEADLOCK ***
[ 540.744565] 3 locks held by mount/2515: [ 540.744567] #0: ffffa0c56bf7a0e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.isra.16+0xdf/0x450 [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: btrfs_read_chunk_tree+0x63/0xbb0 [btrfs] [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.744708] stack backtrace: [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted 5.11.0-rc7+ #5
But the device_list_mutex in clone_fs_devices() is redundant, as explained below. Two threads [1] and [2] (below) could lead to clone_fs_device().
[1] open_ctree <== mount sprout fs btrfs_read_chunk_tree() mutex_lock(&uuid_mutex) <== global lock read_one_dev() open_seed_devices() clone_fs_devices() <== seed fs_devices mutex_lock(&orig->device_list_mutex) <== seed fs_devices
[2] btrfs_init_new_device() <== sprouting mutex_lock(&uuid_mutex); <== global lock btrfs_prepare_sprout() lockdep_assert_held(&uuid_mutex) clone_fs_devices(seed_fs_device) <== seed fs_devices
Both of these threads hold uuid_mutex which is sufficient to protect getting the seed device(s) freed while we are trying to clone it for sprouting [2] or mounting a sprout [1] (as above). A mounted seed device can not free/write/replace because it is read-only. An unmounted seed device can be freed by btrfs_free_stale_devices(), but it needs uuid_mutex. So this patch removes the unnecessary device_list_mutex in clone_fs_devices(). And adds a lockdep_assert_held(&uuid_mutex) in clone_fs_devices().
Reported-by: Su Yue l@damenly.su Tested-by: Su Yue l@damenly.su Signed-off-by: Anand Jain anand.jain@oracle.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 7f38f81f1a92..51846e582257 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -570,6 +570,8 @@ static int btrfs_free_stale_devices(const char *path, struct btrfs_device *device, *tmp_device; int ret = 0;
+ lockdep_assert_held(&uuid_mutex); + if (path) ret = -ENOENT;
@@ -1000,11 +1002,12 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) struct btrfs_device *orig_dev; int ret = 0;
+ lockdep_assert_held(&uuid_mutex); + fs_devices = alloc_fs_devices(orig->fsid, NULL); if (IS_ERR(fs_devices)) return fs_devices;
- mutex_lock(&orig->device_list_mutex); fs_devices->total_devices = orig->total_devices;
list_for_each_entry(orig_dev, &orig->devices, dev_list) { @@ -1036,10 +1039,8 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) device->fs_devices = fs_devices; fs_devices->num_devices++; } - mutex_unlock(&orig->device_list_mutex); return fs_devices; error: - mutex_unlock(&orig->device_list_mutex); free_fs_devices(fs_devices); return ERR_PTR(ret); }
From: Ohhoon Kwon ohoono.kwon@samsung.com
[ Upstream commit c2f273ebd89a79ed87ef1025753343e327b99ac9 ]
While comm change event via prctl has been reported to proc connector by 'commit f786ecba4158 ("connector: add comm change event report to proc connector")', connector listeners were missing comm changes by explicit writes on /proc/[pid]/comm.
Let explicit writes on /proc/[pid]/comm report to proc connector.
Link: https://lkml.kernel.org/r/20210701133458epcms1p68e9eb9bd0eee8903ba26679a37d9... Signed-off-by: Ohhoon Kwon ohoono.kwon@samsung.com Cc: Ingo Molnar mingo@kernel.org Cc: David S. Miller davem@davemloft.net Cc: Christian Brauner christian.brauner@ubuntu.com Cc: Eric W. Biederman ebiederm@xmission.com Cc: Alexey Dobriyan adobriyan@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/proc/base.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c index e5b5f7709d48..533d5836eb9a 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -95,6 +95,7 @@ #include <linux/posix-timers.h> #include <linux/time_namespace.h> #include <linux/resctrl.h> +#include <linux/cn_proc.h> #include <trace/events/oom.h> #include "internal.h" #include "fd.h" @@ -1674,8 +1675,10 @@ static ssize_t comm_write(struct file *file, const char __user *buf, if (!p) return -ESRCH;
- if (same_thread_group(current, p)) + if (same_thread_group(current, p)) { set_task_comm(p, buffer); + proc_comm_connector(p); + } else count = -EINVAL;
Sasha Levin sashal@kernel.org writes:
From: Ohhoon Kwon ohoono.kwon@samsung.com
[ Upstream commit c2f273ebd89a79ed87ef1025753343e327b99ac9 ]
While comm change event via prctl has been reported to proc connector by 'commit f786ecba4158 ("connector: add comm change event report to proc connector")', connector listeners were missing comm changes by explicit writes on /proc/[pid]/comm.
Let explicit writes on /proc/[pid]/comm report to proc connector.
This is a potential userspace ABI breakage? Why backport it?
Especially if there is no one asking for the behavior change in userspace?
Eric
Link: https://lkml.kernel.org/r/20210701133458epcms1p68e9eb9bd0eee8903ba26679a37d9... Signed-off-by: Ohhoon Kwon ohoono.kwon@samsung.com Cc: Ingo Molnar mingo@kernel.org Cc: David S. Miller davem@davemloft.net Cc: Christian Brauner christian.brauner@ubuntu.com Cc: Eric W. Biederman ebiederm@xmission.com Cc: Alexey Dobriyan adobriyan@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org
fs/proc/base.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c index e5b5f7709d48..533d5836eb9a 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -95,6 +95,7 @@ #include <linux/posix-timers.h> #include <linux/time_namespace.h> #include <linux/resctrl.h> +#include <linux/cn_proc.h> #include <trace/events/oom.h> #include "internal.h" #include "fd.h" @@ -1674,8 +1675,10 @@ static ssize_t comm_write(struct file *file, const char __user *buf, if (!p) return -ESRCH;
- if (same_thread_group(current, p))
- if (same_thread_group(current, p)) { set_task_comm(p, buffer);
proc_comm_connector(p);
- } else count = -EINVAL;
On Wed, Sep 15, 2021 at 08:45:37AM -0500, Eric W. Biederman wrote:
Sasha Levin sashal@kernel.org writes:
From: Ohhoon Kwon ohoono.kwon@samsung.com
[ Upstream commit c2f273ebd89a79ed87ef1025753343e327b99ac9 ]
While comm change event via prctl has been reported to proc connector by 'commit f786ecba4158 ("connector: add comm change event report to proc connector")', connector listeners were missing comm changes by explicit writes on /proc/[pid]/comm.
Let explicit writes on /proc/[pid]/comm report to proc connector.
This is a potential userspace ABI breakage? Why backport it?
Especially if there is no one asking for the behavior change in userspace?
This sounds like a concern with the patch going upstream rather than going to stable? stable has the same policy around ABI changes such as upstream.
Sasha Levin sashal@kernel.org writes:
On Wed, Sep 15, 2021 at 08:45:37AM -0500, Eric W. Biederman wrote:
Sasha Levin sashal@kernel.org writes:
From: Ohhoon Kwon ohoono.kwon@samsung.com
[ Upstream commit c2f273ebd89a79ed87ef1025753343e327b99ac9 ]
While comm change event via prctl has been reported to proc connector by 'commit f786ecba4158 ("connector: add comm change event report to proc connector")', connector listeners were missing comm changes by explicit writes on /proc/[pid]/comm.
Let explicit writes on /proc/[pid]/comm report to proc connector.
This is a potential userspace ABI breakage? Why backport it?
Especially if there is no one asking for the behavior change in userspace?
This sounds like a concern with the patch going upstream rather than going to stable? stable has the same policy around ABI changes such as upstream.
Let me say it another way. This looks more like an evolution of the functionality rather than a bug fix.
With something like this unless someone cares I don't think it should be backported. It is all risk and no benefit.
This is all doubly so because I think there are about 2 connector users and connector is not especially good at the job it tries to fulfill. It is for that exact reason that connector does not work in containers. We couldn't find any users who cared.
After the fiasco with the rlimit/ucount changes getting backported before they are even stable is that I am tired of saying about backports meh whatever.
If there is no one who actually cares (which is what I learned about autosel from the rlimit/ucount fiasco) it makes no sense to backport things unless they really are bug fixes.
Backporting this just looks like senseless churn.
Eric
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit 5f5dec07aca7067216ed4c1342e464e7307a9197 ]
Patch series "nilfs2: fix incorrect usage of kobject".
This patchset from Nanyong Sun fixes memory leak issues and a NULL pointer dereference issue caused by incorrect usage of kboject in nilfs2 sysfs implementation.
This patch (of 6):
Reported by syzkaller:
BUG: memory leak unreferenced object 0xffff888100ca8988 (size 8): comm "syz-executor.1", pid 1930, jiffies 4294745569 (age 18.052s) hex dump (first 8 bytes): 6c 6f 6f 70 31 00 ff ff loop1... backtrace: kstrdup+0x36/0x70 mm/util.c:60 kstrdup_const+0x35/0x60 mm/util.c:83 kvasprintf_const+0xf1/0x180 lib/kasprintf.c:48 kobject_set_name_vargs+0x56/0x150 lib/kobject.c:289 kobject_add_varg lib/kobject.c:384 [inline] kobject_init_and_add+0xc9/0x150 lib/kobject.c:473 nilfs_sysfs_create_device_group+0x150/0x7d0 fs/nilfs2/sysfs.c:986 init_nilfs+0xa21/0xea0 fs/nilfs2/the_nilfs.c:637 nilfs_fill_super fs/nilfs2/super.c:1046 [inline] nilfs_mount+0x7b4/0xe80 fs/nilfs2/super.c:1316 legacy_get_tree+0x105/0x210 fs/fs_context.c:592 vfs_get_tree+0x8e/0x2d0 fs/super.c:1498 do_new_mount fs/namespace.c:2905 [inline] path_mount+0xf9b/0x1990 fs/namespace.c:3235 do_mount+0xea/0x100 fs/namespace.c:3248 __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount fs/namespace.c:3433 [inline] __x64_sys_mount+0x14b/0x1f0 fs/namespace.c:3433 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
If kobject_init_and_add return with error, then the cleanup of kobject is needed because memory may be allocated in kobject_init_and_add without freeing.
And the place of cleanup_dev_kobject should use kobject_put to free the memory associated with the kobject. As the section "Kobject removal" of "Documentation/core-api/kobject.rst" says, kobject_del() just makes the kobject "invisible", but it is not cleaned up. And no more cleanup will do after cleanup_dev_kobject, so kobject_put is needed here.
Link: https://lkml.kernel.org/r/1625651306-10829-1-git-send-email-konishi.ryusuke@... Link: https://lkml.kernel.org/r/1625651306-10829-2-git-send-email-konishi.ryusuke@... Reported-by: Hulk Robot hulkci@huawei.com Link: https://lkml.kernel.org/r/20210629022556.3985106-2-sunnanyong@huawei.com Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index 68e8d61e28dd..d2d8ea89937a 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -986,7 +986,7 @@ int nilfs_sysfs_create_device_group(struct super_block *sb) err = kobject_init_and_add(&nilfs->ns_dev_kobj, &nilfs_dev_ktype, NULL, "%s", sb->s_id); if (err) - goto free_dev_subgroups; + goto cleanup_dev_kobject;
err = nilfs_sysfs_create_mounted_snapshots_group(nilfs); if (err) @@ -1023,9 +1023,7 @@ int nilfs_sysfs_create_device_group(struct super_block *sb) nilfs_sysfs_delete_mounted_snapshots_group(nilfs);
cleanup_dev_kobject: - kobject_del(&nilfs->ns_dev_kobj); - -free_dev_subgroups: + kobject_put(&nilfs->ns_dev_kobj); kfree(nilfs->ns_dev_subgroups);
failed_create_device_group:
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit dbc6e7d44a514f231a64d9d5676e001b660b6448 ]
In nilfs_##name##_attr_release, kobj->parent should not be referenced because it is a NULL pointer. The release() method of kobject is always called in kobject_put(kobj), in the implementation of kobject_put(), the kobj->parent will be assigned as NULL before call the release() method. So just use kobj to get the subgroups, which is more efficient and can fix a NULL pointer reference problem.
Link: https://lkml.kernel.org/r/20210629022556.3985106-3-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-3-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index d2d8ea89937a..ec85ac53720d 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -51,11 +51,9 @@ static const struct sysfs_ops nilfs_##name##_attr_ops = { \ #define NILFS_DEV_INT_GROUP_TYPE(name, parent_name) \ static void nilfs_##name##_attr_release(struct kobject *kobj) \ { \ - struct nilfs_sysfs_##parent_name##_subgroups *subgroups; \ - struct the_nilfs *nilfs = container_of(kobj->parent, \ - struct the_nilfs, \ - ns_##parent_name##_kobj); \ - subgroups = nilfs->ns_##parent_name##_subgroups; \ + struct nilfs_sysfs_##parent_name##_subgroups *subgroups = container_of(kobj, \ + struct nilfs_sysfs_##parent_name##_subgroups, \ + sg_##name##_kobj); \ complete(&subgroups->sg_##name##_kobj_unregister); \ } \ static struct kobj_type nilfs_##name##_ktype = { \
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit 24f8cb1ed057c840728167dab33b32e44147c86f ]
If kobject_init_and_add return with error, kobject_put() is needed here to avoid memory leak, because kobject_init_and_add may return error without freeing the memory associated with the kobject it allocated.
Link: https://lkml.kernel.org/r/20210629022556.3985106-4-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-4-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index ec85ac53720d..6305e4ef7e39 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -79,8 +79,8 @@ static int nilfs_sysfs_create_##name##_group(struct the_nilfs *nilfs) \ err = kobject_init_and_add(kobj, &nilfs_##name##_ktype, parent, \ #name); \ if (err) \ - return err; \ - return 0; \ + kobject_put(kobj); \ + return err; \ } \ static void nilfs_sysfs_delete_##name##_group(struct the_nilfs *nilfs) \ { \
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit a3e181259ddd61fd378390977a1e4e2316853afa ]
The kobject_put() should be used to cleanup the memory associated with the kobject instead of kobject_del. See the section "Kobject removal" of "Documentation/core-api/kobject.rst".
Link: https://lkml.kernel.org/r/20210629022556.3985106-5-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-5-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index 6305e4ef7e39..d989e6500bd7 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -84,7 +84,7 @@ static int nilfs_sysfs_create_##name##_group(struct the_nilfs *nilfs) \ } \ static void nilfs_sysfs_delete_##name##_group(struct the_nilfs *nilfs) \ { \ - kobject_del(&nilfs->ns_##parent_name##_subgroups->sg_##name##_kobj); \ + kobject_put(&nilfs->ns_##parent_name##_subgroups->sg_##name##_kobj); \ }
/************************************************************************
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit b2fe39c248f3fa4bbb2a20759b4fdd83504190f7 ]
If kobject_init_and_add returns with error, kobject_put() is needed here to avoid memory leak, because kobject_init_and_add may return error without freeing the memory associated with the kobject it allocated.
Link: https://lkml.kernel.org/r/20210629022556.3985106-6-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-6-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index d989e6500bd7..5ba87573ad3b 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -195,9 +195,9 @@ int nilfs_sysfs_create_snapshot_group(struct nilfs_root *root) }
if (err) - return err; + kobject_put(&root->snapshot_kobj);
- return 0; + return err; }
void nilfs_sysfs_delete_snapshot_group(struct nilfs_root *root)
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit 17243e1c3072b8417a5ebfc53065d0a87af7ca77 ]
kobject_put() should be used to cleanup the memory associated with the kobject instead of kobject_del(). See the section "Kobject removal" of "Documentation/core-api/kobject.rst".
Link: https://lkml.kernel.org/r/20210629022556.3985106-7-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-7-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index 5ba87573ad3b..62f8a7ac19c8 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -202,7 +202,7 @@ int nilfs_sysfs_create_snapshot_group(struct nilfs_root *root)
void nilfs_sysfs_delete_snapshot_group(struct nilfs_root *root) { - kobject_del(&root->snapshot_kobj); + kobject_put(&root->snapshot_kobj); }
/************************************************************************
linux-stable-mirror@lists.linaro.org