From: Pavel Kozlov <pavel.kozlov(a)synopsys.com>
Since commit d9820ff ("ARC: mm: switch pgtable_t back to struct page *")
a memory leakage problem occurs. Memory allocated for page table entries
not released during process termination. This issue can be reproduced by
a small program that allocates a large amount of memory. After several
runs, you'll see that the amount of free memory has reduced and will
continue to reduce after each run. All ARC CPUs are effected by this
issue. The issue was introduced since the kernel stable release v5.15-rc1.
As described in commit d9820ff after switch pgtable_t back to struct
page *, a pointer to "struct page" and appropriate functions are used to
allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't
changed and returns the direct virtual address from the PMD (PGD) entry.
Than this address used as a parameter in the __pte_free() and as a result
this function couldn't release memory page allocated for PTEs.
Fix this issue by changing the pmd_pgtable macro and returning pointer to
struct page.
Fixes: d9820ff76f95 ("ARC: mm: switch pgtable_t back to struct page *")
Signed-off-by: Pavel Kozlov <pavel.kozlov(a)synopsys.com>
Cc: Vineet Gupta <vgupta(a)kernel.org>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: <stable(a)vger.kernel.org> # 4.15.x
---
arch/arc/include/asm/pgtable-levels.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arc/include/asm/pgtable-levels.h b/arch/arc/include/asm/pgtable-levels.h
index 64ca25d199be..ef68758b69f7 100644
--- a/arch/arc/include/asm/pgtable-levels.h
+++ b/arch/arc/include/asm/pgtable-levels.h
@@ -161,7 +161,7 @@
#define pmd_pfn(pmd) ((pmd_val(pmd) & PAGE_MASK) >> PAGE_SHIFT)
#define pmd_page(pmd) virt_to_page(pmd_page_vaddr(pmd))
#define set_pmd(pmdp, pmd) (*(pmdp) = pmd)
-#define pmd_pgtable(pmd) ((pgtable_t) pmd_page_vaddr(pmd))
+#define pmd_pgtable(pmd) ((pgtable_t) pmd_page(pmd))
/*
* 4th level paging: pte
--
2.25.1
When the idxd_user_drv driver is bound to a Work Queue (WQ) device
without IOMMU or with IOMMU Passthrough without Shared Virtual
Addressing (SVA), the application gains direct access to physical
memory via the device by programming physical address to a submitted
descriptor. This allows direct userspace read and write access to
arbitrary physical memory. This is inconsistent with the security
goals of a good kernel API.
Unlike vfio_pci driver, the IDXD char device driver does not provide any
ways to pin user pages and translate the address from user VA to IOVA or
PA without IOMMU SVA. Therefore the application has no way to instruct the
device to perform DMA function. This makes the char device not usable for
normal application usage.
Since user type WQ without SVA cannot be used for normal application usage
and presents the security issue, bind idxd_user_drv driver and enable user
type WQ only when SVA is enabled (i.e. user PASID is enabled).
Fixes: 448c3de8ac83 ("dmaengine: idxd: create user driver for wq 'device'")
Cc: stable(a)vger.kernel.org
Suggested-by: Arjan Van De Ven <arjan.van.de.ven(a)intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu(a)intel.com>
Reviewed-by: Dave Jiang <dave.jiang(a)intel.com>
---
v2:
- Update changlog per Dave Hansen's comments
drivers/dma/idxd/cdev.c | 18 ++++++++++++++++++
include/uapi/linux/idxd.h | 1 +
2 files changed, 19 insertions(+)
diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c
index c2808fd081d6..a9b96b18772f 100644
--- a/drivers/dma/idxd/cdev.c
+++ b/drivers/dma/idxd/cdev.c
@@ -312,6 +312,24 @@ static int idxd_user_drv_probe(struct idxd_dev *idxd_dev)
if (idxd->state != IDXD_DEV_ENABLED)
return -ENXIO;
+ /*
+ * User type WQ is enabled only when SVA is enabled for two reasons:
+ * - If no IOMMU or IOMMU Passthrough without SVA, userspace
+ * can directly access physical address through the WQ.
+ * - The IDXD cdev driver does not provide any ways to pin
+ * user pages and translate the address from user VA to IOVA or
+ * PA without IOMMU SVA. Therefore the application has no way
+ * to instruct the device to perform DMA function. This makes
+ * the cdev not usable for normal application usage.
+ */
+ if (!device_user_pasid_enabled(idxd)) {
+ idxd->cmd_status = IDXD_SCMD_WQ_USER_NO_IOMMU;
+ dev_dbg(&idxd->pdev->dev,
+ "User type WQ cannot be enabled without SVA.\n");
+
+ return -EOPNOTSUPP;
+ }
+
mutex_lock(&wq->wq_lock);
wq->type = IDXD_WQT_USER;
rc = drv_enable_wq(wq);
diff --git a/include/uapi/linux/idxd.h b/include/uapi/linux/idxd.h
index 095299c75828..2b9e7feba3f3 100644
--- a/include/uapi/linux/idxd.h
+++ b/include/uapi/linux/idxd.h
@@ -29,6 +29,7 @@ enum idxd_scmd_stat {
IDXD_SCMD_WQ_NO_SIZE = 0x800e0000,
IDXD_SCMD_WQ_NO_PRIV = 0x800f0000,
IDXD_SCMD_WQ_IRQ_ERR = 0x80100000,
+ IDXD_SCMD_WQ_USER_NO_IOMMU = 0x80110000,
};
#define IDXD_SCMD_SOFTERR_MASK 0x80000000
--
2.32.0
commit 456797da792fa7cbf6698febf275fe9b36691f78 upstream.
arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.
This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.
With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.
Reviewed-by: Sudeep Holla <sudeep.holla(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Reviewed-by: Atish Patra <atishp(a)rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
---
arch/arm64/kernel/topology.c | 40 ------------------------------------
drivers/base/arch_topology.c | 19 +++++++++++++++++
2 files changed, 19 insertions(+), 40 deletions(-)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 113903db666c..3d3a673c6704 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -21,46 +21,6 @@
#include <asm/cputype.h>
#include <asm/topology.h>
-void store_cpu_topology(unsigned int cpuid)
-{
- struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
- u64 mpidr;
-
- if (cpuid_topo->package_id != -1)
- goto topology_populated;
-
- mpidr = read_cpuid_mpidr();
-
- /* Uniprocessor systems can rely on default topology values */
- if (mpidr & MPIDR_UP_BITMASK)
- return;
-
- /*
- * This would be the place to create cpu topology based on MPIDR.
- *
- * However, it cannot be trusted to depict the actual topology; some
- * pieces of the architecture enforce an artificial cap on Aff0 values
- * (e.g. GICv3's ICC_SGI1R_EL1 limits it to 15), leading to an
- * artificial cycling of Aff1, Aff2 and Aff3 values. IOW, these end up
- * having absolutely no relationship to the actual underlying system
- * topology, and cannot be reasonably used as core / package ID.
- *
- * If the MT bit is set, Aff0 *could* be used to define a thread ID, but
- * we still wouldn't be able to obtain a sane core ID. This means we
- * need to entirely ignore MPIDR for any topology deduction.
- */
- cpuid_topo->thread_id = -1;
- cpuid_topo->core_id = cpuid;
- cpuid_topo->package_id = cpu_to_node(cpuid);
-
- pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
- cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
- cpuid_topo->thread_id, mpidr);
-
-topology_populated:
- update_siblings_masks(cpuid);
-}
-
#ifdef CONFIG_ACPI
static bool __init acpi_cpu_is_threaded(int cpu)
{
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 503404f3280e..0dfe20d18b8d 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -538,4 +538,23 @@ void __init init_cpu_topology(void)
else if (of_have_populated_dt() && parse_dt_topology())
reset_cpu_topology();
}
+
+void store_cpu_topology(unsigned int cpuid)
+{
+ struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
+
+ if (cpuid_topo->package_id != -1)
+ goto topology_populated;
+
+ cpuid_topo->thread_id = -1;
+ cpuid_topo->core_id = cpuid;
+ cpuid_topo->package_id = cpu_to_node(cpuid);
+
+ pr_debug("CPU%u: package %d core %d thread %d\n",
+ cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
+ cpuid_topo->thread_id);
+
+topology_populated:
+ update_siblings_masks(cpuid);
+}
#endif
--
2.38.0