Without EXCLUSIVE_SYSTEM_RAM, users are allowed to map arbitrary
physical memory regions into the userspace via /dev/mem. At the same
time, pages may change their properties (e.g., from anonymous pages to
named pages) while they are still being mapped in the userspace, leading
to "corruption" detected by the page table check.
To avoid these false positives, this patch makes PAGE_TABLE_CHECK
depends on EXCLUSIVE_SYSTEM_RAM. This dependency is understandable
because PAGE_TABLE_CHECK is a hardening technique but /dev/mem without
STRICT_DEVMEM (i.e., !EXCLUSIVE_SYSTEM_RAM) is itself a security
problem.
Even with EXCLUSIVE_SYSTEM_RAM, I/O pages may be still allowed to be
mapped via /dev/mem. However, these pages are always considered as named
pages, so they won't break the logic used in the page table check.
Cc: <stable(a)vger.kernel.org> # 5.17
Signed-off-by: Ruihan Li <lrh2000(a)pku.edu.cn>
---
Documentation/mm/page_table_check.rst | 19 +++++++++++++++++++
mm/Kconfig.debug | 1 +
2 files changed, 20 insertions(+)
diff --git a/Documentation/mm/page_table_check.rst b/Documentation/mm/page_table_check.rst
index cfd8f4117..c12838ce6 100644
--- a/Documentation/mm/page_table_check.rst
+++ b/Documentation/mm/page_table_check.rst
@@ -52,3 +52,22 @@ Build kernel with:
Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have page
table support without extra kernel parameter.
+
+Implementation notes
+====================
+
+We specifically decided not to use VMA information in order to avoid relying on
+MM states (except for limited "struct page" info). The page table check is a
+separate from Linux-MM state machine that verifies that the user accessible
+pages are not falsely shared.
+
+PAGE_TABLE_CHECK depends on EXCLUSIVE_SYSTEM_RAM. The reason is that without
+EXCLUSIVE_SYSTEM_RAM, users are allowed to map arbitrary physical memory
+regions into the userspace via /dev/mem. At the same time, pages may change
+their properties (e.g., from anonymous pages to named pages) while they are
+still being mapped in the userspace, leading to "corruption" detected by the
+page table check.
+
+Even with EXCLUSIVE_SYSTEM_RAM, I/O pages may be still allowed to be mapped via
+/dev/mem. However, these pages are always considered as named pages, so they
+won't break the logic used in the page table check.
diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index a925415b4..018a5bd2f 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -98,6 +98,7 @@ config PAGE_OWNER
config PAGE_TABLE_CHECK
bool "Check for invalid mappings in user page tables"
depends on ARCH_SUPPORTS_PAGE_TABLE_CHECK
+ depends on EXCLUSIVE_SYSTEM_RAM
select PAGE_EXTENSION
help
Check that anonymous page is not being mapped twice with read write
--
2.40.1
When hcd->localmem_pool is non-null, localmem_pool is used to allocate
DMA memory. In this case, the dma address will be properly returned (in
dma_handle), and dma_mmap_coherent should be used to map this memory
into the user space. However, the current implementation uses
pfn_remap_range, which is supposed to map normal pages.
Instead of repeating the logic in the memory allocation function, this
patch introduces a more robust solution. Here, the type of allocated
memory is checked by testing whether dma_handle is properly set. If
dma_handle is properly returned, it means some DMA pages are allocated
and dma_mmap_coherent should be used to map them. Otherwise, normal
pages are allocated and pfn_remap_range should be called. This ensures
that the correct mmap functions are used consistently, independently
with logic details that determine which type of memory gets allocated.
Fixes: a0e710a7def4 ("USB: usbfs: fix mmap dma mismatch")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ruihan Li <lrh2000(a)pku.edu.cn>
---
drivers/usb/core/devio.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 3936ca7f7..fcf68818e 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -235,7 +235,7 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
size_t size = vma->vm_end - vma->vm_start;
void *mem;
unsigned long flags;
- dma_addr_t dma_handle;
+ dma_addr_t dma_handle = DMA_MAPPING_ERROR;
int ret;
ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
@@ -265,7 +265,14 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);
- if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
+ /*
+ * In DMA-unavailable cases, hcd_buffer_alloc_pages allocates
+ * normal pages and assigns DMA_MAPPING_ERROR to dma_handle. Check
+ * whether we are in such cases, and then use remap_pfn_range (or
+ * dma_mmap_coherent) to map normal (or DMA) pages into the user
+ * space, respectively.
+ */
+ if (dma_handle == DMA_MAPPING_ERROR) {
if (remap_pfn_range(vma, vma->vm_start,
virt_to_phys(usbm->mem) >> PAGE_SHIFT,
size, vma->vm_page_prot) < 0) {
--
2.40.1
When a GIC local interrupt is not routable, it's vl_map will be used
to control some internal states for core (providing IPTI, IPPCI, IPFDC
input signal for core). Overriding it will interfere core's intetrupt
controller.
Do not touch vl_map if a local interrupt is not routable, we are not
going to remap it.
Before dd098a0e0319 (" irqchip/mips-gic: Get rid of the reliance on
irq_cpu_online()"), if a local interrupt is not routable, then it won't
be requested from GIC Local domain, and thus gic_all_vpes_irq_cpu_online
won't be called for that particular interrupt.
Fixes: dd098a0e0319 (" irqchip/mips-gic: Get rid of the reliance on irq_cpu_online()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
---
drivers/irqchip/irq-mips-gic.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
index 046c355e120b..b568d55ef7c5 100644
--- a/drivers/irqchip/irq-mips-gic.c
+++ b/drivers/irqchip/irq-mips-gic.c
@@ -399,6 +399,8 @@ static void gic_all_vpes_irq_cpu_online(void)
unsigned int intr = local_intrs[i];
struct gic_all_vpes_chip_data *cd;
+ if (!gic_local_irq_is_routable(intr))
+ continue;
cd = &gic_all_vpes_chip_data[intr];
write_gic_vl_map(mips_gic_vx_map_reg(intr), cd->map);
if (cd->mask)
--
2.34.1
fprobe_hander and fprobe_kprobe_handler has guarded ftrace recursion
detection but fprobe_exit_handler has not, which possibly introduce
recursive calls if the fprobe exit callback calls any traceable
functions. Checking in fprobe_hander or fprobe_kprobe_handler
is not enough and misses this case.
So add recursion free guard the same way as fprobe_hander. Since
ftrace recursion check does not employ ip(s), so here use entry_ip and
entry_parent_ip the same as fprobe_handler.
Fixes: 5b0ab78998e3 ("fprobe: Add exit_handler support")
Signed-off-by: Ze Gao <zegao(a)tencent.com>
Cc: stable(a)vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
---
kernel/trace/fprobe.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 097c740799ba..a9580a88cc15 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -17,6 +17,7 @@
struct fprobe_rethook_node {
struct rethook_node node;
unsigned long entry_ip;
+ unsigned long entry_parent_ip;
char data[];
};
@@ -39,6 +40,7 @@ static inline void __fprobe_handler(unsigned long ip, unsigned long
}
fpr = container_of(rh, struct fprobe_rethook_node, node);
fpr->entry_ip = ip;
+ fpr->entry_parent_ip = parent_ip;
if (fp->entry_data_size)
entry_data = fpr->data;
}
@@ -114,14 +116,25 @@ static void fprobe_exit_handler(struct rethook_node *rh, void *data,
{
struct fprobe *fp = (struct fprobe *)data;
struct fprobe_rethook_node *fpr;
+ int bit;
if (!fp || fprobe_disabled(fp))
return;
fpr = container_of(rh, struct fprobe_rethook_node, node);
+ /* we need to assure no calls to traceable functions in-between the
+ * end of fprobe_handler and the beginning of fprobe_exit_handler.
+ */
+ bit = ftrace_test_recursion_trylock(fpr->entry_ip, fpr->entry_parent_ip);
+ if (bit < 0) {
+ fp->nmissed++;
+ return;
+ }
+
fp->exit_handler(fp, fpr->entry_ip, regs,
fp->entry_data_size ? (void *)fpr->data : NULL);
+ ftrace_test_recursion_unlock(bit);
}
NOKPROBE_SYMBOL(fprobe_exit_handler);
--
2.40.1
This patch replaces preempt_{disable, enable} with its corresponding
notrace version in rethook_trampoline_handler so no worries about stack
recursion or overflow introduced by preempt_count_{add, sub} under
fprobe + rethook context.
Fixes: 54ecbe6f1ed5 ("rethook: Add a generic return hook")
Signed-off-by: Ze Gao <zegao(a)tencent.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
---
kernel/trace/rethook.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index 32c3dfdb4d6a..60f6cb2b486b 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -288,7 +288,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
* These loops must be protected from rethook_free_rcu() because those
* are accessing 'rhn->rethook'.
*/
- preempt_disable();
+ preempt_disable_notrace();
/*
* Run the handler on the shadow stack. Do not unlink the list here because
@@ -321,7 +321,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
first = first->next;
rethook_recycle(rhn);
}
- preempt_enable();
+ preempt_enable_notrace();
return correct_ret_addr;
}
--
2.40.1
Hello all,
I recently did have a regression on v6.4rc1, and it seems that the same
exact issue is now happening also on v6.1.28.
I was not able yet to bisect it (yet), but what is happening is that
libusbgx[1] that we use to configure a USB NCM gadget interface[2][3] just
hang completely at boot.
This is happening with multiple ARM32 and ARM64 i.MX SOC (i.MX6, i.MX7,
i.MX8MM).
The logs is something like that
```
[* �F] A start job is running for Load def…t schema g1.schema (6s / no limit)
M[K[** �F] A start job is running for Load def…t schema g1.schema (7s / no limit)
M[K[*** �F] A start job is running for Load def…t schema g1.schema (8s / no limit)
M[K[ *** �F] A start job is running for Load def…t schema g1.schema (8s / no limit)
```
I will try to bisect this and provide more useful feedback ASAP, I
decided to not wait for it and just send this email in case someone has
some insight on what is going on.
Francesco
[1] https://github.com/linux-usb-gadgets/libusbgx
[2] https://git.toradex.com/cgit/meta-toradex-bsp-common.git/tree/recipes-suppo…
[3] https://git.toradex.com/cgit/meta-toradex-bsp-common.git/tree/recipes-suppo…