From: Shuai Xue xueshuai@linux.alibaba.com
commit 77677cdbc2aa4b5d5d839562793d3d126201d18d upstream.
The GHES code calls memory_failure_queue() from IRQ context to queue work into workqueue and schedule it on the current CPU. Then the work is processed in memory_failure_work_func() by kworker and calls memory_failure().
When a page is already poisoned, commit a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") make memory_failure() call kill_accessing_process() that:
- holds mmap locking of current->mm - does pagetable walk to find the error virtual address - and sends SIGBUS to the current process with error info.
However, the mm of kworker is not valid, resulting in a null-pointer dereference. So check mm when killing the accessing process.
[akpm@linux-foundation.org: remove unrelated whitespace alteration] Link: https://lkml.kernel.org/r/20220914064935.7851-1-xueshuai@linux.alibaba.com Fixes: a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Miaohe Lin linmiaohe@huawei.com Acked-by: Naoya Horiguchi naoya.horiguchi@nec.com Cc: Huang Ying ying.huang@intel.com Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Bixuan Cui cuibixuan@linux.alibaba.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -700,6 +700,9 @@ static int kill_accessing_process(struct }; priv.tk.tsk = p;
+ if (!p->mm) + return -EFAULT; + mmap_read_lock(p->mm); ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops, (void *)&priv);