We can do a sleeping allocation from an irq context when CONFIG_NUMA is enabled. Fix this by initializing the NUMA crng instances in a workqueue.
Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot+9de458f6a5e713ee8c1a@syzkaller.appspotmail.com Fixes: 8ef35c866f8862df ("random: set up the NUMA crng instances...") Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o tytso@mit.edu --- drivers/char/random.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c index 3cd3aae24d6d..e182cca7e6cd 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -789,7 +789,7 @@ static void crng_initialize(struct crng_state *crng) }
#ifdef CONFIG_NUMA -static void numa_crng_init(void) +static void do_numa_crng_init(struct work_struct *work) { int i; struct crng_state *crng; @@ -810,6 +810,13 @@ static void numa_crng_init(void) kfree(pool); } } + +DECLARE_WORK(numa_crng_init_work, do_numa_crng_init); + +static void numa_crng_init(void) +{ + schedule_work(&numa_crng_init_work); +} #else static void numa_crng_init(void) {} #endif
Theodore Ts'o wrote:
We can do a sleeping allocation from an irq context when CONFIG_NUMA is enabled. Fix this by initializing the NUMA crng instances in a workqueue.
Offloading to workqueue context itself would be OK, but this patch makes linux.git unbootable because
if (crng == &primary_crng && crng_init < 2) { invalidate_batched_entropy(); numa_crng_init(); // <= Deferred to workqueue context. crng_init = 2; // <= Not waiting for workqueue context, and oops before console becomes ready. ;-) process_random_ready_list(); wake_up_interruptible(&crng_init_wait); pr_notice("random: crng init done\n"); }
Please don't pretend rng_ready() before workqueue context is processed.
On Wed, Apr 25, 2018 at 09:46:42AM +0900, Tetsuo Handa wrote:
Theodore Ts'o wrote:
We can do a sleeping allocation from an irq context when CONFIG_NUMA is enabled. Fix this by initializing the NUMA crng instances in a workqueue.
Offloading to workqueue context itself would be OK, but this patch makes linux.git unbootable because
if (crng == &primary_crng && crng_init < 2) { invalidate_batched_entropy(); numa_crng_init(); // <= Deferred to workqueue context. crng_init = 2; // <= Not waiting for workqueue context, and oops before console becomes ready. ;-) process_random_ready_list(); wake_up_interruptible(&crng_init_wait); pr_notice("random: crng init done\n"); }
Please don't pretend rng_ready() before workqueue context is processed.
Where's the oops? It's not oopsing for me, and if the NUMA crng is not initailized, the code in extract_entropy returns falls back to using the primary_crng:
static void extract_crng(__u32 out[CHACHA20_BLOCK_WORDS]) { struct crng_state *crng = NULL;
#ifdef CONFIG_NUMA if (crng_node_pool) crng = crng_node_pool[numa_node_id()]; if (crng == NULL) #endif crng = &primary_crng; _extract_crng(crng, out); }
- Ted
Theodore Y. Ts'o wrote:
On Wed, Apr 25, 2018 at 09:46:42AM +0900, Tetsuo Handa wrote:
Theodore Ts'o wrote:
We can do a sleeping allocation from an irq context when CONFIG_NUMA is enabled. Fix this by initializing the NUMA crng instances in a workqueue.
Offloading to workqueue context itself would be OK, but this patch makes linux.git unbootable because
if (crng == &primary_crng && crng_init < 2) { invalidate_batched_entropy(); numa_crng_init(); // <= Deferred to workqueue context. crng_init = 2; // <= Not waiting for workqueue context, and oops before console becomes ready. ;-) process_random_ready_list(); wake_up_interruptible(&crng_init_wait); pr_notice("random: crng init done\n"); }
Please don't pretend rng_ready() before workqueue context is processed.
Where's the oops?
I assumed an oops happened, for the kernel did not start printing messages even after 1 minute from guest's power on, and CPU usage (seen from host side) says that 1 CPU is busy-looping; which is a phenomenon that the kernel panic()ed at very early stage. And reverting only your patch solved the problem.
But I can no longer reproduce it. I should have saved the kernel config... So, if nobody sees regression, please go with your patch.
-DECLARE_WORK(numa_crng_init_work, do_numa_crng_init); +static DECLARE_WORK(numa_crng_init_work, do_numa_crng_init);
linux-stable-mirror@lists.linaro.org