Thanks for your patient explanations.
STEP2: In IRQ context, ghes_proc_in_irq() queues memory failure work on current CPU in workqueue and add task work to sync with the workqueue.
Why is there a difference if the interrupted task was a user task vs. a kernel thread?
It seems arbitrary. If the error can be handled in the kernel thread case without a task_work_add() to the current process, can't all errors be handled this way?
The current thread likely has nothing to do with the error. Just a matter of chance on what is running when the NMI is delivered, right?
-Tony