On Thu, Aug 22, 2019 at 3:21 PM Andrew Morton akpm@linux-foundation.org wrote:
On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi joseph.qi@linux.alibaba.com wrote:
Only when calling the poll syscall the first time can user receive POLLPRI correctly. After that, user always fails to acquire the event signal.
Reproduce case:
- Get the monitor code in Documentation/accounting/psi.txt
- Run it, and wait for the event triggered.
- Kill and restart the process.
The question is why we can end up with poll_scheduled = 1 but the work not running (which would reset it to 0). And the answer is because the scheduling side sees group->poll_kworker under RCU protection and then schedules it, but here we cancel the work and destroy the worker. The cancel needs to pair with resetting the poll_scheduled flag.
Should this be backported into -stable kernels?
Adding GregKH and stable@vger.kernel.org
I was able to cleanly apply this patch to stable master and linux-5.2.y branches (these are the only branches that have psi triggers). Greg, Andrew got this patch into -mm tree. Please advise on how we should proceed to land it in stable 5.2.y and master. Thanks, Suren.
On Thu, 22 Aug 2019 16:11:15 -0700 Suren Baghdasaryan surenb@google.com wrote:
On Thu, Aug 22, 2019 at 3:21 PM Andrew Morton akpm@linux-foundation.org wrote:
On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi joseph.qi@linux.alibaba.com wrote:
Only when calling the poll syscall the first time can user receive POLLPRI correctly. After that, user always fails to acquire the event signal.
Reproduce case:
- Get the monitor code in Documentation/accounting/psi.txt
- Run it, and wait for the event triggered.
- Kill and restart the process.
The question is why we can end up with poll_scheduled = 1 but the work not running (which would reset it to 0). And the answer is because the scheduling side sees group->poll_kworker under RCU protection and then schedules it, but here we cancel the work and destroy the worker. The cancel needs to pair with resetting the poll_scheduled flag.
Should this be backported into -stable kernels?
Adding GregKH and stable@vger.kernel.org
I was able to cleanly apply this patch to stable master and linux-5.2.y branches (these are the only branches that have psi triggers). Greg, Andrew got this patch into -mm tree. Please advise on how we should proceed to land it in stable 5.2.y and master.
That isn't the point - we know how to merge patches ;)
What I'm asking is whether it is desirable that -stable kernels have this patch. It certainly sounds like it from the changelog, so I'm wondering if the omission of cc:stable was intentional?
On Thu, Aug 22, 2019 at 4:17 PM Andrew Morton akpm@linux-foundation.org wrote:
On Thu, 22 Aug 2019 16:11:15 -0700 Suren Baghdasaryan surenb@google.com wrote:
On Thu, Aug 22, 2019 at 3:21 PM Andrew Morton akpm@linux-foundation.org wrote:
On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi joseph.qi@linux.alibaba.com wrote:
Only when calling the poll syscall the first time can user receive POLLPRI correctly. After that, user always fails to acquire the event signal.
Reproduce case:
- Get the monitor code in Documentation/accounting/psi.txt
- Run it, and wait for the event triggered.
- Kill and restart the process.
The question is why we can end up with poll_scheduled = 1 but the work not running (which would reset it to 0). And the answer is because the scheduling side sees group->poll_kworker under RCU protection and then schedules it, but here we cancel the work and destroy the worker. The cancel needs to pair with resetting the poll_scheduled flag.
Should this be backported into -stable kernels?
Adding GregKH and stable@vger.kernel.org
I was able to cleanly apply this patch to stable master and linux-5.2.y branches (these are the only branches that have psi triggers). Greg, Andrew got this patch into -mm tree. Please advise on how we should proceed to land it in stable 5.2.y and master.
That isn't the point - we know how to merge patches ;)
What I'm asking is whether it is desirable that -stable kernels have this patch. It certainly sounds like it from the changelog, so I'm wondering if the omission of cc:stable was intentional?
Sorry for my misunderstanding. I believe cc:stable omission was unintentional. It's a fix for a bug which exists in stable branches I mentioned above. Thanks!
linux-stable-mirror@lists.linaro.org