Paul,
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
Does that make sense to you?
Guillaume.
On Thu, Jan 20, 2022 at 07:55:01PM +0100, Guillaume Morin wrote:
Paul,
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
Does that make sense to you?
From a quick glance at v5.4, it looks quite plausible to me.
I do suggest that you try building and testing, given that the hardware's idea of what is plausible overrides that of either of us. ;-)
Thanx, Paul
On 20 Jan 11:16, Paul E. McKenney wrote:
On Thu, Jan 20, 2022 at 07:55:01PM +0100, Guillaume Morin wrote:
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
Does that make sense to you?
From a quick glance at v5.4, it looks quite plausible to me.
I do suggest that you try building and testing, given that the hardware's idea of what is plausible overrides that of either of us. ;-)
We've had a few dozens lockups on 5.4 and 5.10 due to this bug (what lead me to write to you back in Sep). The original bugzilla report is on 5.4 as well, see https://bugzilla.kernel.org/show_bug.cgi?id=208685. So I am positive that the issue is reachable in both kernels.
Also I do know for sure it fixes the problem for 5.10. I don't have a test rig anymore for 5.4. But considering we know it's reachable with 5.4, I think the patch should be applied for 5.4+. Obviously, you're the expert here though.
On Thu, Jan 20, 2022 at 08:26:54PM +0100, Guillaume Morin wrote:
On 20 Jan 11:16, Paul E. McKenney wrote:
On Thu, Jan 20, 2022 at 07:55:01PM +0100, Guillaume Morin wrote:
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
Does that make sense to you?
From a quick glance at v5.4, it looks quite plausible to me.
I do suggest that you try building and testing, given that the hardware's idea of what is plausible overrides that of either of us. ;-)
We've had a few dozens lockups on 5.4 and 5.10 due to this bug (what lead me to write to you back in Sep). The original bugzilla report is on 5.4 as well, see https://bugzilla.kernel.org/show_bug.cgi?id=208685. So I am positive that the issue is reachable in both kernels.
Also I do know for sure it fixes the problem for 5.10. I don't have a test rig anymore for 5.4. But considering we know it's reachable with 5.4, I think the patch should be applied for 5.4+. Obviously, you're the expert here though.
Au contraire! I do not claim much expertise on -stable validation.
If it was me, I would run a quick touch-test like this from the top-level directory of the Linux-kernel source tree on a qemu/KVM-capable system:
tools/testing/selftests/rcutorture/bin/kvm.sh --cpus N --duration 10 --configs "TREE01 TREE04"
Where "N" is replaced by the number of CPUs on your system, which should preferably be at least eight.
This will take somewhere between 15 minutes and an hour to run, depending on your system.
Sadly, v5.4 isn't quite as good at analyzing results as are current versions, but please feel free to send me the output.
Does that help?
Thanx, Paul
On 20 Jan 12:57, Paul E. McKenney wrote:
On Thu, Jan 20, 2022 at 08:26:54PM +0100, Guillaume Morin wrote:
On 20 Jan 11:16, Paul E. McKenney wrote:
On Thu, Jan 20, 2022 at 07:55:01PM +0100, Guillaume Morin wrote:
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
Does that make sense to you?
From a quick glance at v5.4, it looks quite plausible to me.
I do suggest that you try building and testing, given that the hardware's idea of what is plausible overrides that of either of us. ;-)
We've had a few dozens lockups on 5.4 and 5.10 due to this bug (what lead me to write to you back in Sep). The original bugzilla report is on 5.4 as well, see https://bugzilla.kernel.org/show_bug.cgi?id=208685. So I am positive that the issue is reachable in both kernels.
Also I do know for sure it fixes the problem for 5.10. I don't have a test rig anymore for 5.4. But considering we know it's reachable with 5.4, I think the patch should be applied for 5.4+. Obviously, you're the expert here though.
Au contraire! I do not claim much expertise on -stable validation.
If it was me, I would run a quick touch-test like this from the top-level directory of the Linux-kernel source tree on a qemu/KVM-capable system:
tools/testing/selftests/rcutorture/bin/kvm.sh --cpus N --duration 10 --configs "TREE01 TREE04"
Where "N" is replaced by the number of CPUs on your system, which should preferably be at least eight.
This will take somewhere between 15 minutes and an hour to run, depending on your system.
Sadly, v5.4 isn't quite as good at analyzing results as are current versions, but please feel free to send me the output.
Does that help?
Ok I did a quick run with 614ddad17f22a22e035e2ea37a04815f50362017 applied on top of the 5.4 stable branch. Not quite sure how I got suckered into running a test on a kernel I don't even run, but hey I guess everybody must do their part :-)
Not sure about CONFIG_HOTPLUG_CPU thing at the end.
tools/testing/selftests/rcutorture/initrd/init already exists, no need to create it Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 ----Start batch 1: Thu 20 Jan 2022 05:02:37 PM EST TREE01 8: Starting build. Thu 20 Jan 2022 05:02:37 PM EST TREE01 8: Waiting for build to complete. Thu 20 Jan 2022 05:02:37 PM EST TREE01 8: Build complete. Thu 20 Jan 2022 05:03:16 PM EST TREE04 8: Starting build. Thu 20 Jan 2022 05:03:16 PM EST TREE04 8: Waiting for build to complete. Thu 20 Jan 2022 05:03:16 PM EST TREE04 8: Build complete. Thu 20 Jan 2022 05:03:55 PM EST ---- TREE01 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- TREE04 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- Starting kernels. Thu 20 Jan 2022 05:03:55 PM EST ---- All kernel runs complete. Thu 20 Jan 2022 05:14:05 PM EST ---- TREE01 8: Build/run results: --- Thu 20 Jan 2022 05:02:37 PM EST: Starting build --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 46081 Grace period for qemu job at pid 46081 ---- TREE04 8: Build/run results: --- Thu 20 Jan 2022 05:03:16 PM EST: Starting build :CONFIG_HOTPLUG_CPU: improperly set --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 45847 Grace period for qemu job at pid 45847
--- Thu 20 Jan 2022 05:02:37 PM EST Test summary: Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 TREE01 ------- 12719 GPs (21.1983/s) [rcu: g94609 f0x0 ] TREE04 ------- 3128 GPs (5.21333/s) [rcu: g23621 f0x0 ] :CONFIG_HOTPLUG_CPU: improperly set
On Thu, Jan 20, 2022 at 11:21:36PM +0100, Guillaume Morin wrote:
On 20 Jan 12:57, Paul E. McKenney wrote:
On Thu, Jan 20, 2022 at 08:26:54PM +0100, Guillaume Morin wrote:
On 20 Jan 11:16, Paul E. McKenney wrote:
On Thu, Jan 20, 2022 at 07:55:01PM +0100, Guillaume Morin wrote:
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
Does that make sense to you?
From a quick glance at v5.4, it looks quite plausible to me.
I do suggest that you try building and testing, given that the hardware's idea of what is plausible overrides that of either of us. ;-)
We've had a few dozens lockups on 5.4 and 5.10 due to this bug (what lead me to write to you back in Sep). The original bugzilla report is on 5.4 as well, see https://bugzilla.kernel.org/show_bug.cgi?id=208685. So I am positive that the issue is reachable in both kernels.
Also I do know for sure it fixes the problem for 5.10. I don't have a test rig anymore for 5.4. But considering we know it's reachable with 5.4, I think the patch should be applied for 5.4+. Obviously, you're the expert here though.
Au contraire! I do not claim much expertise on -stable validation.
If it was me, I would run a quick touch-test like this from the top-level directory of the Linux-kernel source tree on a qemu/KVM-capable system:
tools/testing/selftests/rcutorture/bin/kvm.sh --cpus N --duration 10 --configs "TREE01 TREE04"
Where "N" is replaced by the number of CPUs on your system, which should preferably be at least eight.
This will take somewhere between 15 minutes and an hour to run, depending on your system.
Sadly, v5.4 isn't quite as good at analyzing results as are current versions, but please feel free to send me the output.
Does that help?
Ok I did a quick run with 614ddad17f22a22e035e2ea37a04815f50362017 applied on top of the 5.4 stable branch. Not quite sure how I got suckered into running a test on a kernel I don't even run, but hey I guess everybody must do their part :-)
That is indeed what I keep telling myself. ;-)
Not sure about CONFIG_HOTPLUG_CPU thing at the end.
tools/testing/selftests/rcutorture/initrd/init already exists, no need to create it Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 ----Start batch 1: Thu 20 Jan 2022 05:02:37 PM EST TREE01 8: Starting build. Thu 20 Jan 2022 05:02:37 PM EST TREE01 8: Waiting for build to complete. Thu 20 Jan 2022 05:02:37 PM EST TREE01 8: Build complete. Thu 20 Jan 2022 05:03:16 PM EST TREE04 8: Starting build. Thu 20 Jan 2022 05:03:16 PM EST TREE04 8: Waiting for build to complete. Thu 20 Jan 2022 05:03:16 PM EST TREE04 8: Build complete. Thu 20 Jan 2022 05:03:55 PM EST
39 seconds to build each kernel. Not bad! ;-)
---- TREE01 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- TREE04 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- Starting kernels. Thu 20 Jan 2022 05:03:55 PM EST ---- All kernel runs complete. Thu 20 Jan 2022 05:14:05 PM EST ---- TREE01 8: Build/run results: --- Thu 20 Jan 2022 05:02:37 PM EST: Starting build --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 46081 Grace period for qemu job at pid 46081 ---- TREE04 8: Build/run results: --- Thu 20 Jan 2022 05:03:16 PM EST: Starting build :CONFIG_HOTPLUG_CPU: improperly set --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 45847 Grace period for qemu job at pid 45847
--- Thu 20 Jan 2022 05:02:37 PM EST Test summary: Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 TREE01 ------- 12719 GPs (21.1983/s) [rcu: g94609 f0x0 ] TREE04 ------- 3128 GPs (5.21333/s) [rcu: g23621 f0x0 ] :CONFIG_HOTPLUG_CPU: improperly set
This run was successful, so good!
But you are quite correct to be suspicious of the "improperly set" message. But is is OK in this particular case.
This message appears because security-related changes made it quite difficult to disable CPU hotplug on x86. The rcutorture test suite is therefore complaining that even though it tried disabling CPU hotplug for the TREE04 test scenario, it found that the kernel nevertheless built with CONFIG_HOTPLUG_CPU=y. And later versions of rcutorture resigned themselves to always testing with CONFIG_HOTPLUG_CPU=y.
So again, this run was successful. And thank you for checking it!
Thanx, Paul
On 20 Jan 15:33, Paul E. McKenney wrote:
---- TREE01 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- TREE04 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- Starting kernels. Thu 20 Jan 2022 05:03:55 PM EST ---- All kernel runs complete. Thu 20 Jan 2022 05:14:05 PM EST ---- TREE01 8: Build/run results: --- Thu 20 Jan 2022 05:02:37 PM EST: Starting build --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 46081 Grace period for qemu job at pid 46081 ---- TREE04 8: Build/run results: --- Thu 20 Jan 2022 05:03:16 PM EST: Starting build :CONFIG_HOTPLUG_CPU: improperly set --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 45847 Grace period for qemu job at pid 45847
--- Thu 20 Jan 2022 05:02:37 PM EST Test summary: Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 TREE01 ------- 12719 GPs (21.1983/s) [rcu: g94609 f0x0 ] TREE04 ------- 3128 GPs (5.21333/s) [rcu: g23621 f0x0 ] :CONFIG_HOTPLUG_CPU: improperly set
This run was successful, so good!
Greg, so could you please queue up 614ddad17f22a22e035e2ea37a04815f50362017 for 5.4+ stable branches?
Thank you
Guillaume.
On Mon, Jan 24, 2022 at 07:12:55PM +0100, Guillaume Morin wrote:
On 20 Jan 15:33, Paul E. McKenney wrote:
---- TREE01 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- TREE04 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- Starting kernels. Thu 20 Jan 2022 05:03:55 PM EST ---- All kernel runs complete. Thu 20 Jan 2022 05:14:05 PM EST ---- TREE01 8: Build/run results: --- Thu 20 Jan 2022 05:02:37 PM EST: Starting build --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 46081 Grace period for qemu job at pid 46081 ---- TREE04 8: Build/run results: --- Thu 20 Jan 2022 05:03:16 PM EST: Starting build :CONFIG_HOTPLUG_CPU: improperly set --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 45847 Grace period for qemu job at pid 45847
--- Thu 20 Jan 2022 05:02:37 PM EST Test summary: Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 TREE01 ------- 12719 GPs (21.1983/s) [rcu: g94609 f0x0 ] TREE04 ------- 3128 GPs (5.21333/s) [rcu: g23621 f0x0 ] :CONFIG_HOTPLUG_CPU: improperly set
This run was successful, so good!
Greg, so could you please queue up 614ddad17f22a22e035e2ea37a04815f50362017 for 5.4+ stable branches?
Will do after this next round is out, thanks.
greg k-h
On Mon, Jan 24, 2022 at 07:16:38PM +0100, Greg KH wrote:
On Mon, Jan 24, 2022 at 07:12:55PM +0100, Guillaume Morin wrote:
On 20 Jan 15:33, Paul E. McKenney wrote:
---- TREE01 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- TREE04 8: Kernel present. Thu 20 Jan 2022 05:03:55 PM EST ---- Starting kernels. Thu 20 Jan 2022 05:03:55 PM EST ---- All kernel runs complete. Thu 20 Jan 2022 05:14:05 PM EST ---- TREE01 8: Build/run results: --- Thu 20 Jan 2022 05:02:37 PM EST: Starting build --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 46081 Grace period for qemu job at pid 46081 ---- TREE04 8: Build/run results: --- Thu 20 Jan 2022 05:03:16 PM EST: Starting build :CONFIG_HOTPLUG_CPU: improperly set --- Thu 20 Jan 2022 05:03:55 PM EST: Starting kernel CPU-hotplug kernel, adding rcutorture onoff. Monitoring qemu job at pid 45847 Grace period for qemu job at pid 45847
--- Thu 20 Jan 2022 05:02:37 PM EST Test summary: Results directory: /usr/scratch/kernel/tools/testing/selftests/rcutorture/res/2022.01.20-17:02:37 tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 60 --duration 10 --configs TREE01 TREE04 TREE01 ------- 12719 GPs (21.1983/s) [rcu: g94609 f0x0 ] TREE04 ------- 3128 GPs (5.21333/s) [rcu: g23621 f0x0 ] :CONFIG_HOTPLUG_CPU: improperly set
This run was successful, so good!
Greg, so could you please queue up 614ddad17f22a22e035e2ea37a04815f50362017 for 5.4+ stable branches?
Will do after this next round is out, thanks.
Now queued up,t hanks.
gre gk-h
On Thu, Jan 20, 2022 at 8:03 PM Guillaume Morin guillaume@morinfr.org wrote:
Paul,
I believe commit 614ddad17f22a22e035e2ea37a04815f50362017 (slated for 5.17) should be queued for all 5.4+ stable branches as it fixes a serious lockup bug. FWIW I have verified it applies cleanly on all 4 branches.
I agree here. 5.4+ suffers this bug and the mentioned commit addresses that issue.
--nX
Does that make sense to you?
Guillaume.
-- Guillaume Morin guillaume@morinfr.org
linux-stable-mirror@lists.linaro.org