Hallo,
4.4.147 is the last kernel of the 4.4.x series that boots for me. All subsequent versions panic on boot. How do I report this bug? If I'm supposed to use https://bugzilla.kernel.org/ I don't know what to fill into the fields. I don't even know if the longterm kernel falls under "Mainline" or some other tree.
MSB
On Fri, Aug 24, 2018 at 02:24:08PM +0200, Matthias B. wrote:
Hallo,
4.4.147 is the last kernel of the 4.4.x series that boots for me. All subsequent versions panic on boot. How do I report this bug? If I'm supposed to use https://bugzilla.kernel.org/ I don't know what to fill into the fields. I don't even know if the longterm kernel falls under "Mainline" or some other tree.
It depends on what the panic looks like :)
Any hints? You can post it here if you want.
Also, if you can run 'git bisect' to track it down to the commit that causes the problem, that is even better, as we can cc: all of the people on that patch to get help from.
thanks,
greg k-h
On Fri, 24 Aug 2018 14:59:50 +0200 Greg KH gregkh@linuxfoundation.org wrote:
It depends on what the panic looks like :)
Any hints? You can post it here if you want.
The attached image is everything I see on screen. Scrolling up does not work.
Also, if you can run 'git bisect' to track it down to the commit that causes the problem, that is even better, as we can cc: all of the people on that patch to get help from.
I can try. Which git repository and branch stores the 4.4.x kernel and which changeset/tag is the 4.4.147 kernel (which is the last one that works for me)?
MSB
On Fri, Aug 24, 2018 at 03:43:44PM +0200, Matthias B. wrote:
On Fri, 24 Aug 2018 14:59:50 +0200 Greg KH gregkh@linuxfoundation.org wrote:
It depends on what the panic looks like :)
Any hints? You can post it here if you want.
The attached image is everything I see on screen. Scrolling up does not work.
Also, if you can run 'git bisect' to track it down to the commit that causes the problem, that is even better, as we can cc: all of the people on that patch to get help from.
I can try. Which git repository and branch stores the 4.4.x kernel and which changeset/tag is the 4.4.147 kernel (which is the last one that works for me)?
All of the stable trees are here: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ you want the linux-4.4.y branch to work off of.
All of the kernels releases are tagged, so you can start with v4.4.147 as the good entry for 'git bisect'.
thanks,
greg k-h
On Fri, 24 Aug 2018 16:12:54 +0200 Greg KH gregkh@linuxfoundation.org wrote:
All of the stable trees are here: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ you want the linux-4.4.y branch to work off of.
All of the kernels releases are tagged, so you can start with v4.4.147 as the good entry for 'git bisect'.
I've never used git bisect on the kernel. Can I simply do "make" for each step or do I need to "make mrproper" every time?
MSB
On Fri, Aug 24, 2018 at 04:33:38PM +0200, Matthias B. wrote:
On Fri, 24 Aug 2018 16:12:54 +0200 Greg KH gregkh@linuxfoundation.org wrote:
All of the stable trees are here: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ you want the linux-4.4.y branch to work off of.
All of the kernels releases are tagged, so you can start with v4.4.147 as the good entry for 'git bisect'.
I've never used git bisect on the kernel. Can I simply do "make" for each step or do I need to "make mrproper" every time?
'make oldconfig' and then 'make' should be just fine. But add '-j10' or so (the number of your cpus*2), so the build goes faster.
thanks,
greg k-h
Bisect identified the problem. It's the attached patch. I applied it to 4.4.152 with patch -Rp1 and I'm running the resulting kernel now.
MSB
On Fri, Aug 24, 2018 at 06:19:19PM +0200, Matthias B. wrote:
Bisect identified the problem. It's the attached patch. I applied it to 4.4.152 with patch -Rp1 and I'm running the resulting kernel now.
MSB
-- For every idiot-proof system there exists at least one system-proof idiot.
From 02ff2769edbce2261e981effbc3c4b98fae4faf0 Mon Sep 17 00:00:00 2001
From: Andi Kleen ak@linux.intel.com Date: Tue, 7 Aug 2018 15:09:39 -0700 Subject: [PATCH] x86/mm/pat: Make set_memory_np() L1TF safe
commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream
set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits.
Replace the open coded PTE manipulation with the L1TF protecting low level PTE routines.
Passes the CPA self test.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de [ dwmw2: Pull in pud_mkhuge() from commit a00cc7d9dd, and pfn_pud() ] Signed-off-by: David Woodhouse dwmw@amazon.co.uk [groeck: port to 4.4] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
arch/x86/include/asm/pgtable.h | 27 +++++++++++++++++++++++++++ arch/x86/mm/pageattr.c | 8 ++++---- 2 files changed, 31 insertions(+), 4 deletions(-)
<snip>
Guenter, another report of this patch causing an issue. Any ideas? I am away from test systems this weekend, but can push out patches if needed.
thanks,
greg k-h
The following is the dmesg output from my working 4.4.147 around the time when the newer kernel panics.
0.425380] Bluetooth: HIDP (Human Interface Emulation) ver 1.2 [ 0.425932] Bluetooth: HIDP socket layer initialized [ 0.426617] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x19 [ 0.427203] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x19 [ 0.427772] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x19 [ 0.428688] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x19 [ 0.429221] microcode: CPU4 sig=0x306c3, pf=0x2, revision=0x19 [ 0.429744] microcode: CPU5 sig=0x306c3, pf=0x2, revision=0x19 [ 0.430255] microcode: CPU6 sig=0x306c3, pf=0x2, revision=0x19 [ 0.430761] microcode: CPU7 sig=0x306c3, pf=0x2, revision=0x19 [ 0.431271] microcode: Microcode Update Driver: v2.01 tigran@aivazian.fsnet.co.uk, Peter Oruba [ 0.431815] AVX2 version of gcm_enc/dec engaged. [ 0.432347] AES CTR mode by8 optimization enabled [ 0.433358] registered taskstats version 1 [ 0.434086] Btrfs loaded [ 0.434991] rtc_cmos 00:02: setting system clock to 2018-08-24 13:35:45 UTC (1535117745) [ 0.435586] ALSA device list: [ 0.436074] No soundcards found.
All of the times visible on screen for the panicking kernel are in this gap.
[ 0.595677] snd_hda_intel 0000:01:00.1: Too many HDMI devices [ 0.596251] snd_hda_intel 0000:01:00.1: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.596839] snd_hda_intel 0000:01:00.1: Too many HDMI devices [ 0.597412] snd_hda_intel 0000:01:00.1: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.612466] snd_hda_intel 0000:01:00.1: control 3:0:0:ELD:0 is already present [ 0.613100] snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with error -16
On Fri, Aug 24, 2018 at 04:09:47PM +0200, Matthias B. wrote:
The following is the dmesg output from my working 4.4.147 around the time when the newer kernel panics.
0.425380] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[ 0.425932] Bluetooth: HIDP socket layer initialized [ 0.426617] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x19 [ 0.427203] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x19 [ 0.427772] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x19 [ 0.428688] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x19 [ 0.429221] microcode: CPU4 sig=0x306c3, pf=0x2, revision=0x19 [ 0.429744] microcode: CPU5 sig=0x306c3, pf=0x2, revision=0x19 [ 0.430255] microcode: CPU6 sig=0x306c3, pf=0x2, revision=0x19 [ 0.430761] microcode: CPU7 sig=0x306c3, pf=0x2, revision=0x19 [ 0.431271] microcode: Microcode Update Driver: v2.01 tigran@aivazian.fsnet.co.uk, Peter Oruba [ 0.431815] AVX2 version of gcm_enc/dec engaged. [ 0.432347] AES CTR mode by8 optimization enabled [ 0.433358] registered taskstats version 1 [ 0.434086] Btrfs loaded [ 0.434991] rtc_cmos 00:02: setting system clock to 2018-08-24 13:35:45 UTC (1535117745) [ 0.435586] ALSA device list: [ 0.436074] No soundcards found.
All of the times visible on screen for the panicking kernel are in this gap.
[ 0.595677] snd_hda_intel 0000:01:00.1: Too many HDMI devices [ 0.596251] snd_hda_intel 0000:01:00.1: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.596839] snd_hda_intel 0000:01:00.1: Too many HDMI devices [ 0.597412] snd_hda_intel 0000:01:00.1: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.612466] snd_hda_intel 0000:01:00.1: control 3:0:0:ELD:0 is already present [ 0.613100] snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with error -16
Have you tried enabling CONFIG_SND_DYNAMIC_MINORS like the kernel is asking you to here? Does that help?
thanks,
greg k-h
On Fri, 24 Aug 2018 16:11:20 +0200 Greg KH gregkh@linuxfoundation.org wrote:
[ 0.595677] snd_hda_intel 0000:01:00.1: Too many HDMI devices [ 0.596251] snd_hda_intel 0000:01:00.1: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.596839] snd_hda_intel 0000:01:00.1: Too many HDMI devices [ 0.597412] snd_hda_intel 0000:01:00.1: Consider building the kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.612466] snd_hda_intel 0000:01:00.1: control 3:0:0:ELD:0 is already present [ 0.613100] snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with error -16
Have you tried enabling CONFIG_SND_DYNAMIC_MINORS like the kernel is asking you to here? Does that help?
I've just tried and it doesn't help. Note that the above message is from my working kernel and has a timestamp later than the panic, so if the timing of the 152 kernel is somewhat similar to the 147, the panic occurs before that snd_hda_intel code is even reached.
MSB
linux-stable-mirror@lists.linaro.org