On Fri, 2023-06-02 at 18:33 +0200, Henning Schild wrote:
Am Fri, 2 Jun 2023 09:14:08 -0700 schrieb Matt Roper matthew.d.roper@intel.com:
On Fri, Jun 02, 2023 at 06:05:05PM +0200, Henning Schild wrote:
This fixes the following problem which was seen on a Tigerlake running Debian 10 with a 5.10 kernel.
[ 111.408631] Missing case (val == 65535) [ 111.408698] WARNING: CPU: 2 PID: 446 at drivers/gpu/drm/i915/intel_dram.c:95 skl_dram_get_dimm_info+0x72/0x1a0 [i915] [ 111.408699] Modules linked in: intel_powerclamp coretemp i915(+) joydev kvm_intel kvm hid_generic irqbypass crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec aesni_intel glue_helper crypto_simd cryptd snd_hwdep snd_hda_core drm_kms_helper uas snd_pcm usb_storage intel_cstate mei_wdt mei_hdcp snd_timer scsi_mod usbhid hid cec wdat_wdt snd mei_me evdev mei intel_uncore rc_core watchdog pcspkr soundcore video acpi_pad acpi_tad button drm fuse configfs loop(+) efi_pstore efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 dm_mod xhci_pci xhci_hcd marvell dwmac_intel stmmac igb e1000e usbcore nvme nvme_core pcs_xpcs phylink libphy i2c_algo_bit t10_pi dca crc_t10dif crct10dif_generic intel_lpss_pci ptp vmd intel_lpss i2c_i801 crct10dif_pclmul pps_core crct10dif_common idma64 usb_common crc32c_intel i2c_smbus [ 111.408755] CPU: 2 PID: 446 Comm: (udev-worker) Not tainted 5.10.180 #2 [ 111.408756] Hardware name: SIEMENS AG SIMATIC IPC427G/no information, BIOS T29.01.02.D3.0 10/11/2022 [ 111.408797] RIP: 0010:skl_dram_get_dimm_info+0x72/0x1a0 [i915] [ 111.408799] Code: 01 00 00 0f 84 31 01 00 00 66 3d 00 01 0f 84 27 01 00 00 41 0f b7 d0 48 c7 c6 ba 81 7c c1 48 c7 c7 be 81 7c c1 e8 d2 75 89 fa <0f> 0b c6 45 01 00 44 0f b6 4d 00 31 f6 b9 01 00 00 00 c1 fb 09 83 [ 111.408801] RSP: 0018:ffffa23a40b53b10 EFLAGS: 00010286 [ 111.408802] RAX: 0000000000000000 RBX: 000000000000ffff RCX: 0000000000000027 [ 111.408803] RDX: ffff947b278a0908 RSI: 0000000000000001 RDI: ffff947b278a0900 [ 111.408804] RBP: ffffa23a40b53b78 R08: 0000000000000000 R09: ffffa23a40b53920 [ 111.408805] R10: ffffa23a40b53918 R11: 0000000000000003 R12: 0000000000000000 [ 111.408806] R13: 000000000000004c R14: ffff9479b1a00000 R15: ffff9479b1a00000 [ 111.408808] FS: 00007ff9626478c0(0000) GS:ffff947b27880000(0000) knlGS:0000000000000000 [ 111.408809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 111.408810] CR2: 0000555753cbf15c CR3: 0000000108432002 CR4: 0000000000770ee0 [ 111.408810] PKRU: 55555554 [ 111.408811] Call Trace: [ 111.408854] skl_dram_get_channel_info+0x24/0x160 [i915] [ 111.408892] intel_dram_detect+0xef/0x630 [i915] [ 111.408931] i915_driver_probe+0xb18/0xc40 [i915] [ 111.408969] ? i915_pci_probe+0x3f/0x160 [i915] [ 111.408973] local_pci_probe+0x3b/0x80 [ 111.408975] pci_device_probe+0xfc/0x1b0 [ 111.408979] really_probe+0x26e/0x460 [ 111.408981] driver_probe_device+0xb4/0x100 [ 111.408983] device_driver_attach+0xa9/0xb0 [ 111.408984] ? device_driver_attach+0xb0/0xb0 [ 111.408985] __driver_attach+0xa1/0x140 [ 111.408987] ? device_driver_attach+0xb0/0xb0 [ 111.408989] bus_for_each_dev+0x84/0xd0 [ 111.408991] bus_add_driver+0x13e/0x200 [ 111.408993] driver_register+0x89/0xe0 [ 111.409036] i915_init+0x60/0x75 [i915] [ 111.409038] ? 0xffffffffc18b8000 [ 111.409041] do_one_initcall+0x56/0x1f0 [ 111.409044] do_init_module+0x4a/0x240 [ 111.409047] __do_sys_finit_module+0xaa/0x110 [ 111.409050] do_syscall_64+0x30/0x40 [ 111.409053] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 111.409054] RIP: 0033:0x7ff962d534f9 [ 111.409056] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d7 08 0d 00 f7 d8 64 89 01 48 [ 111.409057] RSP: 002b:00007fff57caf5d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 111.409058] RAX: ffffffffffffffda RBX: 0000555753cad120 RCX: 00007ff962d534f9 [ 111.409059] RDX: 0000000000000000 RSI: 00007ff962ee6efd RDI: 0000000000000010 [ 111.409060] RBP: 00007ff962ee6efd R08: 0000000000000000 R09: 0000555753c7a320 [ 111.409061] R10: 0000000000000010 R11: 0000000000000246 R12: 0000000000020000 [ 111.409062] R13: 0000000000000000 R14: 0000555753ca4f30 R15: 0000555752d40e4f [ 111.409064] ---[ end trace 2da6ec0bd6f7c3a1 ]---
José Roberto de Souza (1): drm/i915/gen11+: Only load DRAM information from pcode
Matt Roper (1): drm/i915/dg1: Wait for pcode/uncore handshake at startup
This second patch should only be needed for discrete GPUs (none of which are fully enabled on a 5.10 kernel). Are you seeing it cause a change in behavior on an integrated TGL platform?
When i started looking at things i thought it would be a bisect job, because Debian 12 with 6.1 is not affected. Then i found a ubuntu bug and basically just applied the patch that bug was talking about (p2)
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1933274
which needs p1 for that new function
Having both applied the problem i did see is indeed gone. Not that i understand what that code does in depth.
I think it is Tigerlake but i always have a hard time with the "lakes"
CPU: Intel(R) Xeon(R) W-11865MLE @ 1.50GHz
W-11865MLE integrated GPU.
which to me suggests "discrete GPU"
Henning
Matt
drivers/gpu/drm/i915/display/intel_bw.c | 80 +++--------------------- drivers/gpu/drm/i915/i915_drv.c | 4 ++ drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_reg.h | 3 + drivers/gpu/drm/i915/intel_dram.c | 82 ++++++++++++++++++++++++- drivers/gpu/drm/i915/intel_sideband.c | 15 +++++ drivers/gpu/drm/i915/intel_sideband.h | 2 + 7 files changed, 114 insertions(+), 73 deletions(-)
-- 2.39.3