On Thu, Mar 20, 2025 at 05:15:40PM +0000, Bryan O'Donoghue wrote:
On 20/03/2025 09:50, Naresh Kamboju wrote:
Regressions on arm64 Dragonboard 845c boot failed with stable-rc 6.13.8-rc1
Regressions found on Dragonboard 845c :
- boot (debug Kconfigs)
Regression Analysis:
- New regression? Not sure. But the crash looks new.
- Reproducible? Intermittent
Since it is not easy to reproduce this crash, it is hard to bisect.
Boot regression: Dragonboard 845c kernel NULL pointer dereference Reported-by: Linux Kernel Functional Testing lkft@linaro.org
## Boot log [ 7.871211] xhci-pci-renesas 0000:01:00.0: failed to load firmware renesas_usb_fw.mem, fallback to ROM [ 7.877652] CAN device driver interface [ 7.879182] Bluetooth: hci0: setting up wcn399x [ 7.884439] Bluetooth: HCI UART protocol Marvell registered [ 7.890767] xhci-pci-renesas 0000:01:00.0: xHCI Host Controller [ 7.938433] xhci-pci-renesas 0000:01:00.0: new USB bus registered, assigned bus number 3 [ 7.941274] spi_master spi0: will run message pump with realtime priority [ 7.946642] xhci-pci-renesas 0000:01:00.0: Zeroing 64bit base registers, expecting fault [ 7.969396] ath10k_snoc 18800000.wifi: Adding to iommu group 16 [ 7.983424] mcp251xfd spi0.0 can0: MCP2517FD rev0.0 (-RX_INT -PLL +MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD o:40.00MHz c:40.00MHz m:10.00MHz rs:10.00MHz es:0.00MHz rf:10.00MHz ef:0.00MHz) successfully initialized. [ 7.987793] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
^^
drivers/media/platform/qcom/camss/camss.c
1700 struct media_entity *camss_find_sensor(struct media_entity *entity) 1701 { 1702 struct media_pad *pad; 1703 1704 while (1) { 1705 pad = &entity->pads[0]; 1706 if (!(pad->flags & MEDIA_PAD_FL_SINK))
0x30 matches really nicely with a NULL entity->pad pointer.
1707 return NULL; 1708 1710 pad = media_pad_remote_pad_first(pad); 1710 if (!pad || !is_media_entity_v4l2_subdev(pad->entity)) 1711 return NULL; 1712 1713 entity = pad->entity; 1714 1715 if (entity->function == MEDIA_ENT_F_CAM_SENSOR) 1716 return entity; 1717 } 1718 }
Hand waving ensues:
The fact that it's intermittent suggests that we're calling video open before the subdevices are registered. So maybe either camss_subdev_notifier_bound() or camss_subdev_notifier_complete() needs to set a flag and then vfe_set_power() could do if (!everything_configured) return -EPROBE_DEFER or something.
regards, dan carpenter