On Wed, Jun 4, 2025 at 3:30 PM SDL sdl@nppct.ru wrote:
On Sat, May 24, 2025 at 2:14 AM Alexey Nepomnyashih sdl@nppct.ru wrote:
A potential NULL pointer dereference may occur when accessing tmp_mqd->cp_hqd_pq_control without verifying that tmp_mqd is non-NULL. This may happen if mqd_backup[mqd_idx] is unexpectedly NULL.
Although a NULL check for mqd_backup[mqd_idx] existed previously, it was moved to a position after the dereference in a recent commit, which renders it ineffective.
I don't think it's possible for mqd_backup to be NULL at this point. We would have failed earlier in init if the mqd backup allocation failed.
Alex
In scenarios such as GPU reset or power management resume, there is no strict guarantee that amdgpu_gfx_mqd_sw_init() (via ->sw_init()) is invoked before gfx_v9_0_kiq_init_queue(). As a result, mqd_backup[] may remain uninitialized, and dereferencing it without a NULL check can lead to a crash.
Most other uses of mqd_backup[] in the driver explicitly check for NULL, indicating that uninitialized entries are an expected condition and should be handled accordingly.
sw_init() is only called once at driver load time. everything is allocated at that point. If that fails, the driver would not have loaded in the first place. I don't think it's possible for it to be NULL.
Alex
Alexey