On driver unload any pending descriptors are flushed and pending DMA descriptors are explicitly completed: idxd_dmaengine_drv_remove() -> drv_disable_wq() -> idxd_wq_free_irq() -> idxd_flush_pending_descs() -> idxd_dma_complete_txd()
With this done during driver unload any remaining descriptor is likely stuck and can be dropped. Even so, the descriptor may still have a callback set that could no longer be accessible. An example of such a problem is when the dmatest fails and the dmatest module is unloaded. The failure of dmatest leaves descriptors with dma_async_tx_descriptor::callback pointing to code that no longer exist. This causes a page fault as below at the time the IDXD driver is unloaded when it attempts to run the callback: BUG: unable to handle page fault for address: ffffffffc0665190 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page
Fix this by clearing the callback pointers on the transmit descriptors only when workqueue is disabled.
Fixes: 403a2e236538 ("dmaengine: idxd: change MSIX allocation based on per wq activation") Signed-off-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Dave Jiang dave.jiang@intel.com Reviewed-by: Fenghua Yu fenghua.yu@intel.com Cc: stable@vger.kernel.org --- Changes since V1: - Add Dave and Fenghua's Reviewed-by tags. - Cc stable team (Fenghua). - Move declaration local to block needing it (Fenghua). - Add appropriate Fixes tag (Fenghua).
drivers/dma/idxd/device.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c index b4d7bb923a40..6d8ff664fdfb 100644 --- a/drivers/dma/idxd/device.c +++ b/drivers/dma/idxd/device.c @@ -1173,8 +1173,19 @@ static void idxd_flush_pending_descs(struct idxd_irq_entry *ie) spin_unlock(&ie->list_lock);
list_for_each_entry_safe(desc, itr, &flist, list) { + struct dma_async_tx_descriptor *tx; + list_del(&desc->list); ctype = desc->completion->status ? IDXD_COMPLETE_NORMAL : IDXD_COMPLETE_ABORT; + /* + * wq is being disabled. Any remaining descriptors are + * likely to be stuck and can be dropped. callback could + * point to code that is no longer accessible, for example + * if dmatest module has been unloaded. + */ + tx = &desc->txd; + tx->callback = NULL; + tx->callback_result = NULL; idxd_dma_complete_txd(desc, ctype, true); } }