While working on a BeaglePlay with the UART controller and making it use DMA, I figured the DMA controller completion helper was clearly misbehaving.
On one side, with slow devices like a UART controller, the helper uses an unusual polling delay of 1s.
On the other side, the helper can also perform 0us sleeps which means the driver does not sleep at all in some situations and keeps the CPU busy waiting.
Finally, while I was digging into these issues, it felt like the helper was a bit too complex and could be simplified, which is what I did in patch 3. I initially worked on v6.7.x which did not had the spinlocks, I hope I didn't got them wrong.
I also tested this with v6.17-rc7.
Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com --- Miquel Raynal (3): dmaengine: ti: k3-udma: Reduce completion polling delay dmaengine: ti: k3-udma: Ensure a minimum polling delay dmaengine: ti: k3-udma: Simplify the completion worker
drivers/dma/ti/k3-udma.c | 85 ++++++++++++++++++++---------------------------- 1 file changed, 36 insertions(+), 49 deletions(-) --- base-commit: b59220b9fa2c3da4295d71913146cd64b869fcf9 change-id: 20250731-ti-dma-timeout-1c64868b5baf
Best regards,
DMA transfers may be slow if the peer device is slow (eg. a UART controller, a SPI controller...). The completion worker is started very early compared to the actual draining speed, leading to low speed devices suffering from a full 1s delay before noticing the transfers are done, further delaying the execution of their callback.
1s seems overly large for a polling delay, reduce it to arbitrarily 100us which is the minimum amount of time to transfer a byte (and see some progress on the residue variable) on a 100kHz bus. Based on this first measure, the next sleep will be much closer to what is actually needed for the transfer to complete.
The 1 second polling delay involves that any device driver with a (very standard) 1 second timeout will error out indicating a transfer error, whereas the data has actually been transferred correctly.
Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA") Cc: stable@vger.kernel.org # 5.7 Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com --- This issue has been observed by playing with the DMA controller on Beagle Play.
The patch will not apply before v5.7 and probably do not need to be backported before anyway. --- drivers/dma/ti/k3-udma.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index aa2dc762140f6eee334f4506a592e72600ae9834..b2059baed1b2ffc81c10feca797c763e2a04a357 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -1115,19 +1115,18 @@ static void udma_check_tx_completion(struct work_struct *work) time_diff = ktime_sub(uc->tx_drain.tstamp, time_diff) + 1; residue_diff -= uc->tx_drain.residue; + /* + * Try to guess when we should check next time by + * calculating rate at which data is being drained at + * the peer device. Slow devices might have not yet + * started, showing no progress. Use an arbitrary delay + * in this case. + */ if (residue_diff) { - /* - * Try to guess when we should check - * next time by calculating rate at - * which data is being drained at the - * peer device - */ delay = (time_diff / residue_diff) * uc->tx_drain.residue; } else { - /* No progress, check again in 1 second */ - schedule_delayed_work(&uc->tx_drain.work, HZ); - break; + delay = 100000; }
spin_unlock_irqrestore(&uc->vc.lock, flags);
ktime_to_us() returns 0 if the time (ktime_t stores nanoseconds) is too small, leading to a while loop without sleeps, kind of conflicting with the initial aim of this function at observing a small delay and then guessing the amount of time needed to finish draining the descriptor.
Make sure we always sleep for (abitrarily) at least 1us to reduce pressure on the CPU, while not waiting too much either.
Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA") Cc: stable@vger.kernel.org # 5.7 Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com --- drivers/dma/ti/k3-udma.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index b2059baed1b2ffc81c10feca797c763e2a04a357..11232b306475edd5e1ed75d938bbf49ed9c2aabd 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -1125,6 +1125,8 @@ static void udma_check_tx_completion(struct work_struct *work) if (residue_diff) { delay = (time_diff / residue_diff) * uc->tx_drain.residue; + if (delay < 1000) + delay = 1000; } else { delay = 100000; }
linux-stable-mirror@lists.linaro.org