On Thu, Jul 06, 2023 at 01:54:35PM +0000, Haiyang Zhang wrote:
This waiting loop is needed to let the pending Tx packets be sent. If they weren't sent in 1 second, it most likely makes no sense already whether they will be sent at all or not -- the destination host won't wait for them for so long. You say that it may happen only in case of HW issue. If so, I assume you need to fix it some way, e.g. do a HW reset or so? If so, why bother waiting for Tx completions if Tx is hung? You free all skbs later either way, so there are no leaks.
At that point, we don't actually care if the pending packets are sent or not. But if we free the queues too soon, and the HW is slow for unexpected reasons, a delayed completion notice will DMA into the freed memory and cause corruption. That's why we have a longer waiting time.
Aieiiie that is a horrible HW design to not have a strong fence of DMA.
"just wait and hope the HW doesn't UAF the kernel with DMA" is really awful.
Jason