On 2022-03-23 17:27, Linus Torvalds wrote:
On Wed, Mar 23, 2022 at 12:19 AM Oleksandr Natalenko oleksandr@natalenko.name wrote:
The following upstream commits:
aa6f8dcbab47 swiotlb: rework "fix info leak with DMA_FROM_DEVICE" ddbd89deb7d3 swiotlb: fix info leak with DMA_FROM_DEVICE
break ath9k-based Wi-Fi access point for me. The AP emits beacons, but no client can connect to it, either from the very beginning, or shortly after start. These are the only symptoms I've noticed (i.e., no BUG/WARNING messages in `dmesg` etc).
Funky, but clearly true:
These commits appeared in v5.17 and v5.16.15, and both kernels are broken for me. I'm pretty confident these commits make the difference since I've built both v5.17 and v5.16.15 without them, and it fixed the issue.
Can you double-check (or just explicitly confirm if you already did that test) that you need to revert *both* of those commits, and it's the later "rework" fix that triggers it?
So, I do understand this might be an issue with regard to SG I/O handling in ath9k, hence relevant people in Cc.
Yeah, almost certainly an ath9k bug, but a regression is a regression, so if people can't find the issue in ath9k, we'll have to revert those commits.
Honestly, I personally think they were a bit draconian to begin with, and didn't limit their effects sufficiently.
I'm assuming that the ath9k issue is that it gives DMA mapping a big enough area to handle any possible packet size, and just expects - quite reasonably - smaller packets to only fill the part they need.
Which that "info leak" patch obviously breaks entirely.
Except that's the exact case which the new patch is addressing - by copying the whole original area into the SWIOTLB bounce buffer to begin with, if we bounce the whole lot back after the device has only updated part of it, the non-updated parts now get overwritten with the same original contents, rather than whatever random crap happened to be left in the SWIOTLB buffer by its previous user. I'm extremely puzzled how any driver could somehow be dependent on non-device-written data getting replaced with random crap, given that it wouldn't happen with a real IOMMU, or if SWIOTLB just didn't need to bounce, and the data would hardly be deterministic either.
I think I can see how aa6f8dcbab47 might increase the severity of a driver bug where it calls dma_sync_*_for_device() on part of a DMA_FROM_DEVICE mapping that the device *has* written to, without having called a corresponding dma_sync_*_for_cpu() first - previously that would have had no effect, but now SWIOTLB will effectively behave more like an eagerly-prefetching non-coherent cache and write back old data over new - but if ddbd89deb7d3 alone makes a difference then something really weird must be going on.
Has anyone run a sanity check with CONFIG_DMA_API_DEBUG enabled to see if that flags anything up?
Robin.