On Thu, Apr 28, 2011 at 12:32:32PM +0200, Marek Szyprowski wrote:
On Thursday, April 28, 2011 11:38 AM Russell King - ARM Linux wrote:
- Implement dma_alloc_noncoherent on ARM. Marek pointed out that this is needed, and it currently is not implemented, with an outdated comment explaining why it used to not be possible to do it.
dma_alloc_noncoherent is an entirely pointless API afaics.
I was about to ask what the point is ... (what is the expected semantic ? Memory that is reachable but not necessarily cache coherent ?)
As far as I can see, dma_alloc_noncoherent() should just be a wrapper around the normal page allocation function. I don't see it ever needing to do anything special - and the advantage of just being the normal page allocation function is that its properties are well known and architecture independent.
If there is IOMMU chip that supports pages larger than 4KiB then dma_alloc_noncoherent() might try to allocate such larger pages what will result in faster access to the buffer (lower iommu tlb miss ratio). For large buffers even 64KiB 'pages' gives a significant performance improvement.
The memory allocated by dma_alloc_noncoherent() (and dma_alloc_coherent()) has to be virtually contiguous, and DMA contiguous. It is assumed by all drivers that:
virt = dma_alloc_foo(size, &dma);
cpuaddr = virt + offset; dmaaddr = dma + offset;
results in the CPU and DMA seeing ultimately the same address for cpuaddr and dmaaddr for 0 <= offset < size.
The standard alloc_pages() also ensures that if you ask for an order-N page, you'll end up with that allocation being contiguous - so there's no difference there.
What I'd suggest is that dma_alloc_noncoherent() should be architecture independent, and should call into whatever iommu support the device has to setup an approprite iommu mapping. IOW, I don't see any need for every architecture to provide its own dma_alloc_noncoherent() allocation function - or indeed every iommu implementation to deal with the allocation issues either.