On Sun, May 16, 2021 at 12:22:43PM -0700, Linus Torvalds wrote:
On Sun, May 16, 2021 at 12:09 PM Matthew Wilcox willy@infradead.org wrote:
That was the other problem fixed by this patch -- on big-endian 32-bit platforms with 64-bit dma_addr_t (mips, ppc), a DMA address with bit 32 set inadvertently sets the PageTail bit. So we need to store the low bits in the first word, even on big-endian platforms.
Ouch. And yes, that would have shot down the "dma page frame number" model too.
Oh how I wish PageTail was in "flags". Yes, our compound_head() thing is "clever", but it's a pain,
That said, that union entry is "5 words", so the dma_addr_t thing could easily just have had a dummy word at the beginning.
Ah, if you just put one dummy word in front, then dma_addr_t overlaps with page->mapping, which used to be fine, but now we can map network queues to userspace, page->mapping has to be NULL. So there's only two places to put dma_addr; either overlapping compound_head or overlapping pfmemalloc.
I don't think PageTail is movable -- the issue is needing an atomic read of both PageTail _and_ the location of the head page. Even if x86 has something, there are a lot of architectures that don't.
While I've got you on the subject of compound_head ... have you had a look at the folio work? It decreases the number of calls to compound_head() by about 25%, as well as shrinking the (compiled size) of the kernel and laying the groundwork for supporting things like 32kB anonymous pages and adaptive page sizes in the page cache. Andrew's a bit nervous of it, probably because it's such a large change.