On 09.08.22 22:30, Linus Torvalds wrote:
On Tue, Aug 9, 2022 at 1:20 PM David Hildenbrand david@redhat.com wrote:
IIUC VM_MAYSHARE is always set in a MAP_SHARED mapping, but for file mappings we only set VM_SHARED if the file allows for writes
Heh.
This is a horrific hack, and probably should go away.
Yeah, we have that
if (!(file->f_mode & FMODE_WRITE)) vm_flags &= ~(VM_MAYWRITE | VM_SHARED);
but I think that's _entirely_ historical.
Long long ago, in a galaxy far away, we didn't handle shared mmap() very well. In fact, we used to not handle it at all.
But nntpd would use write() to update the spool file, adn them read it through a shared mmap.
And since our mmap() *was* coherent with people doing write() system calls, but didn't handle actual dirty shared mmap, what Linux used to do was to just say "Oh, you want a read-only shared file mmap? I can do that - I'll just downgrade it to a read-only _private_ mapping, and it actually ends up with the same semantics".
And here we are, 30 years later, and it still does that, but it leaves the VM_MAYSHARE flag so that /proc/<pid>/maps can show that it's a shared mapping.
I was suspecting that this code is full of legacy :)
What would make sense to me is to just have VM_SHARED and make it correspond to MAP_SHARED, that would at least confuse me less. Once I have some spare cycles I'll see how easy that might be to achieve.