On Wed, Nov 20, 2024 at 3:08 AM Al Viro viro@zeniv.linux.org.uk wrote:
On Wed, Nov 20, 2024 at 02:44:17AM +0100, Mateusz Guzik wrote:
Pardon me, but I am unable to follow your reasoning.
I suspect the argument is that the overhead of issuing a syscall is big enough that the extra cost of taking the lock trip wont be visible, but that's not accurate -- atomics are measurable when added to syscalls, even on modern CPUs.
Blocking is even more noticable, and the sucker can be contended. And not just by chmod() et.al. - write() will do it, for example.
Ye I was going for the best case scenario.
Nonetheless, as an example say an inode is owned by 0:0 and is being chowned to 1:1 and this is handled by setattr_copy.
The ids are updated one after another: [snip] i_uid_update(idmap, attr, inode); i_gid_update(idmap, attr, inode); [/snip]
So at least in principle it may be someone issuing getattr in parallel will happen to spot 1:0 (as opposed to 0:0 or 1:1), which was never set on the inode and is merely an artifact of hitting the timing.
This would be a bug, but I don't believe this is serious enough to justify taking the inode lock to get out of.
If anything, such scenarios would be more interesting for permission checks...
This indeed came up in that context, I can't be arsed to find the specific e-mail. Somewhere around looking at eliding lockref in favor of rcu-only operation I noted that inodes can arbitrarily change during permission checks (including LSMs) and currently there are no means to detect that. If memory serves Christian said this is known and if LSMs want better it's their business to do it. fwiw I think for perms some machinery (maybe with sequence counters) is warranted, but I have no interest in fighting about the subject.