thing to do is to just special-case S_ISDIR. Not lovely, but whatever.
So something like this instead? It's a smaller diff anyway, and it gets the crazy afds/ceph cases right too.
If you really care about this we can do it. But if we can live with just
I see you went with the S_ISDIR thing for now. How do you feel about adding something like the appended patch (untested) on top of this?
So instead of relying on the inode we could just check f_ops for iterate/iterate_shared. That should amount to the same thing(*) but looks cleaner to me. Alternatively we can do the flag thing you mentioned ofc.
(*) I suffered from a proper cold so my brain is in a half-working state.
On Sat, 5 Aug 2023 at 04:47, Christian Brauner brauner@kernel.org wrote:
So instead of relying on the inode we could just check f_ops for iterate/iterate_shared.
Yes. Except I hate having two of those iterate functions. We should have gotten rid of them absolutely years ago.
You shamed me into fixing that.
Linus
On Sat, 5 Aug 2023 at 11:47, Linus Torvalds torvalds@linux-foundation.org wrote:
Yes. Except I hate having two of those iterate functions. We should have gotten rid of them absolutely years ago.
You shamed me into fixing that.
Ok, final build test done and extensive 'git grep' to make sure I didn't miss anything.
Here's two patches to finally get rid of the old legacy '->iterate()' case entirely from the file ops structure.
Yes, some filesystems then still get the inode lock in write mode, but now it's the filesystem itself that wraps its own iterator, rather than cause pain for the callers.
So no more "Do you have an ->iterate() _or_ ->iterate_shared() function?" and associated "I need to do locking differently" nastiness. Only odd filesystems that never got the memo on "don't use .iterate".
I looked at some of the filesystems I added the wrapper to, and they looked like maybe they would be perfectly happy to just iterate with the lock held shared. But this really was meant to be a completely mindless conversion with no semantic changes.
Adding some filesystem maintainers where that second patch adds the wrapper use. It would be lovely if in another seven years we could get rid of that wrapper too ;)
The patches are entirely untested, but they do build cleanly for me, and look very straightforward. The wrapper function is trivial, and is mostly comments.
Anybody?
Linus
Yes, some filesystems then still get the inode lock in write mode, but now it's the filesystem itself that wraps its own iterator, rather than cause pain for the callers.
And, btrfs already does this in some cases where it first upgrades from shared to non-shared and then downgrades again before returning in btrfs_real_readdir(). So it's not like this isn't already happening.
So no more "Do you have an ->iterate() _or_ ->iterate_shared() function?" and associated "I need to do locking differently" nastiness. Only odd filesystems that never got the memo on "don't use .iterate".
Ack. It's not pretty but that hasn't stopped us before and it's less ugly than double inode methods which pass exactly the same parameters. I think removing the double methods is good and I see no problem in shaming filesystems that didn't manage to convert properly in the last 7 years.
And let's please not get stuck on incoming pinky promises that everyone will have converted before the next 7 years are over. I'd prefer to see the iop wiped and leave the ugliness to the individual filesystems rn.
On Sun, Aug 06, 2023 at 08:10:40AM +0200, Christian Brauner wrote:
Yes, some filesystems then still get the inode lock in write mode, but now it's the filesystem itself that wraps its own iterator, rather than cause pain for the callers.
And, btrfs already does this in some cases where it first upgrades from shared to non-shared and then downgrades again before returning in btrfs_real_readdir(). So it's not like this isn't already happening.
So no more "Do you have an ->iterate() _or_ ->iterate_shared() function?" and associated "I need to do locking differently" nastiness. Only odd filesystems that never got the memo on "don't use .iterate".
Ack. It's not pretty but that hasn't stopped us before and it's less ugly than double inode methods which pass exactly the same parameters. I think removing the double methods is good and I see no problem in shaming filesystems that didn't manage to convert properly in the last 7 years.
And let's please not get stuck on incoming pinky promises that everyone will have converted before the next 7 years are over. I'd prefer to see the iop wiped and leave the ugliness to the individual filesystems rn.
We got sent a fix for a wrong check for O_TMPFILE during RESOLVE_CACHED lookup which I've put on vfs.fixes yesterday:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.5-rc5.vfs.resolve_cached.fix
But in case you planned on applying this directly instead of waiting for next cycle I've added your two appended patches on top of it and my earlier patch for massaging the file_needs_f_pos_lock() check that triggered this whole thing:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.5-rc5.vfs.fixes
On Sun, 6 Aug 2023 at 06:26, Christian Brauner brauner@kernel.org wrote:
We got sent a fix for a wrong check for O_TMPFILE during RESOLVE_CACHED lookup which I've put on vfs.fixes yesterday:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.5-rc5.vfs.resolve_cached.fix
But in case you planned on applying this directly instead of waiting for next cycle I've added your two appended patches on top of it and my earlier patch for massaging the file_needs_f_pos_lock() check that triggered this whole thing:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.5-rc5.vfs.fixes
I had actually planned on just waiting for the 6.6 merge window, but then you made this _so_ easy for me that I ended up taking these things right now.
The timing may not be entirely right, but I'm very comfortable with the "get rid of '->iterate' op" change since it (a) would clearly fail the build on a missed conversion and (b) doesn't touch any core filesystems anyway.
And now that file_needs_f_pos_lock() does look better, and as you say in teh commit, makes it clearer why that locking rule exists.
Linus
Linus
linux-stable-mirror@lists.linaro.org