On Thu, 2019-01-31 at 12:44 +1100, Dave Chinner wrote:
Indeed, the fs/inode.c change definitely needs reverting, because that is just *plain wrong* and breaks long-standing memory reclaim behaviour.
The long-standing behavior may be wrong here, because of just how incredibly slow ext4 and XFS are when it comes to reclaiming inodes with lots of dirty pages.
We have observed some real system stalls when the reclaim code hits an ext4 or XFS inode with dozens of megabytes of dirty data that needs to be synced out before the inode can be reclaimed.
Have you observed any regressions due to not reclaiming inodes with cached pages attached?
If so, what kind of behavioral differences are you seeing due to that regression?
It would be nice if we could figure out a way to avoid both bad behaviors...
I seriously disagree with shovelling a different, largely untested and contentious change to the shrinker algorithm to try and patch over the symptoms of the original change. It leaves the underlying problem unfixed (dying memcgs need a reaper to shrink the remaining slab objects that pin that specific memcg) and instead plays "whack-a-mole" on what we alreayd know is a fundamentally broken assumption (i.e. that shrinking small slabs more agressively is side-effect free).
My patch shrinks small slabs with the same pressure as larger slabs. It also ensures that slabs from dead memcgs will get eventually reclaimed.
What am I missing?