On Aug 2, 2019, at 15:39, Theodore Y. Ts'o tytso@mit.edu wrote:
On Fri, Aug 02, 2019 at 09:00:52PM +0200, Arnd Bergmann wrote:
I must have misunderstood what the field says. I expected that with s_min_extra_isize set beyond the nanosecond fields, there would be a guarantee that all inodes have at least as many extra bytes already allocated. What circumstances would lead to an i_extra_isize smaller than s_min_extra_isize?
When allocating new inodes, i_extra_isize is set to s_want_extra_isize. When modifying existing inodes, if i_extra_isize is less than s_min_extra_isize, then we will attempt to move out extended attribute(s) to the external xattr block. So the s_min_extra_isize field is not a guarantee, but rather an aspirationa goal. The idea is that at some point when we want to enable a new feature, which needs more extra inode space, we can adjust s_min_extra_size and s_want_extra_size, and the file system will migrate things to meet these constraints.
The plan was to teach e2fsck how to fix all of the inodes to meet theh s_min_extra_size value, but that never got implemented,
Actually, we _did_ implement this feature for e2fsck years ago, but haven't landed it upstream for whatever reason. I don't know if I never submitted it, or I did and wasn't accepted for some reason. I definitely would be happy to get it landed.
The patch is in our "master-lustre" branch: e2fsck: add support for expanding the inode size: https://git.whamcloud.com/?p=tools/e2fsprogs.git%3Ba=commit%3Bh=ab1465f9ae2b...
And the test cases are in a separate patch: https://git.whamcloud.com/?p=tools/e2fsprogs.git%3Ba=commit%3Bh=7b8a9fdf8627...
and we even then, e2fsck would have to deal with the case where tit couldn't move the extended attribute(s) in the inode out, because there was no place to put them.
In this case, e2fsck will loop asking to abort or delete am xattr, regardless whether -n or -y is used.
In practice, this hasn't been that much of a limitation because we haven't been adding that many extra inode fields. Keep in mind that Red Hat for example, has explicitly said they will *never* support adding new features to an existing file system. Their only supported method is back up the file system, reformat it with the new file system features, and then restore the file system.
Of course, if the backup/restore includes backing up the extended attributes, and then restoring them, the xattr restore could fail, unless the user also increased the inode size (e.g., from 256 bytes to 512 bytes).
Getting this right in the general case is *hard*. Fortunately, the corner cases really don't happen that often in practice, at least not for pure Linux workloads. Windows which can have arbitrarily large security id's and ACL's might make this harder, of course --- although ext4's EA in inode feature would make this better, modulo needing to write more complex file system code to handle moving xattrs around.
Since the extended timestamps were one of the first extra inode fields to be added, I strongly suggest that we not try to borrow trouble. Solving the general case problem is *hard*.
- Ted