On Fri, Aug 2, 2019 at 11:39 PM Theodore Y. Ts'o tytso@mit.edu wrote:
On Fri, Aug 02, 2019 at 09:00:52PM +0200, Arnd Bergmann wrote:
I must have misunderstood what the field says. I expected that with s_min_extra_isize set beyond the nanosecond fields, there would be a guarantee that all inodes have at least as many extra bytes already allocated. What circumstances would lead to an i_extra_isize smaller than s_min_extra_isize?
When allocating new inodes, i_extra_isize is set to s_want_extra_isize. When modifying existing inodes, if i_extra_isize is less than s_min_extra_isize, then we will attempt to move out extended attribute(s) to the external xattr block. So the s_min_extra_isize field is not a guarantee, but rather an aspirationa goal. The idea is that at some point when we want to enable a new feature, which needs more extra inode space, we can adjust s_min_extra_size and s_want_extra_size, and the file system will migrate things to meet these constraints.
I see in the ext4 code that we always try to expand i_extra_size to s_want_extra_isize in ext4_mark_inode_dirty(), and that s_want_extra_isize is always at least s_min_extra_isize, so we constantly try to expand the inode to fit.
What I still don't see is how any inode on the file system image could have ended up with less than s_min_extra_isize in the first place if s_min_extra_isize is never modified and all inodes in the file system would have originally been created with i_extra_isize >= s_min_extra_isize if EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE is set.
Did older versions of ext4 or ext3 ignore s_min_extra_isize when creating inodes despite EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE, or is there another possibility I'm missing?
Since the extended timestamps were one of the first extra inode fields to be added, I strongly suggest that we not try to borrow trouble. Solving the general case problem is *hard*.
As I said before, I absolutely don't suggest we solve the problem of reliably setting the timestamps, I'm just trying to find out if there is a way to know for sure that it cannot happen and alert the user otherwise. So far I think we have concluded
- just checking s_inode_size is not sufficient because ext3 may have created inodes with s_extra_isize too small - checking s_min_extra_isize may not be sufficient either, for similar reasons I don't yet fully understand (see above).
If there is any other way to be sure that the file system has never been mounted as a writable ext3, maybe that can be used instead?
Arnd