The series is an update and a more complete version of the previously posted series at https://lore.kernel.org/linux-fsdevel/20180122020426.2988-1-deepa.kernel@gma...
Thanks to Arnd Bergmann for doing a few preliminary reviews. They helped me fix a few issues I had overlooked.
The limits (sometimes granularity also) for the filesystems updated here are according to the following table:
File system Time type Start year Expiration year Granularity cramfs fixed 0 0 romfs fixed 0 0 pstore ascii seconds (27 digit ascii) S64_MIN S64_MAX NSEC_PER_USEC coda INT64 S64_MIN S64_MAX 1 omfs 64-bit milliseconds 0 U64_MAX/ 1000 NSEC_PER_MSEC befs unsigned 48-bit seconds 0 0xffffffffffff alloc_super bfs unsigned 32-bit seconds 0 U32_MAX alloc_super efs unsigned 32-bit seconds 0 U32_MAX alloc_super ext2 signed 32-bit seconds S32_MIN S32_MAX alloc_super ext3 signed 32-bit seconds S32_MIN S32_MAX alloc_super ext4 (old) signed 32-bit seconds S32_MIN S32_MAX alloc_super ext4 (extra) 34-bit seconds, 30-bit ns S32_MIN 0x37fffffff 1 freevxfs u32 secs/usecs 0 U32_MAX alloc_super jffs2 unsigned 32-bit seconds 0 U32_MAX alloc_super jfs unsigned 32-bit seconds/ns 0 U32_MAX 1 minix unsigned 32-bit seconds 0 U32_MAX alloc_super orangefs u64 seconds 0 U64_MAX alloc_super qnx4 unsigned 32-bit seconds 0 U32_MAX alloc_super qnx6 unsigned 32-bit seconds 0 U32_MAX alloc_super reiserfs unsigned 32-bit seconds 0 U32_MAX alloc_super squashfs unsigned 32-bit seconds 0 U32_MAX alloc_super ufs1 signed 32-bit seconds S32_MIN S32_MAX NSEC_PER_SEC ufs2 signed 64-bit seconds/u32 ns S64_MIN S64_MAX 1 xfs signed 32-bit seconds/ns S32_MIN S32_MAX 1 ceph unsigned 32-bit second/ns 0 U32_MAX 1000 sysv unsigned 32-bit seconds 0 U32_MAX alloc_super affs u32 day, min, ticks 1978 u32_max days NSEC_PER_SEC nfsv2 unsigned 32-bit seconds/ns 0 U32_MAX 1 nfsv3 unsigned 32-bit seconds/ns 0 U32_MAX 1000 nfsv4 u64 seconds/u32 ns S64_MIN S64_MAX 1000 isofs u8 year since 1900 (fixable) 1900 2155 alloc_super hpfs unsigned 32-bit seconds 1970 2106 alloc_super fat 7-bit years, 2s resolution 1980 2107 cifs (smb) 7-bit years 1980 2107 cifs (modern) 64-bit 100ns since 1601 1601 30828 adfs 40-bit cs since 1900 1900 2248 9p (9P2000) unsigned 32-bit seconds 1970 2106 9p (9P2000.L) signed 64-bit seconds, ns 1970 S64_MAX
Granularity column filled in by the alloc_super() in the above table indicates that the granularity is NSEC_PER_SEC. Note that anything not mentioned above still has the default limits S64_MIN..S64_MAX.
The patches in the series are as structured below: 1. Add vfs support to maintain the limits per filesystem. 2. Add a new timestamp_truncate() api for clamping timestamps according to the filesystem limits. 3. Add a warning for mount syscall to indicate the impending expiry of timestamps. 4. Modify utimes to clamp the timestamps. 5. Fill in limits for filesystems.
An updated version of the test for checking file system timestamp limits has been posted at https://www.spinics.net/lists/fstests/msg12262.html
Changes from previous version: * No change in mount behavior because of expiry of timestamps. * Included limits for more filesystems.
Deepa Dinamani (20): vfs: Add file timestamp range support vfs: Add timestamp_truncate() api timestamp_truncate: Replace users of timespec64_trunc mount: Add mount warning for impending timestamp expiry utimes: Clamp the timestamps before update fs: Fill in max and min timestamps in superblock 9p: Fill min and max timestamps in sb adfs: Fill in max and min timestamps in sb ext4: Initialize timestamps limits fs: nfs: Initialize filesystem timestamp ranges fs: cifs: Initialize filesystem timestamp ranges fs: fat: Initialize filesystem timestamp ranges fs: affs: Initialize filesystem timestamp ranges fs: sysv: Initialize filesystem timestamp ranges fs: ceph: Initialize filesystem timestamp ranges fs: orangefs: Initialize filesystem timestamp ranges fs: hpfs: Initialize filesystem timestamp ranges fs: omfs: Initialize filesystem timestamp ranges pstore: fs superblock limits isofs: Initialize filesystem timestamp ranges
fs/9p/vfs_super.c | 6 +++++- fs/adfs/adfs.h | 13 +++++++++++++ fs/adfs/inode.c | 8 ++------ fs/adfs/super.c | 2 ++ fs/affs/amigaffs.c | 2 +- fs/affs/amigaffs.h | 3 +++ fs/affs/inode.c | 4 ++-- fs/affs/super.c | 4 ++++ fs/attr.c | 21 ++++++++++++--------- fs/befs/linuxvfs.c | 2 ++ fs/bfs/inode.c | 2 ++ fs/ceph/super.c | 2 ++ fs/cifs/cifsfs.c | 22 ++++++++++++++++++++++ fs/cifs/netmisc.c | 14 +++++++------- fs/coda/inode.c | 3 +++ fs/configfs/inode.c | 12 ++++++------ fs/cramfs/inode.c | 2 ++ fs/efs/super.c | 2 ++ fs/ext2/super.c | 2 ++ fs/ext4/ext4.h | 4 ++++ fs/ext4/super.c | 17 +++++++++++++++-- fs/f2fs/file.c | 21 ++++++++++++--------- fs/fat/inode.c | 12 ++++++++++++ fs/fat/misc.c | 5 +++-- fs/freevxfs/vxfs_super.c | 2 ++ fs/hpfs/hpfs_fn.h | 6 ++---- fs/hpfs/super.c | 2 ++ fs/inode.c | 33 ++++++++++++++++++++++++++++++++- fs/isofs/inode.c | 7 +++++++ fs/jffs2/fs.c | 3 +++ fs/jfs/super.c | 2 ++ fs/kernfs/inode.c | 6 +++--- fs/minix/inode.c | 2 ++ fs/namespace.c | 11 +++++++++++ fs/nfs/super.c | 20 +++++++++++++++++++- fs/ntfs/inode.c | 21 ++++++++++++--------- fs/omfs/inode.c | 4 ++++ fs/orangefs/super.c | 2 ++ fs/pstore/inode.c | 4 +++- fs/qnx4/inode.c | 2 ++ fs/qnx6/inode.c | 2 ++ fs/reiserfs/super.c | 3 +++ fs/romfs/super.c | 2 ++ fs/squashfs/super.c | 2 ++ fs/super.c | 2 ++ fs/sysv/super.c | 5 ++++- fs/ubifs/file.c | 21 ++++++++++++--------- fs/ufs/super.c | 7 +++++++ fs/utimes.c | 17 +++++++++++++---- fs/xfs/xfs_super.c | 2 ++ include/linux/fs.h | 5 +++++ include/linux/time64.h | 2 ++ 52 files changed, 304 insertions(+), 78 deletions(-)
Add fields to the superblock to track the min and max timestamps supported by filesystems.
Initially, when a superblock is allocated, initialize it to the max and min values the fields can hold. Individual filesystems override these to match their actual limits.
Pseudo filesystems are assumed to always support the min and max allowable values for the fields.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/super.c | 2 ++ include/linux/fs.h | 3 +++ include/linux/time64.h | 2 ++ 3 files changed, 7 insertions(+)
diff --git a/fs/super.c b/fs/super.c index 2739f57515f8..e5835c38d336 100644 --- a/fs/super.c +++ b/fs/super.c @@ -257,6 +257,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags, s->s_maxbytes = MAX_NON_LFS; s->s_op = &default_op; s->s_time_gran = 1000000000; + s->s_time_min = TIME64_MIN; + s->s_time_max = TIME64_MAX; s->cleancache_poolid = CLEANCACHE_NO_POOL;
s->s_shrink.seeks = DEFAULT_SEEKS; diff --git a/include/linux/fs.h b/include/linux/fs.h index f1a6ca872943..e9d04e4e5628 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1448,6 +1448,9 @@ struct super_block {
/* Granularity of c/m/atime in ns (cannot be worse than a second) */ u32 s_time_gran; + /* Time limits for c/m/atime in seconds */ + time64_t s_time_min; + time64_t s_time_max; #ifdef CONFIG_FSNOTIFY __u32 s_fsnotify_mask; struct fsnotify_mark_connector __rcu *s_fsnotify_marks; diff --git a/include/linux/time64.h b/include/linux/time64.h index a620ee610b9f..19125489ae94 100644 --- a/include/linux/time64.h +++ b/include/linux/time64.h @@ -30,6 +30,8 @@ struct itimerspec64 {
/* Located here for timespec[64]_valid_strict */ #define TIME64_MAX ((s64)~((u64)1 << 63)) +#define TIME64_MIN (-TIME64_MAX - 1) + #define KTIME_MAX ((s64)~((u64)1 << 63)) #define KTIME_SEC_MAX (KTIME_MAX / NSEC_PER_SEC)
timespec_trunc() function is used to truncate a filesystem timestamp to the right granularity. But, the function does not clamp tv_sec part of the timestamps according to the filesystem timestamp limits.
The replacement api: timestamp_truncate() also alters the signature of the function to accommodate filesystem timestamp clamping according to flesystem limits.
Note that the tv_nsec part is set to 0 if tv_sec is not within the range supported for the filesystem.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/inode.c | 33 ++++++++++++++++++++++++++++++++- include/linux/fs.h | 2 ++ 2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/fs/inode.c b/fs/inode.c index 5f5431ec3d62..0fb1f0fb296a 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2166,6 +2166,37 @@ struct timespec64 timespec64_trunc(struct timespec64 t, unsigned gran) } EXPORT_SYMBOL(timespec64_trunc);
+/** + * timestamp_truncate - Truncate timespec to a granularity + * @t: Timespec + * @inode: inode being updated + * + * Truncate a timespec to the granularity supported by the fs + * containing the inode. Always rounds down. gran must + * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). + */ +struct timespec64 timestamp_truncate(struct timespec64 t, struct inode *inode) +{ + struct super_block *sb = inode->i_sb; + unsigned int gran = sb->s_time_gran; + + t.tv_sec = clamp(t.tv_sec, sb->s_time_min, sb->s_time_max); + if (unlikely(t.tv_sec == sb->s_time_max || t.tv_sec == sb->s_time_min)) + t.tv_nsec = 0; + + /* Avoid division in the common cases 1 ns and 1 s. */ + if (gran == 1) + ; /* nothing */ + else if (gran == NSEC_PER_SEC) + t.tv_nsec = 0; + else if (gran > 1 && gran < NSEC_PER_SEC) + t.tv_nsec -= t.tv_nsec % gran; + else + WARN(1, "invalid file time granularity: %u", gran); + return t; +} +EXPORT_SYMBOL(timestamp_truncate); + /** * current_time - Return FS time * @inode: inode. @@ -2187,6 +2218,6 @@ struct timespec64 current_time(struct inode *inode) return now; }
- return timespec64_trunc(now, inode->i_sb->s_time_gran); + return timestamp_truncate(now, inode); } EXPORT_SYMBOL(current_time); diff --git a/include/linux/fs.h b/include/linux/fs.h index e9d04e4e5628..fdfe51d096fa 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -726,6 +726,8 @@ struct inode { void *i_private; /* fs or device private pointer */ } __randomize_layout;
+struct timespec64 timestamp_truncate(struct timespec64 t, struct inode *inode); + static inline unsigned int i_blocksize(const struct inode *node) { return (1 << node->i_blkbits);
Update the inode timestamp updates to use timestamp_truncate() instead of timespec64_trunc().
The change was mostly generated by the following coccinelle script.
virtual context virtual patch
@r1 depends on patch forall@ struct inode *inode; identifier i_xtime =~ "^i_[acm]time$"; expression e; @@
inode->i_xtime = - timespec64_trunc( + timestamp_truncate( ..., - e); + inode);
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: adrian.hunter@intel.com Cc: anton@tuxera.com Cc: dedekind1@gmail.com Cc: gregkh@linuxfoundation.org Cc: hch@lst.de Cc: hirofumi@mail.parknet.co.jp Cc: jaegeuk@kernel.org Cc: jlbec@evilplan.org Cc: richard@nod.at Cc: tj@kernel.org Cc: yuchao0@huawei.com Cc: linux-f2fs-devel@lists.sourceforge.net Cc: linux-ntfs-dev@lists.sourceforge.net Cc: linux-mtd@lists.infradead.org --- fs/attr.c | 21 ++++++++++++--------- fs/configfs/inode.c | 12 ++++++------ fs/f2fs/file.c | 21 ++++++++++++--------- fs/fat/misc.c | 5 +++-- fs/kernfs/inode.c | 6 +++--- fs/ntfs/inode.c | 21 ++++++++++++--------- fs/ubifs/file.c | 21 ++++++++++++--------- 7 files changed, 60 insertions(+), 47 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c index d22e8187477f..df28035aa23e 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -183,15 +183,18 @@ void setattr_copy(struct inode *inode, const struct iattr *attr) inode->i_uid = attr->ia_uid; if (ia_valid & ATTR_GID) inode->i_gid = attr->ia_gid; - if (ia_valid & ATTR_ATIME) - inode->i_atime = timespec64_trunc(attr->ia_atime, - inode->i_sb->s_time_gran); - if (ia_valid & ATTR_MTIME) - inode->i_mtime = timespec64_trunc(attr->ia_mtime, - inode->i_sb->s_time_gran); - if (ia_valid & ATTR_CTIME) - inode->i_ctime = timespec64_trunc(attr->ia_ctime, - inode->i_sb->s_time_gran); + if (ia_valid & ATTR_ATIME) { + inode->i_atime = timestamp_truncate(attr->ia_atime, + inode); + } + if (ia_valid & ATTR_MTIME) { + inode->i_mtime = timestamp_truncate(attr->ia_mtime, + inode); + } + if (ia_valid & ATTR_CTIME) { + inode->i_ctime = timestamp_truncate(attr->ia_ctime, + inode); + } if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode;
diff --git a/fs/configfs/inode.c b/fs/configfs/inode.c index ab0284321912..884dcf06cfbe 100644 --- a/fs/configfs/inode.c +++ b/fs/configfs/inode.c @@ -76,14 +76,14 @@ int configfs_setattr(struct dentry * dentry, struct iattr * iattr) if (ia_valid & ATTR_GID) sd_iattr->ia_gid = iattr->ia_gid; if (ia_valid & ATTR_ATIME) - sd_iattr->ia_atime = timespec64_trunc(iattr->ia_atime, - inode->i_sb->s_time_gran); + sd_iattr->ia_atime = timestamp_truncate(iattr->ia_atime, + inode); if (ia_valid & ATTR_MTIME) - sd_iattr->ia_mtime = timespec64_trunc(iattr->ia_mtime, - inode->i_sb->s_time_gran); + sd_iattr->ia_mtime = timestamp_truncate(iattr->ia_mtime, + inode); if (ia_valid & ATTR_CTIME) - sd_iattr->ia_ctime = timespec64_trunc(iattr->ia_ctime, - inode->i_sb->s_time_gran); + sd_iattr->ia_ctime = timestamp_truncate(iattr->ia_ctime, + inode); if (ia_valid & ATTR_MODE) { umode_t mode = iattr->ia_mode;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 45b45f37d347..faf1e160961b 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -744,15 +744,18 @@ static void __setattr_copy(struct inode *inode, const struct iattr *attr) inode->i_uid = attr->ia_uid; if (ia_valid & ATTR_GID) inode->i_gid = attr->ia_gid; - if (ia_valid & ATTR_ATIME) - inode->i_atime = timespec64_trunc(attr->ia_atime, - inode->i_sb->s_time_gran); - if (ia_valid & ATTR_MTIME) - inode->i_mtime = timespec64_trunc(attr->ia_mtime, - inode->i_sb->s_time_gran); - if (ia_valid & ATTR_CTIME) - inode->i_ctime = timespec64_trunc(attr->ia_ctime, - inode->i_sb->s_time_gran); + if (ia_valid & ATTR_ATIME) { + inode->i_atime = timestamp_truncate(attr->ia_atime, + inode); + } + if (ia_valid & ATTR_MTIME) { + inode->i_mtime = timestamp_truncate(attr->ia_mtime, + inode); + } + if (ia_valid & ATTR_CTIME) { + inode->i_ctime = timestamp_truncate(attr->ia_ctime, + inode); + } if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode;
diff --git a/fs/fat/misc.c b/fs/fat/misc.c index 1e08bd54c5fb..53bb7c6bf993 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -307,8 +307,9 @@ int fat_truncate_time(struct inode *inode, struct timespec64 *now, int flags) inode->i_atime = (struct timespec64){ seconds, 0 }; } if (flags & S_CTIME) { - if (sbi->options.isvfat) - inode->i_ctime = timespec64_trunc(*now, 10000000); + if (sbi->options.isvfat) { + inode->i_ctime = timestamp_truncate(*now, inode); + } else inode->i_ctime = fat_timespec64_trunc_2secs(*now); } diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index f3f3984cce80..892a58cfe7a1 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -161,9 +161,9 @@ static inline void set_inode_attr(struct inode *inode, struct super_block *sb = inode->i_sb; inode->i_uid = attrs->ia_uid; inode->i_gid = attrs->ia_gid; - inode->i_atime = timespec64_trunc(attrs->ia_atime, sb->s_time_gran); - inode->i_mtime = timespec64_trunc(attrs->ia_mtime, sb->s_time_gran); - inode->i_ctime = timespec64_trunc(attrs->ia_ctime, sb->s_time_gran); + inode->i_atime = timestamp_truncate(attrs->ia_atime, inode); + inode->i_mtime = timestamp_truncate(attrs->ia_mtime, inode); + inode->i_ctime = timestamp_truncate(attrs->ia_ctime, inode); }
static void kernfs_refresh_inode(struct kernfs_node *kn, struct inode *inode) diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c index 8baa34baf548..6c7388430ad3 100644 --- a/fs/ntfs/inode.c +++ b/fs/ntfs/inode.c @@ -2899,15 +2899,18 @@ int ntfs_setattr(struct dentry *dentry, struct iattr *attr) ia_valid |= ATTR_MTIME | ATTR_CTIME; } } - if (ia_valid & ATTR_ATIME) - vi->i_atime = timespec64_trunc(attr->ia_atime, - vi->i_sb->s_time_gran); - if (ia_valid & ATTR_MTIME) - vi->i_mtime = timespec64_trunc(attr->ia_mtime, - vi->i_sb->s_time_gran); - if (ia_valid & ATTR_CTIME) - vi->i_ctime = timespec64_trunc(attr->ia_ctime, - vi->i_sb->s_time_gran); + if (ia_valid & ATTR_ATIME) { + vi->i_atime = timestamp_truncate(attr->ia_atime, + vi); + } + if (ia_valid & ATTR_MTIME) { + vi->i_mtime = timestamp_truncate(attr->ia_mtime, + vi); + } + if (ia_valid & ATTR_CTIME) { + vi->i_ctime = timestamp_truncate(attr->ia_ctime, + vi); + } mark_inode_dirty(vi); out: return err; diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 400970d740bb..cd52585c8f4f 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -1078,15 +1078,18 @@ static void do_attr_changes(struct inode *inode, const struct iattr *attr) inode->i_uid = attr->ia_uid; if (attr->ia_valid & ATTR_GID) inode->i_gid = attr->ia_gid; - if (attr->ia_valid & ATTR_ATIME) - inode->i_atime = timespec64_trunc(attr->ia_atime, - inode->i_sb->s_time_gran); - if (attr->ia_valid & ATTR_MTIME) - inode->i_mtime = timespec64_trunc(attr->ia_mtime, - inode->i_sb->s_time_gran); - if (attr->ia_valid & ATTR_CTIME) - inode->i_ctime = timespec64_trunc(attr->ia_ctime, - inode->i_sb->s_time_gran); + if (attr->ia_valid & ATTR_ATIME) { + inode->i_atime = timestamp_truncate(attr->ia_atime, + inode); + } + if (attr->ia_valid & ATTR_MTIME) { + inode->i_mtime = timestamp_truncate(attr->ia_mtime, + inode); + } + if (attr->ia_valid & ATTR_CTIME) { + inode->i_ctime = timestamp_truncate(attr->ia_ctime, + inode); + } if (attr->ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode;
Deepa Dinamani deepa.kernel@gmail.com writes:
diff --git a/fs/fat/misc.c b/fs/fat/misc.c index 1e08bd54c5fb..53bb7c6bf993 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -307,8 +307,9 @@ int fat_truncate_time(struct inode *inode, struct timespec64 *now, int flags) inode->i_atime = (struct timespec64){ seconds, 0 }; } if (flags & S_CTIME) {
if (sbi->options.isvfat)
inode->i_ctime = timespec64_trunc(*now, 10000000);
if (sbi->options.isvfat) {
inode->i_ctime = timestamp_truncate(*now, inode);
else inode->i_ctime = fat_timespec64_trunc_2secs(*now); }}
Looks like broken. It changed to sb->s_time_gran from 10000000, and changed coding style.
Thanks.
On Tue, Jul 30, 2019 at 1:27 AM OGAWA Hirofumi hirofumi@mail.parknet.co.jp wrote:
Deepa Dinamani deepa.kernel@gmail.com writes:
diff --git a/fs/fat/misc.c b/fs/fat/misc.c index 1e08bd54c5fb..53bb7c6bf993 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -307,8 +307,9 @@ int fat_truncate_time(struct inode *inode, struct timespec64 *now, int flags) inode->i_atime = (struct timespec64){ seconds, 0 }; } if (flags & S_CTIME) {
if (sbi->options.isvfat)
inode->i_ctime = timespec64_trunc(*now, 10000000);
if (sbi->options.isvfat) {
inode->i_ctime = timestamp_truncate(*now, inode);
} else inode->i_ctime = fat_timespec64_trunc_2secs(*now); }
Looks like broken. It changed to sb->s_time_gran from 10000000, and changed coding style.
This is using a new api: timestamp_truncate(). granularity is gotten by inode->sb->s_time_gran. See Patch [2/20]: https://lkml.org/lkml/2019/7/29/1853
So this is not broken if fat is filling in the right granularity in the sb.
-Deepa
Hi Deepa,
On 30 Jul 2019, at 18:26, Deepa Dinamani deepa.kernel@gmail.com wrote:
On Tue, Jul 30, 2019 at 1:27 AM OGAWA Hirofumi hirofumi@mail.parknet.co.jp wrote:
Deepa Dinamani deepa.kernel@gmail.com writes:
diff --git a/fs/fat/misc.c b/fs/fat/misc.c index 1e08bd54c5fb..53bb7c6bf993 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -307,8 +307,9 @@ int fat_truncate_time(struct inode *inode, struct timespec64 *now, int flags) inode->i_atime = (struct timespec64){ seconds, 0 }; } if (flags & S_CTIME) {
if (sbi->options.isvfat)
inode->i_ctime = timespec64_trunc(*now, 10000000);
if (sbi->options.isvfat) {
inode->i_ctime = timestamp_truncate(*now, inode);
}} else inode->i_ctime = fat_timespec64_trunc_2secs(*now);
Looks like broken. It changed to sb->s_time_gran from 10000000, and changed coding style.
This is using a new api: timestamp_truncate(). granularity is gotten by inode->sb->s_time_gran. See Patch [2/20]: https://lkml.org/lkml/2019/7/29/1853
So this is not broken if fat is filling in the right granularity in the sb.
It is broken for FAT because FAT has different granularities for different timestamps so it cannot put the correct value in the sb as that only allows one granularity. Your patch is totally broken for fat as it would be immediately obvious if you spent a few minutes looking at the code...
Best regards,
Anton
-Deepa
On Tue, Jul 30, 2019 at 3:28 PM Anton Altaparmakov anton@tuxera.com wrote:
Hi Deepa,
On 30 Jul 2019, at 18:26, Deepa Dinamani deepa.kernel@gmail.com wrote:
On Tue, Jul 30, 2019 at 1:27 AM OGAWA Hirofumi hirofumi@mail.parknet.co.jp wrote:
Deepa Dinamani deepa.kernel@gmail.com writes:
diff --git a/fs/fat/misc.c b/fs/fat/misc.c index 1e08bd54c5fb..53bb7c6bf993 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -307,8 +307,9 @@ int fat_truncate_time(struct inode *inode, struct timespec64 *now, int flags) inode->i_atime = (struct timespec64){ seconds, 0 }; } if (flags & S_CTIME) {
if (sbi->options.isvfat)
inode->i_ctime = timespec64_trunc(*now, 10000000);
if (sbi->options.isvfat) {
inode->i_ctime = timestamp_truncate(*now, inode);
}} else inode->i_ctime = fat_timespec64_trunc_2secs(*now);
Looks like broken. It changed to sb->s_time_gran from 10000000, and changed coding style.
This is using a new api: timestamp_truncate(). granularity is gotten by inode->sb->s_time_gran. See Patch [2/20]: https://lkml.org/lkml/2019/7/29/1853
So this is not broken if fat is filling in the right granularity in the sb.
It is broken for FAT because FAT has different granularities for different timestamps so it cannot put the correct value in the sb as that only allows one granularity. Your patch is totally broken for fat as it would be immediately obvious if you spent a few minutes looking at the code...
It seemed to me that FAT had already covered the special cases (2s and 1d) granularities by using internal functions. This one could also be an inlined calculation, but I will just drop the FAT part from this patch and leave it as is for now.
Thanks, Deepa
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/namespace.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/namespace.c b/fs/namespace.c index b26778bdc236..5314fac8035e 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2739,6 +2739,17 @@ static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint, error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); if (error < 0) mntput(mnt); + + if (!error && sb->s_time_max && + (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) { + char *buf = (char *)__get_free_page(GFP_KERNEL); + char *mntpath = buf ? d_path(mountpoint, buf, PAGE_SIZE) : ERR_PTR(-ENOMEM); + + pr_warn("Mounted %s file system at %s supports timestamps until 0x%llx\n", + fc->fs_type->name, mntpath, (unsigned long long)sb->s_time_max); + free_page((unsigned long)buf); + } + return error; }
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
fs/namespace.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/namespace.c b/fs/namespace.c index b26778bdc236..5314fac8035e 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2739,6 +2739,17 @@ static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint, error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); if (error < 0) mntput(mnt);
- if (!error && sb->s_time_max &&
I don't know why you are testing sb->s_time_max here - it should always be non-zero since alloc_super() sets it to TIME64_MAX.
(ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
char *buf = (char *)__get_free_page(GFP_KERNEL);
char *mntpath = buf ? d_path(mountpoint, buf, PAGE_SIZE) : ERR_PTR(-ENOMEM);
pr_warn("Mounted %s file system at %s supports timestamps until 0x%llx\n",
fc->fs_type->name, mntpath, (unsigned long long)sb->s_time_max);
This doesn't seem like a helpful way to log the time. Maybe use time64_to_tm() to convert to "broken down" time and then print it with "%ptR"... but that wants struct rtc_time. If you apply the attached patch, however, you should then be able to print struct tm with "%ptT".
Ben.
free_page((unsigned long)buf);
- }
- return error;
}
On Mon, Aug 5, 2019 at 4:12 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
fs/namespace.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/namespace.c b/fs/namespace.c index b26778bdc236..5314fac8035e 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2739,6 +2739,17 @@ static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint, error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); if (error < 0) mntput(mnt);
if (!error && sb->s_time_max &&
I don't know why you are testing sb->s_time_max here - it should always be non-zero since alloc_super() sets it to TIME64_MAX.
I think we support some writable file systems that have no timestamps at all, so both the minimum and maximum default to 0 (1970-01-01).
For these, there is no point in printing a warning, they just work as designed, even though the maximum is expired.
Arnd
This doesn't seem like a helpful way to log the time. Maybe use time64_to_tm() to convert to "broken down" time and then print it with "%ptR"... but that wants struct rtc_time. If you apply the attached patch, however, you should then be able to print struct tm with "%ptT".
OK. Will print a more user friendly date string here.
Thanks, -Deepa
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
[...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Ben.
On Mon, Aug 5, 2019 at 7:14 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
[...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Many filesystems support read only mounts only. We do fill in right granularities and limits for these filesystems as well. In keeping with the trend, I have added the warning accordingly. I don't think I have a preference either way. But, not warning for the red only mounts adds another if case. If you have a strong preference, I could add it in.
-Deepa
On Sat, 2019-08-10 at 13:44 -0700, Deepa Dinamani wrote:
On Mon, Aug 5, 2019 at 7:14 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
[...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Many filesystems support read only mounts only. We do fill in right granularities and limits for these filesystems as well. In keeping with the trend, I have added the warning accordingly. I don't think I have a preference either way. But, not warning for the red only mounts adds another if case. If you have a strong preference, I could add it in.
It seems to me that the warning is needed if there is a possibility of data loss (incorrect timestamps, potentially leading to incorrect decisions about which files are newer). This can happen only when a filesystem is mounted read-write, or when a filesystem image is created.
I think that warning for read-only mounts would be an annoyance to users retrieving files from old filesystems.
Ben.
On Mon, Aug 12, 2019 at 3:25 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Sat, 2019-08-10 at 13:44 -0700, Deepa Dinamani wrote:
On Mon, Aug 5, 2019 at 7:14 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
[...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Many filesystems support read only mounts only. We do fill in right granularities and limits for these filesystems as well. In keeping with the trend, I have added the warning accordingly. I don't think I have a preference either way. But, not warning for the red only mounts adds another if case. If you have a strong preference, I could add it in.
It seems to me that the warning is needed if there is a possibility of data loss (incorrect timestamps, potentially leading to incorrect decisions about which files are newer). This can happen only when a filesystem is mounted read-write, or when a filesystem image is created.
I think that warning for read-only mounts would be an annoyance to users retrieving files from old filesystems.
I agree, the warning is not helpful for read-only mounts. An earlier plan was to completely disallow writable mounts that might risk an overflow (in some configurations at least). The warning replaces that now, and I think it should also just warn for the cases that would otherwise have been dangerous.
Arnd
On Mon, Aug 12, 2019 at 7:11 AM Arnd Bergmann arnd@arndb.de wrote:
On Mon, Aug 12, 2019 at 3:25 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Sat, 2019-08-10 at 13:44 -0700, Deepa Dinamani wrote:
On Mon, Aug 5, 2019 at 7:14 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
[...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Many filesystems support read only mounts only. We do fill in right granularities and limits for these filesystems as well. In keeping with the trend, I have added the warning accordingly. I don't think I have a preference either way. But, not warning for the red only mounts adds another if case. If you have a strong preference, I could add it in.
It seems to me that the warning is needed if there is a possibility of data loss (incorrect timestamps, potentially leading to incorrect decisions about which files are newer). This can happen only when a filesystem is mounted read-write, or when a filesystem image is created.
I think that warning for read-only mounts would be an annoyance to users retrieving files from old filesystems.
I agree, the warning is not helpful for read-only mounts. An earlier plan was to completely disallow writable mounts that might risk an overflow (in some configurations at least). The warning replaces that now, and I think it should also just warn for the cases that would otherwise have been dangerous.
Ok, I will make the change to exclude new read only mounts. I will use __mnt_is_readonly() so that it also exculdes filesystems that are readonly also. The diff looks like below:
- if (!error && sb->s_time_max && + if (!error && !__mnt_is_readonly(mnt) && (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
Note that we can get rid of checking for non zero sb->s_time_max now.
-Deepa
On Mon, Aug 12, 2019 at 9:09 AM Deepa Dinamani deepa.kernel@gmail.com wrote:
On Mon, Aug 12, 2019 at 7:11 AM Arnd Bergmann arnd@arndb.de wrote:
On Mon, Aug 12, 2019 at 3:25 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Sat, 2019-08-10 at 13:44 -0700, Deepa Dinamani wrote:
On Mon, Aug 5, 2019 at 7:14 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
The warning reuses the uptime max of 30 years used by the setitimeofday().
Note that the warning is only added for new filesystem mounts through the mount syscall. Automounts do not have the same warning.
[...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Many filesystems support read only mounts only. We do fill in right granularities and limits for these filesystems as well. In keeping with the trend, I have added the warning accordingly. I don't think I have a preference either way. But, not warning for the red only mounts adds another if case. If you have a strong preference, I could add it in.
It seems to me that the warning is needed if there is a possibility of data loss (incorrect timestamps, potentially leading to incorrect decisions about which files are newer). This can happen only when a filesystem is mounted read-write, or when a filesystem image is created.
I think that warning for read-only mounts would be an annoyance to users retrieving files from old filesystems.
I agree, the warning is not helpful for read-only mounts. An earlier plan was to completely disallow writable mounts that might risk an overflow (in some configurations at least). The warning replaces that now, and I think it should also just warn for the cases that would otherwise have been dangerous.
Ok, I will make the change to exclude new read only mounts. I will use __mnt_is_readonly() so that it also exculdes filesystems that are readonly also. The diff looks like below:
if (!error && sb->s_time_max &&
if (!error && !__mnt_is_readonly(mnt) && (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
Note that we can get rid of checking for non zero sb->s_time_max now.
One more thing, we will probably have to add a second warning for when the filesystem is re-mounted rw after the initial readonly mount.
-Deepa
On Mon, 2019-08-12 at 09:15 -0700, Deepa Dinamani wrote:
On Mon, Aug 12, 2019 at 9:09 AM Deepa Dinamani deepa.kernel@gmail.com wrote:
On Mon, Aug 12, 2019 at 7:11 AM Arnd Bergmann arnd@arndb.de wrote:
On Mon, Aug 12, 2019 at 3:25 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Sat, 2019-08-10 at 13:44 -0700, Deepa Dinamani wrote:
On Mon, Aug 5, 2019 at 7:14 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote: > The warning reuses the uptime max of 30 years used by the > setitimeofday(). > > Note that the warning is only added for new filesystem mounts > through the mount syscall. Automounts do not have the same warning. [...]
Another thing - perhaps this warning should be suppressed for read-only mounts?
Many filesystems support read only mounts only. We do fill in right granularities and limits for these filesystems as well. In keeping with the trend, I have added the warning accordingly. I don't think I have a preference either way. But, not warning for the red only mounts adds another if case. If you have a strong preference, I could add it in.
It seems to me that the warning is needed if there is a possibility of data loss (incorrect timestamps, potentially leading to incorrect decisions about which files are newer). This can happen only when a filesystem is mounted read-write, or when a filesystem image is created.
I think that warning for read-only mounts would be an annoyance to users retrieving files from old filesystems.
I agree, the warning is not helpful for read-only mounts. An earlier plan was to completely disallow writable mounts that might risk an overflow (in some configurations at least). The warning replaces that now, and I think it should also just warn for the cases that would otherwise have been dangerous.
Ok, I will make the change to exclude new read only mounts. I will use __mnt_is_readonly() so that it also exculdes filesystems that are readonly also. The diff looks like below:
if (!error && sb->s_time_max &&
if (!error && !__mnt_is_readonly(mnt) && (ktime_get_real_seconds() + TIME_UPTIME_SEC_MAX > sb->s_time_max)) {
Note that we can get rid of checking for non zero sb->s_time_max now.
One more thing, we will probably have to add a second warning for when the filesystem is re-mounted rw after the initial readonly mount.
Indeed, there would need to be a check for remount-read-write. I didn't check whether remount uses this same code path.
Ben.
POSIX is ambiguous on the behavior of timestamps for futimens, utimensat and utimes. Whether to return an error or silently clamp a timestamp beyond the range supported by the underlying filesystems is not clear.
POSIX.1 section for futimens, utimensat and utimes says: (http://pubs.opengroup.org/onlinepubs/9699919799/functions/futimens.html)
The file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the specified time.
If the tv_nsec field of a timespec structure has the special value UTIME_NOW, the file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the current time.
[EINVAL] A new file timestamp would be a value whose tv_sec component is not a value supported by the file system.
The patch chooses to clamp the timestamps according to the filesystem timestamp ranges and does not return an error. This is in line with the behavior of utime syscall also since the POSIX page(http://pubs.opengroup.org/onlinepubs/009695399/functions/utime.html) for utime does not mention returning an error or clamping like above.
Same for utimes http://pubs.opengroup.org/onlinepubs/009695399/functions/utimes.html
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/utimes.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/utimes.c b/fs/utimes.c index 350c9c16ace1..4c1a2ce90bbc 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -21,6 +21,7 @@ static int utimes_common(const struct path *path, struct timespec64 *times) int error; struct iattr newattrs; struct inode *inode = path->dentry->d_inode; + struct super_block *sb = inode->i_sb; struct inode *delegated_inode = NULL;
error = mnt_want_write(path->mnt); @@ -36,16 +37,24 @@ static int utimes_common(const struct path *path, struct timespec64 *times) if (times[0].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_ATIME; else if (times[0].tv_nsec != UTIME_NOW) { - newattrs.ia_atime.tv_sec = times[0].tv_sec; - newattrs.ia_atime.tv_nsec = times[0].tv_nsec; + newattrs.ia_atime.tv_sec = + clamp(times[0].tv_sec, sb->s_time_min, sb->s_time_max); + if (times[0].tv_sec == sb->s_time_max || times[0].tv_sec == sb->s_time_min) + newattrs.ia_atime.tv_nsec = 0; + else + newattrs.ia_atime.tv_nsec = times[0].tv_nsec; newattrs.ia_valid |= ATTR_ATIME_SET; }
if (times[1].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_MTIME; else if (times[1].tv_nsec != UTIME_NOW) { - newattrs.ia_mtime.tv_sec = times[1].tv_sec; - newattrs.ia_mtime.tv_nsec = times[1].tv_nsec; + newattrs.ia_mtime.tv_sec = + clamp(times[1].tv_sec, sb->s_time_min, sb->s_time_max); + if (times[1].tv_sec >= sb->s_time_max || times[1].tv_sec == sb->s_time_min) + newattrs.ia_mtime.tv_nsec = 0; + else + newattrs.ia_mtime.tv_nsec = times[1].tv_nsec; newattrs.ia_valid |= ATTR_MTIME_SET; } /*
On Mon, Jul 29, 2019 at 06:49:09PM -0700, Deepa Dinamani wrote:
POSIX is ambiguous on the behavior of timestamps for futimens, utimensat and utimes. Whether to return an error or silently clamp a timestamp beyond the range supported by the underlying filesystems is not clear.
POSIX.1 section for futimens, utimensat and utimes says: (http://pubs.opengroup.org/onlinepubs/9699919799/functions/futimens.html)
The file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the specified time.
If the tv_nsec field of a timespec structure has the special value UTIME_NOW, the file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the current time.
[EINVAL] A new file timestamp would be a value whose tv_sec component is not a value supported by the file system.
The patch chooses to clamp the timestamps according to the filesystem timestamp ranges and does not return an error. This is in line with the behavior of utime syscall also since the POSIX page(http://pubs.opengroup.org/onlinepubs/009695399/functions/utime.html) for utime does not mention returning an error or clamping like above.
Same for utimes http://pubs.opengroup.org/onlinepubs/009695399/functions/utimes.html
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
fs/utimes.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/utimes.c b/fs/utimes.c index 350c9c16ace1..4c1a2ce90bbc 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -21,6 +21,7 @@ static int utimes_common(const struct path *path, struct timespec64 *times) int error; struct iattr newattrs; struct inode *inode = path->dentry->d_inode;
- struct super_block *sb = inode->i_sb; struct inode *delegated_inode = NULL;
error = mnt_want_write(path->mnt); @@ -36,16 +37,24 @@ static int utimes_common(const struct path *path, struct timespec64 *times) if (times[0].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_ATIME; else if (times[0].tv_nsec != UTIME_NOW) {
newattrs.ia_atime.tv_sec = times[0].tv_sec;
newattrs.ia_atime.tv_nsec = times[0].tv_nsec;
newattrs.ia_atime.tv_sec =
clamp(times[0].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[0].tv_sec == sb->s_time_max || times[0].tv_sec == sb->s_time_min)
newattrs.ia_atime.tv_nsec = 0;
else
}newattrs.ia_atime.tv_nsec = times[0].tv_nsec; newattrs.ia_valid |= ATTR_ATIME_SET;
if (times[1].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_MTIME; else if (times[1].tv_nsec != UTIME_NOW) {
newattrs.ia_mtime.tv_sec = times[1].tv_sec;
newattrs.ia_mtime.tv_nsec = times[1].tv_nsec;
newattrs.ia_mtime.tv_sec =
clamp(times[1].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[1].tv_sec >= sb->s_time_max || times[1].tv_sec == sb->s_time_min)
Line length.
Also, didn't you just introduce a function to clamp tv_sec and fix granularity? Why not just use it here? I think this is the third time I've seen this open-coded logic.
--D
newattrs.ia_mtime.tv_nsec = 0;
else
} /*newattrs.ia_mtime.tv_nsec = times[1].tv_nsec; newattrs.ia_valid |= ATTR_MTIME_SET;
-- 2.17.1
On Wed, Jul 31, 2019 at 8:15 AM Darrick J. Wong darrick.wong@oracle.com wrote:
On Mon, Jul 29, 2019 at 06:49:09PM -0700, Deepa Dinamani wrote:
POSIX is ambiguous on the behavior of timestamps for futimens, utimensat and utimes. Whether to return an error or silently clamp a timestamp beyond the range supported by the underlying filesystems is not clear.
POSIX.1 section for futimens, utimensat and utimes says: (http://pubs.opengroup.org/onlinepubs/9699919799/functions/futimens.html)
The file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the specified time.
If the tv_nsec field of a timespec structure has the special value UTIME_NOW, the file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the current time.
[EINVAL] A new file timestamp would be a value whose tv_sec component is not a value supported by the file system.
The patch chooses to clamp the timestamps according to the filesystem timestamp ranges and does not return an error. This is in line with the behavior of utime syscall also since the POSIX page(http://pubs.opengroup.org/onlinepubs/009695399/functions/utime.html) for utime does not mention returning an error or clamping like above.
Same for utimes http://pubs.opengroup.org/onlinepubs/009695399/functions/utimes.html
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
fs/utimes.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/utimes.c b/fs/utimes.c index 350c9c16ace1..4c1a2ce90bbc 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -21,6 +21,7 @@ static int utimes_common(const struct path *path, struct timespec64 *times) int error; struct iattr newattrs; struct inode *inode = path->dentry->d_inode;
struct super_block *sb = inode->i_sb; struct inode *delegated_inode = NULL; error = mnt_want_write(path->mnt);
@@ -36,16 +37,24 @@ static int utimes_common(const struct path *path, struct timespec64 *times) if (times[0].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_ATIME; else if (times[0].tv_nsec != UTIME_NOW) {
newattrs.ia_atime.tv_sec = times[0].tv_sec;
newattrs.ia_atime.tv_nsec = times[0].tv_nsec;
newattrs.ia_atime.tv_sec =
clamp(times[0].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[0].tv_sec == sb->s_time_max || times[0].tv_sec == sb->s_time_min)
newattrs.ia_atime.tv_nsec = 0;
else
newattrs.ia_atime.tv_nsec = times[0].tv_nsec; newattrs.ia_valid |= ATTR_ATIME_SET; } if (times[1].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_MTIME; else if (times[1].tv_nsec != UTIME_NOW) {
newattrs.ia_mtime.tv_sec = times[1].tv_sec;
newattrs.ia_mtime.tv_nsec = times[1].tv_nsec;
newattrs.ia_mtime.tv_sec =
clamp(times[1].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[1].tv_sec >= sb->s_time_max || times[1].tv_sec == sb->s_time_min)
Line length.
Also, didn't you just introduce a function to clamp tv_sec and fix granularity? Why not just use it here? I think this is the third time I've seen this open-coded logic.
Yes, we can use that now. Earlier we were not setting the tv_nsec to 0 in timestamp_truncate() which is why this was opencoded here. I will make the change to include this.
Thanks, Deepa
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
POSIX is ambiguous on the behavior of timestamps for futimens, utimensat and utimes. Whether to return an error or silently clamp a timestamp beyond the range supported by the underlying filesystems is not clear.
POSIX.1 section for futimens, utimensat and utimes says: (http://pubs.opengroup.org/onlinepubs/9699919799/functions/futimens.html)
The file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the specified time.
If the tv_nsec field of a timespec structure has the special value UTIME_NOW, the file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the current time.
[EINVAL] A new file timestamp would be a value whose tv_sec component is not a value supported by the file system.
The patch chooses to clamp the timestamps according to the filesystem timestamp ranges and does not return an error. This is in line with the behavior of utime syscall also since the POSIX page(http://pubs.opengroup.org/onlinepubs/009695399/functions/utime.html) for utime does not mention returning an error or clamping like above.
Same for utimes http://pubs.opengroup.org/onlinepubs/009695399/functions/utimes.html
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
fs/utimes.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/utimes.c b/fs/utimes.c index 350c9c16ace1..4c1a2ce90bbc 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -21,6 +21,7 @@ static int utimes_common(const struct path *path, struct timespec64 *times) int error; struct iattr newattrs; struct inode *inode = path->dentry->d_inode;
- struct super_block *sb = inode->i_sb; struct inode *delegated_inode = NULL;
error = mnt_want_write(path->mnt); @@ -36,16 +37,24 @@ static int utimes_common(const struct path *path, struct timespec64 *times) if (times[0].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_ATIME; else if (times[0].tv_nsec != UTIME_NOW) {
newattrs.ia_atime.tv_sec = times[0].tv_sec;
newattrs.ia_atime.tv_nsec = times[0].tv_nsec;
newattrs.ia_atime.tv_sec =
clamp(times[0].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[0].tv_sec == sb->s_time_max || times[0].tv_sec == sb->s_time_min)
This is testing the un-clamped value.
newattrs.ia_atime.tv_nsec = 0;
else
}newattrs.ia_atime.tv_nsec = times[0].tv_nsec; newattrs.ia_valid |= ATTR_ATIME_SET;
if (times[1].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_MTIME; else if (times[1].tv_nsec != UTIME_NOW) {
newattrs.ia_mtime.tv_sec = times[1].tv_sec;
newattrs.ia_mtime.tv_nsec = times[1].tv_nsec;
newattrs.ia_mtime.tv_sec =
clamp(times[1].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[1].tv_sec >= sb->s_time_max || times[1].tv_sec == sb->s_time_min)
Similarly here, for the minimum.
I suggest testing for clamping like this:
if (newattrs.ia_atime.tv_sec != times[0].tv_sec) ... if (newattrs.ia_mtime.tv_sec != times[1].tv_sec) ...
Ben.
newattrs.ia_mtime.tv_nsec = 0;
else
} /*newattrs.ia_mtime.tv_nsec = times[1].tv_nsec; newattrs.ia_valid |= ATTR_MTIME_SET;
On Mon, Aug 5, 2019 at 6:30 AM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Mon, 2019-07-29 at 18:49 -0700, Deepa Dinamani wrote:
POSIX is ambiguous on the behavior of timestamps for futimens, utimensat and utimes. Whether to return an error or silently clamp a timestamp beyond the range supported by the underlying filesystems is not clear.
POSIX.1 section for futimens, utimensat and utimes says: (http://pubs.opengroup.org/onlinepubs/9699919799/functions/futimens.html)
The file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the specified time.
If the tv_nsec field of a timespec structure has the special value UTIME_NOW, the file's relevant timestamp shall be set to the greatest value supported by the file system that is not greater than the current time.
[EINVAL] A new file timestamp would be a value whose tv_sec component is not a value supported by the file system.
The patch chooses to clamp the timestamps according to the filesystem timestamp ranges and does not return an error. This is in line with the behavior of utime syscall also since the POSIX page(http://pubs.opengroup.org/onlinepubs/009695399/functions/utime.html) for utime does not mention returning an error or clamping like above.
Same for utimes http://pubs.opengroup.org/onlinepubs/009695399/functions/utimes.html
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
fs/utimes.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/fs/utimes.c b/fs/utimes.c index 350c9c16ace1..4c1a2ce90bbc 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -21,6 +21,7 @@ static int utimes_common(const struct path *path, struct timespec64 *times) int error; struct iattr newattrs; struct inode *inode = path->dentry->d_inode;
struct super_block *sb = inode->i_sb; struct inode *delegated_inode = NULL; error = mnt_want_write(path->mnt);
@@ -36,16 +37,24 @@ static int utimes_common(const struct path *path, struct timespec64 *times) if (times[0].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_ATIME; else if (times[0].tv_nsec != UTIME_NOW) {
newattrs.ia_atime.tv_sec = times[0].tv_sec;
newattrs.ia_atime.tv_nsec = times[0].tv_nsec;
newattrs.ia_atime.tv_sec =
clamp(times[0].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[0].tv_sec == sb->s_time_max || times[0].tv_sec == sb->s_time_min)
This is testing the un-clamped value.
newattrs.ia_atime.tv_nsec = 0;
else
newattrs.ia_atime.tv_nsec = times[0].tv_nsec; newattrs.ia_valid |= ATTR_ATIME_SET; } if (times[1].tv_nsec == UTIME_OMIT) newattrs.ia_valid &= ~ATTR_MTIME; else if (times[1].tv_nsec != UTIME_NOW) {
newattrs.ia_mtime.tv_sec = times[1].tv_sec;
newattrs.ia_mtime.tv_nsec = times[1].tv_nsec;
newattrs.ia_mtime.tv_sec =
clamp(times[1].tv_sec, sb->s_time_min, sb->s_time_max);
if (times[1].tv_sec >= sb->s_time_max || times[1].tv_sec == sb->s_time_min)
Similarly here, for the minimum.
I suggest testing for clamping like this:
if (newattrs.ia_atime.tv_sec != times[0].tv_sec) ... if (newattrs.ia_mtime.tv_sec != times[1].tv_sec) ...
Ben.
newattrs.ia_mtime.tv_nsec = 0;
else
newattrs.ia_mtime.tv_nsec = times[1].tv_nsec; newattrs.ia_valid |= ATTR_MTIME_SET; }
Darrick pointed out that maybe we could use timestamp_truncate() here to clamp. I think it is ok to truncate to the right granularity also here. setattr callbacks do it already. So the diff here looks like below:
- newattrs.ia_atime.tv_sec = - clamp(times[0].tv_sec, sb->s_time_min, sb->s_time_max); - if (times[0].tv_sec == sb->s_time_max || times[0].tv_sec == sb->s_time_min) - newattrs.ia_atime.tv_nsec = 0; - else - newattrs.ia_atime.tv_nsec = times[0].tv_nsec; + newattrs.ia_atime = timestamp_truncate(times[0], inode);
Thanks, Deepa
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Even though some filesystems are read-only, fill in the timestamps to reflect the on-disk representation.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: aivazian.tigran@gmail.com Cc: al@alarsen.net Cc: coda@cs.cmu.edu Cc: darrick.wong@oracle.com Cc: dushistov@mail.ru Cc: dwmw2@infradead.org Cc: hch@infradead.org Cc: jack@suse.com Cc: jaharkes@cs.cmu.edu Cc: luisbg@kernel.org Cc: nico@fluxnic.net Cc: phillip@squashfs.org.uk Cc: richard@nod.at Cc: salah.triki@gmail.com Cc: shaggy@kernel.org Cc: linux-xfs@vger.kernel.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: reiserfs-devel@vger.kernel.org --- fs/befs/linuxvfs.c | 2 ++ fs/bfs/inode.c | 2 ++ fs/coda/inode.c | 3 +++ fs/cramfs/inode.c | 2 ++ fs/efs/super.c | 2 ++ fs/ext2/super.c | 2 ++ fs/freevxfs/vxfs_super.c | 2 ++ fs/jffs2/fs.c | 3 +++ fs/jfs/super.c | 2 ++ fs/minix/inode.c | 2 ++ fs/qnx4/inode.c | 2 ++ fs/qnx6/inode.c | 2 ++ fs/reiserfs/super.c | 3 +++ fs/romfs/super.c | 2 ++ fs/squashfs/super.c | 2 ++ fs/ufs/super.c | 7 +++++++ fs/xfs/xfs_super.c | 2 ++ 17 files changed, 42 insertions(+)
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c index 462d096ff3e9..64cdf4d8e424 100644 --- a/fs/befs/linuxvfs.c +++ b/fs/befs/linuxvfs.c @@ -893,6 +893,8 @@ befs_fill_super(struct super_block *sb, void *data, int silent) sb_set_blocksize(sb, (ulong) befs_sb->block_size); sb->s_op = &befs_sops; sb->s_export_op = &befs_export_operations; + sb->s_time_min = 0; + sb->s_time_max = 0xffffffffffffll; root = befs_iget(sb, iaddr2blockno(sb, &(befs_sb->root_dir))); if (IS_ERR(root)) { ret = PTR_ERR(root); diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c index 5e97bed073d7..f8ce1368218b 100644 --- a/fs/bfs/inode.c +++ b/fs/bfs/inode.c @@ -324,6 +324,8 @@ static int bfs_fill_super(struct super_block *s, void *data, int silent) return -ENOMEM; mutex_init(&info->bfs_lock); s->s_fs_info = info; + s->s_time_min = 0; + s->s_time_max = U32_MAX;
sb_set_blocksize(s, BFS_BSIZE);
diff --git a/fs/coda/inode.c b/fs/coda/inode.c index 321f56e487cb..b1c70e2b9b1e 100644 --- a/fs/coda/inode.c +++ b/fs/coda/inode.c @@ -188,6 +188,9 @@ static int coda_fill_super(struct super_block *sb, void *data, int silent) sb->s_magic = CODA_SUPER_MAGIC; sb->s_op = &coda_super_operations; sb->s_d_op = &coda_dentry_operations; + sb->s_time_gran = 1; + sb->s_time_min = S64_MIN; + sb->s_time_max = S64_MAX;
error = super_setup_bdi(sb); if (error) diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index 9352487bd0fc..4d1d8b7761ed 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -597,6 +597,8 @@ static int cramfs_finalize_super(struct super_block *sb,
/* Set it all up.. */ sb->s_flags |= SB_RDONLY; + sb->s_time_min = 0; + sb->s_time_max = 0; sb->s_op = &cramfs_ops; root = get_cramfs_inode(sb, cramfs_root, 0); if (IS_ERR(root)) diff --git a/fs/efs/super.c b/fs/efs/super.c index 867fc24dee20..4a6ebff2af76 100644 --- a/fs/efs/super.c +++ b/fs/efs/super.c @@ -257,6 +257,8 @@ static int efs_fill_super(struct super_block *s, void *d, int silent) if (!sb) return -ENOMEM; s->s_fs_info = sb; + s->s_time_min = 0; + s->s_time_max = U32_MAX;
s->s_magic = EFS_SUPER_MAGIC; if (!sb_set_blocksize(s, EFS_BLOCKSIZE)) { diff --git a/fs/ext2/super.c b/fs/ext2/super.c index 44eb6e7eb492..baa36c6fb71e 100644 --- a/fs/ext2/super.c +++ b/fs/ext2/super.c @@ -1002,6 +1002,8 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
sb->s_maxbytes = ext2_max_size(sb->s_blocksize_bits); sb->s_max_links = EXT2_LINK_MAX; + sb->s_time_min = S32_MIN; + sb->s_time_max = S32_MAX;
if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV) { sbi->s_inode_size = EXT2_GOOD_OLD_INODE_SIZE; diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c index a89f68c3cbed..578a5062706e 100644 --- a/fs/freevxfs/vxfs_super.c +++ b/fs/freevxfs/vxfs_super.c @@ -229,6 +229,8 @@ static int vxfs_fill_super(struct super_block *sbp, void *dp, int silent)
sbp->s_op = &vxfs_super_ops; sbp->s_fs_info = infp; + sbp->s_time_min = 0; + sbp->s_time_max = U32_MAX;
if (!vxfs_try_sb_magic(sbp, silent, 1, (__force __fs32)cpu_to_le32(VXFS_SUPER_MAGIC))) { diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c index 8a20ddd25f2d..d0b59d03a7a9 100644 --- a/fs/jffs2/fs.c +++ b/fs/jffs2/fs.c @@ -590,6 +590,9 @@ int jffs2_do_fill_super(struct super_block *sb, void *data, int silent) sb->s_blocksize = PAGE_SIZE; sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = JFFS2_SUPER_MAGIC; + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; + if (!sb_rdonly(sb)) jffs2_start_garbage_collect_thread(c); return 0; diff --git a/fs/jfs/super.c b/fs/jfs/super.c index f4e10cb9f734..b2dc4d1f9dcc 100644 --- a/fs/jfs/super.c +++ b/fs/jfs/super.c @@ -503,6 +503,8 @@ static int jfs_fill_super(struct super_block *sb, void *data, int silent)
sb->s_fs_info = sbi; sb->s_max_links = JFS_LINK_MAX; + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; sbi->sb = sb; sbi->uid = INVALID_UID; sbi->gid = INVALID_GID; diff --git a/fs/minix/inode.c b/fs/minix/inode.c index f96073f25432..7cb5fd38eb14 100644 --- a/fs/minix/inode.c +++ b/fs/minix/inode.c @@ -277,6 +277,8 @@ static int minix_fill_super(struct super_block *s, void *data, int silent)
/* set up enough so that it can read an inode */ s->s_op = &minix_sops; + s->s_time_min = 0; + s->s_time_max = U32_MAX; root_inode = minix_iget(s, MINIX_ROOT_INO); if (IS_ERR(root_inode)) { ret = PTR_ERR(root_inode); diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c index 922d083bbc7c..e8da1cde87b9 100644 --- a/fs/qnx4/inode.c +++ b/fs/qnx4/inode.c @@ -201,6 +201,8 @@ static int qnx4_fill_super(struct super_block *s, void *data, int silent) s->s_op = &qnx4_sops; s->s_magic = QNX4_SUPER_MAGIC; s->s_flags |= SB_RDONLY; /* Yup, read-only yet */ + s->s_time_min = 0; + s->s_time_max = U32_MAX;
/* Check the superblock signature. Since the qnx4 code is dangerous, we should leave as quickly as possible diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c index 0f8b0ff1ba43..345db56c98fd 100644 --- a/fs/qnx6/inode.c +++ b/fs/qnx6/inode.c @@ -429,6 +429,8 @@ static int qnx6_fill_super(struct super_block *s, void *data, int silent) s->s_op = &qnx6_sops; s->s_magic = QNX6_SUPER_MAGIC; s->s_flags |= SB_RDONLY; /* Yup, read-only yet */ + s->s_time_min = 0; + s->s_time_max = U32_MAX;
/* ease the later tree level calculations */ sbi = QNX6_SB(s); diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c index ab028ea0e561..d69b4ac0ae2f 100644 --- a/fs/reiserfs/super.c +++ b/fs/reiserfs/super.c @@ -1976,6 +1976,9 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent) goto error_unlocked; }
+ s->s_time_min = 0; + s->s_time_max = U32_MAX; + rs = SB_DISK_SUPER_BLOCK(s); /* * Let's do basic sanity check to verify that underlying device is not diff --git a/fs/romfs/super.c b/fs/romfs/super.c index 7d580f7c3f1d..a42c0e3079dc 100644 --- a/fs/romfs/super.c +++ b/fs/romfs/super.c @@ -478,6 +478,8 @@ static int romfs_fill_super(struct super_block *sb, void *data, int silent) sb->s_maxbytes = 0xFFFFFFFF; sb->s_magic = ROMFS_MAGIC; sb->s_flags |= SB_RDONLY | SB_NOATIME; + sb->s_time_min = 0; + sb->s_time_max = 0; sb->s_op = &romfs_super_ops;
#ifdef CONFIG_ROMFS_ON_MTD diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c index effa638d6d85..a9e9837617a9 100644 --- a/fs/squashfs/super.c +++ b/fs/squashfs/super.c @@ -183,6 +183,8 @@ static int squashfs_fill_super(struct super_block *sb, void *data, int silent) (u64) le64_to_cpu(sblk->id_table_start));
sb->s_maxbytes = MAX_LFS_FILESIZE; + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; sb->s_flags |= SB_RDONLY; sb->s_op = &squashfs_super_ops;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c index 4ed0dca52ec8..1da0be667409 100644 --- a/fs/ufs/super.c +++ b/fs/ufs/super.c @@ -843,6 +843,10 @@ static int ufs_fill_super(struct super_block *sb, void *data, int silent)
sb->s_maxbytes = MAX_LFS_FILESIZE;
+ sb->s_time_gran = NSEC_PER_SEC; + sb->s_time_min = S32_MIN; + sb->s_time_max = S32_MAX; + switch (sbi->s_mount_opt & UFS_MOUNT_UFSTYPE) { case UFS_MOUNT_UFSTYPE_44BSD: UFSD("ufstype=44bsd\n"); @@ -861,6 +865,9 @@ static int ufs_fill_super(struct super_block *sb, void *data, int silent) uspi->s_fshift = 9; uspi->s_sbsize = super_block_size = 1536; uspi->s_sbbase = 0; + sb->s_time_gran = 1; + sb->s_time_min = S64_MIN; + sb->s_time_max = S64_MAX; flags |= UFS_TYPE_UFS2 | UFS_DE_44BSD | UFS_UID_44BSD | UFS_ST_44BSD | UFS_CG_44BSD; break; diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index a14d11d78bd8..1a0daf46bae8 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1685,6 +1685,8 @@ xfs_fs_fill_super( sb->s_maxbytes = xfs_max_file_offset(sb->s_blocksize_bits); sb->s_max_links = XFS_MAXLINK; sb->s_time_gran = 1; + sb->s_time_min = S32_MIN; + sb->s_time_max = S32_MAX; set_posix_acl_flag(sb);
/* version 5 superblocks support inode version counters. */
On Mon, Jul 29, 2019 at 06:49:10PM -0700, Deepa Dinamani wrote:
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Even though some filesystems are read-only, fill in the timestamps to reflect the on-disk representation.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: aivazian.tigran@gmail.com Cc: al@alarsen.net Cc: coda@cs.cmu.edu Cc: darrick.wong@oracle.com Cc: dushistov@mail.ru Cc: dwmw2@infradead.org Cc: hch@infradead.org Cc: jack@suse.com Cc: jaharkes@cs.cmu.edu Cc: luisbg@kernel.org Cc: nico@fluxnic.net Cc: phillip@squashfs.org.uk Cc: richard@nod.at Cc: salah.triki@gmail.com Cc: shaggy@kernel.org Cc: linux-xfs@vger.kernel.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: reiserfs-devel@vger.kernel.org
fs/befs/linuxvfs.c | 2 ++ fs/bfs/inode.c | 2 ++ fs/coda/inode.c | 3 +++ fs/cramfs/inode.c | 2 ++ fs/efs/super.c | 2 ++ fs/ext2/super.c | 2 ++ fs/freevxfs/vxfs_super.c | 2 ++ fs/jffs2/fs.c | 3 +++ fs/jfs/super.c | 2 ++ fs/minix/inode.c | 2 ++ fs/qnx4/inode.c | 2 ++ fs/qnx6/inode.c | 2 ++ fs/reiserfs/super.c | 3 +++ fs/romfs/super.c | 2 ++ fs/squashfs/super.c | 2 ++ fs/ufs/super.c | 7 +++++++ fs/xfs/xfs_super.c | 2 ++ 17 files changed, 42 insertions(+)
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c index 462d096ff3e9..64cdf4d8e424 100644 --- a/fs/befs/linuxvfs.c +++ b/fs/befs/linuxvfs.c @@ -893,6 +893,8 @@ befs_fill_super(struct super_block *sb, void *data, int silent) sb_set_blocksize(sb, (ulong) befs_sb->block_size); sb->s_op = &befs_sops; sb->s_export_op = &befs_export_operations;
- sb->s_time_min = 0;
- sb->s_time_max = 0xffffffffffffll; root = befs_iget(sb, iaddr2blockno(sb, &(befs_sb->root_dir))); if (IS_ERR(root)) { ret = PTR_ERR(root);
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c index 5e97bed073d7..f8ce1368218b 100644 --- a/fs/bfs/inode.c +++ b/fs/bfs/inode.c @@ -324,6 +324,8 @@ static int bfs_fill_super(struct super_block *s, void *data, int silent) return -ENOMEM; mutex_init(&info->bfs_lock); s->s_fs_info = info;
- s->s_time_min = 0;
- s->s_time_max = U32_MAX;
sb_set_blocksize(s, BFS_BSIZE); diff --git a/fs/coda/inode.c b/fs/coda/inode.c index 321f56e487cb..b1c70e2b9b1e 100644 --- a/fs/coda/inode.c +++ b/fs/coda/inode.c @@ -188,6 +188,9 @@ static int coda_fill_super(struct super_block *sb, void *data, int silent) sb->s_magic = CODA_SUPER_MAGIC; sb->s_op = &coda_super_operations; sb->s_d_op = &coda_dentry_operations;
- sb->s_time_gran = 1;
- sb->s_time_min = S64_MIN;
- sb->s_time_max = S64_MAX;
error = super_setup_bdi(sb); if (error) diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index 9352487bd0fc..4d1d8b7761ed 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -597,6 +597,8 @@ static int cramfs_finalize_super(struct super_block *sb, /* Set it all up.. */ sb->s_flags |= SB_RDONLY;
- sb->s_time_min = 0;
- sb->s_time_max = 0; sb->s_op = &cramfs_ops; root = get_cramfs_inode(sb, cramfs_root, 0); if (IS_ERR(root))
diff --git a/fs/efs/super.c b/fs/efs/super.c index 867fc24dee20..4a6ebff2af76 100644 --- a/fs/efs/super.c +++ b/fs/efs/super.c @@ -257,6 +257,8 @@ static int efs_fill_super(struct super_block *s, void *d, int silent) if (!sb) return -ENOMEM; s->s_fs_info = sb;
- s->s_time_min = 0;
- s->s_time_max = U32_MAX;
s->s_magic = EFS_SUPER_MAGIC; if (!sb_set_blocksize(s, EFS_BLOCKSIZE)) { diff --git a/fs/ext2/super.c b/fs/ext2/super.c index 44eb6e7eb492..baa36c6fb71e 100644 --- a/fs/ext2/super.c +++ b/fs/ext2/super.c @@ -1002,6 +1002,8 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent) sb->s_maxbytes = ext2_max_size(sb->s_blocksize_bits); sb->s_max_links = EXT2_LINK_MAX;
- sb->s_time_min = S32_MIN;
- sb->s_time_max = S32_MAX;
if (le32_to_cpu(es->s_rev_level) == EXT2_GOOD_OLD_REV) { sbi->s_inode_size = EXT2_GOOD_OLD_INODE_SIZE; diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c index a89f68c3cbed..578a5062706e 100644 --- a/fs/freevxfs/vxfs_super.c +++ b/fs/freevxfs/vxfs_super.c @@ -229,6 +229,8 @@ static int vxfs_fill_super(struct super_block *sbp, void *dp, int silent) sbp->s_op = &vxfs_super_ops; sbp->s_fs_info = infp;
- sbp->s_time_min = 0;
- sbp->s_time_max = U32_MAX;
if (!vxfs_try_sb_magic(sbp, silent, 1, (__force __fs32)cpu_to_le32(VXFS_SUPER_MAGIC))) { diff --git a/fs/jffs2/fs.c b/fs/jffs2/fs.c index 8a20ddd25f2d..d0b59d03a7a9 100644 --- a/fs/jffs2/fs.c +++ b/fs/jffs2/fs.c @@ -590,6 +590,9 @@ int jffs2_do_fill_super(struct super_block *sb, void *data, int silent) sb->s_blocksize = PAGE_SIZE; sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = JFFS2_SUPER_MAGIC;
- sb->s_time_min = 0;
- sb->s_time_max = U32_MAX;
- if (!sb_rdonly(sb)) jffs2_start_garbage_collect_thread(c); return 0;
diff --git a/fs/jfs/super.c b/fs/jfs/super.c index f4e10cb9f734..b2dc4d1f9dcc 100644 --- a/fs/jfs/super.c +++ b/fs/jfs/super.c @@ -503,6 +503,8 @@ static int jfs_fill_super(struct super_block *sb, void *data, int silent) sb->s_fs_info = sbi; sb->s_max_links = JFS_LINK_MAX;
- sb->s_time_min = 0;
- sb->s_time_max = U32_MAX; sbi->sb = sb; sbi->uid = INVALID_UID; sbi->gid = INVALID_GID;
diff --git a/fs/minix/inode.c b/fs/minix/inode.c index f96073f25432..7cb5fd38eb14 100644 --- a/fs/minix/inode.c +++ b/fs/minix/inode.c @@ -277,6 +277,8 @@ static int minix_fill_super(struct super_block *s, void *data, int silent) /* set up enough so that it can read an inode */ s->s_op = &minix_sops;
- s->s_time_min = 0;
- s->s_time_max = U32_MAX; root_inode = minix_iget(s, MINIX_ROOT_INO); if (IS_ERR(root_inode)) { ret = PTR_ERR(root_inode);
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c index 922d083bbc7c..e8da1cde87b9 100644 --- a/fs/qnx4/inode.c +++ b/fs/qnx4/inode.c @@ -201,6 +201,8 @@ static int qnx4_fill_super(struct super_block *s, void *data, int silent) s->s_op = &qnx4_sops; s->s_magic = QNX4_SUPER_MAGIC; s->s_flags |= SB_RDONLY; /* Yup, read-only yet */
- s->s_time_min = 0;
- s->s_time_max = U32_MAX;
/* Check the superblock signature. Since the qnx4 code is dangerous, we should leave as quickly as possible diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c index 0f8b0ff1ba43..345db56c98fd 100644 --- a/fs/qnx6/inode.c +++ b/fs/qnx6/inode.c @@ -429,6 +429,8 @@ static int qnx6_fill_super(struct super_block *s, void *data, int silent) s->s_op = &qnx6_sops; s->s_magic = QNX6_SUPER_MAGIC; s->s_flags |= SB_RDONLY; /* Yup, read-only yet */
- s->s_time_min = 0;
- s->s_time_max = U32_MAX;
/* ease the later tree level calculations */ sbi = QNX6_SB(s); diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c index ab028ea0e561..d69b4ac0ae2f 100644 --- a/fs/reiserfs/super.c +++ b/fs/reiserfs/super.c @@ -1976,6 +1976,9 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent) goto error_unlocked; }
- s->s_time_min = 0;
- s->s_time_max = U32_MAX;
- rs = SB_DISK_SUPER_BLOCK(s); /*
- Let's do basic sanity check to verify that underlying device is not
diff --git a/fs/romfs/super.c b/fs/romfs/super.c index 7d580f7c3f1d..a42c0e3079dc 100644 --- a/fs/romfs/super.c +++ b/fs/romfs/super.c @@ -478,6 +478,8 @@ static int romfs_fill_super(struct super_block *sb, void *data, int silent) sb->s_maxbytes = 0xFFFFFFFF; sb->s_magic = ROMFS_MAGIC; sb->s_flags |= SB_RDONLY | SB_NOATIME;
- sb->s_time_min = 0;
- sb->s_time_max = 0; sb->s_op = &romfs_super_ops;
#ifdef CONFIG_ROMFS_ON_MTD diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c index effa638d6d85..a9e9837617a9 100644 --- a/fs/squashfs/super.c +++ b/fs/squashfs/super.c @@ -183,6 +183,8 @@ static int squashfs_fill_super(struct super_block *sb, void *data, int silent) (u64) le64_to_cpu(sblk->id_table_start)); sb->s_maxbytes = MAX_LFS_FILESIZE;
- sb->s_time_min = 0;
- sb->s_time_max = U32_MAX; sb->s_flags |= SB_RDONLY; sb->s_op = &squashfs_super_ops;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c index 4ed0dca52ec8..1da0be667409 100644 --- a/fs/ufs/super.c +++ b/fs/ufs/super.c @@ -843,6 +843,10 @@ static int ufs_fill_super(struct super_block *sb, void *data, int silent) sb->s_maxbytes = MAX_LFS_FILESIZE;
- sb->s_time_gran = NSEC_PER_SEC;
- sb->s_time_min = S32_MIN;
- sb->s_time_max = S32_MAX;
- switch (sbi->s_mount_opt & UFS_MOUNT_UFSTYPE) { case UFS_MOUNT_UFSTYPE_44BSD: UFSD("ufstype=44bsd\n");
@@ -861,6 +865,9 @@ static int ufs_fill_super(struct super_block *sb, void *data, int silent) uspi->s_fshift = 9; uspi->s_sbsize = super_block_size = 1536; uspi->s_sbbase = 0;
sb->s_time_gran = 1;
sb->s_time_min = S64_MIN;
flags |= UFS_TYPE_UFS2 | UFS_DE_44BSD | UFS_UID_44BSD | UFS_ST_44BSD | UFS_CG_44BSD; break;sb->s_time_max = S64_MAX;
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index a14d11d78bd8..1a0daf46bae8 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1685,6 +1685,8 @@ xfs_fs_fill_super( sb->s_maxbytes = xfs_max_file_offset(sb->s_blocksize_bits); sb->s_max_links = XFS_MAXLINK; sb->s_time_gran = 1;
- sb->s_time_min = S32_MIN;
- sb->s_time_max = S32_MAX;
For the XFS part,
Reviewed-by: Darrick J. Wong darrick.wong@oracle.com
--D
set_posix_acl_flag(sb); /* version 5 superblocks support inode version counters. */ -- 2.17.1
struct p9_wstat and struct p9_stat_dotl indicate that the wire transport uses u32 and u64 fields for timestamps. Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Note that the upper bound for V9FS_PROTO_2000L is retained as S64_MAX. This is because that is the upper bound supported by vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: ericvh@gmail.com Cc: lucho@ionkov.net Cc: asmadeus@codewreck.org Cc: v9fs-developer@lists.sourceforge.net --- fs/9p/vfs_super.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c index 08112fbcaece..ca243e658d71 100644 --- a/fs/9p/vfs_super.c +++ b/fs/9p/vfs_super.c @@ -69,8 +69,12 @@ v9fs_fill_super(struct super_block *sb, struct v9fs_session_info *v9ses, if (v9fs_proto_dotl(v9ses)) { sb->s_op = &v9fs_super_ops_dotl; sb->s_xattr = v9fs_xattr_handlers; - } else + } else { sb->s_op = &v9fs_super_ops; + sb->s_time_max = U32_MAX; + } + + sb->s_time_min = 0;
ret = super_setup_bdi(sb); if (ret)
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Note that the min timestamp is assumed to be 01 Jan 1970 00:00:00 (Unix epoch). This is consistent with the way we convert timestamps in adfs_adfs2unix_time().
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/adfs/adfs.h | 13 +++++++++++++ fs/adfs/inode.c | 8 ++------ fs/adfs/super.c | 2 ++ 3 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/fs/adfs/adfs.h b/fs/adfs/adfs.h index 804c6a77c5db..781b1c3817e0 100644 --- a/fs/adfs/adfs.h +++ b/fs/adfs/adfs.h @@ -2,6 +2,19 @@ #include <linux/fs.h> #include <linux/adfs_fs.h>
+/* + * 01 Jan 1970 00:00:00 (Unix epoch) as seconds since + * 01 Jan 1900 00:00:00 (RISC OS epoch) + */ +#define RISC_OS_EPOCH_DELTA 2208988800LL + +/* + * Convert 40 bit centi seconds to seconds + * since 01 Jan 1900 00:00:00 (RISC OS epoch) + * The result is 2248-06-03 06:57:57 GMT + */ +#define ADFS_MAX_TIMESTAMP ((0xFFFFFFFFFFLL / 100) - RISC_OS_EPOCH_DELTA) + /* Internal data structures for ADFS */
#define ADFS_FREE_FRAG 0 diff --git a/fs/adfs/inode.c b/fs/adfs/inode.c index 66621e96f9af..3f75cefc0380 100644 --- a/fs/adfs/inode.c +++ b/fs/adfs/inode.c @@ -170,11 +170,7 @@ static void adfs_adfs2unix_time(struct timespec64 *tv, struct inode *inode) { unsigned int high, low; - /* 01 Jan 1970 00:00:00 (Unix epoch) as nanoseconds since - * 01 Jan 1900 00:00:00 (RISC OS epoch) - */ - static const s64 nsec_unix_epoch_diff_risc_os_epoch = - 2208988800000000000LL; + static const s64 nsec_unix_epoch_diff_risc_os_epoch = RISC_OS_EPOCH_DELTA * NSEC_PER_SEC; s64 nsec;
if (ADFS_I(inode)->stamped == 0) @@ -219,7 +215,7 @@ adfs_unix2adfs_time(struct inode *inode, unsigned int secs) if (ADFS_I(inode)->stamped) { /* convert 32-bit seconds to 40-bit centi-seconds */ low = (secs & 255) * 100; - high = (secs / 256) * 100 + (low >> 8) + 0x336e996a; + high = (secs / 256) * 100 + (low >> 8) + (RISC_OS_EPOCH_DELTA*100/256);
ADFS_I(inode)->loadaddr = (high >> 24) | (ADFS_I(inode)->loadaddr & ~0xff); diff --git a/fs/adfs/super.c b/fs/adfs/super.c index 2a83655c408f..0a0854ef9e3c 100644 --- a/fs/adfs/super.c +++ b/fs/adfs/super.c @@ -449,6 +449,8 @@ static int adfs_fill_super(struct super_block *sb, void *data, int silent) asb->s_size = adfs_discsize(dr, sb->s_blocksize_bits); asb->s_version = dr->format_version; asb->s_log2sharesize = dr->log2sharesize; + sb->s_time_min = 0; + sb->s_time_max = ADFS_MAX_TIMESTAMP;
asb->s_map = adfs_read_map(sb, dr); if (IS_ERR(asb->s_map)) {
ext4 has different overflow limits for max filesystem timestamps based on the extra bytes available.
The timestamp limits are calculated according to the encoding table in a4dad1ae24f85i(ext4: Fix handling of extended tv_sec):
* extra msb of adjust for signed * epoch 32-bit 32-bit tv_sec to * bits time decoded 64-bit tv_sec 64-bit tv_sec valid time range * 0 0 1 -0x80000000..-0x00000001 0x000000000 1901-12-13..1969-12-31 * 0 0 0 0x000000000..0x07fffffff 0x000000000 1970-01-01..2038-01-19 * 0 1 1 0x080000000..0x0ffffffff 0x100000000 2038-01-19..2106-02-07 * 0 1 0 0x100000000..0x17fffffff 0x100000000 2106-02-07..2174-02-25 * 1 0 1 0x180000000..0x1ffffffff 0x200000000 2174-02-25..2242-03-16 * 1 0 0 0x200000000..0x27fffffff 0x200000000 2242-03-16..2310-04-04 * 1 1 1 0x280000000..0x2ffffffff 0x300000000 2310-04-04..2378-04-22 * 1 1 0 0x300000000..0x37fffffff 0x300000000 2378-04-22..2446-05-10
Note that the time limits are not correct for deletion times.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Reviewed-by: Andreas Dilger adilger@dilger.ca Cc: tytso@mit.edu Cc: adilger.kernel@dilger.ca Cc: linux-ext4@vger.kernel.org --- fs/ext4/ext4.h | 4 ++++ fs/ext4/super.c | 17 +++++++++++++++-- 2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 1cb67859e051..3f13cf12ae7f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1631,6 +1631,10 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei)
#define EXT4_GOOD_OLD_INODE_SIZE 128
+#define EXT4_EXTRA_TIMESTAMP_MAX (((s64)1 << 34) - 1 + S32_MIN) +#define EXT4_NON_EXTRA_TIMESTAMP_MAX S32_MAX +#define EXT4_TIMESTAMP_MIN S32_MIN + /* * Feature set definitions */ diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 4079605d437a..3ea2d60f33aa 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4035,8 +4035,21 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_inode_size); goto failed_mount; } - if (sbi->s_inode_size > EXT4_GOOD_OLD_INODE_SIZE) - sb->s_time_gran = 1 << (EXT4_EPOCH_BITS - 2); + /* + * i_atime_extra is the last extra field available for [acm]times in + * struct ext4_inode. Checking for that field should suffice to ensure + * we have extra space for all three. + */ + if (sbi->s_inode_size >= offsetof(struct ext4_inode, i_atime_extra) + + sizeof(((struct ext4_inode *)0)->i_atime_extra)) { + sb->s_time_gran = 1; + sb->s_time_max = EXT4_EXTRA_TIMESTAMP_MAX; + } else { + sb->s_time_gran = NSEC_PER_SEC; + sb->s_time_max = EXT4_NON_EXTRA_TIMESTAMP_MAX; + } + + sb->s_time_min = EXT4_TIMESTAMP_MIN; }
sbi->s_desc_size = le16_to_cpu(es->s_desc_size);
On Mon, Jul 29, 2019 at 06:49:13PM -0700, Deepa Dinamani wrote:
ext4 has different overflow limits for max filesystem timestamps based on the extra bytes available.
The timestamp limits are calculated according to the encoding table in a4dad1ae24f85i(ext4: Fix handling of extended tv_sec):
- extra msb of adjust for signed
- epoch 32-bit 32-bit tv_sec to
- bits time decoded 64-bit tv_sec 64-bit tv_sec valid time range
- 0 0 1 -0x80000000..-0x00000001 0x000000000 1901-12-13..1969-12-31
- 0 0 0 0x000000000..0x07fffffff 0x000000000 1970-01-01..2038-01-19
- 0 1 1 0x080000000..0x0ffffffff 0x100000000 2038-01-19..2106-02-07
- 0 1 0 0x100000000..0x17fffffff 0x100000000 2106-02-07..2174-02-25
- 1 0 1 0x180000000..0x1ffffffff 0x200000000 2174-02-25..2242-03-16
- 1 0 0 0x200000000..0x27fffffff 0x200000000 2242-03-16..2310-04-04
- 1 1 1 0x280000000..0x2ffffffff 0x300000000 2310-04-04..2378-04-22
- 1 1 0 0x300000000..0x37fffffff 0x300000000 2378-04-22..2446-05-10
My recollection of ext4 has gotten rusty, so this could be a bogus question:
Say you have a filesystem with s_inode_size > 128 where not all of the ondisk inodes have been upgraded to i_extra_isize > 0 and therefore don't support nanoseconds or times beyond 2038. I think this happens on ext3 filesystems that reserved extra space for inode attrs that are subsequently converted to ext4?
In any case, that means that you have some inodes that support 34-bit tv_sec and some inodes that only support 32-bit tv_sec. For the inodes with 32-bit tv_sec, I think all that happens is that a large timestamp will be truncated further, right?
And no mount time warning because at least /some/ of the inodes are ready to go for the next 30 years?
--D
Note that the time limits are not correct for deletion times.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Reviewed-by: Andreas Dilger adilger@dilger.ca Cc: tytso@mit.edu Cc: adilger.kernel@dilger.ca Cc: linux-ext4@vger.kernel.org
fs/ext4/ext4.h | 4 ++++ fs/ext4/super.c | 17 +++++++++++++++-- 2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 1cb67859e051..3f13cf12ae7f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1631,6 +1631,10 @@ static inline void ext4_clear_state_flags(struct ext4_inode_info *ei) #define EXT4_GOOD_OLD_INODE_SIZE 128 +#define EXT4_EXTRA_TIMESTAMP_MAX (((s64)1 << 34) - 1 + S32_MIN) +#define EXT4_NON_EXTRA_TIMESTAMP_MAX S32_MAX +#define EXT4_TIMESTAMP_MIN S32_MIN
/*
- Feature set definitions
*/ diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 4079605d437a..3ea2d60f33aa 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4035,8 +4035,21 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_inode_size); goto failed_mount; }
if (sbi->s_inode_size > EXT4_GOOD_OLD_INODE_SIZE)
sb->s_time_gran = 1 << (EXT4_EPOCH_BITS - 2);
/*
* i_atime_extra is the last extra field available for [acm]times in
* struct ext4_inode. Checking for that field should suffice to ensure
* we have extra space for all three.
*/
if (sbi->s_inode_size >= offsetof(struct ext4_inode, i_atime_extra) +
sizeof(((struct ext4_inode *)0)->i_atime_extra)) {
sb->s_time_gran = 1;
sb->s_time_max = EXT4_EXTRA_TIMESTAMP_MAX;
} else {
sb->s_time_gran = NSEC_PER_SEC;
sb->s_time_max = EXT4_NON_EXTRA_TIMESTAMP_MAX;
}
}sb->s_time_min = EXT4_TIMESTAMP_MIN;
sbi->s_desc_size = le16_to_cpu(es->s_desc_size); -- 2.17.1
On Wed, Jul 31, 2019 at 8:26 AM Darrick J. Wong darrick.wong@oracle.com wrote:
On Mon, Jul 29, 2019 at 06:49:13PM -0700, Deepa Dinamani wrote:
ext4 has different overflow limits for max filesystem timestamps based on the extra bytes available.
The timestamp limits are calculated according to the encoding table in a4dad1ae24f85i(ext4: Fix handling of extended tv_sec):
- extra msb of adjust for signed
- epoch 32-bit 32-bit tv_sec to
- bits time decoded 64-bit tv_sec 64-bit tv_sec valid time range
- 0 0 1 -0x80000000..-0x00000001 0x000000000 1901-12-13..1969-12-31
- 0 0 0 0x000000000..0x07fffffff 0x000000000 1970-01-01..2038-01-19
- 0 1 1 0x080000000..0x0ffffffff 0x100000000 2038-01-19..2106-02-07
- 0 1 0 0x100000000..0x17fffffff 0x100000000 2106-02-07..2174-02-25
- 1 0 1 0x180000000..0x1ffffffff 0x200000000 2174-02-25..2242-03-16
- 1 0 0 0x200000000..0x27fffffff 0x200000000 2242-03-16..2310-04-04
- 1 1 1 0x280000000..0x2ffffffff 0x300000000 2310-04-04..2378-04-22
- 1 1 0 0x300000000..0x37fffffff 0x300000000 2378-04-22..2446-05-10
My recollection of ext4 has gotten rusty, so this could be a bogus question:
Say you have a filesystem with s_inode_size > 128 where not all of the ondisk inodes have been upgraded to i_extra_isize > 0 and therefore don't support nanoseconds or times beyond 2038. I think this happens on ext3 filesystems that reserved extra space for inode attrs that are subsequently converted to ext4?
In any case, that means that you have some inodes that support 34-bit tv_sec and some inodes that only support 32-bit tv_sec. For the inodes with 32-bit tv_sec, I think all that happens is that a large timestamp will be truncated further, right?
And no mount time warning because at least /some/ of the inodes are ready to go for the next 30 years?
I'm confused about ext3 being converted to ext4. If the converted inodes have extra space, then ext4_iget() will start using the extra space when it modifies the on disk inode, won't it?
But, if there are 32 bit tv_sec and 34 bit tv_sec in a superblock then from this macro below, if an inode has space for extra bits in timestamps, it uses it. Otherwise, only the first 32 bits are copied to the on disk timestamp. This matches the behavior today for 32 bit tv_sec. But, the 34 bit tv_sec has a better behavior after the series because of the clamping and warning.
#define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \ do { \ (raw_inode)->xtime = cpu_to_le32((inode)->xtime.tv_sec); \ if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) {\ (raw_inode)->xtime ## _extra = \ ext4_encode_extra_time(&(inode)->xtime); \ } \ } while (0)
I'm not sure if this corner case if important. Maybe the maintainers can help me here. If this is important, then the inode time updates for an ext4 inode should always be through ext4_setattr() and we can clamp the timestamps there as a special case. And, this patch can be added separately?
Thanks, Deepa
On Thu, Aug 01, 2019 at 12:18:28PM -0700, Deepa Dinamani wrote:
Say you have a filesystem with s_inode_size > 128 where not all of the ondisk inodes have been upgraded to i_extra_isize > 0 and therefore don't support nanoseconds or times beyond 2038. I think this happens on ext3 filesystems that reserved extra space for inode attrs that are subsequently converted to ext4?
I'm confused about ext3 being converted to ext4. If the converted inodes have extra space, then ext4_iget() will start using the extra space when it modifies the on disk inode, won't it?i
It is possible that you can have an ext3 file system with (for example) 256 byte inodes, and all of the extra space was used for extended attributes, then ext4 won't have the extra space available. This is going toh be on an inode-by-inode basis, and if an extended attribute is motdified or deleted, the space would become available,t and then inode would start getting a higher resolution timestamp.
I really don't think it's worth worrying about that, though. It's highly unlikely ext3 file systems will be still be in service by the time it's needed in 2038. And if so, it's highly unlikely they would be converted to ext4.
- Ted
On Fri, Aug 2, 2019 at 12:43 AM Theodore Y. Ts'o tytso@mit.edu wrote:
On Thu, Aug 01, 2019 at 12:18:28PM -0700, Deepa Dinamani wrote:
Say you have a filesystem with s_inode_size > 128 where not all of the ondisk inodes have been upgraded to i_extra_isize > 0 and therefore don't support nanoseconds or times beyond 2038. I think this happens on ext3 filesystems that reserved extra space for inode attrs that are subsequently converted to ext4?
I'm confused about ext3 being converted to ext4. If the converted inodes have extra space, then ext4_iget() will start using the extra space when it modifies the on disk inode, won't it?i
It is possible that you can have an ext3 file system with (for example) 256 byte inodes, and all of the extra space was used for extended attributes, then ext4 won't have the extra space available. This is going toh be on an inode-by-inode basis, and if an extended attribute is motdified or deleted, the space would become available,t and then inode would start getting a higher resolution timestamp.
Is it correct to assume that this kind of file would have to be created using the ext3.ko file system implementation that was removed in linux-4.3, but not using ext2.ko or ext4.ko (which would always set the extended timestamps even in "-t ext2" or "-t ext3" mode)?
I tried to reproduce this on a modern kernel and with and moderately old debugfs (1.42.13) but failed.
I really don't think it's worth worrying about that, though. It's highly unlikely ext3 file systems will be still be in service by the time it's needed in 2038. And if so, it's highly unlikely they would be converted to ext4.
As the difference is easily visible even before y2038 by using utimensat(old_inode, future_date) on a file, we should at least decide what the sanest behavior is that we can easily implement, and then document what is expected to happen here.
If we check for s_min_extra_isize instead of s_inode_size to determine s_time_gran/s_time_max, we would warn at mount time as well as and consistently truncate all timestamps to full 32-bit seconds, regardless of whether there is actually space or not.
Alternatively, we could warn if s_min_extra_isize is too small, but use i_inode_size to determine s_time_gran/s_time_max anyway.
From looking at e2fsprogs git history, I see that
s_min_extra_isize has always been set by mkfs since 2008, but I'm not sure if there would have been a case in which it remains set but the ext3.ko would ignore it and use that space anyway.
Arnd
On Fri, Aug 02, 2019 at 12:39:41PM +0200, Arnd Bergmann wrote:
Is it correct to assume that this kind of file would have to be created using the ext3.ko file system implementation that was removed in linux-4.3, but not usiing ext2.ko or ext4.ko (which would always set the extended timestamps even in "-t ext2" or "-t ext3" mode)?
Correct. Some of the enterprise distro's were using ext4 to support "mount -t ext3" even before 4.3. There's a CONFIG option to enable using ext4 for ext2 or ext3 if they aren't enabled.
If we check for s_min_extra_isize instead of s_inode_size to determine s_time_gran/s_time_max, we would warn at mount time as well as and consistently truncate all timestamps to full 32-bit seconds, regardless of whether there is actually space or not.
Alternatively, we could warn if s_min_extra_isize is too small, but use i_inode_size to determine s_time_gran/s_time_max anyway.
Even with ext4, s_min_extra_isize doesn't guarantee that will be able to expand the inode. This can fail if (a) we aren't able to expand existing the transaction handle because there isn't enough space in the journal, or (b) there is already an external xattr block which is also full, so there is no space to evacuate an extended attribute out of the inode's extra space.
We could be more aggressive by trying to expand make room in the inode in ext4_iget (when we're reading in the inode, assuming the file system isn't mounted read/only), instead of in the middle of mark_inode_dirty(). That will eliminate failure mode (a) --- which is statistically rare --- but it won't eliminate failure mode (b).
Ultimately, the question is which is worse: having a timestamp be wrong, or randomly dropping an xattr from the inode to make room for the extended timestamp. We've come down on it being less harmful to have the timestamp be wrong.
But again, this is a pretty rare case. I'm not convinced it's worth stressing about, since it's going to require multiple things to go wrong before a timestamp will be bad.
- Ted
On Fri, Aug 2, 2019 at 5:43 PM Theodore Y. Ts'o tytso@mit.edu wrote:
On Fri, Aug 02, 2019 at 12:39:41PM +0200, Arnd Bergmann wrote:
Is it correct to assume that this kind of file would have to be created using the ext3.ko file system implementation that was removed in linux-4.3, but not usiing ext2.ko or ext4.ko (which would always set the extended timestamps even in "-t ext2" or "-t ext3" mode)?
Correct. Some of the enterprise distro's were using ext4 to support "mount -t ext3" even before 4.3. There's a CONFIG option to enable using ext4 for ext2 or ext3 if they aren't enabled.
If we check for s_min_extra_isize instead of s_inode_size to determine s_time_gran/s_time_max, we would warn at mount time as well as and consistently truncate all timestamps to full 32-bit seconds, regardless of whether there is actually space or not.
Alternatively, we could warn if s_min_extra_isize is too small, but use i_inode_size to determine s_time_gran/s_time_max anyway.
Even with ext4, s_min_extra_isize doesn't guarantee that will be able to expand the inode. This can fail if (a) we aren't able to expand existing the transaction handle because there isn't enough space in the journal, or (b) there is already an external xattr block which is also full, so there is no space to evacuate an extended attribute out of the inode's extra space.
I must have misunderstood what the field says. I expected that with s_min_extra_isize set beyond the nanosecond fields, there would be a guarantee that all inodes have at least as many extra bytes already allocated. What circumstances would lead to an i_extra_isize smaller than s_min_extra_isize?
We could be more aggressive by trying to expand make room in the inode in ext4_iget (when we're reading in the inode, assuming the file system isn't mounted read/only), instead of in the middle of mark_inode_dirty(). That will eliminate failure mode (a) --- which is statistically rare --- but it won't eliminate failure mode (b).
Ultimately, the question is which is worse: having a timestamp be wrong, or randomly dropping an xattr from the inode to make room for the extended timestamp. We've come down on it being less harmful to have the timestamp be wrong.
But again, this is a pretty rare case. I'm not convinced it's worth stressing about, since it's going to require multiple things to go wrong before a timestamp will be bad.
Agreed, I'm not overly worried about this happening frequently, I'd just feel better if we could reliably warn about the few instances that might theoretically be affected.
Arnd
On Fri, Aug 02, 2019 at 09:00:52PM +0200, Arnd Bergmann wrote:
I must have misunderstood what the field says. I expected that with s_min_extra_isize set beyond the nanosecond fields, there would be a guarantee that all inodes have at least as many extra bytes already allocated. What circumstances would lead to an i_extra_isize smaller than s_min_extra_isize?
When allocating new inodes, i_extra_isize is set to s_want_extra_isize. When modifying existing inodes, if i_extra_isize is less than s_min_extra_isize, then we will attempt to move out extended attribute(s) to the external xattr block. So the s_min_extra_isize field is not a guarantee, but rather an aspirationa goal. The idea is that at some point when we want to enable a new feature, which needs more extra inode space, we can adjust s_min_extra_size and s_want_extra_size, and the file system will migrate things to meet these constraints.
The plan was to teach e2fsck how to fix all of the inodes to meet theh s_min_extra_size value, but that never got implemented, and we even then, e2fsck would have to deal with the case where tit couldn't move the extended attribute(s) in the inode out, because there was no place to put them.
In practice, this hasn't been that much of a limitation because we haven't been adding that many extra inode fields. Keep in mind that Red Hat for example, has explicitly said they will *never* support adding new features to an existing file system. Their only supported method is back up the file system, reformat it with the new file system features, and then restore the file system.
Of course, if the backup/restore includes backing up the extended attributes, and then restoring them, the xattr restore could fail, unless the user also increased the inode size (e.g., from 256 bytes to 512 bytes).
Getting this right in the general case is *hard*. Fortunately, the corner cases really don't happen that often in practice, at least not for pure Linux workloads. Windows which can have arbitrarily large security id's and ACL's might make this harder, of course --- although ext4's EA in inode feature would make this better, modulo needing to write more complex file system code to handle moving xattrs around.
Since the extended timestamps were one of the first extra inode fields to be added, I strongly suggest that we not try to borrow trouble. Solving the general case problem is *hard*.
- Ted
On Aug 2, 2019, at 15:39, Theodore Y. Ts'o tytso@mit.edu wrote:
On Fri, Aug 02, 2019 at 09:00:52PM +0200, Arnd Bergmann wrote:
I must have misunderstood what the field says. I expected that with s_min_extra_isize set beyond the nanosecond fields, there would be a guarantee that all inodes have at least as many extra bytes already allocated. What circumstances would lead to an i_extra_isize smaller than s_min_extra_isize?
When allocating new inodes, i_extra_isize is set to s_want_extra_isize. When modifying existing inodes, if i_extra_isize is less than s_min_extra_isize, then we will attempt to move out extended attribute(s) to the external xattr block. So the s_min_extra_isize field is not a guarantee, but rather an aspirationa goal. The idea is that at some point when we want to enable a new feature, which needs more extra inode space, we can adjust s_min_extra_size and s_want_extra_size, and the file system will migrate things to meet these constraints.
The plan was to teach e2fsck how to fix all of the inodes to meet theh s_min_extra_size value, but that never got implemented,
Actually, we _did_ implement this feature for e2fsck years ago, but haven't landed it upstream for whatever reason. I don't know if I never submitted it, or I did and wasn't accepted for some reason. I definitely would be happy to get it landed.
The patch is in our "master-lustre" branch: e2fsck: add support for expanding the inode size: https://git.whamcloud.com/?p=tools/e2fsprogs.git%3Ba=commit%3Bh=ab1465f9ae2b...
And the test cases are in a separate patch: https://git.whamcloud.com/?p=tools/e2fsprogs.git%3Ba=commit%3Bh=7b8a9fdf8627...
and we even then, e2fsck would have to deal with the case where tit couldn't move the extended attribute(s) in the inode out, because there was no place to put them.
In this case, e2fsck will loop asking to abort or delete am xattr, regardless whether -n or -y is used.
In practice, this hasn't been that much of a limitation because we haven't been adding that many extra inode fields. Keep in mind that Red Hat for example, has explicitly said they will *never* support adding new features to an existing file system. Their only supported method is back up the file system, reformat it with the new file system features, and then restore the file system.
Of course, if the backup/restore includes backing up the extended attributes, and then restoring them, the xattr restore could fail, unless the user also increased the inode size (e.g., from 256 bytes to 512 bytes).
Getting this right in the general case is *hard*. Fortunately, the corner cases really don't happen that often in practice, at least not for pure Linux workloads. Windows which can have arbitrarily large security id's and ACL's might make this harder, of course --- although ext4's EA in inode feature would make this better, modulo needing to write more complex file system code to handle moving xattrs around.
Since the extended timestamps were one of the first extra inode fields to be added, I strongly suggest that we not try to borrow trouble. Solving the general case problem is *hard*.
- Ted
On Fri, Aug 2, 2019 at 11:39 PM Theodore Y. Ts'o tytso@mit.edu wrote:
On Fri, Aug 02, 2019 at 09:00:52PM +0200, Arnd Bergmann wrote:
I must have misunderstood what the field says. I expected that with s_min_extra_isize set beyond the nanosecond fields, there would be a guarantee that all inodes have at least as many extra bytes already allocated. What circumstances would lead to an i_extra_isize smaller than s_min_extra_isize?
When allocating new inodes, i_extra_isize is set to s_want_extra_isize. When modifying existing inodes, if i_extra_isize is less than s_min_extra_isize, then we will attempt to move out extended attribute(s) to the external xattr block. So the s_min_extra_isize field is not a guarantee, but rather an aspirationa goal. The idea is that at some point when we want to enable a new feature, which needs more extra inode space, we can adjust s_min_extra_size and s_want_extra_size, and the file system will migrate things to meet these constraints.
I see in the ext4 code that we always try to expand i_extra_size to s_want_extra_isize in ext4_mark_inode_dirty(), and that s_want_extra_isize is always at least s_min_extra_isize, so we constantly try to expand the inode to fit.
What I still don't see is how any inode on the file system image could have ended up with less than s_min_extra_isize in the first place if s_min_extra_isize is never modified and all inodes in the file system would have originally been created with i_extra_isize >= s_min_extra_isize if EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE is set.
Did older versions of ext4 or ext3 ignore s_min_extra_isize when creating inodes despite EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE, or is there another possibility I'm missing?
Since the extended timestamps were one of the first extra inode fields to be added, I strongly suggest that we not try to borrow trouble. Solving the general case problem is *hard*.
As I said before, I absolutely don't suggest we solve the problem of reliably setting the timestamps, I'm just trying to find out if there is a way to know for sure that it cannot happen and alert the user otherwise. So far I think we have concluded
- just checking s_inode_size is not sufficient because ext3 may have created inodes with s_extra_isize too small - checking s_min_extra_isize may not be sufficient either, for similar reasons I don't yet fully understand (see above).
If there is any other way to be sure that the file system has never been mounted as a writable ext3, maybe that can be used instead?
Arnd
On Sat, Aug 03, 2019 at 11:30:22AM +0200, Arnd Bergmann wrote:
I see in the ext4 code that we always try to expand i_extra_size to s_want_extra_isize in ext4_mark_inode_dirty(), and that s_want_extra_isize is always at least s_min_extra_isize, so we constantly try to expand the inode to fit.
Yes, we *try*. But we may not succeed. There may actually be a problem here if the cause is due to there simply is no space in the external xattr block, so we might try and try every time we try to modify that inode, and it would be a performance mess. If it's due to there being no room in the current transaction, then it's highly likely it will succeed the next time.
Did older versions of ext4 or ext3 ignore s_min_extra_isize when creating inodes despite EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE, or is there another possibility I'm missing?
s_min_extra_isize could get changed in order to make room for some new file system feature --- such as extended timestamps. That's how we might take an old ext3 file system with an inode size > 128, and try to evacuate space for extended timestamps, on a best efforts basis. And since it's best efforts is why Red Hat refuses to support that case. It'll work 99.9% of the time, but they don't want to deal with the 0.01% cases showing up at their help desk.
If you want to pretend that file systems never get upgraded, then life is much simpler. The general approach is that for less-sophisticated customers (e.g., most people running enterprise distros) file system upgrades are not a thing. But for sophisticated users, we do try to make thing work for people who are aware of the risks / caveats / rough edges. Google won't have been able to upgrade thousands and thousands of servers in data centers all over the world if we limited ourselves to Red Hat's support restrictions. Backup / reformat / restore really isn't a practical rollout strategy for many exabytes of file systems.
It sounds like your safety checks / warnings are mostly targeted at low-information customers, no?
- Ted
On Sat, Aug 3, 2019 at 6:03 PM Theodore Y. Ts'o tytso@mit.edu wrote:
On Sat, Aug 03, 2019 at 11:30:22AM +0200, Arnd Bergmann wrote:
I see in the ext4 code that we always try to expand i_extra_size to s_want_extra_isize in ext4_mark_inode_dirty(), and that s_want_extra_isize is always at least s_min_extra_isize, so we constantly try to expand the inode to fit.
Yes, we *try*. But we may not succeed. There may actually be a problem here if the cause is due to there simply is no space in the external xattr block, so we might try and try every time we try to modify that inode, and it would be a performance mess. If it's due to there being no room in the current transaction, then it's highly likely it will succeed the next time.
Did older versions of ext4 or ext3 ignore s_min_extra_isize when creating inodes despite EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE, or is there another possibility I'm missing?
s_min_extra_isize could get changed in order to make room for some new file system feature --- such as extended timestamps.
Ok, that explains it. I assumed s_min_extra_isize was meant to not be modifiable, and did not find a way to change it using the kernel or tune2fs, but now I can see that debugfs can set it.
If you want to pretend that file systems never get upgraded, then life is much simpler. The general approach is that for less-sophisticated customers (e.g., most people running enterprise distros) file system upgrades are not a thing. But for sophisticated users, we do try to make thing work for people who are aware of the risks / caveats / rough edges. Google won't have been able to upgrade thousands and thousands of servers in data centers all over the world if we limited ourselves to Red Hat's support restrictions. Backup / reformat / restore really isn't a practical rollout strategy for many exabytes of file systems.
It sounds like your safety checks / warnings are mostly targeted at low-information customers, no?
Yes, that seems like a reasonable compromise: just warn based on s_min_extra_isize, and assume that anyone who used debugfs to set s_min_extra_isize to a higher value from an ext3 file system during the migration to ext4 was aware of the risks already.
That leaves the question of what we should set the s_time_gran and s_time_max to on a superblock with s_min_extra_isize<16 and s_want_extra_isize>=16.
If we base it on s_min_extra_isize, we never try to set a timestamp later than 2038 and so will never fail, but anyone with a grandfathered s_min_extra_isize from ext3 won't be able to set extended timestamps on any files any more. Based on s_want_extra_isize we would keep the current behavior, but could add a custom warning in the ext4 code about the small s_min_extra_isize indicating a theoretical problem.
Arnd
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
The time formats for various verious is detailed in the RFCs as below:
https://tools.ietf.org/html/rfc7862(time metadata) https://tools.ietf.org/html/rfc7530:
nfstime4
struct nfstime4 { int64_t seconds; uint32_t nseconds; };
https://tools.ietf.org/html/rfc1094
struct timeval { unsigned int seconds; unsigned int useconds; };
https://tools.ietf.org/html/rfc1813
struct nfstime3 { uint32 seconds; uint32 nseconds; };
Use the limits as per the RFC.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: trond.myklebust@hammerspace.com Cc: anna.schumaker@netapp.com Cc: linux-nfs@vger.kernel.org --- fs/nfs/super.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c index f88ddac2dcdf..54eb5a47f180 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -2360,6 +2360,15 @@ void nfs_fill_super(struct super_block *sb, struct nfs_mount_info *mount_info) sb->s_flags |= SB_POSIXACL; sb->s_time_gran = 1; sb->s_export_op = &nfs_export_ops; + } else + sb->s_time_gran = 1000; + + if (server->nfs_client->rpc_ops->version != 4) { + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; + } else { + sb->s_time_min = S64_MIN; + sb->s_time_max = S64_MAX; }
nfs_initialise_sb(sb); @@ -2380,7 +2389,6 @@ static void nfs_clone_super(struct super_block *sb, sb->s_maxbytes = old_sb->s_maxbytes; sb->s_xattr = old_sb->s_xattr; sb->s_op = old_sb->s_op; - sb->s_time_gran = 1; sb->s_export_op = old_sb->s_export_op;
if (server->nfs_client->rpc_ops->version != 2) { @@ -2388,6 +2396,16 @@ static void nfs_clone_super(struct super_block *sb, * so ourselves when necessary. */ sb->s_flags |= SB_POSIXACL; + sb->s_time_gran = 1; + } else + sb->s_time_gran = 1000; + + if (server->nfs_client->rpc_ops->version != 4) { + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; + } else { + sb->s_time_min = S64_MIN; + sb->s_time_max = S64_MAX; }
nfs_initialise_sb(sb);
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Also fixed cnvrtDosUnixTm calculations to avoid int overflow while computing maximum date.
References:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-cifs/d416ff7...
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: sfrench@samba.org Cc: linux-cifs@vger.kernel.org --- fs/cifs/cifsfs.c | 22 ++++++++++++++++++++++ fs/cifs/netmisc.c | 14 +++++++------- 2 files changed, 29 insertions(+), 7 deletions(-)
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index 81a16b4e1b48..94a52a63b9ea 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -56,6 +56,15 @@ #include "dfs_cache.h" #endif
+/* + * DOS dates from 1980/1/1 through 2107/12/31 + * Protocol specifications indicate the range should be to 119, which + * limits maximum year to 2099. But this range has not been checked. + */ +#define SMB_DATE_MAX (127<<9 | 12<<5 | 31) +#define SMB_DATE_MIN (0<<9 | 1<<5 | 1) +#define SMB_TIME_MAX (23<<11 | 59<<5 | 29) + int cifsFYI = 0; bool traceSMB; bool enable_oplocks = true; @@ -142,6 +151,7 @@ cifs_read_super(struct super_block *sb) struct inode *inode; struct cifs_sb_info *cifs_sb; struct cifs_tcon *tcon; + struct timespec64 ts; int rc = 0;
cifs_sb = CIFS_SB(sb); @@ -161,6 +171,18 @@ cifs_read_super(struct super_block *sb) /* BB FIXME fix time_gran to be larger for LANMAN sessions */ sb->s_time_gran = 100;
+ if (tcon->unix_ext) { + ts = cifs_NTtimeToUnix(0); + sb->s_time_min = ts.tv_sec; + ts = cifs_NTtimeToUnix(cpu_to_le64(S64_MAX)); + sb->s_time_max = ts.tv_sec; + } else { + ts = cnvrtDosUnixTm(cpu_to_le16(SMB_DATE_MIN), 0, 0); + sb->s_time_min = ts.tv_sec; + ts = cnvrtDosUnixTm(cpu_to_le16(SMB_DATE_MAX), cpu_to_le16(SMB_TIME_MAX), 0); + sb->s_time_max = ts.tv_sec; + } + sb->s_magic = CIFS_MAGIC_NUMBER; sb->s_op = &cifs_super_ops; sb->s_xattr = cifs_xattr_handlers; diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index ed92958e842d..49c17ee18254 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -949,8 +949,8 @@ static const int total_days_of_prev_months[] = { struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { struct timespec64 ts; - time64_t sec; - int min, days, month, year; + time64_t sec, days; + int min, day, month, year; u16 date = le16_to_cpu(le_date); u16 time = le16_to_cpu(le_time); SMB_TIME *st = (SMB_TIME *)&time; @@ -966,15 +966,15 @@ struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) sec += 60 * 60 * st->Hours; if (st->Hours > 24) cifs_dbg(VFS, "illegal hours %d\n", st->Hours); - days = sd->Day; + day = sd->Day; month = sd->Month; - if (days < 1 || days > 31 || month < 1 || month > 12) { - cifs_dbg(VFS, "illegal date, month %d day: %d\n", month, days); - days = clamp(days, 1, 31); + if (day < 1 || day > 31 || month < 1 || month > 12) { + cifs_dbg(VFS, "illegal date, month %d day: %d\n", month, day); + day = clamp(day, 1, 31); month = clamp(month, 1, 12); } month -= 1; - days += total_days_of_prev_months[month]; + days = day + total_days_of_prev_months[month]; days += 3652; /* account for difference in days between 1980 and 1970 */ year = sd->Year; days += year * 365;
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Some FAT variants indicate that the years after 2099 are not supported. Since commit 7decd1cb0305 ("fat: Fix and cleanup timestamp conversion"), we support the full range of years that can be represented, up to 2107.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: hirofumi@mail.parknet.co.jp --- fs/fat/inode.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 05689198f5af..5f04c5c810fb 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -31,6 +31,11 @@
#define KB_IN_SECTORS 2
+/* DOS dates from 1980/1/1 through 2107/12/31 */ +#define FAT_DATE_MIN (0<<9 | 1<<5 | 1) +#define FAT_DATE_MAX (127<<9 | 12<<5 | 31) +#define FAT_TIME_MAX (23<<11 | 59<<5 | 29) + /* * A deserialized copy of the on-disk structure laid out in struct * fat_boot_sector. @@ -1605,6 +1610,7 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, int debug; long error; char buf[50]; + struct timespec64 ts;
/* * GFP_KERNEL is ok here, because while we do hold the @@ -1698,6 +1704,12 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, sbi->free_clus_valid = 0; sbi->prev_free = FAT_START_ENT; sb->s_maxbytes = 0xffffffff; + fat_time_fat2unix(sbi, &ts, 0, cpu_to_le16(FAT_DATE_MIN), 0); + sb->s_time_min = ts.tv_sec; + + fat_time_fat2unix(sbi, &ts, cpu_to_le16(FAT_TIME_MAX), + cpu_to_le16(FAT_DATE_MAX), 0); + sb->s_time_max = ts.tv_sec;
if (!sbi->fat_length && bpb.fat32_length) { struct fat_boot_fsinfo *fsinfo;
Deepa Dinamani deepa.kernel@gmail.com writes:
+/* DOS dates from 1980/1/1 through 2107/12/31 */ +#define FAT_DATE_MIN (0<<9 | 1<<5 | 1) +#define FAT_DATE_MAX (127<<9 | 12<<5 | 31) +#define FAT_TIME_MAX (23<<11 | 59<<5 | 29)
/*
- A deserialized copy of the on-disk structure laid out in struct
- fat_boot_sector.
@@ -1605,6 +1610,7 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, int debug; long error; char buf[50];
- struct timespec64 ts;
/* * GFP_KERNEL is ok here, because while we do hold the @@ -1698,6 +1704,12 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, sbi->free_clus_valid = 0; sbi->prev_free = FAT_START_ENT; sb->s_maxbytes = 0xffffffff;
- fat_time_fat2unix(sbi, &ts, 0, cpu_to_le16(FAT_DATE_MIN), 0);
- sb->s_time_min = ts.tv_sec;
- fat_time_fat2unix(sbi, &ts, cpu_to_le16(FAT_TIME_MAX),
cpu_to_le16(FAT_DATE_MAX), 0);
- sb->s_time_max = ts.tv_sec;
At least, it is wrong to call fat_time_fat2unix() before setup parameters in sbi.
And please move those timestamp stuff to fat/misc.c like other fat timestamp helpers. (Maybe, provide fat_time_{min,max}() from fat/misc.c, or fat_init_time() such?).
Thanks.
On Tue, Jul 30, 2019 at 2:31 AM OGAWA Hirofumi hirofumi@mail.parknet.co.jp wrote:
Deepa Dinamani deepa.kernel@gmail.com writes:
+/* DOS dates from 1980/1/1 through 2107/12/31 */ +#define FAT_DATE_MIN (0<<9 | 1<<5 | 1) +#define FAT_DATE_MAX (127<<9 | 12<<5 | 31) +#define FAT_TIME_MAX (23<<11 | 59<<5 | 29)
/*
- A deserialized copy of the on-disk structure laid out in struct
- fat_boot_sector.
@@ -1605,6 +1610,7 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, int debug; long error; char buf[50];
struct timespec64 ts; /* * GFP_KERNEL is ok here, because while we do hold the
@@ -1698,6 +1704,12 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, sbi->free_clus_valid = 0; sbi->prev_free = FAT_START_ENT; sb->s_maxbytes = 0xffffffff;
fat_time_fat2unix(sbi, &ts, 0, cpu_to_le16(FAT_DATE_MIN), 0);
sb->s_time_min = ts.tv_sec;
fat_time_fat2unix(sbi, &ts, cpu_to_le16(FAT_TIME_MAX),
cpu_to_le16(FAT_DATE_MAX), 0);
sb->s_time_max = ts.tv_sec;
At least, it is wrong to call fat_time_fat2unix() before setup parameters in sbi.
All the parameters that fat_time_fat2unix() cares in sbi is accessed through
static inline int fat_tz_offset(struct msdos_sb_info *sbi) { return (sbi->options.tz_set ? -sbi->options.time_offset : sys_tz.tz_minuteswest) * SECS_PER_MIN; }
Both the sbi fields sbi->options.tz_set and sbi->options.time_offset are set by the call to parse_options(). And, parse_options() is called before the calls to fat_time_fat2unix().:
int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, void (*setup)(struct super_block *)) { <snip>
error = parse_options(sb, data, isvfat, silent, &debug, &sbi->options); if (error) goto out_fail;
<snip>
sbi->prev_free = FAT_START_ENT; sb->s_maxbytes = 0xffffffff; fat_time_fat2unix(sbi, &ts, 0, cpu_to_le16(FAT_DATE_MIN), 0); sb->s_time_min = ts.tv_sec;
fat_time_fat2unix(sbi, &ts, cpu_to_le16(FAT_TIME_MAX), cpu_to_le16(FAT_DATE_MAX), 0); sb->s_time_max = ts.tv_sec;
<snip> }
I do not see what the problem is.
-Deepa
Deepa Dinamani deepa.kernel@gmail.com writes:
At least, it is wrong to call fat_time_fat2unix() before setup parameters in sbi.
All the parameters that fat_time_fat2unix() cares in sbi is accessed through
static inline int fat_tz_offset(struct msdos_sb_info *sbi) { return (sbi->options.tz_set ? -sbi->options.time_offset : sys_tz.tz_minuteswest) * SECS_PER_MIN; }
Both the sbi fields sbi->options.tz_set and sbi->options.time_offset are set by the call to parse_options(). And, parse_options() is called before the calls to fat_time_fat2unix().:
int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, void (*setup)(struct super_block *)) { <snip>
error = parse_options(sb, data, isvfat, silent, &debug, &sbi->options); if (error) goto out_fail;
<snip>
sbi->prev_free = FAT_START_ENT; sb->s_maxbytes = 0xffffffff; fat_time_fat2unix(sbi, &ts, 0, cpu_to_le16(FAT_DATE_MIN), 0); sb->s_time_min = ts.tv_sec; fat_time_fat2unix(sbi, &ts, cpu_to_le16(FAT_TIME_MAX), cpu_to_le16(FAT_DATE_MAX), 0); sb->s_time_max = ts.tv_sec;
<snip> }
I do not see what the problem is.
Ouch, you are right. I was reading that patch wrongly, sorry.
Thanks.
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Also fix timestamp calculation to avoid overflow while converting from days to seconds.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: dsterba@suse.com --- fs/affs/amigaffs.c | 2 +- fs/affs/amigaffs.h | 3 +++ fs/affs/inode.c | 4 ++-- fs/affs/super.c | 4 ++++ 4 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/affs/amigaffs.c b/fs/affs/amigaffs.c index 14a6c1b90c9f..f708c45d5f66 100644 --- a/fs/affs/amigaffs.c +++ b/fs/affs/amigaffs.c @@ -375,7 +375,7 @@ affs_secs_to_datestamp(time64_t secs, struct affs_date *ds) u32 minute; s32 rem;
- secs -= sys_tz.tz_minuteswest * 60 + ((8 * 365 + 2) * 24 * 60 * 60); + secs -= sys_tz.tz_minuteswest * 60 + AFFS_EPOCH_DELTA; if (secs < 0) secs = 0; days = div_s64_rem(secs, 86400, &rem); diff --git a/fs/affs/amigaffs.h b/fs/affs/amigaffs.h index f9bef9056659..81fb396d4dfa 100644 --- a/fs/affs/amigaffs.h +++ b/fs/affs/amigaffs.h @@ -32,6 +32,9 @@
#define AFFS_ROOT_BMAPS 25
+/* Seconds since Amiga epoch of 1978/01/01 to UNIX */ +#define AFFS_EPOCH_DELTA ((8 * 365 + 2) * 86400LL) + struct affs_date { __be32 days; __be32 mins; diff --git a/fs/affs/inode.c b/fs/affs/inode.c index 73598bff8506..a346cf7659f1 100644 --- a/fs/affs/inode.c +++ b/fs/affs/inode.c @@ -150,10 +150,10 @@ struct inode *affs_iget(struct super_block *sb, unsigned long ino) }
inode->i_mtime.tv_sec = inode->i_atime.tv_sec = inode->i_ctime.tv_sec - = (be32_to_cpu(tail->change.days) * (24 * 60 * 60) + + = (be32_to_cpu(tail->change.days) * 86400LL + be32_to_cpu(tail->change.mins) * 60 + be32_to_cpu(tail->change.ticks) / 50 + - ((8 * 365 + 2) * 24 * 60 * 60)) + + AFFS_EPOCH_DELTA) + sys_tz.tz_minuteswest * 60; inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec = inode->i_atime.tv_nsec = 0; affs_brelse(bh); diff --git a/fs/affs/super.c b/fs/affs/super.c index e7d036efbaa1..cc463ae47c12 100644 --- a/fs/affs/super.c +++ b/fs/affs/super.c @@ -355,6 +355,10 @@ static int affs_fill_super(struct super_block *sb, void *data, int silent) sb->s_op = &affs_sops; sb->s_flags |= SB_NODIRATIME;
+ sb->s_time_gran = NSEC_PER_SEC; + sb->s_time_min = sys_tz.tz_minuteswest * 60 + AFFS_EPOCH_DELTA; + sb->s_time_max = 86400LL * U32_MAX + 86400 + sb->s_time_min; + sbi = kzalloc(sizeof(struct affs_sb_info), GFP_KERNEL); if (!sbi) return -ENOMEM;
On Mon, Jul 29, 2019 at 06:49:17PM -0700, Deepa Dinamani wrote:
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Also fix timestamp calculation to avoid overflow while converting from days to seconds.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: dsterba@suse.com
Acked-by: David Sterba dsterba@suse.com
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: hch@infradead.org --- fs/sysv/super.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/sysv/super.c b/fs/sysv/super.c index d788b1daa7eb..cc8e2ed155c8 100644 --- a/fs/sysv/super.c +++ b/fs/sysv/super.c @@ -368,7 +368,8 @@ static int sysv_fill_super(struct super_block *sb, void *data, int silent) sbi->s_block_base = 0; mutex_init(&sbi->s_lock); sb->s_fs_info = sbi; - + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; sb_set_blocksize(sb, BLOCK_SIZE);
for (i = 0; i < ARRAY_SIZE(flavours) && !size; i++) { @@ -487,6 +488,8 @@ static int v7_fill_super(struct super_block *sb, void *data, int silent) sbi->s_type = FSTYPE_V7; mutex_init(&sbi->s_lock); sb->s_fs_info = sbi; + sb->s_time_min = 0; + sb->s_time_max = U32_MAX; sb_set_blocksize(sb, 512);
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
According to the disscussion in https://patchwork.kernel.org/patch/8308691/ we agreed to use unsigned 32 bit timestamps on ceph. Update the limits accordingly.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: zyan@redhat.com Cc: sage@redhat.com Cc: idryomov@gmail.com Cc: ceph-devel@vger.kernel.org --- fs/ceph/super.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/ceph/super.c b/fs/ceph/super.c index d57fa60dcd43..6cf3a4fceac1 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -981,6 +981,8 @@ static int ceph_set_super(struct super_block *s, void *data) s->s_export_op = &ceph_export_ops;
s->s_time_gran = 1000; /* 1000 ns == 1 us */ + s->s_time_min = 0; + s->s_time_max = U32_MAX;
ret = set_anon_super(s, NULL); /* what is that second arg for? */ if (ret != 0)
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Assume the limits as unsigned according to the below commit 98e8eef557a9 ("changed PVFS_time from int64_t to uint64_t") in https://github.com/waltligon/orangefs
Author: Neill Miller neillm@mcs.anl.gov Date: Thu Sep 2 15:00:38 2004 +0000
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: hubcap@omnibond.com Cc: martin@omnibond.com Cc: devel@lists.orangefs.org --- fs/orangefs/super.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/orangefs/super.c b/fs/orangefs/super.c index ee5efdc35cc1..dcd97e8158b1 100644 --- a/fs/orangefs/super.c +++ b/fs/orangefs/super.c @@ -439,6 +439,8 @@ static int orangefs_fill_sb(struct super_block *sb, sb->s_blocksize = PAGE_SIZE; sb->s_blocksize_bits = PAGE_SHIFT; sb->s_maxbytes = MAX_LFS_FILESIZE; + sb->s_time_min = 0; + sb->s_time_max = S64_MAX;
ret = super_setup_bdi(sb); if (ret)
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Also change the local_to_gmt() to use time64_t instead of time32_t.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: mikulas@artax.karlin.mff.cuni.cz --- fs/hpfs/hpfs_fn.h | 6 ++---- fs/hpfs/super.c | 2 ++ 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/hpfs/hpfs_fn.h b/fs/hpfs/hpfs_fn.h index ab2e7cc2ff33..1cca83218fb5 100644 --- a/fs/hpfs/hpfs_fn.h +++ b/fs/hpfs/hpfs_fn.h @@ -334,7 +334,7 @@ long hpfs_ioctl(struct file *file, unsigned cmd, unsigned long arg); * local time (HPFS) to GMT (Unix) */
-static inline time64_t local_to_gmt(struct super_block *s, time32_t t) +static inline time64_t local_to_gmt(struct super_block *s, time64_t t) { extern struct timezone sys_tz; return t + sys_tz.tz_minuteswest * 60 + hpfs_sb(s)->sb_timeshift; @@ -343,9 +343,7 @@ static inline time64_t local_to_gmt(struct super_block *s, time32_t t) static inline time32_t gmt_to_local(struct super_block *s, time64_t t) { extern struct timezone sys_tz; - t = t - sys_tz.tz_minuteswest * 60 - hpfs_sb(s)->sb_timeshift; - - return clamp_t(time64_t, t, 0, U32_MAX); + return t - sys_tz.tz_minuteswest * 60 - hpfs_sb(s)->sb_timeshift; }
static inline time32_t local_get_seconds(struct super_block *s) diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c index 9db6d84f0d62..0a677a9aaf34 100644 --- a/fs/hpfs/super.c +++ b/fs/hpfs/super.c @@ -614,6 +614,8 @@ static int hpfs_fill_super(struct super_block *s, void *options, int silent) s->s_magic = HPFS_SUPER_MAGIC; s->s_op = &hpfs_sops; s->s_d_op = &hpfs_dentry_operations; + s->s_time_min = local_to_gmt(s, 0); + s->s_time_max = local_to_gmt(s, U32_MAX);
sbi->sb_root = le32_to_cpu(superblock->root); sbi->sb_fs_size = le32_to_cpu(superblock->n_sectors);
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: me@bobcopeland.com Cc: linux-karma-devel@lists.sourceforge.net --- fs/omfs/inode.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c index 08226a835ec3..b76ec6b88ded 100644 --- a/fs/omfs/inode.c +++ b/fs/omfs/inode.c @@ -478,6 +478,10 @@ static int omfs_fill_super(struct super_block *sb, void *data, int silent)
sb->s_maxbytes = 0xffffffff;
+ sb->s_time_gran = NSEC_PER_MSEC; + sb->s_time_min = 0; + sb->s_time_max = U64_MAX / MSEC_PER_SEC; + sb_set_blocksize(sb, 0x200);
bh = sb_bread(sb, 0);
On Mon, Jul 29, 2019 at 06:49:22PM -0700, Deepa Dinamani wrote:
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: me@bobcopeland.com Cc: linux-karma-devel@lists.sourceforge.net
fs/omfs/inode.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c index 08226a835ec3..b76ec6b88ded 100644 --- a/fs/omfs/inode.c +++ b/fs/omfs/inode.c @@ -478,6 +478,10 @@ static int omfs_fill_super(struct super_block *sb, void *data, int silent) sb->s_maxbytes = 0xffffffff;
- sb->s_time_gran = NSEC_PER_MSEC;
- sb->s_time_min = 0;
- sb->s_time_max = U64_MAX / MSEC_PER_SEC;
I honestly don't know if it should be s64 rather than u64, but considering that none of the devices with this filesystem ever exposed dates to users in the negative era, it should be fine.
Acked-by: Bob Copeland me@bobcopeland.com
Also update the gran since pstore has microsecond granularity.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: anton@enomsg.org Cc: ccross@android.com Cc: keescook@chromium.org Cc: tony.luck@intel.com --- fs/pstore/inode.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c index 89a80b568a17..ee752f9fda57 100644 --- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -388,7 +388,9 @@ static int pstore_fill_super(struct super_block *sb, void *data, int silent) sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = PSTOREFS_MAGIC; sb->s_op = &pstore_ops; - sb->s_time_gran = 1; + sb->s_time_gran = NSEC_PER_USEC; + sb->s_time_min = S64_MIN; + sb->s_time_max = S64_MAX;
parse_options(data);
On Mon, Jul 29, 2019 at 06:49:23PM -0700, Deepa Dinamani wrote:
Also update the gran since pstore has microsecond granularity.
So, I'm fine with this, but technically the granularity depends on the backend storage... many have no actual time keeping, though. My point is, pstore's timestamps are really mostly a lie, but the most common backend (ramoops) is seconds-granularity.
So, I'm fine with this, but it's a lie but it's a lie that doesn't matter, so ...
Acked-by: Kees Cook keescook@chromium.org
I'm open to suggestions to improve it...
-Kees
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com Cc: anton@enomsg.org Cc: ccross@android.com Cc: keescook@chromium.org Cc: tony.luck@intel.com
fs/pstore/inode.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c index 89a80b568a17..ee752f9fda57 100644 --- a/fs/pstore/inode.c +++ b/fs/pstore/inode.c @@ -388,7 +388,9 @@ static int pstore_fill_super(struct super_block *sb, void *data, int silent) sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = PSTOREFS_MAGIC; sb->s_op = &pstore_ops;
- sb->s_time_gran = 1;
- sb->s_time_gran = NSEC_PER_USEC;
- sb->s_time_min = S64_MIN;
- sb->s_time_max = S64_MAX;
parse_options(data); -- 2.17.1
On Tue, Jul 30, 2019 at 6:31 AM Kees Cook keescook@chromium.org wrote:
On Mon, Jul 29, 2019 at 06:49:23PM -0700, Deepa Dinamani wrote:
Also update the gran since pstore has microsecond granularity.
So, I'm fine with this, but technically the granularity depends on the backend storage... many have no actual time keeping, though. My point is, pstore's timestamps are really mostly a lie, but the most common backend (ramoops) is seconds-granularity.
So, I'm fine with this, but it's a lie but it's a lie that doesn't matter, so ...
Acked-by: Kees Cook keescook@chromium.org
I'm open to suggestions to improve it...
If we don't care about using sub-second granularity, then setting it to one second unconditionally here will make it always use that and report it correctly.
Arnd
On Tue, Jul 30, 2019 at 12:36 AM Arnd Bergmann arnd@arndb.de wrote:
On Tue, Jul 30, 2019 at 6:31 AM Kees Cook keescook@chromium.org wrote:
On Mon, Jul 29, 2019 at 06:49:23PM -0700, Deepa Dinamani wrote:
Also update the gran since pstore has microsecond granularity.
So, I'm fine with this, but technically the granularity depends on the backend storage... many have no actual time keeping, though. My point is, pstore's timestamps are really mostly a lie, but the most common backend (ramoops) is seconds-granularity.
So, I'm fine with this, but it's a lie but it's a lie that doesn't matter, so ...
Acked-by: Kees Cook keescook@chromium.org
I'm open to suggestions to improve it...
If we don't care about using sub-second granularity, then setting it to one second unconditionally here will make it always use that and report it correctly.
Should this printf in ramoops_write_kmsg_hdr() also be fixed then?
RAMOOPS_KERNMSG_HDR "%lld.%06lu-%c\n", (time64_t)record->time.tv_sec, record->time.tv_nsec / 1000, record->compressed ? 'C' : 'D'); persistent_ram_write(prz, hdr, len);
ramoops_read_kmsg_hdr() doesn't read this as microseconds. Seems like a mismatch from above.
If we want to agree that we just want seconds granularity for pstore, we could replace the tv_nsec part to be all 0's if anybody else is depending on this format. I could drop this patch from the series and post that patch seperately.
Thanks, -Deepa
On Fri, Aug 2, 2019 at 4:26 AM Deepa Dinamani deepa.kernel@gmail.com wrote:
On Tue, Jul 30, 2019 at 12:36 AM Arnd Bergmann arnd@arndb.de wrote:
On Tue, Jul 30, 2019 at 6:31 AM Kees Cook keescook@chromium.org wrote:
On Mon, Jul 29, 2019 at 06:49:23PM -0700, Deepa Dinamani wrote:
Also update the gran since pstore has microsecond granularity.
So, I'm fine with this, but technically the granularity depends on the backend storage... many have no actual time keeping, though. My point is, pstore's timestamps are really mostly a lie, but the most common backend (ramoops) is seconds-granularity.
So, I'm fine with this, but it's a lie but it's a lie that doesn't matter, so ...
Acked-by: Kees Cook keescook@chromium.org
I'm open to suggestions to improve it...
If we don't care about using sub-second granularity, then setting it to one second unconditionally here will make it always use that and report it correctly.
Should this printf in ramoops_write_kmsg_hdr() also be fixed then?
RAMOOPS_KERNMSG_HDR "%lld.%06lu-%c\n", (time64_t)record->time.tv_sec, record->time.tv_nsec / 1000, record->compressed ? 'C' : 'D'); persistent_ram_write(prz, hdr, len);
ramoops_read_kmsg_hdr() doesn't read this as microseconds. Seems like a mismatch from above.
Good catch. This seems to go back to commit 3f8f80f0cfeb ("pstore/ram: Read and write to the 'compressed' flag of pstore"), which introduced the nanosecond read. The write function however has always used microseconds, and that was kept when the implementation changed from timeval to timespec in commit 1e817fb62cd1 ("time: create __getnstimeofday for WARNless calls").
If we want to agree that we just want seconds granularity for pstore, we could replace the tv_nsec part to be all 0's if anybody else is depending on this format. I could drop this patch from the series and post that patch seperately.
We should definitely fix it to not produce a bogus nanosecond value. Whether using full seconds or microsecond resolution is better here, I don't know. It seems that pstore records generally get created with a nanosecond nanosecond accurate timestamp from ktime_get_real_fast_ns() and then truncated to the resolution of the backend, rather than the normal jiffies-accurate inode timestamps that we have for regular file systems.
This might mean that we do want the highest possible resolution and not further truncate here, in case that information ends up being useful afterwards.
Arnd
On Fri, Aug 2, 2019 at 12:15 AM Arnd Bergmann arnd@arndb.de wrote:
On Fri, Aug 2, 2019 at 4:26 AM Deepa Dinamani deepa.kernel@gmail.com wrote:
On Tue, Jul 30, 2019 at 12:36 AM Arnd Bergmann arnd@arndb.de wrote:
On Tue, Jul 30, 2019 at 6:31 AM Kees Cook keescook@chromium.org wrote:
On Mon, Jul 29, 2019 at 06:49:23PM -0700, Deepa Dinamani wrote:
Also update the gran since pstore has microsecond granularity.
So, I'm fine with this, but technically the granularity depends on the backend storage... many have no actual time keeping, though. My point is, pstore's timestamps are really mostly a lie, but the most common backend (ramoops) is seconds-granularity.
So, I'm fine with this, but it's a lie but it's a lie that doesn't matter, so ...
Acked-by: Kees Cook keescook@chromium.org
I'm open to suggestions to improve it...
If we don't care about using sub-second granularity, then setting it to one second unconditionally here will make it always use that and report it correctly.
Should this printf in ramoops_write_kmsg_hdr() also be fixed then?
RAMOOPS_KERNMSG_HDR "%lld.%06lu-%c\n", (time64_t)record->time.tv_sec, record->time.tv_nsec / 1000, record->compressed ? 'C' : 'D'); persistent_ram_write(prz, hdr, len);
ramoops_read_kmsg_hdr() doesn't read this as microseconds. Seems like a mismatch from above.
Good catch. This seems to go back to commit 3f8f80f0cfeb ("pstore/ram: Read and write to the 'compressed' flag of pstore"), which introduced the nanosecond read. The write function however has always used microseconds, and that was kept when the implementation changed from timeval to timespec in commit 1e817fb62cd1 ("time: create __getnstimeofday for WARNless calls").
If we want to agree that we just want seconds granularity for pstore, we could replace the tv_nsec part to be all 0's if anybody else is depending on this format. I could drop this patch from the series and post that patch seperately.
We should definitely fix it to not produce a bogus nanosecond value. Whether using full seconds or microsecond resolution is better here, I don't know. It seems that pstore records generally get created with a nanosecond nanosecond accurate timestamp from ktime_get_real_fast_ns() and then truncated to the resolution of the backend, rather than the normal jiffies-accurate inode timestamps that we have for regular file systems.
This might mean that we do want the highest possible resolution and not further truncate here, in case that information ends up being useful afterwards.
I made a list of granularities used by pstore drivers using pstore_register():
1. efi - seconds 2. ramoops - microsecond 3. erst - seconds 4. powerpc/nvram64 - seconds
I will leave pstore granularity at nanoseconds and fix the ramoops read.
-Deepa
Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range.
Reference: http://www.ecma-international.org/publications/standards/Ecma-119.htm
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/isofs/inode.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c index 9e30d8703735..62c0462dc89f 100644 --- a/fs/isofs/inode.c +++ b/fs/isofs/inode.c @@ -30,6 +30,9 @@ #include "isofs.h" #include "zisofs.h"
+/* max tz offset is 13 hours */ +#define MAX_TZ_OFFSET (52*15*60) + #define BEQUIET
static int isofs_hashi(const struct dentry *parent, struct qstr *qstr); @@ -801,6 +804,10 @@ static int isofs_fill_super(struct super_block *s, void *data, int silent) */ s->s_maxbytes = 0x80000000000LL;
+ /* ECMA-119 timestamp from 1900/1/1 with tz offset */ + s->s_time_min = mktime64(1900, 1, 1, 0, 0, 0) - MAX_TZ_OFFSET; + s->s_time_max = mktime64(U8_MAX+1900, 12, 31, 23, 59, 59) + MAX_TZ_OFFSET; + /* Set this for reference. Its not currently used except on write which we don't have .. */