Introduction
This is a follow on to the series: https://lkml.org/lkml/2016/1/7/20 [1]. This is aimed at reaching a consensus on how to transition the vfs timestamps to use 64 bit time. This demonstrates three ways (2a, 2b and 2c) of solving this problem. Each of the proposals has its own cover letter that explains the individual approach. Proposals 2b and 2c also outline variant approaches which are similar to the respective proposals. This drives the proposal count to 5. All the changes have been discussed with Arnd Bergmann, who posted the original series: https://lkml.org/lkml/2014/5/30/669 [2]
The series has been simplified to include only the 64 bit timestamp changes as per Dave Chinner’s suggestion.
Motivation
The problem is how to change the vfs inode timestamps to use 64 bit times to overcome the 2038 problem.
Below table [3] gives an overview of the extent/ type of changes needed of changes needed. The series is aimed at obtaining small manageable patches for all the cases in [3].
Table [3] Terminology: vfs_time – data type of timestamps used in the vfs layer. Access type # of instances
1. timespec_*(struct vfs_time, struct timespec) / 34 timespec_*(struct vfs_time, struct vfs_time)
2. struct vfs_time = struct vfs_time 50
3. vfs_time = current_fs_time/ CURRENT_TIME/ CURRENT_TIME_SEC 312
4. setattr vfs_time assignments 141
5. vfs_time = other data types, outside of setattr() (timespec, s32, s64..) 74
6. other data types, outside of getattr() (timespec, s32, s64..) = vfs_time 85
7. internal individual fs funtions using inode timestamps as args 80
8. extra timestamp fields in individual filesystems ~10
9. VFS callback - int (*update_time)(struct inode *, struct timespec *, int) 3
10. VFS function - void lease_get_mtime(struct inode *inode, struct timespec *time) 3
Each series is used to demonstrate how each of the above cases is solved using their respective approaches. The example filesystems (btrfs, xfs, cifs, and ceph) were selected in such a way so as to showcase all these issues in table [3].
Source Tree
The tree is hosted at github.com/deepa-hub/vfs.git
The branches for the three approaches are
2a. https://github.com/deepa-hub/vfs.git refs/heads/vfs_time 2b. https://github.com/deepa-hub/vfs.git refs/heads/vfs_time_to_timespec 2c. https://github.com/deepa-hub/vfs.git refs/heads/vfs_time_to_ts64
All the above series are based off of: https://lkml.org/lkml/2016/2/3/34 [4] and a couple of other patches.
Only the minimal changes are posted here to keep the series simple.
There are a couple of bug fixes like data type conversion bugs that will be sent directly to the corresponding filesystem lists.
Next steps
The approaches 2a, 2b and 2c are posted as responses to this cover letter.
Testing
All the approaches have been compile tested only.
Introduction
The series is one of the proposals on how to transition VFS timestamps to use 64 bit time. This is the inode_timespec idea proposed in the original RFC series. The type name has been changed to vfs_time based on Dave Chinner’s suggestion.
Solution
This series defines a new type name for vfs timestamps: vfs_time. All the individual file systems will use struct vfs_time to access vfs timestamps. Current time APIs also return vfs times and are considered to be part of vfs as they are exclusively used for inode timestamps.
These are the steps involved:
1.Define vfs_time as an alias to timespec 2.Change individual filesystems to use only vfs_time data type for timestamps. Make sure each filesystem will be safe when converted to use timespec64 3.Change vfs_time to be an alias for timespec64. 4.Change all filesystems to use timespec64 directly.
Note that the series only includes patches for steps 1 and 2.
Concerns
1.Before the vfs layer is done, it might be confusing to developers as individual filesystems now will have two options for access: timespec and vfs_time type.
Deepa Dinamani (12): vfs: Add vfs_time abstractions fs: cifs: Change cifs_fscache_inode_auxdata to use vfs_time data type fs: cifs: Change cifs_fattr timestamps data type to vfs_time fs: cifs: Make cnvrtDosUnixTm() y2038 safe fs: cifs: Use vfs_time_get_real_* time functions fs: btrfs: Change btrfs_inode.i_otime to use vfs_time data type fs: btrfs: Use vfs_time data type for btrfs_update_time() fs: btrfs: Change timespec data types to use vfs_time fs: ceph: Change encode and decode functions to use vfs_time fs: ceph: Replace timespec data type with vfs_time net: ceph: use vfs_time data type instead of timespec fs: xfs: change inode times to use vfs_time data type
fs/btrfs/btrfs_inode.h | 2 +- fs/btrfs/file.c | 6 +++--- fs/btrfs/inode.c | 2 +- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/root-tree.c | 2 +- fs/btrfs/transaction.c | 2 +- fs/ceph/cache.c | 2 +- fs/ceph/caps.c | 6 +++--- fs/ceph/dir.c | 4 ++-- fs/ceph/file.c | 6 +++--- fs/ceph/inode.c | 32 ++++++++++++++++---------------- fs/ceph/mds_client.c | 2 +- fs/ceph/mds_client.h | 2 +- fs/ceph/super.h | 8 ++++---- fs/ceph/xattr.c | 2 +- fs/cifs/cache.c | 4 ++-- fs/cifs/cifsencrypt.c | 4 ++-- fs/cifs/cifsglob.h | 6 +++--- fs/cifs/cifsproto.h | 6 +++--- fs/cifs/cifssmb.c | 10 +++++----- fs/cifs/inode.c | 2 +- fs/cifs/netmisc.c | 15 ++++++++------- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_iops.c | 2 +- fs/xfs/xfs_trans_inode.c | 6 +++--- include/linux/ceph/decode.h | 11 ++++++----- include/linux/ceph/messenger.h | 1 + include/linux/ceph/osd_client.h | 4 ++-- include/linux/fs.h | 19 +++++++++++++++++++ net/ceph/osd_client.c | 2 +- 30 files changed, 99 insertions(+), 77 deletions(-)
Add vfs_time aliases to help convert vfs timestamps to use 64 bit times. These create an abstraction layer so that vfs inode times can be switched to use struct timespec64 instead of struct timespec.
Use uapi exposed data types, timespec and timespec64 here to keep minimal timestamp data type conversions in API's interfacing with vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/fs.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 4af612f..ec25603 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1412,6 +1412,25 @@ static inline struct timespec current_fs_time_sec(struct super_block *sb) return (struct timespec) { get_seconds(), 0 }; }
+/* Place holder defines until it is safe to use timespec64 + * in the vfs layer. + * timespec64 data type and functions will be used at that + * time directly by all filesystems and these defines will be deleted. + */ +#define vfs_time timespec + +#define vfs_time_compare timespec_compare +#define vfs_time_equal timespec_equal +#define vfs_time_add timespec_add + +/* This returns current time in a timespec/seconds format to match vfs + * timestamp data type. + * This is for timestamps that use the same functions + * as vfs_time but have nothing to do with inode timestamps. + */ +#define vfs_get_real_ts ktime_get_real_ts +#define vfs_get_real_seconds get_seconds + /* * Snapshotting support. */
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
aux data timestamps are only used to read in inode timestamps. Hence, they need to change data type along with vfs times.
struct timespec64 is the same as struct timespec on 64 bit systems. So it is a no-op on 64 bit systems.
The buffer length(datalen) passed in to read in auxdata in cifs_fscache_inode_get_aux() and cifs_fscache_inode_check_aux() should be big enough for the data type change from struct timespec to struct timespec64 to be safe on 32 bit systems.
Following provide support for safe usage on 32 bit systems: 1. datalen already accounts for struct timespec64 on 64 bit systems. 2. datalen passed in is a constant with sufficient space to accommodate auxdata. 3. The keylen subtracted from datalen is a constant and also leaves in sufficient space for increase.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cache.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c index 6c665bf..f0892ac 100644 --- a/fs/cifs/cache.c +++ b/fs/cifs/cache.c @@ -221,8 +221,8 @@ const struct fscache_cookie_def cifs_fscache_super_index_def = { * Auxiliary data attached to CIFS inode within the cache */ struct cifs_fscache_inode_auxdata { - struct timespec last_write_time; - struct timespec last_change_time; + struct vfs_time last_write_time; + struct vfs_time last_change_time; u64 eof; };
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
This is necessary because we are matching cifs_fattr data type to that of the vfs timestamps.
This change is safe because cifs_fattr is accessed in following ways:
1. Assigning to/ from/ compared with struct inode timestamps, which will match vfs_time alias. 2. Assigned from cifs_NTtimeToUnix(). And, this function does not care about the data type sizes of tv_sec and tv_nsec fields. 3. Assigned from current_fs_time(), which will match vfs_time alias. 4. Assigned from cnvrtDosUnixTm(). And, this function does not care about the data type sizes of tv_sec and tv_nsec fields.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cifsencrypt.c | 2 +- fs/cifs/cifsglob.h | 6 +++--- fs/cifs/cifsproto.h | 6 +++--- fs/cifs/cifssmb.c | 4 ++-- fs/cifs/inode.c | 2 +- fs/cifs/netmisc.c | 10 +++++----- 6 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index f86e07d..ee1e674 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -460,7 +460,7 @@ find_timestamp(struct cifs_ses *ses) unsigned char *blobptr; unsigned char *blobend; struct ntlmssp2_name *attrptr; - struct timespec ts; + struct vfs_time ts;
if (!ses->auth_key.len || !ses->auth_key.response) return 0; diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index a25b251..b08749e 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1393,9 +1393,9 @@ struct cifs_fattr { dev_t cf_rdev; unsigned int cf_nlink; unsigned int cf_dtype; - struct timespec cf_atime; - struct timespec cf_mtime; - struct timespec cf_ctime; + struct vfs_time cf_atime; + struct vfs_time cf_mtime; + struct vfs_time cf_ctime; };
static inline void free_dfs_info_param(struct dfs_info3_param *param) diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index eed7ff5..93e9638 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -126,9 +126,9 @@ extern enum securityEnum select_sectype(struct TCP_Server_Info *server, enum securityEnum requested); extern int CIFS_SessSetup(const unsigned int xid, struct cifs_ses *ses, const struct nls_table *nls_cp); -extern struct timespec cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); -extern u64 cifs_UnixTimeToNT(struct timespec); -extern struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, +extern struct vfs_time cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); +extern u64 cifs_UnixTimeToNT(struct vfs_time); +extern struct vfs_time cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset); extern void cifs_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock); extern int cifs_get_writer(struct cifsInodeInfo *cinode); diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 1a9e43d..b3b3fa4 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -478,7 +478,7 @@ decode_lanman_negprot_rsp(struct TCP_Server_Info *server, NEGOTIATE_RSP *pSMBr) * this requirement. */ int val, seconds, remain, result; - struct timespec ts; + struct vfs_time ts; unsigned long utc = get_seconds(); ts = cnvrtDosUnixTm(rsp->SrvTime.Date, rsp->SrvTime.Time, 0); @@ -4000,7 +4000,7 @@ QInfRetry: if (rc) { cifs_dbg(FYI, "Send error in QueryInfo = %d\n", rc); } else if (data) { - struct timespec ts; + struct vfs_time ts; __u32 time = le32_to_cpu(pSMBr->last_write_time);
/* decode response */ diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index fa72359..39068b4 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -110,7 +110,7 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) }
/* revalidate if mtime or size have changed */ - if (timespec_equal(&inode->i_mtime, &fattr->cf_mtime) && + if (vfs_time_equal(&inode->i_mtime, &fattr->cf_mtime) && cifs_i->server_eof == fattr->cf_eof) { cifs_dbg(FYI, "%s: inode %llu is unchanged\n", __func__, cifs_i->uniqueid); diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index abae6dd..86bc0e5 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -918,10 +918,10 @@ smbCalcSize(void *buf) * Convert the NT UTC (based 1601-01-01, in hundred nanosecond units) * into Unix UTC (based 1970-01-01, in seconds). */ -struct timespec +struct vfs_time cifs_NTtimeToUnix(__le64 ntutc) { - struct timespec ts; + struct vfs_time ts; /* BB what about the timezone? BB */
/* Subtract the NTFS time offset, then convert to 1s intervals. */ @@ -949,7 +949,7 @@ cifs_NTtimeToUnix(__le64 ntutc)
/* Convert the Unix UTC into NT UTC. */ u64 -cifs_UnixTimeToNT(struct timespec t) +cifs_UnixTimeToNT(struct vfs_time t) { /* Convert to 100ns intervals and then add the NTFS time offset. */ return (u64) t.tv_sec * 10000000 + t.tv_nsec/100 + NTFS_TIME_OFFSET; @@ -959,9 +959,9 @@ static const int total_days_of_prev_months[] = { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 };
-struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) +struct vfs_time cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { - struct timespec ts; + struct vfs_time ts; int sec, min, days, month, year; u16 date = le16_to_cpu(le_date); u16 time = le16_to_cpu(le_time);
The seconds calculated from the server(DOS format) cannot be saved in int data type as this cannot save seconds since epoch after the year 2038.
Use long long for seconds field in cnvrtDosUnixTm(). This will help represent 64 bit time even on 32 bit systems.
Note that even though the theoretical max on DOS times is 2107, its APIs only support until the year 2099. This means we can get away with 32 bit unsigned sec field. But, the sec field uses long long to maintain uniformity in the kernel as the theoretical max is used everywhere else.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/netmisc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index 86bc0e5..52b41bd 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -962,7 +962,8 @@ static const int total_days_of_prev_months[] = { struct vfs_time cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { struct vfs_time ts; - int sec, min, days, month, year; + long long sec; + int min, days, month, year; u16 date = le16_to_cpu(le_date); u16 time = le16_to_cpu(le_time); SMB_TIME *st = (SMB_TIME *)&time; @@ -973,7 +974,7 @@ struct vfs_time cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) sec = 2 * st->TwoSeconds; min = st->Minutes; if ((sec > 59) || (min > 59)) - cifs_dbg(VFS, "illegal time min %d sec %d\n", min, sec); + cifs_dbg(VFS, "illegal time min %d sec %lld\n", min, sec); sec += (min * 60); sec += 60 * 60 * st->Hours; if (st->Hours > 24)
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
Variables in the patch are passed into functions that also take inode times as argument. Hence, using vfs_time aliases here is necessary so that they can change to use 64 bit time along with vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cifsencrypt.c | 2 +- fs/cifs/cifssmb.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index ee1e674..2b606e2 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -485,7 +485,7 @@ find_timestamp(struct cifs_ses *ses) blobptr += attrsize; /* advance attr value */ }
- ktime_get_real_ts(&ts); + vfs_get_real_ts(&ts); return cpu_to_le64(cifs_UnixTimeToNT(ts)); }
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index b3b3fa4..4c1deae 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -479,11 +479,11 @@ decode_lanman_negprot_rsp(struct TCP_Server_Info *server, NEGOTIATE_RSP *pSMBr) */ int val, seconds, remain, result; struct vfs_time ts; - unsigned long utc = get_seconds(); + unsigned long long utc = vfs_get_real_seconds(); ts = cnvrtDosUnixTm(rsp->SrvTime.Date, rsp->SrvTime.Time, 0); - cifs_dbg(FYI, "SrvTime %d sec since 1970 (utc: %d) diff: %d\n", - (int)ts.tv_sec, (int)utc, + cifs_dbg(FYI, "SrvTime %lld sec since 1970 (utc: %lld) diff: %d\n", + (long long)ts.tv_sec, (long long)utc, (int)(utc - ts.tv_sec)); val = (int)(utc - ts.tv_sec); seconds = abs(val);
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times. struct btrfs_inode is the in memory inode structure for btrfs. i_otime is a member of the btrfs_inode that represents file creation times.
Like all the other inode timestamps in struct inode, i_otime is assigned to/ from disk or struct inode times or 0.
Hence, i_otime needs to change along with other inode timestamps. Use vfs_time data type to make the transition to 64 bit time along with the other inode timestamps.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/btrfs/btrfs_inode.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 61205e3..e45b7af 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -190,7 +190,7 @@ struct btrfs_inode { struct btrfs_delayed_node *delayed_node;
/* File creation time. */ - struct timespec i_otime; + struct vfs_time i_otime;
/* Hook into fs_info->delayed_iputs */ struct list_head delayed_iput;
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
This is set as a vfs callback function for inode operations. This accepts inode timestamp as an argument. And, needs to switch to 64 bit time representation along with vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 59c0e22..6f0417b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5942,7 +5942,7 @@ static int btrfs_dirty_inode(struct inode *inode) * This is a copy of file_update_time. We need this so we can return error on * ENOSPC for updating the inode in the case of file write and mmap writes. */ -static int btrfs_update_time(struct inode *inode, struct timespec *now, +static int btrfs_update_time(struct inode *inode, struct vfs_time *now, int flags) { struct btrfs_root *root = BTRFS_I(inode)->root;
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times. The following needs to switch along with vfs time representation.
1. inode times set/get. 2. For inode times comparison. 3. getting times from current_fs_time()
btrfs_timespec already uses 64 bits to represent seconds in timestamps. The switch to 64 bit using vfs_time will allow for timestamps beyond 2038 to be represented corretly.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/btrfs/file.c | 6 +++--- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/root-tree.c | 2 +- fs/btrfs/transaction.c | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 610f569..f7d1e14 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1732,16 +1732,16 @@ out:
static void update_time_for_write(struct inode *inode) { - struct timespec now; + struct vfs_time now;
if (IS_NOCMTIME(inode)) return;
now = current_fs_time(inode->i_sb); - if (!timespec_equal(&inode->i_mtime, &now)) + if (!vfs_time_equal(&inode->i_mtime, &now)) inode->i_mtime = now;
- if (!timespec_equal(&inode->i_ctime, &now)) + if (!vfs_time_equal(&inode->i_ctime, &now)) inode->i_ctime = now;
if (IS_I_VERSION(inode)) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 6f35d9c..471037f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -443,7 +443,7 @@ static noinline int create_subvol(struct inode *dir, struct btrfs_root *root = BTRFS_I(dir)->root; struct btrfs_root *new_root; struct btrfs_block_rsv block_rsv; - struct timespec cur_time = current_fs_time(dir->i_sb); + struct vfs_time cur_time = current_fs_time(dir->i_sb); struct inode *inode; int ret; int err; @@ -4956,7 +4956,7 @@ static long _btrfs_ioctl_set_received_subvol(struct file *file, struct btrfs_root *root = BTRFS_I(inode)->root; struct btrfs_root_item *root_item = &root->root_item; struct btrfs_trans_handle *trans; - struct timespec ct = current_fs_time(inode->i_sb); + struct vfs_time ct = current_fs_time(inode->i_sb); int ret = 0; int received_uuid_changed;
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c index a25f3b2..0a309f6b 100644 --- a/fs/btrfs/root-tree.c +++ b/fs/btrfs/root-tree.c @@ -488,7 +488,7 @@ void btrfs_update_root_times(struct btrfs_trans_handle *trans, struct btrfs_root *root) { struct btrfs_root_item *item = &root->root_item; - struct timespec ct = current_fs_time(root->fs_info->sb); + struct vfs_time ct = current_fs_time(root->fs_info->sb);
spin_lock(&root->root_item_lock); btrfs_set_root_ctransid(item, trans->transid); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 37562d6..5481ee0 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1333,7 +1333,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, struct dentry *dentry; struct extent_buffer *tmp; struct extent_buffer *old; - struct timespec cur_time; + struct vfs_time cur_time; int ret = 0; u64 to_reserve = 0; u64 index = 0;
Encode is used for iattr times, inode times and current filesystem times. Decode is mostly used for inode times. Hence, these need to use vfs_time to switch to 64 bit times along with vfs.
Decode is also used for keepalive times and processing authentication tickets. Since inode times also use same functions it is necessary to use vfs_time here as well.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/ceph/decode.h | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/linux/ceph/decode.h b/include/linux/ceph/decode.h index 633a130..8139ed9 100644 --- a/include/linux/ceph/decode.h +++ b/include/linux/ceph/decode.h @@ -4,6 +4,7 @@ #include <linux/err.h> #include <linux/bug.h> #include <linux/time.h> +#include <linux/fs.h> #include <asm/unaligned.h>
#include <linux/ceph/types.h> @@ -132,16 +133,16 @@ bad: }
/* - * struct ceph_timespec <-> struct timespec + * struct ceph_timespec <-> struct vfs_time */ -static inline void ceph_decode_timespec(struct timespec *ts, +static inline void ceph_decode_timespec(struct vfs_time *ts, const struct ceph_timespec *tv) { - ts->tv_sec = (__kernel_time_t)le32_to_cpu(tv->tv_sec); - ts->tv_nsec = (long)le32_to_cpu(tv->tv_nsec); + ts->tv_sec = (s64)(u32)le32_to_cpu(tv->tv_sec); + ts->tv_nsec = (long)(u32)le32_to_cpu(tv->tv_nsec); } static inline void ceph_encode_timespec(struct ceph_timespec *tv, - const struct timespec *ts) + const struct vfs_time *ts) { tv->tv_sec = cpu_to_le32((u32)ts->tv_sec); tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec);
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
Use vfs_time data type for accessing inode timestamps. This is necessary as these accesses need to switch to using 64 bit timestamps along with vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/ceph/cache.c | 2 +- fs/ceph/caps.c | 6 +++--- fs/ceph/dir.c | 4 ++-- fs/ceph/file.c | 6 +++--- fs/ceph/inode.c | 32 ++++++++++++++++---------------- fs/ceph/mds_client.c | 2 +- fs/ceph/mds_client.h | 2 +- fs/ceph/super.h | 8 ++++---- fs/ceph/xattr.c | 2 +- 9 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c index a351480..4b5b2da 100644 --- a/fs/ceph/cache.c +++ b/fs/ceph/cache.c @@ -25,7 +25,7 @@ #include "cache.h"
struct ceph_aux_inode { - struct timespec mtime; + struct vfs_time mtime; loff_t size; };
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index cdbf8cf..f407819 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -990,7 +990,7 @@ static int send_cap_msg(struct ceph_mds_session *session, int caps, int wanted, int dirty, u32 seq, u64 flush_tid, u64 oldest_flush_tid, u32 issue_seq, u32 mseq, u64 size, u64 max_size, - struct timespec *mtime, struct timespec *atime, + struct vfs_time *mtime, struct vfs_time *atime, u64 time_warp_seq, kuid_t uid, kgid_t gid, umode_t mode, u64 xattr_version, @@ -1116,7 +1116,7 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap, int held, revoking, dropping, keep; u64 seq, issue_seq, mseq, time_warp_seq, follows; u64 size, max_size; - struct timespec mtime, atime; + struct vfs_time mtime, atime; int wake = 0; umode_t mode; kuid_t uid; @@ -2764,7 +2764,7 @@ static void handle_cap_grant(struct ceph_mds_client *mdsc, int used, wanted, dirty; u64 size = le64_to_cpu(grant->size); u64 max_size = le64_to_cpu(grant->max_size); - struct timespec mtime, atime, ctime; + struct vfs_time mtime, atime, ctime; int check_caps = 0; bool wake = false; bool writeback = false; diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index fd11fb2..83e6602 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1210,7 +1210,7 @@ static ssize_t ceph_read_dir(struct file *file, char __user *buf, size_t size, " rfiles: %20lld\n" " rsubdirs: %20lld\n" "rbytes: %20lld\n" - "rctime: %10ld.%09ld\n", + "rctime: %10lld.%09ld\n", ci->i_files + ci->i_subdirs, ci->i_files, ci->i_subdirs, @@ -1218,7 +1218,7 @@ static ssize_t ceph_read_dir(struct file *file, char __user *buf, size_t size, ci->i_rfiles, ci->i_rsubdirs, ci->i_rbytes, - (long)ci->i_rctime.tv_sec, + (long long)ci->i_rctime.tv_sec, (long)ci->i_rctime.tv_nsec); }
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 9b338ff..c8a6f0d 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -555,7 +555,7 @@ struct ceph_aio_request { struct list_head osd_reqs; unsigned num_reqs; atomic_t pending_reqs; - struct timespec mtime; + struct vfs_time mtime; struct ceph_cap_flush *prealloc_cf; };
@@ -783,7 +783,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, int num_pages = 0; int flags; int ret; - struct timespec mtime = current_fs_time(inode->i_sb); + struct vfs_time mtime = current_fs_time(inode->i_sb); size_t count = iov_iter_count(iter); loff_t pos = iocb->ki_pos; bool write = iov_iter_rw(iter) == WRITE; @@ -988,7 +988,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, int flags; int check_caps = 0; int ret; - struct timespec mtime = current_fs_time(inode->i_sb); + struct vfs_time mtime = current_fs_time(inode->i_sb); size_t count = iov_iter_count(from);
if (ceph_snap(file_inode(file)) != CEPH_NOSNAP) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 63d0198..b5bce93 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -590,8 +590,8 @@ int ceph_fill_file_size(struct inode *inode, int issued, }
void ceph_fill_file_time(struct inode *inode, int issued, - u64 time_warp_seq, struct timespec *ctime, - struct timespec *mtime, struct timespec *atime) + u64 time_warp_seq, struct vfs_time *ctime, + struct vfs_time *mtime, struct vfs_time *atime) { struct ceph_inode_info *ci = ceph_inode(inode); int warn = 0; @@ -601,7 +601,7 @@ void ceph_fill_file_time(struct inode *inode, int issued, CEPH_CAP_FILE_BUFFER| CEPH_CAP_AUTH_EXCL| CEPH_CAP_XATTR_EXCL)) { - if (timespec_compare(ctime, &inode->i_ctime) > 0) { + if (vfs_time_compare(ctime, &inode->i_ctime) > 0) { dout("ctime %ld.%09ld -> %ld.%09ld inc w/ cap\n", inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, ctime->tv_sec, ctime->tv_nsec); @@ -620,14 +620,14 @@ void ceph_fill_file_time(struct inode *inode, int issued, ci->i_time_warp_seq = time_warp_seq; } else if (time_warp_seq == ci->i_time_warp_seq) { /* nobody did utimes(); take the max */ - if (timespec_compare(mtime, &inode->i_mtime) > 0) { + if (vfs_time_compare(mtime, &inode->i_mtime) > 0) { dout("mtime %ld.%09ld -> %ld.%09ld inc\n", inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, mtime->tv_sec, mtime->tv_nsec); inode->i_mtime = *mtime; } - if (timespec_compare(atime, &inode->i_atime) > 0) { + if (vfs_time_compare(atime, &inode->i_atime) > 0) { dout("atime %ld.%09ld -> %ld.%09ld inc\n", inode->i_atime.tv_sec, inode->i_atime.tv_nsec, @@ -670,7 +670,7 @@ static int fill_inode(struct inode *inode, struct page *locked_page, struct ceph_mds_reply_inode *info = iinfo->in; struct ceph_inode_info *ci = ceph_inode(inode); int issued = 0, implemented, new_issued; - struct timespec mtime, atime, ctime; + struct vfs_time mtime, atime, ctime; struct ceph_buffer *xattr_blob = NULL; struct ceph_cap *new_cap = NULL; int err = 0; @@ -1871,12 +1871,12 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) inode->i_atime = attr->ia_atime; dirtied |= CEPH_CAP_FILE_EXCL; } else if ((issued & CEPH_CAP_FILE_WR) && - timespec_compare(&inode->i_atime, + vfs_time_compare(&inode->i_atime, &attr->ia_atime) < 0) { inode->i_atime = attr->ia_atime; dirtied |= CEPH_CAP_FILE_WR; } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || - !timespec_equal(&inode->i_atime, &attr->ia_atime)) { + !vfs_time_equal(&inode->i_atime, &attr->ia_atime)) { ceph_encode_timespec(&req->r_args.setattr.atime, &attr->ia_atime); mask |= CEPH_SETATTR_ATIME; @@ -1885,20 +1885,20 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) } } if (ia_valid & ATTR_MTIME) { - dout("setattr %p mtime %ld.%ld -> %ld.%ld\n", inode, - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, - attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); + dout("setattr %p mtime %lld.%ld -> %lld.%ld\n", inode, + (long long) inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, + (long long) attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_mtime = attr->ia_mtime; dirtied |= CEPH_CAP_FILE_EXCL; } else if ((issued & CEPH_CAP_FILE_WR) && - timespec_compare(&inode->i_mtime, + vfs_time_compare(&inode->i_mtime, &attr->ia_mtime) < 0) { inode->i_mtime = attr->ia_mtime; dirtied |= CEPH_CAP_FILE_WR; } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || - !timespec_equal(&inode->i_mtime, &attr->ia_mtime)) { + !vfs_time_equal(&inode->i_mtime, &attr->ia_mtime)) { ceph_encode_timespec(&req->r_args.setattr.mtime, &attr->ia_mtime); mask |= CEPH_SETATTR_MTIME; @@ -1932,9 +1932,9 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) if (ia_valid & ATTR_CTIME) { bool only = (ia_valid & (ATTR_SIZE|ATTR_MTIME|ATTR_ATIME| ATTR_MODE|ATTR_UID|ATTR_GID)) == 0; - dout("setattr %p ctime %ld.%ld -> %ld.%ld (%s)\n", inode, - inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, - attr->ia_ctime.tv_sec, attr->ia_ctime.tv_nsec, + dout("setattr %p ctime %lld.%ld -> %lld.%ld (%s)\n", inode, + (long long) inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, + (long long) attr->ia_ctime.tv_sec, attr->ia_ctime.tv_nsec, only ? "ctime only" : "ignored"); inode->i_ctime = attr->ia_ctime; if (only) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 348b22e..793be8b 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1721,7 +1721,7 @@ ceph_mdsc_create_request(struct ceph_mds_client *mdsc, int op, int mode) init_completion(&req->r_safe_completion); INIT_LIST_HEAD(&req->r_unsafe_item);
- ktime_get_real_ts(&req->r_stamp); + vfs_get_real_ts(&req->r_stamp);
req->r_op = op; req->r_direct_mode = mode; diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index ccf11ef..51f8c9d 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -202,7 +202,7 @@ struct ceph_mds_request { int r_fmode; /* file mode, if expecting cap */ kuid_t r_uid; kgid_t r_gid; - struct timespec r_stamp; + struct vfs_time r_stamp;
/* for choosing which mds to send this request to */ int r_direct_mode; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 75b7d12..2acbc96 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -169,7 +169,7 @@ struct ceph_cap_snap { u64 xattr_version;
u64 size; - struct timespec mtime, atime, ctime; + struct vfs_time mtime, atime, ctime; u64 time_warp_seq; int writing; /* a sync write is still in progress */ int dirty_pages; /* dirty pages awaiting writeback */ @@ -290,7 +290,7 @@ struct ceph_inode_info { char *i_symlink;
/* for dirs */ - struct timespec i_rctime; + struct vfs_time i_rctime; u64 i_rbytes, i_rfiles, i_rsubdirs; u64 i_files, i_subdirs;
@@ -764,8 +764,8 @@ extern struct inode *ceph_get_snapdir(struct inode *parent); extern int ceph_fill_file_size(struct inode *inode, int issued, u32 truncate_seq, u64 truncate_size, u64 size); extern void ceph_fill_file_time(struct inode *inode, int issued, - u64 time_warp_seq, struct timespec *ctime, - struct timespec *mtime, struct timespec *atime); + u64 time_warp_seq, struct vfs_time *ctime, + struct vfs_time *mtime, struct vfs_time *atime); extern int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req, struct ceph_mds_session *session); diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 1e1c00a..db5328d 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -198,7 +198,7 @@ static size_t ceph_vxattrcb_dir_rbytes(struct ceph_inode_info *ci, char *val, static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val, size_t size) { - return snprintf(val, size, "%ld.09%ld", (long)ci->i_rctime.tv_sec, + return snprintf(val, size, "%lld.09%ld", (long long)ci->i_rctime.tv_sec, (long)ci->i_rctime.tv_nsec); }
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
These timestamps are passed in as arguments to functions using inode timestamps. Hence, these need to change along with vfs to support 64 bit timestamps. vfs_time helps do this transition.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/ceph/messenger.h | 1 + include/linux/ceph/osd_client.h | 4 ++-- net/ceph/osd_client.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index afe886b..28bba12 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -8,6 +8,7 @@ #include <linux/radix-tree.h> #include <linux/uio.h> #include <linux/workqueue.h> +#include <linux/fs.h> #include <net/net_namespace.h>
#include <linux/ceph/types.h> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 7506b48..2b6f08b 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -322,7 +322,7 @@ extern struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client * extern void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id, - struct timespec *mtime); + struct vfs_time *mtime);
extern struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *, struct ceph_file_layout *layout, @@ -364,7 +364,7 @@ extern int ceph_osdc_writepages(struct ceph_osd_client *osdc, struct ceph_snap_context *sc, u64 off, u64 len, u32 truncate_seq, u64 truncate_size, - struct timespec *mtime, + struct vfs_time *mtime, struct page **pages, int nr_pages);
/* watch/notify events */ diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index f8f2359..1273db6 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2401,7 +2401,7 @@ bad: */ void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id, - struct timespec *mtime) + struct vfs_time *mtime) { struct ceph_msg *msg = req->r_request; void *p;
On Fri, Feb 12, 2016 at 01:36:05AM -0800, Deepa Dinamani wrote:
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
These timestamps are passed in as arguments to functions using inode timestamps. Hence, these need to change along with vfs to support 64 bit timestamps. vfs_time helps do this transition.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
Just a point to highlight the problem with this approach:
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index f8f2359..1273db6 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2401,7 +2401,7 @@ bad: */ void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id,
struct timespec *mtime)
struct vfs_time *mtime)
{ struct ceph_msg *msg = req->r_request; void *p;
So this change assumes that mtime is not passed by reference to another function. If we change vfs_time to be a timespec64, then dereferencing in this function works fine, but passing to another function will not because that function will be expecting a timespec.
That, indeed, is what happens here. A few lines into this function:
if (req->r_flags & CEPH_OSD_FLAG_WRITE) ceph_encode_timespec(p, mtime);
And that function:
static inline void ceph_encode_timespec(struct ceph_timespec *tv, const struct timespec *ts) { tv->tv_sec = cpu_to_le32((u32)ts->tv_sec); tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec); }
expects a timespec. It will silently lose 64 bit times even if it did compile. I note in version 2b, the mtime was not passed by reference as a vfs time, but converted at the call site to a timespec and so the internal usage of the timestamp remains unchanged and unaffected by a VFS level timespec->timespec64 change.
I think an approach that requires changes to the API without actually beign able to verify they are correct, fully propagated or don't impact on write/disk formats before the final change of the VFS type is not going to fly. This is the sort of subtle bug that can occur with type changes, and hence why I think that the fs developers should be left to do the conversion of their filesystem to support 64 bit times (i.e. approach 2b).
Any change is going to take a significant amount of testing and verification, and that's something we don't have yet. Nobody has written any tests for xfstests to verify correct 64 bit timestamp behaviour, nor do we have tests to verify 32 bit timestamp behaviour on a 64 bit time kernel. These are things that we are going to need; all filesystems should behave the same w.r.t. these configurations, so we really do need regression tests for this....
Cheers,
Dave.
On Sat, Feb 13, 2016 at 2:08 PM, Dave Chinner david@fromorbit.com wrote:
On Fri, Feb 12, 2016 at 01:36:05AM -0800, Deepa Dinamani wrote:
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
These timestamps are passed in as arguments to functions using inode timestamps. Hence, these need to change along with vfs to support 64 bit timestamps. vfs_time helps do this transition.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
Just a point to highlight the problem with this approach:
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index f8f2359..1273db6 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2401,7 +2401,7 @@ bad: */ void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id,
struct timespec *mtime)
struct vfs_time *mtime)
{ struct ceph_msg *msg = req->r_request; void *p;
So this change assumes that mtime is not passed by reference to another function. If we change vfs_time to be a timespec64, then dereferencing in this function works fine, but passing to another function will not because that function will be expecting a timespec.
That, indeed, is what happens here. A few lines into this function:
if (req->r_flags & CEPH_OSD_FLAG_WRITE) ceph_encode_timespec(p, mtime);
And that function:
static inline void ceph_encode_timespec(struct ceph_timespec *tv, const struct timespec *ts) { tv->tv_sec = cpu_to_le32((u32)ts->tv_sec); tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec); }
I'm not sure where you picked up this encode function from.
You might be missing the patches( 9 and 10) before this?:
2b5f8e517c6fd7121fc1b890c51c6256bc21beb6 net: ceph: use vfs_time data type instead of timespec ca5b82952a6a522ae058ccede57ba1a71da498c5 fs: ceph: Replace timespec data type with vfs_time 3a3ac0bdd23284c4f27a7ab1c133056c1a998075 fs: ceph: Change encode and decode functions to use vfs_time
So encode function actually looks like
-static inline void ceph_decode_timespec(struct timespec *ts, +static inline void ceph_decode_timespec(struct vfs_time *ts, const struct ceph_timespec *tv) { - ts->tv_sec = (__kernel_time_t)le32_to_cpu(tv->tv_sec); - ts->tv_nsec = (long)le32_to_cpu(tv->tv_nsec); + ts->tv_sec = (s64)(u32)le32_to_cpu(tv->tv_sec); + ts->tv_nsec = (long)(u32)le32_to_cpu(tv->tv_nsec); }
There is a conversion error here which I will be fixing separately from the series.
Also, there is another commit in my tree that is pointed to in the cover letter that is also required here:
40219ae801c0284a233ed908b07bb468d219cbc8 net: ceph: Remove CURRENT_TIME references
Changes have been split so that they can done in manageable chunks just like how we are not changing all filesystems at once.
I think an approach that requires changes to the API without actually beign able to verify they are correct, fully propagated or don't impact on write/disk formats before the final change of the VFS type is not going to fly. This is the sort of subtle bug that can occur with type changes, and hence why I think that the fs developers should be left to do the conversion of their filesystem to support 64 bit times (i.e. approach 2b).
Approach 2b has the same problem of not being able to verify the conversion before the vfs switch. Consider what happens if you miss changing one of the instances of direct inode time access. So 2b is also not completely verifiable that the changes are completely propagated. The only approach that does this is the 2c because the data types in the individual filesystems are decoupled from vfs data types from the get go.
Besides, anything omitted by 2a or 2b in the process of conversion should be easily verifiable when vfs does switch. At this point, there will be warnings in case of pointer conversion or errors in case of pass by value, if the data types do not match.
Apart from this, process of how individual filesystems are converted will help avoid these bugs. Here is one of the tricks I plan to do (consider example approach 2a):
1. Change an individual filesystem to use vfs_time. 2. Change vfs to timespec64 and verify that the targeted filesystem will actually compile.
Tagging tricks are also useful here.
Keep in mind that timespec = timespec64 already on 64 bit systems. But, still there might be some tricky cases which should be okay because it will have to reviewed by individual filesystem maintainers.
Any change is going to take a significant amount of testing and verification, and that's something we don't have yet. Nobody has written any tests for xfstests to verify correct 64 bit timestamp behaviour, nor do we have tests to verify 32 bit timestamp behaviour on a 64 bit time kernel. These are things that we are going to need; all filesystems should behave the same w.r.t. these configurations, so we really do need regression tests for this....
Agree. This is needed regardless of what approach is chosen. And, this is a problem for all filesystems even today.
-Deepa
static inline void ceph_encode_timespec(struct ceph_timespec *tv, const struct timespec *ts) { tv->tv_sec = cpu_to_le32((u32)ts->tv_sec); tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec); }
Pointed to decode function change in previous email. Pasting encode function change also here:
static inline void ceph_encode_timespec(struct ceph_timespec *tv, - const struct timespec *ts) + const struct vfs_time *ts)
-Deepa
On Sat, Feb 13, 2016 at 05:46:11PM -0800, Deepa Dinamani wrote:
On Sat, Feb 13, 2016 at 2:08 PM, Dave Chinner david@fromorbit.com wrote:
On Fri, Feb 12, 2016 at 01:36:05AM -0800, Deepa Dinamani wrote:
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
These timestamps are passed in as arguments to functions using inode timestamps. Hence, these need to change along with vfs to support 64 bit timestamps. vfs_time helps do this transition.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
Just a point to highlight the problem with this approach:
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index f8f2359..1273db6 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2401,7 +2401,7 @@ bad: */ void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id,
struct timespec *mtime)
struct vfs_time *mtime)
{ struct ceph_msg *msg = req->r_request; void *p;
So this change assumes that mtime is not passed by reference to another function. If we change vfs_time to be a timespec64, then dereferencing in this function works fine, but passing to another function will not because that function will be expecting a timespec.
That, indeed, is what happens here. A few lines into this function:
if (req->r_flags & CEPH_OSD_FLAG_WRITE) ceph_encode_timespec(p, mtime);
And that function:
static inline void ceph_encode_timespec(struct ceph_timespec *tv, const struct timespec *ts) { tv->tv_sec = cpu_to_le32((u32)ts->tv_sec); tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec); }
I'm not sure where you picked up this encode function from.
You might be missing the patches( 9 and 10) before this?:
2b5f8e517c6fd7121fc1b890c51c6256bc21beb6 net: ceph: use vfs_time data type instead of timespec ca5b82952a6a522ae058ccede57ba1a71da498c5 fs: ceph: Replace timespec data type with vfs_time 3a3ac0bdd23284c4f27a7ab1c133056c1a998075 fs: ceph: Change encode and decode functions to use vfs_time
So I missed this last patch when quickly reading throught he series. I think this is being broken up way too much, and that makes it hard to see how the changes relate.
i.e. this series could have easily been just 3 or 4 patches - it's only ~100 lines of code total that is changed by the series.
FWIW, let me put this in context for you, so maybe you'll understand why I think this timespec64 changeover is actually a trivial, simply thing. Last week I reviewed about 10,000 lines of new code amongst some 14,000 lines of change (twice) for xfstests. That was amongst getting ~1000 lines of my own changes reviewed and committed into the XFS kernel tree, handling user problems, reviewing and commenting on DAX changes, etc.
This week I only have a libxfs kernel/userspace code sync to do (only 500 lines of change this time), and then I've about 10,000 lines of complex new XFS kernel code to review (i.e. for reverse mapping, reflink, dedupe and copy-on-write support). I'll have to review that more than once, and once that is done and all th changes have been propagated over into the userspace code, I've got another 10,000 lines of code in userspace (again for reflink, etc) to review, test and merge.
So, excuse me if I made a mistake and missed something in a patchset that a) had 3 different versions posted, b) is way too fine-grained, and c) being treated like a mountain when it's really a tiny molehill. I do have much more important things to do with my time than be dragged into another silly "oh this is so difficult and hard" bikeshedding argument when I could easily write the entire patchset to do a timespec64 changeover for the VFS in a couple of hours. It's just not that hard to do.
And, FWIW, I'm still waiting to hear how we're going to regression test all this. Has anyone written any xfstests yet to ensure that all the filesystems behave the same and we won't break anything in future as we add 64 bit timestamp support to filesystem on-disk formats? IMO, there's more work in writing the regression tests to make sure everything works correctly in all the different possible combinations of filesystem, kernel and userspace support (e.g. 32 on 32, 32 on 64, 64 on 32 and 64 on 64). I'm much more concerned about this aspect of the problem than actually changing the VFS code, because without it we can't verify the changes we are making are behaving correctly...
Cheers,
Dave.
On Monday 15 February 2016 08:00:50 Dave Chinner wrote:
On Sat, Feb 13, 2016 at 05:46:11PM -0800, Deepa Dinamani wrote:
On Sat, Feb 13, 2016 at 2:08 PM, Dave Chinner david@fromorbit.com wrote:
On Fri, Feb 12, 2016 at 01:36:05AM -0800, Deepa Dinamani wrote:
So, excuse me if I made a mistake and missed something in a patchset that a) had 3 different versions posted, b) is way too fine-grained, and c) being treated like a mountain when it's really a tiny molehill. I do have much more important things to do with my time than be dragged into another silly "oh this is so difficult and hard" bikeshedding argument when I could easily write the entire patchset to do a timespec64 changeover for the VFS in a couple of hours. It's just not that hard to do.
And, FWIW, I'm still waiting to hear how we're going to regression test all this. Has anyone written any xfstests yet to ensure that all the filesystems behave the same and we won't break anything in future as we add 64 bit timestamp support to filesystem on-disk formats? IMO, there's more work in writing the regression tests to make sure everything works correctly in all the different possible combinations of filesystem, kernel and userspace support (e.g. 32 on 32, 32 on 64, 64 on 32 and 64 on 64). I'm much more concerned about this aspect of the problem than actually changing the VFS code, because without it we can't verify the changes we are making are behaving correctly...
You are mixing up way too many things here, for this series all we need is for you to say that one of the approaches is ok, and they are all to the point where they are simple enough that they don't really do much at all. Deepa is taking baby steps here because you complained about v1 being too complex.
This series is not about changing the on-disk format, it is not even changing the VFS time format (yet), it's just a preparation so we can eventually change it.
There are four different things that are going on at the same time, all independent of one another:
1. Changing the file systems so we are able to do the change in struct inode, this series. The *only* part we care about here is that this does not change the existing behavior on either 32-bit or 64-bit systems, and that should be trivial to review.
2. Changing the file systems to provide information to VFS about the time stamp ranges they support in order to do proper handling of overflows in VFS. Deepa has posted a first set of patches to always use current_fs_time() consistently, work on that is continuing and once done, we can debate the policy for what should happen in case of overflow.
3. Writing test cases in xfstests and/or LTP. Yes, we need them, and I think Deepa has started on those, but I don't think they are needed at this point as there is little to test before steps 1 and 2 are done.
4. Changing file systems to use longer on-disk timestamps where needed. This is completely independent of anything else and up to the individual file system developers. Anyone can test this now on 64-bit architectures, and most file systems we care about (xfs being the notable exception, ext4 also until very recently) already do this properly.
After 1, 2 and 3 are done, the simple patch to switch over VFS can be implemented and tested, followed by whatever work remains to switch over file systems to use 64-bit timestamps in the kernel (independent of what they use on disk, again).
My current line of thinking is that for step 1, I'd let Deepa pick one of the three approaches she posted (I don't think we found any showstoppers), and put the patches in my y2038 tree for merging in 4.6. We can easily leave out the file systems that have conflicts against linux-next, and you can put Deepa's patch or another implementation of that into 4.7.
Arnd
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time aliases here to access inode times.
vfs_time is an abstraction that hides the type of timestamps across vfs. This is necessary because we want to change the data types of vfs timestamps to support 64 bit times.
current_fs_time() will change along with vfs timestamp data type changes.
xfs_vn_update_time() is a .update callback for inode operations and this needs to change along with vfs inode times.
All the accesses to or from struct inode timestamps and current_fs_time() are also changed to vfs_time.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_iops.c | 2 +- fs/xfs/xfs_trans_inode.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ceba1a8..ebf76a3 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -765,7 +765,7 @@ xfs_ialloc( xfs_inode_t *ip; uint flags; int error; - struct timespec tv; + struct vfs_time tv;
/* * Call the space management code to pick diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 76b71a1..7f8a897 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -973,7 +973,7 @@ xfs_vn_setattr( STATIC int xfs_vn_update_time( struct inode *inode, - struct timespec *now, + struct vfs_time *now, int flags) { struct xfs_inode *ip = XFS_I(inode); diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c index b97f1df..54fc3c4 100644 --- a/fs/xfs/xfs_trans_inode.c +++ b/fs/xfs/xfs_trans_inode.c @@ -68,7 +68,7 @@ xfs_trans_ichgtime( int flags) { struct inode *inode = VFS_I(ip); - struct timespec tv; + struct vfs_time tv;
ASSERT(tp); ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); @@ -76,13 +76,13 @@ xfs_trans_ichgtime( tv = current_fs_time(inode->i_sb);
if ((flags & XFS_ICHGTIME_MOD) && - !timespec_equal(&inode->i_mtime, &tv)) { + !vfs_time_equal(&inode->i_mtime, &tv)) { inode->i_mtime = tv; ip->i_d.di_mtime.t_sec = tv.tv_sec; ip->i_d.di_mtime.t_nsec = tv.tv_nsec; } if ((flags & XFS_ICHGTIME_CHG) && - !timespec_equal(&inode->i_ctime, &tv)) { + !vfs_time_equal(&inode->i_ctime, &tv)) { inode->i_ctime = tv; ip->i_d.di_ctime.t_sec = tv.tv_sec; ip->i_d.di_ctime.t_nsec = tv.tv_nsec;
Introduction
The series is one of the proposals on how to transition VFS timestamps to use 64 bit time. This idea was proposed by Arnd Bergmann and David Chinner on the RFC: https://lkml.org/lkml/2016/1/7/20 [1].
Solution
1.Define accessor functions for inode timestamps. 2.Change all individual filesystems to use these accessors. 3.Change vfs timestamps to use timespec64. 4.Individual filesystem developers can change their filesystems to use timespec64 at their own convenience.
Note that the series only includes patches for steps 1 and 2.
Variant
We could use timespec64 in the end filesystems instead of timespec.
Concerns
1. Before the vfs layer is done, it might be confusing to confusing to developers as individual filesystems now will have two options for access: directly and accessor.
2. After the vfs layer is done, it might be confusing to developers as individual filesystems now will have two options for accessors: timespec64_to_timespec and vfs_time_to_timespec.
Deepa Dinamani (5): vfs: Add vfs_time accessors fs: cifs: Use vfs_time accessors fs: btrfs: Use vfs_time accessors fs: ceph: Use vfs timestamp accessors fs: xfs: change inode times to use vfs_time data type
fs/btrfs/file.c | 14 +++++---- fs/btrfs/inode.c | 4 +-- fs/ceph/addr.c | 20 +++++++++---- fs/ceph/cache.c | 4 +-- fs/ceph/caps.c | 4 +-- fs/ceph/file.c | 4 ++- fs/ceph/inode.c | 75 ++++++++++++++++++++++++++++-------------------- fs/ceph/mds_client.c | 5 +++- fs/ceph/snap.c | 6 ++-- fs/cifs/cache.c | 12 +++++--- fs/cifs/inode.c | 31 +++++++++++--------- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_trans_inode.c | 24 +++++++++------- include/linux/fs.h | 15 ++++++++++ 14 files changed, 139 insertions(+), 81 deletions(-)
Add vfs_time accessors to help convert vfs timestamps to use 64 bit times. These create an abstraction layer so that vfs inode times can be switched to use struct timespec64 from struct timespec without breaking the individual filesystems after they have incorporated these.
Use uapi exposed data types, timespec and timespec64 here to keep minimal timestamp data type conversions in API's interfacing with vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/fs.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 4af612f..1623c95 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1412,6 +1412,21 @@ static inline struct timespec current_fs_time_sec(struct super_block *sb) return (struct timespec) { get_seconds(), 0 }; }
+/* Place holder defines to ensure safe transition to timespec64 + * in the vfs layer. + * These can be deleted after all filesystems and vfs are switched + * over to using 64 bit time. + */ +static inline struct timespec vfs_time_to_timespec(struct timespec inode_ts) +{ + return inode_ts; +} + +static inline struct timespec timespec_to_vfs_time(struct timespec ts) +{ + return ts; +} + /* * Snapshotting support. */
This is preparation to make vfs inode timestamps y2038 safe. The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time access functions to access inode times.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cache.c | 12 ++++++++---- fs/cifs/inode.c | 31 ++++++++++++++++++------------- 2 files changed, 26 insertions(+), 17 deletions(-)
diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c index 6c665bf..18c861f 100644 --- a/fs/cifs/cache.c +++ b/fs/cifs/cache.c @@ -259,8 +259,10 @@ cifs_fscache_inode_get_aux(const void *cookie_netfs_data, void *buffer,
memset(&auxdata, 0, sizeof(auxdata)); auxdata.eof = cifsi->server_eof; - auxdata.last_write_time = cifsi->vfs_inode.i_mtime; - auxdata.last_change_time = cifsi->vfs_inode.i_ctime; + auxdata.last_write_time = + vfs_time_to_timespec(cifsi->vfs_inode.i_mtime); + auxdata.last_change_time = + vfs_time_to_timespec(cifsi->vfs_inode.i_ctime);
if (maxbuf > sizeof(auxdata)) maxbuf = sizeof(auxdata); @@ -283,8 +285,10 @@ fscache_checkaux cifs_fscache_inode_check_aux(void *cookie_netfs_data,
memset(&auxdata, 0, sizeof(auxdata)); auxdata.eof = cifsi->server_eof; - auxdata.last_write_time = cifsi->vfs_inode.i_mtime; - auxdata.last_change_time = cifsi->vfs_inode.i_ctime; + auxdata.last_write_time = + vfs_time_to_timespec(cifsi->vfs_inode.i_mtime); + auxdata.last_change_time = + vfs_time_to_timespec(cifsi->vfs_inode.i_ctime);
if (memcmp(data, &auxdata, datalen) != 0) return FSCACHE_CHECKAUX_OBSOLETE; diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index fa72359..423bb0c 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -92,6 +92,7 @@ static void cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) { struct cifsInodeInfo *cifs_i = CIFS_I(inode); + struct timespec mtime;
cifs_dbg(FYI, "%s: revalidating inode %llu\n", __func__, cifs_i->uniqueid); @@ -109,8 +110,9 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) return; }
+ mtime = vfs_time_to_timespec(inode->i_mtime); /* revalidate if mtime or size have changed */ - if (timespec_equal(&inode->i_mtime, &fattr->cf_mtime) && + if (timespec_equal(&mtime, &fattr->cf_mtime) && cifs_i->server_eof == fattr->cf_eof) { cifs_dbg(FYI, "%s: inode %llu is unchanged\n", __func__, cifs_i->uniqueid); @@ -159,9 +161,9 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr) cifs_revalidate_cache(inode, fattr);
spin_lock(&inode->i_lock); - inode->i_atime = fattr->cf_atime; - inode->i_mtime = fattr->cf_mtime; - inode->i_ctime = fattr->cf_ctime; + inode->i_atime = timespec_to_vfs_time(fattr->cf_atime); + inode->i_mtime = timespec_to_vfs_time(fattr->cf_mtime); + inode->i_ctime = timespec_to_vfs_time(fattr->cf_ctime); inode->i_rdev = fattr->cf_rdev; cifs_nlink_fattr_to_inode(inode, fattr); inode->i_uid = fattr->cf_uid; @@ -1066,15 +1068,15 @@ cifs_set_file_info(struct inode *inode, struct iattr *attrs, unsigned int xid,
if (attrs->ia_valid & ATTR_ATIME) { set_time = true; - info_buf.LastAccessTime = - cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_atime)); + info_buf.LastAccessTime = cpu_to_le64( + cifs_UnixTimeToNT(vfs_time_to_timespec(attrs->ia_atime))); } else info_buf.LastAccessTime = 0;
if (attrs->ia_valid & ATTR_MTIME) { set_time = true; - info_buf.LastWriteTime = - cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_mtime)); + info_buf.LastWriteTime = cpu_to_le64( + cifs_UnixTimeToNT(vfs_time_to_timespec(attrs->ia_mtime))); } else info_buf.LastWriteTime = 0;
@@ -1086,8 +1088,8 @@ cifs_set_file_info(struct inode *inode, struct iattr *attrs, unsigned int xid, */ if (set_time && (attrs->ia_valid & ATTR_CTIME)) { cifs_dbg(FYI, "CIFS - CTIME changed\n"); - info_buf.ChangeTime = - cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_ctime)); + info_buf.ChangeTime = cpu_to_le64( + cifs_UnixTimeToNT(vfs_time_to_timespec(attrs->ia_ctime))); } else info_buf.ChangeTime = 0;
@@ -2193,17 +2195,20 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs) args->gid = INVALID_GID; /* no change */
if (attrs->ia_valid & ATTR_ATIME) - args->atime = cifs_UnixTimeToNT(attrs->ia_atime); + args->atime = cifs_UnixTimeToNT( + vfs_time_to_timespec(attrs->ia_atime)); else args->atime = NO_CHANGE_64;
if (attrs->ia_valid & ATTR_MTIME) - args->mtime = cifs_UnixTimeToNT(attrs->ia_mtime); + args->mtime = cifs_UnixTimeToNT( + vfs_time_to_timespec(attrs->ia_mtime)); else args->mtime = NO_CHANGE_64;
if (attrs->ia_valid & ATTR_CTIME) - args->ctime = cifs_UnixTimeToNT(attrs->ia_ctime); + args->ctime = cifs_UnixTimeToNT( + vfs_time_to_timespec(attrs->ia_ctime)); else args->ctime = NO_CHANGE_64;
This is in preparation for changing VFS inode timestamps to use 64 bit time. The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time functions to access inode times.
Note that the btrfs_update_time() has not been changed in this patch. This will need to be updated along with vfs time data type changes. This is because the function signature is decided by the VFS layer since it is a VFS inode operations callback.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/btrfs/file.c | 14 +++++++++----- fs/btrfs/inode.c | 4 ++-- 2 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 610f569..3bccb97 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1733,16 +1733,20 @@ out: static void update_time_for_write(struct inode *inode) { struct timespec now; + struct timespec ts;
if (IS_NOCMTIME(inode)) return;
- now = current_fs_time(inode->i_sb); - if (!timespec_equal(&inode->i_mtime, &now)) - inode->i_mtime = now; + now = vfs_time_to_timespec(current_fs_time(inode->i_sb));
- if (!timespec_equal(&inode->i_ctime, &now)) - inode->i_ctime = now; + ts = vfs_time_to_timespec(inode->i_mtime); + if (!timespec_equal(&ts, &now)) + inode->i_mtime = timespec_to_vfs_time(now); + + ts = vfs_time_to_timespec(inode->i_mtime); + if (!timespec_equal(&ts, &now)) + inode->i_ctime = timespec_to_vfs_time(now);
if (IS_I_VERSION(inode)) inode_inc_iversion(inode); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 59c0e22..5b93cf8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5592,7 +5592,7 @@ static struct inode *new_simple_dir(struct super_block *s, inode->i_mtime = current_fs_time(inode->i_sb); inode->i_atime = inode->i_mtime; inode->i_ctime = inode->i_mtime; - BTRFS_I(inode)->i_otime = inode->i_mtime; + BTRFS_I(inode)->i_otime = vfs_time_to_timespec(inode->i_mtime);
return inode; } @@ -6164,7 +6164,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans, inode->i_mtime = current_fs_time(inode->i_sb); inode->i_atime = inode->i_mtime; inode->i_ctime = inode->i_mtime; - BTRFS_I(inode)->i_otime = inode->i_mtime; + BTRFS_I(inode)->i_otime = vfs_time_to_timespec(inode->i_mtime);
inode_item = btrfs_item_ptr(path->nodes[0], path->slots[0], struct btrfs_inode_item);
On Friday 12 February 2016 01:45:47 Deepa Dinamani wrote:
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_mtime = timespec_to_vfs_time(now);
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_ctime = timespec_to_vfs_time(now);
The second one needs to be fs_time_to_timespec(inode->i_ctime), not i_mtime.
Arnd
On Fri, Feb 12, 2016 at 5:57 AM, Arnd Bergmann arnd@arndb.de wrote:
On Friday 12 February 2016 01:45:47 Deepa Dinamani wrote:
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_mtime = timespec_to_vfs_time(now);
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_ctime = timespec_to_vfs_time(now);
The second one needs to be fs_time_to_timespec(inode->i_ctime), not i_mtime.
Yes, you are correct. I will wait for some consensus on the proposal to figure out which version to post again.
Thanks, -Deepa
This is in preparation for changing VFS inode timestamps to use 64 bit time. The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time functions to access inode times.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/ceph/addr.c | 20 +++++++++----- fs/ceph/cache.c | 4 +-- fs/ceph/caps.c | 4 +-- fs/ceph/file.c | 4 ++- fs/ceph/inode.c | 75 ++++++++++++++++++++++++++++++---------------------- fs/ceph/mds_client.c | 5 +++- fs/ceph/snap.c | 6 ++--- 7 files changed, 72 insertions(+), 46 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index c222137..30d72e6 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -479,6 +479,7 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) struct ceph_fs_client *fsc; struct ceph_osd_client *osdc; struct ceph_snap_context *snapc, *oldest; + struct timespec ts; loff_t page_off = page_offset(page); loff_t snap_size = -1; long writeback_stat; @@ -540,11 +541,12 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) ceph_readpage_to_fscache(inode, page);
set_page_writeback(page); + ts = vfs_time_to_timespec(inode->i_mtime); err = ceph_osdc_writepages(osdc, ceph_vino(inode), &ci->i_layout, snapc, page_off, len, truncate_seq, truncate_size, - &inode->i_mtime, &page, 1); + &ts, &page, 1); if (err < 0) { dout("writepage setting page/mapping error %d %p\n", err, page); SetPageError(page); @@ -699,6 +701,7 @@ static int ceph_writepages_start(struct address_space *mapping, int rc = 0; unsigned wsize = 1 << inode->i_blkbits; struct ceph_osd_request *req = NULL; + struct timespec ts; int do_sync = 0; loff_t snap_size, i_size; u64 truncate_size; @@ -978,8 +981,9 @@ get_more_pages: osd_req_op_extent_update(req, 0, len);
vino = ceph_vino(inode); + ts = vfs_time_to_timespec(inode->i_mtime); ceph_osdc_build_request(req, offset, snapc, vino.snap, - &inode->i_mtime); + &ts);
rc = ceph_osdc_start_request(&fsc->client->osdc, req, true); BUG_ON(rc); @@ -1465,6 +1469,7 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_request *req; struct page *page = NULL; + struct timespec ts; u64 len, inline_version; int err = 0; bool from_pagecache = false; @@ -1528,7 +1533,8 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) goto out; }
- ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &inode->i_mtime); + ts = vfs_time_to_timespec(inode->i_mtime); + ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &ts); err = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!err) err = ceph_osdc_wait_request(&fsc->client->osdc, req); @@ -1572,7 +1578,7 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) goto out_put; }
- ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &inode->i_mtime); + ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &ts); err = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!err) err = ceph_osdc_wait_request(&fsc->client->osdc, req); @@ -1622,6 +1628,7 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, u32 pool) struct ceph_osd_request *rd_req = NULL, *wr_req = NULL; struct rb_node **p, *parent; struct ceph_pool_perm *perm; + struct timespec ts; struct page **pages; int err = 0, err2 = 0, have = 0;
@@ -1701,12 +1708,13 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, u32 pool)
osd_req_op_raw_data_in_pages(rd_req, 0, pages, PAGE_SIZE, 0, false, true); + ts = vfs_time_to_timespec(ci->vfs_inode.i_mtime); ceph_osdc_build_request(rd_req, 0, NULL, CEPH_NOSNAP, - &ci->vfs_inode.i_mtime); + &ts); err = ceph_osdc_start_request(&fsc->client->osdc, rd_req, false);
ceph_osdc_build_request(wr_req, 0, NULL, CEPH_NOSNAP, - &ci->vfs_inode.i_mtime); + &ts); err2 = ceph_osdc_start_request(&fsc->client->osdc, wr_req, false);
if (!err) diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c index a351480..789121c 100644 --- a/fs/ceph/cache.c +++ b/fs/ceph/cache.c @@ -105,7 +105,7 @@ static uint16_t ceph_fscache_inode_get_aux(const void *cookie_netfs_data, const struct inode* inode = &ci->vfs_inode;
memset(&aux, 0, sizeof(aux)); - aux.mtime = inode->i_mtime; + aux.mtime = vfs_time_to_timespec(inode->i_mtime); aux.size = i_size_read(inode);
memcpy(buffer, &aux, sizeof(aux)); @@ -131,7 +131,7 @@ static enum fscache_checkaux ceph_fscache_inode_check_aux( return FSCACHE_CHECKAUX_OBSOLETE;
memset(&aux, 0, sizeof(aux)); - aux.mtime = inode->i_mtime; + aux.mtime = vfs_time_to_timespec(inode->i_mtime); aux.size = i_size_read(inode);
if (memcmp(data, &aux, sizeof(aux)) != 0) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index cdbf8cf..03b7212 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1178,8 +1178,8 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap, ci->i_reported_size = size; max_size = ci->i_wanted_max_size; ci->i_requested_max_size = max_size; - mtime = inode->i_mtime; - atime = inode->i_atime; + mtime = vfs_time_to_timespec(inode->i_mtime); + atime = vfs_time_to_timespec(inode->i_atime); time_warp_seq = ci->i_time_warp_seq; uid = inode->i_uid; gid = inode->i_gid; diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 9b338ff..0b782c6 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1497,6 +1497,7 @@ static int ceph_zero_partial_object(struct inode *inode, struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_request *req; + struct timespec ts; int ret = 0; loff_t zero = 0; int op; @@ -1520,8 +1521,9 @@ static int ceph_zero_partial_object(struct inode *inode, goto out; }
+ ts = vfs_time_to_timespec(inode->i_mtime); ceph_osdc_build_request(req, offset, NULL, ceph_vino(inode).snap, - &inode->i_mtime); + &ts);
ret = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!ret) { diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 63d0198..7e560bf 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -594,6 +594,7 @@ void ceph_fill_file_time(struct inode *inode, int issued, struct timespec *mtime, struct timespec *atime) { struct ceph_inode_info *ci = ceph_inode(inode); + struct timespec ts; int warn = 0;
if (issued & (CEPH_CAP_FILE_EXCL| @@ -601,38 +602,42 @@ void ceph_fill_file_time(struct inode *inode, int issued, CEPH_CAP_FILE_BUFFER| CEPH_CAP_AUTH_EXCL| CEPH_CAP_XATTR_EXCL)) { - if (timespec_compare(ctime, &inode->i_ctime) > 0) { + ts = vfs_time_to_timespec(inode->i_ctime); + if (timespec_compare(ctime, &ts) > 0) { dout("ctime %ld.%09ld -> %ld.%09ld inc w/ cap\n", - inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, + ts.tv_sec, ts.tv_nsec, ctime->tv_sec, ctime->tv_nsec); - inode->i_ctime = *ctime; + inode->i_ctime = timespec_to_vfs_time(*ctime); } if (ceph_seq_cmp(time_warp_seq, ci->i_time_warp_seq) > 0) { /* the MDS did a utimes() */ + ts = vfs_time_to_timespec(inode->i_mtime); dout("mtime %ld.%09ld -> %ld.%09ld " "tw %d -> %d\n", - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, + ts.tv_sec, ts.tv_nsec, mtime->tv_sec, mtime->tv_nsec, ci->i_time_warp_seq, (int)time_warp_seq);
- inode->i_mtime = *mtime; - inode->i_atime = *atime; + inode->i_mtime = timespec_to_vfs_time(*mtime); + inode->i_atime = timespec_to_vfs_time(*atime); ci->i_time_warp_seq = time_warp_seq; } else if (time_warp_seq == ci->i_time_warp_seq) { /* nobody did utimes(); take the max */ - if (timespec_compare(mtime, &inode->i_mtime) > 0) { + ts = vfs_time_to_timespec(inode->i_mtime); + if (timespec_compare(mtime, &ts) > 0) { dout("mtime %ld.%09ld -> %ld.%09ld inc\n", - inode->i_mtime.tv_sec, - inode->i_mtime.tv_nsec, + ts.tv_sec, + ts.tv_nsec, mtime->tv_sec, mtime->tv_nsec); - inode->i_mtime = *mtime; + inode->i_mtime = timespec_to_vfs_time(*mtime); } - if (timespec_compare(atime, &inode->i_atime) > 0) { + ts = vfs_time_to_timespec(inode->i_atime); + if (timespec_compare(atime, &ts) > 0) { dout("atime %ld.%09ld -> %ld.%09ld inc\n", - inode->i_atime.tv_sec, - inode->i_atime.tv_nsec, + ts.tv_sec, + ts.tv_nsec, atime->tv_sec, atime->tv_nsec); - inode->i_atime = *atime; + inode->i_atime = timespec_to_vfs_time(*atime); } } else if (issued & CEPH_CAP_FILE_EXCL) { /* we did a utimes(); ignore mds values */ @@ -642,9 +647,9 @@ void ceph_fill_file_time(struct inode *inode, int issued, } else { /* we have no write|excl caps; whatever the MDS says is true */ if (ceph_seq_cmp(time_warp_seq, ci->i_time_warp_seq) >= 0) { - inode->i_ctime = *ctime; - inode->i_mtime = *mtime; - inode->i_atime = *atime; + inode->i_ctime = timespec_to_vfs_time(*ctime); + inode->i_mtime = timespec_to_vfs_time(*mtime); + inode->i_atime = timespec_to_vfs_time(*atime); ci->i_time_warp_seq = time_warp_seq; } else { warn = 1; @@ -1776,6 +1781,8 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) struct ceph_mds_request *req; struct ceph_mds_client *mdsc = ceph_sb_to_client(dentry->d_sb)->mdsc; struct ceph_cap_flush *prealloc_cf; + struct timespec inode_ts; + struct timespec iattr_ts; int issued; int release = 0, dirtied = 0; int mask = 0; @@ -1862,45 +1869,49 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) } }
+ inode_ts = vfs_time_to_timespec(inode->i_atime); + iattr_ts = vfs_time_to_timespec(attr->ia_atime); if (ia_valid & ATTR_ATIME) { dout("setattr %p atime %ld.%ld -> %ld.%ld\n", inode, - inode->i_atime.tv_sec, inode->i_atime.tv_nsec, - attr->ia_atime.tv_sec, attr->ia_atime.tv_nsec); + inode_ts.tv_sec, inode_ts.tv_nsec, + iattr_ts.tv_sec, iattr_ts.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_atime = attr->ia_atime; dirtied |= CEPH_CAP_FILE_EXCL; } else if ((issued & CEPH_CAP_FILE_WR) && - timespec_compare(&inode->i_atime, - &attr->ia_atime) < 0) { + timespec_compare(&inode_ts, + &iattr_ts) < 0) { inode->i_atime = attr->ia_atime; dirtied |= CEPH_CAP_FILE_WR; } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || - !timespec_equal(&inode->i_atime, &attr->ia_atime)) { + !timespec_equal(&inode_ts, &iattr_ts)) { ceph_encode_timespec(&req->r_args.setattr.atime, - &attr->ia_atime); + &iattr_ts); mask |= CEPH_SETATTR_ATIME; release |= CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; } } + inode_ts = vfs_time_to_timespec(inode->i_mtime); + iattr_ts = vfs_time_to_timespec(attr->ia_mtime); if (ia_valid & ATTR_MTIME) { dout("setattr %p mtime %ld.%ld -> %ld.%ld\n", inode, - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, - attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); + inode_ts.tv_sec, inode_ts.tv_nsec, + iattr_ts.tv_sec, iattr_ts.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_mtime = attr->ia_mtime; dirtied |= CEPH_CAP_FILE_EXCL; } else if ((issued & CEPH_CAP_FILE_WR) && - timespec_compare(&inode->i_mtime, - &attr->ia_mtime) < 0) { + timespec_compare(&inode_ts, + &iattr_ts) < 0) { inode->i_mtime = attr->ia_mtime; dirtied |= CEPH_CAP_FILE_WR; } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || - !timespec_equal(&inode->i_mtime, &attr->ia_mtime)) { + !timespec_equal(&inode_ts, &iattr_ts)) { ceph_encode_timespec(&req->r_args.setattr.mtime, - &attr->ia_mtime); + &iattr_ts); mask |= CEPH_SETATTR_MTIME; release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; @@ -1929,12 +1940,14 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) }
/* these do nothing */ + inode_ts = vfs_time_to_timespec(inode->i_ctime); + iattr_ts = vfs_time_to_timespec(attr->ia_ctime); if (ia_valid & ATTR_CTIME) { bool only = (ia_valid & (ATTR_SIZE|ATTR_MTIME|ATTR_ATIME| ATTR_MODE|ATTR_UID|ATTR_GID)) == 0; dout("setattr %p ctime %ld.%ld -> %ld.%ld (%s)\n", inode, - inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, - attr->ia_ctime.tv_sec, attr->ia_ctime.tv_nsec, + inode_ts.tv_sec, inode_ts.tv_nsec, + iattr_ts.tv_sec, iattr_ts.tv_nsec, only ? "ctime only" : "ignored"); inode->i_ctime = attr->ia_ctime; if (only) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 348b22e..01811f9 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2791,6 +2791,7 @@ static int encode_caps_cb(struct inode *inode, struct ceph_cap *cap, struct ceph_inode_info *ci; struct ceph_reconnect_state *recon_state = arg; struct ceph_pagelist *pagelist = recon_state->pagelist; + struct timespec ts; char *path; int pathlen, err; u64 pathbase; @@ -2839,7 +2840,9 @@ static int encode_caps_cb(struct inode *inode, struct ceph_cap *cap, rec.v1.wanted = cpu_to_le32(__ceph_caps_wanted(ci)); rec.v1.issued = cpu_to_le32(cap->issued); rec.v1.size = cpu_to_le64(inode->i_size); - ceph_encode_timespec(&rec.v1.mtime, &inode->i_mtime); + ts = vfs_time_to_timespec(inode->i_mtime); + ceph_encode_timespec(&rec.v1.mtime, &ts); + ts = vfs_time_to_timespec(inode->i_atime); ceph_encode_timespec(&rec.v1.atime, &inode->i_atime); rec.v1.snaprealm = cpu_to_le64(ci->i_snap_realm->ino); rec.v1.pathbase = cpu_to_le64(pathbase); diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 4aa7122..aea3344 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -593,9 +593,9 @@ int __ceph_finish_cap_snap(struct ceph_inode_info *ci,
BUG_ON(capsnap->writing); capsnap->size = inode->i_size; - capsnap->mtime = inode->i_mtime; - capsnap->atime = inode->i_atime; - capsnap->ctime = inode->i_ctime; + capsnap->mtime = vfs_time_to_timespec(inode->i_mtime); + capsnap->atime = vfs_time_to_timespec(inode->i_atime); + capsnap->ctime = vfs_time_to_timespec(inode->i_ctime); capsnap->time_warp_seq = ci->i_time_warp_seq; if (capsnap->dirty_pages) { dout("finish_cap_snap %p cap_snap %p snapc %p %llu %s s=%llu "
This is in preparation for changing VFS inode timestamps to use 64 bit time. The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time functions to access inode times.
current_fs_time() will change along with vfs timestamp data type changes.
xfs_vn_update_time() is a .update callback for inode operations and this needs to change along with vfs inode times.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_trans_inode.c | 24 ++++++++++++++---------- 2 files changed, 15 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ceba1a8..fc20e65 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -831,7 +831,7 @@ xfs_ialloc( ip->i_d.di_nextents = 0; ASSERT(ip->i_d.di_nblocks == 0);
- tv = current_fs_time(mp->m_super); + tv = vfs_time_to_timespec(current_fs_time(mp->m_super)); ip->i_d.di_mtime.t_sec = (__int32_t)tv.tv_sec; ip->i_d.di_mtime.t_nsec = (__int32_t)tv.tv_nsec; ip->i_d.di_atime = ip->i_d.di_mtime; diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c index b97f1df..e469e6a 100644 --- a/fs/xfs/xfs_trans_inode.c +++ b/fs/xfs/xfs_trans_inode.c @@ -68,24 +68,28 @@ xfs_trans_ichgtime( int flags) { struct inode *inode = VFS_I(ip); - struct timespec tv; + struct timespec now; + struct timespec ts;
ASSERT(tp); ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
- tv = current_fs_time(inode->i_sb); + now = vfs_time_to_timespec(current_fs_time(inode->i_sb));
+ ts = vfs_time_to_timespec(inode->i_mtime); if ((flags & XFS_ICHGTIME_MOD) && - !timespec_equal(&inode->i_mtime, &tv)) { - inode->i_mtime = tv; - ip->i_d.di_mtime.t_sec = tv.tv_sec; - ip->i_d.di_mtime.t_nsec = tv.tv_nsec; + !timespec_equal(&ts, &now)) { + inode->i_mtime = timespec_to_vfs_time(now); + ip->i_d.di_mtime.t_sec = now.tv_sec; + ip->i_d.di_mtime.t_nsec = now.tv_nsec; } + + ts = vfs_time_to_timespec(inode->i_ctime); if ((flags & XFS_ICHGTIME_CHG) && - !timespec_equal(&inode->i_ctime, &tv)) { - inode->i_ctime = tv; - ip->i_d.di_ctime.t_sec = tv.tv_sec; - ip->i_d.di_ctime.t_nsec = tv.tv_nsec; + !timespec_equal(&ts, &now)) { + inode->i_ctime = timespec_to_vfs_time(now); + ip->i_d.di_ctime.t_sec = now.tv_sec; + ip->i_d.di_ctime.t_nsec = now.tv_nsec; } }
On Fri, Feb 12, 2016 at 01:45:49AM -0800, Deepa Dinamani wrote:
This is in preparation for changing VFS inode timestamps to use 64 bit time. The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time functions to access inode times.
current_fs_time() will change along with vfs timestamp data type changes.
xfs_vn_update_time() is a .update callback for inode operations and this needs to change along with vfs inode times.
This code is all different in the current XFS for-next branch. XFS no longer has it's own internal timestamps - it only uses the timestamps in the struct inode now.
Cheers,
Dave.
On Saturday 13 February 2016 13:18:32 Dave Chinner wrote:
On Fri, Feb 12, 2016 at 01:45:49AM -0800, Deepa Dinamani wrote:
This is in preparation for changing VFS inode timestamps to use 64 bit time. The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use vfs_time functions to access inode times.
current_fs_time() will change along with vfs timestamp data type changes.
xfs_vn_update_time() is a .update callback for inode operations and this needs to change along with vfs inode times.
This code is all different in the current XFS for-next branch. XFS no longer has it's own internal timestamps - it only uses the timestamps in the struct inode now.
Right, but as far as I can tell, this changes only contexts for variants 2a and 2c, and makes patch 2b a few lines shorter but does not impact whether any of the three approaches is workable or not.
As you had some strong opinions on v1 of the series, could you take a look at the three variants of v2 and say if you have any preferences? It would be nice to agree on one approach so we can merge that patch 1/xx as soon as possible and start merging and adding further patches for the remaining file systems so we can eventually get to the interface change.
Arnd
Introduction
The series is one of the proposals on how to transition VFS timestamps to use 64 bit time. The credit for the idea goes to Arnd Bergmann and it’s a mix of approaches 2a and 2b.
Solution
1.Define vfs_time as alias to timespec. 2.Define accessor functions for inode timestamps. 3.Change individual filesystems to use timespec64 timestamps. Make sure each filesystem will be safe when converted to use timespec64. 4.Change individual filesystems to use only vfs_time data type for inode timestamps. Make sure each filesystem will be safe when converted to use timespec64. 5.Change vfs_time to timespec64. 6.Delete all accessors from individual filesystem and access inode timestamps directly.
Note that the series only includes patches for steps 1-4.
Variant
We could make the solution similar to 2a and use timespec in the individual filesystems instead of timespec64.
Concerns
1. It is not clear when vfs_time or accessors should be used. Even though it is advisable to make the conversions at vfs boundary, it is not easy to catch the exceptions. 2. Maintenance of the code base before vfs layer is changed is going to be a little hard as developers might be confused because of 1. 3. Has the same problem as changing all filesystems to 64 bit: the switch over to timespec64 has to be done globally in a single, huge patch for some filesystems.
Deepa Dinamani (8): vfs: Add vfs_time abstractions fs: cifs: Change auxdata to struct timespec64 data type fs: cifs: Change cifs_fattr timestamps data type to timespec64 fs: cifs: Make cnvrtDosUnixTm() y2038 safe fs: btrfs: Change btrfs_inode.i_otime to vfs_time data type fs: btrfs: Use vfs timestamp abstraction helper fs: ceph: Use vfs timestamp abstraction helpers fs: xfs: change inode times to use vfs_time data type
fs/btrfs/btrfs_inode.h | 2 +- fs/btrfs/file.c | 7 ++-- fs/btrfs/inode.c | 6 +-- fs/btrfs/ioctl.c | 4 +- fs/btrfs/root-tree.c | 2 +- fs/btrfs/transaction.c | 2 +- fs/ceph/addr.c | 20 ++++++--- fs/ceph/cache.c | 6 +-- fs/ceph/caps.c | 10 ++--- fs/ceph/dir.c | 4 +- fs/ceph/file.c | 14 +++++-- fs/ceph/inode.c | 91 +++++++++++++++++++++++------------------ fs/ceph/mds_client.c | 9 ++-- fs/ceph/mds_client.h | 2 +- fs/ceph/snap.c | 6 +-- fs/ceph/super.h | 9 ++-- fs/ceph/xattr.c | 2 +- fs/cifs/cache.c | 16 +++++--- fs/cifs/cifsencrypt.c | 4 +- fs/cifs/cifsglob.h | 6 +-- fs/cifs/cifsproto.h | 6 +-- fs/cifs/cifssmb.c | 6 +-- fs/cifs/inode.c | 35 +++++++++------- fs/cifs/netmisc.c | 16 ++++---- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_iops.c | 2 +- fs/xfs/xfs_trans_inode.c | 7 ++-- include/linux/ceph/decode.h | 9 ++-- include/linux/ceph/osd_client.h | 4 +- include/linux/fs.h | 21 ++++++++++ net/ceph/osd_client.c | 4 +- 31 files changed, 199 insertions(+), 135 deletions(-)
Add vfs_time accessors to help convert vfs timestamps to use 64 bit times. These create an abstraction layer so that vfs inode times can be switched to struct timespec64 from struct timespec without breaking the individual filesystems after they have incorporated these.
Add vfs_time data type aliases to help convert vfs timestamps to use 64 bit times. These create an abstraction layer so that vfs inode times can be switched to struct timespec64 instead of struct timespec.
Use uapi exposed data types, timespec and timespec64 here to keep minimal timestamp data type conversions in API's interfacing with vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/fs.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 4af612f..56e6373 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1412,6 +1412,27 @@ static inline struct timespec current_fs_time_sec(struct super_block *sb) return (struct timespec) { get_seconds(), 0 }; }
+/* Place holder defines until it is safe to use timespec64 + * in the vfs layer. + * timespec64 data type and functions will be used at that + * time directly by all filesystems and these defines will be deleted. + */ +static inline struct timespec64 vfs_time_to_timespec64(struct timespec inode_ts) +{ + return timespec_to_timespec64(inode_ts); +} + +static inline struct timespec timespec64_to_vfs_time(struct timespec64 ts) +{ + return timespec64_to_timespec(ts); +} + + +#define vfs_time timespec + +#define vfs_time_compare timespec_compare +#define vfs_time_equal timespec_equal + /* * Snapshotting support. */
Aux data timestamps are only used to read in inode timestamps for fscache.
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use accessor functions to access inode times before the vfs change.
Also convert the timestamp data types in the individual filesystems to use y2038 safe struct timespec64.
The data type change from struct timespec to struct timespec64 is safe because 1.Size of auxdata on 64 bit systems already accounts for struct timespec64. 2.Only auxdata length changes(increases) by a maximum of 128 bits (data type increase and alignment adjustments). 3.Everything else is the same size as before and possibly less big than on 64 bit systems. 4.64 bit data is aligned on a 64 bit boundary even on 32 bit systems, making the alignment requirements also same as on 64 bit systems, except for i386 where alignment requirements pack the struct tighter.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cache.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c index 6c665bf..713055b 100644 --- a/fs/cifs/cache.c +++ b/fs/cifs/cache.c @@ -221,8 +221,8 @@ const struct fscache_cookie_def cifs_fscache_super_index_def = { * Auxiliary data attached to CIFS inode within the cache */ struct cifs_fscache_inode_auxdata { - struct timespec last_write_time; - struct timespec last_change_time; + struct timespec64 last_write_time; + struct timespec64 last_change_time; u64 eof; };
@@ -259,8 +259,10 @@ cifs_fscache_inode_get_aux(const void *cookie_netfs_data, void *buffer,
memset(&auxdata, 0, sizeof(auxdata)); auxdata.eof = cifsi->server_eof; - auxdata.last_write_time = cifsi->vfs_inode.i_mtime; - auxdata.last_change_time = cifsi->vfs_inode.i_ctime; + auxdata.last_write_time = + vfs_time_to_timespec64(cifsi->vfs_inode.i_mtime); + auxdata.last_change_time = + vfs_time_to_timespec64(cifsi->vfs_inode.i_ctime);
if (maxbuf > sizeof(auxdata)) maxbuf = sizeof(auxdata); @@ -283,8 +285,10 @@ fscache_checkaux cifs_fscache_inode_check_aux(void *cookie_netfs_data,
memset(&auxdata, 0, sizeof(auxdata)); auxdata.eof = cifsi->server_eof; - auxdata.last_write_time = cifsi->vfs_inode.i_mtime; - auxdata.last_change_time = cifsi->vfs_inode.i_ctime; + auxdata.last_write_time = + vfs_time_to_timespec64(cifsi->vfs_inode.i_mtime); + auxdata.last_change_time = + vfs_time_to_timespec64(cifsi->vfs_inode.i_ctime);
if (memcmp(data, &auxdata, datalen) != 0) return FSCACHE_CHECKAUX_OBSOLETE;
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use timespec64 and conversion functions here to access inode times.
This change is safe because cifs_fattr is accessed in following ways:
1. Assigning to/ from/ compared with struct inode timestamps, which will match vfs_time alias. 2. Assigned from cifs_NTtimeToUnix(). And, this function does not care about the data type sizes of tv_sec and tv_nsec fields. 3. Assigned from current_fs_time(), which will match vfs_time alias. 4. Assigned from cnvrtDosUnixTm(). And, this function does not care about the data type sizes of tv_sec and tv_nsec fields.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cifsencrypt.c | 4 ++-- fs/cifs/cifsglob.h | 6 +++--- fs/cifs/cifsproto.h | 6 +++--- fs/cifs/cifssmb.c | 6 +++--- fs/cifs/inode.c | 35 ++++++++++++++++++++--------------- fs/cifs/netmisc.c | 11 ++++++----- 6 files changed, 37 insertions(+), 31 deletions(-)
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index f86e07d..2917d1d 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -460,7 +460,7 @@ find_timestamp(struct cifs_ses *ses) unsigned char *blobptr; unsigned char *blobend; struct ntlmssp2_name *attrptr; - struct timespec ts; + struct timespec64 ts;
if (!ses->auth_key.len || !ses->auth_key.response) return 0; @@ -485,7 +485,7 @@ find_timestamp(struct cifs_ses *ses) blobptr += attrsize; /* advance attr value */ }
- ktime_get_real_ts(&ts); + ktime_get_real_ts64(&ts); return cpu_to_le64(cifs_UnixTimeToNT(ts)); }
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index a25b251..7dfb0e2 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1393,9 +1393,9 @@ struct cifs_fattr { dev_t cf_rdev; unsigned int cf_nlink; unsigned int cf_dtype; - struct timespec cf_atime; - struct timespec cf_mtime; - struct timespec cf_ctime; + struct timespec64 cf_atime; + struct timespec64 cf_mtime; + struct timespec64 cf_ctime; };
static inline void free_dfs_info_param(struct dfs_info3_param *param) diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index eed7ff5..2d78814 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -126,9 +126,9 @@ extern enum securityEnum select_sectype(struct TCP_Server_Info *server, enum securityEnum requested); extern int CIFS_SessSetup(const unsigned int xid, struct cifs_ses *ses, const struct nls_table *nls_cp); -extern struct timespec cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); -extern u64 cifs_UnixTimeToNT(struct timespec); -extern struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, +extern struct timespec64 cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); +extern u64 cifs_UnixTimeToNT(struct timespec64); +extern struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset); extern void cifs_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock); extern int cifs_get_writer(struct cifsInodeInfo *cinode); diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 1a9e43d..db268ff 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -478,8 +478,8 @@ decode_lanman_negprot_rsp(struct TCP_Server_Info *server, NEGOTIATE_RSP *pSMBr) * this requirement. */ int val, seconds, remain, result; - struct timespec ts; - unsigned long utc = get_seconds(); + struct timespec64 ts; + unsigned long utc = ktime_get_real_seconds(); ts = cnvrtDosUnixTm(rsp->SrvTime.Date, rsp->SrvTime.Time, 0); cifs_dbg(FYI, "SrvTime %d sec since 1970 (utc: %d) diff: %d\n", @@ -4000,7 +4000,7 @@ QInfRetry: if (rc) { cifs_dbg(FYI, "Send error in QueryInfo = %d\n", rc); } else if (data) { - struct timespec ts; + struct timespec64 ts; __u32 time = le32_to_cpu(pSMBr->last_write_time);
/* decode response */ diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index fa72359..1ead483 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -92,6 +92,7 @@ static void cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) { struct cifsInodeInfo *cifs_i = CIFS_I(inode); + struct timespec64 mtime;
cifs_dbg(FYI, "%s: revalidating inode %llu\n", __func__, cifs_i->uniqueid); @@ -109,8 +110,9 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) return; }
+ mtime = vfs_time_to_timespec64(inode->i_mtime); /* revalidate if mtime or size have changed */ - if (timespec_equal(&inode->i_mtime, &fattr->cf_mtime) && + if (timespec64_equal(&mtime, &fattr->cf_mtime) && cifs_i->server_eof == fattr->cf_eof) { cifs_dbg(FYI, "%s: inode %llu is unchanged\n", __func__, cifs_i->uniqueid); @@ -159,9 +161,9 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr) cifs_revalidate_cache(inode, fattr);
spin_lock(&inode->i_lock); - inode->i_atime = fattr->cf_atime; - inode->i_mtime = fattr->cf_mtime; - inode->i_ctime = fattr->cf_ctime; + inode->i_atime = timespec64_to_vfs_time(fattr->cf_atime); + inode->i_mtime = timespec64_to_vfs_time(fattr->cf_mtime); + inode->i_ctime = timespec64_to_vfs_time(fattr->cf_ctime); inode->i_rdev = fattr->cf_rdev; cifs_nlink_fattr_to_inode(inode, fattr); inode->i_uid = fattr->cf_uid; @@ -321,7 +323,7 @@ cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct super_block *sb) fattr->cf_uid = cifs_sb->mnt_uid; fattr->cf_gid = cifs_sb->mnt_gid; fattr->cf_atime = fattr->cf_ctime = - fattr->cf_mtime = current_fs_time(sb); + fattr->cf_mtime = vfs_time_to_timespec64(current_fs_time(sb)); fattr->cf_nlink = 2; fattr->cf_flags |= CIFS_FATTR_DFS_REFERRAL; } @@ -597,7 +599,7 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info, if (info->LastAccessTime) fattr->cf_atime = cifs_NTtimeToUnix(info->LastAccessTime); else - fattr->cf_atime = current_fs_time(sb); + fattr->cf_atime = vfs_time_to_timespec64(current_fs_time(sb));
fattr->cf_ctime = cifs_NTtimeToUnix(info->ChangeTime); fattr->cf_mtime = cifs_NTtimeToUnix(info->LastWriteTime); @@ -1066,15 +1068,15 @@ cifs_set_file_info(struct inode *inode, struct iattr *attrs, unsigned int xid,
if (attrs->ia_valid & ATTR_ATIME) { set_time = true; - info_buf.LastAccessTime = - cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_atime)); + info_buf.LastAccessTime = cpu_to_le64( + cifs_UnixTimeToNT(vfs_time_to_timespec64(attrs->ia_atime))); } else info_buf.LastAccessTime = 0;
if (attrs->ia_valid & ATTR_MTIME) { set_time = true; - info_buf.LastWriteTime = - cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_mtime)); + info_buf.LastWriteTime = cpu_to_le64( + cifs_UnixTimeToNT(vfs_time_to_timespec64(attrs->ia_mtime))); } else info_buf.LastWriteTime = 0;
@@ -1086,8 +1088,8 @@ cifs_set_file_info(struct inode *inode, struct iattr *attrs, unsigned int xid, */ if (set_time && (attrs->ia_valid & ATTR_CTIME)) { cifs_dbg(FYI, "CIFS - CTIME changed\n"); - info_buf.ChangeTime = - cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_ctime)); + info_buf.ChangeTime = cpu_to_le64( + cifs_UnixTimeToNT(vfs_time_to_timespec64(attrs->ia_ctime))); } else info_buf.ChangeTime = 0;
@@ -2193,17 +2195,20 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs) args->gid = INVALID_GID; /* no change */
if (attrs->ia_valid & ATTR_ATIME) - args->atime = cifs_UnixTimeToNT(attrs->ia_atime); + args->atime = cifs_UnixTimeToNT( + vfs_time_to_timespec64(attrs->ia_atime)); else args->atime = NO_CHANGE_64;
if (attrs->ia_valid & ATTR_MTIME) - args->mtime = cifs_UnixTimeToNT(attrs->ia_mtime); + args->mtime = cifs_UnixTimeToNT( + vfs_time_to_timespec64(attrs->ia_mtime)); else args->mtime = NO_CHANGE_64;
if (attrs->ia_valid & ATTR_CTIME) - args->ctime = cifs_UnixTimeToNT(attrs->ia_ctime); + args->ctime = cifs_UnixTimeToNT( + vfs_time_to_timespec64(attrs->ia_ctime)); else args->ctime = NO_CHANGE_64;
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index abae6dd..bf1b52e 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -918,10 +918,11 @@ smbCalcSize(void *buf) * Convert the NT UTC (based 1601-01-01, in hundred nanosecond units) * into Unix UTC (based 1970-01-01, in seconds). */ -struct timespec +struct timespec64 cifs_NTtimeToUnix(__le64 ntutc) { - struct timespec ts; + struct timespec64 ts; + /* BB what about the timezone? BB */
/* Subtract the NTFS time offset, then convert to 1s intervals. */ @@ -949,7 +950,7 @@ cifs_NTtimeToUnix(__le64 ntutc)
/* Convert the Unix UTC into NT UTC. */ u64 -cifs_UnixTimeToNT(struct timespec t) +cifs_UnixTimeToNT(struct timespec64 t) { /* Convert to 100ns intervals and then add the NTFS time offset. */ return (u64) t.tv_sec * 10000000 + t.tv_nsec/100 + NTFS_TIME_OFFSET; @@ -959,9 +960,9 @@ static const int total_days_of_prev_months[] = { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 };
-struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) +struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { - struct timespec ts; + struct timespec64 ts; int sec, min, days, month, year; u16 date = le16_to_cpu(le_date); u16 time = le16_to_cpu(le_time);
The seconds calculated from the server(DOS format) cannot be saved in int data type as this cannot save seconds since epoch after the year 2038.
Use long long for seconds field in cnvrtDosUnixTm(). This will help represent 64 bit time even on 32 bit systems.
Note that even though the theoretical max on DOS times is 2107, its api's only support until the year 2099. This means we can get away with 32 bit unsigned sec field. But, the sec field uses long long to maintain uniformity in the kernel, where everyone uses the theoretical max.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/netmisc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index bf1b52e..ecbb7162 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -963,7 +963,8 @@ static const int total_days_of_prev_months[] = { struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { struct timespec64 ts; - int sec, min, days, month, year; + long long sec; + int min, days, month, year; u16 date = le16_to_cpu(le_date); u16 time = le16_to_cpu(le_time); SMB_TIME *st = (SMB_TIME *)&time; @@ -974,7 +975,7 @@ struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) sec = 2 * st->TwoSeconds; min = st->Minutes; if ((sec > 59) || (min > 59)) - cifs_dbg(VFS, "illegal time min %d sec %d\n", min, sec); + cifs_dbg(VFS, "illegal time min %d sec %lld\n", min, sec); sec += (min * 60); sec += 60 * 60 * st->Hours; if (st->Hours > 24)
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use timespec64 and conversion functions here to access inode times.
struct btrfs_inode is the in memory inode structure for btrfs. i_otime is a member of the btrfs_inode that represents file creation times. Use struct timespec64 to represent this timestamp.
Like all the other inode timestamps in struct inode, i_otime is assigned to/ from disk or struct inode times or 0.
Hence, i_otime needs to use accessor functions to access inode timestamps.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/btrfs/btrfs_inode.h | 2 +- fs/btrfs/inode.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 61205e3..5200e68 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -190,7 +190,7 @@ struct btrfs_inode { struct btrfs_delayed_node *delayed_node;
/* File creation time. */ - struct timespec i_otime; + struct timespec64 i_otime;
/* Hook into fs_info->delayed_iputs */ struct list_head delayed_iput; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 59c0e22..bcd223c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5592,7 +5592,7 @@ static struct inode *new_simple_dir(struct super_block *s, inode->i_mtime = current_fs_time(inode->i_sb); inode->i_atime = inode->i_mtime; inode->i_ctime = inode->i_mtime; - BTRFS_I(inode)->i_otime = inode->i_mtime; + BTRFS_I(inode)->i_otime = vfs_time_to_timespec64(inode->i_mtime);
return inode; } @@ -6164,7 +6164,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans, inode->i_mtime = current_fs_time(inode->i_sb); inode->i_atime = inode->i_mtime; inode->i_ctime = inode->i_mtime; - BTRFS_I(inode)->i_otime = inode->i_mtime; + BTRFS_I(inode)->i_otime = vfs_time_to_timespec64(inode->i_mtime);
inode_item = btrfs_item_ptr(path->nodes[0], path->slots[0], struct btrfs_inode_item);
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use timespec64 and conversion functions here to access inode times.
The following needs to switch along with vfs time representation.
1. inode times set/get. 2. For inode times comparison. 3. getting times from current_fs_time(). 4. btrfs_timespec already uses 64 bits to represent seconds in timestamps as 64 bits in btrfs_timespec. 5. btrfs_update_time() is a inode_ops callback.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/btrfs/file.c | 7 ++++--- fs/btrfs/inode.c | 2 +- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/root-tree.c | 2 +- fs/btrfs/transaction.c | 2 +- 5 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 610f569..a5fb13c 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1732,16 +1732,17 @@ out:
static void update_time_for_write(struct inode *inode) { - struct timespec now; + struct vfs_time now;
if (IS_NOCMTIME(inode)) return;
now = current_fs_time(inode->i_sb); - if (!timespec_equal(&inode->i_mtime, &now)) + + if (!vfs_time_equal(&inode->i_mtime, &now)) inode->i_mtime = now;
- if (!timespec_equal(&inode->i_ctime, &now)) + if (!vfs_time_equal(&inode->i_ctime, &now)) inode->i_ctime = now;
if (IS_I_VERSION(inode)) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index bcd223c..860e5e6 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5942,7 +5942,7 @@ static int btrfs_dirty_inode(struct inode *inode) * This is a copy of file_update_time. We need this so we can return error on * ENOSPC for updating the inode in the case of file write and mmap writes. */ -static int btrfs_update_time(struct inode *inode, struct timespec *now, +static int btrfs_update_time(struct inode *inode, struct vfs_time *now, int flags) { struct btrfs_root *root = BTRFS_I(inode)->root; diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 6f35d9c..471037f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -443,7 +443,7 @@ static noinline int create_subvol(struct inode *dir, struct btrfs_root *root = BTRFS_I(dir)->root; struct btrfs_root *new_root; struct btrfs_block_rsv block_rsv; - struct timespec cur_time = current_fs_time(dir->i_sb); + struct vfs_time cur_time = current_fs_time(dir->i_sb); struct inode *inode; int ret; int err; @@ -4956,7 +4956,7 @@ static long _btrfs_ioctl_set_received_subvol(struct file *file, struct btrfs_root *root = BTRFS_I(inode)->root; struct btrfs_root_item *root_item = &root->root_item; struct btrfs_trans_handle *trans; - struct timespec ct = current_fs_time(inode->i_sb); + struct vfs_time ct = current_fs_time(inode->i_sb); int ret = 0; int received_uuid_changed;
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c index a25f3b2..0a309f6b 100644 --- a/fs/btrfs/root-tree.c +++ b/fs/btrfs/root-tree.c @@ -488,7 +488,7 @@ void btrfs_update_root_times(struct btrfs_trans_handle *trans, struct btrfs_root *root) { struct btrfs_root_item *item = &root->root_item; - struct timespec ct = current_fs_time(root->fs_info->sb); + struct vfs_time ct = current_fs_time(root->fs_info->sb);
spin_lock(&root->root_item_lock); btrfs_set_root_ctransid(item, trans->transid); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 37562d6..5481ee0 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1333,7 +1333,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, struct dentry *dentry; struct extent_buffer *tmp; struct extent_buffer *old; - struct timespec cur_time; + struct vfs_time cur_time; int ret = 0; u64 to_reserve = 0; u64 index = 0;
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use timespec64 and conversion functions here to access inode times.
All the timestamps are converted to use struct timespec64 data type. And, all the vfs timestamps are converted to timespec64 at the boundary of vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/ceph/addr.c | 20 ++++++--- fs/ceph/cache.c | 6 +-- fs/ceph/caps.c | 10 ++--- fs/ceph/dir.c | 4 +- fs/ceph/file.c | 14 +++++-- fs/ceph/inode.c | 91 +++++++++++++++++++++++------------------ fs/ceph/mds_client.c | 9 ++-- fs/ceph/mds_client.h | 2 +- fs/ceph/snap.c | 6 +-- fs/ceph/super.h | 9 ++-- fs/ceph/xattr.c | 2 +- include/linux/ceph/decode.h | 9 ++-- include/linux/ceph/osd_client.h | 4 +- net/ceph/osd_client.c | 4 +- 14 files changed, 110 insertions(+), 80 deletions(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index c222137..09a97b5 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -479,6 +479,7 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) struct ceph_fs_client *fsc; struct ceph_osd_client *osdc; struct ceph_snap_context *snapc, *oldest; + struct timespec64 ts; loff_t page_off = page_offset(page); loff_t snap_size = -1; long writeback_stat; @@ -540,11 +541,12 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) ceph_readpage_to_fscache(inode, page);
set_page_writeback(page); + ts = vfs_time_to_timespec64(inode->i_mtime); err = ceph_osdc_writepages(osdc, ceph_vino(inode), &ci->i_layout, snapc, page_off, len, truncate_seq, truncate_size, - &inode->i_mtime, &page, 1); + &ts, &page, 1); if (err < 0) { dout("writepage setting page/mapping error %d %p\n", err, page); SetPageError(page); @@ -699,6 +701,7 @@ static int ceph_writepages_start(struct address_space *mapping, int rc = 0; unsigned wsize = 1 << inode->i_blkbits; struct ceph_osd_request *req = NULL; + struct timespec64 ts; int do_sync = 0; loff_t snap_size, i_size; u64 truncate_size; @@ -978,8 +981,9 @@ get_more_pages: osd_req_op_extent_update(req, 0, len);
vino = ceph_vino(inode); + ts = vfs_time_to_timespec64(inode->i_mtime); ceph_osdc_build_request(req, offset, snapc, vino.snap, - &inode->i_mtime); + &ts);
rc = ceph_osdc_start_request(&fsc->client->osdc, req, true); BUG_ON(rc); @@ -1465,6 +1469,7 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_request *req; struct page *page = NULL; + struct timespec64 ts; u64 len, inline_version; int err = 0; bool from_pagecache = false; @@ -1528,7 +1533,8 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) goto out; }
- ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &inode->i_mtime); + ts = vfs_time_to_timespec64(inode->i_mtime); + ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &ts); err = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!err) err = ceph_osdc_wait_request(&fsc->client->osdc, req); @@ -1572,7 +1578,7 @@ int ceph_uninline_data(struct file *filp, struct page *locked_page) goto out_put; }
- ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &inode->i_mtime); + ceph_osdc_build_request(req, 0, NULL, CEPH_NOSNAP, &ts); err = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!err) err = ceph_osdc_wait_request(&fsc->client->osdc, req); @@ -1622,6 +1628,7 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, u32 pool) struct ceph_osd_request *rd_req = NULL, *wr_req = NULL; struct rb_node **p, *parent; struct ceph_pool_perm *perm; + struct timespec64 ts; struct page **pages; int err = 0, err2 = 0, have = 0;
@@ -1701,12 +1708,13 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, u32 pool)
osd_req_op_raw_data_in_pages(rd_req, 0, pages, PAGE_SIZE, 0, false, true); + ts = vfs_time_to_timespec64(ci->vfs_inode.i_mtime); ceph_osdc_build_request(rd_req, 0, NULL, CEPH_NOSNAP, - &ci->vfs_inode.i_mtime); + &ts); err = ceph_osdc_start_request(&fsc->client->osdc, rd_req, false);
ceph_osdc_build_request(wr_req, 0, NULL, CEPH_NOSNAP, - &ci->vfs_inode.i_mtime); + &ts); err2 = ceph_osdc_start_request(&fsc->client->osdc, wr_req, false);
if (!err) diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c index a351480..a773543 100644 --- a/fs/ceph/cache.c +++ b/fs/ceph/cache.c @@ -25,7 +25,7 @@ #include "cache.h"
struct ceph_aux_inode { - struct timespec mtime; + struct timespec64 mtime; loff_t size; };
@@ -105,7 +105,7 @@ static uint16_t ceph_fscache_inode_get_aux(const void *cookie_netfs_data, const struct inode* inode = &ci->vfs_inode;
memset(&aux, 0, sizeof(aux)); - aux.mtime = inode->i_mtime; + aux.mtime = vfs_time_to_timespec64(inode->i_mtime); aux.size = i_size_read(inode);
memcpy(buffer, &aux, sizeof(aux)); @@ -131,7 +131,7 @@ static enum fscache_checkaux ceph_fscache_inode_check_aux( return FSCACHE_CHECKAUX_OBSOLETE;
memset(&aux, 0, sizeof(aux)); - aux.mtime = inode->i_mtime; + aux.mtime = vfs_time_to_timespec64(inode->i_mtime); aux.size = i_size_read(inode);
if (memcmp(data, &aux, sizeof(aux)) != 0) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index cdbf8cf..ab186a1 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -990,7 +990,7 @@ static int send_cap_msg(struct ceph_mds_session *session, int caps, int wanted, int dirty, u32 seq, u64 flush_tid, u64 oldest_flush_tid, u32 issue_seq, u32 mseq, u64 size, u64 max_size, - struct timespec *mtime, struct timespec *atime, + struct timespec64 *mtime, struct timespec64 *atime, u64 time_warp_seq, kuid_t uid, kgid_t gid, umode_t mode, u64 xattr_version, @@ -1116,7 +1116,7 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap, int held, revoking, dropping, keep; u64 seq, issue_seq, mseq, time_warp_seq, follows; u64 size, max_size; - struct timespec mtime, atime; + struct timespec64 mtime, atime; int wake = 0; umode_t mode; kuid_t uid; @@ -1178,8 +1178,8 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap, ci->i_reported_size = size; max_size = ci->i_wanted_max_size; ci->i_requested_max_size = max_size; - mtime = inode->i_mtime; - atime = inode->i_atime; + mtime = vfs_time_to_timespec64(inode->i_mtime); + atime = vfs_time_to_timespec64(inode->i_atime); time_warp_seq = ci->i_time_warp_seq; uid = inode->i_uid; gid = inode->i_gid; @@ -2764,7 +2764,7 @@ static void handle_cap_grant(struct ceph_mds_client *mdsc, int used, wanted, dirty; u64 size = le64_to_cpu(grant->size); u64 max_size = le64_to_cpu(grant->max_size); - struct timespec mtime, atime, ctime; + struct timespec64 mtime, atime, ctime; int check_caps = 0; bool wake = false; bool writeback = false; diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index fd11fb2..83e6602 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1210,7 +1210,7 @@ static ssize_t ceph_read_dir(struct file *file, char __user *buf, size_t size, " rfiles: %20lld\n" " rsubdirs: %20lld\n" "rbytes: %20lld\n" - "rctime: %10ld.%09ld\n", + "rctime: %10lld.%09ld\n", ci->i_files + ci->i_subdirs, ci->i_files, ci->i_subdirs, @@ -1218,7 +1218,7 @@ static ssize_t ceph_read_dir(struct file *file, char __user *buf, size_t size, ci->i_rfiles, ci->i_rsubdirs, ci->i_rbytes, - (long)ci->i_rctime.tv_sec, + (long long)ci->i_rctime.tv_sec, (long)ci->i_rctime.tv_nsec); }
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 9b338ff..ec77278 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -555,7 +555,7 @@ struct ceph_aio_request { struct list_head osd_reqs; unsigned num_reqs; atomic_t pending_reqs; - struct timespec mtime; + struct timespec64 mtime; struct ceph_cap_flush *prealloc_cf; };
@@ -783,11 +783,13 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, int num_pages = 0; int flags; int ret; - struct timespec mtime = current_fs_time(inode->i_sb); + struct timespec64 mtime; size_t count = iov_iter_count(iter); loff_t pos = iocb->ki_pos; bool write = iov_iter_rw(iter) == WRITE;
+ mtime = vfs_time_to_timespec64(current_fs_time(inode->i_sb)); + if (write && ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS;
@@ -988,9 +990,11 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, int flags; int check_caps = 0; int ret; - struct timespec mtime = current_fs_time(inode->i_sb); + struct timespec64 mtime; size_t count = iov_iter_count(from);
+ mtime = vfs_time_to_timespec64(current_fs_time(inode->i_sb)); + if (ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS;
@@ -1497,6 +1501,7 @@ static int ceph_zero_partial_object(struct inode *inode, struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_request *req; + struct timespec64 ts; int ret = 0; loff_t zero = 0; int op; @@ -1520,8 +1525,9 @@ static int ceph_zero_partial_object(struct inode *inode, goto out; }
+ ts = vfs_time_to_timespec64(inode->i_mtime); ceph_osdc_build_request(req, offset, NULL, ceph_vino(inode).snap, - &inode->i_mtime); + &ts);
ret = ceph_osdc_start_request(&fsc->client->osdc, req, false); if (!ret) { diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 63d0198..ecfdbe5 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -590,10 +590,11 @@ int ceph_fill_file_size(struct inode *inode, int issued, }
void ceph_fill_file_time(struct inode *inode, int issued, - u64 time_warp_seq, struct timespec *ctime, - struct timespec *mtime, struct timespec *atime) + u64 time_warp_seq, struct timespec64 *ctime, + struct timespec64 *mtime, struct timespec64 *atime) { struct ceph_inode_info *ci = ceph_inode(inode); + struct vfs_time vfs_ts; int warn = 0;
if (issued & (CEPH_CAP_FILE_EXCL| @@ -601,38 +602,42 @@ void ceph_fill_file_time(struct inode *inode, int issued, CEPH_CAP_FILE_BUFFER| CEPH_CAP_AUTH_EXCL| CEPH_CAP_XATTR_EXCL)) { - if (timespec_compare(ctime, &inode->i_ctime) > 0) { - dout("ctime %ld.%09ld -> %ld.%09ld inc w/ cap\n", - inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, + vfs_ts = timespec64_to_vfs_time(*ctime); + if (vfs_time_compare(&vfs_ts, &inode->i_ctime) > 0) { + dout("ctime %lld.%09ld -> %lld.%09ld inc w/ cap\n", + (long long)inode->i_ctime.tv_sec, + inode->i_ctime.tv_nsec, ctime->tv_sec, ctime->tv_nsec); - inode->i_ctime = *ctime; + inode->i_ctime = vfs_ts; } if (ceph_seq_cmp(time_warp_seq, ci->i_time_warp_seq) > 0) { /* the MDS did a utimes() */ - dout("mtime %ld.%09ld -> %ld.%09ld " + dout("mtime %lld.%09ld -> %lld.%09ld " "tw %d -> %d\n", - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, + (long long)inode->i_mtime.tv_sec, + inode->i_mtime.tv_nsec, mtime->tv_sec, mtime->tv_nsec, ci->i_time_warp_seq, (int)time_warp_seq); - - inode->i_mtime = *mtime; - inode->i_atime = *atime; + inode->i_mtime = timespec64_to_vfs_time(*mtime); + inode->i_atime = timespec64_to_vfs_time(*atime); ci->i_time_warp_seq = time_warp_seq; } else if (time_warp_seq == ci->i_time_warp_seq) { /* nobody did utimes(); take the max */ - if (timespec_compare(mtime, &inode->i_mtime) > 0) { - dout("mtime %ld.%09ld -> %ld.%09ld inc\n", - inode->i_mtime.tv_sec, - inode->i_mtime.tv_nsec, - mtime->tv_sec, mtime->tv_nsec); - inode->i_mtime = *mtime; + vfs_ts = timespec64_to_vfs_time(*mtime); + if (vfs_time_compare(&vfs_ts, &inode->i_mtime) > 0) { + dout("mtime %lld.%09ld -> %lld.%09ld inc\n", + (long long)inode->i_mtime.tv_sec, + inode->i_mtime.tv_nsec, + mtime->tv_sec, mtime->tv_nsec); + inode->i_mtime = vfs_ts; } - if (timespec_compare(atime, &inode->i_atime) > 0) { - dout("atime %ld.%09ld -> %ld.%09ld inc\n", - inode->i_atime.tv_sec, + vfs_ts = timespec64_to_vfs_time(*atime); + if (vfs_time_compare(&vfs_ts, &inode->i_atime) > 0) { + dout("atime %lld.%09ld -> %lld.%09ld inc\n", + (long long)inode->i_atime.tv_sec, inode->i_atime.tv_nsec, atime->tv_sec, atime->tv_nsec); - inode->i_atime = *atime; + inode->i_atime = vfs_ts; } } else if (issued & CEPH_CAP_FILE_EXCL) { /* we did a utimes(); ignore mds values */ @@ -642,9 +647,9 @@ void ceph_fill_file_time(struct inode *inode, int issued, } else { /* we have no write|excl caps; whatever the MDS says is true */ if (ceph_seq_cmp(time_warp_seq, ci->i_time_warp_seq) >= 0) { - inode->i_ctime = *ctime; - inode->i_mtime = *mtime; - inode->i_atime = *atime; + inode->i_ctime = timespec64_to_vfs_time(*ctime); + inode->i_mtime = timespec64_to_vfs_time(*mtime); + inode->i_atime = timespec64_to_vfs_time(*atime); ci->i_time_warp_seq = time_warp_seq; } else { warn = 1; @@ -670,7 +675,7 @@ static int fill_inode(struct inode *inode, struct page *locked_page, struct ceph_mds_reply_inode *info = iinfo->in; struct ceph_inode_info *ci = ceph_inode(inode); int issued = 0, implemented, new_issued; - struct timespec mtime, atime, ctime; + struct timespec64 mtime, atime, ctime; struct ceph_buffer *xattr_blob = NULL; struct ceph_cap *new_cap = NULL; int err = 0; @@ -1776,6 +1781,7 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) struct ceph_mds_request *req; struct ceph_mds_client *mdsc = ceph_sb_to_client(dentry->d_sb)->mdsc; struct ceph_cap_flush *prealloc_cf; + struct timespec64 iattr_ts; int issued; int release = 0, dirtied = 0; int mask = 0; @@ -1863,44 +1869,46 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) }
if (ia_valid & ATTR_ATIME) { - dout("setattr %p atime %ld.%ld -> %ld.%ld\n", inode, - inode->i_atime.tv_sec, inode->i_atime.tv_nsec, - attr->ia_atime.tv_sec, attr->ia_atime.tv_nsec); + dout("setattr %p atime %lld.%ld -> %lld.%ld\n", inode, + (long long)inode->i_mtime.tv_sec, + inode->i_mtime.tv_nsec, + (long long)attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_atime = attr->ia_atime; dirtied |= CEPH_CAP_FILE_EXCL; } else if ((issued & CEPH_CAP_FILE_WR) && - timespec_compare(&inode->i_atime, + vfs_time_compare(&inode->i_atime, &attr->ia_atime) < 0) { inode->i_atime = attr->ia_atime; dirtied |= CEPH_CAP_FILE_WR; } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || - !timespec_equal(&inode->i_atime, &attr->ia_atime)) { + !vfs_time_equal(&inode->i_atime, &attr->ia_atime)) { + iattr_ts = vfs_time_to_timespec64(attr->ia_atime); ceph_encode_timespec(&req->r_args.setattr.atime, - &attr->ia_atime); + &iattr_ts); mask |= CEPH_SETATTR_ATIME; release |= CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; } } if (ia_valid & ATTR_MTIME) { - dout("setattr %p mtime %ld.%ld -> %ld.%ld\n", inode, - inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, - attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); + dout("setattr %p mtime %lld.%ld -> %lld.%ld\n", inode, + (long long)inode->i_mtime.tv_sec, inode->i_mtime.tv_nsec, + (long long)attr->ia_mtime.tv_sec, attr->ia_mtime.tv_nsec); if (issued & CEPH_CAP_FILE_EXCL) { ci->i_time_warp_seq++; inode->i_mtime = attr->ia_mtime; dirtied |= CEPH_CAP_FILE_EXCL; } else if ((issued & CEPH_CAP_FILE_WR) && - timespec_compare(&inode->i_mtime, + vfs_time_compare(&inode->i_mtime, &attr->ia_mtime) < 0) { inode->i_mtime = attr->ia_mtime; dirtied |= CEPH_CAP_FILE_WR; } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || - !timespec_equal(&inode->i_mtime, &attr->ia_mtime)) { + !vfs_time_equal(&inode->i_mtime, &attr->ia_mtime)) { ceph_encode_timespec(&req->r_args.setattr.mtime, - &attr->ia_mtime); + &iattr_ts); mask |= CEPH_SETATTR_MTIME; release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; @@ -1932,9 +1940,12 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr) if (ia_valid & ATTR_CTIME) { bool only = (ia_valid & (ATTR_SIZE|ATTR_MTIME|ATTR_ATIME| ATTR_MODE|ATTR_UID|ATTR_GID)) == 0; - dout("setattr %p ctime %ld.%ld -> %ld.%ld (%s)\n", inode, - inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec, - attr->ia_ctime.tv_sec, attr->ia_ctime.tv_nsec, + dout("setattr %p ctime %lld.%ld -> %lld.%ld (%s)\n", inode, + (long long)inode->i_ctime.tv_sec, + inode->i_ctime.tv_nsec, + (long long)attr->ia_ctime.tv_sec, + attr->ia_ctime.tv_nsec, + only ? "ctime only" : "ignored"); inode->i_ctime = attr->ia_ctime; if (only) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 348b22e..bdf77065 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1721,7 +1721,7 @@ ceph_mdsc_create_request(struct ceph_mds_client *mdsc, int op, int mode) init_completion(&req->r_safe_completion); INIT_LIST_HEAD(&req->r_unsafe_item);
- ktime_get_real_ts(&req->r_stamp); + getnstimeofday64(&req->r_stamp);
req->r_op = op; req->r_direct_mode = mode; @@ -2791,6 +2791,7 @@ static int encode_caps_cb(struct inode *inode, struct ceph_cap *cap, struct ceph_inode_info *ci; struct ceph_reconnect_state *recon_state = arg; struct ceph_pagelist *pagelist = recon_state->pagelist; + struct timespec64 ts; char *path; int pathlen, err; u64 pathbase; @@ -2839,8 +2840,10 @@ static int encode_caps_cb(struct inode *inode, struct ceph_cap *cap, rec.v1.wanted = cpu_to_le32(__ceph_caps_wanted(ci)); rec.v1.issued = cpu_to_le32(cap->issued); rec.v1.size = cpu_to_le64(inode->i_size); - ceph_encode_timespec(&rec.v1.mtime, &inode->i_mtime); - ceph_encode_timespec(&rec.v1.atime, &inode->i_atime); + ts = vfs_time_to_timespec64(inode->i_mtime); + ceph_encode_timespec(&rec.v1.mtime, &ts); + ts = vfs_time_to_timespec64(inode->i_atime); + ceph_encode_timespec(&rec.v1.atime, &ts); rec.v1.snaprealm = cpu_to_le64(ci->i_snap_realm->ino); rec.v1.pathbase = cpu_to_le64(pathbase); reclen = sizeof(rec.v1); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index ccf11ef..6fb8568 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -202,7 +202,7 @@ struct ceph_mds_request { int r_fmode; /* file mode, if expecting cap */ kuid_t r_uid; kgid_t r_gid; - struct timespec r_stamp; + struct timespec64 r_stamp;
/* for choosing which mds to send this request to */ int r_direct_mode; diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 4aa7122..312cdc2 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -593,9 +593,9 @@ int __ceph_finish_cap_snap(struct ceph_inode_info *ci,
BUG_ON(capsnap->writing); capsnap->size = inode->i_size; - capsnap->mtime = inode->i_mtime; - capsnap->atime = inode->i_atime; - capsnap->ctime = inode->i_ctime; + capsnap->mtime = vfs_time_to_timespec64(inode->i_mtime); + capsnap->atime = vfs_time_to_timespec64(inode->i_atime); + capsnap->ctime = vfs_time_to_timespec64(inode->i_ctime); capsnap->time_warp_seq = ci->i_time_warp_seq; if (capsnap->dirty_pages) { dout("finish_cap_snap %p cap_snap %p snapc %p %llu %s s=%llu " diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 75b7d12..19c31ba 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -169,7 +169,7 @@ struct ceph_cap_snap { u64 xattr_version;
u64 size; - struct timespec mtime, atime, ctime; + struct timespec64 mtime, atime, ctime; u64 time_warp_seq; int writing; /* a sync write is still in progress */ int dirty_pages; /* dirty pages awaiting writeback */ @@ -290,7 +290,7 @@ struct ceph_inode_info { char *i_symlink;
/* for dirs */ - struct timespec i_rctime; + struct timespec64 i_rctime; u64 i_rbytes, i_rfiles, i_rsubdirs; u64 i_files, i_subdirs;
@@ -764,8 +764,9 @@ extern struct inode *ceph_get_snapdir(struct inode *parent); extern int ceph_fill_file_size(struct inode *inode, int issued, u32 truncate_seq, u64 truncate_size, u64 size); extern void ceph_fill_file_time(struct inode *inode, int issued, - u64 time_warp_seq, struct timespec *ctime, - struct timespec *mtime, struct timespec *atime); + u64 time_warp_seq, struct timespec64 *ctime, + struct timespec64 *mtime, + struct timespec64 *atime); extern int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req, struct ceph_mds_session *session); diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 1e1c00a..db5328d 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -198,7 +198,7 @@ static size_t ceph_vxattrcb_dir_rbytes(struct ceph_inode_info *ci, char *val, static size_t ceph_vxattrcb_dir_rctime(struct ceph_inode_info *ci, char *val, size_t size) { - return snprintf(val, size, "%ld.09%ld", (long)ci->i_rctime.tv_sec, + return snprintf(val, size, "%lld.09%ld", (long long)ci->i_rctime.tv_sec, (long)ci->i_rctime.tv_nsec); }
diff --git a/include/linux/ceph/decode.h b/include/linux/ceph/decode.h index 633a130..b68edf5 100644 --- a/include/linux/ceph/decode.h +++ b/include/linux/ceph/decode.h @@ -4,6 +4,7 @@ #include <linux/err.h> #include <linux/bug.h> #include <linux/time.h> +#include <linux/fs.h> #include <asm/unaligned.h>
#include <linux/ceph/types.h> @@ -132,16 +133,16 @@ bad: }
/* - * struct ceph_timespec <-> struct timespec + * struct ceph_timespec <-> struct timespec64 */ -static inline void ceph_decode_timespec(struct timespec *ts, +static inline void ceph_decode_timespec(struct timespec64 *ts, const struct ceph_timespec *tv) { - ts->tv_sec = (__kernel_time_t)le32_to_cpu(tv->tv_sec); + ts->tv_sec = (s64)le32_to_cpu(tv->tv_sec); ts->tv_nsec = (long)le32_to_cpu(tv->tv_nsec); } static inline void ceph_encode_timespec(struct ceph_timespec *tv, - const struct timespec *ts) + const struct timespec64 *ts) { tv->tv_sec = cpu_to_le32((u32)ts->tv_sec); tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec); diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 7506b48..f4fdd97 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -322,7 +322,7 @@ extern struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client * extern void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id, - struct timespec *mtime); + struct timespec64 *mtime);
extern struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *, struct ceph_file_layout *layout, @@ -364,7 +364,7 @@ extern int ceph_osdc_writepages(struct ceph_osd_client *osdc, struct ceph_snap_context *sc, u64 off, u64 len, u32 truncate_seq, u64 truncate_size, - struct timespec *mtime, + struct timespec64 *mtime, struct page **pages, int nr_pages);
/* watch/notify events */ diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index f8f2359..af1692a 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2401,7 +2401,7 @@ bad: */ void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off, struct ceph_snap_context *snapc, u64 snap_id, - struct timespec *mtime) + struct timespec64 *mtime) { struct ceph_msg *msg = req->r_request; void *p; @@ -2735,7 +2735,7 @@ int ceph_osdc_writepages(struct ceph_osd_client *osdc, struct ceph_vino vino, struct ceph_snap_context *snapc, u64 off, u64 len, u32 truncate_seq, u64 truncate_size, - struct timespec *mtime, + struct timespec64 *mtime, struct page **pages, int num_pages) { struct ceph_osd_request *req;
The VFS inode timestamps are not y2038 safe as they use struct timespec. These will be changed to use struct timespec64 instead and that is y2038 safe. But, since the above data type conversion will break the end file systems, use timespec64 and conversion functions here to access inode times.
All the timestamps are converted to use struct timespec64 data type. And, all the vfs timestamps are converted to timespec64 at the boundary of vfs.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_iops.c | 2 +- fs/xfs/xfs_trans_inode.c | 7 ++++--- 3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ceba1a8..ebf76a3 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -765,7 +765,7 @@ xfs_ialloc( xfs_inode_t *ip; uint flags; int error; - struct timespec tv; + struct vfs_time tv;
/* * Call the space management code to pick diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 76b71a1..7f8a897 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -973,7 +973,7 @@ xfs_vn_setattr( STATIC int xfs_vn_update_time( struct inode *inode, - struct timespec *now, + struct vfs_time *now, int flags) { struct xfs_inode *ip = XFS_I(inode); diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c index b97f1df..1242502 100644 --- a/fs/xfs/xfs_trans_inode.c +++ b/fs/xfs/xfs_trans_inode.c @@ -68,7 +68,7 @@ xfs_trans_ichgtime( int flags) { struct inode *inode = VFS_I(ip); - struct timespec tv; + struct vfs_time tv;
ASSERT(tp); ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); @@ -76,13 +76,14 @@ xfs_trans_ichgtime( tv = current_fs_time(inode->i_sb);
if ((flags & XFS_ICHGTIME_MOD) && - !timespec_equal(&inode->i_mtime, &tv)) { + !vfs_time_equal(&inode->i_mtime, &tv)) { inode->i_mtime = tv; ip->i_d.di_mtime.t_sec = tv.tv_sec; ip->i_d.di_mtime.t_nsec = tv.tv_nsec; } + if ((flags & XFS_ICHGTIME_CHG) && - !timespec_equal(&inode->i_ctime, &tv)) { + !vfs_time_equal(&inode->i_ctime, &tv)) { inode->i_ctime = tv; ip->i_d.di_ctime.t_sec = tv.tv_sec; ip->i_d.di_ctime.t_nsec = tv.tv_nsec;
On Friday 12 February 2016 01:21:59 Deepa Dinamani wrote:
Introduction
This is a follow on to the series: https://lkml.org/lkml/2016/1/7/20 [1]. This is aimed at reaching a consensus on how to transition the vfs timestamps to use 64 bit time. This demonstrates three ways (2a, 2b and 2c) of solving this problem. Each of the proposals has its own cover letter that explains the individual approach. Proposals 2b and 2c also outline variant approaches which are similar to the respective proposals. This drives the proposal count to 5. All the changes have been discussed with Arnd Bergmann, who posted the original series: https://lkml.org/lkml/2014/5/30/669 [2]
The series has been simplified to include only the 64 bit timestamp changes as per Dave Chinner’s suggestion.
Motivation
The problem is how to change the vfs inode timestamps to use 64 bit times to overcome the 2038 problem.
Below table [3] gives an overview of the extent/ type of changes needed of changes needed. The series is aimed at obtaining small manageable patches for all the cases in [3].
Hi Deepa,
Thanks a lot for posting this updated series, very nice work!
Regarding the three versions, I think all of them are doable doable, and they all have their upsides and downsides but no showstoppers.
Let me summarize what I see in the patches:
2a is the smallest set of changes in number of lines, as you indicated in the previous discussion (I was skeptical here initially, but you were right). The main downside is that each patch has to carefully consider what happens at the point when the type gets flipped, so that printk format strings are correct and assignments to local variables don't truncate the range. It also requires changing the types again after the VFS change, but that is something we can automate using coccinelle.
2b has the main advantage of not changing behavior with the flip, so we can convert all file systems to use vfs_time relatively easily and then later make them actually use 64-bit timestamps with a patch that each file system developer can do for themselves. One downside is that it leads to rather ugly code as discussed before, examples are in "[RFC v2b 5/5] fs: xfs: change inode times to use vfs_time data type" and "[RFC v2b 3/5] fs: btrfs: Use vfs_time accessors".
2c gets us the furthest along the way for the conversion, close to where we want to end up in the long run, so we could do that to file systems one by one. The behavior change is immediate, so there are fewer possible surprises than with 2a, but it also means the most upfront work.
Arnd
Regarding the three versions, I think all of them are doable doable, and they all have their upsides and downsides but no showstoppers.
I agree that all the approaches are doable.
Let me summarize what I see in the patches:
2a is the smallest set of changes in number of lines, as you indicated in the previous discussion (I was skeptical here initially, but you were right). The main downside is that each patch has to carefully consider what happens at the point when the type gets flipped, so that printk format strings are correct and assignments to local variables don't truncate the range. It also requires changing the types again after the VFS change, but that is something we can automate using coccinelle.
2c has the same downside as this. It also has to carefully consider what happens when you switch end filesystems to timespec64, be it for printks or assignments. I would say that the effort to do this was the same for 2a and 2c.
And, 2c also needs to get rid of the abstraction macros when vfs is transitioned to using timespec64.
2b has the main advantage of not changing behavior with the flip, so we can convert all file systems to use vfs_time relatively easily and then later make them actually use 64-bit timestamps with a patch that each file system developer can do for themselves. One downside is that it leads to rather ugly code as discussed before, examples are in "[RFC v2b 5/5] fs: xfs: change inode times to use vfs_time data type" and "[RFC v2b 3/5] fs: btrfs: Use vfs_time accessors".
Here is the breakup of the number of changes required from the table in the cover letter(https://lkml.org/lkml/2016/2/12/76):
* # Changes needed in 2a = row 1 + row 7 + row 8 + row 9 + row 10 = 34 + 80 + 10 + 3 + 3 = 130 * # Changes needed in 2b = row 1 + row 4 + row 5 + row 6 + row 7 * (~3) = 34 + 80 + 141 + 74 + 85 + 240 = 654 * # Changes needed in 2c = Changes in 2b + some more
It is clear to see from the above table that number of such changes will be considerably more for approaches 2b and 2c.
And, 2b is not even close to what we want to achieve and will again confuse developers even more as there will be 2 sets of abstraction apis now: 1. vfs_time apis 2. timespec64 to timespec/ timespec to timespec64 apis Since there is no clean up effort here after vfs is switched over, we are just making all filesystems that use these apis harder to read.
2c gets us the furthest along the way for the conversion, close to where we want to end up in the long run, so we could do that to file systems one by one. The behavior change is immediate, so there are fewer possible surprises than with 2a, but it also means the most upfront work.
2c abstractions can be used in more than one way. And, 2c also introduces a new timestamp data type along with timespec64 in the filesystem code. The above two factors can make it confusing for the developers until we transition vfs and remove abstractions from individual filesystems. And, this is a problem as we want to remove abstractions in a different kernel release than the one we do the transition in, as we discussed previously.
2a still seems like the right choice to me. And, will have the least number of changes.
As Arnd thinks all of them are doable, if anybody else has other concerns we missed please comment.
-Deepa
Changing a few things and accounting for false positives (assuming worst case and rounding up for case 2a, rounding down for 2b)
- # Changes needed in 2a = row 1 + row 7 + row 8 + row 9 + row 10 = 34 + 80 + 10 + 3 + 3 = 130
- # Changes needed in 2b = row 1 + row 4 + row 5 + row 6 + row 7 * (~3) = 34 + 80 + 141 + 74 + 85 + 240 = 654
- # Changes needed in 2c = Changes in 2b + some more
becomes
* # Changes needed in 2a = row 1 + row 4 + row 7 + row 8 + row 9 + row 10 = 34 + 80 + 141 + 10 + 3 + 3 = 271 * # Changes needed in 2b = row 1 + row 4 + row 5 + row 6 + row 7 * (~2) = 34 + 141 + 74 + 85 + 160 = 494 * # Changes needed in 2c = Changes in 2b + some more
-Deepa
<resend because of html editor>
Changing a few things, accounting for false positives (assuming worst case and rounding up for case 2a, rounding down for 2b)
- # Changes needed in 2a = row 1 + row 7 + row 8 + row 9 + row 10 = 34 + 80 + 10 + 3 + 3 = 130
- # Changes needed in 2b = row 1 + row 4 + row 5 + row 6 + row 7 * (~3) = 34 + 80 + 141 + 74 + 85 + 240 = 654
- # Changes needed in 2c = Changes in 2b + some more
becomes
* # Changes needed in 2a = row 1 + row 4 + row 7 + row 8 + row 9 + row 10 = 34 + 80 + 141 + 10 + 3 + 3 = 271 * # Changes needed in 2b = row 1 + row 4 + row 5 + row 6 + row 7 * (~2) = 34 + 141 + 74 + 85 + 160 = 494 * # Changes needed in 2c = Changes in 2b + some more
-Deepa
On Fri, 12 Feb 2016, Arnd Bergmann wrote:
Regarding the three versions, I think all of them are doable doable, and they all have their upsides and downsides but no showstoppers.
Let me summarize what I see in the patches:
2a is the smallest set of changes in number of lines, as you indicated in the previous discussion (I was skeptical here initially, but you were right). The main downside is that each patch has to carefully consider what happens at the point when the type gets flipped, so that printk format strings are correct and assignments to local variables don't truncate the range. It also requires changing the types again after the VFS change, but that is something we can automate using coccinelle.
Yes, you can do that final change with coccinelle, but all in all this has the highest risk.
2b has the main advantage of not changing behavior with the flip, so we can convert all file systems to use vfs_time relatively easily and then later make them actually use 64-bit timestamps with a patch that each file system developer can do for themselves.
And that's a very clear advantage of this approch. The change is purely mechanical and can be largely done with coccinelle.
One downside is that it leads to rather ugly code as discussed before, examples are in "[RFC v2b 5/5] fs: xfs: change inode times to use vfs_time data type" and "[RFC v2b 3/5] fs: btrfs: Use vfs_time accessors".
- now = current_fs_time(inode->i_sb); - if (!timespec_equal(&inode->i_mtime, &now)) - inode->i_mtime = now; + now = vfs_time_to_timespec(current_fs_time(inode->i_sb)); + ts = vfs_time_to_timespec(inode->i_mtime); + if (!timespec_equal(&ts, &now)) + inode->i_mtime = timespec_to_vfs_time(now); + + ts = vfs_time_to_timespec(inode->i_mtime); + if (!timespec_equal(&ts, &now)) + inode->i_ctime = timespec_to_vfs_time(now);
You can either provide a helper function which does that m/c_time update at the VFS level where you can hide the mess or just accept that this transition will introduce some odd constructs like the above.
2c gets us the furthest along the way for the conversion, close to where we want to end up in the long run, so we could do that to file systems one by one. The behavior change is immediate, so there are fewer possible surprises than with 2a, but it also means the most upfront work.
I would'nt worry about the upfront work to much, if it is the cleanest way to do the conversion. Though, I'm not seing how that really makes the whole thing simpler.
So far we had good results with changing core code first and then fix up the users one by one and at the end remove any intermediary helpers which we introduced along the way. So I'd chose 2b as it's a very clear transition mechanism with pretty low risk.
Thanks,
tglx
On Wed, 24 Feb 2016, Thomas Gleixner wrote:
On Fri, 12 Feb 2016, Arnd Bergmann wrote:
Regarding the three versions, I think all of them are doable doable, and they all have their upsides and downsides but no showstoppers.
Let me summarize what I see in the patches:
2a is the smallest set of changes in number of lines, as you indicated in the previous discussion (I was skeptical here initially, but you were right). The main downside is that each patch has to carefully consider what happens at the point when the type gets flipped, so that printk format strings are correct and assignments to local variables don't truncate the range. It also requires changing the types again after the VFS change, but that is something we can automate using coccinelle.
Yes, you can do that final change with coccinelle, but all in all this has the highest risk.
Recent versions of Coccinelle can help a bit with format string changes. See demos/format.cocci.
julia
2b has the main advantage of not changing behavior with the flip, so we can convert all file systems to use vfs_time relatively easily and then later make them actually use 64-bit timestamps with a patch that each file system developer can do for themselves.
And that's a very clear advantage of this approch. The change is purely mechanical and can be largely done with coccinelle.
One downside is that it leads to rather ugly code as discussed before, examples are in "[RFC v2b 5/5] fs: xfs: change inode times to use vfs_time data type" and "[RFC v2b 3/5] fs: btrfs: Use vfs_time accessors".
now = current_fs_time(inode->i_sb);
if (!timespec_equal(&inode->i_mtime, &now))
inode->i_mtime = now;
now = vfs_time_to_timespec(current_fs_time(inode->i_sb));
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_mtime = timespec_to_vfs_time(now);
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_ctime = timespec_to_vfs_time(now);
You can either provide a helper function which does that m/c_time update at the VFS level where you can hide the mess or just accept that this transition will introduce some odd constructs like the above.
2c gets us the furthest along the way for the conversion, close to where we want to end up in the long run, so we could do that to file systems one by one. The behavior change is immediate, so there are fewer possible surprises than with 2a, but it also means the most upfront work.
I would'nt worry about the upfront work to much, if it is the cleanest way to do the conversion. Though, I'm not seing how that really makes the whole thing simpler.
So far we had good results with changing core code first and then fix up the users one by one and at the end remove any intermediary helpers which we introduced along the way. So I'd chose 2b as it's a very clear transition mechanism with pretty low risk.
Thanks,
tglx
Y2038 mailing list Y2038@lists.linaro.org https://lists.linaro.org/mailman/listinfo/y2038
On Wednesday 24 February 2016 13:19:07 Thomas Gleixner wrote:
On Fri, 12 Feb 2016, Arnd Bergmann wrote:
2b has the main advantage of not changing behavior with the flip, so we can convert all file systems to use vfs_time relatively easily and then later make them actually use 64-bit timestamps with a patch that each file system developer can do for themselves.
And that's a very clear advantage of this approch. The change is purely mechanical and can be largely done with coccinelle.
Actually Deepa pointed out one common case where the behavior actually changes:
inode->i_mtime.tv_sec = le32_to_cpu(on_disk_mtime);
will change behavior when tv_sec is changed to a time64_t, but it only changes in a way that is consistent with 64-bit architectures, and it will not change for user space that uses the existing syscalls to actually read the timestamps, so the risk of introducing a regression here is still really low.
One downside is that it leads to rather ugly code as discussed before, examples are in "[RFC v2b 5/5] fs: xfs: change inode times to use vfs_time data type" and "[RFC v2b 3/5] fs: btrfs: Use vfs_time accessors".
now = current_fs_time(inode->i_sb);
if (!timespec_equal(&inode->i_mtime, &now))
inode->i_mtime = now;
now = vfs_time_to_timespec(current_fs_time(inode->i_sb));
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_mtime = timespec_to_vfs_time(now);
ts = vfs_time_to_timespec(inode->i_mtime);
if (!timespec_equal(&ts, &now))
inode->i_ctime = timespec_to_vfs_time(now);
You can either provide a helper function which does that m/c_time update at the VFS level where you can hide the mess or just accept that this transition will introduce some odd constructs like the above.
I think we can live with this. There are around 10 files that do timespec_compare() or timespec_equal(), comparing an inode timestamp with a local variable or a filesystem specific timespec value, but they are all slightly different so there is no obvious helper function to address multiple file systems at once without also introducing a 'struct vfs_time'.
2c gets us the furthest along the way for the conversion, close to where we want to end up in the long run, so we could do that to file systems one by one. The behavior change is immediate, so there are fewer possible surprises than with 2a, but it also means the most upfront work.
I would'nt worry about the upfront work to much, if it is the cleanest way to do the conversion. Though, I'm not seing how that really makes the whole thing simpler.
So far we had good results with changing core code first and then fix up the users one by one and at the end remove any intermediary helpers which we introduced along the way. So I'd chose 2b as it's a very clear transition mechanism with pretty low risk.
Ok, thanks for your input, let's do that then.
I'd like to get the preparation patch ([RFC v2b 1/5] vfs: Add vfs_time accessors) merged as soon as possible so we can do the rest of the series through the file system maintainer trees where possible.
I guess even though the patch by itself is trivial, it's now too late to do for 4.5, but we should be able to just put that into your tip tree, or Al's vfs tree for 4.6 and then try to get all file systems moved over for 4.7, followed by the vfs patch to actually change the type.
Arnd