This is an update to Arnd Bergmann's RFC patch series: https://lkml.org/lkml/2014/5/30/669 .
The syscalls and runtime libraries will be handled in separate patch series.
The filling of max and min timestamps for individual filesystems can be leveraged from the patch series: https://lkml.org/lkml/2015/11/20/413 1. Objective: To transition all file system code to use 64 bit time. This translates to using timespec64 across the fs code for any timestamp representation.
2. Problem Description: struct timespec cannot represent times after year 2038 on 32 bit systems. The alternative to timespec in the kernel world is timespec64 to get around the above problem. This will be the UAPI exposed interface.
3. Design objectives: The goal of this approach was to come up with small manageable patches to address the above problem. Preferably, a single patch per filesystem that can be merged independently.
Also, a more generic approach that all the individual filesystems could follow was preferred.
4. Solution: The solution incorporated in the patch series involves stages defined below:
4.1. CONFIG_FS_USES_64BIT_TIME This new config is defined to #ifdef code that is required to support 64 bit times. 4.2. struct inode times: The struct inode saves {a,c,m}timestamps as struct timespec. This leads to 2 problems: a. The size of the structure depends on whether the machine is 32/ 64 bit. b. y2038 problem described above. struct timespec64 also has the same problem as in (a) above. Choosing scalar types to store these timestamps(time64 and s32) solves both the above problems. Adding accessors to access these timestamps would hide the internal data type so that this can be changed according to config in (4.1). 4.3. struct inode_timespec: Use inode_timespec for all other timestamp representation throughout VFS and individual filesystems. inode_timespec is aliased to timespec or timespec64 based on config defined in (4.1). Using timespec64 in this stage would require ifdef-ing the code, which is spread throughout the code. 4.4. Enable config in (4.1).
4.5. Convert inode_timespec to timespec64: Drop all references to inode_timespec in (4.3). Replace it with timespec64. 5. Alternate Solution: Steps involved are:
5.1. Change VFS code to handle both timespec64 and timespec: There are a few API's in the VFS that take timestamps as arguments: generic_update_time(), inode->i_op->update_time(), lease_get_mtime(), fstack_copy_attr_all(), setattr_copy(), generic_fillattr. The attr and kstat properties could have accessors like inode. But, the other functions will have to maintain copies or be updated simultaenously along with other affecting filesystems.
5.2.struct timespec64: Change individual fs to use timespec64.
5.3. struct inode times: The struct inode saves {a,c,m}timestamps as struct timespec.
This leads to 2 problems: a. The size of the structure depends on whether the machine is 32/ 64 bit. b. y2038 problem described above.
struct timespec64 also has the same problem as in (a) above. Choosing scalar types to store these timestamps(time64 and s32) solves both the above problems.
Change individual filesystems to use macros to access inode times. Inode macros can assume conversion from timespec64 always.
5.4. VFS: Change vfs code also as above in (1) and (2).
5.5. Drop timespec This involves dropping support for any api/ struct changes to make vfs use only timespec64 for timestamps.
6. Rationale: The advantage of the method described in (5) is that we do not have inode_timespec aliases only to be dropped later.
But, the method suffers from disadvantages: a. As mentioned in (5.1), our process is affected as all the filesystems using each api must be changed simultaenously or VFS should have a copy. b. While individual filesystems are being changed, VFS will have to have 2 copies of a few apis. This will mean that at this time, any new code being added might use either. This might lead to confusion.
Misc: 7. Range check: Patches include range check and clamping of timestamps. This topic did not have a conclusion on the previous RFC.
7.1. Rationale The method incorporated in the patch series is based on following principles: a. Linux does not impose any fixed format for on-disk inodes. LKML discussion is still ongoing concerning the best handling of file systems used or updated after their expiration date. b. Linux cannot surmise the side effects to a file system because of the wrong timestamps as each fs saves timestamps differently. c. Individual filesystems must be able to say what to do when timestamps are clamped.
7.2. Solution Based on the above principles, the solution is described below:
7.2.1. There are 2 instances that the solution handles differently: a. While mounting a filesystem: A filesystem that has already exceeded the range of its timestamp fields. b. While doing operations on a mounted filesystem: Timestamps start getting clamped after the filesystem is mounted.
7.2.2. In both the above cases, a function is invoked as per the callbacks registered by filesystems.
8. Testing This is a proof of concept implementation. I want to get some feedback before I convert rest of the file systems.
I've done some initial testing based on the patches below on x86 64 bit arch. Testing was mainly done on the root filesystem. mount, stat, touch, read, write system calls were used for testing timestamp clamps and other functionality.
Patches 8-15 are only included to provide a complete picture.
Deepa Dinamani (15): fs: add Kconfig entry CONFIG_FS_USES_64BIT_TIME vfs: Change all structures to support 64 bit time kernel: time: Add macros and functions to support 64 bit time vfs: Add support for vfs code to use 64 bit time fs: cifs: Add support for cifs to use 64 bit time fs: fat: convert fat to 64 bit time fs: ext4: convert to use 64 bit time fs: Enable 64 bit time fs: cifs: replace inode_timespec with timespec64 fs: fat: replace inode_timespec with timespec64 fs: ext4: replace inode_timespec with timespec64 vfs: remove inode_timespec and timespec references kernel: time: change inode_timespec to timespec64 vfs: Remove inode_timespec aliases fs: Drop CONFIG_FS_USES_64BIT_TIME
fs/attr.c | 15 ++--- fs/bad_inode.c | 10 ++- fs/binfmt_misc.c | 7 +- fs/cifs/cache.c | 16 +++-- fs/cifs/cifsencrypt.c | 2 +- fs/cifs/cifsglob.h | 6 +- fs/cifs/cifsproto.h | 9 +-- fs/cifs/cifssmb.c | 17 +++-- fs/cifs/file.c | 9 ++- fs/cifs/inode.c | 65 ++++++++++++------- fs/cifs/netmisc.c | 26 ++++---- fs/ext4/acl.c | 3 +- fs/ext4/ext4.h | 44 +++++++------ fs/ext4/extents.c | 25 ++++++-- fs/ext4/ialloc.c | 9 ++- fs/ext4/inline.c | 10 ++- fs/ext4/inode.c | 16 +++-- fs/ext4/ioctl.c | 16 +++-- fs/ext4/namei.c | 40 ++++++++---- fs/ext4/super.c | 6 +- fs/ext4/xattr.c | 2 +- fs/fat/dir.c | 7 +- fs/fat/fat.h | 8 ++- fs/fat/file.c | 10 ++- fs/fat/inode.c | 46 ++++++++++---- fs/fat/misc.c | 7 +- fs/fat/namei_msdos.c | 40 +++++++----- fs/fat/namei_vfat.c | 41 ++++++++---- fs/inode.c | 53 +++++++++++----- fs/libfs.c | 50 ++++++++++++--- fs/locks.c | 5 +- fs/nsfs.c | 6 +- fs/pipe.c | 6 +- fs/posix_acl.c | 2 +- fs/stack.c | 6 +- fs/stat.c | 6 +- fs/super.c | 10 +++ fs/utimes.c | 6 +- include/linux/fs.h | 101 +++++++++++++++++++++++++---- include/linux/fs_stack.h | 9 +-- include/linux/stat.h | 6 +- include/linux/time64.h | 4 ++ kernel/time/time.c | 162 ++++++++++++++++++++++++++++++++++++++++++----- 43 files changed, 691 insertions(+), 253 deletions(-)
This config will be used to #ifdef code that will be required for switching over all file systems to use 64 bit time.
The config should remain turned off until all the support in vfs and other file systems has been added.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/Kconfig | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/fs/Kconfig b/fs/Kconfig index 922893f..a11934b 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -8,6 +8,16 @@ menu "File systems" config DCACHE_WORD_ACCESS bool
+#use 64 bit timestamps +config FS_USES_64BIT_TIME + bool + default n + help + Temporary configuration to switch over all file systems to + use 64 bit time. + Need to be enabled only after all individual file system + and vfs changes are in place. + if BLOCK
source "fs/ext2/Kconfig"
The current representation of inode times in struct inode, struct iattr, and struct kstat use struct timespec. timespec is not y2038 safe.
Use scalar data types (seconds and nanoseconds stored separately) to represent timestamps in struct inode in order to maintain same size for times across 32 bit and 64 bit architectures. In addition, lay them out such that they are packed on a naturally aligned boundary on 64 bit arch as 4 bytes are used to store nsec. This makes each tuple(sec, nscec) use 12 bytes instead of 16 bytes. This will help save RAM space as inode structure is cached in memory. The other structures are transient and do not benefit from these changes.
Add accessors for inode timestamps. These provide a way to access the time field members. Accessors abstract the timestamp representation so that any logic to convert between the struct inode timestamps and other interfaces can be placed here. The plan is to globally change all references to these types through these accessors only. So when the actual internal representation changes, it will be transparent to the outside world. This can be extended to add code to validate the inode times that are being set. Macros are chosen as accessors rather than functions because we can condense 3 {a,c,m} time functions into a single macro. After we agree on an approach, the implementation could be changed to use static inline functions if it suits more.
Add inode_timespec aliases to help convert kstat and iattr times to use 64 bit times. These hide the internal data type. Use uapi exposed data types here to keep minimal timstamp data type conversions in API's interfacing with vfs.
After the CONFIG_FS_USES_64BIT_TIME is enabled, all inode_timespec aliases will be removed and timespec64 data types and API's will be used directly.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/fs.h | 55 +++++++++++++++++++++++++++++++++++++++++++------- include/linux/stat.h | 6 +++--- include/linux/time64.h | 21 +++++++++++++++++++ 3 files changed, 72 insertions(+), 10 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h index 12ba937..b9f3cee 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -245,13 +245,13 @@ typedef void (dax_iodone_t)(struct buffer_head *bh_map, int uptodate); */ struct iattr { unsigned int ia_valid; - umode_t ia_mode; - kuid_t ia_uid; - kgid_t ia_gid; - loff_t ia_size; - struct timespec ia_atime; - struct timespec ia_mtime; - struct timespec ia_ctime; + umode_t ia_mode; + kuid_t ia_uid; + kgid_t ia_gid; + loff_t ia_size; + struct inode_timespec ia_atime; + struct inode_timespec ia_mtime; + struct inode_timespec ia_ctime;
/* * Not an attribute, but an auxiliary info for filesystems wanting to @@ -616,9 +616,18 @@ struct inode { }; dev_t i_rdev; loff_t i_size; +#ifdef CONFIG_FS_USES_64BIT_TIME + time64_t i_atime_sec; + time64_t i_mtime_sec; + time64_t i_ctime_sec; + s32 i_atime_nsec; + s32 i_mtime_nsec; + s32 i_ctime_nsec; +#else struct timespec i_atime; struct timespec i_mtime; struct timespec i_ctime; +#endif spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */ unsigned short i_bytes; unsigned int i_blkbits; @@ -679,6 +688,38 @@ struct inode { void *i_private; /* fs or device private pointer */ };
+#ifdef CONFIG_FS_USES_64BIT_TIME + +#define VFS_INODE_SET_XTIME(xtime, inode, ts64) \ + do { \ + struct inode_timespec __ts = (ts64); \ + (inode)->xtime##_sec = __ts.tv_sec; \ + (inode)->xtime##_nsec = __ts.tv_nsec; \ + } while (0) + +#define VFS_INODE_GET_XTIME(xtime, inode) \ + (struct timespec64){.tv_sec = (inode)->xtime##_sec, \ + .tv_nsec = (inode)->xtime##_nsec} + +#else + +#define VFS_INODE_SET_XTIME(xtime, inode, ts) \ + ((inode)->xtime = (ts)) + +#define VFS_INODE_GET_XTIME(xtime, inode) \ + ((inode)->xtime) + +#endif + +#define VFS_INODE_SWAP_XTIME(xtime, inode1, inode2) \ + do { \ + struct inode_timespec __ts = \ + VFS_INODE_GET_XTIME(xtime, inode1); \ + VFS_INODE_SET_XTIME(xtime, inode1, \ + VFS_INODE_GET_XTIME(xtime, inode2)); \ + VFS_INODE_SET_XTIME(xtime, inode2, __ts); \ + } while (0) + static inline int inode_unhashed(struct inode *inode) { return hlist_unhashed(&inode->i_hash); diff --git a/include/linux/stat.h b/include/linux/stat.h index 075cb0c..559983f 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -27,9 +27,9 @@ struct kstat { kgid_t gid; dev_t rdev; loff_t size; - struct timespec atime; - struct timespec mtime; - struct timespec ctime; + struct inode_timespec atime; + struct inode_timespec mtime; + struct inode_timespec ctime; unsigned long blksize; unsigned long long blocks; }; diff --git a/include/linux/time64.h b/include/linux/time64.h index 367d5af..be98201 100644 --- a/include/linux/time64.h +++ b/include/linux/time64.h @@ -26,6 +26,27 @@ struct itimerspec64 {
#endif
+#ifdef CONFIG_FS_USES_64BIT_TIME + +/* Place holder defines until CONFIG_FS_USES_64BIT_TIME + * is enabled. + * timespec64 data type and functions will be used at that + * time directly and these defines will be deleted. + */ +#define inode_timespec timespec64 + +#define inode_timespec_compare timespec64_compare +#define inode_timespec_equal timespec64_equal + +#else + +#define inode_timespec timespec + +#define inode_timespec_compare timespec_compare +#define inode_timespec_equal timespec_equal + +#endif + /* Parameters used to convert the timespec values: */ #define MSEC_PER_SEC 1000L #define USEC_PER_MSEC 1000L
On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
The current representation of inode times in struct inode, struct iattr, and struct kstat use struct timespec. timespec is not y2038 safe.
Use scalar data types (seconds and nanoseconds stored separately) to represent timestamps in struct inode in order to maintain same size for times across 32 bit and 64 bit architectures. In addition, lay them out such that they are packed on a naturally aligned boundary on 64 bit arch as 4 bytes are used to store nsec. This makes each tuple(sec, nscec) use 12 bytes instead of 16 bytes. This will help save RAM space as inode structure is cached in memory. The other structures are transient and do not benefit from these changes.
IMO, this decisions sends the patch series immediately down the wrong path. TO me, this is a severe case of premature optimisation because everything gets way more complex just to save those 8 bytes, especially as those holes can be filled simply by changing the variable declaration order in the structure and adding a simple comment.
And, really, I don't like those VFS_INODE_[GS]ET_XTIME macros at all; you've got to touch lots of code(*), making it all shouty and harder to read. They seem only to exist because of the above structural change requires an abstract timestamp accessor while CONFIG_FS_USES_64BIT_TIME exists. Given that goes away at the end o the series, so should the macro - if we use a struct timespec64 in the first place, it isn't even necessary as a temporary construct.
(*) I note you haven't touched XFS, which means you've probably broken lots of other filesystem code. e.g. in XFS, functions like xfs_vn_getattr() and xfs_vn_update_time() access inode->i_[acm]time directly and hence are not going to compile after this patch series.
Cheers,
Dave.
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote:
On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote: The current representation of inode times in struct inode, struct iattr, and struct kstat use struct timespec. timespec is not y2038 safe.
Use scalar data types (seconds and nanoseconds stored separately) to represent timestamps in struct inode in order to maintain same size for times across 32 bit and 64 bit architectures. In addition, lay them out such that they are packed on a naturally aligned boundary on 64 bit arch as 4 bytes are used to store nsec. This makes each tuple(sec, nscec) use 12 bytes instead of 16 bytes. This will help save RAM space as inode structure is cached in memory. The other structures are transient and do not benefit from these changes.
IMO, this decisions sends the patch series immediately down the wrong path.
There are other things the patch does that I would like to get comments on: inode_timespec aliases, range check, individual fs changes etc. These are independent of the inode timestamp representation changes.
TO me, this is a severe case of premature optimisation because everything gets way more complex just to save those 8 bytes, especially as those holes can be filled simply by changing the variable declaration order in the structure and adding a simple comment.
I had tried rearranging the structure and the pahole tool does not show any difference unless you pack and align the struct to 4 bytes on 64 bit arch. The change actually saves 16 bytes on x86_64 and adds 12 bytes on i386.
Here is the breakdown for struct inode before and after the patch:
x86_64: /* size: 544, cachelines: 9, members: 44 */ | /* size: 528, cachelines: 9, members: 47 */ /* sum members: 534, holes: 3, sum holes: 10 */ | /* sum members: 522, holes: 2, sum holes: 6 */
i386: /* size: 328, cachelines: 6, members: 45 */ | /* size: 340, cachelines: 6, members: 48 */ /* sum members: 326, holes: 1, sum holes: 2 */ | /* sum members: 338, holes: 1, sum holes: 2 */
According to /proc/slabinfo I estimated savings of 4MB on a lightly loaded system.
And, really, I don't like those VFS_INODE_[GS]ET_XTIME macros at all; you've got to touch lots of code(*), making it all shouty and harder to read. They seem only to exist because of the above structural change requires an abstract timestamp accessor while CONFIG_FS_USES_64BIT_TIME exists. Given that goes away at the end o the series, so should the macro - if we use a struct timespec64 in the first place, it isn't even necessary as a temporary construct
timespec64 was the first option considered here. The problem with using timespec64 is the long data type to represent nsec. If it were possible to change timespec64 nsec to int data type then it might be okay to use that if we are not worried about holes. I do not see why time stamps should have different representations on a 32 bit vs a 64 bit arch. This left us with the option define a new data type to represent timestamps. I agreed with the concerns on the earlier RFC series that there are already very many data types to represent time in the kernel. So this left me with the option of using scalar types to represent these. The scalar types were not used for optimization. They just happened to serve that purpose as well. This could be in a follow on patch, but as long as we are changing the representation everywhere, I don't see why there should be an intermediate step to change it to timespec64 only to change it to this representation later.
As far as accessors are concerned, there already are accessors in the VFS: generic_fillattr() and setattr_copy(). The problem really is that there is more than one way of updating these attributes(timestamps in this particular case). The side effect of this is that we don't always call timespec_trunc() before assigning timestamps which can lead to inconsistencies between on disk and in memory inode timestamps. Also, since these also touch other attributes, these become more restrictive. The accessors were an idea to streamline all accesses to timestamps in inode. Right now the accessor macros also figure out if the timestamps were clamped and then call the registered callback. But, this can be extended to include fs_time_trunc() and then all the end users can just use these and not worry about the right granularity or range. As the commit text says, these can be changed to inline functions to avoid shouty case.
(*) I note you haven't touched XFS, which means you've probably broken lots of other filesystem code. e.g. in XFS, functions like xfs_vn_getattr() and xfs_vn_update_time() access inode->i_[acm]time directly and hence are not going to compile after this patch series.
I think I should have explained this more in my cover letter, as this has come up twice now. Patches 1-7 are the only ones that are relevant and compiled and tested. These change three example filesystems as an illustration of the proposed solution. Of course, every filesystem will have to be changed similarly before patches 8-15 and a few more to change additional filesystems to use timespec64 can be merged. Patches 8-15 were included merely to provide a complete picture, as I thought patches explained the concept better than words only. These have not even been compiled, as these are for illustration purposes as noted in the cover letter.
-Deepa
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote:
On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote: The current representation of inode times in struct inode, struct iattr, and struct kstat use struct timespec. timespec is not y2038 safe.
Use scalar data types (seconds and nanoseconds stored separately) to represent timestamps in struct inode in order to maintain same size for times across 32 bit and 64 bit architectures. In addition, lay them out such that they are packed on a naturally aligned boundary on 64 bit arch as 4 bytes are used to store nsec. This makes each tuple(sec, nscec) use 12 bytes instead of 16 bytes. This will help save RAM space as inode structure is cached in memory. The other structures are transient and do not benefit from these changes.
IMO, this decisions sends the patch series immediately down the wrong path.
There are other things the patch does that I would like to get comments on: inode_timespec aliases, range check, individual fs changes etc. These are independent of the inode timestamp representation changes.
The inode_timespec alias is part of the problem - AFAICT it exists because you need a representation that is independent of both the old and new in-memory structures.
The valid timestamp range stuff in the superblock is absolutely necessary, but that's something that can be done completely independently to the changes for supporting a differnet time storage format.
And the fs changes cannot really be commented on until the VFs time representation is sorted out properly...
TO me, this is a severe case of premature optimisation because everything gets way more complex just to save those 8 bytes, especially as those holes can be filled simply by changing the variable declaration order in the structure and adding a simple comment.
I had tried rearranging the structure and the pahole tool does not show any difference unless you pack and align the struct to 4 bytes on 64 bit arch. The change actually saves 16 bytes on x86_64 and adds 12 bytes on i386.
Here is the breakdown for struct inode before and after the patch:
x86_64: /* size: 544, cachelines: 9, members: 44 */ | /* size: 528, cachelines: 9, members: 47 */ /* sum members: 534, holes: 3, sum holes: 10 */ | /* sum members: 522, holes: 2, sum holes: 6 */
i386: /* size: 328, cachelines: 6, members: 45 */ | /* size: 340, cachelines: 6, members: 48 */ /* sum members: 326, holes: 1, sum holes: 2 */ | /* sum members: 338, holes: 1, sum holes: 2 */
According to /proc/slabinfo I estimated savings of 4MB on a lightly loaded system.
And, really, I don't like those VFS_INODE_[GS]ET_XTIME macros at all; you've got to touch lots of code(*), making it all shouty and harder to read. They seem only to exist because of the above structural change requires an abstract timestamp accessor while CONFIG_FS_USES_64BIT_TIME exists. Given that goes away at the end o the series, so should the macro - if we use a struct timespec64 in the first place, it isn't even necessary as a temporary construct
timespec64 was the first option considered here. The problem with using timespec64 is the long data type to represent nsec. If it were possible to change timespec64 nsec to int data type then it might be okay to use that if we are not worried about holes. I do not see why time stamps should have different representations on a 32 bit vs a 64 bit arch.
What's it matter? iot's irrelevant to the problem at hand.
Besides, have you looked at the existing timestamp definitions? they use a struct timespec, which on a 64 bit system:
struct timespec i_atime; /* 88 16 */ struct timespec i_mtime; /* 104 16 */ struct timespec i_ctime; /* 120 16 */
use 2 longs and are identical to a timespec64 in everything but name. These should just be changed to a timespec64, and we suck up the fact it increases the size of the 32 bit inode as we have to increase it's size to support time > y2038 anyway.
This is what I meant about premature optimisation - you've got a wonderfully complex solution to a problem that we don't need to solve to support timestamps >y2038. It's also why it goes down the wrong path at this point - most of the changes are not necessary if all we need to do is a simple timespec -> timespec64 type change and the addition timestamp range limiting in the existing truncation function...
The problem really is that there is more than one way of updating these attributes(timestamps in this particular case). The side effect of this is that we don't always call timespec_trunc() before assigning timestamps which can lead to inconsistencies between on disk and in memory inode timestamps.
That's a problem that can be fixed independently of y2038 support. Indeed, we can be quite lazy about updating timestamps - by intent and design we usually have different timestamps in memory compared to on disk, which is one of the reasons why there are so many different ways to change and update timestamps....
Cheers,
Dave.
On Tuesday 12 January 2016 19:29:57 Dave Chinner wrote:
This is what I meant about premature optimisation - you've got a wonderfully complex solution to a problem that we don't need to solve to support timestamps >y2038. It's also why it goes down the wrong path at this point - most of the changes are not necessary if all we need to do is a simple timespec -> timespec64 type change and the addition timestamp range limiting in the existing truncation function...
I originally suggested doing the split representation because I was worried about the downsides of using timespec64 on 32-bit systems after looking at actual memory consumption on my test box.
At this moment, I have a total of 145712700 inodes in memory on a machine with 64GB ram, saving 12 bytes on each amounts to a total of 145MB. I think it was more than that when I first looked, so it's between 0.2% and 0.3% of savings in total memory, which is certainly worth discussing about, given the renewed interest in conserving RAM in general. If we want to save this memory, then doing it at the same time as the timespec64 conversion is the right time so we don't need to touch every file twice.
One point that I had not considered though is on the 32-bit systems we are talking about, not only is RAM much smaller, but also there would be a smaller fraction of RAM available to store inodes, so there is not as much to gain.
Arnd
On Tue, Jan 12, 2016 at 10:27:07AM +0100, Arnd Bergmann wrote:
On Tuesday 12 January 2016 19:29:57 Dave Chinner wrote:
This is what I meant about premature optimisation - you've got a wonderfully complex solution to a problem that we don't need to solve to support timestamps >y2038. It's also why it goes down the wrong path at this point - most of the changes are not necessary if all we need to do is a simple timespec -> timespec64 type change and the addition timestamp range limiting in the existing truncation function...
I originally suggested doing the split representation because I was worried about the downsides of using timespec64 on 32-bit systems after looking at actual memory consumption on my test box.
At this moment, I have a total of 145712700 inodes in memory on a machine
Is that all? :P
with 64GB ram, saving 12 bytes on each amounts to a total of 145MB.
I just posted a patchset that knocks 104 bytes off the XFS inode (~12% reduction in size). We need the changes in that patchset to sanely support >y2038k support in XFS, and it means we now won't need to grow the XFS inode to add that support, either.
I think it was more than that when I first looked, so it's between 0.2% and 0.3% of savings in total memory, which is certainly worth discussing about, given the renewed interest in conserving RAM in general. If we want to save this memory, then doing it at the same time as the timespec64 conversion is the right time so we don't need to touch every file twice.
You just uttered the key words: "If we want to save this memory"
So let's stop conflating two different lines of development because we only actually *need* y2038k support.
The fact we haven't made timestamp space optimisations means that nobody has thought it necessary or worthwhile. y2038k support doesn't change the landscape under which we might consider the optimisation, so we need to determine if the benefit outweighs the cost in terms of code complexity and maintainability.
So separate the two changes - make the y2038k change simple and obviously correct first by changing everything to timespec64. Then it won't get delayed by bikeshedding about an optimisation of that is of questionable benefit.
Cheers,
Dave.
On Wednesday 13 January 2016 17:27:16 Dave Chinner wrote:
I think it was more than that when I first looked, so it's between 0.2% and 0.3% of savings in total memory, which is certainly worth discussing about, given the renewed interest in conserving RAM in general. If we want to save this memory, then doing it at the same time as the timespec64 conversion is the right time so we don't need to touch every file twice.
You just uttered the key words: "If we want to save this memory"
So let's stop conflating two different lines of development because we only actually *need* y2038k support.
The fact we haven't made timestamp space optimisations means that nobody has thought it necessary or worthwhile. y2038k support doesn't change the landscape under which we might consider the optimisation, so we need to determine if the benefit outweighs the cost in terms of code complexity and maintainability.
So separate the two changes - make the y2038k change simple and obviously correct first by changing everything to timespec64. Then it won't get delayed by bikeshedding about an optimisation of that is of questionable benefit.
Fine with me. I think Deepa already started simplifying the series already. I agree that for 64-bit machines, there is no need to optimize that code now, since we are not regressing in terms of memory size.
For 32-bit machines, we are regressing anyway, the question is whether it's by 12 or 24 bytes per inode. Let me try to estimate the worse-case scenario here: let's assume that we have 1GB of RAM (anything more on a 32-bit system gets you into trouble, and if you have less, there will be less of a problem). Filling all of system ram with small tmpfs files means a single 4K page plus 280 bytes for the minimum inode, so we need an additional 6MB or 12MB to store the extra timespec bits. Probably not too bad for a worst-case scenario, but there is also the case of storing just the inodes but no pages, and that would be worse.
I've added the linux-arm and linux-mips lists to cc, to see if anyone has strong opinions on this matter. We don't have to worry about x86-32 here, because sizeof(struct timespec64) is 12 bytes there anyway, and I don't think there are any other 32-bit architectures that have large-scale deployments or additional requirements we don't already have on ARM.
Arnd
On Tue, Jan 12, 2016 at 07:29:57PM +1100, Dave Chinner wrote:
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote:
On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
Summarizing, here are the open questions that need to be sorted before another series: 1. What should be part of the series? a. y2038 safe data types conversion. b. range check and clamping 2. How to achieve a seamless transition? Is inode_timespec solution agreed upon to achieve 1a? An alternate approach is included in the cover letter. 3. policy for handling out of range timestamps: There was no conclusion on this from the previous series as noted in the cover letter. I think this can be solved by figuring out the answer to question: who should have a say in deciding the course of action if the timestamps are out of range: a. sysadmin through sysctl (Arnd's suggestion) b. have default vfs handlers with an option for individual fs to override. c. clamp and ignore d. disable expired fs at compile time (Arnd's suggestion)
The inode_timespec alias is part of the problem - AFAICT it exists because you need a representation that is independent of both the old and new in-memory structures.
Maybe you are misunderstanding the fact that only struct inode is changed to have individual fields. Whereas, timespec64 is used everywhere else in the series. inode_timespec is an alias to transition timespec references to timespec64 without breaking anything. This is needed because we don't want to change all references to timespec in a single patch like the cover letter says. The same RFC Patch 2 has more details.
The valid timestamp range stuff in the superblock is absolutely necessary, but that's something that can be done completely independently to the changes for supporting a differnet time storage format.
If I'm defining new functions to support new format, then I should at least get the function signatures right before using them. These can be in a different patch, but should be in the same patch series, before they are used anywhere. For instance, struct timespec timespec_trunc(struct timespec t, unsigned gran); should now take superblock as an argument instead of gran.
And the fs changes cannot really be commented on until the VFs time representation is sorted out properly...
Each fs is changed twice in the current approach to transition everything to timespec64. And, there are different ways of doing this. For instance, Arnd had an idea different from mine as to how this can be done: He was suggesting using something like these accessor macros and incorporating timespec64 from the beginning in the individual filesystems rather than inode_timespec. Again, this is independent of how timestamps are stored in struct inode. There are others that are independent of inode timestamp representation as well
Besides, have you looked at the existing timestamp definitions? they use a struct timespec, which on a 64 bit system:
struct timespec i_atime; /* 88 16 */ struct timespec i_mtime; /* 104 16 */ struct timespec i_ctime; /* 120 16 */
use 2 longs and are identical to a timespec64 in everything but name. These should just be changed to a timespec64, and we suck up the fact it increases the size of the 32 bit inode as we have to increase it's size to support time > y2038 anyway.
This is what I meant about premature optimisation - you've got a wonderfully complex solution to a problem that we don't need to solve to support timestamps >y2038. It's also why it goes down the wrong path at this point - most of the changes are not necessary if all we need to do is a simple timespec -> timespec64 type change and the addition timestamp range limiting in the existing truncation function...
The pahole output I pasted in the previous email(on the left) was for timespec.
Yes, I do know that timespec64 is same as timespec on 64 bit systems: #if __BITS_PER_LONG == 64 # define timespec64 timespec
I think it's been agreed upon now that inode timestamps will be changed to use timespec64 as Arnd mentioned, if I do not hear any objections. The whole purpose of this is to gather comments.
The problem really is that there is more than one way of updating these attributes(timestamps in this particular case). The side effect of this is that we don't always call timespec_trunc() before assigning timestamps which can lead to inconsistencies between on disk and in memory inode timestamps.
That's a problem that can be fixed independently of y2038 support. Indeed, we can be quite lazy about updating timestamps - by intent and design we usually have different timestamps in memory compared to on disk, which is one of the reasons why there are so many different ways to change and update timestamps....
This has nothing to do with lazy updates. This is about writing wrong granularities and non clamped values to in-memory inode.
-Deepa
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
On Tue, Jan 12, 2016 at 07:29:57PM +1100, Dave Chinner wrote:
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote:
On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
Summarizing, here are the open questions that need to be sorted before another series:
- What should be part of the series? a. y2038 safe data types conversion. b. range check and clamping
Yes.
- How to achieve a seamless transition? Is inode_timespec solution agreed upon to achieve 1a?
No. Just convert direct to timespec64.
An alternate approach is included in the cover letter. 3. policy for handling out of range timestamps: There was no conclusion on this from the previous series as noted in the cover letter. a. sysadmin through sysctl (Arnd's suggestion) b. have default vfs handlers with an option for individual fs to override. c. clamp and ignore
I think it's a mix - if the timestamps come in from userspace, fail with ERANGE. That could be controlled by sysctl via VFS part of the ->setattr operation, or in each of the individual FS implementations. If they come from the kernel (e.g. atime update) then the generic behvaiour is to warn and continue, filesystems can otherwise select their own policy for kernel updates via ->update_time.
d. disable expired fs at compile time (Arnd's suggestion)
Not really an option, because it means we can't use filesystems that interop with other systems (e.g. cameras, etc) because they won't support y2038k timestamps for a long time, if ever (e.g. vfat).
The problem really is that there is more than one way of updating these attributes(timestamps in this particular case). The side effect of this is that we don't always call timespec_trunc() before assigning timestamps which can lead to inconsistencies between on disk and in memory inode timestamps.
That's a problem that can be fixed independently of y2038 support. Indeed, we can be quite lazy about updating timestamps - by intent and design we usually have different timestamps in memory compared to on disk, which is one of the reasons why there are so many different ways to change and update timestamps....
This has nothing to do with lazy updates. This is about writing wrong granularities and non clamped values to in-memory inode.
Which really shouldn't happen because we should be clamping and/or truncating timestamps at the creation/entry point into the VFS/filesystem.
e.g. current_fs_time(sb) is how filesystems grab the current kernel time for timestamp updates. Add an equivalent current_fs_time64(sb) to do return timespec64 and do clamping and limit warning, and now you have a simple vehicle for converting the VFS and filesystems to support y2038k clean date formats.
If there are places where filesystems are receiving or using unchecked timestamps then those are bugs that need fixing. Those need to be in separate patches to y2038k support...
Cheers,
Dave.
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
On Tue, Jan 12, 2016 at 07:29:57PM +1100, Dave Chinner wrote:
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote:
On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
- How to achieve a seamless transition? Is inode_timespec solution agreed upon to achieve 1a?
No. Just convert direct to timespec64.
The hard part here is how to split that change into logical patches per file system. We have already discussed all sorts of ways to do that, but there is no ideal solution, as you usually end up either having some really large patches, or you have to modify the same lines multiple times.
The most promising approaches are:
a) In Deepa's current patch set, some infrastructure is first introduced by changing the type from timespec to an identical inode_timespec, which lets us convert one file system at a time to inode_timespec and then change the type once they are all done. The downside is then that all file systems have to get touched twice so we end up with timespec64 everywhere.
b) A variation of that which I would do is to use have a smaller set of infrastructure first, so we can change one file system at a time to timespec64 while leaving the common structures to use timespec until all file systems are converted. The downside is the use of some conversion macros when accessing the times in the inode. When the common code is changed, those accessor macros get turned into trivial assignments that can be removed up later or changed in the same patch.
c) The opposite direction from b) is to first change the common code, but then any direct assignment between a timespec in a file system and the timespec64 in the inode/iattr/kstat/etc first needs a conversion helper so we can build cleanly, and then we do one file system at a time to remove them all again while changing the internal structures in the file system from timespec to timespec64.
An alternate approach is included in the cover letter. 3. policy for handling out of range timestamps: There was no conclusion on this from the previous series as noted in the cover letter. a. sysadmin through sysctl (Arnd's suggestion) b. have default vfs handlers with an option for individual fs to override. c. clamp and ignore
I think it's a mix - if the timestamps come in from userspace, fail with ERANGE. That could be controlled by sysctl via VFS part of the ->setattr operation, or in each of the individual FS implementations. If they come from the kernel (e.g. atime update) then the generic behvaiour is to warn and continue, filesystems can otherwise select their own policy for kernel updates via ->update_time.
I'd prefer not to have it done by the individual file system implementation, so we get a consistent behavior. Normally you either care about correct time stamps, or you care about interoperability and you don't want to have errors returned here.
It could be done per mount, but that seems overly complicated for rather little to be gained.
d. disable expired fs at compile time (Arnd's suggestion)
Not really an option, because it means we can't use filesystems that interop with other systems (e.g. cameras, etc) because they won't support y2038k timestamps for a long time, if ever (e.g. vfat).
Let me clarify what my idea is here: I want a global kernel option that disables all code that has known y2038 issues. If anyone tries to build an embedded system with support beyond 2038, that should disable all of those things, including file systems, drivers and system calls, so we can reasonably assume that everything that works today with that kernel build will keep working in the future and not break in random ways.
For a file system, this can be done in a number of ways:
* Most file systems today interpret the time as an unsigned 32-bit number (as opposed to signed as ext3, xfs and few others do), so as long as we use timespec64 in the syscalls, we are ok.
* Some legacy file systems (maybe hfs) can remain disabled, as nobody cares about them any more.
* If we still care about them (e.g. ext2), we can make them support only read-only mode. In ext4, this would mean forbidding write access to file systems that don't have the extended inode format enabled.
Normal users that don't care about not breaking in 2038 obviously won't set the option, and have the same level of backwards compatibility support as today.
The problem really is that there is more than one way of updating these attributes(timestamps in this particular case). The side effect of this is that we don't always call timespec_trunc() before assigning timestamps which can lead to inconsistencies between on disk and in memory inode timestamps.
That's a problem that can be fixed independently of y2038 support. Indeed, we can be quite lazy about updating timestamps - by intent and design we usually have different timestamps in memory compared to on disk, which is one of the reasons why there are so many different ways to change and update timestamps....
This has nothing to do with lazy updates. This is about writing wrong granularities and non clamped values to in-memory inode.
Which really shouldn't happen because we should be clamping and/or truncating timestamps at the creation/entry point into the VFS/filesystem.
e.g. current_fs_time(sb) is how filesystems grab the current kernel time for timestamp updates. Add an equivalent current_fs_time64(sb) to do return timespec64 and do clamping and limit warning, and now you have a simple vehicle for converting the VFS and filesystems to support y2038k clean date formats.
I think the current patch series does this already.
If there are places where filesystems are receiving or using unchecked timestamps then those are bugs that need fixing. Those need to be in separate patches to y2038k support...
Fair enough, but that probably means that patch series will have to come first. This will also reduce the number of places in which a separate type conversion function needs to be added.
Arnd
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
On Tue, Jan 12, 2016 at 07:29:57PM +1100, Dave Chinner wrote:
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote: > On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
- How to achieve a seamless transition? Is inode_timespec solution agreed upon to achieve 1a?
No. Just convert direct to timespec64.
The hard part here is how to split that change into logical patches per file system. We have already discussed all sorts of ways to do that, but there is no ideal solution, as you usually end up either having some really large patches, or you have to modify the same lines multiple times.
The most promising approaches are:
a) In Deepa's current patch set, some infrastructure is first introduced by changing the type from timespec to an identical inode_timespec, which lets us convert one file system at a time to inode_timespec and then change the type once they are all done. The downside is then that all file systems have to get touched twice so we end up with timespec64 everywhere.
b) A variation of that which I would do is to use have a smaller set of infrastructure first, so we can change one file system at a time to timespec64 while leaving the common structures to use timespec until all file systems are converted. The downside is the use of some conversion macros when accessing the times in the inode. When the common code is changed, those accessor macros get turned into trivial assignments that can be removed up later or changed in the same patch.
c) The opposite direction from b) is to first change the common code, but then any direct assignment between a timespec in a file system and the timespec64 in the inode/iattr/kstat/etc first needs a conversion helper so we can build cleanly, and then we do one file system at a time to remove them all again while changing the internal structures in the file system from timespec to timespec64.
Just a clarification here: approaches b and c also need some functions that take times as arguments, including a function pointer in the vfs layer to be supported in both forms: timespec and timespec64 concurrently. As included in the cover letter, these are: generic_update_time(), inode->i_op->update_time(), lease_get_mtime(), fstack_copy_attr_all(), setattr_copy(), generic_fillattr().
-Deepa
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
On Tue, Jan 12, 2016 at 07:29:57PM +1100, Dave Chinner wrote:
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote: > On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
- How to achieve a seamless transition? Is inode_timespec solution agreed upon to achieve 1a?
No. Just convert direct to timespec64.
The hard part here is how to split that change into logical patches per file system. We have already discussed all sorts of ways to do that, but there is no ideal solution, as you usually end up either having some really large patches, or you have to modify the same lines multiple times.
The most promising approaches are:
a) In Deepa's current patch set, some infrastructure is first introduced by changing the type from timespec to an identical inode_timespec, which lets us convert one file system at a time to inode_timespec and then change the type once they are all done. The downside is then that all file systems have to get touched twice so we end up with timespec64 everywhere.
Yes, and the result is not pretty.
b) A variation of that which I would do is to use have a smaller set of infrastructure first, so we can change one file system at a time to timespec64 while leaving the common structures to use timespec until all file systems are converted. The downside is the use of some conversion macros when accessing the times in the inode. When the common code is changed, those accessor macros get turned into trivial assignments that can be removed up later or changed in the same patch.
Doesn't really make a lot of sense to me. We have to change everything evntually, and it's not that much work to do so up front...
c) The opposite direction from b) is to first change the common code, but then any direct assignment between a timespec in a file system and the timespec64 in the inode/iattr/kstat/etc first needs a conversion helper so we can build cleanly, and then we do one file system at a time to remove them all again while changing the internal structures in the file system from timespec to timespec64.
No new helpers are necessary - we've already got the helper functions we need. This:
int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(old_dentry); + struct inode_timespec now = current_fs_time(inode->i_sb);
- inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME; + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); inc_nlink(inode); .....
is just wrong. All the type conversion and clamping and checking done in that VFS_INODE_SET_XTIME() should be done in current_fs_time() and have it return a timespec64 directly. Indeed, it already does truncation, and can easily be made to do range clamping, too. i.e. the change should simply be:
- inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME; + inode->i_ctime = dir->i_ctime = dir->i_mtime = current_fs_time(inode->i_sb);
This will work cleanly with a s/timespec/timespec64/ API changeover, too, without the need for any additional helpers at all.
i.e. the first thing to do is make sure everything that creates/converts timestamps uses/returns a struct timespec that is clamped and truncated at the earliest possible opportunity. i.e. on entry to the kernel and when pulled from the kernel time. Once all the code is using and passing around truncated+clamped struct timespec (like above), then we can simply do a s/timespec/timespec64/ on the inode and API, and we're done.
I think it's a mix - if the timestamps come in from userspace, fail with ERANGE. That could be controlled by sysctl via VFS part of the ->setattr operation, or in each of the individual FS implementations. If they come from the kernel (e.g. atime update) then the generic behvaiour is to warn and continue, filesystems can otherwise select their own policy for kernel updates via ->update_time.
I'd prefer not to have it done by the individual file system implementation, so we get a consistent behavior.
Can't be helped, because different filesystems have different timestamp behaviours, and not all use the generic VFS code for timestamp updates. The filesystems need to use the correct helper functions to obtain a valid current time, but you can't stop them from storing and using arbitrary timestamp formats if they so desire...
d. disable expired fs at compile time (Arnd's suggestion)
Not really an option, because it means we can't use filesystems that interop with other systems (e.g. cameras, etc) because they won't support y2038k timestamps for a long time, if ever (e.g. vfat).
Let me clarify what my idea is here: I want a global kernel option that disables all code that has known y2038 issues. If anyone tries to build an embedded system with support beyond 2038, that should disable all of those things, including file systems, drivers and system calls, so we can reasonably assume that everything that works today with that kernel build will keep working in the future and not break in random ways.
It's not that black and white when it comes to filesystems. y2038k support is determined by the on-disk structure of the filesystem being mounted, and that is determined at mount time. When the filesystem mounts and sets it's valid timestamp ranges the VFS will need to decide as to whether the filesystem is allowed to continue mounting or not.
For a file system, this can be done in a number of ways:
- Most file systems today interpret the time as an unsigned 32-bit number (as opposed to signed as ext3, xfs and few others do), so as long as we use timespec64 in the syscalls, we are ok.
Actually, sign conversion is a problem we currently have to be very careful of. See, for example, xfstests:tests/generic/258, which tests timestamps recording times before epoch. i.e. in XFS we have to convert the unsigned 32 bit disk timestamp to signed 32 bit before storing it in the VFS timestamp so it behaves correctly on 64 bit systems. This results in us needing to do this when reading the inode from disk:
+ /* + * time is signed, so need to convert to signed 32 bit before + * storing in inode timestamp which may be 64 bit. Otherwise + * a time before epoch is converted to a time long after epoch + * on 64 bit systems. + */ + inode->i_atime.tv_sec = (int)be32_to_cpu(from->di_atime.t_sec); + inode->i_atime.tv_nsec = (int)be32_to_cpu(from->di_atime.t_nsec); + inode->i_mtime.tv_sec = (int)be32_to_cpu(from->di_mtime.t_sec); + inode->i_mtime.tv_nsec = (int)be32_to_cpu(from->di_mtime.t_nsec); + inode->i_ctime.tv_sec = (int)be32_to_cpu(from->di_ctime.t_sec); + inode->i_ctime.tv_nsec = (int)be32_to_cpu(from->di_ctime.t_nsec); + (http://oss.sgi.com/archives/xfs/2016-01/msg00456.html)
Some legacy file systems (maybe hfs) can remain disabled, as nobody cares about them any more.
If we still care about them (e.g. ext2), we can make them support only read-only mode. In ext4, this would mean forbidding write access to file systems that don't have the extended inode format enabled.
For ext2/4, that would have to be handled internally by the filesystem with feature masks. For other legacy filesystems, then the VFS mount time checking could allow RO mounts if the supported ranges are not y2038k clean. Compile time options are not really the best approach here...
If there are places where filesystems are receiving or using unchecked timestamps then those are bugs that need fixing. Those need to be in separate patches to y2038k support...
Fair enough, but that probably means that patch series will have to come first.
Yes, that is normal practice for structuring a non-trivial feature addition patch series. Bug fixes first, then cleanups, then the functionality changes.
This will also reduce the number of places in which a separate type conversion function needs to be added.
Precisely. I'm pretty sure this should come down to "no new conversion functions needed" for the vast majority of the code.
Cheers,
Dave.
On Friday 15 January 2016 08:00:01 Dave Chinner wrote:
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
c) The opposite direction from b) is to first change the common code, but then any direct assignment between a timespec in a file system and the timespec64 in the inode/iattr/kstat/etc first needs a conversion helper so we can build cleanly, and then we do one file system at a time to remove them all again while changing the internal structures in the file system from timespec to timespec64.
No new helpers are necessary - we've already got the helper functions we need. This:
int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(old_dentry);
struct inode_timespec now = current_fs_time(inode->i_sb);
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
VFS_INODE_SET_XTIME(i_ctime, inode, now);
VFS_INODE_SET_XTIME(i_mtime, dir, now);
VFS_INODE_SET_XTIME(i_ctime, dir, now); inc_nlink(inode);
.....
is just wrong. All the type conversion and clamping and checking done in that VFS_INODE_SET_XTIME() should be done in current_fs_time() and have it return a timespec64 directly. Indeed, it already does truncation, and can easily be made to do range clamping, too. i.e. the change should simply be:
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
- inode->i_ctime = dir->i_ctime = dir->i_mtime = current_fs_time(inode->i_sb);
Yes, that is the obvious case, and I guess works for at least half the file systems when they always assign righthand side and lefthand side of the time stamps using the external types or helpers like CURRENT_TIME and current_fs_time().
However, there are a couple of file systems that need a bit more refactoring before we can do this, e.g. in ntfs_truncate:
if (!IS_NOCMTIME(VFS_I(base_ni)) && !IS_RDONLY(VFS_I(base_ni))) { struct timespec now = current_fs_time(VFS_I(base_ni)->i_sb); int sync_it = 0;
if (!timespec_equal(&VFS_I(base_ni)->i_mtime, &now) || !timespec_equal(&VFS_I(base_ni)->i_ctime, &now)) sync_it = 1; VFS_I(base_ni)->i_mtime = now; VFS_I(base_ni)->i_ctime = now; }
The type of the local variable must match the return code of current_fs_time(), so if we change over i_mtime and current_fs_time globally, this either has to be rewritten first to avoid the use of local variables, or it needs temporary conversion helpers, or it has to be changed in the same patch. None of those is particularly appealing. There are a few dozen such things in various file systems.
I think it's a mix - if the timestamps come in from userspace, fail with ERANGE. That could be controlled by sysctl via VFS part of the ->setattr operation, or in each of the individual FS implementations. If they come from the kernel (e.g. atime update) then the generic behvaiour is to warn and continue, filesystems can otherwise select their own policy for kernel updates via ->update_time.
I'd prefer not to have it done by the individual file system implementation, so we get a consistent behavior.
Can't be helped, because different filesystems have different timestamp behaviours, and not all use the generic VFS code for timestamp updates. The filesystems need to use the correct helper functions to obtain a valid current time, but you can't stop them from storing and using arbitrary timestamp formats if they so desire...
I mean the decision whether to clamp or error on an overflow should be done consistently. Having a global sysctl knob or a compile-time option is better than having each file system implementor take a guess at what users might prefer, if we can't come up with a behavior (e.g. clamp all the time, or error out all the time) that everybody agrees is always correct.
d. disable expired fs at compile time (Arnd's suggestion)
Not really an option, because it means we can't use filesystems that interop with other systems (e.g. cameras, etc) because they won't support y2038k timestamps for a long time, if ever (e.g. vfat).
Let me clarify what my idea is here: I want a global kernel option that disables all code that has known y2038 issues. If anyone tries to build an embedded system with support beyond 2038, that should disable all of those things, including file systems, drivers and system calls, so we can reasonably assume that everything that works today with that kernel build will keep working in the future and not break in random ways.
It's not that black and white when it comes to filesystems. y2038k support is determined by the on-disk structure of the filesystem being mounted, and that is determined at mount time. When the filesystem mounts and sets it's valid timestamp ranges the VFS will need to decide as to whether the filesystem is allowed to continue mounting or not.
Some file systems are always broken around 2038 (e.g. HFS in 2040), so if we can't fix them, I want to be able to turn them off in Kconfig along with the 32-bit time_t syscalls. For those where y2038 support depends on on-disk feature flags (ext4 and xfs I guess, maybe one or two more), the config option can turn off write support for the old format.
This is a rather important part of the y2038 work: If anyone cares about y2038 problems in this decade, they want to deploy systems with extremely long service contracts and don't want to update them 20 years from now to fix an obscure bug that could be prevented today, so we try hard to identify every line of code that won't work then as it does today.
For a file system, this can be done in a number of ways:
- Most file systems today interpret the time as an unsigned 32-bit number (as opposed to signed as ext3, xfs and few others do), so as long as we use timespec64 in the syscalls, we are ok.
Actually, sign conversion is a problem we currently have to be very careful of. See, for example, xfstests:tests/generic/258, which tests timestamps recording times before epoch. i.e. in XFS we have to convert the unsigned 32 bit disk timestamp to signed 32 bit before storing it in the VFS timestamp so it behaves correctly on 64 bit systems. This results in us needing to do this when reading the inode from disk:
/*
* time is signed, so need to convert to signed 32 bit before
* storing in inode timestamp which may be 64 bit. Otherwise
* a time before epoch is converted to a time long after epoch
* on 64 bit systems.
*/
inode->i_atime.tv_sec = (int)be32_to_cpu(from->di_atime.t_sec);
inode->i_atime.tv_nsec = (int)be32_to_cpu(from->di_atime.t_nsec);
inode->i_mtime.tv_sec = (int)be32_to_cpu(from->di_mtime.t_sec);
inode->i_mtime.tv_nsec = (int)be32_to_cpu(from->di_mtime.t_nsec);
inode->i_ctime.tv_sec = (int)be32_to_cpu(from->di_ctime.t_sec);
inode->i_ctime.tv_nsec = (int)be32_to_cpu(from->di_ctime.t_nsec);
I'm very aware of this issue. Most file system developers however were not, so e.g. a timestamp on btrfs is interpreted differently on 32-bit and 64-bit kernels. Some file systems (AFS, NFSv3?) explicitly define the timestamps as unsigned, and most others that store 32-bit seconds apparently never thought about the issue and happen to use unsigned interpretation on 64-bit systems, since that is what you get out of be32_to_cpu() without adding the cast.
For the y2038 case, this means we are lucky: almost all the users today have 64-bit hardware, so they can already represent timestamps in the range from 1970 to 2106. Once we have 64-bit timestamps in 32-bit user space, we just need to make that use the same format we use on the 64-bit machines already.
ext2/3/4, xfs and ocfs2 (maybe one or two more, I'd have to check) currently behave in a consistent manner across 32-bit and 64-bit architectures by allowing a range between 1902 and 2037, and we obviously don't have a choice there but to keep that current behavior, and extend the time format in one way or another to store additional bits for the epoch.
Some legacy file systems (maybe hfs) can remain disabled, as nobody cares about them any more.
If we still care about them (e.g. ext2), we can make them support only read-only mode. In ext4, this would mean forbidding write access to file systems that don't have the extended inode format enabled.
For ext2/4, that would have to be handled internally by the filesystem with feature masks. For other legacy filesystems, then the VFS mount time checking could allow RO mounts if the supported ranges are not y2038k clean. Compile time options are not really the best approach here...
I'm not following the line of thought here. We have some users that want ext4 to mount old file system images without long inodes writable, because they don't care about the 2038 problem. We also have other users that want to force the same file system image to be read-only because they want to ensure that it does not stop working correctly when the time overflow happens while the fs is mounted.
If you don't want a compile-time option for it, how do you suggest we decide which case we have?
Arnd
On Thursday 14 January 2016 23:46:16 Arnd Bergmann wrote:
I'm not following the line of thought here. We have some users that want ext4 to mount old file system images without long inodes writable, because they don't care about the 2038 problem. We also have other users that want to force the same file system image to be read-only because they want to ensure that it does not stop working correctly when the time overflow happens while the fs is mounted.
If you don't want a compile-time option for it, how do you suggest we decide which case we have?
In case that came across wrong, I'm assuming that the first user also wants all the system calls enabled that pass 32-bit time_t values, while the second one wants them all left out from the kernel to ensure that no user space program gets incorrect data. This could be done using a sysctl of course, but I still think we want a compile-time option for the syscalls for clarity, and I would simply use the same compile-time option to determine the behavior of the file system, network protocols and device drivers that deal with 32-bit timestamps outside of the kernel.
Arnd
On Thu, Jan 14, 2016 at 11:54:36PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 23:46:16 Arnd Bergmann wrote:
I'm not following the line of thought here. We have some users that want ext4 to mount old file system images without long inodes writable, because they don't care about the 2038 problem. We also have other users that want to force the same file system image to be read-only because they want to ensure that it does not stop working correctly when the time overflow happens while the fs is mounted.
If you don't want a compile-time option for it, how do you suggest we decide which case we have?
In case that came across wrong, I'm assuming that the first user also wants all the system calls enabled that pass 32-bit time_t values, while the second one wants them all left out from the kernel to ensure that no user space program gets incorrect data.
system call API support is a completely different class of problem. It's out of the scope of this patchset, and really I don't care what you do with them.
The point I'm making is that we'll have to modify all the existing filesystem code to supply a valid timestamp range to the VFS at mount time for the range checking/clamping, similar to how we do the granularity specification right now. That means we can do rejection of non-y2038k compliant filesystems at runtime based on what the filesystem tells the VFS it supports.. Set up the default to be reject if rw, allow if ro, and provide a mount option to override ad allow mounting rw.
Users can then make the decision when mounting their filesystems. If they are system/automatically mounted filesystems and aren't y2038k compliant, then the override option can be added to /etc/fstab and we're all good. If the truly paranoid users want to disallow the override and/or read only mount options, then add a sysctl to control that.
Cheers,
Dave.
On Friday 15 January 2016 13:27:34 Dave Chinner wrote:
On Thu, Jan 14, 2016 at 11:54:36PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 23:46:16 Arnd Bergmann wrote:
I'm not following the line of thought here. We have some users that want ext4 to mount old file system images without long inodes writable, because they don't care about the 2038 problem. We also have other users that want to force the same file system image to be read-only because they want to ensure that it does not stop working correctly when the time overflow happens while the fs is mounted.
If you don't want a compile-time option for it, how do you suggest we decide which case we have?
In case that came across wrong, I'm assuming that the first user also wants all the system calls enabled that pass 32-bit time_t values, while the second one wants them all left out from the kernel to ensure that no user space program gets incorrect data.
system call API support is a completely different class of problem. It's out of the scope of this patchset, and really I don't care what you do with them.
Sure, I was just providing some background about why we want a compile-time option in general.
The point I'm making is that we'll have to modify all the existing filesystem code to supply a valid timestamp range to the VFS at mount time for the range checking/clamping, similar to how we do the granularity specification right now. That means we can do rejection of non-y2038k compliant filesystems at runtime based on what the filesystem tells the VFS it supports.. Set up the default to be reject if rw, allow if ro, and provide a mount option to override ad allow mounting rw.
We can't really default to "reject if rw", because that would break all systems using ext3 or xfs, unless users modify their fstab or set the flag that makes the partition y2038 compliant.
The compile-time option that I'm thinking of would change the default beween "always allow" and "reject if rw", based on whether the system cares about this issue or not. Almost everyone today won't care about it at all and would be rather annoyed by being unable to mount their rootfs, but some people care about the behavior a lot.
Having a global sysctl or mount option as an override would be good, maybe both if that isn't over-engineering the problem when we already have a compile-time option.
Arnd
On Fri, Jan 15, 2016 at 06:01:36PM +0100, Arnd Bergmann wrote:
On Friday 15 January 2016 13:27:34 Dave Chinner wrote:
The point I'm making is that we'll have to modify all the existing filesystem code to supply a valid timestamp range to the VFS at mount time for the range checking/clamping, similar to how we do the granularity specification right now. That means we can do rejection of non-y2038k compliant filesystems at runtime based on what the filesystem tells the VFS it supports.. Set up the default to be reject if rw, allow if ro, and provide a mount option to override ad allow mounting rw.
We can't really default to "reject if rw", because that would break all systems using ext3 or xfs, unless users modify their fstab or set the flag that makes the partition y2038 compliant.
Right, I was refering to the behaviour of a y2038k compliant kernel, A current non-compliant kernel will have the default behaviour you are suggesting.
The compile-time option that I'm thinking of would change the default beween "always allow" and "reject if rw", based on whether the system cares about this issue or not. Almost everyone today won't care about it at all and would be rather annoyed by being unable to mount their rootfs, but some people care about the behavior a lot.
Yup, that's exactly what I was implying.
Having a global sysctl or mount option as an override would be good, maybe both if that isn't over-engineering the problem when we already have a compile-time option.
Distros should not be forces to ship multiple kernels just to provide all the different runtime compliance behaviours their users require. Make the policy runtime enforcable, but select the default behaviour and supported policies via compile time options.
Cheers,
Dave
On Thu, Jan 14, 2016 at 11:46:16PM +0100, Arnd Bergmann wrote:
On Friday 15 January 2016 08:00:01 Dave Chinner wrote:
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
c) The opposite direction from b) is to first change the common code, but then any direct assignment between a timespec in a file system and the timespec64 in the inode/iattr/kstat/etc first needs a conversion helper so we can build cleanly, and then we do one file system at a time to remove them all again while changing the internal structures in the file system from timespec to timespec64.
No new helpers are necessary - we've already got the helper functions we need. This:
int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(old_dentry);
struct inode_timespec now = current_fs_time(inode->i_sb);
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
VFS_INODE_SET_XTIME(i_ctime, inode, now);
VFS_INODE_SET_XTIME(i_mtime, dir, now);
VFS_INODE_SET_XTIME(i_ctime, dir, now); inc_nlink(inode);
.....
is just wrong. All the type conversion and clamping and checking done in that VFS_INODE_SET_XTIME() should be done in current_fs_time() and have it return a timespec64 directly. Indeed, it already does truncation, and can easily be made to do range clamping, too. i.e. the change should simply be:
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
- inode->i_ctime = dir->i_ctime = dir->i_mtime = current_fs_time(inode->i_sb);
Yes, that is the obvious case, and I guess works for at least half the file systems when they always assign righthand side and lefthand side of the time stamps using the external types or helpers like CURRENT_TIME and current_fs_time().
However, there are a couple of file systems that need a bit more refactoring before we can do this, e.g. in ntfs_truncate:
Sure, and nfs is a pain because of all it's internal use of timespecs, too. But we have these timespec_to_timespec64 helper functions, and that's what we should use in these cases where the filesystem cannot support full 64 bit timestamps internally. In those cases, they'll be telling the superblock this at mount time things like current_fs_time() won't be returning then a timestamp that is out of range for a 32 bit timestamp....
if (!IS_NOCMTIME(VFS_I(base_ni)) && !IS_RDONLY(VFS_I(base_ni))) { struct timespec now = current_fs_time(VFS_I(base_ni)->i_sb); int sync_it = 0; if (!timespec_equal(&VFS_I(base_ni)->i_mtime, &now) || !timespec_equal(&VFS_I(base_ni)->i_ctime, &now)) sync_it = 1; VFS_I(base_ni)->i_mtime = now; VFS_I(base_ni)->i_ctime = now;
}
The type of the local variable must match the return code of current_fs_time(), so if we change over i_mtime and current_fs_time globally, this either has to be rewritten first to avoid the use of local variables, or it needs temporary conversion helpers, or it has to be changed in the same patch. None of those is particularly appealing. There are a few dozen such things in various file systems.
it gets rewritten to:
struct timespec now;
now = timespec64_to_timespec(current_fs_time(VFS_I(base_ni)->i_sb)); ....
and the valid timestamp range for ntfs is set to 32bit timestamps. This then leaves it up to the filesystem developers to make the ntfs filesystem code 64 bit timestamp clean if the on disk format is ever changed to support 64 bit times.
Same goes for NFS, and any of the other filesystems that use struct timespec internally for time representation.
I think it's a mix - if the timestamps come in from userspace, fail with ERANGE. That could be controlled by sysctl via VFS part of the ->setattr operation, or in each of the individual FS implementations. If they come from the kernel (e.g. atime update) then the generic behvaiour is to warn and continue, filesystems can otherwise select their own policy for kernel updates via ->update_time.
I'd prefer not to have it done by the individual file system implementation, so we get a consistent behavior.
Can't be helped, because different filesystems have different timestamp behaviours, and not all use the generic VFS code for timestamp updates. The filesystems need to use the correct helper functions to obtain a valid current time, but you can't stop them from storing and using arbitrary timestamp formats if they so desire...
I mean the decision whether to clamp or error on an overflow should be done consistently.
Sure, but that comes from using the helpers we already have, and applying the clamping at a point where errors can be returned to userspace. current_fs_time() should never return a timestamp outside what the filesystem has said it supports and we've validated that behaviour at mount time. hence it's only user provided timestamps that we need range errors on.
Having a global sysctl knob or a compile-time option is better than having each file system implementor take a guess at what users might prefer, if we can't come up with a behavior (e.g. clamp all the time, or error out all the time) that everybody agrees is always correct.
filesystem implementors will use the helper funtions that are provided for this. If they don't (like all the current use of CURRENT_TIME), then that's a bug that needs fixing. i.e. we need a timespec_clamp() function, similar to timespec_trunc(), and y2038k compliant filesystems and syscalls need to use them....
Not really an option, because it means we can't use filesystems that interop with other systems (e.g. cameras, etc) because they won't support y2038k timestamps for a long time, if ever (e.g. vfat).
Let me clarify what my idea is here: I want a global kernel option that disables all code that has known y2038 issues. If anyone tries to build an embedded system with support beyond 2038, that should disable all of those things, including file systems, drivers and system calls, so we can reasonably assume that everything that works today with that kernel build will keep working in the future and not break in random ways.
It's not that black and white when it comes to filesystems. y2038k support is determined by the on-disk structure of the filesystem being mounted, and that is determined at mount time. When the filesystem mounts and sets it's valid timestamp ranges the VFS will need to decide as to whether the filesystem is allowed to continue mounting or not.
Some file systems are always broken around 2038 (e.g. HFS in 2040), so if we can't fix them, I want to be able to turn them off in Kconfig along with the 32-bit time_t syscalls.
That can be done with kconfig depends rules - it has nothing to do with this patch set.
ext2/3/4, xfs and ocfs2 (maybe one or two more, I'd have to check) currently behave in a consistent manner across 32-bit and 64-bit architectures by allowing a range between 1902 and 2037, and we obviously don't have a choice there but to keep that current behavior, and extend the time format in one way or another to store additional bits for the epoch.
That's a filesystem implementation problem, not a generic inode timestamp problem. i.e. this is handled when the filesystem converts the inode timestamp from a timespec64 in the struct inode to whatever format it stores the timestamp on disk. That conversion does not change just because the VFS inode moves from a timespec to a timespec64. Again, those on-disk format changes to support beyond the current epoch are outside the scope of this patchset, because they are not affected by the timestamp format the VFS choses to use.
Cheers,
Dave.
On Friday 15 January 2016 13:49:37 Dave Chinner wrote:
On Thu, Jan 14, 2016 at 11:46:16PM +0100, Arnd Bergmann wrote:
On Friday 15 January 2016 08:00:01 Dave Chinner wrote:
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
Yes, that is the obvious case, and I guess works for at least half the file systems when they always assign righthand side and lefthand side of the time stamps using the external types or helpers like CURRENT_TIME and current_fs_time().
However, there are a couple of file systems that need a bit more refactoring before we can do this, e.g. in ntfs_truncate:
Sure, and nfs is a pain because of all it's internal use of timespecs, too.
lustre is probably the worst.
But we have these timespec_to_timespec64 helper functions, and that's what we should use in these cases where the filesystem cannot support full 64 bit timestamps internally. In those cases, they'll be telling the superblock this at mount time things like current_fs_time() won't be returning then a timestamp that is out of range for a 32 bit timestamp....
I'm not worried about the runtime problems here, just how to get a series of patches that are each doing one reasonable thing at a time.
if (!IS_NOCMTIME(VFS_I(base_ni)) && !IS_RDONLY(VFS_I(base_ni))) { struct timespec now = current_fs_time(VFS_I(base_ni)->i_sb); int sync_it = 0; if (!timespec_equal(&VFS_I(base_ni)->i_mtime, &now) || !timespec_equal(&VFS_I(base_ni)->i_ctime, &now)) sync_it = 1; VFS_I(base_ni)->i_mtime = now; VFS_I(base_ni)->i_ctime = now;
}
The type of the local variable must match the return code of current_fs_time(), so if we change over i_mtime and current_fs_time globally, this either has to be rewritten first to avoid the use of local variables, or it needs temporary conversion helpers, or it has to be changed in the same patch. None of those is particularly appealing. There are a few dozen such things in various file systems.
it gets rewritten to:
struct timespec now; now = timespec64_to_timespec(current_fs_time(VFS_I(base_ni)->i_sb));
....
and the valid timestamp range for ntfs is set to 32bit timestamps. This then leaves it up to the filesystem developers to make the ntfs filesystem code 64 bit timestamp clean if the on disk format is ever changed to support 64 bit times.
Same goes for NFS, and any of the other filesystems that use struct timespec internally for time representation.
That is what I meant in my previous mail about approach c) being ugly because it requires sprinkling lots of timespec64_to_timespec() and timespec_to_timespec64() in the initial patch in order to atomically change the types in inode/iattr/kstat/... without introducing build regressions.
It's a rather horrible patch, and quite likely to cause conflicts with other patches that introduce another use of those structures in the merge window.
Having a global sysctl knob or a compile-time option is better than having each file system implementor take a guess at what users might prefer, if we can't come up with a behavior (e.g. clamp all the time, or error out all the time) that everybody agrees is always correct.
filesystem implementors will use the helper funtions that are provided for this. If they don't (like all the current use of CURRENT_TIME), then that's a bug that needs fixing.
Ok, then we are in total agreement here: the policy remains to be decided by common code, but the implementation can differ per file system.
i.e. we need a timespec_clamp() function, similar to timespec_trunc(), and y2038k compliant filesystems and syscalls need to use them....
I was thinking we end up with a single function that does both clamp() and trunk(), but that's an implementation detail.
Let me clarify what my idea is here: I want a global kernel option that disables all code that has known y2038 issues. If anyone tries to build an embedded system with support beyond 2038, that should disable all of those things, including file systems, drivers and system calls, so we can reasonably assume that everything that works today with that kernel build will keep working in the future and not break in random ways.
It's not that black and white when it comes to filesystems. y2038k support is determined by the on-disk structure of the filesystem being mounted, and that is determined at mount time. When the filesystem mounts and sets it's valid timestamp ranges the VFS will need to decide as to whether the filesystem is allowed to continue mounting or not.
Some file systems are always broken around 2038 (e.g. HFS in 2040), so if we can't fix them, I want to be able to turn them off in Kconfig along with the 32-bit time_t syscalls.
That can be done with kconfig depends rules - it has nothing to do with this patch set.
kconfig dependencies is what I meant for the simple cases where a file system is known to always be broken, we just need a small modification for the cases you mentioned below.
ext2/3/4, xfs and ocfs2 (maybe one or two more, I'd have to check) currently behave in a consistent manner across 32-bit and 64-bit architectures by allowing a range between 1902 and 2037, and we obviously don't have a choice there but to keep that current behavior, and extend the time format in one way or another to store additional bits for the epoch.
That's a filesystem implementation problem, not a generic inode timestamp problem. i.e. this is handled when the filesystem converts the inode timestamp from a timespec64 in the struct inode to whatever format it stores the timestamp on disk. That conversion does not change just because the VFS inode moves from a timespec to a timespec64. Again, those on-disk format changes to support beyond the current epoch are outside the scope of this patchset, because they are not affected by the timestamp format the VFS choses to use.
Fine with me, we can have another series to add the Kconfig dependencies and modify the file systems that need this.
Arnd
On Fri, Jan 15, 2016 at 08:00:01AM +1100, Dave Chinner wrote:
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
On Tue, Jan 12, 2016 at 07:29:57PM +1100, Dave Chinner wrote:
On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote:
> On Jan 11, 2016, at 04:33, Dave Chinner david@fromorbit.com wrote: >> On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote:
c) The opposite direction from b) is to first change the common code, but then any direct assignment between a timespec in a file system and the timespec64 in the inode/iattr/kstat/etc first needs a conversion helper so we can build cleanly, and then we do one file system at a time to remove them all again while changing the internal structures in the file system from timespec to timespec64.
No new helpers are necessary - we've already got the helper functions we need. This:
int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(old_dentry);
struct inode_timespec now = current_fs_time(inode->i_sb);
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
VFS_INODE_SET_XTIME(i_ctime, inode, now);
VFS_INODE_SET_XTIME(i_mtime, dir, now);
VFS_INODE_SET_XTIME(i_ctime, dir, now); inc_nlink(inode);
.....
is just wrong. All the type conversion and clamping and checking done in that VFS_INODE_SET_XTIME() should be done in current_fs_time() and have it return a timespec64 directly. Indeed, it already does truncation, and can easily be made to do range clamping, too. i.e. the change should simply be:
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
- inode->i_ctime = dir->i_ctime = dir->i_mtime = current_fs_time(inode->i_sb);
The current patch series already does this and macros don't do any clamping. The accesors are only illustrating a way of using a callback when clamping is detected. And, if we don't need accessors, I will find a different way of doing that if we agree it is necessary.
+struct timespec64 current_fs_time(struct super_block *sb) +{ + struct timespec64 now = current_kernel_time64(); + + return fs_time_trunc(now, sb); +} +EXPORT_SYMBOL(current_fs_time); + +struct timespec64 current_fs_time_sec(struct super_block *sb) +{ + struct timespec64 ts = {ktime_get_real_seconds(), 0}; + + /* range check for time. */ + fs_time_range_check(sb, &ts); + + return ts; +}
+struct inode_timespec +fs_time_trunc(struct inode_timespec t, struct super_block *sb) +{ + u32 gran = sb->s_time_gran; + + /* range check for time. */ + fs_time_range_check(sb, &t); + if (unlikely(is_fs_timestamp_bad(t))) + return t; + + /* Avoid division in the common cases 1 ns and 1 s. */ + if (gran == 1) + ;/* nothing */ + else if (gran == NSEC_PER_SEC) + t.tv_nsec = 0; + else if (gran > 1 && gran < NSEC_PER_SEC) + t.tv_nsec -= t.tv_nsec % gran; + else + WARN(1, "illegal file time granularity: %u", gran); + + return t; +}
But really, you are still misunderstanding what inode_timespec does. It only introduces aliases and not the accessors. This is only a way to avoid code duplication since we cannot change all fs at one time.
And, this is what you would do if you were coding in any object oriented language. This is object polymorphism.
So, I will paste here again. Whatever is below is all the extra code inode_timespec introduces:
+#ifdef CONFIG_FS_USES_64BIT_TIME + +/* Place holder defines until CONFIG_FS_USES_64BIT_TIME + * is enabled. + * timespec64 data type and functions will be used at that + * time directly and these defines will be deleted. + */ +#define inode_timespec timespec64 + +#define inode_timespec_compare timespec64_compare +#define inode_timespec_equal timespec64_equal + +#else + +#define inode_timespec timespec + +#define inode_timespec_compare timespec_compare +#define inode_timespec_equal timespec_equal + +#endif +
-Deepa
The current_fs_time function is not y2038 safe because of the use of struct timespec.
The macros CURRENT_TIME and CURRENT_TIME_SEC do not represent file system times correctly as they cannnot perform range checks or truncations. These are also not y2038 safe. Add 64 bit versions of the above macros.
Provide a new set of FS_TIME macros which will return time in timespec or timespec64 based on CONFIG_FS_USES_64_BIT_TIME. These are meant to be used only within file systems because of being tied to the above config. Once the config is enabled, the timespec version of it can be deleted and the 64 bit time version can be used elsewhere also.
Add struct timespec64 version for current_fs_time(). Current version of current_fs_time() can be dropped after enabling CONFIG_FS_USES_64BIT_TIME.
Provide an alternative to timespec_trunc(): fs_time_trunc(). This function takes super block as an argument in addition to timestamp so that it can include range and precision checks. Additionally, the function uses y2038 safe timespec64 instead of timespec for timestamp representation.
Add function: current_fs_time_sec() to obtain only the seconds portion of the current time(Equivalent to CURRENT_TIME_SEC). This function has two versions selected by the config CONFIG_FS_USES_64BIT_TIME. The 32 bit version support can be dropped after the above config is enabled globally.
All calls to timespec_trunc() will be eventually replaced by fs_time_trunc(). At which point, timespec_trunc() can be deleted.
All the above function calls use fs_time_range_check() to clamp the timestamps.
Inodes that are saved in memory and on disk always have valid timestamps. But, the accessors can detect a clamped timestamp while saving the timestamps into inodes. The clamped timestamp handling is split into two seperate cases: a. Mounting a fs that has exceeded it's current timestamp needs. b. A mounted fs exceeds timestamps needs. Both the above cases are handled using seperate callbacks: superblock bad_timestamp_mount and bad_timestamp operations.
Motivation for the above callbacks being that the Linux kernel does not internally use timestamps and it cannot decide how catastrophic these timestamp clamps can be for the on disk file system or user space applications that use it.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/libfs.c | 5 ++ fs/super.c | 10 ++++ include/linux/fs.h | 50 ++++++++++++++-- include/linux/time64.h | 4 ++ kernel/time/time.c | 156 +++++++++++++++++++++++++++++++++++++++++-------- 5 files changed, 196 insertions(+), 29 deletions(-)
diff --git a/fs/libfs.c b/fs/libfs.c index 8dc37fc..4fa2002 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -227,6 +227,9 @@ struct dentry *mount_pseudo(struct file_system_type *fs_type, char *name, s->s_magic = magic; s->s_op = ops ? ops : &simple_super_operations; s->s_time_gran = 1; + s->s_time_min = FS_DEFAULT_MIN_TIMESTAMP; + s->s_time_max = FS_DEFAULT_MAX_TIMESTAMP; + root = new_inode(s); if (!root) goto Enomem; @@ -482,6 +485,8 @@ int simple_fill_super(struct super_block *s, unsigned long magic, s->s_magic = magic; s->s_op = &simple_super_operations; s->s_time_gran = 1; + s->s_time_min = FS_DEFAULT_MIN_TIMESTAMP; + s->s_time_max = FS_DEFAULT_MAX_TIMESTAMP;
inode = new_inode(s); if (!inode) diff --git a/fs/super.c b/fs/super.c index 7ea56de..3f53def 100644 --- a/fs/super.c +++ b/fs/super.c @@ -239,6 +239,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags) s->s_maxbytes = MAX_NON_LFS; s->s_op = &default_op; s->s_time_gran = 1000000000; + s->s_time_min = FS_DEFAULT_MIN_TIMESTAMP; + s->s_time_max = FS_DEFAULT_MAX_TIMESTAMP; s->cleancache_poolid = CLEANCACHE_NO_POOL;
s->s_shrink.seeks = DEFAULT_SEEKS; @@ -1143,6 +1145,14 @@ mount_fs(struct file_system_type *type, int flags, const char *name, void *data) WARN((sb->s_maxbytes < 0), "%s set sb->s_maxbytes to " "negative value (%lld)\n", type->name, sb->s_maxbytes);
+ /* check timestamp range */ + if (unlikely(is_fs_timestamp_bad(current_fs_time(sb))) && + (sb->s_op->bad_timestamp_mount)) { + error = sb->s_op->bad_timestamp_mount(sb); + if (error) + goto out_sb; + } + up_write(&sb->s_umount); free_secdata(secdata); return root; diff --git a/include/linux/fs.h b/include/linux/fs.h index b9f3cee..5112bc2 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -693,8 +693,14 @@ struct inode { #define VFS_INODE_SET_XTIME(xtime, inode, ts64) \ do { \ struct inode_timespec __ts = (ts64); \ + struct super_block *__sb = inode->i_sb; \ (inode)->xtime##_sec = __ts.tv_sec; \ (inode)->xtime##_nsec = __ts.tv_nsec; \ + if (unlikely(is_fs_timestamp_bad(ts64))) { \ + (inode)->xtime##_nsec = 0; \ + if (__sb->s_op->bad_timestamp) \ + __sb->s_op->bad_timestamp(__sb); \ + } \ } while (0)
#define VFS_INODE_GET_XTIME(xtime, inode) \ @@ -703,8 +709,16 @@ struct inode {
#else
-#define VFS_INODE_SET_XTIME(xtime, inode, ts) \ - ((inode)->xtime = (ts)) +#define VFS_INODE_SET_XTIME(xtime, inode, ts) \ + do { \ + struct super_block *__sb = inode->i_sb; \ + (inode)->xtime = (ts); \ + if (unlikely(is_fs_timestamp_bad((inode)->xtime))) { \ + (inode)->xtime.tv_nsec = 0; \ + if (__sb->s_op->bad_timestamp) \ + __sb->s_op->bad_timestamp(__sb); \ + } \ + } while (0)
#define VFS_INODE_GET_XTIME(xtime, inode) \ ((inode)->xtime) @@ -1355,6 +1369,9 @@ struct super_block { unsigned int s_max_links; fmode_t s_mode;
+ /* Max and min values of c/m/atime in UNIX time. */ + time64_t s_time_max; + time64_t s_time_min; /* Granularity of c/m/atime in ns. Cannot be worse than a second */ u32 s_time_gran; @@ -1416,7 +1433,26 @@ struct super_block { struct list_head s_inodes; /* all inodes */ };
-extern struct timespec current_fs_time(struct super_block *sb); +/* Temporary macros to be used within fs code for current times. + * To aid moving all of fs code to timespec64. + */ +#ifdef CONFIG_FS_USES_64BIT_TIME + +#define FS_TIME CURRENT_TIME64 +#define FS_TIME_SEC CURRENT_TIME64_SEC + +#else + +#define FS_TIME CURRENT_TIME +#define FS_TIME_SEC CURRENT_TIME_SEC + +#endif + +extern int is_fs_timestamp_bad(struct inode_timespec ts); +extern struct inode_timespec current_fs_time(struct super_block *sb); +extern struct inode_timespec current_fs_time_sec(struct super_block *sb); +extern struct inode_timespec +fs_time_trunc(struct inode_timespec ts, struct super_block *sb);
/* * Snapshotting support. @@ -1635,6 +1671,11 @@ struct block_device_operations; #define NOMMU_VMFLAGS \ (NOMMU_MAP_READ | NOMMU_MAP_WRITE | NOMMU_MAP_EXEC)
+#define FS_TIMESTAMP_NSEC_NOT_VALID INT_MAX +/* Max timestamp is set to (2038-01-19 03:14:07 UTC) */ +#define FS_DEFAULT_MAX_TIMESTAMP INT_MAX +/* Min timestamp is set to Epoch (1970-01-01 UTC). */ +#define FS_DEFAULT_MIN_TIMESTAMP 0
struct iov_iter;
@@ -1732,8 +1773,9 @@ extern int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
struct super_operations { struct inode *(*alloc_inode)(struct super_block *sb); + int (*bad_timestamp_mount)(struct super_block *); + void (*bad_timestamp)(struct super_block *); void (*destroy_inode)(struct inode *); - void (*dirty_inode) (struct inode *, int flags); int (*write_inode) (struct inode *, struct writeback_control *wbc); int (*drop_inode) (struct inode *); diff --git a/include/linux/time64.h b/include/linux/time64.h index be98201..eb3cdc0 100644 --- a/include/linux/time64.h +++ b/include/linux/time64.h @@ -47,6 +47,10 @@ struct itimerspec64 {
#endif
+#define CURRENT_TIME64 (current_kernel_time64()) +#define CURRENT_TIME64_SEC \ + ((struct timespec64) { ktime_get_real_seconds(), 0 }) + /* Parameters used to convert the timespec values: */ #define MSEC_PER_SEC 1000L #define USEC_PER_MSEC 1000L diff --git a/kernel/time/time.c b/kernel/time/time.c index 86751c6..24ca258 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -230,6 +230,103 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p) return copy_to_user(txc_p, &txc, sizeof(struct timex)) ? -EFAULT : ret; }
+/* fs_time_range_check: + * Function to check if a given timestamp is in the range allowed for a + * filesystem. + * Assume input timespec is normalized. + * Clamp it to max or min value allowed for seconds, whenever values are + * out of range. + * Also set ts->nsec value to FS_TIMESTAMP_NSEC_NOT_VALID if clamped. + * nsec is set to 0 if not in allowed range. + */ +static void +fs_time_range_check(struct super_block *sb, struct inode_timespec *ts) +{ + if (unlikely(sb->s_time_max < ts->tv_sec || + sb->s_time_min > ts->tv_sec)) { + ts->tv_sec = clamp_val(ts->tv_sec, sb->s_time_min, sb->s_time_max); + ts->tv_nsec = FS_TIMESTAMP_NSEC_NOT_VALID; + return; + } + + if(unlikely(ts->tv_nsec < 0 || ts->tv_nsec >= NSEC_PER_SEC)) + ts->tv_nsec = 0; +} + +/* returns -1 if timestamp is bad/ clamped according to + * fs_time_range_check. + * returns 0 otherwise. + */ +int is_fs_timestamp_bad(struct inode_timespec ts) +{ + if (ts.tv_nsec == FS_TIMESTAMP_NSEC_NOT_VALID) + return -1; + + return 0; +} +EXPORT_SYMBOL(is_fs_timestamp_bad); + +/* + * fs_time_trunc - Truncate inode_timespec to a granularity + * @t: inode_timespec + * @sb: Super block. + * + * Truncate a timespec to a granularity. Always rounds down. Granularity + * must * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). + * Returns 1 on error, 0 otherwise. + */ +struct inode_timespec +fs_time_trunc(struct inode_timespec t, struct super_block *sb) +{ + u32 gran = sb->s_time_gran; + + /* range check for time. */ + fs_time_range_check(sb, &t); + if (unlikely(is_fs_timestamp_bad(t))) + return t; + + /* Avoid division in the common cases 1 ns and 1 s. */ + if (gran == 1) + ;/* nothing */ + else if (gran == NSEC_PER_SEC) + t.tv_nsec = 0; + else if (gran > 1 && gran < NSEC_PER_SEC) + t.tv_nsec -= t.tv_nsec % gran; + else + WARN(1, "illegal file time granularity: %u", gran); + + return t; +} +EXPORT_SYMBOL(fs_time_trunc); + +/** + * timespec_trunc - Truncate timespec to a granularity + * @t: Timespec + * @gran: Granularity in ns. + * + * Truncate a timespec to a granularity. Always rounds down. gran must + * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). + * + * This function is deprecated and should no longer be used for filesystems. + * fs_time_trunc should be used instead. + */ +struct timespec timespec_trunc(struct timespec t, unsigned gran) +{ + + /* Avoid division in the common cases 1 ns and 1 s. */ + if (gran == 1) { + /* nothing */ + } else if (gran == NSEC_PER_SEC) { + t.tv_nsec = 0; + } else if (gran > 1 && gran < NSEC_PER_SEC) { + t.tv_nsec -= t.tv_nsec % gran; + } else { + WARN(1, "illegal file time granularity: %u", gran); + } + return t; +} +EXPORT_SYMBOL(timespec_trunc); + /** * current_fs_time - Return FS time * @sb: Superblock. @@ -237,13 +334,46 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p) * Return the current time truncated to the time granularity supported by * the fs. */ +#ifdef CONFIG_FS_USES_64BIT_TIME +struct timespec64 current_fs_time(struct super_block *sb) +{ + struct timespec64 now = current_kernel_time64(); + + return fs_time_trunc(now, sb); +} +EXPORT_SYMBOL(current_fs_time); + +struct timespec64 current_fs_time_sec(struct super_block *sb) +{ + struct timespec64 ts = {ktime_get_real_seconds(), 0}; + + /* range check for time. */ + fs_time_range_check(sb, &ts); + + return ts; +} +EXPORT_SYMBOL(current_fs_time_sec); +#else struct timespec current_fs_time(struct super_block *sb) { struct timespec now = current_kernel_time(); - return timespec_trunc(now, sb->s_time_gran); + + return fs_time_trunc(now, sb); } EXPORT_SYMBOL(current_fs_time);
+struct timespec current_fs_time_sec(struct super_block *sb) +{ + struct timespec ts = { get_seconds(), 0 }; + + /* range check for time. */ + fs_time_range_check(sb, &ts); + + return ts; +} +EXPORT_SYMBOL(current_fs_time_sec); +#endif + /* * Convert jiffies to milliseconds and back. * @@ -286,30 +416,6 @@ unsigned int jiffies_to_usecs(const unsigned long j) } EXPORT_SYMBOL(jiffies_to_usecs);
-/** - * timespec_trunc - Truncate timespec to a granularity - * @t: Timespec - * @gran: Granularity in ns. - * - * Truncate a timespec to a granularity. Always rounds down. gran must - * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). - */ -struct timespec timespec_trunc(struct timespec t, unsigned gran) -{ - /* Avoid division in the common cases 1 ns and 1 s. */ - if (gran == 1) { - /* nothing */ - } else if (gran == NSEC_PER_SEC) { - t.tv_nsec = 0; - } else if (gran > 1 && gran < NSEC_PER_SEC) { - t.tv_nsec -= t.tv_nsec % gran; - } else { - WARN(1, "illegal file time granularity: %u", gran); - } - return t; -} -EXPORT_SYMBOL(timespec_trunc); - /* * mktime64 - Converts date to seconds. * Converts Gregorian date to seconds since 1970-01-01 00:00:00.
VFS currently uses struct timespec timestamps which are not y2038 safe.
Change all the struct inode timestamps accesses through accessor macros only. This will help the switch over to 64 bit times seamlessly.
Use struct inode_timespec aliases everywhere. This will change timestamp data types to struct timespec64 when 64 bit time switch occurs.
Change all calls to CURRENT_TIME to current_fs_time(). The CURRENT_TIME macro is not accurate for file system code as it does not perform range checks on timestamps nor does it cater to individual file system timestamp granularity. Change all calls to timespec_trunc() to fs_time_trunc(). The latter supports range checking on timestamps.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/attr.c | 15 +++++++------- fs/bad_inode.c | 10 +++++++-- fs/binfmt_misc.c | 7 +++++-- fs/inode.c | 53 ++++++++++++++++++++++++++++++++---------------- fs/libfs.c | 45 ++++++++++++++++++++++++++++++++-------- fs/locks.c | 5 ++--- fs/nsfs.c | 6 +++++- fs/pipe.c | 6 +++++- fs/posix_acl.c | 2 +- fs/stack.c | 6 +++--- fs/stat.c | 6 +++--- fs/utimes.c | 6 ++++-- include/linux/fs_stack.h | 9 ++++---- 13 files changed, 121 insertions(+), 55 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c index 6530ced..4156239 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -148,14 +148,14 @@ void setattr_copy(struct inode *inode, const struct iattr *attr) if (ia_valid & ATTR_GID) inode->i_gid = attr->ia_gid; if (ia_valid & ATTR_ATIME) - inode->i_atime = timespec_trunc(attr->ia_atime, - inode->i_sb->s_time_gran); + VFS_INODE_SET_XTIME(i_atime, inode, + fs_time_trunc(attr->ia_atime, inode->i_sb)); if (ia_valid & ATTR_MTIME) - inode->i_mtime = timespec_trunc(attr->ia_mtime, - inode->i_sb->s_time_gran); + VFS_INODE_SET_XTIME(i_mtime, inode, + fs_time_trunc(attr->ia_mtime, inode->i_sb)); if (ia_valid & ATTR_CTIME) - inode->i_ctime = timespec_trunc(attr->ia_ctime, - inode->i_sb->s_time_gran); + VFS_INODE_SET_XTIME(i_ctime, inode, + fs_time_trunc(attr->ia_ctime, inode->i_sb)); if (ia_valid & ATTR_MODE) { umode_t mode = attr->ia_mode;
@@ -192,7 +192,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de struct inode *inode = dentry->d_inode; umode_t mode = inode->i_mode; int error; - struct timespec now; + struct inode_timespec now; unsigned int ia_valid = attr->ia_valid;
WARN_ON_ONCE(!mutex_is_locked(&inode->i_mutex)); @@ -210,7 +210,6 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de }
now = current_fs_time(inode->i_sb); - attr->ia_ctime = now; if (!(ia_valid & ATTR_ATIME_SET)) attr->ia_atime = now; diff --git a/fs/bad_inode.c b/fs/bad_inode.c index 103f5d7..3c51e22 100644 --- a/fs/bad_inode.c +++ b/fs/bad_inode.c @@ -169,11 +169,17 @@ static const struct inode_operations bad_inode_ops =
void make_bad_inode(struct inode *inode) { + struct inode_timespec now; + remove_inode_hash(inode);
inode->i_mode = S_IFREG; - inode->i_atime = inode->i_mtime = inode->i_ctime = - current_fs_time(inode->i_sb); + + now = current_fs_time(inode->i_sb); + + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); inode->i_op = &bad_inode_ops; inode->i_fop = &bad_file_ops; } diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c index 78f005f..4fd4437 100644 --- a/fs/binfmt_misc.c +++ b/fs/binfmt_misc.c @@ -562,12 +562,15 @@ static void entry_status(Node *e, char *page) static struct inode *bm_get_inode(struct super_block *sb, int mode) { struct inode *inode = new_inode(sb); + struct inode_timespec now;
if (inode) { inode->i_ino = get_next_ino(); inode->i_mode = mode; - inode->i_atime = inode->i_mtime = inode->i_ctime = - current_fs_time(inode->i_sb); + now = current_fs_time(inode->i_sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); } return inode; } diff --git a/fs/inode.c b/fs/inode.c index 4c8f719..d3d64dc 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1532,27 +1532,36 @@ EXPORT_SYMBOL(bmap); * passed since the last atime update. */ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode, - struct timespec now) + struct inode_timespec now) {
+ struct inode_timespec ctime; + struct inode_timespec mtime; + struct inode_timespec atime; + if (!(mnt->mnt_flags & MNT_RELATIME)) return 1; + + atime = VFS_INODE_GET_XTIME(i_atime, inode); + ctime = VFS_INODE_GET_XTIME(i_ctime, inode); + mtime = VFS_INODE_GET_XTIME(i_mtime, inode); + /* * Is mtime younger than atime? If yes, update atime: */ - if (timespec_compare(&inode->i_mtime, &inode->i_atime) >= 0) + if (inode_timespec_compare(&mtime, &atime) >= 0) return 1; /* * Is ctime younger than atime? If yes, update atime: */ - if (timespec_compare(&inode->i_ctime, &inode->i_atime) >= 0) + if (inode_timespec_compare(&ctime, &atime) >= 0) return 1;
/* * Is the previous atime value older than a day? If yes, * update atime: */ - if ((long)(now.tv_sec - inode->i_atime.tv_sec) >= 24*60*60) + if ((long)(now.tv_sec - atime.tv_sec) >= 24*60*60) return 1; /* * Good, we can skip the atime update: @@ -1560,18 +1569,19 @@ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode, return 0; }
-int generic_update_time(struct inode *inode, struct timespec *time, int flags) +int generic_update_time(struct inode *inode, struct inode_timespec *time, + int flags) { int iflags = I_DIRTY_TIME;
if (flags & S_ATIME) - inode->i_atime = *time; + VFS_INODE_SET_XTIME(i_atime, inode, *time); if (flags & S_VERSION) inode_inc_iversion(inode); if (flags & S_CTIME) - inode->i_ctime = *time; + VFS_INODE_SET_XTIME(i_ctime, inode, *time); if (flags & S_MTIME) - inode->i_mtime = *time; + VFS_INODE_SET_XTIME(i_mtime, inode, *time);
if (!(inode->i_sb->s_flags & MS_LAZYTIME) || (flags & S_VERSION)) iflags |= I_DIRTY_SYNC; @@ -1584,9 +1594,10 @@ EXPORT_SYMBOL(generic_update_time); * This does the actual work of updating an inodes time or version. Must have * had called mnt_want_write() before calling this. */ -static int update_time(struct inode *inode, struct timespec *time, int flags) +static int update_time(struct inode *inode, struct inode_timespec *time, + int flags) { - int (*update_time)(struct inode *, struct timespec *, int); + int (*update_time)(struct inode *, struct inode_timespec *, int);
update_time = inode->i_op->update_time ? inode->i_op->update_time : generic_update_time; @@ -1606,7 +1617,8 @@ static int update_time(struct inode *inode, struct timespec *time, int flags) bool atime_needs_update(const struct path *path, struct inode *inode) { struct vfsmount *mnt = path->mnt; - struct timespec now; + struct inode_timespec now; + struct inode_timespec atime;
if (inode->i_flags & S_NOATIME) return false; @@ -1621,11 +1633,11 @@ bool atime_needs_update(const struct path *path, struct inode *inode) return false;
now = current_fs_time(inode->i_sb); - if (!relatime_need_update(mnt, inode, now)) return false;
- if (timespec_equal(&inode->i_atime, &now)) + atime = VFS_INODE_GET_XTIME(i_atime, inode); + if (inode_timespec_equal(&atime, &now)) return false;
return true; @@ -1635,7 +1647,7 @@ void touch_atime(const struct path *path) { struct vfsmount *mnt = path->mnt; struct inode *inode = d_inode(path->dentry); - struct timespec now; + struct inode_timespec now;
if (!atime_needs_update(path, inode)) return; @@ -1770,7 +1782,10 @@ EXPORT_SYMBOL(file_remove_privs); int file_update_time(struct file *file) { struct inode *inode = file_inode(file); - struct timespec now; + struct inode_timespec now; + struct inode_timespec mtime; + struct inode_timespec ctime; + int sync_it = 0; int ret;
@@ -1779,10 +1794,14 @@ int file_update_time(struct file *file) return 0;
now = current_fs_time(inode->i_sb); - if (!timespec_equal(&inode->i_mtime, &now)) + + mtime = VFS_INODE_GET_XTIME(i_mtime, inode); + ctime = VFS_INODE_GET_XTIME(i_ctime, inode); + + if (!inode_timespec_equal(&mtime, &now)) sync_it = S_MTIME;
- if (!timespec_equal(&inode->i_ctime, &now)) + if (!inode_timespec_equal(&ctime, &now)) sync_it |= S_CTIME;
if (IS_I_VERSION(inode)) diff --git a/fs/libfs.c b/fs/libfs.c index 4fa2002..5a0c7c2 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -216,6 +216,7 @@ struct dentry *mount_pseudo(struct file_system_type *fs_type, char *name, struct dentry *dentry; struct inode *root; struct qstr d_name = QSTR_INIT(name, strlen(name)); + struct inode_timespec now;
s = sget(fs_type, NULL, set_anon_super, MS_NOUSER, NULL); if (IS_ERR(s)) @@ -240,7 +241,11 @@ struct dentry *mount_pseudo(struct file_system_type *fs_type, char *name, */ root->i_ino = 1; root->i_mode = S_IFDIR | S_IRUSR | S_IWUSR; - root->i_atime = root->i_mtime = root->i_ctime = CURRENT_TIME; + now = current_fs_time(s); + VFS_INODE_SET_XTIME(i_atime, root, now); + VFS_INODE_SET_XTIME(i_ctime, root, now); + VFS_INODE_SET_XTIME(i_mtime, root, now); + dentry = __d_alloc(s, &d_name); if (!dentry) { iput(root); @@ -269,8 +274,11 @@ EXPORT_SYMBOL(simple_open); int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(old_dentry); + struct inode_timespec now = current_fs_time(inode->i_sb);
- inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME; + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); inc_nlink(inode); ihold(inode); dget(dentry); @@ -303,8 +311,11 @@ EXPORT_SYMBOL(simple_empty); int simple_unlink(struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(dentry); + struct inode_timespec now = current_fs_time(inode->i_sb);
- inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME; + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); drop_nlink(inode); dput(dentry); return 0; @@ -328,6 +339,7 @@ int simple_rename(struct inode *old_dir, struct dentry *old_dentry, { struct inode *inode = d_inode(old_dentry); int they_are_dirs = d_is_dir(old_dentry); + struct inode_timespec now;
if (!simple_empty(new_dentry)) return -ENOTEMPTY; @@ -343,8 +355,13 @@ int simple_rename(struct inode *old_dir, struct dentry *old_dentry, inc_nlink(new_dir); }
- old_dir->i_ctime = old_dir->i_mtime = new_dir->i_ctime = - new_dir->i_mtime = inode->i_ctime = CURRENT_TIME; + now = current_fs_time(inode->i_sb); + + VFS_INODE_SET_XTIME(i_ctime, old_dir, now); + VFS_INODE_SET_XTIME(i_mtime, old_dir, now); + VFS_INODE_SET_XTIME(i_ctime, new_dir, now); + VFS_INODE_SET_XTIME(i_mtime, new_dir, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now);
return 0; } @@ -478,6 +495,7 @@ int simple_fill_super(struct super_block *s, unsigned long magic, struct inode *inode; struct dentry *root; struct dentry *dentry; + struct inode_timespec now; int i;
s->s_blocksize = PAGE_CACHE_SIZE; @@ -497,7 +515,10 @@ int simple_fill_super(struct super_block *s, unsigned long magic, */ inode->i_ino = 1; inode->i_mode = S_IFDIR | 0755; - inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + now = current_fs_time(inode->i_sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); inode->i_op = &simple_dir_inode_operations; inode->i_fop = &simple_dir_operations; set_nlink(inode, 2); @@ -523,7 +544,10 @@ int simple_fill_super(struct super_block *s, unsigned long magic, goto out; } inode->i_mode = S_IFREG | files->mode; - inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + now = current_fs_time(inode->i_sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); inode->i_fop = files->ops; inode->i_ino = i; d_add(dentry, inode); @@ -1056,6 +1080,7 @@ struct inode *alloc_anon_inode(struct super_block *s) .set_page_dirty = anon_set_page_dirty, }; struct inode *inode = new_inode_pseudo(s); + struct inode_timespec now;
if (!inode) return ERR_PTR(-ENOMEM); @@ -1074,7 +1099,11 @@ struct inode *alloc_anon_inode(struct super_block *s) inode->i_uid = current_fsuid(); inode->i_gid = current_fsgid(); inode->i_flags |= S_PRIVATE; - inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + now = current_fs_time(s); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); + return inode; } EXPORT_SYMBOL(alloc_anon_inode); diff --git a/fs/locks.c b/fs/locks.c index 15e2b60..2b818eb 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1491,7 +1491,7 @@ EXPORT_SYMBOL(__break_lease); * exclusive leases. The justification is that if someone has an * exclusive lease, then they could be modifying it. */ -void lease_get_mtime(struct inode *inode, struct timespec *time) +void lease_get_mtime(struct inode *inode, struct inode_timespec *time) { bool has_lease = false; struct file_lock_context *ctx; @@ -1510,9 +1510,8 @@ void lease_get_mtime(struct inode *inode, struct timespec *time) if (has_lease) *time = current_fs_time(inode->i_sb); else - *time = inode->i_mtime; + *time = VFS_INODE_GET_XTIME(i_mtime, inode); } - EXPORT_SYMBOL(lease_get_mtime);
/** diff --git a/fs/nsfs.c b/fs/nsfs.c index 8f20d60..a079fc9 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -51,6 +51,7 @@ void *ns_get_path(struct path *path, struct task_struct *task, struct qstr qname = { .name = "", }; struct dentry *dentry; struct inode *inode; + struct inode_timespec now; struct ns_common *ns; unsigned long d;
@@ -82,7 +83,10 @@ slow: return ERR_PTR(-ENOMEM); } inode->i_ino = ns->inum; - inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME; + now = current_fs_time(mnt->mnt_sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); inode->i_flags |= S_IMMUTABLE; inode->i_mode = S_IFREG | S_IRUGO; inode->i_fop = &ns_file_operations; diff --git a/fs/pipe.c b/fs/pipe.c index 42cf8dd..5d414a3 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -637,6 +637,7 @@ static struct inode * get_pipe_inode(void) { struct inode *inode = new_inode_pseudo(pipe_mnt->mnt_sb); struct pipe_inode_info *pipe; + struct inode_timespec now;
if (!inode) goto fail_inode; @@ -662,7 +663,10 @@ static struct inode * get_pipe_inode(void) inode->i_mode = S_IFIFO | S_IRUSR | S_IWUSR; inode->i_uid = current_fsuid(); inode->i_gid = current_fsgid(); - inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + now = current_fs_time(pipe_mnt->mnt_sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now);
return inode;
diff --git a/fs/posix_acl.c b/fs/posix_acl.c index 711dd51..0ac55c9 100644 --- a/fs/posix_acl.c +++ b/fs/posix_acl.c @@ -859,7 +859,7 @@ int simple_set_acl(struct inode *inode, struct posix_acl *acl, int type) acl = NULL; }
- inode->i_ctime = CURRENT_TIME; + VFS_INODE_SET_XTIME(i_ctime, inode, current_fs_time(inode->i_sb)); set_cached_acl(inode, type, acl); return 0; } diff --git a/fs/stack.c b/fs/stack.c index a54e33e..812bbbb 100644 --- a/fs/stack.c +++ b/fs/stack.c @@ -66,9 +66,9 @@ void fsstack_copy_attr_all(struct inode *dest, const struct inode *src) dest->i_uid = src->i_uid; dest->i_gid = src->i_gid; dest->i_rdev = src->i_rdev; - dest->i_atime = src->i_atime; - dest->i_mtime = src->i_mtime; - dest->i_ctime = src->i_ctime; + VFS_INODE_SET_XTIME(i_atime, dest, VFS_INODE_GET_XTIME(i_atime, src)); + VFS_INODE_SET_XTIME(i_mtime, dest, VFS_INODE_GET_XTIME(i_mtime, src)); + VFS_INODE_SET_XTIME(i_ctime, dest, VFS_INODE_GET_XTIME(i_ctime, src)); dest->i_blkbits = src->i_blkbits; dest->i_flags = src->i_flags; set_nlink(dest, src->i_nlink); diff --git a/fs/stat.c b/fs/stat.c index bc045c7..c448313 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -28,9 +28,9 @@ void generic_fillattr(struct inode *inode, struct kstat *stat) stat->gid = inode->i_gid; stat->rdev = inode->i_rdev; stat->size = i_size_read(inode); - stat->atime = inode->i_atime; - stat->mtime = inode->i_mtime; - stat->ctime = inode->i_ctime; + stat->atime = VFS_INODE_GET_XTIME(i_atime, inode); + stat->mtime = VFS_INODE_GET_XTIME(i_mtime, inode); + stat->ctime = VFS_INODE_GET_XTIME(i_ctime, inode); stat->blksize = (1 << inode->i_blkbits); stat->blocks = inode->i_blocks; } diff --git a/fs/utimes.c b/fs/utimes.c index aa138d6..c23c8e6 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -48,11 +48,12 @@ static bool nsec_valid(long nsec) return nsec >= 0 && nsec <= 999999999; }
-static int utimes_common(struct path *path, struct timespec *times) +static int utimes_common(struct path *path, struct inode_timespec *times) { int error; struct iattr newattrs; struct inode *inode = path->dentry->d_inode; + struct super_block *sb = inode->i_sb; struct inode *delegated_inode = NULL;
error = mnt_want_write(path->mnt); @@ -133,7 +134,8 @@ out: * must be owner or have write permission. * Else, update from *times, must be owner or super user. */ -long do_utimes(int dfd, const char __user *filename, struct timespec *times, +long do_utimes(int dfd, const char __user *filename, + struct inode_timespec *times, int flags) { int error = -EINVAL; diff --git a/include/linux/fs_stack.h b/include/linux/fs_stack.h index da317c7..2d2bb50 100644 --- a/include/linux/fs_stack.h +++ b/include/linux/fs_stack.h @@ -15,15 +15,16 @@ extern void fsstack_copy_inode_size(struct inode *dst, struct inode *src); static inline void fsstack_copy_attr_atime(struct inode *dest, const struct inode *src) { - dest->i_atime = src->i_atime; + VFS_INODE_SET_XTIME(i_atime, dest, VFS_INODE_GET_XTIME(i_atime, src)); }
static inline void fsstack_copy_attr_times(struct inode *dest, const struct inode *src) { - dest->i_atime = src->i_atime; - dest->i_mtime = src->i_mtime; - dest->i_ctime = src->i_ctime; + VFS_INODE_SET_XTIME(i_atime, dest, VFS_INODE_GET_XTIME(i_atime, src)); + VFS_INODE_SET_XTIME(i_mtime, dest, VFS_INODE_GET_XTIME(i_mtime, src)); + VFS_INODE_SET_XTIME(i_ctime, dest, VFS_INODE_GET_XTIME(i_ctime, src)); + }
#endif /* _LINUX_FS_STACK_H */
Change all struct timespec references to struct inode_timespec. Use inode timestamp accessors to access inode time fields. This will help the switch to struct timespec64 when CONFIG_FS_USES_64BIT_TIME is enabled.
Use current_fs_time() instead of CURRENT_TIME macros to help range and precision checks.
Truncate and perform range checks before saving the times in struct inode.
Switch over connection times to use SYSTEM_TIME macro instead of CURRENT_TIME. Since SYSTEM_TIME is also under the CONFIG_FS_USES_64BIT_TIME this will help the switch to use timespec64.
Use long long for seconds field in cnvrtDosUnixTm(). This will help represent 64 bit time. Since DOS uses 1980 epoch, all the timestamps are positive when represented in UNIX format. Change all arithmetic to unsigned. Note that even though the theoretical max on DOS times is 2107, its api's only support until the year 2099. This means we can get away with 32 bit unsigned sec field. But, the sec field uses long long to maintain uniformity in the kernel, where everyone uses the theoretical max.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cache.c | 16 ++++++++----- fs/cifs/cifsencrypt.c | 2 +- fs/cifs/cifsglob.h | 6 ++--- fs/cifs/cifsproto.h | 9 +++---- fs/cifs/cifssmb.c | 17 +++++++++----- fs/cifs/file.c | 9 ++++--- fs/cifs/inode.c | 65 +++++++++++++++++++++++++++++++++------------------ fs/cifs/netmisc.c | 26 +++++++++++---------- 8 files changed, 92 insertions(+), 58 deletions(-)
diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c index 6c665bf..8d27e69b 100644 --- a/fs/cifs/cache.c +++ b/fs/cifs/cache.c @@ -221,8 +221,8 @@ const struct fscache_cookie_def cifs_fscache_super_index_def = { * Auxiliary data attached to CIFS inode within the cache */ struct cifs_fscache_inode_auxdata { - struct timespec last_write_time; - struct timespec last_change_time; + struct inode_timespec last_write_time; + struct inode_timespec last_change_time; u64 eof; };
@@ -259,8 +259,10 @@ cifs_fscache_inode_get_aux(const void *cookie_netfs_data, void *buffer,
memset(&auxdata, 0, sizeof(auxdata)); auxdata.eof = cifsi->server_eof; - auxdata.last_write_time = cifsi->vfs_inode.i_mtime; - auxdata.last_change_time = cifsi->vfs_inode.i_ctime; + auxdata.last_write_time = + VFS_INODE_GET_XTIME(i_mtime, &cifsi->vfs_inode); + auxdata.last_change_time = + VFS_INODE_GET_XTIME(i_ctime, &cifsi->vfs_inode);
if (maxbuf > sizeof(auxdata)) maxbuf = sizeof(auxdata); @@ -283,8 +285,10 @@ fscache_checkaux cifs_fscache_inode_check_aux(void *cookie_netfs_data,
memset(&auxdata, 0, sizeof(auxdata)); auxdata.eof = cifsi->server_eof; - auxdata.last_write_time = cifsi->vfs_inode.i_mtime; - auxdata.last_change_time = cifsi->vfs_inode.i_ctime; + auxdata.last_write_time = + VFS_INODE_GET_XTIME(i_mtime, &cifsi->vfs_inode); + auxdata.last_change_time = + VFS_INODE_GET_XTIME(i_ctime, &cifsi->vfs_inode);
if (memcmp(data, &auxdata, datalen) != 0) return FSCACHE_CHECKAUX_OBSOLETE; diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index afa09fc..b0ef587 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -483,7 +483,7 @@ find_timestamp(struct cifs_ses *ses) blobptr += attrsize; /* advance attr value */ }
- return cpu_to_le64(cifs_UnixTimeToNT(CURRENT_TIME)); + return cpu_to_le64(cifs_UnixTimeToNT(FS_TIME)); }
static int calc_ntlmv2_hash(struct cifs_ses *ses, char *ntlmv2_hash, diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index a25b251..c95dce7 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1393,9 +1393,9 @@ struct cifs_fattr { dev_t cf_rdev; unsigned int cf_nlink; unsigned int cf_dtype; - struct timespec cf_atime; - struct timespec cf_mtime; - struct timespec cf_ctime; + struct inode_timespec cf_atime; + struct inode_timespec cf_mtime; + struct inode_timespec cf_ctime; };
static inline void free_dfs_info_param(struct dfs_info3_param *param) diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index eed7ff5..9979c74 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -126,10 +126,11 @@ extern enum securityEnum select_sectype(struct TCP_Server_Info *server, enum securityEnum requested); extern int CIFS_SessSetup(const unsigned int xid, struct cifs_ses *ses, const struct nls_table *nls_cp); -extern struct timespec cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); -extern u64 cifs_UnixTimeToNT(struct timespec); -extern struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, - int offset); +extern struct inode_timespec + cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); +extern u64 cifs_UnixTimeToNT(struct inode_timespec); +extern struct inode_timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, + int offset); extern void cifs_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock); extern int cifs_get_writer(struct cifsInodeInfo *cinode); extern void cifs_put_writer(struct cifsInodeInfo *cinode); diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 90b4f9f..a813bcd 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -478,13 +478,17 @@ decode_lanman_negprot_rsp(struct TCP_Server_Info *server, NEGOTIATE_RSP *pSMBr) * this requirement. */ int val, seconds, remain, result; - struct timespec ts, utc; - utc = CURRENT_TIME; + struct inode_timespec ts, utc; + + utc = FS_TIME; ts = cnvrtDosUnixTm(rsp->SrvTime.Date, rsp->SrvTime.Time, 0); - cifs_dbg(FYI, "SrvTime %d sec since 1970 (utc: %d) diff: %d\n", - (int)ts.tv_sec, (int)utc.tv_sec, - (int)(utc.tv_sec - ts.tv_sec)); + cifs_dbg(FYI, "SrvTime %lld sec since 1970 (utc: %lld) diff: %lld\n", + (long long)ts.tv_sec, (long long)utc.tv_sec, + (long long)(utc.tv_sec - ts.tv_sec)); + /* Assume difference cannot be more than + * INT_MAX or INT_MIN + */ val = (int)(utc.tv_sec - ts.tv_sec); seconds = abs(val); result = (seconds / MIN_TZ_ADJ) * MIN_TZ_ADJ; @@ -4000,7 +4004,8 @@ QInfRetry: if (rc) { cifs_dbg(FYI, "Send error in QueryInfo = %d\n", rc); } else if (data) { - struct timespec ts; + struct inode_timespec ts; + __u32 time = le32_to_cpu(pSMBr->last_write_time);
/* decode response */ diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 0a2752b..2d226cf 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1839,6 +1839,7 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to) int bytes_written = 0; struct inode *inode; struct cifsFileInfo *open_file; + struct inode_timespec now;
if (!mapping || !mapping->host) return -EFAULT; @@ -1870,7 +1871,9 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to) write_data, to - from, &offset); cifsFileInfo_put(open_file); /* Does mm or vfs already set times? */ - inode->i_atime = inode->i_mtime = current_fs_time(inode->i_sb); + now = current_fs_time(inode->i_sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); if ((bytes_written > 0) && (offset)) rc = 0; else if (bytes_written < 0) @@ -3567,6 +3570,7 @@ static int cifs_readpage_worker(struct file *file, struct page *page, loff_t *poffset) { char *read_data; + struct inode *inode = file_inode(file); int rc;
/* Is the page cached? */ @@ -3584,8 +3588,7 @@ static int cifs_readpage_worker(struct file *file, struct page *page, else cifs_dbg(FYI, "Bytes read %d\n", rc);
- file_inode(file)->i_atime = - current_fs_time(file_inode(file)->i_sb); + VFS_INODE_SET_XTIME(i_atime, inode, current_fs_time(inode->i_sb));
if (PAGE_CACHE_SIZE > rc) memset(read_data + rc, 0, PAGE_CACHE_SIZE - rc); diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index aeb26db..bb91bf7 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -92,6 +92,7 @@ static void cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) { struct cifsInodeInfo *cifs_i = CIFS_I(inode); + struct inode_timespec mtime;
cifs_dbg(FYI, "%s: revalidating inode %llu\n", __func__, cifs_i->uniqueid); @@ -110,12 +111,13 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) }
/* revalidate if mtime or size have changed */ - if (timespec_equal(&inode->i_mtime, &fattr->cf_mtime) && - cifs_i->server_eof == fattr->cf_eof) { + mtime = VFS_INODE_GET_XTIME(i_mtime, inode); + if (inode_timespec_equal(&mtime, &fattr->cf_mtime) + && cifs_i->server_eof == fattr->cf_eof) { cifs_dbg(FYI, "%s: inode %llu is unchanged\n", - __func__, cifs_i->uniqueid); - return; - } + __func__, cifs_i->uniqueid); + return; + }
cifs_dbg(FYI, "%s: invalidating inode %llu mapping\n", __func__, cifs_i->uniqueid); @@ -155,13 +157,17 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr) { struct cifsInodeInfo *cifs_i = CIFS_I(inode); struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); + struct super_block *sb = inode->i_sb;
cifs_revalidate_cache(inode, fattr);
spin_lock(&inode->i_lock); - inode->i_atime = fattr->cf_atime; - inode->i_mtime = fattr->cf_mtime; - inode->i_ctime = fattr->cf_ctime; + VFS_INODE_SET_XTIME(i_atime, inode, + fs_time_trunc(fattr->cf_atime, sb)); + VFS_INODE_SET_XTIME(i_mtime, inode, + fs_time_trunc(fattr->cf_mtime, sb)); + VFS_INODE_SET_XTIME(i_ctime, inode, + fs_time_trunc(fattr->cf_ctime, sb)); inode->i_rdev = fattr->cf_rdev; cifs_nlink_fattr_to_inode(inode, fattr); inode->i_uid = fattr->cf_uid; @@ -231,6 +237,7 @@ cifs_unix_basic_to_fattr(struct cifs_fattr *fattr, FILE_UNIX_BASIC_INFO *info, fattr->cf_atime = cifs_NTtimeToUnix(info->LastAccessTime); fattr->cf_mtime = cifs_NTtimeToUnix(info->LastModificationTime); fattr->cf_ctime = cifs_NTtimeToUnix(info->LastStatusChange); + fattr->cf_mode = le64_to_cpu(info->Permissions);
/* @@ -288,7 +295,7 @@ cifs_unix_basic_to_fattr(struct cifs_fattr *fattr, FILE_UNIX_BASIC_INFO *info, fattr->cf_uid = uid; } } - + fattr->cf_gid = cifs_sb->mnt_gid; if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_OVERR_GID)) { u64 id = le64_to_cpu(info->Gid); @@ -313,6 +320,7 @@ static void cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct super_block *sb) { struct cifs_sb_info *cifs_sb = CIFS_SB(sb); + struct inode_timespec now;
cifs_dbg(FYI, "creating fake fattr for DFS referral\n");
@@ -320,9 +328,10 @@ cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct super_block *sb) fattr->cf_mode = S_IFDIR | S_IXUGO | S_IRWXU; fattr->cf_uid = cifs_sb->mnt_uid; fattr->cf_gid = cifs_sb->mnt_gid; - fattr->cf_atime = CURRENT_TIME; - fattr->cf_ctime = CURRENT_TIME; - fattr->cf_mtime = CURRENT_TIME; + now = current_fs_time(sb); + fattr->cf_atime = now; + fattr->cf_ctime = now; + fattr->cf_mtime = now; fattr->cf_nlink = 2; fattr->cf_flags |= CIFS_FATTR_DFS_REFERRAL; } @@ -584,9 +593,10 @@ static int cifs_sfu_mode(struct cifs_fattr *fattr, const unsigned char *path, /* Fill a cifs_fattr struct with info from FILE_ALL_INFO */ static void cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info, - struct cifs_sb_info *cifs_sb, bool adjust_tz, + struct super_block *sb, bool adjust_tz, bool symlink) { + struct cifs_sb_info *cifs_sb = CIFS_SB(sb); struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
memset(fattr, 0, sizeof(*fattr)); @@ -597,7 +607,7 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info, if (info->LastAccessTime) fattr->cf_atime = cifs_NTtimeToUnix(info->LastAccessTime); else - fattr->cf_atime = CURRENT_TIME; + fattr->cf_atime = current_fs_time(sb);
fattr->cf_ctime = cifs_NTtimeToUnix(info->ChangeTime); fattr->cf_mtime = cifs_NTtimeToUnix(info->LastWriteTime); @@ -657,7 +667,6 @@ cifs_get_file_info(struct file *filp) FILE_ALL_INFO find_data; struct cifs_fattr fattr; struct inode *inode = file_inode(filp); - struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); struct cifsFileInfo *cfile = filp->private_data; struct cifs_tcon *tcon = tlink_tcon(cfile->tlink); struct TCP_Server_Info *server = tcon->ses->server; @@ -669,7 +678,7 @@ cifs_get_file_info(struct file *filp) rc = server->ops->query_file_info(xid, tcon, &cfile->fid, &find_data); switch (rc) { case 0: - cifs_all_info_to_fattr(&fattr, &find_data, cifs_sb, false, + cifs_all_info_to_fattr(&fattr, &find_data, inode->i_sb, false, false); break; case -EREMOTE: @@ -751,7 +760,7 @@ cifs_get_inode_info(struct inode **inode, const char *full_path, }
if (!rc) { - cifs_all_info_to_fattr(&fattr, data, cifs_sb, adjust_tz, + cifs_all_info_to_fattr(&fattr, data, sb, adjust_tz, symlink); } else if (rc == -EREMOTE) { cifs_create_dfs_fattr(&fattr, sb); @@ -1252,6 +1261,7 @@ int cifs_unlink(struct inode *dir, struct dentry *dentry) unsigned int xid; char *full_path = NULL; struct inode *inode = d_inode(dentry); + struct inode_timespec now; struct cifsInodeInfo *cifs_inode; struct super_block *sb = dir->i_sb; struct cifs_sb_info *cifs_sb = CIFS_SB(sb); @@ -1343,9 +1353,11 @@ out_reval: cifs_inode = CIFS_I(inode); cifs_inode->time = 0; /* will force revalidate to get info when needed */ - inode->i_ctime = current_fs_time(sb); + VFS_INODE_SET_XTIME(i_ctime, inode, current_fs_time(sb)); } - dir->i_ctime = dir->i_mtime = current_fs_time(sb); + now = current_fs_time(sb); + VFS_INODE_SET_XTIME(i_ctime, dir, now); + VFS_INODE_SET_XTIME(i_mtime, dir, now); cifs_inode = CIFS_I(dir); CIFS_I(dir)->time = 0; /* force revalidate of dir as well */ unlink_out: @@ -1565,6 +1577,7 @@ int cifs_rmdir(struct inode *inode, struct dentry *direntry) struct TCP_Server_Info *server; char *full_path = NULL; struct cifsInodeInfo *cifsInode; + struct inode_timespec now;
cifs_dbg(FYI, "cifs_rmdir, inode = 0x%p\n", inode);
@@ -1612,8 +1625,10 @@ int cifs_rmdir(struct inode *inode, struct dentry *direntry) */ cifsInode->time = 0;
- d_inode(direntry)->i_ctime = inode->i_ctime = inode->i_mtime = - current_fs_time(inode->i_sb); + now = current_fs_time(inode->i_sb); + VFS_INODE_SET_XTIME(i_ctime, d_inode(direntry), now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now);
rmdir_exit: kfree(full_path); @@ -1692,6 +1707,7 @@ cifs_rename2(struct inode *source_dir, struct dentry *source_dentry, struct cifs_tcon *tcon; FILE_UNIX_BASIC_INFO *info_buf_source = NULL; FILE_UNIX_BASIC_INFO *info_buf_target; + struct inode_timespec now; unsigned int xid; int rc, tmprc;
@@ -1785,8 +1801,11 @@ unlink_target: /* force revalidate to go get info when needed */ CIFS_I(source_dir)->time = CIFS_I(target_dir)->time = 0;
- source_dir->i_ctime = source_dir->i_mtime = target_dir->i_ctime = - target_dir->i_mtime = current_fs_time(source_dir->i_sb); + now = current_fs_time(source_dir->i_sb); + VFS_INODE_SET_XTIME(i_ctime, source_dir, now); + VFS_INODE_SET_XTIME(i_mtime, source_dir, now); + VFS_INODE_SET_XTIME(i_ctime, target_dir, now); + VFS_INODE_SET_XTIME(i_mtime, target_dir, now);
cifs_rename_exit: kfree(info_buf_source); diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index 301d3d4..4a26260 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -918,10 +918,10 @@ smbCalcSize(void *buf) * Convert the NT UTC (based 1601-01-01, in hundred nanosecond units) * into Unix UTC (based 1970-01-01, in seconds). */ -struct timespec +struct inode_timespec cifs_NTtimeToUnix(__le64 ntutc) { - struct timespec ts; + struct inode_timespec ts; /* BB what about the timezone? BB */
/* Subtract the NTFS time offset, then convert to 1s intervals. */ @@ -949,7 +949,7 @@ cifs_NTtimeToUnix(__le64 ntutc)
/* Convert the Unix UTC into NT UTC. */ u64 -cifs_UnixTimeToNT(struct timespec t) +cifs_UnixTimeToNT(struct inode_timespec t) { /* Convert to 100ns intervals and then add the NTFS time offset. */ return (u64) t.tv_sec * 10000000 + t.tv_nsec/100 + NTFS_TIME_OFFSET; @@ -959,10 +959,11 @@ static const int total_days_of_prev_months[] = { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 };
-struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) +struct inode_timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { - struct timespec ts; - int sec, min, days, month, year; + struct inode_timespec ts; + unsigned long long sec; + unsigned int min, days, month, year; u16 date = le16_to_cpu(le_date); u16 time = le16_to_cpu(le_time); SMB_TIME *st = (SMB_TIME *)&time; @@ -973,7 +974,7 @@ struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) sec = 2 * st->TwoSeconds; min = st->Minutes; if ((sec > 59) || (min > 59)) - cifs_dbg(VFS, "illegal time min %d sec %d\n", min, sec); + cifs_dbg(VFS, "illegal time min %d sec %llu\n", min, sec); sec += (min * 60); sec += 60 * 60 * st->Hours; if (st->Hours > 24) @@ -992,11 +993,12 @@ struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) days += year * 365; days += (year/4); /* leap year */ /* generalized leap year calculation is more complex, ie no leap year - for years/100 except for years/400, but since the maximum number for DOS - year is 2**7, the last year is 1980+127, which means we need only - consider 2 special case years, ie the years 2000 and 2100, and only - adjust for the lack of leap year for the year 2100, as 2000 was a - leap year (divisable by 400) */ + * for years/100 except for years/400, but since the maximum number for + * DOS year is 2**7, the last year is 1980+127, which means we need only + * consider 2 special case years, ie the years 2000 and 2100, and only + * adjust for the lack of leap year for the year 2100, as 2000 was a + * leap year (divisable by 400) + */ if (year >= 120) /* the year 2100 */ days = days - 1; /* do not count leap year for the year 2100 */
FAT filesystem supports timestamps until the year 2099 even though the theoretical max is 2107. But, the struct timespec overflows in the year 2038 on 32 bit systems.
Use inode_timespec throughout the file system code so that the timestamps can switch to y2038 safe struct timespec64 when CONFIG_FS_USES_64BIT_TIME is turned on.
Use a larger data type for seconds in fat_time_fat2unix(). This extends timestamps beyond the year 2038.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/fat/dir.c | 7 +++++-- fs/fat/fat.h | 8 +++++--- fs/fat/file.c | 10 ++++++++-- fs/fat/inode.c | 46 ++++++++++++++++++++++++++++++++-------------- fs/fat/misc.c | 7 ++++--- fs/fat/namei_msdos.c | 40 +++++++++++++++++++++++++--------------- fs/fat/namei_vfat.c | 41 +++++++++++++++++++++++++++-------------- 7 files changed, 106 insertions(+), 53 deletions(-)
diff --git a/fs/fat/dir.c b/fs/fat/dir.c index 7def96c..fa8a922 100644 --- a/fs/fat/dir.c +++ b/fs/fat/dir.c @@ -1034,6 +1034,7 @@ int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo) struct super_block *sb = dir->i_sb; struct msdos_dir_entry *de; struct buffer_head *bh; + struct inode_timespec now; int err = 0, nr_slots;
/* @@ -1071,7 +1072,9 @@ int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo) } }
- dir->i_mtime = dir->i_atime = CURRENT_TIME_SEC; + now = current_fs_time_sec(sb); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_atime, dir, now); if (IS_DIRSYNC(dir)) (void)fat_sync_inode(dir); else @@ -1130,7 +1133,7 @@ error: return err; }
-int fat_alloc_new_dir(struct inode *dir, struct timespec *ts) +int fat_alloc_new_dir(struct inode *dir, struct inode_timespec *ts) { struct super_block *sb = dir->i_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); diff --git a/fs/fat/fat.h b/fs/fat/fat.h index e6b764a..cabb0fd 100644 --- a/fs/fat/fat.h +++ b/fs/fat/fat.h @@ -303,7 +303,7 @@ extern int fat_scan_logstart(struct inode *dir, int i_logstart, struct fat_slot_info *sinfo); extern int fat_get_dotdot_entry(struct inode *dir, struct buffer_head **bh, struct msdos_dir_entry **de); -extern int fat_alloc_new_dir(struct inode *dir, struct timespec *ts); +extern int fat_alloc_new_dir(struct inode *dir, struct inode_timespec *ts); extern int fat_add_entries(struct inode *dir, void *slots, int nr_slots, struct fat_slot_info *sinfo); extern int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo); @@ -405,9 +405,11 @@ void fat_msg(struct super_block *sb, const char *level, const char *fmt, ...); } while (0) extern int fat_clusters_flush(struct super_block *sb); extern int fat_chain_add(struct inode *inode, int new_dclus, int nr_cluster); -extern void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec *ts, +extern void fat_time_fat2unix(struct msdos_sb_info *sbi, + struct inode_timespec *ts, __le16 __time, __le16 __date, u8 time_cs); -extern void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec *ts, +extern void fat_time_unix2fat(struct msdos_sb_info *sbi, + struct inode_timespec *ts, __le16 *time, __le16 *date, u8 *time_cs); extern int fat_sync_bhs(struct buffer_head **bhs, int nr_bhs);
diff --git a/fs/fat/file.c b/fs/fat/file.c index 43d3475..e7f060f 100644 --- a/fs/fat/file.c +++ b/fs/fat/file.c @@ -188,13 +188,16 @@ static int fat_cont_expand(struct inode *inode, loff_t size) { struct address_space *mapping = inode->i_mapping; loff_t start = inode->i_size, count = size - inode->i_size; + struct inode_timespec ts; int err;
err = generic_cont_expand_simple(inode, size); if (err) goto out;
- inode->i_ctime = inode->i_mtime = CURRENT_TIME_SEC; + ts = current_fs_time_sec(inode->i_sb); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); mark_inode_dirty(inode); if (IS_SYNC(inode)) { int err2; @@ -280,6 +283,7 @@ error: static int fat_free(struct inode *inode, int skip) { struct super_block *sb = inode->i_sb; + struct inode_timespec ts; int err, wait, free_start, i_start, i_logstart;
if (MSDOS_I(inode)->i_start == 0) @@ -297,7 +301,9 @@ static int fat_free(struct inode *inode, int skip) MSDOS_I(inode)->i_logstart = 0; } MSDOS_I(inode)->i_attrs |= ATTR_ARCH; - inode->i_ctime = inode->i_mtime = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); if (wait) { err = fat_sync_inode(inode); if (err) { diff --git a/fs/fat/inode.c b/fs/fat/inode.c index a559905..a1eba05 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -232,12 +232,15 @@ static int fat_write_end(struct file *file, struct address_space *mapping, struct page *pagep, void *fsdata) { struct inode *inode = mapping->host; + struct inode_timespec ts; int err; err = generic_write_end(file, mapping, pos, len, copied, pagep, fsdata); if (err < len) fat_write_failed(mapping, pos + len); if (!(err < 0) && !(MSDOS_I(inode)->i_attrs & ATTR_ARCH)) { - inode->i_mtime = inode->i_ctime = CURRENT_TIME_SEC; + ts = current_fs_time_sec(inode->i_sb); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); MSDOS_I(inode)->i_attrs |= ATTR_ARCH; mark_inode_dirty(inode); } @@ -502,6 +505,7 @@ static int fat_validate_dir(struct inode *dir) int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); + struct inode_timespec mtime, ctime, atime; int error;
MSDOS_I(inode)->i_pos = 0; @@ -551,13 +555,21 @@ int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de) inode->i_blocks = ((inode->i_size + (sbi->cluster_size - 1)) & ~((loff_t)sbi->cluster_size - 1)) >> 9;
- fat_time_fat2unix(sbi, &inode->i_mtime, de->time, de->date, 0); + mtime = VFS_INODE_GET_XTIME(i_mtime, inode); + fat_time_fat2unix(sbi, &mtime, de->time, de->date, 0); if (sbi->options.isvfat) { - fat_time_fat2unix(sbi, &inode->i_ctime, de->ctime, - de->cdate, de->ctime_cs); - fat_time_fat2unix(sbi, &inode->i_atime, 0, de->adate, 0); - } else - inode->i_ctime = inode->i_atime = inode->i_mtime; + ctime = VFS_INODE_GET_XTIME(i_ctime, inode); + fat_time_fat2unix(sbi, + &ctime, + de->ctime, de->cdate, de->ctime_cs); + atime = VFS_INODE_GET_XTIME(i_atime, inode); + fat_time_fat2unix(sbi, &atime, + 0, de->adate, 0); + } else { + mtime = VFS_INODE_GET_XTIME(i_mtime, inode); + VFS_INODE_SET_XTIME(i_atime, inode, mtime); + VFS_INODE_SET_XTIME(i_ctime, inode, mtime); + }
return 0; } @@ -828,6 +840,7 @@ static int __fat_write_inode(struct inode *inode, int wait) struct msdos_sb_info *sbi = MSDOS_SB(sb); struct buffer_head *bh; struct msdos_dir_entry *raw_entry; + struct inode_timespec ts; loff_t i_pos; sector_t blocknr; int err, offset; @@ -861,14 +874,18 @@ retry: raw_entry->size = cpu_to_le32(inode->i_size); raw_entry->attr = fat_make_attrs(inode); fat_set_start(raw_entry, MSDOS_I(inode)->i_logstart); - fat_time_unix2fat(sbi, &inode->i_mtime, &raw_entry->time, - &raw_entry->date, NULL); + ts = VFS_INODE_GET_XTIME(i_mtime, inode); + fat_time_unix2fat(sbi, &ts, + &raw_entry->time, &raw_entry->date, NULL); if (sbi->options.isvfat) { __le16 atime; - fat_time_unix2fat(sbi, &inode->i_ctime, &raw_entry->ctime, + ts = VFS_INODE_GET_XTIME(i_ctime, inode); + fat_time_unix2fat(sbi, &ts, + &raw_entry->ctime, &raw_entry->cdate, &raw_entry->ctime_cs); - fat_time_unix2fat(sbi, &inode->i_atime, &atime, - &raw_entry->adate, NULL); + ts = VFS_INODE_GET_XTIME(i_atime, inode); + fat_time_unix2fat(sbi, &ts, + &atime, &raw_entry->adate, NULL); } spin_unlock(&sbi->inode_hash_lock); mark_buffer_dirty(bh); @@ -1385,8 +1402,9 @@ static int fat_read_root(struct inode *inode) MSDOS_I(inode)->mmu_private = inode->i_size;
fat_save_attrs(inode, ATTR_DIR); - inode->i_mtime.tv_sec = inode->i_atime.tv_sec = inode->i_ctime.tv_sec = 0; - inode->i_mtime.tv_nsec = inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec = 0; + VFS_INODE_SET_XTIME(i_atime, inode, ((struct inode_timespec) {0, 0})); + VFS_INODE_SET_XTIME(i_mtime, inode, ((struct inode_timespec) {0, 0})); + VFS_INODE_SET_XTIME(i_ctime, inode, ((struct inode_timespec) {0, 0})); set_nlink(inode, fat_subdirs(inode)+2);
return 0; diff --git a/fs/fat/misc.c b/fs/fat/misc.c index c4589e9..1544498 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -186,11 +186,12 @@ static time_t days_in_year[] = { };
/* Convert a FAT time/date pair to a UNIX date (seconds since 1 1 70). */ -void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec *ts, +void fat_time_fat2unix(struct msdos_sb_info *sbi, struct inode_timespec *ts, __le16 __time, __le16 __date, u8 time_cs) { u16 time = le16_to_cpu(__time), date = le16_to_cpu(__date); - time_t second, day, leap_day, month, year; + long long second; + time_t day, leap_day, month, year;
year = date >> 9; month = max(1, (date >> 5) & 0xf); @@ -224,7 +225,7 @@ void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec *ts, }
/* Convert linear UNIX date to a FAT time/date pair. */ -void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec *ts, +void fat_time_unix2fat(struct msdos_sb_info *sbi, struct inode_timespec *ts, __le16 *time, __le16 *date, u8 *time_cs) { struct tm tm; diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c index b7e2b33..457dcfb 100644 --- a/fs/fat/namei_msdos.c +++ b/fs/fat/namei_msdos.c @@ -224,7 +224,8 @@ static struct dentry *msdos_lookup(struct inode *dir, struct dentry *dentry, /***** Creates a directory entry (name is already formatted). */ static int msdos_add_entry(struct inode *dir, const unsigned char *name, int is_dir, int is_hid, int cluster, - struct timespec *ts, struct fat_slot_info *sinfo) + struct inode_timespec *ts, + struct fat_slot_info *sinfo) { struct msdos_sb_info *sbi = MSDOS_SB(dir->i_sb); struct msdos_dir_entry de; @@ -249,7 +250,8 @@ static int msdos_add_entry(struct inode *dir, const unsigned char *name, if (err) return err;
- dir->i_ctime = dir->i_mtime = *ts; + VFS_INODE_SET_XTIME(i_ctime, dir, *ts); + VFS_INODE_SET_XTIME(i_mtime, dir, *ts); if (IS_DIRSYNC(dir)) (void)fat_sync_inode(dir); else @@ -265,7 +267,7 @@ static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct super_block *sb = dir->i_sb; struct inode *inode = NULL; struct fat_slot_info sinfo; - struct timespec ts; + struct inode_timespec ts; unsigned char msdos_name[MSDOS_NAME]; int err, is_hid;
@@ -283,7 +285,7 @@ static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode, goto out; }
- ts = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); err = msdos_add_entry(dir, msdos_name, 0, is_hid, 0, &ts, &sinfo); if (err) goto out; @@ -293,7 +295,9 @@ static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode, err = PTR_ERR(inode); goto out; } - inode->i_mtime = inode->i_atime = inode->i_ctime = ts; + VFS_INODE_SET_XTIME(i_atime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); /* timestamp is already written, so mark_inode_dirty() is unneeded. */
d_instantiate(dentry, inode); @@ -330,7 +334,7 @@ static int msdos_rmdir(struct inode *dir, struct dentry *dentry) drop_nlink(dir);
clear_nlink(inode); - inode->i_ctime = CURRENT_TIME_SEC; + VFS_INODE_SET_XTIME(i_ctime, inode, current_fs_time_sec(sb)); fat_detach(inode); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); @@ -347,7 +351,7 @@ static int msdos_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) struct fat_slot_info sinfo; struct inode *inode; unsigned char msdos_name[MSDOS_NAME]; - struct timespec ts; + struct inode_timespec ts; int err, is_hid, cluster;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -364,7 +368,7 @@ static int msdos_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) goto out; }
- ts = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); cluster = fat_alloc_new_dir(dir, &ts); if (cluster < 0) { err = cluster; @@ -383,7 +387,9 @@ static int msdos_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) goto out; } set_nlink(inode, 2); - inode->i_mtime = inode->i_atime = inode->i_ctime = ts; + VFS_INODE_SET_XTIME(i_atime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); /* timestamp is already written, so mark_inode_dirty() is unneeded. */
d_instantiate(dentry, inode); @@ -416,7 +422,7 @@ static int msdos_unlink(struct inode *dir, struct dentry *dentry) if (err) goto out; clear_nlink(inode); - inode->i_ctime = CURRENT_TIME_SEC; + VFS_INODE_SET_XTIME(i_ctime, inode, current_fs_time_sec(sb)); fat_detach(inode); out: mutex_unlock(&MSDOS_SB(sb)->s_lock); @@ -434,8 +440,9 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name, struct buffer_head *dotdot_bh; struct msdos_dir_entry *dotdot_de; struct inode *old_inode, *new_inode; + struct super_block *sb = old_dir->i_sb; struct fat_slot_info old_sinfo, sinfo; - struct timespec ts; + struct inode_timespec ts; loff_t new_i_pos; int err, old_attrs, is_dir, update_dotdot, corrupt = 0;
@@ -481,7 +488,9 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name, mark_inode_dirty(old_inode);
old_dir->i_version++; - old_dir->i_ctime = old_dir->i_mtime = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); + VFS_INODE_SET_XTIME(i_ctime, old_dir, ts); + VFS_INODE_SET_XTIME(i_mtime, old_dir, ts); if (IS_DIRSYNC(old_dir)) (void)fat_sync_inode(old_dir); else @@ -490,7 +499,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name, } }
- ts = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); if (new_inode) { if (err) goto out; @@ -541,7 +550,8 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name, if (err) goto error_dotdot; old_dir->i_version++; - old_dir->i_ctime = old_dir->i_mtime = ts; + VFS_INODE_SET_XTIME(i_ctime, old_dir, ts); + VFS_INODE_SET_XTIME(i_mtime, old_dir, ts); if (IS_DIRSYNC(old_dir)) (void)fat_sync_inode(old_dir); else @@ -551,7 +561,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name, drop_nlink(new_inode); if (is_dir) drop_nlink(new_inode); - new_inode->i_ctime = ts; + VFS_INODE_SET_XTIME(i_ctime, new_dir, ts); } out: brelse(sinfo.bh); diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index 7092584..31da5b6 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -577,7 +577,7 @@ xlate_to_uni(const unsigned char *name, int len, unsigned char *outname,
static int vfat_build_slots(struct inode *dir, const unsigned char *name, int len, int is_dir, int cluster, - struct timespec *ts, + struct inode_timespec *ts, struct msdos_dir_slot *slots, int *nr_slots) { struct msdos_sb_info *sbi = MSDOS_SB(dir->i_sb); @@ -653,7 +653,7 @@ out_free: }
static int vfat_add_entry(struct inode *dir, struct qstr *qname, int is_dir, - int cluster, struct timespec *ts, + int cluster, struct inode_timespec *ts, struct fat_slot_info *sinfo) { struct msdos_dir_slot *slots; @@ -678,7 +678,9 @@ static int vfat_add_entry(struct inode *dir, struct qstr *qname, int is_dir, goto cleanup;
/* update timestamp */ - dir->i_ctime = dir->i_mtime = dir->i_atime = *ts; + VFS_INODE_SET_XTIME(i_ctime, dir, *ts); + VFS_INODE_SET_XTIME(i_mtime, dir, *ts); + VFS_INODE_SET_XTIME(i_atime, dir, *ts); if (IS_DIRSYNC(dir)) (void)fat_sync_inode(dir); else @@ -772,12 +774,12 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct super_block *sb = dir->i_sb; struct inode *inode; struct fat_slot_info sinfo; - struct timespec ts; + struct inode_timespec ts; int err;
mutex_lock(&MSDOS_SB(sb)->s_lock);
- ts = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); err = vfat_add_entry(dir, &dentry->d_name, 0, 0, &ts, &sinfo); if (err) goto out; @@ -790,7 +792,9 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode, goto out; } inode->i_version++; - inode->i_mtime = inode->i_atime = inode->i_ctime = ts; + VFS_INODE_SET_XTIME(i_atime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); /* timestamp is already written, so mark_inode_dirty() is unneeded. */
d_instantiate(dentry, inode); @@ -804,6 +808,7 @@ static int vfat_rmdir(struct inode *dir, struct dentry *dentry) struct inode *inode = d_inode(dentry); struct super_block *sb = dir->i_sb; struct fat_slot_info sinfo; + struct inode_timespec now; int err;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -821,7 +826,9 @@ static int vfat_rmdir(struct inode *dir, struct dentry *dentry) drop_nlink(dir);
clear_nlink(inode); - inode->i_mtime = inode->i_atime = CURRENT_TIME_SEC; + now = current_fs_time_sec(sb); + VFS_INODE_SET_XTIME(i_atime, inode, now); + VFS_INODE_SET_XTIME(i_mtime, inode, now); fat_detach(inode); dentry->d_time = dir->i_version; out: @@ -835,6 +842,7 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry) struct inode *inode = d_inode(dentry); struct super_block *sb = dir->i_sb; struct fat_slot_info sinfo; + struct inode_timespec now; int err;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -847,7 +855,9 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry) if (err) goto out; clear_nlink(inode); - inode->i_mtime = inode->i_atime = CURRENT_TIME_SEC; + now = current_fs_time_sec(sb); + VFS_INODE_SET_XTIME(i_atime, dir, now); + VFS_INODE_SET_XTIME(i_mtime, dir, now); fat_detach(inode); dentry->d_time = dir->i_version; out: @@ -861,7 +871,7 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) struct super_block *sb = dir->i_sb; struct inode *inode; struct fat_slot_info sinfo; - struct timespec ts; + struct inode_timespec ts; int err, cluster;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -887,7 +897,9 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) } inode->i_version++; set_nlink(inode, 2); - inode->i_mtime = inode->i_atime = inode->i_ctime = ts; + VFS_INODE_SET_XTIME(i_atime, inode, ts); + VFS_INODE_SET_XTIME(i_mtime, inode, ts); + VFS_INODE_SET_XTIME(i_ctime, inode, ts); /* timestamp is already written, so mark_inode_dirty() is unneeded. */
d_instantiate(dentry, inode); @@ -909,7 +921,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry, struct msdos_dir_entry *dotdot_de; struct inode *old_inode, *new_inode; struct fat_slot_info old_sinfo, sinfo; - struct timespec ts; + struct inode_timespec ts; loff_t new_i_pos; int err, is_dir, update_dotdot, corrupt = 0; struct super_block *sb = old_dir->i_sb; @@ -931,7 +943,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry, } }
- ts = CURRENT_TIME_SEC; + ts = current_fs_time_sec(sb); if (new_inode) { if (is_dir) { err = fat_dir_empty(new_inode); @@ -976,7 +988,8 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry, if (err) goto error_dotdot; old_dir->i_version++; - old_dir->i_ctime = old_dir->i_mtime = ts; + VFS_INODE_SET_XTIME(i_ctime, old_dir, ts); + VFS_INODE_SET_XTIME(i_mtime, old_dir, ts); if (IS_DIRSYNC(old_dir)) (void)fat_sync_inode(old_dir); else @@ -986,7 +999,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry, drop_nlink(new_inode); if (is_dir) drop_nlink(new_inode); - new_inode->i_ctime = ts; + VFS_INODE_SET_XTIME(i_ctime, new_inode, ts); } out: brelse(sinfo.bh);
struct timespec is not y2038 safe. The ext4 uses time_extra fields to extend {a,c,m,cr} times until 2446.
Use struct inode_timespec to replace timespec. inode_timespec will eventually be replaced by struct timespec64 when CONFIG_FS_USES_64BIT_TIME is enabled.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/ext4/acl.c | 3 ++- fs/ext4/ext4.h | 44 ++++++++++++++++++++++++-------------------- fs/ext4/extents.c | 25 +++++++++++++++++++------ fs/ext4/ialloc.c | 9 +++++++-- fs/ext4/inline.c | 10 ++++++++-- fs/ext4/inode.c | 16 ++++++++++++---- fs/ext4/ioctl.c | 16 ++++++++++------ fs/ext4/namei.c | 40 ++++++++++++++++++++++++++++------------ fs/ext4/super.c | 6 +++++- fs/ext4/xattr.c | 2 +- 10 files changed, 116 insertions(+), 55 deletions(-)
diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c index 69b1e73..e8073d5 100644 --- a/fs/ext4/acl.c +++ b/fs/ext4/acl.c @@ -200,7 +200,8 @@ __ext4_set_acl(handle_t *handle, struct inode *inode, int type, if (error < 0) return error; else { - inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, + ext4_current_time(inode)); ext4_mark_inode_dirty(handle, inode); if (error == 0) acl = NULL; diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index c569430..4bb2604 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -754,14 +754,15 @@ struct move_extent { * affected filesystem before 2242. */
-static inline __le32 ext4_encode_extra_time(struct timespec *time) +static inline __le32 ext4_encode_extra_time(struct inode_timespec *time) { u32 extra = sizeof(time->tv_sec) > 4 ? ((time->tv_sec - (s32)time->tv_sec) >> 32) & EXT4_EPOCH_MASK : 0; return cpu_to_le32(extra | (time->tv_nsec << EXT4_EPOCH_BITS)); }
-static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra) +static inline void ext4_decode_extra_time(struct inode_timespec *time, + __le32 extra) { if (unlikely(sizeof(time->tv_sec) > 4 && (extra & cpu_to_le32(EXT4_EPOCH_MASK)))) { @@ -784,12 +785,13 @@ static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra) time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS; }
-#define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \ -do { \ - (raw_inode)->xtime = cpu_to_le32((inode)->xtime.tv_sec); \ - if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ - (raw_inode)->xtime ## _extra = \ - ext4_encode_extra_time(&(inode)->xtime); \ +#define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \ +do { \ + struct inode_timespec __ts = VFS_INODE_GET_XTIME(xtime, inode); \ + (raw_inode)->xtime = cpu_to_le32(__ts.tv_sec); \ + if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ + (raw_inode)->xtime ## _extra = \ + ext4_encode_extra_time(&__ts); \ } while (0)
#define EXT4_EINODE_SET_XTIME(xtime, einode, raw_inode) \ @@ -801,14 +803,16 @@ do { \ ext4_encode_extra_time(&(einode)->xtime); \ } while (0)
-#define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \ -do { \ - (inode)->xtime.tv_sec = (signed)le32_to_cpu((raw_inode)->xtime); \ - if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ - ext4_decode_extra_time(&(inode)->xtime, \ - raw_inode->xtime ## _extra); \ - else \ - (inode)->xtime.tv_nsec = 0; \ +#define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \ +do { \ + struct inode_timespec __ts = VFS_INODE_GET_XTIME(xtime, inode); \ + __ts.tv_sec = (signed)le32_to_cpu((raw_inode)->xtime); \ + if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ + ext4_decode_extra_time(&__ts, \ + raw_inode->xtime ## _extra); \ + else \ + __ts.tv_nsec = 0; \ + VFS_INODE_SET_XTIME(xtime, inode, __ts); \ } while (0)
#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode) \ @@ -931,9 +935,9 @@ struct ext4_inode_info {
/* * File creation time. Its function is same as that of - * struct timespec i_{a,c,m}time in the generic inode. + * struct inode_timespec i_{a,c,m}time in the generic inode. */ - struct timespec i_crtime; + struct inode_timespec i_crtime;
/* mballoc */ struct list_head i_prealloc_list; @@ -1441,10 +1445,10 @@ static inline struct ext4_inode_info *EXT4_I(struct inode *inode) return container_of(inode, struct ext4_inode_info, vfs_inode); }
-static inline struct timespec ext4_current_time(struct inode *inode) +static inline struct inode_timespec ext4_current_time(struct inode *inode) { return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ? - current_fs_time(inode->i_sb) : CURRENT_TIME_SEC; + current_fs_time(inode->i_sb) : current_fs_time_sec(inode->i_sb); }
static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index b52fea3..99c4800 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4726,12 +4726,13 @@ retry: map.m_lblk += ret; map.m_len = len = len - ret; epos = (loff_t)map.m_lblk << inode->i_blkbits; - inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, ext4_current_time(inode)); if (new_size) { if (epos > new_size) epos = new_size; if (ext4_update_inode_size(inode, epos) & 0x1) - inode->i_mtime = inode->i_ctime; + VFS_INODE_SET_XTIME(i_mtime, inode, + VFS_INODE_GET_XTIME(i_ctime, inode)); } else { if (epos > inode->i_size) ext4_set_inode_flag(inode, @@ -4755,6 +4756,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, loff_t len, int mode) { struct inode *inode = file_inode(file); + struct inode_timespec now; handle_t *handle = NULL; unsigned int max_blocks; loff_t new_size = 0; @@ -4854,7 +4856,9 @@ static long ext4_zero_range(struct file *file, loff_t offset, } /* Now release the pages and zero block aligned part of pages */ truncate_pagecache_range(inode, start, end - 1); - inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now);
ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size, flags, mode); @@ -4879,7 +4883,9 @@ static long ext4_zero_range(struct file *file, loff_t offset, goto out_dio; }
- inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); if (new_size) { ext4_update_inode_size(inode, new_size); } else { @@ -5459,6 +5465,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) { struct super_block *sb = inode->i_sb; ext4_lblk_t punch_start, punch_stop; + struct inode_timespec now; handle_t *handle; unsigned int credits; loff_t new_size, ioffset; @@ -5578,7 +5585,10 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) up_write(&EXT4_I(inode)->i_data_sem); if (IS_SYNC(inode)) ext4_handle_sync(handle); - inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); + ext4_mark_inode_dirty(handle, inode);
out_stop: @@ -5606,6 +5616,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) struct ext4_ext_path *path; struct ext4_extent *extent; ext4_lblk_t offset_lblk, len_lblk, ee_start_lblk = 0; + struct inode_timespec now; unsigned int credits, ee_len; int ret = 0, depth, split_flag = 0; loff_t ioffset; @@ -5688,7 +5699,9 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) /* Expand file to avoid data loss if there is error while shifting */ inode->i_size += len; EXT4_I(inode)->i_disksize += len; - inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); ret = ext4_mark_inode_dirty(handle, inode); if (ret) goto out_stop; diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 1b8024d..6f16598 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -756,6 +756,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir, ext4_group_t i; ext4_group_t flex_group; struct ext4_group_info *grp; + struct inode_timespec ts; int encrypt = 0;
/* Cannot create files in a deleted directory */ @@ -1029,8 +1030,12 @@ got: inode->i_ino = ino + group * EXT4_INODES_PER_GROUP(sb); /* This is the optimal IO size (for stat), not the fs block size */ inode->i_blocks = 0; - inode->i_mtime = inode->i_atime = inode->i_ctime = ei->i_crtime = - ext4_current_time(inode); + ts = ei->i_crtime = ext4_current_time(inode); + if (unlikely(is_fs_timestamp_bad(ei->i_crtime))) + ei->i_crtime.tv_nsec = 0; + VFS_INODE_SET_XTIME(i_mtime, inode, ts); + VFS_INODE_SET_XTIME(i_atime, inode, ts); + VFS_INODE_SET_XTIME(i_ctime, inode, ts);
memset(ei->i_data, 0, sizeof(ei->i_data)); ei->i_dir_start_lookup = 0; diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index d884989..a53fb3b 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -1003,6 +1003,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle, struct inode *dir = d_inode(dentry->d_parent); int err; struct ext4_dir_entry_2 *de; + struct inode_timespec now;
err = ext4_find_dest_de(dir, inode, iloc->bh, inline_start, inline_size, fname, &de); @@ -1028,7 +1029,9 @@ static int ext4_add_dirent_to_inline(handle_t *handle, * happen is that the times are slightly out of date * and/or different from the directory change time. */ - dir->i_mtime = dir->i_ctime = ext4_current_time(dir); + now = ext4_current_time(dir); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); ext4_update_dx_flag(dir); dir->i_version++; ext4_mark_inode_dirty(handle, dir); @@ -1896,6 +1899,7 @@ void ext4_inline_data_truncate(struct inode *inode, int *has_inline) int inline_size, value_len, needed_blocks; size_t i_size; void *value = NULL; + struct inode_timespec now; struct ext4_xattr_ibody_find is = { .s = { .not_found = -ENODATA, }, }; @@ -1973,7 +1977,9 @@ out: if (inode->i_nlink) ext4_orphan_del(handle, inode);
- inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); ext4_mark_inode_dirty(handle, inode); if (IS_SYNC(inode)) ext4_handle_sync(handle); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 3ce5db6..078fd58 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3689,6 +3689,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) struct super_block *sb = inode->i_sb; ext4_lblk_t first_block, stop_block; struct address_space *mapping = inode->i_mapping; + struct inode_timespec now; loff_t first_block_offset, last_block_offset; handle_t *handle; unsigned int credits; @@ -3804,7 +3805,9 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) if (IS_SYNC(inode)) ext4_handle_sync(handle);
- inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); ext4_mark_inode_dirty(handle, inode); out_stop: ext4_journal_stop(handle); @@ -3875,6 +3878,7 @@ void ext4_truncate(struct inode *inode) unsigned int credits; handle_t *handle; struct address_space *mapping = inode->i_mapping; + struct inode_timespec now;
/* * There is a possibility that we're either freeing the inode @@ -3958,7 +3962,9 @@ out_stop: if (inode->i_nlink) ext4_orphan_del(handle, inode);
- inode->i_mtime = inode->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle);
@@ -4825,6 +4831,7 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode) int ext4_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = d_inode(dentry); + struct inode_timespec now; int error, rc = 0; int orphan = 0; const unsigned int ia_valid = attr->ia_valid; @@ -4905,8 +4912,9 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) * update c/mtime in shrink case below */ if (!shrink) { - inode->i_mtime = ext4_current_time(inode); - inode->i_ctime = inode->i_mtime; + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); } down_write(&EXT4_I(inode)->i_data_sem); EXT4_I(inode)->i_disksize = attr->ia_size; diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index afb51f5..3825eb7 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -67,9 +67,9 @@ static void swap_inode_data(struct inode *inode1, struct inode *inode2) memswap(&inode1->i_blocks, &inode2->i_blocks, sizeof(inode1->i_blocks)); memswap(&inode1->i_bytes, &inode2->i_bytes, sizeof(inode1->i_bytes)); - memswap(&inode1->i_atime, &inode2->i_atime, sizeof(inode1->i_atime)); - memswap(&inode1->i_mtime, &inode2->i_mtime, sizeof(inode1->i_mtime)); - + VFS_INODE_SWAP_XTIME(i_ctime, inode1, inode2); + VFS_INODE_SWAP_XTIME(i_atime, inode1, inode2); + VFS_INODE_SWAP_XTIME(i_mtime, inode1, inode2); memswap(ei1->i_data, ei2->i_data, sizeof(ei1->i_data)); memswap(&ei1->i_flags, &ei2->i_flags, sizeof(ei1->i_flags)); memswap(&ei1->i_disksize, &ei2->i_disksize, sizeof(ei1->i_disksize)); @@ -98,6 +98,7 @@ static long swap_inode_boot_loader(struct super_block *sb, struct inode *inode_bl; struct ext4_inode_info *ei_bl; struct ext4_sb_info *sbi = EXT4_SB(sb); + struct inode_timespec now;
if (inode->i_nlink != 1 || !S_ISREG(inode->i_mode)) return -EINVAL; @@ -154,7 +155,9 @@ static long swap_inode_boot_loader(struct super_block *sb,
swap_inode_data(inode, inode_bl);
- inode->i_ctime = inode_bl->i_ctime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode_bl, now);
spin_lock(&sbi->s_next_gen_lock); inode->i_generation = sbi->s_next_generation++; @@ -298,7 +301,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) }
ext4_set_inode_flags(inode); - inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, ext4_current_time(inode));
err = ext4_mark_iloc_dirty(handle, inode, &iloc); flags_err: @@ -357,7 +360,8 @@ flags_out: } err = ext4_reserve_inode_write(handle, inode, &iloc); if (err == 0) { - inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, + ext4_current_time(inode)); inode->i_generation = generation; err = ext4_mark_iloc_dirty(handle, inode, &iloc); } diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index bfc026d..34c2d91 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1876,6 +1876,7 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname, struct buffer_head *bh) { unsigned int blocksize = dir->i_sb->s_blocksize; + struct inode_timespec now; int csum_size = 0; int err;
@@ -1912,7 +1913,9 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname, * happen is that the times are slightly out of date * and/or different from the directory change time. */ - dir->i_mtime = dir->i_ctime = ext4_current_time(dir); + now = ext4_current_time(dir); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); ext4_update_dx_flag(dir); dir->i_version++; ext4_mark_inode_dirty(handle, dir); @@ -2911,6 +2914,7 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry) struct buffer_head *bh; struct ext4_dir_entry_2 *de; handle_t *handle = NULL; + struct inode_timespec now;
/* Initialize quotas before so that eventual writes go in * separate transaction */ @@ -2964,7 +2968,10 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry) * recovery. */ inode->i_size = 0; ext4_orphan_add(handle, inode); - inode->i_ctime = dir->i_ctime = dir->i_mtime = ext4_current_time(inode); + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); ext4_mark_inode_dirty(handle, inode); ext4_dec_count(handle, dir); ext4_update_dx_flag(dir); @@ -2984,6 +2991,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry) struct buffer_head *bh; struct ext4_dir_entry_2 *de; handle_t *handle = NULL; + struct inode_timespec now;
trace_ext4_unlink_enter(dir, dentry); /* Initialize quotas before so that eventual writes go @@ -3027,13 +3035,15 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry) retval = ext4_delete_entry(handle, dir, de, bh); if (retval) goto end_unlink; - dir->i_ctime = dir->i_mtime = ext4_current_time(dir); + now = ext4_current_time(dir); + VFS_INODE_SET_XTIME(i_mtime, dir, now); + VFS_INODE_SET_XTIME(i_ctime, dir, now); ext4_update_dx_flag(dir); ext4_mark_inode_dirty(handle, dir); drop_nlink(inode); if (!inode->i_nlink) ext4_orphan_add(handle, inode); - inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, ext4_current_time(inode)); ext4_mark_inode_dirty(handle, inode);
end_unlink: @@ -3226,7 +3236,7 @@ retry: if (IS_DIRSYNC(dir)) ext4_handle_sync(handle);
- inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, ext4_current_time(inode)); ext4_inc_count(handle, inode); ihold(inode);
@@ -3342,6 +3352,7 @@ static int ext4_rename_dir_finish(handle_t *handle, struct ext4_renament *ent, static int ext4_setent(handle_t *handle, struct ext4_renament *ent, unsigned ino, unsigned file_type) { + struct inode_timespec now; int retval;
BUFFER_TRACE(ent->bh, "get write access"); @@ -3352,8 +3363,9 @@ static int ext4_setent(handle_t *handle, struct ext4_renament *ent, if (ext4_has_feature_filetype(ent->dir->i_sb)) ent->de->file_type = file_type; ent->dir->i_version++; - ent->dir->i_ctime = ent->dir->i_mtime = - ext4_current_time(ent->dir); + now = ext4_current_time(ent->dir); + VFS_INODE_SET_XTIME(i_mtime, ent->dir, now); + VFS_INODE_SET_XTIME(i_ctime, ent->dir, now); ext4_mark_inode_dirty(handle, ent->dir); BUFFER_TRACE(ent->bh, "call ext4_handle_dirty_metadata"); if (!ent->inlined) { @@ -3489,6 +3501,7 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry, int force_reread; int retval; struct inode *whiteout = NULL; + struct inode_timespec now; int credits; u8 old_file_type;
@@ -3619,7 +3632,7 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry, * Like most other Unix systems, set the ctime for inodes on a * rename. */ - old.inode->i_ctime = ext4_current_time(old.inode); + VFS_INODE_SET_XTIME(i_ctime, old.inode, ext4_current_time(old.inode)); ext4_mark_inode_dirty(handle, old.inode);
if (!whiteout) { @@ -3631,9 +3644,12 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
if (new.inode) { ext4_dec_count(handle, new.inode); - new.inode->i_ctime = ext4_current_time(new.inode); + VFS_INODE_SET_XTIME(i_ctime, new.inode, + ext4_current_time(new.inode)); } - old.dir->i_ctime = old.dir->i_mtime = ext4_current_time(old.dir); + now = ext4_current_time(old.dir); + VFS_INODE_SET_XTIME(i_mtime, old.dir, now); + VFS_INODE_SET_XTIME(i_ctime, old.dir, now); ext4_update_dx_flag(old.dir); if (old.dir_bh) { retval = ext4_rename_dir_finish(handle, &old, new.dir->i_ino); @@ -3785,8 +3801,8 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry, * Like most other Unix systems, set the ctime for inodes on a * rename. */ - old.inode->i_ctime = ext4_current_time(old.inode); - new.inode->i_ctime = ext4_current_time(new.inode); + VFS_INODE_SET_XTIME(i_ctime, old.inode, ext4_current_time(old.inode)); + VFS_INODE_SET_XTIME(i_ctime, new.inode, ext4_current_time(new.inode)); ext4_mark_inode_dirty(handle, old.inode); ext4_mark_inode_dirty(handle, new.inode);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 8f8aa69..2547697 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5116,6 +5116,7 @@ static int ext4_enable_quotas(struct super_block *sb) static int ext4_quota_off(struct super_block *sb, int type) { struct inode *inode = sb_dqopt(sb)->files[type]; + struct inode_timespec now; handle_t *handle;
/* Force all delayed allocation blocks to be allocated. @@ -5131,7 +5132,10 @@ static int ext4_quota_off(struct super_block *sb, int type) handle = ext4_journal_start(inode, EXT4_HT_QUOTA, 1); if (IS_ERR(handle)) goto out; - inode->i_mtime = inode->i_ctime = CURRENT_TIME; + + now = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_mtime, inode, now); + VFS_INODE_SET_XTIME(i_ctime, inode, now); ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle);
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index e9b9afd..58a9392 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1170,7 +1170,7 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index, } if (!error) { ext4_xattr_update_super_block(handle, inode->i_sb); - inode->i_ctime = ext4_current_time(inode); + VFS_INODE_SET_XTIME(i_ctime, inode, ext4_current_time(inode)); if (!value) ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND); error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);
64 bit time support is in place for all filesystems. Enable CONFIG_USES_64BIT_TIME to start using 64 bit time everywhere.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/Kconfig b/fs/Kconfig index a11934b..bfeefce 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -11,7 +11,7 @@ config DCACHE_WORD_ACCESS #use 64 bit timestamps config FS_USES_64BIT_TIME bool - default n + default y help Temporary configuration to switch over all file systems to use 64 bit time.
Substitute inode_timespec aliases with timespec64. Since CONFIG_FS_USES_64BIT_TIME is enabled, internally all inode_timespec references are using timespec64 already.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/cifs/cache.c | 4 ++-- fs/cifs/cifsglob.h | 6 +++--- fs/cifs/cifsproto.h | 6 +++--- fs/cifs/cifssmb.c | 4 ++-- fs/cifs/file.c | 2 +- fs/cifs/inode.c | 12 ++++++------ fs/cifs/netmisc.c | 10 +++++----- 7 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c index 8d27e69b..61c21bf 100644 --- a/fs/cifs/cache.c +++ b/fs/cifs/cache.c @@ -221,8 +221,8 @@ const struct fscache_cookie_def cifs_fscache_super_index_def = { * Auxiliary data attached to CIFS inode within the cache */ struct cifs_fscache_inode_auxdata { - struct inode_timespec last_write_time; - struct inode_timespec last_change_time; + struct timespec64 last_write_time; + struct timespec64 last_change_time; u64 eof; };
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index c95dce7..7dfb0e2 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1393,9 +1393,9 @@ struct cifs_fattr { dev_t cf_rdev; unsigned int cf_nlink; unsigned int cf_dtype; - struct inode_timespec cf_atime; - struct inode_timespec cf_mtime; - struct inode_timespec cf_ctime; + struct timespec64 cf_atime; + struct timespec64 cf_mtime; + struct timespec64 cf_ctime; };
static inline void free_dfs_info_param(struct dfs_info3_param *param) diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index 9979c74..663b8a4 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -126,10 +126,10 @@ extern enum securityEnum select_sectype(struct TCP_Server_Info *server, enum securityEnum requested); extern int CIFS_SessSetup(const unsigned int xid, struct cifs_ses *ses, const struct nls_table *nls_cp); -extern struct inode_timespec +extern struct timespec64 cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); -extern u64 cifs_UnixTimeToNT(struct inode_timespec); -extern struct inode_timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, +extern u64 cifs_UnixTimeToNT(struct timespec64); +extern struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset); extern void cifs_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock); extern int cifs_get_writer(struct cifsInodeInfo *cinode); diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index a813bcd..465e089 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -478,7 +478,7 @@ decode_lanman_negprot_rsp(struct TCP_Server_Info *server, NEGOTIATE_RSP *pSMBr) * this requirement. */ int val, seconds, remain, result; - struct inode_timespec ts, utc; + struct timespec64 ts, utc;
utc = FS_TIME; ts = cnvrtDosUnixTm(rsp->SrvTime.Date, @@ -4004,7 +4004,7 @@ QInfRetry: if (rc) { cifs_dbg(FYI, "Send error in QueryInfo = %d\n", rc); } else if (data) { - struct inode_timespec ts; + struct timespec64 ts;
__u32 time = le32_to_cpu(pSMBr->last_write_time);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 2d226cf..656e799 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1839,7 +1839,7 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to) int bytes_written = 0; struct inode *inode; struct cifsFileInfo *open_file; - struct inode_timespec now; + struct timespec64 now;
if (!mapping || !mapping->host) return -EFAULT; diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index bb91bf7..c65d7e3 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -92,7 +92,7 @@ static void cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) { struct cifsInodeInfo *cifs_i = CIFS_I(inode); - struct inode_timespec mtime; + struct timespec64 mtime;
cifs_dbg(FYI, "%s: revalidating inode %llu\n", __func__, cifs_i->uniqueid); @@ -112,7 +112,7 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr)
/* revalidate if mtime or size have changed */ mtime = VFS_INODE_GET_XTIME(i_mtime, inode); - if (inode_timespec_equal(&mtime, &fattr->cf_mtime) + if (timespec64_equal(&mtime, &fattr->cf_mtime) && cifs_i->server_eof == fattr->cf_eof) { cifs_dbg(FYI, "%s: inode %llu is unchanged\n", __func__, cifs_i->uniqueid); @@ -320,7 +320,7 @@ static void cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct super_block *sb) { struct cifs_sb_info *cifs_sb = CIFS_SB(sb); - struct inode_timespec now; + struct timespec64 now;
cifs_dbg(FYI, "creating fake fattr for DFS referral\n");
@@ -1261,7 +1261,7 @@ int cifs_unlink(struct inode *dir, struct dentry *dentry) unsigned int xid; char *full_path = NULL; struct inode *inode = d_inode(dentry); - struct inode_timespec now; + struct timespec64 now; struct cifsInodeInfo *cifs_inode; struct super_block *sb = dir->i_sb; struct cifs_sb_info *cifs_sb = CIFS_SB(sb); @@ -1577,7 +1577,7 @@ int cifs_rmdir(struct inode *inode, struct dentry *direntry) struct TCP_Server_Info *server; char *full_path = NULL; struct cifsInodeInfo *cifsInode; - struct inode_timespec now; + struct timespec64 now;
cifs_dbg(FYI, "cifs_rmdir, inode = 0x%p\n", inode);
@@ -1707,7 +1707,7 @@ cifs_rename2(struct inode *source_dir, struct dentry *source_dentry, struct cifs_tcon *tcon; FILE_UNIX_BASIC_INFO *info_buf_source = NULL; FILE_UNIX_BASIC_INFO *info_buf_target; - struct inode_timespec now; + struct timespec64 now; unsigned int xid; int rc, tmprc;
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index 4a26260..bb1570f 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -918,10 +918,10 @@ smbCalcSize(void *buf) * Convert the NT UTC (based 1601-01-01, in hundred nanosecond units) * into Unix UTC (based 1970-01-01, in seconds). */ -struct inode_timespec +struct timespec64 cifs_NTtimeToUnix(__le64 ntutc) { - struct inode_timespec ts; + struct timespec64 ts; /* BB what about the timezone? BB */
/* Subtract the NTFS time offset, then convert to 1s intervals. */ @@ -949,7 +949,7 @@ cifs_NTtimeToUnix(__le64 ntutc)
/* Convert the Unix UTC into NT UTC. */ u64 -cifs_UnixTimeToNT(struct inode_timespec t) +cifs_UnixTimeToNT(struct timespec64 t) { /* Convert to 100ns intervals and then add the NTFS time offset. */ return (u64) t.tv_sec * 10000000 + t.tv_nsec/100 + NTFS_TIME_OFFSET; @@ -959,9 +959,9 @@ static const int total_days_of_prev_months[] = { 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 };
-struct inode_timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) +struct timespec64 cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) { - struct inode_timespec ts; + struct timespec64 ts; unsigned long long sec; unsigned int min, days, month, year; u16 date = le16_to_cpu(le_date);
Substitute inode_timespec aliases with timespec64. Since CONFIG_FS_USES_64BIT_TIME is enabled, internally all inode_timespec references are using timespec64 already.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/fat/dir.c | 4 ++-- fs/fat/fat.h | 6 +++--- fs/fat/file.c | 4 ++-- fs/fat/inode.c | 12 ++++++------ fs/fat/misc.c | 4 ++-- fs/fat/namei_msdos.c | 8 ++++---- fs/fat/namei_vfat.c | 14 +++++++------- 7 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/fs/fat/dir.c b/fs/fat/dir.c index fa8a922..683f973 100644 --- a/fs/fat/dir.c +++ b/fs/fat/dir.c @@ -1034,7 +1034,7 @@ int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo) struct super_block *sb = dir->i_sb; struct msdos_dir_entry *de; struct buffer_head *bh; - struct inode_timespec now; + struct timespec64 now; int err = 0, nr_slots;
/* @@ -1133,7 +1133,7 @@ error: return err; }
-int fat_alloc_new_dir(struct inode *dir, struct inode_timespec *ts) +int fat_alloc_new_dir(struct inode *dir, struct timespec64 *ts) { struct super_block *sb = dir->i_sb; struct msdos_sb_info *sbi = MSDOS_SB(sb); diff --git a/fs/fat/fat.h b/fs/fat/fat.h index cabb0fd..859626a 100644 --- a/fs/fat/fat.h +++ b/fs/fat/fat.h @@ -303,7 +303,7 @@ extern int fat_scan_logstart(struct inode *dir, int i_logstart, struct fat_slot_info *sinfo); extern int fat_get_dotdot_entry(struct inode *dir, struct buffer_head **bh, struct msdos_dir_entry **de); -extern int fat_alloc_new_dir(struct inode *dir, struct inode_timespec *ts); +extern int fat_alloc_new_dir(struct inode *dir, struct timespec64 *ts); extern int fat_add_entries(struct inode *dir, void *slots, int nr_slots, struct fat_slot_info *sinfo); extern int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo); @@ -406,10 +406,10 @@ void fat_msg(struct super_block *sb, const char *level, const char *fmt, ...); extern int fat_clusters_flush(struct super_block *sb); extern int fat_chain_add(struct inode *inode, int new_dclus, int nr_cluster); extern void fat_time_fat2unix(struct msdos_sb_info *sbi, - struct inode_timespec *ts, + struct timespec64 *ts, __le16 __time, __le16 __date, u8 time_cs); extern void fat_time_unix2fat(struct msdos_sb_info *sbi, - struct inode_timespec *ts, + struct timespec64 *ts, __le16 *time, __le16 *date, u8 *time_cs); extern int fat_sync_bhs(struct buffer_head **bhs, int nr_bhs);
diff --git a/fs/fat/file.c b/fs/fat/file.c index e7f060f..48c985d 100644 --- a/fs/fat/file.c +++ b/fs/fat/file.c @@ -188,7 +188,7 @@ static int fat_cont_expand(struct inode *inode, loff_t size) { struct address_space *mapping = inode->i_mapping; loff_t start = inode->i_size, count = size - inode->i_size; - struct inode_timespec ts; + struct timespec64 ts; int err;
err = generic_cont_expand_simple(inode, size); @@ -283,7 +283,7 @@ error: static int fat_free(struct inode *inode, int skip) { struct super_block *sb = inode->i_sb; - struct inode_timespec ts; + struct timespec64 ts; int err, wait, free_start, i_start, i_logstart;
if (MSDOS_I(inode)->i_start == 0) diff --git a/fs/fat/inode.c b/fs/fat/inode.c index a1eba05..03a8fa0 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -232,7 +232,7 @@ static int fat_write_end(struct file *file, struct address_space *mapping, struct page *pagep, void *fsdata) { struct inode *inode = mapping->host; - struct inode_timespec ts; + struct timespec64 ts; int err; err = generic_write_end(file, mapping, pos, len, copied, pagep, fsdata); if (err < len) @@ -505,7 +505,7 @@ static int fat_validate_dir(struct inode *dir) int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); - struct inode_timespec mtime, ctime, atime; + struct timespec64 mtime, ctime, atime; int error;
MSDOS_I(inode)->i_pos = 0; @@ -840,7 +840,7 @@ static int __fat_write_inode(struct inode *inode, int wait) struct msdos_sb_info *sbi = MSDOS_SB(sb); struct buffer_head *bh; struct msdos_dir_entry *raw_entry; - struct inode_timespec ts; + struct timespec64 ts; loff_t i_pos; sector_t blocknr; int err, offset; @@ -1402,9 +1402,9 @@ static int fat_read_root(struct inode *inode) MSDOS_I(inode)->mmu_private = inode->i_size;
fat_save_attrs(inode, ATTR_DIR); - VFS_INODE_SET_XTIME(i_atime, inode, ((struct inode_timespec) {0, 0})); - VFS_INODE_SET_XTIME(i_mtime, inode, ((struct inode_timespec) {0, 0})); - VFS_INODE_SET_XTIME(i_ctime, inode, ((struct inode_timespec) {0, 0})); + VFS_INODE_SET_XTIME(i_atime, inode, ((struct timespec64) {0, 0})); + VFS_INODE_SET_XTIME(i_mtime, inode, ((struct timespec64) {0, 0})); + VFS_INODE_SET_XTIME(i_ctime, inode, ((struct timespec64) {0, 0})); set_nlink(inode, fat_subdirs(inode)+2);
return 0; diff --git a/fs/fat/misc.c b/fs/fat/misc.c index 1544498..3ed5fbd 100644 --- a/fs/fat/misc.c +++ b/fs/fat/misc.c @@ -186,7 +186,7 @@ static time_t days_in_year[] = { };
/* Convert a FAT time/date pair to a UNIX date (seconds since 1 1 70). */ -void fat_time_fat2unix(struct msdos_sb_info *sbi, struct inode_timespec *ts, +void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec64 *ts, __le16 __time, __le16 __date, u8 time_cs) { u16 time = le16_to_cpu(__time), date = le16_to_cpu(__date); @@ -225,7 +225,7 @@ void fat_time_fat2unix(struct msdos_sb_info *sbi, struct inode_timespec *ts, }
/* Convert linear UNIX date to a FAT time/date pair. */ -void fat_time_unix2fat(struct msdos_sb_info *sbi, struct inode_timespec *ts, +void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec64 *ts, __le16 *time, __le16 *date, u8 *time_cs) { struct tm tm; diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c index 457dcfb..faa1274 100644 --- a/fs/fat/namei_msdos.c +++ b/fs/fat/namei_msdos.c @@ -224,7 +224,7 @@ static struct dentry *msdos_lookup(struct inode *dir, struct dentry *dentry, /***** Creates a directory entry (name is already formatted). */ static int msdos_add_entry(struct inode *dir, const unsigned char *name, int is_dir, int is_hid, int cluster, - struct inode_timespec *ts, + struct timespec64 *ts, struct fat_slot_info *sinfo) { struct msdos_sb_info *sbi = MSDOS_SB(dir->i_sb); @@ -267,7 +267,7 @@ static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct super_block *sb = dir->i_sb; struct inode *inode = NULL; struct fat_slot_info sinfo; - struct inode_timespec ts; + struct timespec64 ts; unsigned char msdos_name[MSDOS_NAME]; int err, is_hid;
@@ -351,7 +351,7 @@ static int msdos_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) struct fat_slot_info sinfo; struct inode *inode; unsigned char msdos_name[MSDOS_NAME]; - struct inode_timespec ts; + struct timespec64 ts; int err, is_hid, cluster;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -442,7 +442,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name, struct inode *old_inode, *new_inode; struct super_block *sb = old_dir->i_sb; struct fat_slot_info old_sinfo, sinfo; - struct inode_timespec ts; + struct timespec64 ts; loff_t new_i_pos; int err, old_attrs, is_dir, update_dotdot, corrupt = 0;
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index 31da5b6..84c309d 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -577,7 +577,7 @@ xlate_to_uni(const unsigned char *name, int len, unsigned char *outname,
static int vfat_build_slots(struct inode *dir, const unsigned char *name, int len, int is_dir, int cluster, - struct inode_timespec *ts, + struct timespec64 *ts, struct msdos_dir_slot *slots, int *nr_slots) { struct msdos_sb_info *sbi = MSDOS_SB(dir->i_sb); @@ -653,7 +653,7 @@ out_free: }
static int vfat_add_entry(struct inode *dir, struct qstr *qname, int is_dir, - int cluster, struct inode_timespec *ts, + int cluster, struct timespec64 *ts, struct fat_slot_info *sinfo) { struct msdos_dir_slot *slots; @@ -774,7 +774,7 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct super_block *sb = dir->i_sb; struct inode *inode; struct fat_slot_info sinfo; - struct inode_timespec ts; + struct timespec64 ts; int err;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -808,7 +808,7 @@ static int vfat_rmdir(struct inode *dir, struct dentry *dentry) struct inode *inode = d_inode(dentry); struct super_block *sb = dir->i_sb; struct fat_slot_info sinfo; - struct inode_timespec now; + struct timespec64 now; int err;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -842,7 +842,7 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry) struct inode *inode = d_inode(dentry); struct super_block *sb = dir->i_sb; struct fat_slot_info sinfo; - struct inode_timespec now; + struct timespec64 now; int err;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -871,7 +871,7 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode) struct super_block *sb = dir->i_sb; struct inode *inode; struct fat_slot_info sinfo; - struct inode_timespec ts; + struct timespec64 ts; int err, cluster;
mutex_lock(&MSDOS_SB(sb)->s_lock); @@ -921,7 +921,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry, struct msdos_dir_entry *dotdot_de; struct inode *old_inode, *new_inode; struct fat_slot_info old_sinfo, sinfo; - struct inode_timespec ts; + struct timespec64 ts; loff_t new_i_pos; int err, is_dir, update_dotdot, corrupt = 0; struct super_block *sb = old_dir->i_sb;
Substitute inode_timespec aliases with timespec64. Since CONFIG_FS_USES_64BIT_TIME is enabled, internally all inode_timespec references are using timespec64 already.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/ext4/ext4.h | 14 +++++++------- fs/ext4/extents.c | 6 +++--- fs/ext4/ialloc.c | 2 +- fs/ext4/inline.c | 4 ++-- fs/ext4/inode.c | 6 +++--- fs/ext4/ioctl.c | 2 +- fs/ext4/namei.c | 10 +++++----- fs/ext4/super.c | 2 +- 8 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 4bb2604..2d4bef0 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -754,14 +754,14 @@ struct move_extent { * affected filesystem before 2242. */
-static inline __le32 ext4_encode_extra_time(struct inode_timespec *time) +static inline __le32 ext4_encode_extra_time(struct timespec64 *time) { u32 extra = sizeof(time->tv_sec) > 4 ? ((time->tv_sec - (s32)time->tv_sec) >> 32) & EXT4_EPOCH_MASK : 0; return cpu_to_le32(extra | (time->tv_nsec << EXT4_EPOCH_BITS)); }
-static inline void ext4_decode_extra_time(struct inode_timespec *time, +static inline void ext4_decode_extra_time(struct timespec64 *time, __le32 extra) { if (unlikely(sizeof(time->tv_sec) > 4 && @@ -787,7 +787,7 @@ static inline void ext4_decode_extra_time(struct inode_timespec *time,
#define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \ do { \ - struct inode_timespec __ts = VFS_INODE_GET_XTIME(xtime, inode); \ + struct timespec64 __ts = VFS_INODE_GET_XTIME(xtime, inode); \ (raw_inode)->xtime = cpu_to_le32(__ts.tv_sec); \ if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ (raw_inode)->xtime ## _extra = \ @@ -805,7 +805,7 @@ do { \
#define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \ do { \ - struct inode_timespec __ts = VFS_INODE_GET_XTIME(xtime, inode); \ + struct timespec64 __ts = VFS_INODE_GET_XTIME(xtime, inode); \ __ts.tv_sec = (signed)le32_to_cpu((raw_inode)->xtime); \ if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ ext4_decode_extra_time(&__ts, \ @@ -935,9 +935,9 @@ struct ext4_inode_info {
/* * File creation time. Its function is same as that of - * struct inode_timespec i_{a,c,m}time in the generic inode. + * struct timespec64 i_{a,c,m}time in the generic inode. */ - struct inode_timespec i_crtime; + struct timespec64 i_crtime;
/* mballoc */ struct list_head i_prealloc_list; @@ -1445,7 +1445,7 @@ static inline struct ext4_inode_info *EXT4_I(struct inode *inode) return container_of(inode, struct ext4_inode_info, vfs_inode); }
-static inline struct inode_timespec ext4_current_time(struct inode *inode) +static inline struct timespec64 ext4_current_time(struct inode *inode) { return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ? current_fs_time(inode->i_sb) : current_fs_time_sec(inode->i_sb); diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 99c4800..ec1a912 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4756,7 +4756,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, loff_t len, int mode) { struct inode *inode = file_inode(file); - struct inode_timespec now; + struct timespec64 now; handle_t *handle = NULL; unsigned int max_blocks; loff_t new_size = 0; @@ -5465,7 +5465,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) { struct super_block *sb = inode->i_sb; ext4_lblk_t punch_start, punch_stop; - struct inode_timespec now; + struct timespec64 now; handle_t *handle; unsigned int credits; loff_t new_size, ioffset; @@ -5616,7 +5616,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) struct ext4_ext_path *path; struct ext4_extent *extent; ext4_lblk_t offset_lblk, len_lblk, ee_start_lblk = 0; - struct inode_timespec now; + struct timespec64 now; unsigned int credits, ee_len; int ret = 0, depth, split_flag = 0; loff_t ioffset; diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 6f16598..929c092 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -756,7 +756,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct inode *dir, ext4_group_t i; ext4_group_t flex_group; struct ext4_group_info *grp; - struct inode_timespec ts; + struct timespec64 ts; int encrypt = 0;
/* Cannot create files in a deleted directory */ diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index a53fb3b..08550a4 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -1003,7 +1003,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle, struct inode *dir = d_inode(dentry->d_parent); int err; struct ext4_dir_entry_2 *de; - struct inode_timespec now; + struct timespec64 now;
err = ext4_find_dest_de(dir, inode, iloc->bh, inline_start, inline_size, fname, &de); @@ -1899,7 +1899,7 @@ void ext4_inline_data_truncate(struct inode *inode, int *has_inline) int inline_size, value_len, needed_blocks; size_t i_size; void *value = NULL; - struct inode_timespec now; + struct timespec64 now; struct ext4_xattr_ibody_find is = { .s = { .not_found = -ENODATA, }, }; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 078fd58..be618a7 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3689,7 +3689,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) struct super_block *sb = inode->i_sb; ext4_lblk_t first_block, stop_block; struct address_space *mapping = inode->i_mapping; - struct inode_timespec now; + struct timespec64 now; loff_t first_block_offset, last_block_offset; handle_t *handle; unsigned int credits; @@ -3878,7 +3878,7 @@ void ext4_truncate(struct inode *inode) unsigned int credits; handle_t *handle; struct address_space *mapping = inode->i_mapping; - struct inode_timespec now; + struct timespec64 now;
/* * There is a possibility that we're either freeing the inode @@ -4831,7 +4831,7 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode) int ext4_setattr(struct dentry *dentry, struct iattr *attr) { struct inode *inode = d_inode(dentry); - struct inode_timespec now; + struct timespec64 now; int error, rc = 0; int orphan = 0; const unsigned int ia_valid = attr->ia_valid; diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 3825eb7..94be0f8 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -98,7 +98,7 @@ static long swap_inode_boot_loader(struct super_block *sb, struct inode *inode_bl; struct ext4_inode_info *ei_bl; struct ext4_sb_info *sbi = EXT4_SB(sb); - struct inode_timespec now; + struct timespec64 now;
if (inode->i_nlink != 1 || !S_ISREG(inode->i_mode)) return -EINVAL; diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 34c2d91..9ced45b 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1876,7 +1876,7 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname, struct buffer_head *bh) { unsigned int blocksize = dir->i_sb->s_blocksize; - struct inode_timespec now; + struct timespec64 now; int csum_size = 0; int err;
@@ -2914,7 +2914,7 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry) struct buffer_head *bh; struct ext4_dir_entry_2 *de; handle_t *handle = NULL; - struct inode_timespec now; + struct timespec64 now;
/* Initialize quotas before so that eventual writes go in * separate transaction */ @@ -2991,7 +2991,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry) struct buffer_head *bh; struct ext4_dir_entry_2 *de; handle_t *handle = NULL; - struct inode_timespec now; + struct timespec64 now;
trace_ext4_unlink_enter(dir, dentry); /* Initialize quotas before so that eventual writes go @@ -3352,7 +3352,7 @@ static int ext4_rename_dir_finish(handle_t *handle, struct ext4_renament *ent, static int ext4_setent(handle_t *handle, struct ext4_renament *ent, unsigned ino, unsigned file_type) { - struct inode_timespec now; + struct timespec64 now; int retval;
BUFFER_TRACE(ent->bh, "get write access"); @@ -3501,7 +3501,7 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry, int force_reread; int retval; struct inode *whiteout = NULL; - struct inode_timespec now; + struct timespec64 now; int credits; u8 old_file_type;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 2547697..32f6b19 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5116,7 +5116,7 @@ static int ext4_enable_quotas(struct super_block *sb) static int ext4_quota_off(struct super_block *sb, int type) { struct inode *inode = sb_dqopt(sb)->files[type]; - struct inode_timespec now; + struct timespec64 now; handle_t *handle;
/* Force all delayed allocation blocks to be allocated.
Substitute inode_timespec aliases with timespec64. Since CONFIG_FS_USES_64BIT_TIME is enabled, internally all inode_timespec references are using timespec64 already.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/attr.c | 2 +- fs/bad_inode.c | 2 +- fs/binfmt_misc.c | 2 +- fs/inode.c | 36 ++++++++++++++++++------------------ fs/libfs.c | 12 ++++++------ fs/locks.c | 2 +- fs/nsfs.c | 2 +- fs/pipe.c | 2 +- fs/utimes.c | 4 ++-- include/linux/fs.h | 24 +++++++++--------------- include/linux/stat.h | 6 +++--- 11 files changed, 44 insertions(+), 50 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c index 4156239..ec5e9ad 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -192,7 +192,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de struct inode *inode = dentry->d_inode; umode_t mode = inode->i_mode; int error; - struct inode_timespec now; + struct timespec64 now; unsigned int ia_valid = attr->ia_valid;
WARN_ON_ONCE(!mutex_is_locked(&inode->i_mutex)); diff --git a/fs/bad_inode.c b/fs/bad_inode.c index 3c51e22..2a8daef 100644 --- a/fs/bad_inode.c +++ b/fs/bad_inode.c @@ -169,7 +169,7 @@ static const struct inode_operations bad_inode_ops =
void make_bad_inode(struct inode *inode) { - struct inode_timespec now; + struct timespec64 now;
remove_inode_hash(inode);
diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c index 4fd4437..c58ecb7 100644 --- a/fs/binfmt_misc.c +++ b/fs/binfmt_misc.c @@ -562,7 +562,7 @@ static void entry_status(Node *e, char *page) static struct inode *bm_get_inode(struct super_block *sb, int mode) { struct inode *inode = new_inode(sb); - struct inode_timespec now; + struct timespec64 now;
if (inode) { inode->i_ino = get_next_ino(); diff --git a/fs/inode.c b/fs/inode.c index d3d64dc..86218f6 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1532,12 +1532,12 @@ EXPORT_SYMBOL(bmap); * passed since the last atime update. */ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode, - struct inode_timespec now) + struct timespec64 now) {
- struct inode_timespec ctime; - struct inode_timespec mtime; - struct inode_timespec atime; + struct timespec64 ctime; + struct timespec64 mtime; + struct timespec64 atime;
if (!(mnt->mnt_flags & MNT_RELATIME)) return 1; @@ -1549,12 +1549,12 @@ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode, /* * Is mtime younger than atime? If yes, update atime: */ - if (inode_timespec_compare(&mtime, &atime) >= 0) + if (timespec64_compare(&mtime, &atime) >= 0) return 1; /* * Is ctime younger than atime? If yes, update atime: */ - if (inode_timespec_compare(&ctime, &atime) >= 0) + if (timespec64_compare(&ctime, &atime) >= 0) return 1;
/* @@ -1569,7 +1569,7 @@ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode, return 0; }
-int generic_update_time(struct inode *inode, struct inode_timespec *time, +int generic_update_time(struct inode *inode, struct timespec64 *time, int flags) { int iflags = I_DIRTY_TIME; @@ -1594,10 +1594,10 @@ EXPORT_SYMBOL(generic_update_time); * This does the actual work of updating an inodes time or version. Must have * had called mnt_want_write() before calling this. */ -static int update_time(struct inode *inode, struct inode_timespec *time, +static int update_time(struct inode *inode, struct timespec64 *time, int flags) { - int (*update_time)(struct inode *, struct inode_timespec *, int); + int (*update_time)(struct inode *, struct timespec64 *, int);
update_time = inode->i_op->update_time ? inode->i_op->update_time : generic_update_time; @@ -1617,8 +1617,8 @@ static int update_time(struct inode *inode, struct inode_timespec *time, bool atime_needs_update(const struct path *path, struct inode *inode) { struct vfsmount *mnt = path->mnt; - struct inode_timespec now; - struct inode_timespec atime; + struct timespec64 now; + struct timespec64 atime;
if (inode->i_flags & S_NOATIME) return false; @@ -1637,7 +1637,7 @@ bool atime_needs_update(const struct path *path, struct inode *inode) return false;
atime = VFS_INODE_GET_XTIME(i_atime, inode); - if (inode_timespec_equal(&atime, &now)) + if (timespec64_equal(&atime, &now)) return false;
return true; @@ -1647,7 +1647,7 @@ void touch_atime(const struct path *path) { struct vfsmount *mnt = path->mnt; struct inode *inode = d_inode(path->dentry); - struct inode_timespec now; + struct timespec64 now;
if (!atime_needs_update(path, inode)) return; @@ -1782,9 +1782,9 @@ EXPORT_SYMBOL(file_remove_privs); int file_update_time(struct file *file) { struct inode *inode = file_inode(file); - struct inode_timespec now; - struct inode_timespec mtime; - struct inode_timespec ctime; + struct timespec64 now; + struct timespec64 mtime; + struct timespec64 ctime;
int sync_it = 0; int ret; @@ -1798,10 +1798,10 @@ int file_update_time(struct file *file) mtime = VFS_INODE_GET_XTIME(i_mtime, inode); ctime = VFS_INODE_GET_XTIME(i_ctime, inode);
- if (!inode_timespec_equal(&mtime, &now)) + if (!timespec64_equal(&mtime, &now)) sync_it = S_MTIME;
- if (!inode_timespec_equal(&ctime, &now)) + if (!timespec64_equal(&ctime, &now)) sync_it |= S_CTIME;
if (IS_I_VERSION(inode)) diff --git a/fs/libfs.c b/fs/libfs.c index 5a0c7c2..ffa9e65 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -216,7 +216,7 @@ struct dentry *mount_pseudo(struct file_system_type *fs_type, char *name, struct dentry *dentry; struct inode *root; struct qstr d_name = QSTR_INIT(name, strlen(name)); - struct inode_timespec now; + struct timespec64 now;
s = sget(fs_type, NULL, set_anon_super, MS_NOUSER, NULL); if (IS_ERR(s)) @@ -274,7 +274,7 @@ EXPORT_SYMBOL(simple_open); int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(old_dentry); - struct inode_timespec now = current_fs_time(inode->i_sb); + struct timespec64 now = current_fs_time(inode->i_sb);
VFS_INODE_SET_XTIME(i_ctime, inode, now); VFS_INODE_SET_XTIME(i_mtime, dir, now); @@ -311,7 +311,7 @@ EXPORT_SYMBOL(simple_empty); int simple_unlink(struct inode *dir, struct dentry *dentry) { struct inode *inode = d_inode(dentry); - struct inode_timespec now = current_fs_time(inode->i_sb); + struct timespec64 now = current_fs_time(inode->i_sb);
VFS_INODE_SET_XTIME(i_ctime, inode, now); VFS_INODE_SET_XTIME(i_mtime, dir, now); @@ -339,7 +339,7 @@ int simple_rename(struct inode *old_dir, struct dentry *old_dentry, { struct inode *inode = d_inode(old_dentry); int they_are_dirs = d_is_dir(old_dentry); - struct inode_timespec now; + struct timespec64 now;
if (!simple_empty(new_dentry)) return -ENOTEMPTY; @@ -495,7 +495,7 @@ int simple_fill_super(struct super_block *s, unsigned long magic, struct inode *inode; struct dentry *root; struct dentry *dentry; - struct inode_timespec now; + struct timespec64 now; int i;
s->s_blocksize = PAGE_CACHE_SIZE; @@ -1080,7 +1080,7 @@ struct inode *alloc_anon_inode(struct super_block *s) .set_page_dirty = anon_set_page_dirty, }; struct inode *inode = new_inode_pseudo(s); - struct inode_timespec now; + struct timespec64 now;
if (!inode) return ERR_PTR(-ENOMEM); diff --git a/fs/locks.c b/fs/locks.c index 2b818eb..ccf9c23 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1491,7 +1491,7 @@ EXPORT_SYMBOL(__break_lease); * exclusive leases. The justification is that if someone has an * exclusive lease, then they could be modifying it. */ -void lease_get_mtime(struct inode *inode, struct inode_timespec *time) +void lease_get_mtime(struct inode *inode, struct timespec64 *time) { bool has_lease = false; struct file_lock_context *ctx; diff --git a/fs/nsfs.c b/fs/nsfs.c index a079fc9..e9012b5 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -51,7 +51,7 @@ void *ns_get_path(struct path *path, struct task_struct *task, struct qstr qname = { .name = "", }; struct dentry *dentry; struct inode *inode; - struct inode_timespec now; + struct timespec64 now; struct ns_common *ns; unsigned long d;
diff --git a/fs/pipe.c b/fs/pipe.c index 5d414a3..3bcb870 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -637,7 +637,7 @@ static struct inode * get_pipe_inode(void) { struct inode *inode = new_inode_pseudo(pipe_mnt->mnt_sb); struct pipe_inode_info *pipe; - struct inode_timespec now; + struct timespec64 now;
if (!inode) goto fail_inode; diff --git a/fs/utimes.c b/fs/utimes.c index c23c8e6..6c0d208 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -48,7 +48,7 @@ static bool nsec_valid(long nsec) return nsec >= 0 && nsec <= 999999999; }
-static int utimes_common(struct path *path, struct inode_timespec *times) +static int utimes_common(struct path *path, struct timespec64 *times) { int error; struct iattr newattrs; @@ -135,7 +135,7 @@ out: * Else, update from *times, must be owner or super user. */ long do_utimes(int dfd, const char __user *filename, - struct inode_timespec *times, + struct timespec64 *times, int flags) { int error = -EINVAL; diff --git a/include/linux/fs.h b/include/linux/fs.h index 5112bc2..4a754a2 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -249,9 +249,9 @@ struct iattr { kuid_t ia_uid; kgid_t ia_gid; loff_t ia_size; - struct inode_timespec ia_atime; - struct inode_timespec ia_mtime; - struct inode_timespec ia_ctime; + struct timespec64 ia_atime; + struct timespec64 ia_mtime; + struct timespec64 ia_ctime;
/* * Not an attribute, but an auxiliary info for filesystems wanting to @@ -616,18 +616,12 @@ struct inode { }; dev_t i_rdev; loff_t i_size; -#ifdef CONFIG_FS_USES_64BIT_TIME time64_t i_atime_sec; time64_t i_mtime_sec; time64_t i_ctime_sec; s32 i_atime_nsec; s32 i_mtime_nsec; s32 i_ctime_nsec; -#else - struct timespec i_atime; - struct timespec i_mtime; - struct timespec i_ctime; -#endif spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */ unsigned short i_bytes; unsigned int i_blkbits; @@ -727,7 +721,7 @@ struct inode {
#define VFS_INODE_SWAP_XTIME(xtime, inode1, inode2) \ do { \ - struct inode_timespec __ts = \ + struct timespec64 __ts = \ VFS_INODE_GET_XTIME(xtime, inode1); \ VFS_INODE_SET_XTIME(xtime, inode1, \ VFS_INODE_GET_XTIME(xtime, inode2)); \ @@ -1448,11 +1442,11 @@ struct super_block {
#endif
-extern int is_fs_timestamp_bad(struct inode_timespec ts); -extern struct inode_timespec current_fs_time(struct super_block *sb); -extern struct inode_timespec current_fs_time_sec(struct super_block *sb); -extern struct inode_timespec -fs_time_trunc(struct inode_timespec ts, struct super_block *sb); +extern int is_fs_timestamp_bad(struct timespec64 ts); +extern struct timespec64 current_fs_time(struct super_block *sb); +extern struct timespec64 current_fs_time_sec(struct super_block *sb); +extern struct timespec64 +fs_time_trunc(struct timespec64 ts, struct super_block *sb);
/* * Snapshotting support. diff --git a/include/linux/stat.h b/include/linux/stat.h index 559983f..5561337 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -27,9 +27,9 @@ struct kstat { kgid_t gid; dev_t rdev; loff_t size; - struct inode_timespec atime; - struct inode_timespec mtime; - struct inode_timespec ctime; + struct timespec64 atime; + struct timespec64 mtime; + struct timespec64 ctime; unsigned long blksize; unsigned long long blocks; };
Substitute inode_timespec aliases with timespec64. Since CONFIG_FS_USES_64BIT_TIME is enabled, internally all inode_timespec references are using timespec64 already.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com
Conflicts: kernel/time/time.c
Conflicts: kernel/time/time.c --- kernel/time/time.c | 130 +++++++++++++++++++++++++++++++---------------------- 1 file changed, 76 insertions(+), 54 deletions(-)
diff --git a/kernel/time/time.c b/kernel/time/time.c index 24ca258..87d9a4c 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -230,6 +230,76 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p) return copy_to_user(txc_p, &txc, sizeof(struct timex)) ? -EFAULT : ret; }
+<<<<<<< HEAD +======= +/** + * current_fs_time - Return FS time + * @sb: Superblock. + * + * Return the current time truncated to the time granularity supported by + * the fs. + */ +struct timespec64 current_fs_time(struct super_block *sb) +{ + struct timespec64 now = current_kernel_time64(); + + return fs_time_trunc(now, sb); +} +EXPORT_SYMBOL(current_fs_time); + +struct timespec64 current_fs_time_sec(struct super_block *sb) +{ + struct timespec64 ts = {ktime_get_real_seconds(), 0}; + + /* range check for time. */ + fs_time_range_check(sb, &ts); + + return ts; +} +EXPORT_SYMBOL(current_fs_time_sec); + +/* + * Convert jiffies to milliseconds and back. + * + * Avoid unnecessary multiplications/divisions in the + * two most common HZ cases: + */ +unsigned int jiffies_to_msecs(const unsigned long j) +{ +#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) + return (MSEC_PER_SEC / HZ) * j; +#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC) + return (j + (HZ / MSEC_PER_SEC) - 1)/(HZ / MSEC_PER_SEC); +#else +# if BITS_PER_LONG == 32 + return (HZ_TO_MSEC_MUL32 * j) >> HZ_TO_MSEC_SHR32; +# else + return (j * HZ_TO_MSEC_NUM) / HZ_TO_MSEC_DEN; +# endif +#endif +} +EXPORT_SYMBOL(jiffies_to_msecs); + +unsigned int jiffies_to_usecs(const unsigned long j) +{ + /* + * Hz usually doesn't go much further MSEC_PER_SEC. + * jiffies_to_usecs() and usecs_to_jiffies() depend on that. + */ + BUILD_BUG_ON(HZ > USEC_PER_SEC); + +#if !(USEC_PER_SEC % HZ) + return (USEC_PER_SEC / HZ) * j; +#else +# if BITS_PER_LONG == 32 + return (HZ_TO_USEC_MUL32 * j) >> HZ_TO_USEC_SHR32; +# else + return (j * HZ_TO_USEC_NUM) / HZ_TO_USEC_DEN; +# endif +#endif +} +EXPORT_SYMBOL(jiffies_to_usecs); + /* fs_time_range_check: * Function to check if a given timestamp is in the range allowed for a * filesystem. @@ -240,7 +310,7 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p) * nsec is set to 0 if not in allowed range. */ static void -fs_time_range_check(struct super_block *sb, struct inode_timespec *ts) +fs_time_range_check(struct super_block *sb, struct timespec64 *ts) { if (unlikely(sb->s_time_max < ts->tv_sec || sb->s_time_min > ts->tv_sec)) { @@ -257,7 +327,7 @@ fs_time_range_check(struct super_block *sb, struct inode_timespec *ts) * fs_time_range_check. * returns 0 otherwise. */ -int is_fs_timestamp_bad(struct inode_timespec ts) +int is_fs_timestamp_bad(struct timespec64 ts) { if (ts.tv_nsec == FS_TIMESTAMP_NSEC_NOT_VALID) return -1; @@ -267,16 +337,16 @@ int is_fs_timestamp_bad(struct inode_timespec ts) EXPORT_SYMBOL(is_fs_timestamp_bad);
/* - * fs_time_trunc - Truncate inode_timespec to a granularity - * @t: inode_timespec + * fs_time_trunc - Truncate timespec64 to a granularity + * @t: timespec64 * @sb: Super block. * * Truncate a timespec to a granularity. Always rounds down. Granularity * must * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). * Returns 1 on error, 0 otherwise. */ -struct inode_timespec -fs_time_trunc(struct inode_timespec t, struct super_block *sb) +struct timespec64 +fs_time_trunc(struct timespec64 t, struct super_block *sb) { u32 gran = sb->s_time_gran;
@@ -300,34 +370,6 @@ fs_time_trunc(struct inode_timespec t, struct super_block *sb) EXPORT_SYMBOL(fs_time_trunc);
/** - * timespec_trunc - Truncate timespec to a granularity - * @t: Timespec - * @gran: Granularity in ns. - * - * Truncate a timespec to a granularity. Always rounds down. gran must - * not be 0 nor greater than a second (NSEC_PER_SEC, or 10^9 ns). - * - * This function is deprecated and should no longer be used for filesystems. - * fs_time_trunc should be used instead. - */ -struct timespec timespec_trunc(struct timespec t, unsigned gran) -{ - - /* Avoid division in the common cases 1 ns and 1 s. */ - if (gran == 1) { - /* nothing */ - } else if (gran == NSEC_PER_SEC) { - t.tv_nsec = 0; - } else if (gran > 1 && gran < NSEC_PER_SEC) { - t.tv_nsec -= t.tv_nsec % gran; - } else { - WARN(1, "illegal file time granularity: %u", gran); - } - return t; -} -EXPORT_SYMBOL(timespec_trunc); - -/** * current_fs_time - Return FS time * @sb: Superblock. * @@ -353,26 +395,6 @@ struct timespec64 current_fs_time_sec(struct super_block *sb) return ts; } EXPORT_SYMBOL(current_fs_time_sec); -#else -struct timespec current_fs_time(struct super_block *sb) -{ - struct timespec now = current_kernel_time(); - - return fs_time_trunc(now, sb); -} -EXPORT_SYMBOL(current_fs_time); - -struct timespec current_fs_time_sec(struct super_block *sb) -{ - struct timespec ts = { get_seconds(), 0 }; - - /* range check for time. */ - fs_time_range_check(sb, &ts); - - return ts; -} -EXPORT_SYMBOL(current_fs_time_sec); -#endif
/* * Convert jiffies to milliseconds and back.
Now that CONFIG_FS_USES_64BIT_TIME is enabled, the aliases for inode_timespec are no longer used and can be removed.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- include/linux/time64.h | 21 --------------------- 1 file changed, 21 deletions(-)
diff --git a/include/linux/time64.h b/include/linux/time64.h index eb3cdc0..f30c910 100644 --- a/include/linux/time64.h +++ b/include/linux/time64.h @@ -26,27 +26,6 @@ struct itimerspec64 {
#endif
-#ifdef CONFIG_FS_USES_64BIT_TIME - -/* Place holder defines until CONFIG_FS_USES_64BIT_TIME - * is enabled. - * timespec64 data type and functions will be used at that - * time directly and these defines will be deleted. - */ -#define inode_timespec timespec64 - -#define inode_timespec_compare timespec64_compare -#define inode_timespec_equal timespec64_equal - -#else - -#define inode_timespec timespec - -#define inode_timespec_compare timespec_compare -#define inode_timespec_equal timespec_equal - -#endif - #define CURRENT_TIME64 (current_kernel_time64()) #define CURRENT_TIME64_SEC \ ((struct timespec64) { ktime_get_real_seconds(), 0 })
All file system code is using 64 bit time already and this config is no longer required.
Signed-off-by: Deepa Dinamani deepa.kernel@gmail.com --- fs/Kconfig | 10 ---------- 1 file changed, 10 deletions(-)
diff --git a/fs/Kconfig b/fs/Kconfig index bfeefce..922893f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -8,16 +8,6 @@ menu "File systems" config DCACHE_WORD_ACCESS bool
-#use 64 bit timestamps -config FS_USES_64BIT_TIME - bool - default y - help - Temporary configuration to switch over all file systems to - use 64 bit time. - Need to be enabled only after all individual file system - and vfs changes are in place. - if BLOCK
source "fs/ext2/Kconfig"