On Thu, Mar 03, 2016 at 03:08:22PM +0100, Arnd Bergmann wrote:
On Thursday 03 March 2016 10:45:11 Dave Chinner wrote:
On Thu, Mar 03, 2016 at 12:07:42AM +0100, Arnd Bergmann wrote:
On Thursday 03 March 2016 09:19:05 Dave Chinner wrote:
On Wed, Mar 02, 2016 at 07:51:34AM -0800, Deepa Dinamani wrote:
--- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -927,6 +927,12 @@ static inline struct file *get_file(struct file *f) #define MAX_LFS_FILESIZE ((loff_t)0x7fffffffffffffffLL) #endif +#define MAX_VFS_TIME S64_MAX +#define MIN_VFS_TIME S64_MIN
These. Anything ouside these ranges is invalid.
As such, I think this is wrong for 32 bit systems as the min/max VFS times right now are S32_MAX/S32_MIN...
I think the file system should always set the actual on-disk format limits in the superblock, regardless of the word size of the CPU architecture.
Eventually, yes. But right now, 32 bit systems are limited by the platform architecture. Hence no matter what the filesystem on-disk format supports, these are going to hard limits until filesystems start modifying the defaults to indicate what their on-disk format supports.
The purpose of the patch for now is to allow file systems to set the defaults, we are not checking them yet, until all file systems are converted, hence setting the defaults to invalid.
Wrong way around, IMO. Set the defaults to be limiting first, then as filesystems are converted the limits get expanded.
How about setting the defaults for minimum and maximum to '-1'? That would allow us to detect whether file systems have set this correctly, and it would be less confusing to the reader.
Perhaps, but I'm yet to understand why we even care if filesystems have set limits or not? If the filesystem has not set it's own limits, then surely we should assume that the filesystem is not y2038k compatible and have the default limits set appropriately for that?
i.e. 32 bit systems should default to 32 bit time limits until the filesystem developers come along and verify that conversion from their on-disk format to 64 bit time works correctly. Then they set their real limits which may (or may not) be >y2038 compatible, and this means that filesystems that have not been converted to >y2038 compatible do not need any modifications to behave correctly on 32 bit systems...
We can set the limits in the superblock if you like, but I would not do it by changing the constants here:
Most file systems have arbitrary other maximum dates that they need to set (on 64-bit systems), e.g. 2028 (iso9660), 2040 (hfsplus), 2106 (ceph), 2107 (fat, cifs), 2262 (logfs), 2514 (ext4).
Yes, and that needs to be done when the filesystem is converted to use the proper timespec64 representation.
i.e. the "provide timestamp limit support" patchset starts out by limiting all times on all filesystems on all platforms to 32 bit times. Then after each filesystem is converted to use the VFS native 64 bit timestamps, we ensure that the filesystem sets it's known-valid timestamp limits in the superblock to correctly indicate the range of timestamps they support.
Hence by the end of the patchset we have the situation where filesystems that are not >y2038 compatible are limited to 32 bit timestamps, and those that can represent larger dates expose only the exact dates they can support to the VFS.
The file systems specific code should just set whatever it actually supports and then we can find a way to limit that further for 32-bit tasks.
Yes, 32 bit tasks is something for the syscall layer to care about.
But that doesn't change the fact that the *default timestamp range* should be "no >y2038 support", and it is up to the filesystem to override that with it's own limits if they are different.
We already support 32-bit time_t in compat tasks on 64-bit architectures, and we will support 64-bit time_t in normal tasks on 32-bit architectures in the future, and there is no need to range-check the timestamps written to disk.
The one range-check that we probably want for current 32-bit tasks is in the vfs_fstatat/vfs_getattr code, so a kernel that internally handles post-2038 timestamps can limit them to S32_MAX instead of wrapping.
Those should return EOVERFLOW rather than an incorrect timestamp, like we do for other 64 bit fields stat returns that can't fit in 32 bit stat fields. That's a clear indication of a process that should be using stat64() instead.
On most 32-bit architectures, stat64 still has a 32-bit time value, and even if it is 64-bit wide, glibc turns it into a 32-bit number when copying to its version of 'struct stat64' that differs from the kernel's and contains a 'struct timespec'.
So stat64 also returns EOVERLFOW to indicate to the application that it needs to use a different, 64 bit time aware syscall.
I don't care if we return EOVERFLOW or give some other indication of the overflow, but this is fundamentally a separate problem from large file support, and we need a new syscall for it.
Yes, but you seem to have missed my point that the we don't need to care about 32 bit times, whether it be syscalls or tasks, at the VFS/filesystem level. That's all a syscall layer issue. i.e. vfs_fstatat() only uses 64 bit times and only places where struct kstat contents get converted to whatever structure userspace has supplied (e.g. cp_new_stat()) need to detect/care about timestamp overflows like they already do with inode numbers and link counts.
Cheers,
Dave.