Re: [Y2038] [RFC 02/15] vfs: Change all structures to support 64 bit time

15 Jan 2016

      On Thu, Jan 14, 2016 at 11:46:16PM +0100, Arnd Bergmann wrote:
...
On Friday 15 January 2016 08:00:01 Dave Chinner wrote:
...
On Thu, Jan 14, 2016 at 05:53:21PM +0100, Arnd Bergmann wrote:
...
On Thursday 14 January 2016 08:04:36 Dave Chinner wrote:
...
On Wed, Jan 13, 2016 at 08:33:16AM -0800, Deepa Dinamani wrote:
c) The opposite direction from b) is to first change the common
   code, but then any direct assignment between a timespec in
   a file system and the timespec64 in the inode/iattr/kstat/etc
   first needs a conversion helper so we can build cleanly,
   and then we do one file system at a time to remove them all
   again while changing the internal structures in the
   file system from timespec to timespec64.
No new helpers are necessary - we've already got the helper
functions we need. This:
int simple_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
 {
        struct inode *inode = d_inode(old_dentry);

  struct inode_timespec now = current_fs_time(inode->i_sb);

  inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;

  VFS_INODE_SET_XTIME(i_ctime, inode, now);

  VFS_INODE_SET_XTIME(i_mtime, dir, now);

  VFS_INODE_SET_XTIME(i_ctime, dir, now);
  inc_nlink(inode);

.....
is just wrong. All the type conversion and clamping and checking
done in that VFS_INODE_SET_XTIME() should be done in
current_fs_time() and have it return a timespec64 directly. Indeed,
it already does truncation, and can easily be made to do range
clamping, too.  i.e.  the change should simply be:

  inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;

inode->i_ctime = dir->i_ctime = dir->i_mtime = current_fs_time(inode->i_sb);

Yes, that is the obvious case, and I guess works for at least half the
file systems when they always assign righthand side and lefthand
side of the time stamps using the external types or helpers like
CURRENT_TIME and current_fs_time().
However, there are a couple of file systems that need a bit more refactoring
before we can do this, e.g. in ntfs_truncate:
Sure, and nfs is a pain because of all it's internal use of
timespecs, too. But we have these timespec_to_timespec64 helper
functions, and that's what we should use in these cases where the
filesystem cannot support full 64 bit timestamps internally. In
those cases, they'll be telling the superblock this at mount time
things like current_fs_time() won't be returning then a timestamp
that is out of range for a 32 bit timestamp....
...
    if (!IS_NOCMTIME(VFS_I(base_ni)) && !IS_RDONLY(VFS_I(base_ni))) {
            struct timespec now = current_fs_time(VFS_I(base_ni)->i_sb);
            int sync_it = 0;

            if (!timespec_equal(&VFS_I(base_ni)->i_mtime, &now) ||
                !timespec_equal(&VFS_I(base_ni)->i_ctime, &now))
                    sync_it = 1;
            VFS_I(base_ni)->i_mtime = now;
            VFS_I(base_ni)->i_ctime = now;

}
The type of the local variable must match the return code of
current_fs_time(), so if we change over i_mtime and current_fs_time
globally, this either has to be rewritten first to avoid the use of
local variables, or it needs temporary conversion helpers, or
it has to be changed in the same patch. None of those is particularly
appealing. There are a few dozen such things in various file systems.
it gets rewritten to:
struct timespec now;
now = timespec64_to_timespec(current_fs_time(VFS_I(base_ni)->i_sb));
....
and the valid timestamp range for ntfs is set to 32bit timestamps.
This then leaves it up to the filesystem developers to make
the ntfs filesystem code 64 bit timestamp clean if the on disk
format is ever changed to support 64 bit times.
Same goes for NFS, and any of the other filesystems that use struct
timespec internally for time representation.
...
...
...
...
I think it's a mix - if the timestamps come in from userspace,
fail with ERANGE. That could be controlled by sysctl via VFS
part of the ->setattr operation, or in each of the individual FS
implementations. If they come from the kernel (e.g. atime update) then
the generic behvaiour is to warn and continue, filesystems can
otherwise select their own policy for kernel updates via
->update_time.
I'd prefer not to have it done by the individual file system
implementation, so we get a consistent behavior.
Can't be helped, because different filesystems have different
timestamp behaviours, and not all use the generic VFS code for
timestamp updates. The filesystems need to use the correct helper
functions to obtain a valid current time, but you can't stop them
from storing and using arbitrary timestamp formats if they so
desire...
I mean the decision whether to clamp or error on an overflow
should be done consistently.
Sure, but that comes from using the helpers we already have, and
applying the clamping at a point where errors can be returned to
userspace. current_fs_time() should never return a timestamp outside
what the filesystem has said it supports and we've validated that
behaviour at mount time. hence it's only user provided timestamps
that we need range errors on.
...
Having a global sysctl knob or
a compile-time option is better than having each file system
implementor take a guess at what users might prefer, if we can't
come up with a behavior (e.g. clamp all the time, or error out
all the time) that everybody agrees is always correct.
filesystem implementors will use the helper funtions that are
provided for this. If they don't (like all the current use of
CURRENT_TIME), then that's a bug that needs fixing. i.e. we need
a timespec_clamp() function, similar to timespec_trunc(), and y2038k
compliant filesystems and syscalls need to use them....
...
...
...
...
Not really an option, because it means we can't use filesystems that
interop with other systems (e.g. cameras, etc) because they won't
support y2038k timestamps for a long time, if ever (e.g. vfat).
Let me clarify what my idea is here: I want a global kernel option
that disables all code that has known y2038 issues. If anyone tries
to build an embedded system with support beyond 2038, that should
disable all of those things, including file systems, drivers and
system calls, so we can reasonably assume that everything that works
today with that kernel build will keep working in the future and
not break in random ways.
It's not that black and white when it comes to filesystems. y2038k
support is determined by the on-disk structure of the filesystem
being mounted, and that is determined at mount time.  When the
filesystem mounts and sets it's valid timestamp ranges the VFS
will need to decide as to whether the filesystem is allowed to
continue mounting or not.
Some file systems are always broken around 2038 (e.g. HFS in 2040),
so if we can't fix them, I want to be able to turn them off
in Kconfig along with the 32-bit time_t syscalls.
That can be done with kconfig depends rules - it has nothing to do
with this patch set.
...
ext2/3/4, xfs and ocfs2 (maybe one or two more, I'd have to check)
currently behave in a consistent manner across 32-bit and 64-bit
architectures by allowing a range between 1902 and 2037, and we
obviously don't have a choice there but to keep that current
behavior, and extend the time format in one way or another to
store additional bits for the epoch.
That's a filesystem implementation problem, not a generic inode
timestamp problem. i.e. this is handled when the filesystem converts
the inode timestamp from a timespec64 in the struct inode to
whatever format it stores the timestamp on disk. That conversion
does not change just because the VFS inode moves from a timespec to
a timespec64. Again, those on-disk format changes to support beyond
the current epoch are outside the scope of this patchset, because
they are not affected by the timestamp format the VFS choses to use.
Cheers,
Dave.
-- 
Dave Chinner
david@fromorbit.com

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Y2038] [RFC 02/15] vfs: Change all structures to support 64 bit time