This is a mostly unchanged copy of a series I sent back in April for
an initial review. All the earlier syscall patches that Deepa or I sent
got merged now, and this is the largest chunk of remaining patches.
Changes this time are:
- This is actually tested with the LTP syscalls test suite,
both before and after the CONFIG_64BIT_TIME change (which is not
included here). I have created a patch series for musl libc to use
64-bit time_t and change all the system calls over to the new entry
points for this. The only bugs I found during that testing were in
later parts of the conversion that I have not posted yet.
- I rewrote the sys_io_getevents conversion after the
introduction of sys_sys_io_getevents. We obviously don't need to have
two of each, so we will only provide sys_io_pgetevents() with 64-bit
time_t but not sys_io_getevents(), which the libc can implement on
top of the former.
- While we have Deepa's POSIX timer conversion merged now, we
still need to decide on how we want to do the replacement
ABI for getitimer()/setitimer(). Like getrusage()/waitid() and
clock_adjtime() and unlike the system calls I'm posting here,
there is no one obvious ABI.
- For ppoll()/pselect6(), the ABI is fairly clear, but the
implementation still needs to be done. I tested with a simple
prototype based on the existing compat code, but we can
probably improve that. This is something that Deepa still
wants to work on.
- Finally, Christoph Hellwig objected to the idea of reusing the
compat_ namespace for the 32-bit native case. Changing that
would be a departure from our plans so far[2], and would make
some things end up differently. Until we have decided on how this
is to be done, I've decided to not change the code for this
post. We can clearly rename all the symbols and I've implemented
that in [3] for the current linux-next (not including the
series here). This is something we can definitely do, but I'd
need to know soon whether we can merge this series unchanged
for 4.19 or if I should rebase it on top of that patch with the
alternative naming.
Arnd
---
Previous cover letter announcement below, see [4] for the full
series:
After the first timekeeping series from Deepa (merged into -tip now)
and my follow-up for IPC system calls, this is a third set of system
call conversions following the same principle.
Most of the changes are straightforward, so I'm grouping them into a
larger series even though the system calls are mostly unrelated to one
another. After this series, the remaining calls that need to be changed
are getrusage()/waitid(), pselect6/ppoll(), timer{,fd}_{get,set}time()
and getitimer()/setitimer(). Those will be sent separately, once they
are matured enough.
To put the changes into perspective, a list of all system calls that
require changes is available in a spreadsheet[5] and I have made
another experimental patch that changes over x86[6] and arm[7] to
actually use them.
Link [1] https://lore.kernel.org/lkml/20180712082034.GA8802@infradead.org/
Link [2] https://lwn.net/Articles/643234/
Link [3] https://lore.kernel.org/lkml/20180713133204.3123939-1-arnd@arndb.de/
Link [4] https://lore.kernel.org/lkml/20180425160311.2718314-1-arnd@arndb.de/
Link [5] https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T…
Link [6] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/…
Link [7] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/…
Arnd Bergmann (17):
y2038: compat: Move common compat types to asm-generic/compat.h
y2038: Remove newstat family from default syscall set
y2038: Remove stat64 family from default syscall set
asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro
asm-generic: Remove empty asm/unistd.h
y2038: Change sys_utimensat() to use __kernel_timespec
y2038: Compile utimes()/futimesat() conditionally
y2038: utimes: Rework #ifdef guards for compat syscalls
y2038: futex: Move compat implementation into futex.c
y2038: futex: Add support for __kernel_timespec
y2038: Prepare sched_rr_get_interval for __kernel_timespec
y2038: aio: Prepare sys_io_{p,}getevents for __kernel_timespec
y2038: socket: Convert recvmmsg to __kernel_timespec
y2038: socket: Add compat_sys_recvmmsg_time64
y2038: signal: Change rt_sigtimedwait to use __kernel_timespec
y2038: Make compat_sys_rt_sigtimedwait usable on 32-bit
y2038: signal: Add compat_sys_rt_sigtimedwait_time64
arch/alpha/include/asm/unistd.h | 2 +
arch/arc/include/uapi/asm/unistd.h | 1 +
arch/arm/include/asm/unistd.h | 4 +-
arch/arm64/include/asm/compat.h | 20 +--
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/c6x/include/uapi/asm/unistd.h | 1 +
arch/h8300/include/uapi/asm/unistd.h | 1 +
arch/hexagon/include/uapi/asm/unistd.h | 1 +
arch/ia64/include/asm/unistd.h | 3 +
arch/m68k/include/asm/unistd.h | 2 +-
arch/microblaze/include/asm/unistd.h | 2 +-
arch/mips/include/asm/compat.h | 22 +---
arch/mips/include/asm/unistd.h | 3 +-
arch/nds32/include/uapi/asm/unistd.h | 1 +
arch/nios2/include/uapi/asm/unistd.h | 1 +
arch/openrisc/include/uapi/asm/unistd.h | 1 +
arch/parisc/include/asm/compat.h | 18 +--
arch/parisc/include/asm/unistd.h | 3 +-
arch/powerpc/include/asm/compat.h | 18 +--
arch/powerpc/include/asm/unistd.h | 3 +-
arch/s390/include/asm/compat.h | 18 +--
arch/s390/include/asm/unistd.h | 3 +-
arch/sh/include/asm/unistd.h | 2 +-
arch/sparc/include/asm/compat.h | 19 +--
arch/sparc/include/asm/unistd.h | 3 +-
arch/unicore32/include/uapi/asm/unistd.h | 1 +
arch/x86/include/asm/compat.h | 19 +--
arch/x86/include/asm/unistd.h | 3 +-
arch/xtensa/include/asm/unistd.h | 2 +-
fs/aio.c | 77 ++++++++++--
fs/read_write.c | 2 +-
fs/stat.c | 3 +
fs/utimes.c | 59 +++++----
include/asm-generic/compat.h | 24 +++-
include/asm-generic/unistd.h | 13 --
include/linux/compat.h | 12 +-
include/linux/compat_time.h | 5 +
include/linux/futex.h | 8 --
include/linux/socket.h | 19 ++-
include/linux/syscalls.h | 25 ++--
include/uapi/asm-generic/unistd.h | 2 +
kernel/Makefile | 3 -
kernel/futex.c | 207 +++++++++++++++++++++++++++++--
kernel/futex_compat.c | 202 ------------------------------
kernel/sched/core.c | 4 +-
kernel/signal.c | 68 ++++++++--
kernel/sys_ni.c | 1 +
net/compat.c | 16 +--
net/socket.c | 55 ++++++--
50 files changed, 524 insertions(+), 461 deletions(-)
delete mode 100644 include/asm-generic/unistd.h
delete mode 100644 kernel/futex_compat.c
--
2.9.0
get_seconds() is deprecated because of the 32-bit overflow and will
be removed. All callers in ufs also truncate to a 32-bit number, so
nothing changes during the conversion, but this should be harmless as the
superblock and cylinder group timestamps are not visible to user space,
except for checking the fs-dirty state, wich works fine across the
overflow.
This moves the call to get_seconds() into a new inline function, with
a comment explaining the constraints, while converting it to
ktime_get_real_seconds().
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Originally sent on June 19, got an Ack but nobody picked up the
patch.
---
fs/ufs/balloc.c | 4 ++--
fs/ufs/ialloc.c | 2 +-
fs/ufs/super.c | 4 ++--
fs/ufs/util.h | 14 ++++++++++++++
4 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c
index e727ee07dbe4..075d3d9114c8 100644
--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -547,7 +547,7 @@ static u64 ufs_add_fragments(struct inode *inode, u64 fragment,
/*
* Block can be extended
*/
- ucg->cg_time = cpu_to_fs32(sb, get_seconds());
+ ucg->cg_time = ufs_get_seconds(sb);
for (i = newcount; i < (uspi->s_fpb - fragoff); i++)
if (ubh_isclr (UCPI_UBH(ucpi), ucpi->c_freeoff, fragno + i))
break;
@@ -639,7 +639,7 @@ static u64 ufs_alloc_fragments(struct inode *inode, unsigned cgno,
if (!ufs_cg_chkmagic(sb, ucg))
ufs_panic (sb, "ufs_alloc_fragments",
"internal error, bad magic number on cg %u", cgno);
- ucg->cg_time = cpu_to_fs32(sb, get_seconds());
+ ucg->cg_time = ufs_get_seconds(sb);
if (count == uspi->s_fpb) {
result = ufs_alloccg_block (inode, ucpi, goal, err);
diff --git a/fs/ufs/ialloc.c b/fs/ufs/ialloc.c
index e1ef0f0a1353..c678fff2a04d 100644
--- a/fs/ufs/ialloc.c
+++ b/fs/ufs/ialloc.c
@@ -89,7 +89,7 @@ void ufs_free_inode (struct inode * inode)
if (!ufs_cg_chkmagic(sb, ucg))
ufs_panic (sb, "ufs_free_fragments", "internal error, bad cg magic number");
- ucg->cg_time = cpu_to_fs32(sb, get_seconds());
+ ucg->cg_time = ufs_get_seconds(sb);
is_directory = S_ISDIR(inode->i_mode);
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 96a20a76e3c4..f48a5b802221 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -698,7 +698,7 @@ static int ufs_sync_fs(struct super_block *sb, int wait)
usb1 = ubh_get_usb_first(uspi);
usb3 = ubh_get_usb_third(uspi);
- usb1->fs_time = cpu_to_fs32(sb, get_seconds());
+ usb1->fs_time = ufs_get_seconds(sb);
if ((flags & UFS_ST_MASK) == UFS_ST_SUN ||
(flags & UFS_ST_MASK) == UFS_ST_SUNOS ||
(flags & UFS_ST_MASK) == UFS_ST_SUNx86)
@@ -1344,7 +1344,7 @@ static int ufs_remount (struct super_block *sb, int *mount_flags,
*/
if (*mount_flags & SB_RDONLY) {
ufs_put_super_internal(sb);
- usb1->fs_time = cpu_to_fs32(sb, get_seconds());
+ usb1->fs_time = ufs_get_seconds(sb);
if ((flags & UFS_ST_MASK) == UFS_ST_SUN
|| (flags & UFS_ST_MASK) == UFS_ST_SUNOS
|| (flags & UFS_ST_MASK) == UFS_ST_SUNx86)
diff --git a/fs/ufs/util.h b/fs/ufs/util.h
index 1907be6d5808..1fd3011ea623 100644
--- a/fs/ufs/util.h
+++ b/fs/ufs/util.h
@@ -590,3 +590,17 @@ static inline int ufs_is_data_ptr_zero(struct ufs_sb_private_info *uspi,
else
return *(__fs32 *)p == 0;
}
+
+static inline __fs32 ufs_get_seconds(struct super_block *sbp)
+{
+ time64_t now = ktime_get_real_seconds();
+
+ /* Signed 32-bit interpretation wraps around in 2038, which
+ * happens in ufs1 inode stamps but not ufs2 using 64-bits
+ * stamps. For superblock and blockgroup, let's assume
+ * unsigned 32-bit stamps, which are good until y2106.
+ * Wrap around rather than clamp here to make the dirty
+ * file system detection work in the superblock stamp.
+ */
+ return cpu_to_fs32(sbp, lower_32_bits(now));
+}
--
2.9.0
The get_seconds function is deprecated now since it returns a 32-bit
value that will eventually overflow, and we are replacing it throughout
the kernel with ktime_get_seconds() or ktime_get_real_seconds() that
return a time64_t.
bcache uses get_seconds() to read the current system time and store it in
the superblock as well as in uuid_entry structures that are user visible.
Unfortunately, the two structures in are still limited to 32 bits, so this
won't fix any real problems but will still overflow in year 2106. Let's
at least document that properly, in case we get an updated format in the
future it can be fixed. We still have a long time before the overflow
and checking the tools at https://github.com/koverstreet/bcache-tools
reveals no access to any of them.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
drivers/md/bcache/super.c | 12 ++++++------
include/uapi/linux/bcache.h | 4 ++--
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index fa4058e43202..74746d8ee05e 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -181,7 +181,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
goto err;
}
- sb->last_mount = get_seconds();
+ sb->last_mount = (u32)ktime_get_real_seconds();
err = NULL;
get_page(bh->b_page);
@@ -701,7 +701,7 @@ static void bcache_device_detach(struct bcache_device *d)
SET_UUID_FLASH_ONLY(u, 0);
memcpy(u->uuid, invalid_uuid, 16);
- u->invalidated = cpu_to_le32(get_seconds());
+ u->invalidated = cpu_to_le32((u32)ktime_get_real_seconds());
bch_uuid_write(d->c);
}
@@ -1027,7 +1027,7 @@ void bch_cached_dev_detach(struct cached_dev *dc)
int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
uint8_t *set_uuid)
{
- uint32_t rtime = cpu_to_le32(get_seconds());
+ uint32_t rtime = cpu_to_le32((u32)ktime_get_real_seconds());
struct uuid_entry *u;
struct cached_dev *exist_dc, *t;
@@ -1070,7 +1070,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
(BDEV_STATE(&dc->sb) == BDEV_STATE_STALE ||
BDEV_STATE(&dc->sb) == BDEV_STATE_NONE)) {
memcpy(u->uuid, invalid_uuid, 16);
- u->invalidated = cpu_to_le32(get_seconds());
+ u->invalidated = cpu_to_le32((u32)ktime_get_real_seconds());
u = NULL;
}
@@ -1390,7 +1390,7 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size)
get_random_bytes(u->uuid, 16);
memset(u->label, 0, 32);
- u->first_reg = u->last_reg = cpu_to_le32(get_seconds());
+ u->first_reg = u->last_reg = cpu_to_le32((u32)ktime_get_real_seconds());
SET_UUID_FLASH_ONLY(u, 1);
u->sectors = size >> 9;
@@ -1894,7 +1894,7 @@ static void run_cache_set(struct cache_set *c)
goto err;
closure_sync(&cl);
- c->sb.last_mount = get_seconds();
+ c->sb.last_mount = (u32)ktime_get_real_seconds();
bcache_write_super(c);
list_for_each_entry_safe(dc, t, &uncached_devices, list)
diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
index 821f71a2e48f..8d19e02d752a 100644
--- a/include/uapi/linux/bcache.h
+++ b/include/uapi/linux/bcache.h
@@ -195,7 +195,7 @@ struct cache_sb {
};
};
- __u32 last_mount; /* time_t */
+ __u32 last_mount; /* time overflow in y2106 */
__u16 first_bucket;
union {
@@ -318,7 +318,7 @@ struct uuid_entry {
struct {
__u8 uuid[16];
__u8 label[32];
- __u32 first_reg;
+ __u32 first_reg; /* time overflow in y2106 */
__u32 last_reg;
__u32 invalidated;
--
2.9.0
get_seconds() is deprecated in favor of ktime_get_real_seconds(),
which returns a 64-bit timestamp.
In the SYSV file system, the superblock timestamp is only 32 bits
wide, and it is used to check whether a file system is clean, so
the best solution seems to be to force a wraparound and explicitly
convert it to an unsigned 32-bit value.
This is independent of the inode timestamps that are also 32-bit
wide on disk and that come from current_time().
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Originally sent on Jun 19, got an Ack but no other reply.
Christoph apparently hasn't applied any sysvfs patches in many years,
so I'd like someone else to take this one.
Al or Andrew, could you take this patch for 4.19 as well?
---
fs/sysv/inode.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 47f66bbc4578..e8927ea70d12 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -35,7 +35,7 @@
static int sysv_sync_fs(struct super_block *sb, int wait)
{
struct sysv_sb_info *sbi = SYSV_SB(sb);
- unsigned long time = get_seconds(), old_time;
+ u32 time = (u32)ktime_get_real_seconds(), old_time;
mutex_lock(&sbi->s_lock);
@@ -46,8 +46,8 @@ static int sysv_sync_fs(struct super_block *sb, int wait)
*/
old_time = fs32_to_cpu(sbi, *sbi->s_sb_time);
if (sbi->s_type == FSTYPE_SYSV4) {
- if (*sbi->s_sb_state == cpu_to_fs32(sbi, 0x7c269d38 - old_time))
- *sbi->s_sb_state = cpu_to_fs32(sbi, 0x7c269d38 - time);
+ if (*sbi->s_sb_state == cpu_to_fs32(sbi, 0x7c269d38u - old_time))
+ *sbi->s_sb_state = cpu_to_fs32(sbi, 0x7c269d38u - time);
*sbi->s_sb_time = cpu_to_fs32(sbi, time);
mark_buffer_dirty(sbi->s_bh2);
}
--
2.9.0