This is a series of cleanups for the y2038 work, mostly intended for namespace cleaning: the kernel defines the traditional time_t, timeval and timespec types that often lead to y2038-unsafe code. Even though the unsafe usage is mostly gone from the kernel, having the types and associated functions around means that we can still grow new users, and that we may be missing conversions to safe types that actually matter.
As there is no rush on any of these patches, I would either queue them up in linux-next through my y2038 branch, or Thomas could add them to the tip tree if he wants.
As mentioned in another series, this is part of a larger effort to fix all the remaining bits and pieces that are not completed yet from the y2038 conversion, and the full set can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=y...
Maintainers, please review and provide Acks.
Let me know if you have any opinion on whether we should do the include last two patches of this series or not.
Arnd
Arnd Bergmann (23): y2038: remove CONFIG_64BIT_TIME y2038: add __kernel_old_timespec and __kernel_old_time_t y2038: vdso: change timeval to __kernel_old_timeval y2038: vdso: change timespec to __kernel_old_timespec y2038: vdso: change time_t to __kernel_old_time_t y2038: vdso: nds32: open-code timespec_add_ns() y2038: vdso: powerpc: avoid timespec references y2038: ipc: remove __kernel_time_t reference from headers y2038: stat: avoid 'time_t' in 'struct stat' y2038: uapi: change __kernel_time_t to __kernel_old_time_t y2038: rusage: use __kernel_old_timeval y2038: syscalls: change remaining timeval to __kernel_old_timeval y2038: socket: remove timespec reference in timestamping y2038: make ns_to_compat_timeval use __kernel_old_timeval y2038: elfcore: Use __kernel_old_timeval for process times y2038: timerfd: Use timespec64 internally y2038: time: avoid timespec usage in settimeofday() y2038: itimer: compat handling to itimer.c y2038: use compat_{get,set}_itimer on alpha y2038: move itimer reset into itimer.c y2038: itimer: change implementation to timespec64 [RFC] y2038: itimer: use ktime_t internally y2038: allow disabling time32 system calls
arch/Kconfig | 11 +- arch/alpha/kernel/osf_sys.c | 67 +----- arch/alpha/kernel/syscalls/syscall.tbl | 4 +- arch/ia64/kernel/asm-offsets.c | 2 +- arch/mips/include/uapi/asm/msgbuf.h | 6 +- arch/mips/include/uapi/asm/sembuf.h | 4 +- arch/mips/include/uapi/asm/shmbuf.h | 6 +- arch/mips/include/uapi/asm/stat.h | 16 +- arch/mips/kernel/binfmt_elfn32.c | 4 +- arch/mips/kernel/binfmt_elfo32.c | 4 +- arch/nds32/kernel/vdso/gettimeofday.c | 61 +++-- arch/parisc/include/uapi/asm/msgbuf.h | 6 +- arch/parisc/include/uapi/asm/sembuf.h | 4 +- arch/parisc/include/uapi/asm/shmbuf.h | 6 +- arch/powerpc/include/asm/asm-prototypes.h | 3 +- arch/powerpc/include/asm/vdso_datapage.h | 6 +- arch/powerpc/include/uapi/asm/msgbuf.h | 6 +- arch/powerpc/include/uapi/asm/sembuf.h | 4 +- arch/powerpc/include/uapi/asm/shmbuf.h | 6 +- arch/powerpc/include/uapi/asm/stat.h | 2 +- arch/powerpc/kernel/asm-offsets.c | 18 +- arch/powerpc/kernel/syscalls.c | 4 +- arch/powerpc/kernel/time.c | 5 +- arch/powerpc/kernel/vdso32/gettimeofday.S | 6 +- arch/powerpc/kernel/vdso64/gettimeofday.S | 8 +- arch/sparc/include/uapi/asm/msgbuf.h | 6 +- arch/sparc/include/uapi/asm/sembuf.h | 4 +- arch/sparc/include/uapi/asm/shmbuf.h | 6 +- arch/sparc/include/uapi/asm/stat.h | 24 +- arch/sparc/vdso/vclock_gettime.c | 36 +-- arch/x86/entry/vdso/vclock_gettime.c | 6 +- arch/x86/entry/vsyscall/vsyscall_64.c | 4 +- arch/x86/include/uapi/asm/msgbuf.h | 6 +- arch/x86/include/uapi/asm/sembuf.h | 4 +- arch/x86/include/uapi/asm/shmbuf.h | 6 +- arch/x86/um/vdso/um_vdso.c | 12 +- fs/aio.c | 2 +- fs/binfmt_elf.c | 12 +- fs/binfmt_elf_fdpic.c | 12 +- fs/compat_binfmt_elf.c | 4 +- fs/select.c | 10 +- fs/timerfd.c | 14 +- fs/utimes.c | 8 +- include/linux/compat.h | 19 +- include/linux/syscalls.h | 16 +- include/linux/time.h | 9 +- include/linux/time32.h | 2 +- include/linux/types.h | 2 +- include/trace/events/timer.h | 29 +-- include/uapi/asm-generic/msgbuf.h | 12 +- include/uapi/asm-generic/posix_types.h | 1 + include/uapi/asm-generic/sembuf.h | 7 +- include/uapi/asm-generic/shmbuf.h | 12 +- include/uapi/linux/cyclades.h | 6 +- include/uapi/linux/elfcore.h | 8 +- include/uapi/linux/errqueue.h | 7 + include/uapi/linux/msg.h | 6 +- include/uapi/linux/ppp_defs.h | 4 +- include/uapi/linux/resource.h | 4 +- include/uapi/linux/sem.h | 4 +- include/uapi/linux/shm.h | 6 +- include/uapi/linux/time.h | 6 +- include/uapi/linux/time_types.h | 5 + include/uapi/linux/utime.h | 4 +- ipc/syscall.c | 2 +- kernel/compat.c | 24 -- kernel/power/power.h | 2 +- kernel/sys.c | 4 +- kernel/sys_ni.c | 23 ++ kernel/time/hrtimer.c | 2 +- kernel/time/itimer.c | 280 ++++++++++++++-------- kernel/time/time.c | 32 ++- lib/vdso/gettimeofday.c | 4 +- net/core/scm.c | 6 +- net/socket.c | 2 +- security/selinux/hooks.c | 10 +- 76 files changed, 501 insertions(+), 504 deletions(-)
The CONFIG_64BIT_TIME option is defined on all architectures, and can be removed for simplicity now.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/Kconfig | 8 -------- fs/aio.c | 2 +- ipc/syscall.c | 2 +- kernel/time/hrtimer.c | 2 +- kernel/time/time.c | 4 ++-- net/socket.c | 2 +- 6 files changed, 6 insertions(+), 14 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig index 5f8a5d84dbbe..0e1fded2940e 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -796,14 +796,6 @@ config OLD_SIGACTION config COMPAT_OLD_SIGACTION bool
-config 64BIT_TIME - def_bool y - help - This should be selected by all architectures that need to support - new system calls with a 64-bit time_t. This is relevant on all 32-bit - architectures, and 64-bit architectures as part of compat syscall - handling. - config COMPAT_32BIT_TIME def_bool !64BIT || COMPAT help diff --git a/fs/aio.c b/fs/aio.c index 01e0fb9ae45a..447e3a0c572c 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -2056,7 +2056,7 @@ static long do_io_getevents(aio_context_t ctx_id, * specifies an infinite timeout. Note that the timeout pointed to by * timeout is relative. Will fail with -ENOSYS if not implemented. */ -#if !defined(CONFIG_64BIT_TIME) || defined(CONFIG_64BIT) +#ifdef CONFIG_64BIT
SYSCALL_DEFINE5(io_getevents, aio_context_t, ctx_id, long, min_nr, diff --git a/ipc/syscall.c b/ipc/syscall.c index 581bdff4e7c5..dfb0e988d542 100644 --- a/ipc/syscall.c +++ b/ipc/syscall.c @@ -30,7 +30,7 @@ int ksys_ipc(unsigned int call, int first, unsigned long second, return ksys_semtimedop(first, (struct sembuf __user *)ptr, second, NULL); case SEMTIMEDOP: - if (IS_ENABLED(CONFIG_64BIT) || !IS_ENABLED(CONFIG_64BIT_TIME)) + if (IS_ENABLED(CONFIG_64BIT)) return ksys_semtimedop(first, ptr, second, (const struct __kernel_timespec __user *)fifth); else if (IS_ENABLED(CONFIG_COMPAT_32BIT_TIME)) diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 65605530ee34..9e20873148c6 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1940,7 +1940,7 @@ long hrtimer_nanosleep(const struct timespec64 *rqtp, return ret; }
-#if !defined(CONFIG_64BIT_TIME) || defined(CONFIG_64BIT) +#ifdef CONFIG_64BIT
SYSCALL_DEFINE2(nanosleep, struct __kernel_timespec __user *, rqtp, struct __kernel_timespec __user *, rmtp) diff --git a/kernel/time/time.c b/kernel/time/time.c index 45a358953f09..ddbddf504c23 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -267,7 +267,7 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, } #endif
-#if !defined(CONFIG_64BIT_TIME) || defined(CONFIG_64BIT) +#ifdef CONFIG_64BIT SYSCALL_DEFINE1(adjtimex, struct __kernel_timex __user *, txc_p) { struct __kernel_timex txc; /* Local copy of parameter */ @@ -884,7 +884,7 @@ int get_timespec64(struct timespec64 *ts, ts->tv_sec = kts.tv_sec;
/* Zero out the padding for 32 bit systems or in compat mode */ - if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall()) + if (in_compat_syscall()) kts.tv_nsec &= 0xFFFFFFFFUL;
ts->tv_nsec = kts.tv_nsec; diff --git a/net/socket.c b/net/socket.c index 6a9ab7a8b1d2..98f6544b0096 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2833,7 +2833,7 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args) a[2], true); break; case SYS_RECVMMSG: - if (IS_ENABLED(CONFIG_64BIT) || !IS_ENABLED(CONFIG_64BIT_TIME)) + if (IS_ENABLED(CONFIG_64BIT)) err = __sys_recvmmsg(a0, (struct mmsghdr __user *)a1, a[2], a[3], (struct __kernel_timespec __user *)a[4],
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -267,7 +267,7 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, } #endif -#if !defined(CONFIG_64BIT_TIME) || defined(CONFIG_64BIT) +#ifdef CONFIG_64BIT SYSCALL_DEFINE1(adjtimex, struct __kernel_timex __user *, txc_p) { struct __kernel_timex txc; /* Local copy of parameter */ @@ -884,7 +884,7 @@ int get_timespec64(struct timespec64 *ts, ts->tv_sec = kts.tv_sec; /* Zero out the padding for 32 bit systems or in compat mode */
- if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall())
- if (in_compat_syscall()) kts.tv_nsec &= 0xFFFFFFFFUL;
ts->tv_nsec = kts.tv_nsec;
[...]
It's not a problem with this patch, but I noticed that this condition doesn't match what the comment says. It looks like it was broken by:
commit 98f76206b33504b934209d16196477dfa519a807 Author: Dmitry Safonov dima@arista.com Date: Fri Oct 12 14:42:53 2018 +0100
compat: Cleanup in_compat_syscall() callers
Ben.
On 11/20/19 10:28 PM, Ben Hutchings wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -267,7 +267,7 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, } #endif -#if !defined(CONFIG_64BIT_TIME) || defined(CONFIG_64BIT) +#ifdef CONFIG_64BIT SYSCALL_DEFINE1(adjtimex, struct __kernel_timex __user *, txc_p) { struct __kernel_timex txc; /* Local copy of parameter */ @@ -884,7 +884,7 @@ int get_timespec64(struct timespec64 *ts, ts->tv_sec = kts.tv_sec; /* Zero out the padding for 32 bit systems or in compat mode */
- if (IS_ENABLED(CONFIG_64BIT_TIME) && in_compat_syscall())
- if (in_compat_syscall()) kts.tv_nsec &= 0xFFFFFFFFUL;
ts->tv_nsec = kts.tv_nsec;
[...]
It's not a problem with this patch, but I noticed that this condition doesn't match what the comment says. It looks like it was broken by:
commit 98f76206b33504b934209d16196477dfa519a807 Author: Dmitry Safonov dima@arista.com Date: Fri Oct 12 14:42:53 2018 +0100
compat: Cleanup in_compat_syscall() callers
Ugh, you right. I've failed to read the condition and thought it's related to CONFIG_COMPAT :( I'll send a fix shortly, thanks for spotting this!
The 'struct timespec' definition can no longer be part of the uapi headers because it conflicts with a a now incompatible libc definition. Also, we really want to remove it in order to prevent new uses from creeping in.
The same namespace conflict exists with time_t, which should also be removed. __kernel_time_t could be used safely, but adding 'old' in the name makes it clearer that this should not be used for new interfaces.
Add a replacement __kernel_old_timespec structure and __kernel_old_time_t along the lines of __kernel_old_timeval.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/uapi/asm-generic/posix_types.h | 1 + include/uapi/linux/time_types.h | 5 +++++ 2 files changed, 6 insertions(+)
diff --git a/include/uapi/asm-generic/posix_types.h b/include/uapi/asm-generic/posix_types.h index f0733a26ebfc..2f9c80595ba7 100644 --- a/include/uapi/asm-generic/posix_types.h +++ b/include/uapi/asm-generic/posix_types.h @@ -86,6 +86,7 @@ typedef struct { */ typedef __kernel_long_t __kernel_off_t; typedef long long __kernel_loff_t; +typedef __kernel_long_t __kernel_old_time_t; typedef __kernel_long_t __kernel_time_t; typedef long long __kernel_time64_t; typedef __kernel_long_t __kernel_clock_t; diff --git a/include/uapi/linux/time_types.h b/include/uapi/linux/time_types.h index 27bfc8fc6904..60b37f29842d 100644 --- a/include/uapi/linux/time_types.h +++ b/include/uapi/linux/time_types.h @@ -28,6 +28,11 @@ struct __kernel_old_timeval { }; #endif
+struct __kernel_old_timespec { + __kernel_time_t tv_sec; /* seconds */ + long tv_nsec; /* nanoseconds */ +}; + struct __kernel_sock_timeval { __s64 tv_sec; __s64 tv_usec;
Acked-by: Deepa Dinamani deepa.kernel@gmail.com
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote:
The 'struct timespec' definition can no longer be part of the uapi headers because it conflicts with a a now incompatible libc definition. Also, we really want to remove it in order to prevent new uses from creeping in.
The same namespace conflict exists with time_t, which should also be removed. __kernel_time_t could be used safely, but adding 'old' in the name makes it clearer that this should not be used for new interfaces.
Add a replacement __kernel_old_timespec structure and __kernel_old_time_t along the lines of __kernel_old_timeval.
[...]
--- a/include/uapi/linux/time_types.h +++ b/include/uapi/linux/time_types.h @@ -28,6 +28,11 @@ struct __kernel_old_timeval { }; #endif +struct __kernel_old_timespec {
- __kernel_time_t tv_sec; /* seconds */
Should this be __kernel_old_time_t for consistency?
Ben.
- long tv_nsec; /* nanoseconds */
+};
struct __kernel_sock_timeval { __s64 tv_sec; __s64 tv_usec;
On Wed, Nov 20, 2019 at 11:30 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote:
The 'struct timespec' definition can no longer be part of the uapi headers because it conflicts with a a now incompatible libc definition. Also, we really want to remove it in order to prevent new uses from creeping in.
The same namespace conflict exists with time_t, which should also be removed. __kernel_time_t could be used safely, but adding 'old' in the name makes it clearer that this should not be used for new interfaces.
Add a replacement __kernel_old_timespec structure and __kernel_old_time_t along the lines of __kernel_old_timeval.
[...]
--- a/include/uapi/linux/time_types.h +++ b/include/uapi/linux/time_types.h @@ -28,6 +28,11 @@ struct __kernel_old_timeval { }; #endif
+struct __kernel_old_timespec {
__kernel_time_t tv_sec; /* seconds */
Should this be __kernel_old_time_t for consistency?
Yes. I had already noticed this and changed it in the current version of "y2038: uapi: change __kernel_time_t to __kernel_old_time_t".
Arnd
The gettimeofday() function in vdso uses the traditional 'timeval' structure layout, which will be incompatible with future versions of glibc on 32-bit architectures that use a 64-bit time_t.
This interface is problematic for y2038, when time_t overflows on 32-bit architectures, but the plan so far is that a libc with 64-bit time_t will not call into the gettimeofday() vdso helper at all, and only have a method for entering clock_gettime(). This means we don't have to fix it here, though we probably want to add a new clock_gettime() entry point using a 64-bit version of 'struct timespec' at some point.
Changing the vdso code to use __kernel_old_timeval helps isolate this usage from the other ones that still need to be fixed properly, and it gets us closer to removing the 'timeval' definition from the kernel sources.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/nds32/kernel/vdso/gettimeofday.c | 6 +++--- arch/powerpc/kernel/asm-offsets.c | 8 ++++---- arch/sparc/vdso/vclock_gettime.c | 12 ++++++------ arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/um/vdso/um_vdso.c | 4 ++-- 5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/nds32/kernel/vdso/gettimeofday.c b/arch/nds32/kernel/vdso/gettimeofday.c index b02581891c33..1e69fd5b067b 100644 --- a/arch/nds32/kernel/vdso/gettimeofday.c +++ b/arch/nds32/kernel/vdso/gettimeofday.c @@ -230,10 +230,10 @@ notrace int __vdso_clock_getres(clockid_t clk_id, struct timespec *res) return 0; }
-static notrace inline int gettimeofday_fallback(struct timeval *_tv, +static notrace inline int gettimeofday_fallback(struct __kernel_old_timeval *_tv, struct timezone *_tz) { - register struct timeval *tv asm("$r0") = _tv; + register struct __kernel_old_timeval *tv asm("$r0") = _tv; register struct timezone *tz asm("$r1") = _tz; register int ret asm("$r0");
@@ -246,7 +246,7 @@ static notrace inline int gettimeofday_fallback(struct timeval *_tv, return ret; }
-notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) +notrace int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { struct timespec ts; struct vdso_data *vdata; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 484f54dab247..827f4c354e13 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -393,8 +393,8 @@ int main(void) OFFSET(CFG_DCACHE_LOGBLOCKSZ, vdso_data, dcache_log_block_size); #ifdef CONFIG_PPC64 OFFSET(CFG_SYSCALL_MAP64, vdso_data, syscall_map_64); - OFFSET(TVAL64_TV_SEC, timeval, tv_sec); - OFFSET(TVAL64_TV_USEC, timeval, tv_usec); + OFFSET(TVAL64_TV_SEC, __kernel_old_timeval, tv_sec); + OFFSET(TVAL64_TV_USEC, __kernel_old_timeval, tv_usec); OFFSET(TVAL32_TV_SEC, old_timeval32, tv_sec); OFFSET(TVAL32_TV_USEC, old_timeval32, tv_usec); OFFSET(TSPC64_TV_SEC, timespec, tv_sec); @@ -402,8 +402,8 @@ int main(void) OFFSET(TSPC32_TV_SEC, old_timespec32, tv_sec); OFFSET(TSPC32_TV_NSEC, old_timespec32, tv_nsec); #else - OFFSET(TVAL32_TV_SEC, timeval, tv_sec); - OFFSET(TVAL32_TV_USEC, timeval, tv_usec); + OFFSET(TVAL32_TV_SEC, __kernel_old_timeval, tv_sec); + OFFSET(TVAL32_TV_USEC, __kernel_old_timeval, tv_usec); OFFSET(TSPC32_TV_SEC, timespec, tv_sec); OFFSET(TSPC32_TV_NSEC, timespec, tv_nsec); #endif diff --git a/arch/sparc/vdso/vclock_gettime.c b/arch/sparc/vdso/vclock_gettime.c index fc5bdd14de76..a20c5030578d 100644 --- a/arch/sparc/vdso/vclock_gettime.c +++ b/arch/sparc/vdso/vclock_gettime.c @@ -74,7 +74,7 @@ notrace static long vdso_fallback_gettime(long clock, struct timespec *ts) return o0; }
-notrace static long vdso_fallback_gettimeofday(struct timeval *tv, struct timezone *tz) +notrace static long vdso_fallback_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { register long num __asm__("g1") = __NR_gettimeofday; register long o0 __asm__("o0") = (long) tv; @@ -304,7 +304,7 @@ __vdso_clock_gettime_stick(clockid_t clock, struct timespec *ts) }
notrace int -__vdso_gettimeofday(struct timeval *tv, struct timezone *tz) +__vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { struct vvar_data *vvd = get_vvar_data();
@@ -312,7 +312,7 @@ __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) if (likely(tv != NULL)) { union tstv_t { struct timespec ts; - struct timeval tv; + struct __kernel_old_timeval tv; } *tstv = (union tstv_t *) tv; do_realtime(vvd, &tstv->ts); /* @@ -336,11 +336,11 @@ __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) return vdso_fallback_gettimeofday(tv, tz); } int -gettimeofday(struct timeval *, struct timezone *) +gettimeofday(struct __kernel_old_timeval *, struct timezone *) __attribute__((weak, alias("__vdso_gettimeofday")));
notrace int -__vdso_gettimeofday_stick(struct timeval *tv, struct timezone *tz) +__vdso_gettimeofday_stick(struct __kernel_old_timeval *tv, struct timezone *tz) { struct vvar_data *vvd = get_vvar_data();
@@ -348,7 +348,7 @@ __vdso_gettimeofday_stick(struct timeval *tv, struct timezone *tz) if (likely(tv != NULL)) { union tstv_t { struct timespec ts; - struct timeval tv; + struct __kernel_old_timeval tv; } *tstv = (union tstv_t *) tv; do_realtime_stick(vvd, &tstv->ts); /* diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index e7c596dea947..76e62bcb8d87 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -184,7 +184,7 @@ bool emulate_vsyscall(unsigned long error_code, */ switch (vsyscall_nr) { case 0: - if (!write_ok_or_segv(regs->di, sizeof(struct timeval)) || + if (!write_ok_or_segv(regs->di, sizeof(struct __kernel_old_timeval)) || !write_ok_or_segv(regs->si, sizeof(struct timezone))) { ret = -EFAULT; goto check_fault; diff --git a/arch/x86/um/vdso/um_vdso.c b/arch/x86/um/vdso/um_vdso.c index 891868756a51..845336c11364 100644 --- a/arch/x86/um/vdso/um_vdso.c +++ b/arch/x86/um/vdso/um_vdso.c @@ -25,7 +25,7 @@ int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) int clock_gettime(clockid_t, struct timespec *) __attribute__((weak, alias("__vdso_clock_gettime")));
-int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz) +int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { long ret;
@@ -34,7 +34,7 @@ int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
return ret; } -int gettimeofday(struct timeval *, struct timezone *) +int gettimeofday(struct __kernel_old_timeval *, struct timezone *) __attribute__((weak, alias("__vdso_gettimeofday")));
time_t __vdso_time(time_t *t)
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
The gettimeofday() function in vdso uses the traditional 'timeval' structure layout, which will be incompatible with future versions of glibc on 32-bit architectures that use a 64-bit time_t.
This interface is problematic for y2038, when time_t overflows on 32-bit architectures, but the plan so far is that a libc with 64-bit time_t will not call into the gettimeofday() vdso helper at all, and only have a method for entering clock_gettime(). This means we don't have to fix it here, though we probably want to add a new clock_gettime() entry point using a 64-bit version of 'struct timespec' at some point.
Changing the vdso code to use __kernel_old_timeval helps isolate this usage from the other ones that still need to be fixed properly, and it gets us closer to removing the 'timeval' definition from the kernel sources.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Reviewed-by: Thomas Gleixner tglx@linutronix.de
In order to remove 'timespec' completely from the kernel, all internal uses should be converted to a y2038-safe type, while those that are only for compatibity with existing user space should be marked appropriately.
Change vdso to use __kernel_old_timespec in order to avoid the deprecated type and mark these interfaces as outdated.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/ia64/kernel/asm-offsets.c | 2 +- arch/nds32/kernel/vdso/gettimeofday.c | 22 +++++++++++----------- arch/sparc/vdso/vclock_gettime.c | 24 ++++++++++++------------ arch/x86/um/vdso/um_vdso.c | 4 ++-- 4 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/arch/ia64/kernel/asm-offsets.c b/arch/ia64/kernel/asm-offsets.c index 00e8e2a1eb19..fb0deb8a4221 100644 --- a/arch/ia64/kernel/asm-offsets.c +++ b/arch/ia64/kernel/asm-offsets.c @@ -211,7 +211,7 @@ void foo(void) offsetof (struct cpuinfo_ia64, ptce_stride)); BLANK(); DEFINE(IA64_TIMESPEC_TV_NSEC_OFFSET, - offsetof (struct timespec, tv_nsec)); + offsetof (struct __kernel_old_timespec, tv_nsec)); DEFINE(IA64_TIME_SN_SPEC_SNSEC_OFFSET, offsetof (struct time_sn_spec, snsec));
diff --git a/arch/nds32/kernel/vdso/gettimeofday.c b/arch/nds32/kernel/vdso/gettimeofday.c index 1e69fd5b067b..687abc7145f5 100644 --- a/arch/nds32/kernel/vdso/gettimeofday.c +++ b/arch/nds32/kernel/vdso/gettimeofday.c @@ -48,9 +48,9 @@ static notrace int vdso_read_retry(const struct vdso_data *vdata, u32 start) }
static notrace long clock_gettime_fallback(clockid_t _clkid, - struct timespec *_ts) + struct __kernel_old_timespec *_ts) { - register struct timespec *ts asm("$r1") = _ts; + register struct __kernel_old_timespec *ts asm("$r1") = _ts; register clockid_t clkid asm("$r0") = _clkid; register long ret asm("$r0");
@@ -63,7 +63,7 @@ static notrace long clock_gettime_fallback(clockid_t _clkid, return ret; }
-static notrace int do_realtime_coarse(struct timespec *ts, +static notrace int do_realtime_coarse(struct __kernel_old_timespec *ts, struct vdso_data *vdata) { u32 seq; @@ -78,7 +78,7 @@ static notrace int do_realtime_coarse(struct timespec *ts, return 0; }
-static notrace int do_monotonic_coarse(struct timespec *ts, +static notrace int do_monotonic_coarse(struct __kernel_old_timespec *ts, struct vdso_data *vdata) { struct timespec tomono; @@ -115,7 +115,7 @@ static notrace inline u64 vgetsns(struct vdso_data *vdso) return ((u64) cycle_delta & vdso->cs_mask) * vdso->cs_mult; }
-static notrace int do_realtime(struct timespec *ts, struct vdso_data *vdata) +static notrace int do_realtime(struct __kernel_old_timespec *ts, struct vdso_data *vdata) { unsigned count; u64 ns; @@ -133,7 +133,7 @@ static notrace int do_realtime(struct timespec *ts, struct vdso_data *vdata) return 0; }
-static notrace int do_monotonic(struct timespec *ts, struct vdso_data *vdata) +static notrace int do_monotonic(struct __kernel_old_timespec *ts, struct vdso_data *vdata) { struct timespec tomono; u64 nsecs; @@ -158,7 +158,7 @@ static notrace int do_monotonic(struct timespec *ts, struct vdso_data *vdata) return 0; }
-notrace int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts) +notrace int __vdso_clock_gettime(clockid_t clkid, struct __kernel_old_timespec *ts) { struct vdso_data *vdata; int ret = -1; @@ -191,10 +191,10 @@ notrace int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts) }
static notrace int clock_getres_fallback(clockid_t _clk_id, - struct timespec *_res) + struct __kernel_old_timespec *_res) { register clockid_t clk_id asm("$r0") = _clk_id; - register struct timespec *res asm("$r1") = _res; + register struct __kernel_old_timespec *res asm("$r1") = _res; register int ret asm("$r0");
asm volatile ("movi $r15, %3\n" @@ -206,7 +206,7 @@ static notrace int clock_getres_fallback(clockid_t _clk_id, return ret; }
-notrace int __vdso_clock_getres(clockid_t clk_id, struct timespec *res) +notrace int __vdso_clock_getres(clockid_t clk_id, struct __kernel_old_timespec *res) { struct vdso_data *vdata = __get_datapage();
@@ -248,7 +248,7 @@ static notrace inline int gettimeofday_fallback(struct __kernel_old_timeval *_tv
notrace int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { - struct timespec ts; + struct __kernel_old_timespec ts; struct vdso_data *vdata; int ret;
diff --git a/arch/sparc/vdso/vclock_gettime.c b/arch/sparc/vdso/vclock_gettime.c index a20c5030578d..e794edde6755 100644 --- a/arch/sparc/vdso/vclock_gettime.c +++ b/arch/sparc/vdso/vclock_gettime.c @@ -63,7 +63,7 @@ notrace static __always_inline struct vvar_data *get_vvar_data(void) return (struct vvar_data *) ret; }
-notrace static long vdso_fallback_gettime(long clock, struct timespec *ts) +notrace static long vdso_fallback_gettime(long clock, struct __kernel_old_timespec *ts) { register long num __asm__("g1") = __NR_clock_gettime; register long o0 __asm__("o0") = clock; @@ -144,7 +144,7 @@ notrace static __always_inline u64 vgetsns_stick(struct vvar_data *vvar) }
notrace static __always_inline int do_realtime(struct vvar_data *vvar, - struct timespec *ts) + struct __kernel_old_timespec *ts) { unsigned long seq; u64 ns; @@ -164,7 +164,7 @@ notrace static __always_inline int do_realtime(struct vvar_data *vvar, }
notrace static __always_inline int do_realtime_stick(struct vvar_data *vvar, - struct timespec *ts) + struct __kernel_old_timespec *ts) { unsigned long seq; u64 ns; @@ -184,7 +184,7 @@ notrace static __always_inline int do_realtime_stick(struct vvar_data *vvar, }
notrace static __always_inline int do_monotonic(struct vvar_data *vvar, - struct timespec *ts) + struct __kernel_old_timespec *ts) { unsigned long seq; u64 ns; @@ -204,7 +204,7 @@ notrace static __always_inline int do_monotonic(struct vvar_data *vvar, }
notrace static __always_inline int do_monotonic_stick(struct vvar_data *vvar, - struct timespec *ts) + struct __kernel_old_timespec *ts) { unsigned long seq; u64 ns; @@ -224,7 +224,7 @@ notrace static __always_inline int do_monotonic_stick(struct vvar_data *vvar, }
notrace static int do_realtime_coarse(struct vvar_data *vvar, - struct timespec *ts) + struct __kernel_old_timespec *ts) { unsigned long seq;
@@ -237,7 +237,7 @@ notrace static int do_realtime_coarse(struct vvar_data *vvar, }
notrace static int do_monotonic_coarse(struct vvar_data *vvar, - struct timespec *ts) + struct __kernel_old_timespec *ts) { unsigned long seq;
@@ -251,7 +251,7 @@ notrace static int do_monotonic_coarse(struct vvar_data *vvar, }
notrace int -__vdso_clock_gettime(clockid_t clock, struct timespec *ts) +__vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts) { struct vvar_data *vvd = get_vvar_data();
@@ -275,11 +275,11 @@ __vdso_clock_gettime(clockid_t clock, struct timespec *ts) return vdso_fallback_gettime(clock, ts); } int -clock_gettime(clockid_t, struct timespec *) +clock_gettime(clockid_t, struct __kernel_old_timespec *) __attribute__((weak, alias("__vdso_clock_gettime")));
notrace int -__vdso_clock_gettime_stick(clockid_t clock, struct timespec *ts) +__vdso_clock_gettime_stick(clockid_t clock, struct __kernel_old_timespec *ts) { struct vvar_data *vvd = get_vvar_data();
@@ -311,7 +311,7 @@ __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) if (likely(vvd->vclock_mode != VCLOCK_NONE)) { if (likely(tv != NULL)) { union tstv_t { - struct timespec ts; + struct __kernel_old_timespec ts; struct __kernel_old_timeval tv; } *tstv = (union tstv_t *) tv; do_realtime(vvd, &tstv->ts); @@ -347,7 +347,7 @@ __vdso_gettimeofday_stick(struct __kernel_old_timeval *tv, struct timezone *tz) if (likely(vvd->vclock_mode != VCLOCK_NONE)) { if (likely(tv != NULL)) { union tstv_t { - struct timespec ts; + struct __kernel_old_timespec ts; struct __kernel_old_timeval tv; } *tstv = (union tstv_t *) tv; do_realtime_stick(vvd, &tstv->ts); diff --git a/arch/x86/um/vdso/um_vdso.c b/arch/x86/um/vdso/um_vdso.c index 845336c11364..371724cf70da 100644 --- a/arch/x86/um/vdso/um_vdso.c +++ b/arch/x86/um/vdso/um_vdso.c @@ -13,7 +13,7 @@ #include <linux/getcpu.h> #include <asm/unistd.h>
-int __vdso_clock_gettime(clockid_t clock, struct timespec *ts) +int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts) { long ret;
@@ -22,7 +22,7 @@ int __vdso_clock_gettime(clockid_t clock, struct timespec *ts)
return ret; } -int clock_gettime(clockid_t, struct timespec *) +int clock_gettime(clockid_t, struct __kernel_old_timespec *) __attribute__((weak, alias("__vdso_clock_gettime")));
int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
Only x86 uses the 'time' syscall in vdso, so change that to __kernel_old_time_t as a preparation for removing 'time_t' and '__kernel_time_t' later.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/x86/entry/vdso/vclock_gettime.c | 6 +++--- arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/um/vdso/um_vdso.c | 4 ++-- lib/vdso/gettimeofday.c | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c index d9ff616bb0f6..7d70935b6758 100644 --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -15,7 +15,7 @@ #include "../../../../lib/vdso/gettimeofday.c"
extern int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz); -extern time_t __vdso_time(time_t *t); +extern __kernel_old_time_t __vdso_time(__kernel_old_time_t *t);
int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) { @@ -25,12 +25,12 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) int gettimeofday(struct __kernel_old_timeval *, struct timezone *) __attribute__((weak, alias("__vdso_gettimeofday")));
-time_t __vdso_time(time_t *t) +__kernel_old_time_t __vdso_time(__kernel_old_time_t *t) { return __cvdso_time(t); }
-time_t time(time_t *t) __attribute__((weak, alias("__vdso_time"))); +__kernel_old_time_t time(__kernel_old_time_t *t) __attribute__((weak, alias("__vdso_time")));
#if defined(CONFIG_X86_64) && !defined(BUILD_VDSO32_64) diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index 76e62bcb8d87..bba5bfdb2a56 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -194,7 +194,7 @@ bool emulate_vsyscall(unsigned long error_code, break;
case 1: - if (!write_ok_or_segv(regs->di, sizeof(time_t))) { + if (!write_ok_or_segv(regs->di, sizeof(__kernel_old_time_t))) { ret = -EFAULT; goto check_fault; } diff --git a/arch/x86/um/vdso/um_vdso.c b/arch/x86/um/vdso/um_vdso.c index 371724cf70da..2112b8d14668 100644 --- a/arch/x86/um/vdso/um_vdso.c +++ b/arch/x86/um/vdso/um_vdso.c @@ -37,7 +37,7 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) int gettimeofday(struct __kernel_old_timeval *, struct timezone *) __attribute__((weak, alias("__vdso_gettimeofday")));
-time_t __vdso_time(time_t *t) +__kernel_old_time_t __vdso_time(__kernel_old_time_t *t) { long secs;
@@ -47,7 +47,7 @@ time_t __vdso_time(time_t *t)
return secs; } -time_t time(time_t *t) __attribute__((weak, alias("__vdso_time"))); +__kernel_old_time_t time(__kernel_old_time_t *t) __attribute__((weak, alias("__vdso_time")));
long __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused) diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c index 45f57fd2db64..9ecfd3b547ba 100644 --- a/lib/vdso/gettimeofday.c +++ b/lib/vdso/gettimeofday.c @@ -164,10 +164,10 @@ __cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz) }
#ifdef VDSO_HAS_TIME -static __maybe_unused time_t __cvdso_time(time_t *time) +static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time) { const struct vdso_data *vd = __arch_get_vdso_data(); - time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec); + __kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
if (time) *time = t;
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
Only x86 uses the 'time' syscall in vdso, so change that to __kernel_old_time_t as a preparation for removing 'time_t' and '__kernel_time_t' later.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Reviewed-by: Thomas Gleixner tglx@linutronix.de
The nds32 vdso is now the last user of the deprecated timespec_add_ns().
Change it to an open-coded version like the one it already uses in do_realtime(). What we should really do though is to use the generic vdso implementation that is now used in x86. arm and mips.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/nds32/kernel/vdso/gettimeofday.c | 33 ++++++++++++--------------- 1 file changed, 15 insertions(+), 18 deletions(-)
diff --git a/arch/nds32/kernel/vdso/gettimeofday.c b/arch/nds32/kernel/vdso/gettimeofday.c index 687abc7145f5..9ec03cf0ec54 100644 --- a/arch/nds32/kernel/vdso/gettimeofday.c +++ b/arch/nds32/kernel/vdso/gettimeofday.c @@ -81,22 +81,20 @@ static notrace int do_realtime_coarse(struct __kernel_old_timespec *ts, static notrace int do_monotonic_coarse(struct __kernel_old_timespec *ts, struct vdso_data *vdata) { - struct timespec tomono; u32 seq; + u64 ns;
do { seq = vdso_read_begin(vdata);
- ts->tv_sec = vdata->xtime_coarse_sec; - ts->tv_nsec = vdata->xtime_coarse_nsec; - - tomono.tv_sec = vdata->wtm_clock_sec; - tomono.tv_nsec = vdata->wtm_clock_nsec; + ts->tv_sec = vdata->xtime_coarse_sec + vdata->wtm_clock_sec; + ns = vdata->xtime_coarse_nsec + vdata->wtm_clock_nsec;
} while (vdso_read_retry(vdata, seq));
- ts->tv_sec += tomono.tv_sec; - timespec_add_ns(ts, tomono.tv_nsec); + ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns); + ts->tv_nsec = ns; + return 0; }
@@ -135,26 +133,25 @@ static notrace int do_realtime(struct __kernel_old_timespec *ts, struct vdso_dat
static notrace int do_monotonic(struct __kernel_old_timespec *ts, struct vdso_data *vdata) { - struct timespec tomono; - u64 nsecs; + u64 ns; u32 seq;
do { seq = vdso_read_begin(vdata);
ts->tv_sec = vdata->xtime_clock_sec; - nsecs = vdata->xtime_clock_nsec; - nsecs += vgetsns(vdata); - nsecs >>= vdata->cs_shift; + ns = vdata->xtime_clock_nsec; + ns += vgetsns(vdata); + ns >>= vdata->cs_shift;
- tomono.tv_sec = vdata->wtm_clock_sec; - tomono.tv_nsec = vdata->wtm_clock_nsec; + ts->tv_sec += vdata->wtm_clock_sec; + ns += vdata->wtm_clock_nsec;
} while (vdso_read_retry(vdata, seq));
- ts->tv_sec += tomono.tv_sec; - ts->tv_nsec = 0; - timespec_add_ns(ts, nsecs + tomono.tv_nsec); + ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns); + ts->tv_nsec = ns; + return 0; }
As a preparation to stop using 'struct timespec' in the kernel, change the powerpc vdso implementation:
- split up the vdso data definition to have equivalent members for seconds and nanoseconds instead of an xtime structure
- use timespec64 as an intermediate for the xtime update
- change the asm-offsets definition to be based the appropriate fixed-length types
This is only a temporary fix for changing the types, in order to actually support a 64-bit safe vdso32 version of clock_gettime(), the entire powerpc vdso should be replaced with the generic lib/vdso/ implementation. If that happens first, this patch becomes obsolete.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/powerpc/include/asm/vdso_datapage.h | 6 ++++-- arch/powerpc/kernel/asm-offsets.c | 14 +++++--------- arch/powerpc/kernel/time.c | 5 +++-- arch/powerpc/kernel/vdso32/gettimeofday.S | 6 ++---- arch/powerpc/kernel/vdso64/gettimeofday.S | 8 ++++---- 5 files changed, 18 insertions(+), 21 deletions(-)
diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h index c61d59ed3b45..a115970a6809 100644 --- a/arch/powerpc/include/asm/vdso_datapage.h +++ b/arch/powerpc/include/asm/vdso_datapage.h @@ -81,7 +81,8 @@ struct vdso_data { __u32 stamp_sec_fraction; /* fractional seconds of stamp_xtime */ __s32 wtom_clock_nsec; /* Wall to monotonic clock nsec */ __s64 wtom_clock_sec; /* Wall to monotonic clock sec */ - struct timespec stamp_xtime; /* xtime as at tb_orig_stamp */ + __s64 stamp_xtime_sec; /* xtime secs as at tb_orig_stamp */ + __s64 stamp_xtime_nsec; /* xtime nsecs as at tb_orig_stamp */ __u32 syscall_map_64[SYSCALL_MAP_SIZE]; /* map of syscalls */ __u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */ }; @@ -101,7 +102,8 @@ struct vdso_data { __u32 tz_dsttime; /* Type of dst correction 0x5C */ __s32 wtom_clock_sec; /* Wall to monotonic clock */ __s32 wtom_clock_nsec; - struct timespec stamp_xtime; /* xtime as at tb_orig_stamp */ + __s32 stamp_xtime_sec; /* xtime seconds as at tb_orig_stamp */ + __s32 stamp_xtime_nsec; /* xtime nsecs as at tb_orig_stamp */ __u32 stamp_sec_fraction; /* fractional seconds of stamp_xtime */ __u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */ __u32 dcache_block_size; /* L1 d-cache block size */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 827f4c354e13..f22bd6d1fe93 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -385,7 +385,8 @@ int main(void) OFFSET(CFG_SYSCALL_MAP32, vdso_data, syscall_map_32); OFFSET(WTOM_CLOCK_SEC, vdso_data, wtom_clock_sec); OFFSET(WTOM_CLOCK_NSEC, vdso_data, wtom_clock_nsec); - OFFSET(STAMP_XTIME, vdso_data, stamp_xtime); + OFFSET(STAMP_XTIME_SEC, vdso_data, stamp_xtime_sec); + OFFSET(STAMP_XTIME_NSEC, vdso_data, stamp_xtime_nsec); OFFSET(STAMP_SEC_FRAC, vdso_data, stamp_sec_fraction); OFFSET(CFG_ICACHE_BLOCKSZ, vdso_data, icache_block_size); OFFSET(CFG_DCACHE_BLOCKSZ, vdso_data, dcache_block_size); @@ -395,18 +396,13 @@ int main(void) OFFSET(CFG_SYSCALL_MAP64, vdso_data, syscall_map_64); OFFSET(TVAL64_TV_SEC, __kernel_old_timeval, tv_sec); OFFSET(TVAL64_TV_USEC, __kernel_old_timeval, tv_usec); +#endif + OFFSET(TSPC64_TV_SEC, __kernel_timespec, tv_sec); + OFFSET(TSPC64_TV_NSEC, __kernel_timespec, tv_nsec); OFFSET(TVAL32_TV_SEC, old_timeval32, tv_sec); OFFSET(TVAL32_TV_USEC, old_timeval32, tv_usec); - OFFSET(TSPC64_TV_SEC, timespec, tv_sec); - OFFSET(TSPC64_TV_NSEC, timespec, tv_nsec); OFFSET(TSPC32_TV_SEC, old_timespec32, tv_sec); OFFSET(TSPC32_TV_NSEC, old_timespec32, tv_nsec); -#else - OFFSET(TVAL32_TV_SEC, __kernel_old_timeval, tv_sec); - OFFSET(TVAL32_TV_USEC, __kernel_old_timeval, tv_usec); - OFFSET(TSPC32_TV_SEC, timespec, tv_sec); - OFFSET(TSPC32_TV_NSEC, timespec, tv_nsec); -#endif /* timeval/timezone offsets for use by vdso */ OFFSET(TZONE_TZ_MINWEST, timezone, tz_minuteswest); OFFSET(TZONE_TZ_DSTTIME, timezone, tz_dsttime); diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 694522308cd5..1fad5a04d083 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -885,7 +885,7 @@ static notrace u64 timebase_read(struct clocksource *cs)
void update_vsyscall(struct timekeeper *tk) { - struct timespec xt; + struct timespec64 xt; struct clocksource *clock = tk->tkr_mono.clock; u32 mult = tk->tkr_mono.mult; u32 shift = tk->tkr_mono.shift; @@ -957,7 +957,8 @@ void update_vsyscall(struct timekeeper *tk) vdso_data->tb_to_xs = new_tb_to_xs; vdso_data->wtom_clock_sec = tk->wall_to_monotonic.tv_sec; vdso_data->wtom_clock_nsec = tk->wall_to_monotonic.tv_nsec; - vdso_data->stamp_xtime = xt; + vdso_data->stamp_xtime_sec = xt.sec; + vdso_data->stamp_xtime_nsec = xt.nsec; vdso_data->stamp_sec_fraction = frac_sec; smp_wmb(); ++(vdso_data->tb_update_count); diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S index 4327665ad86f..37ba4c3d965b 100644 --- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -15,10 +15,8 @@ /* Offset for the low 32-bit part of a field of long type */ #if defined(CONFIG_PPC64) && defined(CONFIG_CPU_BIG_ENDIAN) #define LOPART 4 -#define TSPEC_TV_SEC TSPC64_TV_SEC+LOPART #else #define LOPART 0 -#define TSPEC_TV_SEC TSPC32_TV_SEC #endif
.text @@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
- lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9) + lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
cmplwi r11,0 /* check if t is NULL */ beq 2f @@ -268,7 +266,7 @@ __do_get_tspec: * as a 32.32 fixed-point number in r3 and r4. * Load & add the xtime stamp. */ - lwz r5,STAMP_XTIME+TSPEC_TV_SEC(r9) + lwz r5,STAMP_XTIME_SEC+LOWPART(r9) lwz r6,STAMP_SEC_FRAC(r9) addc r4,r4,r6 adde r3,r3,r5 diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S index 07bfe33fe874..1f24e411af80 100644 --- a/arch/powerpc/kernel/vdso64/gettimeofday.S +++ b/arch/powerpc/kernel/vdso64/gettimeofday.S @@ -116,8 +116,8 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime) * CLOCK_REALTIME_COARSE, below values are needed for MONOTONIC_COARSE * too */ - ld r4,STAMP_XTIME+TSPC64_TV_SEC(r3) - ld r5,STAMP_XTIME+TSPC64_TV_NSEC(r3) + ld r4,STAMP_XTIME_SEC(r3) + ld r5,STAMP_XTIME_NSEC(r3) bne cr6,75f
/* CLOCK_MONOTONIC_COARSE */ @@ -220,7 +220,7 @@ V_FUNCTION_BEGIN(__kernel_time) mr r11,r3 /* r11 holds t */ bl V_LOCAL_FUNC(__get_datapage)
- ld r4,STAMP_XTIME+TSPC64_TV_SEC(r3) + ld r4,STAMP_XTIME_SEC(r3)
cmpldi r11,0 /* check if t is NULL */ beq 2f @@ -265,7 +265,7 @@ V_FUNCTION_BEGIN(__do_get_tspec) mulhdu r6,r6,r5 /* in units of 2^-32 seconds */
/* Add stamp since epoch */ - ld r4,STAMP_XTIME+TSPC64_TV_SEC(r3) + ld r4,STAMP_XTIME_SEC(r3) lwz r5,STAMP_SEC_FRAC(r3) or r0,r4,r5 or r0,r0,r6
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -15,10 +15,8 @@ /* Offset for the low 32-bit part of a field of long type */ #if defined(CONFIG_PPC64) && defined(CONFIG_CPU_BIG_ENDIAN) #define LOPART 4 -#define TSPEC_TV_SEC TSPC64_TV_SEC+LOPART #else #define LOPART 0 -#define TSPEC_TV_SEC TSPC32_TV_SEC #endif .text @@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
- lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9)
- lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
"LOWPART" should be "LOPART".
cmplwi r11,0 /* check if t is NULL */ beq 2f @@ -268,7 +266,7 @@ __do_get_tspec: * as a 32.32 fixed-point number in r3 and r4. * Load & add the xtime stamp. */
- lwz r5,STAMP_XTIME+TSPEC_TV_SEC(r9)
- lwz r5,STAMP_XTIME_SEC+LOWPART(r9)
Same here.
lwz r6,STAMP_SEC_FRAC(r9) addc r4,r4,r6 adde r3,r3,r5
[...]
On Wed, Nov 20, 2019 at 11:43 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -15,10 +15,8 @@ /* Offset for the low 32-bit part of a field of long type */ #if defined(CONFIG_PPC64) && defined(CONFIG_CPU_BIG_ENDIAN) #define LOPART 4 -#define TSPEC_TV_SEC TSPC64_TV_SEC+LOPART #else #define LOPART 0 -#define TSPEC_TV_SEC TSPC32_TV_SEC #endif
.text
@@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9)
lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
"LOWPART" should be "LOPART".
Thanks, fixed both instances in a patch on top now. I considered folding it into the original patch, but as it's close to the merge window I'd rather not rebase it, and this way I also give you credit for finding the bug.
I'm surprised that the 0-day bot did not report this already.
Thanks fro the careful review!
Arnd
commit 1c11ca7a0584ddede5b8c93057b40d31e8a96d3d (HEAD) Author: Arnd Bergmann arnd@arndb.de Date: Thu Nov 21 15:19:49 2019 +0100
y2038: fix typo in powerpc vdso "LOPART"
The earlier patch introduced a typo, change LOWPART back to LOPART.
Fixes: 176ed98c8a76 ("y2038: vdso: powerpc: avoid timespec references") Reported-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Arnd Bergmann arnd@arndb.de
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S index a7180b0f4aa1..c8e6902cb01b 100644 --- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -190,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
- lwz r3,STAMP_XTIME_SEC+LOWPART(r9) + lwz r3,STAMP_XTIME_SEC+LOPART(r9)
cmplwi r11,0 /* check if t is NULL */ beq 2f @@ -266,7 +266,7 @@ __do_get_tspec: * as a 32.32 fixed-point number in r3 and r4. * Load & add the xtime stamp. */ - lwz r5,STAMP_XTIME_SEC+LOWPART(r9) + lwz r5,STAMP_XTIME_SEC+LOPART(r9)
lwz r6,STAMP_SEC_FRAC(r9) addc r4,r4,r6 adde r3,r3,r5
Arnd Bergmann arnd@arndb.de a écrit :
On Wed, Nov 20, 2019 at 11:43 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -15,10 +15,8 @@ /* Offset for the low 32-bit part of a field of long type */ #if defined(CONFIG_PPC64) && defined(CONFIG_CPU_BIG_ENDIAN) #define LOPART 4 -#define TSPEC_TV_SEC TSPC64_TV_SEC+LOPART #else #define LOPART 0 -#define TSPEC_TV_SEC TSPC32_TV_SEC #endif
.text
@@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9)
lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
"LOWPART" should be "LOPART".
Thanks, fixed both instances in a patch on top now. I considered folding it into the original patch, but as it's close to the merge window I'd rather not rebase it, and this way I also give you credit for finding the bug.
Take care, might conflict with https://github.com/linuxppc/linux/commit/5e381d727fe8834ca5a126f510194a7a4ac...
Christophe
I'm surprised that the 0-day bot did not report this already.
Thanks fro the careful review!
Arnd
commit 1c11ca7a0584ddede5b8c93057b40d31e8a96d3d (HEAD) Author: Arnd Bergmann arnd@arndb.de Date: Thu Nov 21 15:19:49 2019 +0100
y2038: fix typo in powerpc vdso "LOPART" The earlier patch introduced a typo, change LOWPART back to LOPART. Fixes: 176ed98c8a76 ("y2038: vdso: powerpc: avoid timespec references") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S index a7180b0f4aa1..c8e6902cb01b 100644 --- a/arch/powerpc/kernel/vdso32/gettimeofday.S +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S @@ -190,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
lwz r3,STAMP_XTIME_SEC+LOPART(r9) cmplwi r11,0 /* check if t is NULL */ beq 2f
@@ -266,7 +266,7 @@ __do_get_tspec: * as a 32.32 fixed-point number in r3 and r4. * Load & add the xtime stamp. */
lwz r5,STAMP_XTIME_SEC+LOWPART(r9)
lwz r5,STAMP_XTIME_SEC+LOPART(r9) lwz r6,STAMP_SEC_FRAC(r9) addc r4,r4,r6 adde r3,r3,r5
On Thu, Nov 21, 2019 at 5:25 PM Christophe Leroy christophe.leroy@c-s.fr wrote:
Arnd Bergmann arnd@arndb.de a écrit :
On Wed, Nov 20, 2019 at 11:43 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote:
@@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9)
lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
"LOWPART" should be "LOPART".
Thanks, fixed both instances in a patch on top now. I considered folding it into the original patch, but as it's close to the merge window I'd rather not rebase it, and this way I also give you credit for finding the bug.
Take care, might conflict with https://github.com/linuxppc/linux/commit/5e381d727fe8834ca5a126f510194a7a4ac...
Sorry for my late reply. I see this commit and no other variant of it has made it into linux-next by now, so I assume this is not getting sent for v5.5 and it's not stopping me from sending my own pull request.
Please let me know if I missed something and this will cause problems.
On a related note: are you still working on the generic lib/vdso support for powerpc? Without that, future libc implementations that use 64-bit time_t will have to use the slow clock_gettime64 syscall instead of the vdso, which has a significant performance impact.
Arnd
Le 27/11/2019 à 12:03, Arnd Bergmann a écrit :
On Thu, Nov 21, 2019 at 5:25 PM Christophe Leroy christophe.leroy@c-s.fr wrote:
Arnd Bergmann arnd@arndb.de a écrit :
On Wed, Nov 20, 2019 at 11:43 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote:
@@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9)
lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
"LOWPART" should be "LOPART".
Thanks, fixed both instances in a patch on top now. I considered folding it into the original patch, but as it's close to the merge window I'd rather not rebase it, and this way I also give you credit for finding the bug.
Take care, might conflict with https://github.com/linuxppc/linux/commit/5e381d727fe8834ca5a126f510194a7a4ac...
Sorry for my late reply. I see this commit and no other variant of it has made it into linux-next by now, so I assume this is not getting sent for v5.5 and it's not stopping me from sending my own pull request.
Please let me know if I missed something and this will cause problems.
On a related note: are you still working on the generic lib/vdso support for powerpc? Without that, future libc implementations that use 64-bit time_t will have to use the slow clock_gettime64 syscall instead of the vdso, which has a significant performance impact.
I have left this generic lib/vdso subject aside for the moment, because performance is disappointing and its architecture doesn't real fit with powerpc ABI.
From a performance point of view, it is manipulating 64 bits vars where is could use 32 bits vars. Of course I understand that y2038 will anyway force the use of 64 bits for seconds, but at the time being powerpc32 VDSO is using 32 bits vars for both secs and ns, it make the difference. Also, the generic VDSO is playing too much with data on stacks and associated memory read/write/copies, which kills performance on RISC processors like powerpc. Inlining do_hres() for instance significantly improves that as it allow handling the 'struct __kernel_timespec ts' on registers instead of using stack.
Regarding powerpc ABI, the issue is that errors shall be reported by setting the SO bit in CR register, and this cannot be done in C. This means: - The VDSO entry point must be in ASM and the generic VDSO C function must be called from there, it cannot be the VDSO entry point. - The VDSO fallback (ie the system call) cannot be done from the generic VDSO C function, it must be called from the ASM as well.
Last point/question, what's the point in using 64 bits for nanoseconds on 32 bits arches ?
Christophe
On Mon, Dec 2, 2019 at 1:55 PM Christophe Leroy christophe.leroy@c-s.fr wrote:
Le 27/11/2019 à 12:03, Arnd Bergmann a écrit :
On Thu, Nov 21, 2019 at 5:25 PM Christophe Leroy christophe.leroy@c-s.fr wrote:
Arnd Bergmann arnd@arndb.de a écrit :
On Wed, Nov 20, 2019 at 11:43 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote:
@@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) bl __get_datapage@local mr r9, r3 /* datapage ptr in r9 */
lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9)
lwz r3,STAMP_XTIME_SEC+LOWPART(r9)
"LOWPART" should be "LOPART".
Thanks, fixed both instances in a patch on top now. I considered folding it into the original patch, but as it's close to the merge window I'd rather not rebase it, and this way I also give you credit for finding the bug.
Take care, might conflict with https://github.com/linuxppc/linux/commit/5e381d727fe8834ca5a126f510194a7a4ac...
Sorry for my late reply. I see this commit and no other variant of it has made it into linux-next by now, so I assume this is not getting sent for v5.5 and it's not stopping me from sending my own pull request.
Please let me know if I missed something and this will cause problems.
On a related note: are you still working on the generic lib/vdso support for powerpc? Without that, future libc implementations that use 64-bit time_t will have to use the slow clock_gettime64 syscall instead of the vdso, which has a significant performance impact.
I have left this generic lib/vdso subject aside for the moment, because performance is disappointing and its architecture doesn't real fit with powerpc ABI.
From a performance point of view, it is manipulating 64 bits vars where is could use 32 bits vars. Of course I understand that y2038 will anyway force the use of 64 bits for seconds, but at the time being powerpc32 VDSO is using 32 bits vars for both secs and ns, it make the difference.
Do you think we could optimize the common code? This sounds like it could improve things for other architectures as well.
Also, the generic VDSO is playing too much with data on stacks and associated memory read/write/copies, which kills performance on RISC processors like powerpc. Inlining do_hres() for instance significantly improves that as it allow handling the 'struct __kernel_timespec ts' on registers instead of using stack.
That should be easy enough to change in the common code, as long as adding 'inline' does not cause harm on x86 and arm.
Regarding powerpc ABI, the issue is that errors shall be reported by setting the SO bit in CR register, and this cannot be done in C. This means:
- The VDSO entry point must be in ASM and the generic VDSO C function
must be called from there, it cannot be the VDSO entry point.
- The VDSO fallback (ie the system call) cannot be done from the generic
VDSO C function, it must be called from the ASM as well.
As far as I can tell, both the VDSO entry point and the fallback are in architecture specific code on all architectures, so this does not seem to be a show-stopper.
It also seems that they might be combined as long the current powerpc code could be changed to use the generic vdso_data structure definition: the existing code can keep being used for gettimeofday(), clock_gettime(CLOCK_MONOTONIC, ...) and clock_gettime(CLOCK_REALTIME), while the generic implementation can be called for clock_gettime64(), clock_getres() and clock_gettime() with other time clock IDs.
Last point/question, what's the point in using 64 bits for nanoseconds on 32 bits arches ?
The __kernel_timespec structure is defined with two 64-bit members so it has the same layout on both 32-bit and 64-bit architectures, which lets us share the implementation of the compat syscall handlers even on big-endian architectures, and it avoids accidentally leaking four bytes of stack data when copying a timespec from kernel to user space. The high 32 bits of the nanosecond are expected to always be zero when copying to user space, and to be ignored when copied into the kernel (see get_timespec64()).
Note that C99 and POSIX require tv_nsec to be 'long', so 64-bit architectures have to make it 64-bit wide, and 32-bit architectures end up including padding for it.
In the vdso_data, the "nsec" value is shifted, so it actually needs more bits. I don't know if this is a strict requirement, or if we could change it to be 32 bits non-shifted during the update at the cost of losing 1 nanosecond of accuracy.
Arnd
There are two structures based on time_t that conflict between libc and kernel: timeval and timespec. Both are now renamed to __kernel_old_timeval and __kernel_old_timespec.
For time_t, the old typedef is still __kernel_time_t. There is nothing wrong with that name, but it would be nice to not use that going forward as this type is used almost only in deprecated interfaces because of the y2038 overflow.
In the IPC headers (msgbuf.h, sembuf.h, shmbuf.h), __kernel_time_t is only used for the 64-bit variants, which are not deprecated.
Change these to a plain 'long', which is the same type as __kernel_time_t on all 64-bit architectures anyway, to reduce the number of users of the old type.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/mips/include/uapi/asm/msgbuf.h | 6 +++--- arch/mips/include/uapi/asm/sembuf.h | 4 ++-- arch/mips/include/uapi/asm/shmbuf.h | 6 +++--- arch/parisc/include/uapi/asm/msgbuf.h | 6 +++--- arch/parisc/include/uapi/asm/sembuf.h | 4 ++-- arch/parisc/include/uapi/asm/shmbuf.h | 6 +++--- arch/powerpc/include/uapi/asm/msgbuf.h | 6 +++--- arch/powerpc/include/uapi/asm/sembuf.h | 4 ++-- arch/powerpc/include/uapi/asm/shmbuf.h | 6 +++--- arch/sparc/include/uapi/asm/msgbuf.h | 6 +++--- arch/sparc/include/uapi/asm/sembuf.h | 4 ++-- arch/sparc/include/uapi/asm/shmbuf.h | 6 +++--- arch/x86/include/uapi/asm/msgbuf.h | 6 +++--- arch/x86/include/uapi/asm/sembuf.h | 4 ++-- arch/x86/include/uapi/asm/shmbuf.h | 6 +++--- include/uapi/asm-generic/msgbuf.h | 12 ++++++------ include/uapi/asm-generic/sembuf.h | 7 +++---- include/uapi/asm-generic/shmbuf.h | 12 ++++++------ 18 files changed, 55 insertions(+), 56 deletions(-)
diff --git a/arch/mips/include/uapi/asm/msgbuf.h b/arch/mips/include/uapi/asm/msgbuf.h index 46aa15b13e4e..9e0c2e230274 100644 --- a/arch/mips/include/uapi/asm/msgbuf.h +++ b/arch/mips/include/uapi/asm/msgbuf.h @@ -15,9 +15,9 @@ #if defined(__mips64) struct msqid64_ds { struct ipc64_perm msg_perm; - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + long msg_stime; /* last msgsnd time */ + long msg_rtime; /* last msgrcv time */ + long msg_ctime; /* last change time */ unsigned long msg_cbytes; /* current number of bytes on queue */ unsigned long msg_qnum; /* number of messages in queue */ unsigned long msg_qbytes; /* max number of bytes on queue */ diff --git a/arch/mips/include/uapi/asm/sembuf.h b/arch/mips/include/uapi/asm/sembuf.h index 60c89e6cb25b..43e1b4a2f68a 100644 --- a/arch/mips/include/uapi/asm/sembuf.h +++ b/arch/mips/include/uapi/asm/sembuf.h @@ -14,8 +14,8 @@ #ifdef __mips64 struct semid64_ds { struct ipc64_perm sem_perm; /* permissions .. see ipc.h */ - __kernel_time_t sem_otime; /* last semop time */ - __kernel_time_t sem_ctime; /* last change time */ + long sem_otime; /* last semop time */ + long sem_ctime; /* last change time */ unsigned long sem_nsems; /* no. of semaphores in array */ unsigned long __unused1; unsigned long __unused2; diff --git a/arch/mips/include/uapi/asm/shmbuf.h b/arch/mips/include/uapi/asm/shmbuf.h index 9b9bba3401f2..680bb95b2240 100644 --- a/arch/mips/include/uapi/asm/shmbuf.h +++ b/arch/mips/include/uapi/asm/shmbuf.h @@ -17,9 +17,9 @@ struct shmid64_ds { struct ipc64_perm shm_perm; /* operation perms */ size_t shm_segsz; /* size of segment (bytes) */ - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + long shm_atime; /* last attach time */ + long shm_dtime; /* last detach time */ + long shm_ctime; /* last change time */ __kernel_pid_t shm_cpid; /* pid of creator */ __kernel_pid_t shm_lpid; /* pid of last operator */ unsigned long shm_nattch; /* no. of current attaches */ diff --git a/arch/parisc/include/uapi/asm/msgbuf.h b/arch/parisc/include/uapi/asm/msgbuf.h index 6a2e9ab2ef8d..3b877335da38 100644 --- a/arch/parisc/include/uapi/asm/msgbuf.h +++ b/arch/parisc/include/uapi/asm/msgbuf.h @@ -16,9 +16,9 @@ struct msqid64_ds { struct ipc64_perm msg_perm; #if __BITS_PER_LONG == 64 - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + long msg_stime; /* last msgsnd time */ + long msg_rtime; /* last msgrcv time */ + long msg_ctime; /* last change time */ #else unsigned long msg_stime_high; unsigned long msg_stime; /* last msgsnd time */ diff --git a/arch/parisc/include/uapi/asm/sembuf.h b/arch/parisc/include/uapi/asm/sembuf.h index 3c31163b1241..8241cf126018 100644 --- a/arch/parisc/include/uapi/asm/sembuf.h +++ b/arch/parisc/include/uapi/asm/sembuf.h @@ -16,8 +16,8 @@ struct semid64_ds { struct ipc64_perm sem_perm; /* permissions .. see ipc.h */ #if __BITS_PER_LONG == 64 - __kernel_time_t sem_otime; /* last semop time */ - __kernel_time_t sem_ctime; /* last change time */ + long sem_otime; /* last semop time */ + long sem_ctime; /* last change time */ #else unsigned long sem_otime_high; unsigned long sem_otime; /* last semop time */ diff --git a/arch/parisc/include/uapi/asm/shmbuf.h b/arch/parisc/include/uapi/asm/shmbuf.h index c89b3dd8db21..5da3089be65e 100644 --- a/arch/parisc/include/uapi/asm/shmbuf.h +++ b/arch/parisc/include/uapi/asm/shmbuf.h @@ -16,9 +16,9 @@ struct shmid64_ds { struct ipc64_perm shm_perm; /* operation perms */ #if __BITS_PER_LONG == 64 - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + long shm_atime; /* last attach time */ + long shm_dtime; /* last detach time */ + long shm_ctime; /* last change time */ #else unsigned long shm_atime_high; unsigned long shm_atime; /* last attach time */ diff --git a/arch/powerpc/include/uapi/asm/msgbuf.h b/arch/powerpc/include/uapi/asm/msgbuf.h index 2b1b37797a47..969bd83e4d3d 100644 --- a/arch/powerpc/include/uapi/asm/msgbuf.h +++ b/arch/powerpc/include/uapi/asm/msgbuf.h @@ -11,9 +11,9 @@ struct msqid64_ds { struct ipc64_perm msg_perm; #ifdef __powerpc64__ - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + long msg_stime; /* last msgsnd time */ + long msg_rtime; /* last msgrcv time */ + long msg_ctime; /* last change time */ #else unsigned long msg_stime_high; unsigned long msg_stime; /* last msgsnd time */ diff --git a/arch/powerpc/include/uapi/asm/sembuf.h b/arch/powerpc/include/uapi/asm/sembuf.h index 3f60946f77e3..008ae77c6746 100644 --- a/arch/powerpc/include/uapi/asm/sembuf.h +++ b/arch/powerpc/include/uapi/asm/sembuf.h @@ -26,8 +26,8 @@ struct semid64_ds { unsigned long sem_ctime_high; unsigned long sem_ctime; /* last change time */ #else - __kernel_time_t sem_otime; /* last semop time */ - __kernel_time_t sem_ctime; /* last change time */ + long sem_otime; /* last semop time */ + long sem_ctime; /* last change time */ #endif unsigned long sem_nsems; /* no. of semaphores in array */ unsigned long __unused3; diff --git a/arch/powerpc/include/uapi/asm/shmbuf.h b/arch/powerpc/include/uapi/asm/shmbuf.h index b591c4d7e4c5..00422b2f3c63 100644 --- a/arch/powerpc/include/uapi/asm/shmbuf.h +++ b/arch/powerpc/include/uapi/asm/shmbuf.h @@ -22,9 +22,9 @@ struct shmid64_ds { struct ipc64_perm shm_perm; /* operation perms */ #ifdef __powerpc64__ - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + long shm_atime; /* last attach time */ + long shm_dtime; /* last detach time */ + long shm_ctime; /* last change time */ #else unsigned long shm_atime_high; unsigned long shm_atime; /* last attach time */ diff --git a/arch/sparc/include/uapi/asm/msgbuf.h b/arch/sparc/include/uapi/asm/msgbuf.h index ffc46c211d6d..eeeb91933280 100644 --- a/arch/sparc/include/uapi/asm/msgbuf.h +++ b/arch/sparc/include/uapi/asm/msgbuf.h @@ -13,9 +13,9 @@ struct msqid64_ds { struct ipc64_perm msg_perm; #if defined(__sparc__) && defined(__arch64__) - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + long msg_stime; /* last msgsnd time */ + long msg_rtime; /* last msgrcv time */ + long msg_ctime; /* last change time */ #else unsigned long msg_stime_high; unsigned long msg_stime; /* last msgsnd time */ diff --git a/arch/sparc/include/uapi/asm/sembuf.h b/arch/sparc/include/uapi/asm/sembuf.h index f3d309c2e1cd..cbcbaa4e7128 100644 --- a/arch/sparc/include/uapi/asm/sembuf.h +++ b/arch/sparc/include/uapi/asm/sembuf.h @@ -14,8 +14,8 @@ struct semid64_ds { struct ipc64_perm sem_perm; /* permissions .. see ipc.h */ #if defined(__sparc__) && defined(__arch64__) - __kernel_time_t sem_otime; /* last semop time */ - __kernel_time_t sem_ctime; /* last change time */ + long sem_otime; /* last semop time */ + long sem_ctime; /* last change time */ #else unsigned long sem_otime_high; unsigned long sem_otime; /* last semop time */ diff --git a/arch/sparc/include/uapi/asm/shmbuf.h b/arch/sparc/include/uapi/asm/shmbuf.h index 06618b84822d..a5d7d8d681c4 100644 --- a/arch/sparc/include/uapi/asm/shmbuf.h +++ b/arch/sparc/include/uapi/asm/shmbuf.h @@ -14,9 +14,9 @@ struct shmid64_ds { struct ipc64_perm shm_perm; /* operation perms */ #if defined(__sparc__) && defined(__arch64__) - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + long shm_atime; /* last attach time */ + long shm_dtime; /* last detach time */ + long shm_ctime; /* last change time */ #else unsigned long shm_atime_high; unsigned long shm_atime; /* last attach time */ diff --git a/arch/x86/include/uapi/asm/msgbuf.h b/arch/x86/include/uapi/asm/msgbuf.h index 90ab9a795b49..7c5bb43ed8af 100644 --- a/arch/x86/include/uapi/asm/msgbuf.h +++ b/arch/x86/include/uapi/asm/msgbuf.h @@ -15,9 +15,9 @@
struct msqid64_ds { struct ipc64_perm msg_perm; - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + __kernel_long_t msg_stime; /* last msgsnd time */ + __kernel_long_t msg_rtime; /* last msgrcv time */ + __kernel_long_t msg_ctime; /* last change time */ __kernel_ulong_t msg_cbytes; /* current number of bytes on queue */ __kernel_ulong_t msg_qnum; /* number of messages in queue */ __kernel_ulong_t msg_qbytes; /* max number of bytes on queue */ diff --git a/arch/x86/include/uapi/asm/sembuf.h b/arch/x86/include/uapi/asm/sembuf.h index 89de6cd9f0a7..7c1b156695ba 100644 --- a/arch/x86/include/uapi/asm/sembuf.h +++ b/arch/x86/include/uapi/asm/sembuf.h @@ -21,9 +21,9 @@ struct semid64_ds { unsigned long sem_ctime; /* last change time */ unsigned long sem_ctime_high; #else - __kernel_time_t sem_otime; /* last semop time */ + long sem_otime; /* last semop time */ __kernel_ulong_t __unused1; - __kernel_time_t sem_ctime; /* last change time */ + long sem_ctime; /* last change time */ __kernel_ulong_t __unused2; #endif __kernel_ulong_t sem_nsems; /* no. of semaphores in array */ diff --git a/arch/x86/include/uapi/asm/shmbuf.h b/arch/x86/include/uapi/asm/shmbuf.h index 644421f3823b..f0305dc660c9 100644 --- a/arch/x86/include/uapi/asm/shmbuf.h +++ b/arch/x86/include/uapi/asm/shmbuf.h @@ -16,9 +16,9 @@ struct shmid64_ds { struct ipc64_perm shm_perm; /* operation perms */ size_t shm_segsz; /* size of segment (bytes) */ - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + __kernel_long_t shm_atime; /* last attach time */ + __kernel_long_t shm_dtime; /* last detach time */ + __kernel_long_t shm_ctime; /* last change time */ __kernel_pid_t shm_cpid; /* pid of creator */ __kernel_pid_t shm_lpid; /* pid of last operator */ __kernel_ulong_t shm_nattch; /* no. of current attaches */ diff --git a/include/uapi/asm-generic/msgbuf.h b/include/uapi/asm-generic/msgbuf.h index 9fe4881557cb..af95aa89012e 100644 --- a/include/uapi/asm-generic/msgbuf.h +++ b/include/uapi/asm-generic/msgbuf.h @@ -13,9 +13,9 @@ * everyone just ended up making identical copies without specific * optimizations, so we may just as well all use the same one. * - * 64 bit architectures typically define a 64 bit __kernel_time_t, - * so they do not need the first three padding words. - * On big-endian systems, the padding is in the wrong place. + * 64 bit architectures use a 64-bit long time field here, while + * 32 bit architectures have a pair of unsigned long values. + * On big-endian systems, the lower half is in the wrong place. * * Pad space is left for: * - 2 miscellaneous 32-bit values @@ -24,9 +24,9 @@ struct msqid64_ds { struct ipc64_perm msg_perm; #if __BITS_PER_LONG == 64 - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + long msg_stime; /* last msgsnd time */ + long msg_rtime; /* last msgrcv time */ + long msg_ctime; /* last change time */ #else unsigned long msg_stime; /* last msgsnd time */ unsigned long msg_stime_high; diff --git a/include/uapi/asm-generic/sembuf.h b/include/uapi/asm-generic/sembuf.h index 0bae010f1b64..137606018c6a 100644 --- a/include/uapi/asm-generic/sembuf.h +++ b/include/uapi/asm-generic/sembuf.h @@ -13,9 +13,8 @@ * everyone just ended up making identical copies without specific * optimizations, so we may just as well all use the same one. * - * 64 bit architectures use a 64-bit __kernel_time_t here, while + * 64 bit architectures use a 64-bit long time field here, while * 32 bit architectures have a pair of unsigned long values. - * so they do not need the first two padding words. * * On big-endian systems, the padding is in the wrong place for * historic reasons, so user space has to reconstruct a time_t @@ -29,8 +28,8 @@ struct semid64_ds { struct ipc64_perm sem_perm; /* permissions .. see ipc.h */ #if __BITS_PER_LONG == 64 - __kernel_time_t sem_otime; /* last semop time */ - __kernel_time_t sem_ctime; /* last change time */ + long sem_otime; /* last semop time */ + long sem_ctime; /* last change time */ #else unsigned long sem_otime; /* last semop time */ unsigned long sem_otime_high; diff --git a/include/uapi/asm-generic/shmbuf.h b/include/uapi/asm-generic/shmbuf.h index e504422fc501..2bab955e0fed 100644 --- a/include/uapi/asm-generic/shmbuf.h +++ b/include/uapi/asm-generic/shmbuf.h @@ -13,9 +13,9 @@ * everyone just ended up making identical copies without specific * optimizations, so we may just as well all use the same one. * - * 64 bit architectures typically define a 64 bit __kernel_time_t, - * so they do not need the first two padding words. - * On big-endian systems, the padding is in the wrong place. + * 64 bit architectures use a 64-bit long time field here, while + * 32 bit architectures have a pair of unsigned long values. + * On big-endian systems, the lower half is in the wrong place. * * * Pad space is left for: @@ -26,9 +26,9 @@ struct shmid64_ds { struct ipc64_perm shm_perm; /* operation perms */ size_t shm_segsz; /* size of segment (bytes) */ #if __BITS_PER_LONG == 64 - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + long shm_atime; /* last attach time */ + long shm_dtime; /* last detach time */ + long shm_ctime; /* last change time */ #else unsigned long shm_atime; /* last attach time */ unsigned long shm_atime_high;
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/arch/x86/include/uapi/asm/sembuf.h +++ b/arch/x86/include/uapi/asm/sembuf.h @@ -21,9 +21,9 @@ struct semid64_ds { unsigned long sem_ctime; /* last change time */ unsigned long sem_ctime_high; #else
- __kernel_time_t sem_otime; /* last semop time */
- long sem_otime; /* last semop time */ __kernel_ulong_t __unused1;
- __kernel_time_t sem_ctime; /* last change time */
- long sem_ctime; /* last change time */ __kernel_ulong_t __unused2;
#endif __kernel_ulong_t sem_nsems; /* no. of semaphores in array */
[...]
We need to use __kernel_long_t here to do the right thing on x32.
Ben.
On Wed, Nov 20, 2019 at 11:49 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: [...]
--- a/arch/x86/include/uapi/asm/sembuf.h +++ b/arch/x86/include/uapi/asm/sembuf.h @@ -21,9 +21,9 @@ struct semid64_ds { unsigned long sem_ctime; /* last change time */ unsigned long sem_ctime_high; #else
__kernel_time_t sem_otime; /* last semop time */
long sem_otime; /* last semop time */ __kernel_ulong_t __unused1;
__kernel_time_t sem_ctime; /* last change time */
long sem_ctime; /* last change time */ __kernel_ulong_t __unused2;
#endif __kernel_ulong_t sem_nsems; /* no. of semaphores in array */
[...]
We need to use __kernel_long_t here to do the right thing on x32.
Good catch, thanks for the review!
I applied the patch below now on top.
Arnd
commit c7ebd8a1c1825c3197732ea692cf3dde34a644f5 (HEAD) Author: Arnd Bergmann arnd@arndb.de Date: Thu Nov 21 15:25:04 2019 +0100
y2038: ipc: fix x32 ABI breakage
The correct type on x32 is 64-bit wide, same as for the other struct members around it, so use __kernel_long_t in place of the original __kernel_time_t here, corresponding to the rest of the structure.
Fixes: caf5e32d4ea7 ("y2038: ipc: remove __kernel_time_t reference from headers") Reported-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Arnd Bergmann arnd@arndb.de
diff --git a/arch/x86/include/uapi/asm/sembuf.h b/arch/x86/include/uapi/asm/sembuf.h index 7c1b156695ba..20cab43c4b15 100644 --- a/arch/x86/include/uapi/asm/sembuf.h +++ b/arch/x86/include/uapi/asm/sembuf.h @@ -21,9 +21,9 @@ struct semid64_ds { unsigned long sem_ctime; /* last change time */ unsigned long sem_ctime_high; #else - long sem_otime; /* last semop time */ + __kernel_long_t sem_otime; /* last semop time */ __kernel_ulong_t __unused1; - long sem_ctime; /* last change time */ + __kenrel_long_t sem_ctime; /* last change time */ __kernel_ulong_t __unused2; #endif __kernel_ulong_t sem_nsems; /* no. of semaphores in array */
The time_t definition may differ between user space and kernel space, so replace time_t with an unambiguous 'long' for the mips and sparc.
The same structures also contain 'off_t', which has the same problem, so replace that as well on those two architectures and powerpc.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/mips/include/uapi/asm/stat.h | 16 ++++++++-------- arch/powerpc/include/uapi/asm/stat.h | 2 +- arch/sparc/include/uapi/asm/stat.h | 24 ++++++++++++------------ 3 files changed, 21 insertions(+), 21 deletions(-)
diff --git a/arch/mips/include/uapi/asm/stat.h b/arch/mips/include/uapi/asm/stat.h index 95416f366d7f..3d2a3b71845c 100644 --- a/arch/mips/include/uapi/asm/stat.h +++ b/arch/mips/include/uapi/asm/stat.h @@ -26,17 +26,17 @@ struct stat { gid_t st_gid; unsigned st_rdev; long st_pad2[2]; - off_t st_size; + long st_size; long st_pad3; /* * Actually this should be timestruc_t st_atime, st_mtime and st_ctime * but we don't have it under Linux. */ - time_t st_atime; + long st_atime; long st_atime_nsec; - time_t st_mtime; + long st_mtime; long st_mtime_nsec; - time_t st_ctime; + long st_ctime; long st_ctime_nsec; long st_blksize; long st_blocks; @@ -70,13 +70,13 @@ struct stat64 { * Actually this should be timestruc_t st_atime, st_mtime and st_ctime * but we don't have it under Linux. */ - time_t st_atime; + long st_atime; unsigned long st_atime_nsec; /* Reserved for st_atime expansion */
- time_t st_mtime; + long st_mtime; unsigned long st_mtime_nsec; /* Reserved for st_mtime expansion */
- time_t st_ctime; + long st_ctime; unsigned long st_ctime_nsec; /* Reserved for st_ctime expansion */
unsigned long st_blksize; @@ -105,7 +105,7 @@ struct stat { unsigned int st_rdev; unsigned int st_pad1[3]; /* Reserved for st_rdev expansion */
- off_t st_size; + long st_size;
/* * Actually this should be timestruc_t st_atime, st_mtime and st_ctime diff --git a/arch/powerpc/include/uapi/asm/stat.h b/arch/powerpc/include/uapi/asm/stat.h index afd25f2ff4e8..7871055e5e32 100644 --- a/arch/powerpc/include/uapi/asm/stat.h +++ b/arch/powerpc/include/uapi/asm/stat.h @@ -40,7 +40,7 @@ struct stat { uid_t st_uid; gid_t st_gid; unsigned long st_rdev; - off_t st_size; + long st_size; unsigned long st_blksize; unsigned long st_blocks; unsigned long st_atime; diff --git a/arch/sparc/include/uapi/asm/stat.h b/arch/sparc/include/uapi/asm/stat.h index b6ec4eb217f7..732c41720e24 100644 --- a/arch/sparc/include/uapi/asm/stat.h +++ b/arch/sparc/include/uapi/asm/stat.h @@ -14,12 +14,12 @@ struct stat { uid_t st_uid; gid_t st_gid; unsigned int st_rdev; - off_t st_size; - time_t st_atime; - time_t st_mtime; - time_t st_ctime; - off_t st_blksize; - off_t st_blocks; + long st_size; + long st_atime; + long st_mtime; + long st_ctime; + long st_blksize; + long st_blocks; unsigned long __unused4[2]; };
@@ -57,15 +57,15 @@ struct stat { unsigned short st_uid; unsigned short st_gid; unsigned short st_rdev; - off_t st_size; - time_t st_atime; + long st_size; + long st_atime; unsigned long st_atime_nsec; - time_t st_mtime; + long st_mtime; unsigned long st_mtime_nsec; - time_t st_ctime; + long st_ctime; unsigned long st_ctime_nsec; - off_t st_blksize; - off_t st_blocks; + long st_blksize; + long st_blocks; unsigned long __unused4[2]; };
This is mainly a patch for clarification, and to let us remove the time_t definition from the kernel to prevent new users from creeping in that might not be y2038-safe.
All remaining uses of 'time_t' or '__kernel_time_t' are part of the user API that cannot be changed by that either have a replacement or that do not suffer from the y2038 overflow.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/linux/syscalls.h | 4 ++-- include/linux/time32.h | 2 +- include/linux/types.h | 2 +- include/uapi/linux/cyclades.h | 6 +++--- include/uapi/linux/msg.h | 6 +++--- include/uapi/linux/ppp_defs.h | 4 ++-- include/uapi/linux/sem.h | 4 ++-- include/uapi/linux/shm.h | 6 +++--- include/uapi/linux/time.h | 6 +++--- include/uapi/linux/time_types.h | 4 ++-- include/uapi/linux/utime.h | 4 ++-- kernel/time/time.c | 6 +++--- 12 files changed, 27 insertions(+), 27 deletions(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index f7c561c4dcdd..2f27bc9d5ef0 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1076,7 +1076,7 @@ asmlinkage long sys_fadvise64(int fd, loff_t offset, size_t len, int advice); asmlinkage long sys_alarm(unsigned int seconds); asmlinkage long sys_getpgrp(void); asmlinkage long sys_pause(void); -asmlinkage long sys_time(time_t __user *tloc); +asmlinkage long sys_time(__kernel_old_time_t __user *tloc); asmlinkage long sys_time32(old_time32_t __user *tloc); #ifdef __ARCH_WANT_SYS_UTIME asmlinkage long sys_utime(char __user *filename, @@ -1116,7 +1116,7 @@ asmlinkage long sys_sysfs(int option, asmlinkage long sys_fork(void);
/* obsolete: kernel/time/time.c */ -asmlinkage long sys_stime(time_t __user *tptr); +asmlinkage long sys_stime(__kernel_old_time_t __user *tptr); asmlinkage long sys_stime32(old_time32_t __user *tptr);
/* obsolete: kernel/signal.c */ diff --git a/include/linux/time32.h b/include/linux/time32.h index 0a1f302a1753..cad4c3186002 100644 --- a/include/linux/time32.h +++ b/include/linux/time32.h @@ -12,7 +12,7 @@ #include <linux/time64.h> #include <linux/timex.h>
-#define TIME_T_MAX (time_t)((1UL << ((sizeof(time_t) << 3) - 1)) - 1) +#define TIME_T_MAX (__kernel_old_time_t)((1UL << ((sizeof(__kernel_old_time_t) << 3) - 1)) - 1)
typedef s32 old_time32_t;
diff --git a/include/linux/types.h b/include/linux/types.h index 05030f608be3..e32c1180b742 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -67,7 +67,7 @@ typedef __kernel_ptrdiff_t ptrdiff_t;
#ifndef _TIME_T #define _TIME_T -typedef __kernel_time_t time_t; +typedef __kernel_old_time_t time_t; #endif
#ifndef _CLOCK_T diff --git a/include/uapi/linux/cyclades.h b/include/uapi/linux/cyclades.h index 8279bc3d60ca..fc0add2194a9 100644 --- a/include/uapi/linux/cyclades.h +++ b/include/uapi/linux/cyclades.h @@ -83,9 +83,9 @@ struct cyclades_monitor { * open) */ struct cyclades_idle_stats { - __kernel_time_t in_use; /* Time device has been in use (secs) */ - __kernel_time_t recv_idle; /* Time since last char received (secs) */ - __kernel_time_t xmit_idle; /* Time since last char transmitted (secs) */ + __kernel_old_time_t in_use; /* Time device has been in use (secs) */ + __kernel_old_time_t recv_idle; /* Time since last char received (secs) */ + __kernel_old_time_t xmit_idle; /* Time since last char transmitted (secs) */ unsigned long recv_bytes; /* Bytes received */ unsigned long xmit_bytes; /* Bytes transmitted */ unsigned long overruns; /* Input overruns */ diff --git a/include/uapi/linux/msg.h b/include/uapi/linux/msg.h index e4a0d9a9a9e8..01ee8d54c1c8 100644 --- a/include/uapi/linux/msg.h +++ b/include/uapi/linux/msg.h @@ -19,9 +19,9 @@ struct msqid_ds { struct ipc_perm msg_perm; struct msg *msg_first; /* first message on queue,unused */ struct msg *msg_last; /* last message in queue,unused */ - __kernel_time_t msg_stime; /* last msgsnd time */ - __kernel_time_t msg_rtime; /* last msgrcv time */ - __kernel_time_t msg_ctime; /* last change time */ + __kernel_old_time_t msg_stime; /* last msgsnd time */ + __kernel_old_time_t msg_rtime; /* last msgrcv time */ + __kernel_old_time_t msg_ctime; /* last change time */ unsigned long msg_lcbytes; /* Reuse junk fields for 32 bit */ unsigned long msg_lqbytes; /* ditto */ unsigned short msg_cbytes; /* current number of bytes on queue */ diff --git a/include/uapi/linux/ppp_defs.h b/include/uapi/linux/ppp_defs.h index 0039fa39a358..20286bd90ab5 100644 --- a/include/uapi/linux/ppp_defs.h +++ b/include/uapi/linux/ppp_defs.h @@ -148,8 +148,8 @@ struct ppp_comp_stats { * based on the libc time_t. */ struct ppp_idle { - __kernel_time_t xmit_idle; /* time since last NP packet sent */ - __kernel_time_t recv_idle; /* time since last NP packet received */ + __kernel_old_time_t xmit_idle; /* time since last NP packet sent */ + __kernel_old_time_t recv_idle; /* time since last NP packet received */ };
struct ppp_idle32 { diff --git a/include/uapi/linux/sem.h b/include/uapi/linux/sem.h index 39a1876f039e..75aa3b273cd9 100644 --- a/include/uapi/linux/sem.h +++ b/include/uapi/linux/sem.h @@ -24,8 +24,8 @@ /* Obsolete, used only for backwards compatibility and libc5 compiles */ struct semid_ds { struct ipc_perm sem_perm; /* permissions .. see ipc.h */ - __kernel_time_t sem_otime; /* last semop time */ - __kernel_time_t sem_ctime; /* create/last semctl() time */ + __kernel_old_time_t sem_otime; /* last semop time */ + __kernel_old_time_t sem_ctime; /* create/last semctl() time */ struct sem *sem_base; /* ptr to first semaphore in array */ struct sem_queue *sem_pending; /* pending operations to be processed */ struct sem_queue **sem_pending_last; /* last pending operation */ diff --git a/include/uapi/linux/shm.h b/include/uapi/linux/shm.h index 6507ad0afc81..8d1f17a4e08e 100644 --- a/include/uapi/linux/shm.h +++ b/include/uapi/linux/shm.h @@ -28,9 +28,9 @@ struct shmid_ds { struct ipc_perm shm_perm; /* operation perms */ int shm_segsz; /* size of segment (bytes) */ - __kernel_time_t shm_atime; /* last attach time */ - __kernel_time_t shm_dtime; /* last detach time */ - __kernel_time_t shm_ctime; /* last change time */ + __kernel_old_time_t shm_atime; /* last attach time */ + __kernel_old_time_t shm_dtime; /* last detach time */ + __kernel_old_time_t shm_ctime; /* last change time */ __kernel_ipc_pid_t shm_cpid; /* pid of creator */ __kernel_ipc_pid_t shm_lpid; /* pid of last operator */ unsigned short shm_nattch; /* no. of current attaches */ diff --git a/include/uapi/linux/time.h b/include/uapi/linux/time.h index 958932effc5e..a655aa28dc6e 100644 --- a/include/uapi/linux/time.h +++ b/include/uapi/linux/time.h @@ -8,13 +8,13 @@ #ifndef _STRUCT_TIMESPEC #define _STRUCT_TIMESPEC struct timespec { - __kernel_time_t tv_sec; /* seconds */ - long tv_nsec; /* nanoseconds */ + __kernel_old_time_t tv_sec; /* seconds */ + long tv_nsec; /* nanoseconds */ }; #endif
struct timeval { - __kernel_time_t tv_sec; /* seconds */ + __kernel_old_time_t tv_sec; /* seconds */ __kernel_suseconds_t tv_usec; /* microseconds */ };
diff --git a/include/uapi/linux/time_types.h b/include/uapi/linux/time_types.h index 60b37f29842d..074e391d73a1 100644 --- a/include/uapi/linux/time_types.h +++ b/include/uapi/linux/time_types.h @@ -29,8 +29,8 @@ struct __kernel_old_timeval { #endif
struct __kernel_old_timespec { - __kernel_time_t tv_sec; /* seconds */ - long tv_nsec; /* nanoseconds */ + __kernel_old_time_t tv_sec; /* seconds */ + long tv_nsec; /* nanoseconds */ };
struct __kernel_sock_timeval { diff --git a/include/uapi/linux/utime.h b/include/uapi/linux/utime.h index fd9aa26b6860..bc8f13e81d6e 100644 --- a/include/uapi/linux/utime.h +++ b/include/uapi/linux/utime.h @@ -5,8 +5,8 @@ #include <linux/types.h>
struct utimbuf { - __kernel_time_t actime; - __kernel_time_t modtime; + __kernel_old_time_t actime; + __kernel_old_time_t modtime; };
#endif diff --git a/kernel/time/time.c b/kernel/time/time.c index ddbddf504c23..7eba7c9a7e3e 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -59,9 +59,9 @@ EXPORT_SYMBOL(sys_tz); * why not move it into the appropriate arch directory (for those * architectures that need it). */ -SYSCALL_DEFINE1(time, time_t __user *, tloc) +SYSCALL_DEFINE1(time, __kernel_old_time_t __user *, tloc) { - time_t i = (time_t)ktime_get_real_seconds(); + __kernel_old_time_t i = (__kernel_old_time_t)ktime_get_real_seconds();
if (tloc) { if (put_user(i,tloc)) @@ -78,7 +78,7 @@ SYSCALL_DEFINE1(time, time_t __user *, tloc) * architectures that need it). */
-SYSCALL_DEFINE1(stime, time_t __user *, tptr) +SYSCALL_DEFINE1(stime, __kernel_old_time_t __user *, tptr) { struct timespec64 tv; int err;
Acked-by: Deepa Dinamani deepa.kernel@gmail.com
On Fri, Nov 08, 2019 at 10:12:09PM +0100, Arnd Bergmann wrote:
This is mainly a patch for clarification, and to let us remove the time_t definition from the kernel to prevent new users from creeping in that might not be y2038-safe.
All remaining uses of 'time_t' or '__kernel_time_t' are part of the user API that cannot be changed by that either have a replacement or that do not suffer from the y2038 overflow.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Acked-by: Christian Brauner christian.brauner@ubuntu.com
There are two 'struct timeval' fields in 'struct rusage'.
Unfortunately the definition of timeval is now ambiguous when used in user space with a libc that has a 64-bit time_t, and this also changes the 'rusage' definition in user space in a way that is incompatible with the system call interface.
While there is no good solution to avoid all ambiguity here, change the definition in the kernel headers to be compatible with the kernel ABI, using __kernel_old_timeval as an unambiguous base type.
In previous discussions, there was also a plan to add a replacement for rusage based on 64-bit timestamps and nanosecond resolution, i.e. 'struct __kernel_timespec'. I have patches for that as well, if anyone thinks we should do that.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' here, to make them completely unambiguous? --- arch/alpha/kernel/osf_sys.c | 2 +- include/uapi/linux/resource.h | 4 ++-- kernel/sys.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index bf497b8b0ec6..bbe7a0da6264 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -963,7 +963,7 @@ put_tv32(struct timeval32 __user *o, struct timespec64 *i) }
static inline long -put_tv_to_tv32(struct timeval32 __user *o, struct timeval *i) +put_tv_to_tv32(struct timeval32 __user *o, struct __kernel_old_timeval *i) { return copy_to_user(o, &(struct timeval32){ .tv_sec = i->tv_sec, diff --git a/include/uapi/linux/resource.h b/include/uapi/linux/resource.h index cc00fd079631..74ef57b38f9f 100644 --- a/include/uapi/linux/resource.h +++ b/include/uapi/linux/resource.h @@ -22,8 +22,8 @@ #define RUSAGE_THREAD 1 /* only the calling thread */
struct rusage { - struct timeval ru_utime; /* user time used */ - struct timeval ru_stime; /* system time used */ + struct __kernel_old_timeval ru_utime; /* user time used */ + struct __kernel_old_timeval ru_stime; /* system time used */ __kernel_long_t ru_maxrss; /* maximum resident set size */ __kernel_long_t ru_ixrss; /* integral shared memory size */ __kernel_long_t ru_idrss; /* integral unshared data size */ diff --git a/kernel/sys.c b/kernel/sys.c index a611d1d58c7d..d3aef31e24dc 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1763,8 +1763,8 @@ void getrusage(struct task_struct *p, int who, struct rusage *r) unlock_task_sighand(p, &flags);
out: - r->ru_utime = ns_to_timeval(utime); - r->ru_stime = ns_to_timeval(stime); + r->ru_utime = ns_to_kernel_old_timeval(utime); + r->ru_stime = ns_to_kernel_old_timeval(stime);
if (who != RUSAGE_CHILDREN) { struct mm_struct *mm = get_task_mm(p);
On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote:
There are two 'struct timeval' fields in 'struct rusage'.
Unfortunately the definition of timeval is now ambiguous when used in user space with a libc that has a 64-bit time_t, and this also changes the 'rusage' definition in user space in a way that is incompatible with the system call interface.
While there is no good solution to avoid all ambiguity here, change the definition in the kernel headers to be compatible with the kernel ABI, using __kernel_old_timeval as an unambiguous base type.
In previous discussions, there was also a plan to add a replacement for rusage based on 64-bit timestamps and nanosecond resolution, i.e. 'struct __kernel_timespec'. I have patches for that as well, if anyone thinks we should do that.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' here, to make them completely unambiguous?
The patch looks ok to me. I must confess I looked into rusage long ago so __kernel_timespec type used in uapi made me nervious at first, but then i found that we've this type defined in time_types.h uapi so userspace should be safe. I also like the idea of __kernel_rusage but definitely on top of the series.
Reviewed-by: Cyrill Gorcunov gorcunov@gmail.com
On Tue, Nov 12, 2019 at 10:09 PM Cyrill Gorcunov gorcunov@gmail.com wrote:
On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote:
Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' here, to make them completely unambiguous?
The patch looks ok to me. I must confess I looked into rusage long ago so __kernel_timespec type used in uapi made me nervious at first, but then i found that we've this type defined in time_types.h uapi so userspace should be safe. I also like the idea of __kernel_rusage but definitely on top of the series.
There are clearly too many time types at the moment, but I'm in the process of throwing out the ones we no longer need now.
I do have a number patches implementing other variants for the syscall, and I suppose that if we end up adding __kernel_rusage, that would have to go with a set of syscalls using 64-bit seconds/nanoseconds rather than the old 32/64 microseconds. I don't know what other changes remain that anyone would want from sys_waitid() now that it does support pidfd.
If there is still a need for a new waitid() replacement, that should take that new __kernel_rusage I think, but until then I hope we are fine with today's getrusage+waitid based on the current struct rusage.
BSD has wait6() to return separate rusage structures for 'self' and 'children', but I could not find any application (using the freebsd sources and debian code search) that actually uses that information, so there might not be any demand for that.
Reviewed-by: Cyrill Gorcunov gorcunov@gmail.com
Thanks,
Arnd
On Wed, Nov 13, 2019 at 11:02:12AM +0100, Arnd Bergmann wrote: ...
There are clearly too many time types at the moment, but I'm in the process of throwing out the ones we no longer need now.
Cool!
I do have a number patches implementing other variants for the syscall, and I suppose that if we end up adding __kernel_rusage, that would have to go with a set of syscalls using 64-bit seconds/nanoseconds rather than the old 32/64 microseconds. I don't know what other changes remain that anyone would want from sys_waitid() now that it does support pidfd.
If there is still a need for a new waitid() replacement, that should take that new __kernel_rusage I think, but until then I hope we are fine with today's getrusage+waitid based on the current struct rusage.
Definitely.
BSD has wait6() to return separate rusage structures for 'self' and 'children', but I could not find any application (using the freebsd sources and debian code search) that actually uses that information, so there might not be any demand for that.
Thanks for detailed info Arnd!
On Wed, Nov 13, 2019 at 11:02:12AM +0100, Arnd Bergmann wrote:
On Tue, Nov 12, 2019 at 10:09 PM Cyrill Gorcunov gorcunov@gmail.com wrote:
On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote:
Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' here, to make them completely unambiguous?
The patch looks ok to me. I must confess I looked into rusage long ago so __kernel_timespec type used in uapi made me nervious at first, but then i found that we've this type defined in time_types.h uapi so userspace should be safe. I also like the idea of __kernel_rusage but definitely on top of the series.
There are clearly too many time types at the moment, but I'm in the process of throwing out the ones we no longer need now.
I do have a number patches implementing other variants for the syscall, and I suppose that if we end up adding __kernel_rusage, that would have to go with a set of syscalls using 64-bit seconds/nanoseconds rather than the old 32/64 microseconds. I don't know what other changes remain that anyone would want from sys_waitid() now that it does support pidfd.
If there is still a need for a new waitid() replacement, that should take that new __kernel_rusage I think, but until then I hope we are fine with today's getrusage+waitid based on the current struct rusage.
Note, that glibc does _not_ expose the rusage argument, i.e. most of userspace is unaware that waitid() does allow you to get rusage information. So users first need to know that waitid() has an rusage argument and then need to call the waitid() syscall directly.
BSD has wait6() to return separate rusage structures for 'self' and 'children', but I could not find any application (using the freebsd sources and debian code search) that actually uses that information, so there might not be any demand for that.
Speaking specifically for Linux now, I think that rusage does not actually expose the information most relevant users are interested in. On Linux nowadays it is _way_ more interesting to retrieve stats relative to the cgroup the task lived in etc. Doing a git grep -i rusage in the systemd source code shows that rusage is used _nowhere_. And I consider an init system to be the most likely candidate to be interested in rusage.
Christian
On Thu, Nov 14, 2019 at 1:38 AM Christian Brauner christian.brauner@ubuntu.com wrote:
On Wed, Nov 13, 2019 at 11:02:12AM +0100, Arnd Bergmann wrote:
On Tue, Nov 12, 2019 at 10:09 PM Cyrill Gorcunov gorcunov@gmail.com wrote:
On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote:
Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' here, to make them completely unambiguous?
The patch looks ok to me. I must confess I looked into rusage long ago so __kernel_timespec type used in uapi made me nervious at first, but then i found that we've this type defined in time_types.h uapi so userspace should be safe. I also like the idea of __kernel_rusage but definitely on top of the series.
There are clearly too many time types at the moment, but I'm in the process of throwing out the ones we no longer need now.
I do have a number patches implementing other variants for the syscall, and I suppose that if we end up adding __kernel_rusage, that would have to go with a set of syscalls using 64-bit seconds/nanoseconds rather than the old 32/64 microseconds. I don't know what other changes remain that anyone would want from sys_waitid() now that it does support pidfd.
If there is still a need for a new waitid() replacement, that should take that new __kernel_rusage I think, but until then I hope we are fine with today's getrusage+waitid based on the current struct rusage.
Note, that glibc does _not_ expose the rusage argument, i.e. most of userspace is unaware that waitid() does allow you to get rusage information. So users first need to know that waitid() has an rusage argument and then need to call the waitid() syscall directly.
On architectures that don't have a wait4 syscall (riscv32 for now), glibc uses waitid to implement wait4 and wait3.
BSD has wait6() to return separate rusage structures for 'self' and 'children', but I could not find any application (using the freebsd sources and debian code search) that actually uses that information, so there might not be any demand for that.
Speaking specifically for Linux now, I think that rusage does not actually expose the information most relevant users are interested in. On Linux nowadays it is _way_ more interesting to retrieve stats relative to the cgroup the task lived in etc. Doing a git grep -i rusage in the systemd source code shows that rusage is used _nowhere_. And I consider an init system to be the most likely candidate to be interested in rusage.
I looked at a couple of implementations of time(1), this is one example that sometimes uses wait3(), though other implementations just call getrusage() in the parent process before the fork/exec. None of them actually seem to report better than millisecond resolution, so there is not a strict reason to do a timespec replacement for these.
Arnd
[+Cc Florian, libc-alpha]
On Thu, Nov 14, 2019 at 11:18:15AM +0100, Arnd Bergmann wrote:
On Thu, Nov 14, 2019 at 1:38 AM Christian Brauner christian.brauner@ubuntu.com wrote:
On Wed, Nov 13, 2019 at 11:02:12AM +0100, Arnd Bergmann wrote:
On Tue, Nov 12, 2019 at 10:09 PM Cyrill Gorcunov gorcunov@gmail.com wrote:
On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote:
Question: should we also rename 'struct rusage' into 'struct __kernel_rusage' here, to make them completely unambiguous?
The patch looks ok to me. I must confess I looked into rusage long ago so __kernel_timespec type used in uapi made me nervious at first, but then i found that we've this type defined in time_types.h uapi so userspace should be safe. I also like the idea of __kernel_rusage but definitely on top of the series.
There are clearly too many time types at the moment, but I'm in the process of throwing out the ones we no longer need now.
I do have a number patches implementing other variants for the syscall, and I suppose that if we end up adding __kernel_rusage, that would have to go with a set of syscalls using 64-bit seconds/nanoseconds rather than the old 32/64 microseconds. I don't know what other changes remain that anyone would want from sys_waitid() now that it does support pidfd.
If there is still a need for a new waitid() replacement, that should take that new __kernel_rusage I think, but until then I hope we are fine with today's getrusage+waitid based on the current struct rusage.
Note, that glibc does _not_ expose the rusage argument, i.e. most of userspace is unaware that waitid() does allow you to get rusage information. So users first need to know that waitid() has an rusage argument and then need to call the waitid() syscall directly.
On architectures that don't have a wait4 syscall (riscv32 for now), glibc uses waitid to implement wait4 and wait3.
Yes, and there's an ongoing discussion to implement wait4() on all arches through waitid(), I think. I haven't followed it too closely.
BSD has wait6() to return separate rusage structures for 'self' and 'children', but I could not find any application (using the freebsd sources and debian code search) that actually uses that information, so there might not be any demand for that.
Speaking specifically for Linux now, I think that rusage does not actually expose the information most relevant users are interested in. On Linux nowadays it is _way_ more interesting to retrieve stats relative to the cgroup the task lived in etc. Doing a git grep -i rusage in the systemd source code shows that rusage is used _nowhere_. And I consider an init system to be the most likely candidate to be interested in rusage.
I looked at a couple of implementations of time(1), this is one example that sometimes uses wait3(), though other implementations just call getrusage() in the parent process before the fork/exec. None of them actually seem to report better than millisecond resolution, so there is not a strict reason to do a timespec replacement for these.
Right, though I have to say that for the sake of consistency I'd much rather have a replacement. We're doing all this work right now so we might as well. But I get the point.
Christian
All of the remaining syscalls that pass a timeval (gettimeofday, utime, futimesat) can trivially be changed to pass a __kernel_old_timeval instead, which has a compatible layout, but avoids ambiguity with the timeval type in user space.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/powerpc/include/asm/asm-prototypes.h | 3 ++- arch/powerpc/kernel/syscalls.c | 4 ++-- fs/select.c | 10 +++++----- fs/utimes.c | 8 ++++---- include/linux/syscalls.h | 10 +++++----- kernel/power/power.h | 2 +- kernel/time/time.c | 2 +- 7 files changed, 20 insertions(+), 19 deletions(-)
diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 8561498e653c..2c25dc079cb9 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -92,7 +92,8 @@ long sys_swapcontext(struct ucontext __user *old_ctx, long sys_debug_setcontext(struct ucontext __user *ctx, int ndbg, struct sig_dbg_op __user *dbg); int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct timeval __user *tvp); +ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, + struct __kernel_old_timeval __user *tvp); unsigned long __init early_init(unsigned long dt_ptr); void __init machine_init(u64 dt_ptr); #endif diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 3bfb3888e897..078608ec2e92 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -79,7 +79,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, * sys_select() with the appropriate args. -- Cort */ int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct timeval __user *tvp) +ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp) { if ( (unsigned long)n >= 4096 ) { @@ -89,7 +89,7 @@ ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, s || __get_user(inp, ((fd_set __user * __user *)(buffer+1))) || __get_user(outp, ((fd_set __user * __user *)(buffer+2))) || __get_user(exp, ((fd_set __user * __user *)(buffer+3))) - || __get_user(tvp, ((struct timeval __user * __user *)(buffer+4)))) + || __get_user(tvp, ((struct __kernel_old_timeval __user * __user *)(buffer+4)))) return -EFAULT; } return sys_select(n, inp, outp, exp, tvp); diff --git a/fs/select.c b/fs/select.c index 53a0c149f528..11d0285d46b7 100644 --- a/fs/select.c +++ b/fs/select.c @@ -321,7 +321,7 @@ static int poll_select_finish(struct timespec64 *end_time, switch (pt_type) { case PT_TIMEVAL: { - struct timeval rtv; + struct __kernel_old_timeval rtv;
if (sizeof(rtv) > sizeof(rtv.tv_sec) + sizeof(rtv.tv_usec)) memset(&rtv, 0, sizeof(rtv)); @@ -698,10 +698,10 @@ int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp, }
static int kern_select(int n, fd_set __user *inp, fd_set __user *outp, - fd_set __user *exp, struct timeval __user *tvp) + fd_set __user *exp, struct __kernel_old_timeval __user *tvp) { struct timespec64 end_time, *to = NULL; - struct timeval tv; + struct __kernel_old_timeval tv; int ret;
if (tvp) { @@ -720,7 +720,7 @@ static int kern_select(int n, fd_set __user *inp, fd_set __user *outp, }
SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp, - fd_set __user *, exp, struct timeval __user *, tvp) + fd_set __user *, exp, struct __kernel_old_timeval __user *, tvp) { return kern_select(n, inp, outp, exp, tvp); } @@ -810,7 +810,7 @@ SYSCALL_DEFINE6(pselect6_time32, int, n, fd_set __user *, inp, fd_set __user *, struct sel_arg_struct { unsigned long n; fd_set __user *inp, *outp, *exp; - struct timeval __user *tvp; + struct __kernel_old_timeval __user *tvp; };
SYSCALL_DEFINE1(old_select, struct sel_arg_struct __user *, arg) diff --git a/fs/utimes.c b/fs/utimes.c index 1ba3f7883870..c952b6b3d8a0 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -161,9 +161,9 @@ SYSCALL_DEFINE4(utimensat, int, dfd, const char __user *, filename, * utimensat() instead. */ static long do_futimesat(int dfd, const char __user *filename, - struct timeval __user *utimes) + struct __kernel_old_timeval __user *utimes) { - struct timeval times[2]; + struct __kernel_old_timeval times[2]; struct timespec64 tstimes[2];
if (utimes) { @@ -190,13 +190,13 @@ static long do_futimesat(int dfd, const char __user *filename,
SYSCALL_DEFINE3(futimesat, int, dfd, const char __user *, filename, - struct timeval __user *, utimes) + struct __kernel_old_timeval __user *, utimes) { return do_futimesat(dfd, filename, utimes); }
SYSCALL_DEFINE2(utimes, char __user *, filename, - struct timeval __user *, utimes) + struct __kernel_old_timeval __user *, utimes) { return do_futimesat(AT_FDCWD, filename, utimes); } diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 2f27bc9d5ef0..e665920fa359 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -51,7 +51,7 @@ struct statx; struct __sysctl_args; struct sysinfo; struct timespec; -struct timeval; +struct __kernel_old_timeval; struct __kernel_timex; struct timezone; struct tms; @@ -732,7 +732,7 @@ asmlinkage long sys_prctl(int option, unsigned long arg2, unsigned long arg3, asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct getcpu_cache __user *cache);
/* kernel/time.c */ -asmlinkage long sys_gettimeofday(struct timeval __user *tv, +asmlinkage long sys_gettimeofday(struct __kernel_old_timeval __user *tv, struct timezone __user *tz); asmlinkage long sys_settimeofday(struct timeval __user *tv, struct timezone __user *tz); @@ -1082,9 +1082,9 @@ asmlinkage long sys_time32(old_time32_t __user *tloc); asmlinkage long sys_utime(char __user *filename, struct utimbuf __user *times); asmlinkage long sys_utimes(char __user *filename, - struct timeval __user *utimes); + struct __kernel_old_timeval __user *utimes); asmlinkage long sys_futimesat(int dfd, const char __user *filename, - struct timeval __user *utimes); + struct __kernel_old_timeval __user *utimes); #endif asmlinkage long sys_futimesat_time32(unsigned int dfd, const char __user *filename, @@ -1098,7 +1098,7 @@ asmlinkage long sys_getdents(unsigned int fd, struct linux_dirent __user *dirent, unsigned int count); asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp, - fd_set __user *exp, struct timeval __user *tvp); + fd_set __user *exp, struct __kernel_old_timeval __user *tvp); asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, int timeout); asmlinkage long sys_epoll_wait(int epfd, struct epoll_event __user *events, diff --git a/kernel/power/power.h b/kernel/power/power.h index 44bee462ff57..7cdc64dc2373 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -179,7 +179,7 @@ extern void swsusp_close(fmode_t); extern int swsusp_unmark(void); #endif
-struct timeval; +struct __kernel_old_timeval; /* kernel/power/swsusp.c */ extern void swsusp_show_speed(ktime_t, ktime_t, unsigned int, char *);
diff --git a/kernel/time/time.c b/kernel/time/time.c index 7eba7c9a7e3e..bc114f0be8f1 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -137,7 +137,7 @@ SYSCALL_DEFINE1(stime32, old_time32_t __user *, tptr) #endif /* __ARCH_WANT_SYS_TIME32 */ #endif
-SYSCALL_DEFINE2(gettimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(gettimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { if (likely(tv != NULL)) {
On Fri, Nov 08, 2019 at 10:12:11PM +0100, Arnd Bergmann wrote:
All of the remaining syscalls that pass a timeval (gettimeofday, utime, futimesat) can trivially be changed to pass a __kernel_old_timeval instead, which has a compatible layout, but avoids ambiguity with the timeval type in user space.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Seems reasonable. Acked-by: Christian Brauner christian.brauner@ubuntu.com
On Friday, November 8, 2019 10:12:11 PM CET Arnd Bergmann wrote:
All of the remaining syscalls that pass a timeval (gettimeofday, utime, futimesat) can trivially be changed to pass a __kernel_old_timeval instead, which has a compatible layout, but avoids ambiguity with the timeval type in user space.
Signed-off-by: Arnd Bergmann arnd@arndb.de
For the change in power/power.h
Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
arch/powerpc/include/asm/asm-prototypes.h | 3 ++- arch/powerpc/kernel/syscalls.c | 4 ++-- fs/select.c | 10 +++++----- fs/utimes.c | 8 ++++---- include/linux/syscalls.h | 10 +++++----- kernel/power/power.h | 2 +- kernel/time/time.c | 2 +- 7 files changed, 20 insertions(+), 19 deletions(-)
diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 8561498e653c..2c25dc079cb9 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -92,7 +92,8 @@ long sys_swapcontext(struct ucontext __user *old_ctx, long sys_debug_setcontext(struct ucontext __user *ctx, int ndbg, struct sig_dbg_op __user *dbg); int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct timeval __user *tvp); +ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp,
struct __kernel_old_timeval __user *tvp);
unsigned long __init early_init(unsigned long dt_ptr); void __init machine_init(u64 dt_ptr); #endif diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 3bfb3888e897..078608ec2e92 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -79,7 +79,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len,
- sys_select() with the appropriate args. -- Cort
*/ int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct timeval __user *tvp) +ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp) { if ( (unsigned long)n >= 4096 ) { @@ -89,7 +89,7 @@ ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, s || __get_user(inp, ((fd_set __user * __user *)(buffer+1))) || __get_user(outp, ((fd_set __user * __user *)(buffer+2))) || __get_user(exp, ((fd_set __user * __user *)(buffer+3)))
|| __get_user(tvp, ((struct timeval __user * __user *)(buffer+4))))
} return sys_select(n, inp, outp, exp, tvp);|| __get_user(tvp, ((struct __kernel_old_timeval __user * __user *)(buffer+4)))) return -EFAULT;
diff --git a/fs/select.c b/fs/select.c index 53a0c149f528..11d0285d46b7 100644 --- a/fs/select.c +++ b/fs/select.c @@ -321,7 +321,7 @@ static int poll_select_finish(struct timespec64 *end_time, switch (pt_type) { case PT_TIMEVAL: {
struct timeval rtv;
struct __kernel_old_timeval rtv;
if (sizeof(rtv) > sizeof(rtv.tv_sec) + sizeof(rtv.tv_usec)) memset(&rtv, 0, sizeof(rtv)); @@ -698,10 +698,10 @@ int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp, } static int kern_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timeval __user *tvp)
fd_set __user *exp, struct __kernel_old_timeval __user *tvp)
{ struct timespec64 end_time, *to = NULL;
- struct timeval tv;
- struct __kernel_old_timeval tv; int ret;
if (tvp) { @@ -720,7 +720,7 @@ static int kern_select(int n, fd_set __user *inp, fd_set __user *outp, } SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
fd_set __user *, exp, struct timeval __user *, tvp)
fd_set __user *, exp, struct __kernel_old_timeval __user *, tvp)
{ return kern_select(n, inp, outp, exp, tvp); } @@ -810,7 +810,7 @@ SYSCALL_DEFINE6(pselect6_time32, int, n, fd_set __user *, inp, fd_set __user *, struct sel_arg_struct { unsigned long n; fd_set __user *inp, *outp, *exp;
- struct timeval __user *tvp;
- struct __kernel_old_timeval __user *tvp;
}; SYSCALL_DEFINE1(old_select, struct sel_arg_struct __user *, arg) diff --git a/fs/utimes.c b/fs/utimes.c index 1ba3f7883870..c952b6b3d8a0 100644 --- a/fs/utimes.c +++ b/fs/utimes.c @@ -161,9 +161,9 @@ SYSCALL_DEFINE4(utimensat, int, dfd, const char __user *, filename,
- utimensat() instead.
*/ static long do_futimesat(int dfd, const char __user *filename,
struct timeval __user *utimes)
struct __kernel_old_timeval __user *utimes)
{
- struct timeval times[2];
- struct __kernel_old_timeval times[2]; struct timespec64 tstimes[2];
if (utimes) { @@ -190,13 +190,13 @@ static long do_futimesat(int dfd, const char __user *filename, SYSCALL_DEFINE3(futimesat, int, dfd, const char __user *, filename,
struct timeval __user *, utimes)
struct __kernel_old_timeval __user *, utimes)
{ return do_futimesat(dfd, filename, utimes); } SYSCALL_DEFINE2(utimes, char __user *, filename,
struct timeval __user *, utimes)
struct __kernel_old_timeval __user *, utimes)
{ return do_futimesat(AT_FDCWD, filename, utimes); } diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 2f27bc9d5ef0..e665920fa359 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -51,7 +51,7 @@ struct statx; struct __sysctl_args; struct sysinfo; struct timespec; -struct timeval; +struct __kernel_old_timeval; struct __kernel_timex; struct timezone; struct tms; @@ -732,7 +732,7 @@ asmlinkage long sys_prctl(int option, unsigned long arg2, unsigned long arg3, asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct getcpu_cache __user *cache); /* kernel/time.c */ -asmlinkage long sys_gettimeofday(struct timeval __user *tv, +asmlinkage long sys_gettimeofday(struct __kernel_old_timeval __user *tv, struct timezone __user *tz); asmlinkage long sys_settimeofday(struct timeval __user *tv, struct timezone __user *tz); @@ -1082,9 +1082,9 @@ asmlinkage long sys_time32(old_time32_t __user *tloc); asmlinkage long sys_utime(char __user *filename, struct utimbuf __user *times); asmlinkage long sys_utimes(char __user *filename,
struct timeval __user *utimes);
struct __kernel_old_timeval __user *utimes);
asmlinkage long sys_futimesat(int dfd, const char __user *filename,
struct timeval __user *utimes);
struct __kernel_old_timeval __user *utimes);
#endif asmlinkage long sys_futimesat_time32(unsigned int dfd, const char __user *filename, @@ -1098,7 +1098,7 @@ asmlinkage long sys_getdents(unsigned int fd, struct linux_dirent __user *dirent, unsigned int count); asmlinkage long sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timeval __user *tvp);
fd_set __user *exp, struct __kernel_old_timeval __user *tvp);
asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, int timeout); asmlinkage long sys_epoll_wait(int epfd, struct epoll_event __user *events, diff --git a/kernel/power/power.h b/kernel/power/power.h index 44bee462ff57..7cdc64dc2373 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -179,7 +179,7 @@ extern void swsusp_close(fmode_t); extern int swsusp_unmark(void); #endif -struct timeval; +struct __kernel_old_timeval; /* kernel/power/swsusp.c */ extern void swsusp_show_speed(ktime_t, ktime_t, unsigned int, char *); diff --git a/kernel/time/time.c b/kernel/time/time.c index 7eba7c9a7e3e..bc114f0be8f1 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -137,7 +137,7 @@ SYSCALL_DEFINE1(stime32, old_time32_t __user *, tptr) #endif /* __ARCH_WANT_SYS_TIME32 */ #endif -SYSCALL_DEFINE2(gettimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(gettimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { if (likely(tv != NULL)) {
In order to remove the 'struct timespec' definition and the timespec64_to_timespec() helper function, change over the in-kernel definition of 'struct scm_timestamping' to use the __kernel_old_timespec replacement and open-code the assignment.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/uapi/linux/errqueue.h | 7 +++++++ net/core/scm.c | 6 ++++-- 2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/errqueue.h b/include/uapi/linux/errqueue.h index 28491dac074b..0cca19670fd2 100644 --- a/include/uapi/linux/errqueue.h +++ b/include/uapi/linux/errqueue.h @@ -37,9 +37,16 @@ struct sock_extended_err { * The timestamping interfaces SO_TIMESTAMPING, MSG_TSTAMP_* * communicate network timestamps by passing this struct in a cmsg with * recvmsg(). See Documentation/networking/timestamping.txt for details. + * User space sees a timespec definition that matches either + * __kernel_timespec or __kernel_old_timespec, in the kernel we + * require two structure definitions to provide both. */ struct scm_timestamping { +#ifdef __KERNEL__ + struct __kernel_old_timespec ts[3]; +#else struct timespec ts[3]; +#endif };
struct scm_timestamping64 { diff --git a/net/core/scm.c b/net/core/scm.c index 31a38239c92f..dc6fed1f221c 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -268,8 +268,10 @@ void put_cmsg_scm_timestamping(struct msghdr *msg, struct scm_timestamping_inter struct scm_timestamping tss; int i;
- for (i = 0; i < ARRAY_SIZE(tss.ts); i++) - tss.ts[i] = timespec64_to_timespec(tss_internal->ts[i]); + for (i = 0; i < ARRAY_SIZE(tss.ts); i++) { + tss.ts[i].tv_sec = tss_internal->ts[i].tv_sec; + tss.ts[i].tv_nsec = tss_internal->ts[i].tv_nsec; + }
put_cmsg(msg, SOL_SOCKET, SO_TIMESTAMPING_OLD, sizeof(tss), &tss); }
Acked-by: Deepa Dinamani deepa.kernel@gmail.com
On Fri, Nov 8, 2019 at 10:15 PM Arnd Bergmann arnd@arndb.de wrote:
diff --git a/include/uapi/linux/errqueue.h b/include/uapi/linux/errqueue.h index 28491dac074b..0cca19670fd2 100644 --- a/include/uapi/linux/errqueue.h +++ b/include/uapi/linux/errqueue.h @@ -37,9 +37,16 @@ struct sock_extended_err {
The timestamping interfaces SO_TIMESTAMPING, MSG_TSTAMP_*
communicate network timestamps by passing this struct in a cmsg with
recvmsg(). See Documentation/networking/timestamping.txt for details.
User space sees a timespec definition that matches either
__kernel_timespec or __kernel_old_timespec, in the kernel we
*/
require two structure definitions to provide both.
struct scm_timestamping { +#ifdef __KERNEL__
struct __kernel_old_timespec ts[3];
+#else struct timespec ts[3]; +#endif };
The kbuild bot pointed out that the way I sent this series, the use of __kernel_old_timespec causes a build failure, because I introduce this in a separate submission. I'm moving this patch over to the other series, and changing the subject to
y2038: socket: use __kernel_old_timespec instead of timespec
With the expectation of merging it along with the other core patches.
Arnd
This gets us one step closer to removing 'struct timeval' from the kernel. We still keep __kernel_old_timeval for interfaces that we cannot fix otherwise, and ns_to_compat_timeval() is provably safe for interfaces that are legitimate users of __kernel_old_timeval on native kernels, so this is an obvious change.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/linux/compat.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h index 16dafd9f4b86..3735a22bfbc0 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -937,10 +937,10 @@ static inline bool in_compat_syscall(void) { return is_compat_task(); } */ static inline struct old_timeval32 ns_to_old_timeval32(s64 nsec) { - struct timeval tv; + struct __kernel_old_timeval tv; struct old_timeval32 ctv;
- tv = ns_to_timeval(nsec); + tv = ns_to_kernel_old_timeval(nsec); ctv.tv_sec = tv.tv_sec; ctv.tv_usec = tv.tv_usec;
We store elapsed time for a crashed process in struct elf_prstatus using 'timeval' structures. Once glibc starts using 64-bit time_t, this becomes incompatible with the kernel's idea of timeval since the structure layout no longer matches on 32-bit architectures.
This changes the definition of the elf_prstatus structure to use __kernel_old_timeval instead, which is hardcoded to the currently used binary layout. There is no risk of overflow in y2038 though, because the time values are all relative times, and can store up to 68 years of process elapsed time.
There is a risk of applications breaking at build time when they use the new kernel headers and expect the type to be exactly 'timeval' rather than a structure that has the same fields as before. Those applications have to be modified to deal with 64-bit time_t anyway.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/mips/kernel/binfmt_elfn32.c | 4 ++-- arch/mips/kernel/binfmt_elfo32.c | 4 ++-- fs/binfmt_elf.c | 12 ++++++------ fs/binfmt_elf_fdpic.c | 12 ++++++------ fs/compat_binfmt_elf.c | 4 ++-- include/uapi/linux/elfcore.h | 8 ++++---- 6 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/arch/mips/kernel/binfmt_elfn32.c b/arch/mips/kernel/binfmt_elfn32.c index 7a12763d553a..6ee3f7218c67 100644 --- a/arch/mips/kernel/binfmt_elfn32.c +++ b/arch/mips/kernel/binfmt_elfn32.c @@ -100,7 +100,7 @@ jiffies_to_old_timeval32(unsigned long jiffies, struct old_timeval32 *value) #undef TASK_SIZE #define TASK_SIZE TASK_SIZE32
-#undef ns_to_timeval -#define ns_to_timeval ns_to_old_timeval32 +#undef ns_to_kernel_old_timeval +#define ns_to_kernel_old_timeval ns_to_old_timeval32
#include "../../../fs/binfmt_elf.c" diff --git a/arch/mips/kernel/binfmt_elfo32.c b/arch/mips/kernel/binfmt_elfo32.c index e6db06a1d31a..6dd103d3cebb 100644 --- a/arch/mips/kernel/binfmt_elfo32.c +++ b/arch/mips/kernel/binfmt_elfo32.c @@ -103,7 +103,7 @@ jiffies_to_old_timeval32(unsigned long jiffies, struct old_timeval32 *value) #undef TASK_SIZE #define TASK_SIZE TASK_SIZE32
-#undef ns_to_timeval -#define ns_to_timeval ns_to_old_timeval32 +#undef ns_to_kernel_old_timeval +#define ns_to_kernel_old_timeval ns_to_old_timeval32
#include "../../../fs/binfmt_elf.c" diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index c5642bcb6b46..5372eabd276a 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1489,18 +1489,18 @@ static void fill_prstatus(struct elf_prstatus *prstatus, * group-wide total, not its individual thread total. */ thread_group_cputime(p, &cputime); - prstatus->pr_utime = ns_to_timeval(cputime.utime); - prstatus->pr_stime = ns_to_timeval(cputime.stime); + prstatus->pr_utime = ns_to_kernel_old_timeval(cputime.utime); + prstatus->pr_stime = ns_to_kernel_old_timeval(cputime.stime); } else { u64 utime, stime;
task_cputime(p, &utime, &stime); - prstatus->pr_utime = ns_to_timeval(utime); - prstatus->pr_stime = ns_to_timeval(stime); + prstatus->pr_utime = ns_to_kernel_old_timeval(utime); + prstatus->pr_stime = ns_to_kernel_old_timeval(stime); }
- prstatus->pr_cutime = ns_to_timeval(p->signal->cutime); - prstatus->pr_cstime = ns_to_timeval(p->signal->cstime); + prstatus->pr_cutime = ns_to_kernel_old_timeval(p->signal->cutime); + prstatus->pr_cstime = ns_to_kernel_old_timeval(p->signal->cstime); }
static int fill_psinfo(struct elf_prpsinfo *psinfo, struct task_struct *p, diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c index d86ebd0dcc3d..240f66663543 100644 --- a/fs/binfmt_elf_fdpic.c +++ b/fs/binfmt_elf_fdpic.c @@ -1359,17 +1359,17 @@ static void fill_prstatus(struct elf_prstatus *prstatus, * group-wide total, not its individual thread total. */ thread_group_cputime(p, &cputime); - prstatus->pr_utime = ns_to_timeval(cputime.utime); - prstatus->pr_stime = ns_to_timeval(cputime.stime); + prstatus->pr_utime = ns_to_kernel_old_timeval(cputime.utime); + prstatus->pr_stime = ns_to_kernel_old_timeval(cputime.stime); } else { u64 utime, stime;
task_cputime(p, &utime, &stime); - prstatus->pr_utime = ns_to_timeval(utime); - prstatus->pr_stime = ns_to_timeval(stime); + prstatus->pr_utime = ns_to_kernel_old_timeval(utime); + prstatus->pr_stime = ns_to_kernel_old_timeval(stime); } - prstatus->pr_cutime = ns_to_timeval(p->signal->cutime); - prstatus->pr_cstime = ns_to_timeval(p->signal->cstime); + prstatus->pr_cutime = ns_to_kernel_old_timeval(p->signal->cutime); + prstatus->pr_cstime = ns_to_kernel_old_timeval(p->signal->cstime);
prstatus->pr_exec_fdpic_loadmap = p->mm->context.exec_fdpic_loadmap; prstatus->pr_interp_fdpic_loadmap = p->mm->context.interp_fdpic_loadmap; diff --git a/fs/compat_binfmt_elf.c b/fs/compat_binfmt_elf.c index b7f9ffa1d5f1..aaad4ca1217e 100644 --- a/fs/compat_binfmt_elf.c +++ b/fs/compat_binfmt_elf.c @@ -48,8 +48,8 @@ #define elf_prstatus compat_elf_prstatus #define elf_prpsinfo compat_elf_prpsinfo
-#undef ns_to_timeval -#define ns_to_timeval ns_to_old_timeval32 +#undef ns_to_kernel_old_timeval +#define ns_to_kernel_old_timeval ns_to_old_timeval32
/* * To use this file, asm/elf.h must define compat_elf_check_arch. diff --git a/include/uapi/linux/elfcore.h b/include/uapi/linux/elfcore.h index 0b2c9e16e345..baf03562306d 100644 --- a/include/uapi/linux/elfcore.h +++ b/include/uapi/linux/elfcore.h @@ -53,10 +53,10 @@ struct elf_prstatus pid_t pr_ppid; pid_t pr_pgrp; pid_t pr_sid; - struct timeval pr_utime; /* User time */ - struct timeval pr_stime; /* System time */ - struct timeval pr_cutime; /* Cumulative user time */ - struct timeval pr_cstime; /* Cumulative system time */ + struct __kernel_old_timeval pr_utime; /* User time */ + struct __kernel_old_timeval pr_stime; /* System time */ + struct __kernel_old_timeval pr_cutime; /* Cumulative user time */ + struct __kernel_old_timeval pr_cstime; /* Cumulative system time */ #if 0 long pr_instr; /* Current instruction */ #endif
timerfd_show() uses a 'struct itimerspec' internally, but that is deprecated because of the time_t overflow and a conflict with the glibc type of the same name that is now incompatible in user space.
Use a pair of timespec64 variables instead as a simple replacement.
As this removes the last use of itimerspec from the kernel, allowing the removal of the definition from the uapi headers along with timespec and timeval later.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- fs/timerfd.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/timerfd.c b/fs/timerfd.c index 48305ba41e3c..ac7f59a58f94 100644 --- a/fs/timerfd.c +++ b/fs/timerfd.c @@ -302,11 +302,11 @@ static ssize_t timerfd_read(struct file *file, char __user *buf, size_t count, static void timerfd_show(struct seq_file *m, struct file *file) { struct timerfd_ctx *ctx = file->private_data; - struct itimerspec t; + struct timespec64 value, interval;
spin_lock_irq(&ctx->wqh.lock); - t.it_value = ktime_to_timespec(timerfd_get_remaining(ctx)); - t.it_interval = ktime_to_timespec(ctx->tintv); + value = ktime_to_timespec64(timerfd_get_remaining(ctx)); + interval = ktime_to_timespec64(ctx->tintv); spin_unlock_irq(&ctx->wqh.lock);
seq_printf(m, @@ -318,10 +318,10 @@ static void timerfd_show(struct seq_file *m, struct file *file) ctx->clockid, (unsigned long long)ctx->ticks, ctx->settime_flags, - (unsigned long long)t.it_value.tv_sec, - (unsigned long long)t.it_value.tv_nsec, - (unsigned long long)t.it_interval.tv_sec, - (unsigned long long)t.it_interval.tv_nsec); + (unsigned long long)value.tv_sec, + (unsigned long long)value.tv_nsec, + (unsigned long long)interval.tv_sec, + (unsigned long long)interval.tv_nsec); } #else #define timerfd_show NULL
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
timerfd_show() uses a 'struct itimerspec' internally, but that is deprecated because of the time_t overflow and a conflict with the glibc type of the same name that is now incompatible in user space.
Use a pair of timespec64 variables instead as a simple replacement.
As this removes the last use of itimerspec from the kernel, allowing the removal of the definition from the uapi headers along with timespec and timeval later.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Reviewed-by: Thomas Gleixner tglx@linutronix.de
The compat_get_timeval() and timeval_valid() interfaces are deprecated and getting removed along with the definition of struct timeval itself.
Change the two implementations of the settimeofday() system call to open-code these helpers and completely avoid references to timeval.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/linux/syscalls.h | 2 +- kernel/time/time.c | 20 +++++++++----------- 2 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index e665920fa359..d0391cc2dae9 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -734,7 +734,7 @@ asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct g /* kernel/time.c */ asmlinkage long sys_gettimeofday(struct __kernel_old_timeval __user *tv, struct timezone __user *tz); -asmlinkage long sys_settimeofday(struct timeval __user *tv, +asmlinkage long sys_settimeofday(struct __kernel_old_timeval __user *tv, struct timezone __user *tz); asmlinkage long sys_adjtimex(struct __kernel_timex __user *txc_p); asmlinkage long sys_adjtimex_time32(struct old_timex32 __user *txc_p); diff --git a/kernel/time/time.c b/kernel/time/time.c index bc114f0be8f1..6bfbe640fd3b 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -196,22 +196,21 @@ int do_sys_settimeofday64(const struct timespec64 *tv, const struct timezone *tz return 0; }
-SYSCALL_DEFINE2(settimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts; - struct timeval user_tv; struct timezone new_tz;
if (tv) { - if (copy_from_user(&user_tv, tv, sizeof(*tv))) + if (get_user(new_ts.tv_sec, &tv->tv_sec) || + get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
- if (!timeval_valid(&user_tv)) + if (tv->tv_usec > USEC_PER_SEC) return -EINVAL;
- new_ts.tv_sec = user_tv.tv_sec; - new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC; + new_ts.tv_nsec *= NSEC_PER_USEC; } if (tz) { if (copy_from_user(&new_tz, tz, sizeof(*tz))) @@ -245,18 +244,17 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts; - struct timeval user_tv; struct timezone new_tz;
if (tv) { - if (compat_get_timeval(&user_tv, tv)) + if (get_user(new_ts.tv_sec, &tv->tv_sec) || + get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
- if (!timeval_valid(&user_tv)) + if (new_ts.tv_nsec > USEC_PER_SEC) return -EINVAL;
- new_ts.tv_sec = user_tv.tv_sec; - new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC; + new_ts.tv_nsec *= NSEC_PER_USEC; } if (tz) { if (copy_from_user(&new_tz, tz, sizeof(*tz)))
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
-SYSCALL_DEFINE2(settimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts;
- struct timeval user_tv; struct timezone new_tz;
if (tv) {
if (copy_from_user(&user_tv, tv, sizeof(*tv)))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
How is that supposed to be correct on a 32bit kernel?
if (!timeval_valid(&user_tv))
if (tv->tv_usec > USEC_PER_SEC) return -EINVAL;
That's incomplete:
static inline bool timeval_valid(const struct timeval *tv) { /* Dates before 1970 are bogus */ if (tv->tv_sec < 0) return false;
/* Can't have more microseconds then a second */ if (tv->tv_usec < 0 || tv->tv_usec >= USEC_PER_SEC) return false;
return true; }
new_ts.tv_sec = user_tv.tv_sec;
new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC;
} if (tz) { if (copy_from_user(&new_tz, tz, sizeof(*tz)))new_ts.tv_nsec *= NSEC_PER_USEC;
@@ -245,18 +244,17 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts;
- struct timeval user_tv; struct timezone new_tz;
if (tv) {
if (compat_get_timeval(&user_tv, tv))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
if (!timeval_valid(&user_tv))
if (new_ts.tv_nsec > USEC_PER_SEC) return -EINVAL;
Ditto.
Thanks,
tglx
On Wed, Nov 13, 2019 at 10:53 PM Thomas Gleixner tglx@linutronix.de wrote:
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
-SYSCALL_DEFINE2(settimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts;
struct timeval user_tv; struct timezone new_tz; if (tv) {
if (copy_from_user(&user_tv, tv, sizeof(*tv)))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
How is that supposed to be correct on a 32bit kernel?
I don't see the problem you are referring to. This should behave the same way on a 32-bit kernel and on a 64-bit kernel, sign-extending the tv_sec field, and copying the user tv_usec field into the kernel tv_nsec, to be multiplied by 1000 a few lines later.
Am I missing something?
if (!timeval_valid(&user_tv))
if (tv->tv_usec > USEC_PER_SEC) return -EINVAL;
That's incomplete:
static inline bool timeval_valid(const struct timeval *tv) { /* Dates before 1970 are bogus */ if (tv->tv_sec < 0) return false;
/* Can't have more microseconds then a second */ if (tv->tv_usec < 0 || tv->tv_usec >= USEC_PER_SEC) return false; return true;
}
My idea was to not duplicate the range check that is done in do_sys_settimeofday64() and again in do_settimeofday64:
if (!timespec64_valid_settod(ts)) return -EINVAL;
The only check we should need in addition to this is to ensure that passing an invalid tv_usec number doesn't become an unexpectedly valid tv_nsec after the multiplication.
I agree the patch looks like I'm missing a check here, but the code after the patch appears clear enough to me.
Arnd
On Thu, 14 Nov 2019, Arnd Bergmann wrote:
On Wed, Nov 13, 2019 at 10:53 PM Thomas Gleixner tglx@linutronix.de wrote:
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
-SYSCALL_DEFINE2(settimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts;
struct timeval user_tv; struct timezone new_tz; if (tv) {
if (copy_from_user(&user_tv, tv, sizeof(*tv)))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
How is that supposed to be correct on a 32bit kernel?
I don't see the problem you are referring to. This should behave the same way on a 32-bit kernel and on a 64-bit kernel, sign-extending the tv_sec field, and copying the user tv_usec field into the kernel tv_nsec, to be multiplied by 1000 a few lines later.
You're right. Tired brain failed to see the implicit sign extension in get_user().
Am I missing something?
No.
if (!timeval_valid(&user_tv))
if (tv->tv_usec > USEC_PER_SEC) return -EINVAL;
That's incomplete:
static inline bool timeval_valid(const struct timeval *tv) { /* Dates before 1970 are bogus */ if (tv->tv_sec < 0) return false;
/* Can't have more microseconds then a second */ if (tv->tv_usec < 0 || tv->tv_usec >= USEC_PER_SEC) return false; return true;
}
My idea was to not duplicate the range check that is done in do_sys_settimeofday64() and again in do_settimeofday64:
if (!timespec64_valid_settod(ts)) return -EINVAL;
The only check we should need in addition to this is to ensure that passing an invalid tv_usec number doesn't become an unexpectedly valid tv_nsec after the multiplication.
Right, but please add a proper comment as you/we are going to scratch heads 4 weeks from now when staring at that check and wondering why it is incomplete.
Thanks,
tglx
On Thu, Nov 14, 2019 at 3:04 PM Thomas Gleixner tglx@linutronix.de wrote:
On Thu, 14 Nov 2019, Arnd Bergmann wrote:
On Wed, Nov 13, 2019 at 10:53 PM Thomas Gleixner tglx@linutronix.de wrote:
My idea was to not duplicate the range check that is done in do_sys_settimeofday64() and again in do_settimeofday64:
if (!timespec64_valid_settod(ts)) return -EINVAL;
The only check we should need in addition to this is to ensure that passing an invalid tv_usec number doesn't become an unexpectedly valid tv_nsec after the multiplication.
Right, but please add a proper comment as you/we are going to scratch heads 4 weeks from now when staring at that check and wondering why it is incomplete.
Ok, done. I had just uploaded the branch with the fixup for the __user pointer access in the same patch, but that version had introduced another typo. I hope the version I uploaded now has all known issues addressed for tomorrow's linux-next.
Arnd
On 19-11-08 22:12:16, Arnd Bergmann wrote:
The compat_get_timeval() and timeval_valid() interfaces are deprecated and getting removed along with the definition of struct timeval itself.
Change the two implementations of the settimeofday() system call to open-code these helpers and completely avoid references to timeval.
I get the following rcu stalls due to this patch on riscv64 (on qemu):
[root@riscv ~]# uname -a Linux riscv 5.4.0-rc6-00018-gadde74306a4b #112 SMP Fri Nov 15 00:46:20 EET 2019 riscv64 riscv64 riscv64 GNU/Linux [root@riscv ~]# [ 420.135710] rcu: INFO: rcu_sched self-detected stall on CPU [ 420.136839] rcu: 3-....: (99702 ticks this GP) idle=482/1/0x4000000000000002 softirq=3322/3322 fqs=48784 [ 420.138917] (t=99768 jiffies g=4985 q=8343) [ 420.139772] Task dump for CPU 3: [ 420.140236] rdate R running task 0 254 1 0x00000008 [ 420.142226] Call Trace: [ 420.142791] [<ffffffe000037954>] walk_stackframe+0x0/0xa6 [ 420.143911] [<ffffffe000037aba>] show_stack+0x2a/0x34 [ 420.145010] [<ffffffe0000569c8>] sched_show_task+0xf0/0x116 [ 420.145996] [<ffffffe00005b502>] dump_cpu_task+0x3e/0x48 [ 420.147073] [<ffffffe000084e5e>] rcu_dump_cpu_stacks+0x7c/0xb4 [ 420.148243] [<ffffffe0000842f6>] rcu_sched_clock_irq+0x3d6/0x582 [ 420.149349] [<ffffffe0000897b4>] update_process_times+0x1e/0x42 [ 420.150306] [<ffffffe000093a34>] tick_sched_handle.isra.0+0x2a/0x3a [ 420.150997] [<ffffffe000093ce8>] tick_sched_timer+0x4e/0x92 [ 420.151603] [<ffffffe000089eb6>] __hrtimer_run_queues+0xae/0x108 [ 420.152639] [<ffffffe00008a5ac>] hrtimer_interrupt+0xca/0x1d4 [ 420.153629] [<ffffffe0004de564>] riscv_timer_interrupt+0x32/0x3a [ 420.154629] [<ffffffe000612ad4>] do_IRQ+0xa4/0xb8 [ 420.155294] [<ffffffe000036814>] ret_from_exception+0x0/0xc [ 420.156073] [<ffffffe000036814>] ret_from_exception+0x0/0xc [ 451.556189] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-... } 100725 jiffies s: 53 root: 0x8/. [ 451.558689] rcu: blocking rcu_node structures: [ 451.559501] Task dump for CPU 3: [ 451.560518] rdate R running task 0 254 1 0x00000008 [ 451.561396] Call Trace: [ 451.561675] [<ffffffe00060ed48>] __schedule+0x158/0x36a [ 483.147733] rcu: INFO: rcu_sched self-detected stall on CPU [ 483.148723] rcu: 3-....: (115448 ticks this GP) idle=482/1/0x4000000000000002 softirq=3322/3322 fqs=56510 [ 483.150220] (t=115521 jiffies g=4985 q=8400) [ 483.150885] Task dump for CPU 3: [ 483.151392] rdate R running task 0 254 1 0x00000008 [ 483.152321] Call Trace: [ 483.152755] [<ffffffe000037954>] walk_stackframe+0x0/0xa6 [ 483.153600] [<ffffffe000037aba>] show_stack+0x2a/0x34 [ 483.154428] [<ffffffe0000569c8>] sched_show_task+0xf0/0x116 [ 483.155325] [<ffffffe00005b502>] dump_cpu_task+0x3e/0x48 [ 483.156199] [<ffffffe000084e5e>] rcu_dump_cpu_stacks+0x7c/0xb4 [ 483.157163] [<ffffffe0000842f6>] rcu_sched_clock_irq+0x3d6/0x582 [ 483.158166] [<ffffffe0000897b4>] update_process_times+0x1e/0x42 [ 483.159257] [<ffffffe000093a34>] tick_sched_handle.isra.0+0x2a/0x3a [ 483.160240] [<ffffffe000093ce8>] tick_sched_timer+0x4e/0x92 [ 483.160992] [<ffffffe000089eb6>] __hrtimer_run_queues+0xae/0x108 [ 483.161881] [<ffffffe00008a5ac>] hrtimer_interrupt+0xca/0x1d4 [ 483.162778] [<ffffffe0004de564>] riscv_timer_interrupt+0x32/0x3a [ 483.163542] [<ffffffe000612ad4>] do_IRQ+0xa4/0xb8 [ 483.164241] [<ffffffe000036814>] ret_from_exception+0x0/0xc [ 483.165108] [<ffffffe000036814>] ret_from_exception+0x0/0xc [ 515.044254] rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-... } 116597 jiffies s: 53 root: 0x8/. [ 515.046221] rcu: blocking rcu_node structures: [ 515.046799] Task dump for CPU 3: [ 515.047180] rdate R running task 0 254 1 0x00000008 [ 515.048476] Call Trace: [ 515.048895] [<ffffffe00060ed48>] __schedule+0x158/0x36a
I will dig up some more into this tomorrow since it's way past by midnight here.
Signed-off-by: Arnd Bergmann arnd@arndb.de
include/linux/syscalls.h | 2 +- kernel/time/time.c | 20 +++++++++----------- 2 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index e665920fa359..d0391cc2dae9 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -734,7 +734,7 @@ asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct g /* kernel/time.c */ asmlinkage long sys_gettimeofday(struct __kernel_old_timeval __user *tv, struct timezone __user *tz); -asmlinkage long sys_settimeofday(struct timeval __user *tv, +asmlinkage long sys_settimeofday(struct __kernel_old_timeval __user *tv, struct timezone __user *tz); asmlinkage long sys_adjtimex(struct __kernel_timex __user *txc_p); asmlinkage long sys_adjtimex_time32(struct old_timex32 __user *txc_p); diff --git a/kernel/time/time.c b/kernel/time/time.c index bc114f0be8f1..6bfbe640fd3b 100644 --- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -196,22 +196,21 @@ int do_sys_settimeofday64(const struct timespec64 *tv, const struct timezone *tz return 0; } -SYSCALL_DEFINE2(settimeofday, struct timeval __user *, tv, +SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts;
- struct timeval user_tv; struct timezone new_tz;
if (tv) {
if (copy_from_user(&user_tv, tv, sizeof(*tv)))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
if (!timeval_valid(&user_tv))
if (tv->tv_usec > USEC_PER_SEC) return -EINVAL;
new_ts.tv_sec = user_tv.tv_sec;
new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC;
} if (tz) { if (copy_from_user(&new_tz, tz, sizeof(*tz)))new_ts.tv_nsec *= NSEC_PER_USEC;
@@ -245,18 +244,17 @@ COMPAT_SYSCALL_DEFINE2(settimeofday, struct old_timeval32 __user *, tv, struct timezone __user *, tz) { struct timespec64 new_ts;
- struct timeval user_tv; struct timezone new_tz;
if (tv) {
if (compat_get_timeval(&user_tv, tv))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
if (!timeval_valid(&user_tv))
if (new_ts.tv_nsec > USEC_PER_SEC) return -EINVAL;
new_ts.tv_sec = user_tv.tv_sec;
new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC;
} if (tz) { if (copy_from_user(&new_tz, tz, sizeof(*tz)))new_ts.tv_nsec *= NSEC_PER_USEC;
-- 2.20.0
On Fri, Nov 15, 2019 at 12:01 AM Abel Vesa abelvesa@linux.com wrote:
On 19-11-08 22:12:16, Arnd Bergmann wrote:
The compat_get_timeval() and timeval_valid() interfaces are deprecated and getting removed along with the definition of struct timeval itself.
Change the two implementations of the settimeofday() system call to open-code these helpers and completely avoid references to timeval.
I'm not sure how we get to the RCU stall, but this is almost certainly another symptom of a typo I had introduced in the patch, which others have also reported. This is the the fix in today's linux-next:
--- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -207,7 +207,7 @@ SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
- if (tv->tv_usec > USEC_PER_SEC) + if (new_ts->tv_usec > USEC_PER_SEC) return -EINVAL;
new_ts.tv_nsec *= NSEC_PER_USEC;
Arnd
I get the following rcu stalls due to this patch on riscv64 (on qemu):
[root@riscv ~]# uname -a Linux riscv 5.4.0-rc6-00018-gadde74306a4b #112 SMP Fri Nov 15 00:46:20 EET 2019 riscv64 riscv64 riscv64 GNU/Linux [root@riscv ~]# [ 420.135710] rcu: INFO: rcu_sched self-detected stall on CPU [ 420.136839] rcu: 3-....: (99702 ticks this GP) idle=482/1/0x4000000000000002 softirq=3322/3322 fqs=48784 [ 420.138917] (t=99768 jiffies g=4985 q=8343) [ 420.139772] Task dump for CPU 3: [ 420.140236] rdate R running task 0 254 1 0x00000008 [ 420.142226] Call Trace: [ 420.142791] [<ffffffe000037954>] walk_stackframe+0x0/0xa6 [ 420.143911] [<ffffffe000037aba>] show_stack+0x2a/0x34 [ 420.145010] [<ffffffe0000569c8>] sched_show_task+0xf0/0x116 [ 420.145996] [<ffffffe00005b502>] dump_cpu_task+0x3e/0x48 [ 420.147073] [<ffffffe000084e5e>] rcu_dump_cpu_stacks+0x7c/0xb4 [ 420.148243] [<ffffffe0000842f6>] rcu_sched_clock_irq+0x3d6/0x582 [ 420.149349] [<ffffffe0000897b4>] update_process_times+0x1e/0x42 [ 420.150306] [<ffffffe000093a34>] tick_sched_handle.isra.0+0x2a/0x3a [ 420.150997] [<ffffffe000093ce8>] tick_sched_timer+0x4e/0x92 [ 420.151603] [<ffffffe000089eb6>] __hrtimer_run_queues+0xae/0x108 [ 420.152639] [<ffffffe00008a5ac>] hrtimer_interrupt+0xca/0x1d4 [ 420.153629] [<ffffffe0004de564>] riscv_timer_interrupt+0x32/0x3a [ 420.154629] [<ffffffe000612ad4>] do_IRQ+0xa4/0xb8 [ 420.155294] [<ffffffe000036814>] ret_from_exception+0x0/0xc [ 420.156073] [<ffffffe000036814>] ret_from_exception+0x0/0xc
On 15/11/2019 08.58, Arnd Bergmann wrote:
On Fri, Nov 15, 2019 at 12:01 AM Abel Vesa abelvesa@linux.com wrote:
On 19-11-08 22:12:16, Arnd Bergmann wrote:
The compat_get_timeval() and timeval_valid() interfaces are deprecated and getting removed along with the definition of struct timeval itself.
Change the two implementations of the settimeofday() system call to open-code these helpers and completely avoid references to timeval.
I'm not sure how we get to the RCU stall, but this is almost certainly another symptom of a typo I had introduced in the patch, which others have also reported. This is the the fix in today's linux-next:
--- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -207,7 +207,7 @@ SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
if (tv->tv_usec > USEC_PER_SEC)
if (new_ts->tv_usec > USEC_PER_SEC) return -EINVAL;
Hopefully not :)
new_ts.tv_nsec *= NSEC_PER_USEC;
So the actual patch in next-20191115 does
- if (copy_from_user(&user_tv, tv, sizeof(*tv))) + if (get_user(new_ts.tv_sec, &tv->tv_sec) || + get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
- if (!timeval_valid(&user_tv)) + if (new_ts.tv_nsec > USEC_PER_SEC) return -EINVAL;
- new_ts.tv_sec = user_tv.tv_sec; - new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC; + new_ts.tv_nsec *= NSEC_PER_USEC;
But removing the "user value is < 0" check, relying on the timespec value being rejected later, is wrong: 1000=8*125. Multiplying by 8 always gives a value with the low three bits clear, multiplying by 125 is reversible. So you can take any target value with the three low bits clear, logic shift right by 3, multiply by 0x1cac083126e978d5 , and flip the top three bits as you wish to generate 8 pre-images of that target value. Four of those will be negative. A trivial example is 0x80..000 (aka LONG_MIN) and its cousins 0xa0..000, 0xc0..000, 0xe0..000 which all become 0 and thus accepted after multiplying by NSEC_PER_USEC. But also -858989233 (or -3689348814741906097 if long is 64 bit) become 4226200 which isn't even a multiple of 1000 - there's 500M examples to choose from :)
I'm pretty sure it doesn't generate worse code, gcc is smart enough to compile "foo > BAR || foo < 0" as if it was written "(unsigned version of foo)foo > BAR". And while a value of USEC_PER_SEC itself will not overflow and then be rejeted because the real comparison done later is ">= NSEC_PER_SEC", I think it's clearer to say "foo >= USEC_PER_SEC || foo < 0) just so the same pattern is used.
Rasmus
On Fri, Nov 15, 2019 at 11:27 AM Rasmus Villemoes linux@rasmusvillemoes.dk wrote:
On 15/11/2019 08.58, Arnd Bergmann wrote:
On Fri, Nov 15, 2019 at 12:01 AM Abel Vesa abelvesa@linux.com wrote:
--- a/kernel/time/time.c +++ b/kernel/time/time.c @@ -207,7 +207,7 @@ SYSCALL_DEFINE2(settimeofday, struct __kernel_old_timeval __user *, tv, get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
if (tv->tv_usec > USEC_PER_SEC)
if (new_ts->tv_usec > USEC_PER_SEC) return -EINVAL;
Hopefully not :)
No, I misquoted from a fix that I had temporarily applied, not the version in linux-next.
new_ts.tv_nsec *= NSEC_PER_USEC;
So the actual patch in next-20191115 does
if (copy_from_user(&user_tv, tv, sizeof(*tv)))
if (get_user(new_ts.tv_sec, &tv->tv_sec) ||
get_user(new_ts.tv_nsec, &tv->tv_usec)) return -EFAULT;
if (!timeval_valid(&user_tv))
if (new_ts.tv_nsec > USEC_PER_SEC) return -EINVAL;
new_ts.tv_sec = user_tv.tv_sec;
new_ts.tv_nsec = user_tv.tv_usec * NSEC_PER_USEC;
new_ts.tv_nsec *= NSEC_PER_USEC;
But removing the "user value is < 0" check, relying on the timespec value being rejected later, is wrong
You are right of course, so many ways to get this one line wrong... Pushed more more update to the branch now.
Thanks for the careful review!
Arnd
The structure is only used in one place, moving it there simplifies the interface and helps with later changes to this code.
Rename it to match the other time32 structures in the process.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/linux/compat.h | 15 ++++----------- kernel/compat.c | 24 ------------------------ kernel/time/itimer.c | 42 +++++++++++++++++++++++++++++++++++------- 3 files changed, 39 insertions(+), 42 deletions(-)
diff --git a/include/linux/compat.h b/include/linux/compat.h index 3735a22bfbc0..906a0ea933cd 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -116,14 +116,7 @@ typedef __compat_gid32_t compat_gid_t; struct compat_sel_arg_struct; struct rusage;
-struct compat_itimerval { - struct old_timeval32 it_interval; - struct old_timeval32 it_value; -}; - -struct itimerval; -int get_compat_itimerval(struct itimerval *, const struct compat_itimerval __user *); -int put_compat_itimerval(struct compat_itimerval __user *, const struct itimerval *); +struct old_itimerval32;
struct compat_tms { compat_clock_t tms_utime; @@ -668,10 +661,10 @@ compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr,
/* kernel/itimer.c */ asmlinkage long compat_sys_getitimer(int which, - struct compat_itimerval __user *it); + struct old_itimerval32 __user *it); asmlinkage long compat_sys_setitimer(int which, - struct compat_itimerval __user *in, - struct compat_itimerval __user *out); + struct old_itimerval32 __user *in, + struct old_itimerval32 __user *out);
/* kernel/kexec.c */ asmlinkage long compat_sys_kexec_load(compat_ulong_t entry, diff --git a/kernel/compat.c b/kernel/compat.c index a2bc1d6ceb57..95005f849c68 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -90,30 +90,6 @@ int compat_put_timespec(const struct timespec *ts, void __user *uts) } EXPORT_SYMBOL_GPL(compat_put_timespec);
-int get_compat_itimerval(struct itimerval *o, const struct compat_itimerval __user *i) -{ - struct compat_itimerval v32; - - if (copy_from_user(&v32, i, sizeof(struct compat_itimerval))) - return -EFAULT; - o->it_interval.tv_sec = v32.it_interval.tv_sec; - o->it_interval.tv_usec = v32.it_interval.tv_usec; - o->it_value.tv_sec = v32.it_value.tv_sec; - o->it_value.tv_usec = v32.it_value.tv_usec; - return 0; -} - -int put_compat_itimerval(struct compat_itimerval __user *o, const struct itimerval *i) -{ - struct compat_itimerval v32; - - v32.it_interval.tv_sec = i->it_interval.tv_sec; - v32.it_interval.tv_usec = i->it_interval.tv_usec; - v32.it_value.tv_sec = i->it_value.tv_sec; - v32.it_value.tv_usec = i->it_value.tv_usec; - return copy_to_user(o, &v32, sizeof(struct compat_itimerval)) ? -EFAULT : 0; -} - #ifdef __ARCH_WANT_SYS_SIGPROCMASK
/* diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c index 77f1e5635cc1..c52ebb40b60b 100644 --- a/kernel/time/itimer.c +++ b/kernel/time/itimer.c @@ -112,19 +112,34 @@ SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, value) }
#ifdef CONFIG_COMPAT +struct old_itimerval32 { + struct old_timeval32 it_interval; + struct old_timeval32 it_value; +}; + +static int put_old_itimerval32(struct old_itimerval32 __user *o, const struct itimerval *i) +{ + struct old_itimerval32 v32; + + v32.it_interval.tv_sec = i->it_interval.tv_sec; + v32.it_interval.tv_usec = i->it_interval.tv_usec; + v32.it_value.tv_sec = i->it_value.tv_sec; + v32.it_value.tv_usec = i->it_value.tv_usec; + return copy_to_user(o, &v32, sizeof(struct old_itimerval32)) ? -EFAULT : 0; +} + COMPAT_SYSCALL_DEFINE2(getitimer, int, which, - struct compat_itimerval __user *, it) + struct old_itimerval32 __user *, it) { struct itimerval kit; int error = do_getitimer(which, &kit);
- if (!error && put_compat_itimerval(it, &kit)) + if (!error && put_old_itimerval32(it, &kit)) error = -EFAULT; return error; } #endif
- /* * The timer is automagically restarted, when interval != 0 */ @@ -310,15 +325,28 @@ SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, value, }
#ifdef CONFIG_COMPAT +static int get_old_itimerval32(struct itimerval *o, const struct old_itimerval32 __user *i) +{ + struct old_itimerval32 v32; + + if (copy_from_user(&v32, i, sizeof(struct old_itimerval32))) + return -EFAULT; + o->it_interval.tv_sec = v32.it_interval.tv_sec; + o->it_interval.tv_usec = v32.it_interval.tv_usec; + o->it_value.tv_sec = v32.it_value.tv_sec; + o->it_value.tv_usec = v32.it_value.tv_usec; + return 0; +} + COMPAT_SYSCALL_DEFINE3(setitimer, int, which, - struct compat_itimerval __user *, in, - struct compat_itimerval __user *, out) + struct old_itimerval32 __user *, in, + struct old_itimerval32 __user *, out) { struct itimerval kin, kout; int error;
if (in) { - if (get_compat_itimerval(&kin, in)) + if (get_old_itimerval32(&kin, in)) return -EFAULT; } else { memset(&kin, 0, sizeof(kin)); @@ -327,7 +355,7 @@ COMPAT_SYSCALL_DEFINE3(setitimer, int, which, error = do_setitimer(which, &kin, out ? &kout : NULL); if (error || !out) return error; - if (put_compat_itimerval(out, &kout)) + if (put_old_itimerval32(out, &kout)) return -EFAULT; return 0; }
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
The structure is only used in one place, moving it there simplifies the interface and helps with later changes to this code.
Rename it to match the other time32 structures in the process.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Reviewed-by: Thomas Gleixner tglx@linutronix.de
The itimer handling for the old alpha osf_setitimer/osf_getitimer system calls is identical to the compat version of getitimer/setitimer, so just use those directly.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/alpha/kernel/osf_sys.c | 65 -------------------------- arch/alpha/kernel/syscalls/syscall.tbl | 4 +- kernel/time/itimer.c | 4 +- 3 files changed, 4 insertions(+), 69 deletions(-)
diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index bbe7a0da6264..94e4cde8071a 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -971,30 +971,6 @@ put_tv_to_tv32(struct timeval32 __user *o, struct __kernel_old_timeval *i) sizeof(struct timeval32)); }
-static inline long -get_it32(struct itimerval *o, struct itimerval32 __user *i) -{ - struct itimerval32 itv; - if (copy_from_user(&itv, i, sizeof(struct itimerval32))) - return -EFAULT; - o->it_interval.tv_sec = itv.it_interval.tv_sec; - o->it_interval.tv_usec = itv.it_interval.tv_usec; - o->it_value.tv_sec = itv.it_value.tv_sec; - o->it_value.tv_usec = itv.it_value.tv_usec; - return 0; -} - -static inline long -put_it32(struct itimerval32 __user *o, struct itimerval *i) -{ - return copy_to_user(o, &(struct itimerval32){ - .it_interval.tv_sec = o->it_interval.tv_sec, - .it_interval.tv_usec = o->it_interval.tv_usec, - .it_value.tv_sec = o->it_value.tv_sec, - .it_value.tv_usec = o->it_value.tv_usec}, - sizeof(struct itimerval32)); -} - static inline void jiffies_to_timeval32(unsigned long jiffies, struct timeval32 *value) { @@ -1039,47 +1015,6 @@ SYSCALL_DEFINE2(osf_settimeofday, struct timeval32 __user *, tv,
asmlinkage long sys_ni_posix_timers(void);
-SYSCALL_DEFINE2(osf_getitimer, int, which, struct itimerval32 __user *, it) -{ - struct itimerval kit; - int error; - - if (!IS_ENABLED(CONFIG_POSIX_TIMERS)) - return sys_ni_posix_timers(); - - error = do_getitimer(which, &kit); - if (!error && put_it32(it, &kit)) - error = -EFAULT; - - return error; -} - -SYSCALL_DEFINE3(osf_setitimer, int, which, struct itimerval32 __user *, in, - struct itimerval32 __user *, out) -{ - struct itimerval kin, kout; - int error; - - if (!IS_ENABLED(CONFIG_POSIX_TIMERS)) - return sys_ni_posix_timers(); - - if (in) { - if (get_it32(&kin, in)) - return -EFAULT; - } else - memset(&kin, 0, sizeof(kin)); - - error = do_setitimer(which, &kin, out ? &kout : NULL); - if (error || !out) - return error; - - if (put_it32(out, &kout)) - return -EFAULT; - - return 0; - -} - SYSCALL_DEFINE2(osf_utimes, const char __user *, filename, struct timeval32 __user *, tvs) { diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index 728fe028c02c..8e13b0b2928d 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -89,10 +89,10 @@ 80 common setgroups sys_setgroups 81 common osf_old_getpgrp sys_ni_syscall 82 common setpgrp sys_setpgid -83 common osf_setitimer sys_osf_setitimer +83 common osf_setitimer compat_sys_setitimer 84 common osf_old_wait sys_ni_syscall 85 common osf_table sys_ni_syscall -86 common osf_getitimer sys_osf_getitimer +86 common osf_getitimer compat_sys_getitimer 87 common gethostname sys_gethostname 88 common sethostname sys_sethostname 89 common getdtablesize sys_getdtablesize diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c index c52ebb40b60b..4664c6addf69 100644 --- a/kernel/time/itimer.c +++ b/kernel/time/itimer.c @@ -111,7 +111,7 @@ SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, value) return error; }
-#ifdef CONFIG_COMPAT +#if defined(CONFIG_COMPAT) || defined(CONFIG_ALPHA) struct old_itimerval32 { struct old_timeval32 it_interval; struct old_timeval32 it_value; @@ -324,7 +324,7 @@ SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, value, return 0; }
-#ifdef CONFIG_COMPAT +#if defined(CONFIG_COMPAT) || defined(CONFIG_ALPHA) static int get_old_itimerval32(struct itimerval *o, const struct old_itimerval32 __user *i) { struct old_itimerval32 v32;
On Fri, Nov 08, 2019 at 10:12:18PM +0100, Arnd Bergmann wrote:
The itimer handling for the old alpha osf_setitimer/osf_getitimer system calls is identical to the compat version of getitimer/setitimer, so just use those directly.
Signed-off-by: Arnd Bergmann arnd@arndb.de
alpha:allnoconfig, alpha:tinyconfig:
alpha-linux-ld: arch/alpha/kernel/systbls.o: in function `sys_call_table': (.data+0x298): undefined reference to `compat_sys_setitimer' alpha-linux-ld: (.data+0x2b0): undefined reference to `compat_sys_getitimer'
Guenter
Preparing for a change to the itimer internals, stop using the do_setitimer() symbol and instead use a new higher-level interface.
The do_getitimer()/do_setitimer functions can now be made static, allowing the compiler to potentially produce better object code.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/linux/time.h | 9 +++++---- kernel/time/itimer.c | 15 +++++++++++++-- security/selinux/hooks.c | 10 +++------- 3 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/include/linux/time.h b/include/linux/time.h index 27d83fd2ae61..0760a4f5a15c 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -35,10 +35,11 @@ extern time64_t mktime64(const unsigned int year, const unsigned int mon, extern u32 (*arch_gettimeoffset)(void); #endif
-struct itimerval; -extern int do_setitimer(int which, struct itimerval *value, - struct itimerval *ovalue); -extern int do_getitimer(int which, struct itimerval *value); +#ifdef CONFIG_POSIX_TIMERS +extern void clear_itimer(void); +#else +static inline void clear_itimer(void) {} +#endif
extern long do_utimes(int dfd, const char __user *filename, struct timespec64 *times, int flags);
diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c index 4664c6addf69..ce9cd19ce72e 100644 --- a/kernel/time/itimer.c +++ b/kernel/time/itimer.c @@ -73,7 +73,7 @@ static void get_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, value->it_interval = ns_to_timeval(interval); }
-int do_getitimer(int which, struct itimerval *value) +static int do_getitimer(int which, struct itimerval *value) { struct task_struct *tsk = current;
@@ -197,7 +197,7 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
-int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) +static int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) { struct task_struct *tsk = current; struct hrtimer *timer; @@ -249,6 +249,17 @@ int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) return 0; }
+#ifdef CONFIG_SECURITY_SELINUX +void clear_itimer(void) +{ + struct itimerval v = {}; + int i; + + for (i = 0; i < 3; i++) + do_setitimer(i, &v, NULL); +} +#endif + #ifdef __ARCH_WANT_SYS_ALARM
/** diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 9625b99e677f..c3f2e89acb87 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2549,9 +2549,8 @@ static void selinux_bprm_committing_creds(struct linux_binprm *bprm) static void selinux_bprm_committed_creds(struct linux_binprm *bprm) { const struct task_security_struct *tsec = selinux_cred(current_cred()); - struct itimerval itimer; u32 osid, sid; - int rc, i; + int rc;
osid = tsec->osid; sid = tsec->sid; @@ -2569,11 +2568,8 @@ static void selinux_bprm_committed_creds(struct linux_binprm *bprm) rc = avc_has_perm(&selinux_state, osid, sid, SECCLASS_PROCESS, PROCESS__SIGINH, NULL); if (rc) { - if (IS_ENABLED(CONFIG_POSIX_TIMERS)) { - memset(&itimer, 0, sizeof itimer); - for (i = 0; i < 3; i++) - do_setitimer(i, &itimer, NULL); - } + if (IS_ENABLED(CONFIG_POSIX_TIMERS)) + clear_itimer(); spin_lock_irq(¤t->sighand->siglock); if (!fatal_signal_pending(current)) { flush_sigqueue(¤t->pending);
On Fri, Nov 8, 2019 at 10:18 PM Arnd Bergmann arnd@arndb.de wrote:
Preparing for a change to the itimer internals, stop using the do_setitimer() symbol and instead use a new higher-level interface.
The do_getitimer()/do_setitimer functions can now be made static, allowing the compiler to potentially produce better object code.
Signed-off-by: Arnd Bergmann arnd@arndb.de
include/linux/time.h | 9 +++++---- kernel/time/itimer.c | 15 +++++++++++++-- security/selinux/hooks.c | 10 +++------- 3 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/include/linux/time.h b/include/linux/time.h index 27d83fd2ae61..0760a4f5a15c 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -35,10 +35,11 @@ extern time64_t mktime64(const unsigned int year, const unsigned int mon, extern u32 (*arch_gettimeoffset)(void); #endif
-struct itimerval; -extern int do_setitimer(int which, struct itimerval *value,
struct itimerval *ovalue);
-extern int do_getitimer(int which, struct itimerval *value); +#ifdef CONFIG_POSIX_TIMERS +extern void clear_itimer(void); +#else +static inline void clear_itimer(void) {} +#endif
extern long do_utimes(int dfd, const char __user *filename, struct timespec64 *times, int flags);
diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c index 4664c6addf69..ce9cd19ce72e 100644 --- a/kernel/time/itimer.c +++ b/kernel/time/itimer.c @@ -73,7 +73,7 @@ static void get_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, value->it_interval = ns_to_timeval(interval); }
-int do_getitimer(int which, struct itimerval *value) +static int do_getitimer(int which, struct itimerval *value) { struct task_struct *tsk = current;
@@ -197,7 +197,7 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
-int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) +static int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) { struct task_struct *tsk = current; struct hrtimer *timer; @@ -249,6 +249,17 @@ int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) return 0; }
+#ifdef CONFIG_SECURITY_SELINUX
Did you mean "#ifdef CONFIG_POSIX_TIMERS" here to match the header?
+void clear_itimer(void) +{
struct itimerval v = {};
int i;
for (i = 0; i < 3; i++)
do_setitimer(i, &v, NULL);
+} +#endif
#ifdef __ARCH_WANT_SYS_ALARM
/** diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 9625b99e677f..c3f2e89acb87 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2549,9 +2549,8 @@ static void selinux_bprm_committing_creds(struct linux_binprm *bprm) static void selinux_bprm_committed_creds(struct linux_binprm *bprm) { const struct task_security_struct *tsec = selinux_cred(current_cred());
struct itimerval itimer; u32 osid, sid;
int rc, i;
int rc; osid = tsec->osid; sid = tsec->sid;
@@ -2569,11 +2568,8 @@ static void selinux_bprm_committed_creds(struct linux_binprm *bprm) rc = avc_has_perm(&selinux_state, osid, sid, SECCLASS_PROCESS, PROCESS__SIGINH, NULL); if (rc) {
if (IS_ENABLED(CONFIG_POSIX_TIMERS)) {
memset(&itimer, 0, sizeof itimer);
for (i = 0; i < 3; i++)
do_setitimer(i, &itimer, NULL);
}
if (IS_ENABLED(CONFIG_POSIX_TIMERS))
clear_itimer();
Since you already define a no-op fallback for the case of !IS_ENABLED(CONFIG_POSIX_TIMERS) in time.h, why not simply call clear_itimer() unconditionally?
spin_lock_irq(¤t->sighand->siglock); if (!fatal_signal_pending(current)) { flush_sigqueue(¤t->pending);
-- 2.20.0
On Sat, Nov 9, 2019 at 2:43 PM Ondrej Mosnacek omosnace@redhat.com wrote:
-struct itimerval; -extern int do_setitimer(int which, struct itimerval *value,
struct itimerval *ovalue);
-extern int do_getitimer(int which, struct itimerval *value); +#ifdef CONFIG_POSIX_TIMERS +extern void clear_itimer(void); +#else +static inline void clear_itimer(void) {} +#endif
@@ -249,6 +249,17 @@ int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) return 0; }
+#ifdef CONFIG_SECURITY_SELINUX
Did you mean "#ifdef CONFIG_POSIX_TIMERS" here to match the header?
No, this part is intentional, CONFIG_POSIX_TIMERS already controls whether itimer.c is compiled in the first place, but this function is only needed when called from the selinux driver.
}
if (IS_ENABLED(CONFIG_POSIX_TIMERS))
clear_itimer();
Since you already define a no-op fallback for the case of !IS_ENABLED(CONFIG_POSIX_TIMERS) in time.h, why not simply call clear_itimer() unconditionally?
Ah right, that was indeed my plan here when I changed the declaration in the header, I just forgot to remove the if(). Fixed now.
Thanks for the review!
Arnd
On Sat, Nov 9, 2019 at 10:03 PM Arnd Bergmann arnd@arndb.de wrote:
On Sat, Nov 9, 2019 at 2:43 PM Ondrej Mosnacek omosnace@redhat.com wrote:
-struct itimerval; -extern int do_setitimer(int which, struct itimerval *value,
struct itimerval *ovalue);
-extern int do_getitimer(int which, struct itimerval *value); +#ifdef CONFIG_POSIX_TIMERS +extern void clear_itimer(void); +#else +static inline void clear_itimer(void) {} +#endif
@@ -249,6 +249,17 @@ int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) return 0; }
+#ifdef CONFIG_SECURITY_SELINUX
Did you mean "#ifdef CONFIG_POSIX_TIMERS" here to match the header?
No, this part is intentional, CONFIG_POSIX_TIMERS already controls whether itimer.c is compiled in the first place, but this function is only needed when called from the selinux driver.
All right, but you declare the function in time.h even if CONFIG_SECURITY_SELINUX is not enabled... it is kind of awkward when it can happen that the function is declared but not defined anywhere (even if it shouldn't be used by new users). Maybe you could at least put the header declaration/definition inside #ifdef CONFIG_SECURITY_SELINUX as well so it is clear that this function is intended for SELinux only?
On Sun, Nov 10, 2019 at 12:07 AM Ondrej Mosnacek omosnace@redhat.com wrote:
On Sat, Nov 9, 2019 at 10:03 PM Arnd Bergmann arnd@arndb.de wrote:
On Sat, Nov 9, 2019 at 2:43 PM Ondrej Mosnacek omosnace@redhat.com wrote:
-struct itimerval; -extern int do_setitimer(int which, struct itimerval *value,
struct itimerval *ovalue);
-extern int do_getitimer(int which, struct itimerval *value); +#ifdef CONFIG_POSIX_TIMERS +extern void clear_itimer(void); +#else +static inline void clear_itimer(void) {} +#endif
@@ -249,6 +249,17 @@ int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) return 0; }
+#ifdef CONFIG_SECURITY_SELINUX
Did you mean "#ifdef CONFIG_POSIX_TIMERS" here to match the header?
No, this part is intentional, CONFIG_POSIX_TIMERS already controls whether itimer.c is compiled in the first place, but this function is only needed when called from the selinux driver.
All right, but you declare the function in time.h even if CONFIG_SECURITY_SELINUX is not enabled... it is kind of awkward when it can happen that the function is declared but not defined anywhere (even if it shouldn't be used by new users). Maybe you could at least put the header declaration/definition inside #ifdef CONFIG_SECURITY_SELINUX as well so it is clear that this function is intended for SELinux only?
I don't see that as a problem, we rarely put declarations inside of an #ifdef. The main effect that would have is forcing any file that includes linux/time.h to be rebuilt when selinux is turned on or off in the .config.
Arnd
On Mon, Nov 11, 2019 at 11:58 AM Arnd Bergmann arnd@arndb.de wrote:
On Sun, Nov 10, 2019 at 12:07 AM Ondrej Mosnacek omosnace@redhat.com wrote:
On Sat, Nov 9, 2019 at 10:03 PM Arnd Bergmann arnd@arndb.de wrote:
On Sat, Nov 9, 2019 at 2:43 PM Ondrej Mosnacek omosnace@redhat.com wrote:
-struct itimerval; -extern int do_setitimer(int which, struct itimerval *value,
struct itimerval *ovalue);
-extern int do_getitimer(int which, struct itimerval *value); +#ifdef CONFIG_POSIX_TIMERS +extern void clear_itimer(void); +#else +static inline void clear_itimer(void) {} +#endif
@@ -249,6 +249,17 @@ int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) return 0; }
+#ifdef CONFIG_SECURITY_SELINUX
Did you mean "#ifdef CONFIG_POSIX_TIMERS" here to match the header?
No, this part is intentional, CONFIG_POSIX_TIMERS already controls whether itimer.c is compiled in the first place, but this function is only needed when called from the selinux driver.
All right, but you declare the function in time.h even if CONFIG_SECURITY_SELINUX is not enabled... it is kind of awkward when it can happen that the function is declared but not defined anywhere (even if it shouldn't be used by new users). Maybe you could at least put the header declaration/definition inside #ifdef CONFIG_SECURITY_SELINUX as well so it is clear that this function is intended for SELinux only?
I don't see that as a problem, we rarely put declarations inside of an #ifdef. The main effect that would have is forcing any file that includes linux/time.h to be rebuilt when selinux is turned on or off in the .config.
OK, but with this patch if someone tries to use the function elsewhere, the build will succeed if SELinux is enabled in the config, but fail if it isn't. Is that intended? I would suggest at least clearly documenting it above the declaration that the function isn't supposed to be used by new users and doing so will cause build to fail under CONFIG_SECURITY_SELINUX=n.
-- Ondrej Mosnacek <omosnace at redhat dot com> Software Engineer, Security Technologies Red Hat, Inc.
On Thu, 14 Nov 2019, Ondrej Mosnacek wrote:
On Mon, Nov 11, 2019 at 11:58 AM Arnd Bergmann arnd@arndb.de wrote:
I don't see that as a problem, we rarely put declarations inside of an #ifdef. The main effect that would have is forcing any file that includes linux/time.h to be rebuilt when selinux is turned on or off in the .config.
OK, but with this patch if someone tries to use the function elsewhere, the build will succeed if SELinux is enabled in the config, but fail if it isn't. Is that intended? I would suggest at least clearly documenting it above the declaration that the function isn't supposed to be used by new users and doing so will cause build to fail under CONFIG_SECURITY_SELINUX=n.
Come on. We have enough functions in the kernel which are only available under a certain config option and if you (ab)use them elsewhere then the build fails. So what?
The #ifdef documents the limited scope and intended use clearly. If something else really needs that function, then removing the #ifdef shouldn't be rocket science either.
Thanks,
tglx
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
Preparing for a change to the itimer internals, stop using the do_setitimer() symbol and instead use a new higher-level interface.
The do_getitimer()/do_setitimer functions can now be made static, allowing the compiler to potentially produce better object code.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Reviewed-by: Thomas Gleixner tglx@linutronix.de
There is no 64-bit version of getitimer/setitimer since that is not actually needed. However, the implementation is built around the deprecated 'struct timeval' type.
Change the code to use timespec64 internally to reduce the dependencies on timeval and associated helper functions.
Minor adjustments in the code are needed to make the native and compat version work the same way, and to keep the range check working after the conversion.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/trace/events/timer.h | 8 +- kernel/time/itimer.c | 158 +++++++++++++++++++++-------------- 2 files changed, 100 insertions(+), 66 deletions(-)
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h index b7a904825e7d..939cbfc80015 100644 --- a/include/trace/events/timer.h +++ b/include/trace/events/timer.h @@ -303,7 +303,7 @@ DEFINE_EVENT(hrtimer_class, hrtimer_cancel, */ TRACE_EVENT(itimer_state,
- TP_PROTO(int which, const struct itimerval *const value, + TP_PROTO(int which, const struct itimerspec64 *const value, unsigned long long expires),
TP_ARGS(which, value, expires), @@ -321,12 +321,12 @@ TRACE_EVENT(itimer_state, __entry->which = which; __entry->expires = expires; __entry->value_sec = value->it_value.tv_sec; - __entry->value_usec = value->it_value.tv_usec; + __entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC; __entry->interval_sec = value->it_interval.tv_sec; - __entry->interval_usec = value->it_interval.tv_usec; + __entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC; ),
- TP_printk("which=%d expires=%llu it_value=%ld.%ld it_interval=%ld.%ld", + TP_printk("which=%d expires=%llu it_value=%ld.%06ld it_interval=%ld.%06ld", __entry->which, __entry->expires, __entry->value_sec, __entry->value_usec, __entry->interval_sec, __entry->interval_usec) diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c index ce9cd19ce72e..5872db9bd5f7 100644 --- a/kernel/time/itimer.c +++ b/kernel/time/itimer.c @@ -26,7 +26,7 @@ * Returns the delta between the expiry time and now, which can be * less than zero or 1usec for an pending expired timer */ -static struct timeval itimer_get_remtime(struct hrtimer *timer) +static struct timespec64 itimer_get_remtime(struct hrtimer *timer) { ktime_t rem = __hrtimer_get_remaining(timer, true);
@@ -41,11 +41,11 @@ static struct timeval itimer_get_remtime(struct hrtimer *timer) } else rem = 0;
- return ktime_to_timeval(rem); + return ktime_to_timespec64(rem); }
static void get_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, - struct itimerval *const value) + struct itimerspec64 *const value) { u64 val, interval; struct cpu_itimer *it = &tsk->signal->it[clock_id]; @@ -69,11 +69,11 @@ static void get_cpu_itimer(struct task_struct *tsk, unsigned int clock_id,
spin_unlock_irq(&tsk->sighand->siglock);
- value->it_value = ns_to_timeval(val); - value->it_interval = ns_to_timeval(interval); + value->it_value = ns_to_timespec64(val); + value->it_interval = ns_to_timespec64(interval); }
-static int do_getitimer(int which, struct itimerval *value) +static int do_getitimer(int which, struct itimerspec64 *value) { struct task_struct *tsk = current;
@@ -82,7 +82,7 @@ static int do_getitimer(int which, struct itimerval *value) spin_lock_irq(&tsk->sighand->siglock); value->it_value = itimer_get_remtime(&tsk->signal->real_timer); value->it_interval = - ktime_to_timeval(tsk->signal->it_real_incr); + ktime_to_timespec64(tsk->signal->it_real_incr); spin_unlock_irq(&tsk->sighand->siglock); break; case ITIMER_VIRTUAL: @@ -97,17 +97,26 @@ static int do_getitimer(int which, struct itimerval *value) return 0; }
+static int put_itimerval(struct itimerval __user *o, + const struct itimerspec64 *i) +{ + struct itimerval v; + + v.it_interval.tv_sec = i->it_interval.tv_sec; + v.it_interval.tv_usec = i->it_interval.tv_nsec / NSEC_PER_USEC; + v.it_value.tv_sec = i->it_value.tv_sec; + v.it_value.tv_usec = i->it_value.tv_nsec / NSEC_PER_USEC; + return copy_to_user(o, &v, sizeof(struct itimerval)) ? -EFAULT : 0; +} + + SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, value) { - int error = -EFAULT; - struct itimerval get_buffer; + struct itimerspec64 get_buffer; + int error = do_getitimer(which, &get_buffer);
- if (value) { - error = do_getitimer(which, &get_buffer); - if (!error && - copy_to_user(value, &get_buffer, sizeof(get_buffer))) - error = -EFAULT; - } + if (!error && put_itimerval(value, &get_buffer)) + error = -EFAULT; return error; }
@@ -117,24 +126,25 @@ struct old_itimerval32 { struct old_timeval32 it_value; };
-static int put_old_itimerval32(struct old_itimerval32 __user *o, const struct itimerval *i) +static int put_old_itimerval32(struct old_itimerval32 __user *o, + const struct itimerspec64 *i) { struct old_itimerval32 v32;
v32.it_interval.tv_sec = i->it_interval.tv_sec; - v32.it_interval.tv_usec = i->it_interval.tv_usec; + v32.it_interval.tv_usec = i->it_interval.tv_nsec / NSEC_PER_USEC; v32.it_value.tv_sec = i->it_value.tv_sec; - v32.it_value.tv_usec = i->it_value.tv_usec; + v32.it_value.tv_usec = i->it_value.tv_nsec / NSEC_PER_USEC; return copy_to_user(o, &v32, sizeof(struct old_itimerval32)) ? -EFAULT : 0; }
COMPAT_SYSCALL_DEFINE2(getitimer, int, which, - struct old_itimerval32 __user *, it) + struct old_itimerval32 __user *, value) { - struct itimerval kit; - int error = do_getitimer(which, &kit); + struct itimerspec64 get_buffer; + int error = do_getitimer(which, &get_buffer);
- if (!error && put_old_itimerval32(it, &kit)) + if (!error && put_old_itimerval32(value, &get_buffer)) error = -EFAULT; return error; } @@ -156,8 +166,8 @@ enum hrtimer_restart it_real_fn(struct hrtimer *timer) }
static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, - const struct itimerval *const value, - struct itimerval *const ovalue) + const struct itimerspec64 *const value, + struct itimerspec64 *const ovalue) { u64 oval, nval, ointerval, ninterval; struct cpu_itimer *it = &tsk->signal->it[clock_id]; @@ -166,8 +176,8 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, * Use the to_ktime conversion because that clamps the maximum * value to KTIME_MAX and avoid multiplication overflows. */ - nval = ktime_to_ns(timeval_to_ktime(value->it_value)); - ninterval = ktime_to_ns(timeval_to_ktime(value->it_interval)); + nval = timespec64_to_ns(&value->it_value); + ninterval = timespec64_to_ns(&value->it_interval);
spin_lock_irq(&tsk->sighand->siglock);
@@ -186,8 +196,8 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, spin_unlock_irq(&tsk->sighand->siglock);
if (ovalue) { - ovalue->it_value = ns_to_timeval(oval); - ovalue->it_interval = ns_to_timeval(ointerval); + ovalue->it_value = ns_to_timespec64(oval); + ovalue->it_interval = ns_to_timespec64(ointerval); } }
@@ -197,19 +207,13 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
-static int do_setitimer(int which, struct itimerval *value, struct itimerval *ovalue) +static int do_setitimer(int which, struct itimerspec64 *value, + struct itimerspec64 *ovalue) { struct task_struct *tsk = current; struct hrtimer *timer; ktime_t expires;
- /* - * Validate the timevals in value. - */ - if (!timeval_valid(&value->it_value) || - !timeval_valid(&value->it_interval)) - return -EINVAL; - switch (which) { case ITIMER_REAL: again: @@ -218,7 +222,7 @@ static int do_setitimer(int which, struct itimerval *value, struct itimerval *ov if (ovalue) { ovalue->it_value = itimer_get_remtime(timer); ovalue->it_interval - = ktime_to_timeval(tsk->signal->it_real_incr); + = ktime_to_timespec64(tsk->signal->it_real_incr); } /* We are sharing ->siglock with it_real_fn() */ if (hrtimer_try_to_cancel(timer) < 0) { @@ -226,10 +230,10 @@ static int do_setitimer(int which, struct itimerval *value, struct itimerval *ov hrtimer_cancel_wait_running(timer); goto again; } - expires = timeval_to_ktime(value->it_value); + expires = timespec64_to_ktime(value->it_value); if (expires != 0) { tsk->signal->it_real_incr = - timeval_to_ktime(value->it_interval); + timespec64_to_ktime(value->it_interval); hrtimer_start(timer, expires, HRTIMER_MODE_REL); } else tsk->signal->it_real_incr = 0; @@ -252,7 +256,7 @@ static int do_setitimer(int which, struct itimerval *value, struct itimerval *ov #ifdef CONFIG_SECURITY_SELINUX void clear_itimer(void) { - struct itimerval v = {}; + struct itimerspec64 v = {}; int i;
for (i = 0; i < 3; i++) @@ -276,15 +280,15 @@ void clear_itimer(void) */ static unsigned int alarm_setitimer(unsigned int seconds) { - struct itimerval it_new, it_old; + struct itimerspec64 it_new, it_old;
#if BITS_PER_LONG < 64 if (seconds > INT_MAX) seconds = INT_MAX; #endif it_new.it_value.tv_sec = seconds; - it_new.it_value.tv_usec = 0; - it_new.it_interval.tv_sec = it_new.it_interval.tv_usec = 0; + it_new.it_value.tv_nsec = 0; + it_new.it_interval.tv_sec = it_new.it_interval.tv_nsec = 0;
do_setitimer(ITIMER_REAL, &it_new, &it_old);
@@ -292,8 +296,8 @@ static unsigned int alarm_setitimer(unsigned int seconds) * We can't return 0 if we have an alarm pending ... And we'd * better return too much than too little anyway */ - if ((!it_old.it_value.tv_sec && it_old.it_value.tv_usec) || - it_old.it_value.tv_usec >= 500000) + if ((!it_old.it_value.tv_sec && it_old.it_value.tv_nsec) || + it_old.it_value.tv_nsec >= 500000) it_old.it_value.tv_sec++;
return it_old.it_value.tv_sec; @@ -310,15 +314,35 @@ SYSCALL_DEFINE1(alarm, unsigned int, seconds)
#endif
+static int get_itimerval(struct itimerspec64 *o, const struct itimerval __user *i) +{ + struct itimerval v; + + if (copy_from_user(&v, i, sizeof(struct itimerval))) + return -EFAULT; + + /* Validate the timevals in value. */ + if (!timeval_valid(&v.it_value) || + !timeval_valid(&v.it_interval)) + return -EINVAL; + + o->it_interval.tv_sec = v.it_interval.tv_sec; + o->it_interval.tv_nsec = v.it_interval.tv_usec * NSEC_PER_USEC; + o->it_value.tv_sec = v.it_value.tv_sec; + o->it_value.tv_nsec = v.it_value.tv_usec * NSEC_PER_USEC; + return 0; +} + SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, value, struct itimerval __user *, ovalue) { - struct itimerval set_buffer, get_buffer; + struct itimerspec64 set_buffer, get_buffer; int error;
if (value) { - if(copy_from_user(&set_buffer, value, sizeof(set_buffer))) - return -EFAULT; + error = get_itimerval(&set_buffer, value); + if (error) + return error; } else { memset(&set_buffer, 0, sizeof(set_buffer)); printk_once(KERN_WARNING "%s calls setitimer() with new_value NULL pointer." @@ -330,43 +354,53 @@ SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, value, if (error || !ovalue) return error;
- if (copy_to_user(ovalue, &get_buffer, sizeof(get_buffer))) + if (put_itimerval(ovalue, &get_buffer)) return -EFAULT; return 0; }
#if defined(CONFIG_COMPAT) || defined(CONFIG_ALPHA) -static int get_old_itimerval32(struct itimerval *o, const struct old_itimerval32 __user *i) +static int get_old_itimerval32(struct itimerspec64 *o, const struct old_itimerval32 __user *i) { struct old_itimerval32 v32;
if (copy_from_user(&v32, i, sizeof(struct old_itimerval32))) return -EFAULT; + + /* Validate the timevals in value. */ + if (!timeval_valid(&v32.it_value) || + !timeval_valid(&v32.it_interval)) + return -EINVAL; + o->it_interval.tv_sec = v32.it_interval.tv_sec; - o->it_interval.tv_usec = v32.it_interval.tv_usec; + o->it_interval.tv_nsec = v32.it_interval.tv_usec * NSEC_PER_USEC; o->it_value.tv_sec = v32.it_value.tv_sec; - o->it_value.tv_usec = v32.it_value.tv_usec; + o->it_value.tv_nsec = v32.it_value.tv_usec * NSEC_PER_USEC; return 0; }
COMPAT_SYSCALL_DEFINE3(setitimer, int, which, - struct old_itimerval32 __user *, in, - struct old_itimerval32 __user *, out) + struct old_itimerval32 __user *, value, + struct old_itimerval32 __user *, ovalue) { - struct itimerval kin, kout; + struct itimerspec64 set_buffer, get_buffer; int error;
- if (in) { - if (get_old_itimerval32(&kin, in)) - return -EFAULT; + if (value) { + error = get_old_itimerval32(&set_buffer, value); + if (error) + return error; } else { - memset(&kin, 0, sizeof(kin)); + memset(&set_buffer, 0, sizeof(set_buffer)); + printk_once(KERN_WARNING "%s calls setitimer() with new_value NULL pointer." + " Misfeature support will be removed\n", + current->comm); }
- error = do_setitimer(which, &kin, out ? &kout : NULL); - if (error || !out) + error = do_setitimer(which, &set_buffer, ovalue ? &get_buffer : NULL); + if (error || !ovalue) return error; - if (put_old_itimerval32(out, &kout)) + if (put_old_itimerval32(ovalue, &get_buffer)) return -EFAULT; return 0; }
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
TRACE_EVENT(itimer_state,
- TP_PROTO(int which, const struct itimerval *const value,
- TP_PROTO(int which, const struct itimerspec64 *const value, unsigned long long expires),
TP_ARGS(which, value, expires), @@ -321,12 +321,12 @@ TRACE_EVENT(itimer_state, __entry->which = which; __entry->expires = expires; __entry->value_sec = value->it_value.tv_sec;
__entry->value_usec = value->it_value.tv_usec;
__entry->interval_sec = value->it_interval.tv_sec;__entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC;
__entry->interval_usec = value->it_interval.tv_usec;
__entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC;
Hmm, having a division in a tracepoint is clearly suboptimal.
),
- TP_printk("which=%d expires=%llu it_value=%ld.%ld it_interval=%ld.%ld",
- TP_printk("which=%d expires=%llu it_value=%ld.%06ld it_interval=%ld.%06ld",
We print only 6 digits after the . so that would be even correct w/o a division. But it probably does not matter much.
@@ -197,19 +207,13 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
Hrm, why do we have yet another incarnation of timeval_valid()? Can we please have only one (the inline version)?
Thanks,
tglx
On Wed, 13 Nov 2019 23:28:47 +0100 (CET) Thomas Gleixner tglx@linutronix.de wrote:
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
TRACE_EVENT(itimer_state,
- TP_PROTO(int which, const struct itimerval *const value,
- TP_PROTO(int which, const struct itimerspec64 *const value, unsigned long long expires),
TP_ARGS(which, value, expires), @@ -321,12 +321,12 @@ TRACE_EVENT(itimer_state, __entry->which = which; __entry->expires = expires; __entry->value_sec = value->it_value.tv_sec;
__entry->value_usec = value->it_value.tv_usec;
__entry->interval_sec = value->it_interval.tv_sec;__entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC;
__entry->interval_usec = value->it_interval.tv_usec;
__entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC;
Hmm, having a division in a tracepoint is clearly suboptimal.
Right, we should move the division into the TP_printk()
__entry->interval_nsec = alue->it_interval.tv_nsec;
),
- TP_printk("which=%d expires=%llu it_value=%ld.%ld it_interval=%ld.%ld",
- TP_printk("which=%d expires=%llu it_value=%ld.%06ld it_interval=%ld.%06ld",
We print only 6 digits after the . so that would be even correct w/o a division. But it probably does not matter much.
Well, we still need the division in the printk, otherwise it will print more than 6. That's just the minimum and it will print the full number.
__entry->interval_nsec / NSEC_PER_USEC
-- Steve
@@ -197,19 +207,13 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
Hrm, why do we have yet another incarnation of timeval_valid()? Can we please have only one (the inline version)?
Thanks,
tglx
On Wed, 13 Nov 2019, Steven Rostedt wrote:
On Wed, 13 Nov 2019 23:28:47 +0100 (CET) Thomas Gleixner tglx@linutronix.de wrote:
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
TRACE_EVENT(itimer_state,
- TP_PROTO(int which, const struct itimerval *const value,
- TP_PROTO(int which, const struct itimerspec64 *const value, unsigned long long expires),
TP_ARGS(which, value, expires), @@ -321,12 +321,12 @@ TRACE_EVENT(itimer_state, __entry->which = which; __entry->expires = expires; __entry->value_sec = value->it_value.tv_sec;
__entry->value_usec = value->it_value.tv_usec;
__entry->interval_sec = value->it_interval.tv_sec;__entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC;
__entry->interval_usec = value->it_interval.tv_usec;
__entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC;
Hmm, having a division in a tracepoint is clearly suboptimal.
Right, we should move the division into the TP_printk()
__entry->interval_nsec = alue->it_interval.tv_nsec;
),
- TP_printk("which=%d expires=%llu it_value=%ld.%ld it_interval=%ld.%ld",
- TP_printk("which=%d expires=%llu it_value=%ld.%06ld it_interval=%ld.%06ld",
We print only 6 digits after the . so that would be even correct w/o a division. But it probably does not matter much.
Well, we still need the division in the printk, otherwise it will print more than 6. That's just the minimum and it will print the full number.
That's fine. The print is not really timing critical, the tracepoint very much so.
Thanks,
tglx
On Thu, Nov 14, 2019 at 3:06 AM Steven Rostedt rostedt@goodmis.org wrote:
On Wed, 13 Nov 2019 23:28:47 +0100 (CET) Thomas Gleixner tglx@linutronix.de wrote:
__entry->value_usec = value->it_value.tv_usec;
__entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC; __entry->interval_sec = value->it_interval.tv_sec;
__entry->interval_usec = value->it_interval.tv_usec;
__entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC;
Hmm, having a division in a tracepoint is clearly suboptimal.
Right, we should move the division into the TP_printk()
__entry->interval_nsec = alue->it_interval.tv_nsec;
Ok, fixed now, thanks for the suggestion!
Arnd
On Wed, Nov 13, 2019 at 11:28 PM Thomas Gleixner tglx@linutronix.de wrote:
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
@@ -321,12 +321,12 @@ TRACE_EVENT(itimer_state, __entry->which = which; __entry->expires = expires; __entry->value_sec = value->it_value.tv_sec;
__entry->value_usec = value->it_value.tv_usec;
__entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC; __entry->interval_sec = value->it_interval.tv_sec;
__entry->interval_usec = value->it_interval.tv_usec;
__entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC;
Hmm, having a division in a tracepoint is clearly suboptimal.
Ok, moving it to the TP_printk() as Steven suggested.
TP_printk("which=%d expires=%llu it_value=%ld.%ld it_interval=%ld.%ld",
TP_printk("which=%d expires=%llu it_value=%ld.%06ld it_interval=%ld.%06ld",
We print only 6 digits after the . so that would be even correct w/o a division. But it probably does not matter much.
This is just a cosmetic fix, it can be a separate patch if you care. The idea is to print the numbers as normal decimal representation, e.g. 0.001000 for a millisecond instead of the nonstandard 0.1000.
@@ -197,19 +207,13 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
Hrm, why do we have yet another incarnation of timeval_valid()?
No idea, you have to ask the author of commit 7d99b7d634d8 ("[PATCH] Validate and sanitze itimer timeval from userspace") ;-)
Can we please have only one (the inline version)?
I'm removing the inline version in a later patch along with most of the rest of include/linux/time32.h.
Having the macro version is convenient for this patch, since I'm using it on two different structures (itimerval/__kernel_old_timeval and old_itimerval32/old_timeval32), neither of which is the type used in the inline function.
I could use two local inline functions instead of the macro, or just open code both call sites if you prefer that.
Arnd
On Thu, 14 Nov 2019, Arnd Bergmann wrote:
On Wed, Nov 13, 2019 at 11:28 PM Thomas Gleixner tglx@linutronix.de wrote:
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
@@ -197,19 +207,13 @@ static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
Hrm, why do we have yet another incarnation of timeval_valid()?
No idea, you have to ask the author of commit 7d99b7d634d8 ("[PATCH] Validate and sanitze itimer timeval from userspace") ;-)
I don't know that guy. :)
Can we please have only one (the inline version)?
I'm removing the inline version in a later patch along with most of the rest of include/linux/time32.h.
Having the macro version is convenient for this patch, since I'm using it on two different structures (itimerval/__kernel_old_timeval and old_itimerval32/old_timeval32), neither of which is the type used in the inline function.
Fair enough.
Thanks,
tglx
On Fri, 2019-11-08 at 22:12 +0100, Arnd Bergmann wrote: [...]
@@ -292,8 +296,8 @@ static unsigned int alarm_setitimer(unsigned int seconds) * We can't return 0 if we have an alarm pending ... And we'd * better return too much than too little anyway */
- if ((!it_old.it_value.tv_sec && it_old.it_value.tv_usec) ||
it_old.it_value.tv_usec >= 500000)
- if ((!it_old.it_value.tv_sec && it_old.it_value.tv_nsec) ||
it_old.it_value.tv_nsec >= 500000)
[...]
This is now off by a factor of 1000. It might be helpful to use NSEC_PER_SEC / 2 here so no-one has to count the 0 digits.
Ben.
On Thu, Nov 21, 2019 at 5:52 PM Ben Hutchings ben.hutchings@codethink.co.uk wrote:
On Fri, 2019-11-08 at 22:12 +0100, Arnd Bergmann wrote: [...]
@@ -292,8 +296,8 @@ static unsigned int alarm_setitimer(unsigned int seconds) * We can't return 0 if we have an alarm pending ... And we'd * better return too much than too little anyway */
if ((!it_old.it_value.tv_sec && it_old.it_value.tv_usec) ||
it_old.it_value.tv_usec >= 500000)
if ((!it_old.it_value.tv_sec && it_old.it_value.tv_nsec) ||
it_old.it_value.tv_nsec >= 500000)
[...]
This is now off by a factor of 1000. It might be helpful to use NSEC_PER_SEC / 2 here so no-one has to count the 0 digits.
Fixed now, thanks a lot for pointing it out!
Arnd
Avoid the intermediate step going from timeval to timespec64 to ktime_t and back by using ktime_t throughout the code.
I was going back and forth between the two implementations. This patch is optional: if we want it, it could be folded into the patch converting to itimerspec64.
On an arm32 build, this version actually produces 10% larger code than the timespec64 version, while on x86-64 it's the same as before, and the number of source lines stays the same as well.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- include/trace/events/timer.h | 29 ++--- kernel/time/itimer.c | 229 ++++++++++++++++++----------------- 2 files changed, 129 insertions(+), 129 deletions(-)
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h index 939cbfc80015..021733a49f10 100644 --- a/include/trace/events/timer.h +++ b/include/trace/events/timer.h @@ -297,39 +297,36 @@ DEFINE_EVENT(hrtimer_class, hrtimer_cancel, /** * itimer_state - called when itimer is started or canceled * @which: name of the interval timer - * @value: the itimers value, itimer is canceled if value->it_value is - * zero, otherwise it is started + * @interval: the itimers value + * @interval: the itimers interval * @expires: the itimers expiry time */ TRACE_EVENT(itimer_state,
- TP_PROTO(int which, const struct itimerspec64 *const value, - unsigned long long expires), + TP_PROTO(int which, ktime_t value, ktime_t interval, unsigned long long expires),
- TP_ARGS(which, value, expires), + TP_ARGS(which, value, interval, expires),
TP_STRUCT__entry( __field( int, which ) __field( unsigned long long, expires ) - __field( long, value_sec ) - __field( long, value_usec ) - __field( long, interval_sec ) - __field( long, interval_usec ) + __field( time64_t, value_sec ) + __field( u32, value_nsec ) + __field( time64_t, interval_sec ) + __field( u32, interval_nsec ) ),
TP_fast_assign( __entry->which = which; __entry->expires = expires; - __entry->value_sec = value->it_value.tv_sec; - __entry->value_usec = value->it_value.tv_nsec / NSEC_PER_USEC; - __entry->interval_sec = value->it_interval.tv_sec; - __entry->interval_usec = value->it_interval.tv_nsec / NSEC_PER_USEC; + __entry->value_sec = div_u64_rem(value, NSEC_PER_SEC, &__entry->value_nsec); + __entry->interval_sec = div_u64_rem(interval, NSEC_PER_SEC, &__entry->interval_nsec); ),
- TP_printk("which=%d expires=%llu it_value=%ld.%06ld it_interval=%ld.%06ld", + TP_printk("which=%d expires=%llu it_value=%lld.%09d it_interval=%lld.%09d", __entry->which, __entry->expires, - __entry->value_sec, __entry->value_usec, - __entry->interval_sec, __entry->interval_usec) + __entry->value_sec, __entry->value_nsec, + __entry->interval_sec, __entry->interval_nsec) );
/** diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c index 5872db9bd5f7..707f73054d76 100644 --- a/kernel/time/itimer.c +++ b/kernel/time/itimer.c @@ -26,7 +26,7 @@ * Returns the delta between the expiry time and now, which can be * less than zero or 1usec for an pending expired timer */ -static struct timespec64 itimer_get_remtime(struct hrtimer *timer) +static ktime_t itimer_get_remtime(struct hrtimer *timer) { ktime_t rem = __hrtimer_get_remaining(timer, true);
@@ -41,11 +41,11 @@ static struct timespec64 itimer_get_remtime(struct hrtimer *timer) } else rem = 0;
- return ktime_to_timespec64(rem); + return rem; }
static void get_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, - struct itimerspec64 *const value) + ktime_t *ovalue, ktime_t *ointerval) { u64 val, interval; struct cpu_itimer *it = &tsk->signal->it[clock_id]; @@ -69,27 +69,26 @@ static void get_cpu_itimer(struct task_struct *tsk, unsigned int clock_id,
spin_unlock_irq(&tsk->sighand->siglock);
- value->it_value = ns_to_timespec64(val); - value->it_interval = ns_to_timespec64(interval); + *ovalue = ns_to_ktime(val); + *ointerval = ns_to_ktime(interval); }
-static int do_getitimer(int which, struct itimerspec64 *value) +static int do_getitimer(int which, ktime_t *value, ktime_t *interval) { struct task_struct *tsk = current;
switch (which) { case ITIMER_REAL: spin_lock_irq(&tsk->sighand->siglock); - value->it_value = itimer_get_remtime(&tsk->signal->real_timer); - value->it_interval = - ktime_to_timespec64(tsk->signal->it_real_incr); + *value = itimer_get_remtime(&tsk->signal->real_timer); + *interval = tsk->signal->it_real_incr; spin_unlock_irq(&tsk->sighand->siglock); break; case ITIMER_VIRTUAL: - get_cpu_itimer(tsk, CPUCLOCK_VIRT, value); + get_cpu_itimer(tsk, CPUCLOCK_VIRT, value, interval); break; case ITIMER_PROF: - get_cpu_itimer(tsk, CPUCLOCK_PROF, value); + get_cpu_itimer(tsk, CPUCLOCK_PROF, value, interval); break; default: return(-EINVAL); @@ -98,26 +97,31 @@ static int do_getitimer(int which, struct itimerspec64 *value) }
static int put_itimerval(struct itimerval __user *o, - const struct itimerspec64 *i) + ktime_t value, ktime_t interval) { struct itimerval v; - - v.it_interval.tv_sec = i->it_interval.tv_sec; - v.it_interval.tv_usec = i->it_interval.tv_nsec / NSEC_PER_USEC; - v.it_value.tv_sec = i->it_value.tv_sec; - v.it_value.tv_usec = i->it_value.tv_nsec / NSEC_PER_USEC; + struct timespec64 ts; + + ts = ktime_to_timespec64(interval); + v.it_interval.tv_sec = ts.tv_sec; + v.it_interval.tv_usec = ts.tv_nsec / NSEC_PER_USEC; + ts = ktime_to_timespec64(value); + v.it_value.tv_sec = ts.tv_sec; + v.it_value.tv_usec = ts.tv_nsec / NSEC_PER_USEC; return copy_to_user(o, &v, sizeof(struct itimerval)) ? -EFAULT : 0; }
-SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, value) +SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, it) { - struct itimerspec64 get_buffer; - int error = do_getitimer(which, &get_buffer); + ktime_t value, interval; + int error; + + error = do_getitimer(which, &value, &interval); + if (error) + return error;
- if (!error && put_itimerval(value, &get_buffer)) - error = -EFAULT; - return error; + return put_itimerval(it, value, interval); }
#if defined(CONFIG_COMPAT) || defined(CONFIG_ALPHA) @@ -127,26 +131,31 @@ struct old_itimerval32 { };
static int put_old_itimerval32(struct old_itimerval32 __user *o, - const struct itimerspec64 *i) + ktime_t value, ktime_t interval) { struct old_itimerval32 v32; - - v32.it_interval.tv_sec = i->it_interval.tv_sec; - v32.it_interval.tv_usec = i->it_interval.tv_nsec / NSEC_PER_USEC; - v32.it_value.tv_sec = i->it_value.tv_sec; - v32.it_value.tv_usec = i->it_value.tv_nsec / NSEC_PER_USEC; + struct timespec64 ts; + + ts = ktime_to_timespec64(interval); + v32.it_interval.tv_sec = ts.tv_sec; + v32.it_interval.tv_usec = ts.tv_nsec / NSEC_PER_USEC; + ts = ktime_to_timespec64(value); + v32.it_value.tv_sec = ts.tv_sec; + v32.it_value.tv_usec = ts.tv_nsec / NSEC_PER_USEC; return copy_to_user(o, &v32, sizeof(struct old_itimerval32)) ? -EFAULT : 0; }
COMPAT_SYSCALL_DEFINE2(getitimer, int, which, - struct old_itimerval32 __user *, value) + struct old_itimerval32 __user *, it) { - struct itimerspec64 get_buffer; - int error = do_getitimer(which, &get_buffer); + ktime_t value, interval; + int error;
- if (!error && put_old_itimerval32(value, &get_buffer)) - error = -EFAULT; - return error; + error = do_getitimer(which, &value, &interval); + if (error) + return error; + + return put_old_itimerval32(it, value, interval); } #endif
@@ -166,86 +175,78 @@ enum hrtimer_restart it_real_fn(struct hrtimer *timer) }
static void set_cpu_itimer(struct task_struct *tsk, unsigned int clock_id, - const struct itimerspec64 *const value, - struct itimerspec64 *const ovalue) + ktime_t value, ktime_t interval, + ktime_t *ovalue, ktime_t *ointerval) { - u64 oval, nval, ointerval, ninterval; + u64 oval, nval; struct cpu_itimer *it = &tsk->signal->it[clock_id];
- /* - * Use the to_ktime conversion because that clamps the maximum - * value to KTIME_MAX and avoid multiplication overflows. - */ - nval = timespec64_to_ns(&value->it_value); - ninterval = timespec64_to_ns(&value->it_interval); + nval = value;
spin_lock_irq(&tsk->sighand->siglock);
oval = it->expires; - ointerval = it->incr; + if (ointerval) + *ointerval = it->incr; + if (oval || nval) { if (nval > 0) nval += TICK_NSEC; set_process_cpu_timer(tsk, clock_id, &nval, &oval); } + it->expires = nval; - it->incr = ninterval; + it->incr = interval; + trace_itimer_state(clock_id == CPUCLOCK_VIRT ? - ITIMER_VIRTUAL : ITIMER_PROF, value, nval); + ITIMER_VIRTUAL : ITIMER_PROF, value, interval, nval);
spin_unlock_irq(&tsk->sighand->siglock);
- if (ovalue) { - ovalue->it_value = ns_to_timespec64(oval); - ovalue->it_interval = ns_to_timespec64(ointerval); - } + if (ovalue) + *ovalue = oval; }
-/* - * Returns true if the timeval is in canonical form - */ #define timeval_valid(t) \ (((t)->tv_sec >= 0) && (((unsigned long) (t)->tv_usec) < USEC_PER_SEC))
-static int do_setitimer(int which, struct itimerspec64 *value, - struct itimerspec64 *ovalue) +static int do_setitimer(int which, ktime_t value, ktime_t interval, + ktime_t *ovalue, ktime_t *ointerval) { struct task_struct *tsk = current; struct hrtimer *timer; - ktime_t expires;
switch (which) { case ITIMER_REAL: again: spin_lock_irq(&tsk->sighand->siglock); timer = &tsk->signal->real_timer; - if (ovalue) { - ovalue->it_value = itimer_get_remtime(timer); - ovalue->it_interval - = ktime_to_timespec64(tsk->signal->it_real_incr); - } + if (ovalue) + *ovalue = itimer_get_remtime(timer); + if (ointerval) + *ointerval = tsk->signal->it_real_incr; /* We are sharing ->siglock with it_real_fn() */ if (hrtimer_try_to_cancel(timer) < 0) { spin_unlock_irq(&tsk->sighand->siglock); hrtimer_cancel_wait_running(timer); goto again; } - expires = timespec64_to_ktime(value->it_value); - if (expires != 0) { - tsk->signal->it_real_incr = - timespec64_to_ktime(value->it_interval); - hrtimer_start(timer, expires, HRTIMER_MODE_REL); + if (value != 0) { + tsk->signal->it_real_incr = interval; + hrtimer_start(timer, value, HRTIMER_MODE_REL); } else tsk->signal->it_real_incr = 0;
- trace_itimer_state(ITIMER_REAL, value, 0); + trace_itimer_state(ITIMER_REAL, value, interval, 0); spin_unlock_irq(&tsk->sighand->siglock); break; case ITIMER_VIRTUAL: - set_cpu_itimer(tsk, CPUCLOCK_VIRT, value, ovalue); + set_cpu_itimer(tsk, CPUCLOCK_VIRT, value, interval, + ovalue, ointerval); break; case ITIMER_PROF: - set_cpu_itimer(tsk, CPUCLOCK_PROF, value, ovalue); + set_cpu_itimer(tsk, CPUCLOCK_PROF, value, interval, + ovalue, ointerval); break; default: return -EINVAL; @@ -256,11 +257,10 @@ static int do_setitimer(int which, struct itimerspec64 *value, #ifdef CONFIG_SECURITY_SELINUX void clear_itimer(void) { - struct itimerspec64 v = {}; int i;
for (i = 0; i < 3; i++) - do_setitimer(i, &v, NULL); + do_setitimer(i, 0, 0, NULL, NULL); } #endif
@@ -280,27 +280,23 @@ void clear_itimer(void) */ static unsigned int alarm_setitimer(unsigned int seconds) { - struct itimerspec64 it_new, it_old; + ktime_t old;
#if BITS_PER_LONG < 64 if (seconds > INT_MAX) seconds = INT_MAX; #endif - it_new.it_value.tv_sec = seconds; - it_new.it_value.tv_nsec = 0; - it_new.it_interval.tv_sec = it_new.it_interval.tv_nsec = 0;
- do_setitimer(ITIMER_REAL, &it_new, &it_old); + do_setitimer(ITIMER_REAL, ktime_set(seconds, 0), 0, &old, NULL);
/* * We can't return 0 if we have an alarm pending ... And we'd * better return too much than too little anyway */ - if ((!it_old.it_value.tv_sec && it_old.it_value.tv_nsec) || - it_old.it_value.tv_nsec >= 500000) - it_old.it_value.tv_sec++; + if (old > 0 && old < NSEC_PER_SEC) + return 1;
- return it_old.it_value.tv_sec; + return div_u64(old + (NSEC_PER_SEC / 2), NSEC_PER_SEC); }
/* @@ -314,7 +310,8 @@ SYSCALL_DEFINE1(alarm, unsigned int, seconds)
#endif
-static int get_itimerval(struct itimerspec64 *o, const struct itimerval __user *i) +static int get_itimerval(ktime_t *value, ktime_t *interval, + const struct itimerval __user *i) { struct itimerval v;
@@ -326,41 +323,45 @@ static int get_itimerval(struct itimerspec64 *o, const struct itimerval __user * !timeval_valid(&v.it_interval)) return -EINVAL;
- o->it_interval.tv_sec = v.it_interval.tv_sec; - o->it_interval.tv_nsec = v.it_interval.tv_usec * NSEC_PER_USEC; - o->it_value.tv_sec = v.it_value.tv_sec; - o->it_value.tv_nsec = v.it_value.tv_usec * NSEC_PER_USEC; + *interval = ktime_set(v.it_interval.tv_sec, + v.it_interval.tv_usec * NSEC_PER_USEC); + *value = ktime_set(v.it_value.tv_sec, + v.it_value.tv_usec * NSEC_PER_USEC); + return 0; }
-SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, value, - struct itimerval __user *, ovalue) + +SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, in, + struct itimerval __user *, out) { - struct itimerspec64 set_buffer, get_buffer; + ktime_t value = 0, interval = 0; + ktime_t ovalue, ointerval; int error;
- if (value) { - error = get_itimerval(&set_buffer, value); + if (in) { + error = get_itimerval(&value, &interval, in); if (error) return error; } else { - memset(&set_buffer, 0, sizeof(set_buffer)); printk_once(KERN_WARNING "%s calls setitimer() with new_value NULL pointer." " Misfeature support will be removed\n", current->comm); }
- error = do_setitimer(which, &set_buffer, ovalue ? &get_buffer : NULL); - if (error || !ovalue) + if (!out) + return do_setitimer(which, value, interval, NULL, NULL); + + error = do_setitimer(which, value, interval, &ovalue, &ointerval); + if (error) return error;
- if (put_itimerval(ovalue, &get_buffer)) - return -EFAULT; - return 0; + return put_itimerval(out, ovalue, ointerval); }
#if defined(CONFIG_COMPAT) || defined(CONFIG_ALPHA) -static int get_old_itimerval32(struct itimerspec64 *o, const struct old_itimerval32 __user *i) +static int get_old_itimerval32(ktime_t *value, ktime_t *interval, + const struct old_itimerval32 __user *i) { struct old_itimerval32 v32;
@@ -372,36 +373,38 @@ static int get_old_itimerval32(struct itimerspec64 *o, const struct old_itimerva !timeval_valid(&v32.it_interval)) return -EINVAL;
- o->it_interval.tv_sec = v32.it_interval.tv_sec; - o->it_interval.tv_nsec = v32.it_interval.tv_usec * NSEC_PER_USEC; - o->it_value.tv_sec = v32.it_value.tv_sec; - o->it_value.tv_nsec = v32.it_value.tv_usec * NSEC_PER_USEC; + *interval = ktime_set(v32.it_interval.tv_sec, + v32.it_interval.tv_usec * NSEC_PER_USEC); + *value = ktime_set(v32.it_value.tv_sec, + v32.it_value.tv_usec * NSEC_PER_USEC); + return 0; }
COMPAT_SYSCALL_DEFINE3(setitimer, int, which, - struct old_itimerval32 __user *, value, - struct old_itimerval32 __user *, ovalue) + struct old_itimerval32 __user *, in, + struct old_itimerval32 __user *, out) { - struct itimerspec64 set_buffer, get_buffer; + ktime_t value = 0, interval = 0; + ktime_t ovalue, ointerval; int error;
- if (value) { - error = get_old_itimerval32(&set_buffer, value); + if (in) { + error = get_old_itimerval32(&value, &interval, in); if (error) return error; } else { - memset(&set_buffer, 0, sizeof(set_buffer)); printk_once(KERN_WARNING "%s calls setitimer() with new_value NULL pointer." " Misfeature support will be removed\n", current->comm); }
- error = do_setitimer(which, &set_buffer, ovalue ? &get_buffer : NULL); - if (error || !ovalue) + if (!out) + return do_setitimer(which, value, interval, NULL, NULL); + + error = do_setitimer(which, value, interval, &ovalue, &ointerval); + if (error) return error; - if (put_old_itimerval32(ovalue, &get_buffer)) - return -EFAULT; - return 0; + return put_old_itimerval32(out, ovalue, ointerval); } #endif
On Fri, 8 Nov 2019, Arnd Bergmann wrote:
Avoid the intermediate step going from timeval to timespec64 to ktime_t and back by using ktime_t throughout the code.
I was going back and forth between the two implementations. This patch is optional: if we want it, it could be folded into the patch converting to itimerspec64.
On an arm32 build, this version actually produces 10% larger code than the timespec64 version, while on x86-64 it's the same as before, and the number of source lines stays the same as well.
Right. For 32bit without a native 64/32 division this is going to be more text and I'm not really convinvced that this buys us anything.
Thanks,
tglx
At the moment, the compilation of the old time32 system calls depends purely on the architecture. As systems with new libc based on 64-bit time_t are getting deployed, even architectures that previously supported these (notably x86-32 and arm32 but also many others) no longer depend on them, and removing them from a kernel image results in a smaller kernel binary, the same way we can leave out many other optional system calls.
More importantly, on an embedded system that needs to keep working beyond year 2038, any user space program calling these system calls is likely a bug, so removing them from the kernel image does provide an extra debugging help for finding broken applications.
I've gone back and forth on hiding this option unless CONFIG_EXPERT is set. This version leaves it visible based on the logic that eventually it will be turned off indefinitely.
Signed-off-by: Arnd Bergmann arnd@arndb.de --- arch/Kconfig | 3 ++- kernel/sys_ni.c | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/arch/Kconfig b/arch/Kconfig index 0e1fded2940e..1203955ed4d0 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -797,7 +797,8 @@ config COMPAT_OLD_SIGACTION bool
config COMPAT_32BIT_TIME - def_bool !64BIT || COMPAT + bool "Provide system calls for 32-bit time_t" + default !64BIT || COMPAT help This enables 32 bit time_t support in addition to 64 bit time_t support. This is relevant on all 32-bit architectures, and 64-bit architectures diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 34b76895b81e..3b69a560a7ac 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -410,6 +410,29 @@ COND_SYSCALL(send); COND_SYSCALL(bdflush); COND_SYSCALL(uselib);
+/* optional: time32 */ +COND_SYSCALL(time32); +COND_SYSCALL(stime32); +COND_SYSCALL(utime32); +COND_SYSCALL(adjtimex_time32); +COND_SYSCALL(sched_rr_get_interval_time32); +COND_SYSCALL(nanosleep_time32); +COND_SYSCALL(rt_sigtimedwait_time32); +COND_SYSCALL_COMPAT(rt_sigtimedwait_time32); +COND_SYSCALL(timer_settime32); +COND_SYSCALL(timer_gettime32); +COND_SYSCALL(clock_settime32); +COND_SYSCALL(clock_gettime32); +COND_SYSCALL(clock_getres_time32); +COND_SYSCALL(clock_nanosleep_time32); +COND_SYSCALL(utimes_time32); +COND_SYSCALL(futimesat_time32); +COND_SYSCALL(pselect6_time32); +COND_SYSCALL_COMPAT(pselect6_time32); +COND_SYSCALL(ppoll_time32); +COND_SYSCALL_COMPAT(ppoll_time32); +COND_SYSCALL(utimensat_time32); +COND_SYSCALL(clock_adjtime32);
/* * The syscalls below are not found in include/uapi/asm-generic/unistd.h
On Fri, Nov 08, 2019 at 10:12:22PM +0100, Arnd Bergmann wrote:
At the moment, the compilation of the old time32 system calls depends purely on the architecture. As systems with new libc based on 64-bit time_t are getting deployed, even architectures that previously supported these (notably x86-32 and arm32 but also many others) no longer depend on them, and removing them from a kernel image results in a smaller kernel binary, the same way we can leave out many other optional system calls.
More importantly, on an embedded system that needs to keep working beyond year 2038, any user space program calling these system calls is likely a bug, so removing them from the kernel image does provide an extra debugging help for finding broken applications.
I've gone back and forth on hiding this option unless CONFIG_EXPERT is set. This version leaves it visible based on the logic that eventually it will be turned off indefinitely.
Signed-off-by: Arnd Bergmann arnd@arndb.de
Acked-by: Christian Brauner christian.brauner@ubuntu.com
On Fri, Nov 8, 2019 at 10:04 PM Arnd Bergmann arnd@arndb.de wrote:
This is a series of cleanups for the y2038 work, mostly intended for namespace cleaning: the kernel defines the traditional time_t, timeval and timespec types that often lead to y2038-unsafe code. Even though the unsafe usage is mostly gone from the kernel, having the types and associated functions around means that we can still grow new users, and that we may be missing conversions to safe types that actually matter.
As there is no rush on any of these patches, I would either queue them up in linux-next through my y2038 branch, or Thomas could add them to the tip tree if he wants.
As mentioned in another series, this is part of a larger effort to fix all the remaining bits and pieces that are not completed yet from the y2038 conversion, and the full set can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=y...
Maintainers, please review and provide Acks.
Let me know if you have any opinion on whether we should do the include last two patches of this series or not.
Arnd
Arnd Bergmann (23): y2038: remove CONFIG_64BIT_TIME y2038: add __kernel_old_timespec and __kernel_old_time_t y2038: vdso: change timeval to __kernel_old_timeval y2038: vdso: change timespec to __kernel_old_timespec y2038: vdso: change time_t to __kernel_old_time_t y2038: vdso: nds32: open-code timespec_add_ns() y2038: vdso: powerpc: avoid timespec references y2038: ipc: remove __kernel_time_t reference from headers y2038: stat: avoid 'time_t' in 'struct stat' y2038: uapi: change __kernel_time_t to __kernel_old_time_t y2038: rusage: use __kernel_old_timeval y2038: syscalls: change remaining timeval to __kernel_old_timeval y2038: socket: remove timespec reference in timestamping y2038: make ns_to_compat_timeval use __kernel_old_timeval y2038: elfcore: Use __kernel_old_timeval for process times y2038: timerfd: Use timespec64 internally y2038: time: avoid timespec usage in settimeofday() y2038: itimer: compat handling to itimer.c y2038: use compat_{get,set}_itimer on alpha y2038: move itimer reset into itimer.c y2038: itimer: change implementation to timespec64 [RFC] y2038: itimer: use ktime_t internally y2038: allow disabling time32 system calls
I've dropped the "[RFC] y2038: itimer: use ktime_t internally" patch for the moment, and added two other patches from other series:
y2038: remove CONFIG_64BIT_TIME y2038: socket: use __kernel_old_timespec instead of timespec
Tentatively pushed out the patches with the Acks I have received so far to my y2038 branch on git.kernel.org so it gets included in linux-next.
If I hear no complaints, I'll send a pull request for the merge window, along with the compat-ioctl series I have already queued up in the same branch.
Arnd