This is a mostly unchanged copy of a series I sent back in April for
an initial review. All the earlier syscall patches that Deepa or I sent
got merged now, and this is the largest chunk of remaining patches.
Changes this time are:
- This is actually tested with the LTP syscalls test suite,
both before and after the CONFIG_64BIT_TIME change (which is not
included here). I have created a patch series for musl libc to use
64-bit time_t and change all the system calls over to the new entry
points for this. The only bugs I found during that testing were in
later parts of the conversion that I have not posted yet.
- I rewrote the sys_io_getevents conversion after the
introduction of sys_sys_io_getevents. We obviously don't need to have
two of each, so we will only provide sys_io_pgetevents() with 64-bit
time_t but not sys_io_getevents(), which the libc can implement on
top of the former.
- While we have Deepa's POSIX timer conversion merged now, we
still need to decide on how we want to do the replacement
ABI for getitimer()/setitimer(). Like getrusage()/waitid() and
clock_adjtime() and unlike the system calls I'm posting here,
there is no one obvious ABI.
- For ppoll()/pselect6(), the ABI is fairly clear, but the
implementation still needs to be done. I tested with a simple
prototype based on the existing compat code, but we can
probably improve that. This is something that Deepa still
wants to work on.
- Finally, Christoph Hellwig objected to the idea of reusing the
compat_ namespace for the 32-bit native case. Changing that
would be a departure from our plans so far[2], and would make
some things end up differently. Until we have decided on how this
is to be done, I've decided to not change the code for this
post. We can clearly rename all the symbols and I've implemented
that in [3] for the current linux-next (not including the
series here). This is something we can definitely do, but I'd
need to know soon whether we can merge this series unchanged
for 4.19 or if I should rebase it on top of that patch with the
alternative naming.
Arnd
---
Previous cover letter announcement below, see [4] for the full
series:
After the first timekeeping series from Deepa (merged into -tip now)
and my follow-up for IPC system calls, this is a third set of system
call conversions following the same principle.
Most of the changes are straightforward, so I'm grouping them into a
larger series even though the system calls are mostly unrelated to one
another. After this series, the remaining calls that need to be changed
are getrusage()/waitid(), pselect6/ppoll(), timer{,fd}_{get,set}time()
and getitimer()/setitimer(). Those will be sent separately, once they
are matured enough.
To put the changes into perspective, a list of all system calls that
require changes is available in a spreadsheet[5] and I have made
another experimental patch that changes over x86[6] and arm[7] to
actually use them.
Link [1] https://lore.kernel.org/lkml/20180712082034.GA8802@infradead.org/
Link [2] https://lwn.net/Articles/643234/
Link [3] https://lore.kernel.org/lkml/20180713133204.3123939-1-arnd@arndb.de/
Link [4] https://lore.kernel.org/lkml/20180425160311.2718314-1-arnd@arndb.de/
Link [5] https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T…
Link [6] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/…
Link [7] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/commit/…
Arnd Bergmann (17):
y2038: compat: Move common compat types to asm-generic/compat.h
y2038: Remove newstat family from default syscall set
y2038: Remove stat64 family from default syscall set
asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro
asm-generic: Remove empty asm/unistd.h
y2038: Change sys_utimensat() to use __kernel_timespec
y2038: Compile utimes()/futimesat() conditionally
y2038: utimes: Rework #ifdef guards for compat syscalls
y2038: futex: Move compat implementation into futex.c
y2038: futex: Add support for __kernel_timespec
y2038: Prepare sched_rr_get_interval for __kernel_timespec
y2038: aio: Prepare sys_io_{p,}getevents for __kernel_timespec
y2038: socket: Convert recvmmsg to __kernel_timespec
y2038: socket: Add compat_sys_recvmmsg_time64
y2038: signal: Change rt_sigtimedwait to use __kernel_timespec
y2038: Make compat_sys_rt_sigtimedwait usable on 32-bit
y2038: signal: Add compat_sys_rt_sigtimedwait_time64
arch/alpha/include/asm/unistd.h | 2 +
arch/arc/include/uapi/asm/unistd.h | 1 +
arch/arm/include/asm/unistd.h | 4 +-
arch/arm64/include/asm/compat.h | 20 +--
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/uapi/asm/unistd.h | 1 +
arch/c6x/include/uapi/asm/unistd.h | 1 +
arch/h8300/include/uapi/asm/unistd.h | 1 +
arch/hexagon/include/uapi/asm/unistd.h | 1 +
arch/ia64/include/asm/unistd.h | 3 +
arch/m68k/include/asm/unistd.h | 2 +-
arch/microblaze/include/asm/unistd.h | 2 +-
arch/mips/include/asm/compat.h | 22 +---
arch/mips/include/asm/unistd.h | 3 +-
arch/nds32/include/uapi/asm/unistd.h | 1 +
arch/nios2/include/uapi/asm/unistd.h | 1 +
arch/openrisc/include/uapi/asm/unistd.h | 1 +
arch/parisc/include/asm/compat.h | 18 +--
arch/parisc/include/asm/unistd.h | 3 +-
arch/powerpc/include/asm/compat.h | 18 +--
arch/powerpc/include/asm/unistd.h | 3 +-
arch/s390/include/asm/compat.h | 18 +--
arch/s390/include/asm/unistd.h | 3 +-
arch/sh/include/asm/unistd.h | 2 +-
arch/sparc/include/asm/compat.h | 19 +--
arch/sparc/include/asm/unistd.h | 3 +-
arch/unicore32/include/uapi/asm/unistd.h | 1 +
arch/x86/include/asm/compat.h | 19 +--
arch/x86/include/asm/unistd.h | 3 +-
arch/xtensa/include/asm/unistd.h | 2 +-
fs/aio.c | 77 ++++++++++--
fs/read_write.c | 2 +-
fs/stat.c | 3 +
fs/utimes.c | 59 +++++----
include/asm-generic/compat.h | 24 +++-
include/asm-generic/unistd.h | 13 --
include/linux/compat.h | 12 +-
include/linux/compat_time.h | 5 +
include/linux/futex.h | 8 --
include/linux/socket.h | 19 ++-
include/linux/syscalls.h | 25 ++--
include/uapi/asm-generic/unistd.h | 2 +
kernel/Makefile | 3 -
kernel/futex.c | 207 +++++++++++++++++++++++++++++--
kernel/futex_compat.c | 202 ------------------------------
kernel/sched/core.c | 4 +-
kernel/signal.c | 68 ++++++++--
kernel/sys_ni.c | 1 +
net/compat.c | 16 +--
net/socket.c | 55 ++++++--
50 files changed, 524 insertions(+), 461 deletions(-)
delete mode 100644 include/asm-generic/unistd.h
delete mode 100644 kernel/futex_compat.c
--
2.9.0
get_seconds() is deprecated because of the 32-bit overflow and will
be removed. All callers in ufs also truncate to a 32-bit number, so
nothing changes during the conversion, but this should be harmless as the
superblock and cylinder group timestamps are not visible to user space,
except for checking the fs-dirty state, wich works fine across the
overflow.
This moves the call to get_seconds() into a new inline function, with
a comment explaining the constraints, while converting it to
ktime_get_real_seconds().
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Originally sent on June 19, got an Ack but nobody picked up the
patch.
---
fs/ufs/balloc.c | 4 ++--
fs/ufs/ialloc.c | 2 +-
fs/ufs/super.c | 4 ++--
fs/ufs/util.h | 14 ++++++++++++++
4 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/fs/ufs/balloc.c b/fs/ufs/balloc.c
index e727ee07dbe4..075d3d9114c8 100644
--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -547,7 +547,7 @@ static u64 ufs_add_fragments(struct inode *inode, u64 fragment,
/*
* Block can be extended
*/
- ucg->cg_time = cpu_to_fs32(sb, get_seconds());
+ ucg->cg_time = ufs_get_seconds(sb);
for (i = newcount; i < (uspi->s_fpb - fragoff); i++)
if (ubh_isclr (UCPI_UBH(ucpi), ucpi->c_freeoff, fragno + i))
break;
@@ -639,7 +639,7 @@ static u64 ufs_alloc_fragments(struct inode *inode, unsigned cgno,
if (!ufs_cg_chkmagic(sb, ucg))
ufs_panic (sb, "ufs_alloc_fragments",
"internal error, bad magic number on cg %u", cgno);
- ucg->cg_time = cpu_to_fs32(sb, get_seconds());
+ ucg->cg_time = ufs_get_seconds(sb);
if (count == uspi->s_fpb) {
result = ufs_alloccg_block (inode, ucpi, goal, err);
diff --git a/fs/ufs/ialloc.c b/fs/ufs/ialloc.c
index e1ef0f0a1353..c678fff2a04d 100644
--- a/fs/ufs/ialloc.c
+++ b/fs/ufs/ialloc.c
@@ -89,7 +89,7 @@ void ufs_free_inode (struct inode * inode)
if (!ufs_cg_chkmagic(sb, ucg))
ufs_panic (sb, "ufs_free_fragments", "internal error, bad cg magic number");
- ucg->cg_time = cpu_to_fs32(sb, get_seconds());
+ ucg->cg_time = ufs_get_seconds(sb);
is_directory = S_ISDIR(inode->i_mode);
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 96a20a76e3c4..f48a5b802221 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -698,7 +698,7 @@ static int ufs_sync_fs(struct super_block *sb, int wait)
usb1 = ubh_get_usb_first(uspi);
usb3 = ubh_get_usb_third(uspi);
- usb1->fs_time = cpu_to_fs32(sb, get_seconds());
+ usb1->fs_time = ufs_get_seconds(sb);
if ((flags & UFS_ST_MASK) == UFS_ST_SUN ||
(flags & UFS_ST_MASK) == UFS_ST_SUNOS ||
(flags & UFS_ST_MASK) == UFS_ST_SUNx86)
@@ -1344,7 +1344,7 @@ static int ufs_remount (struct super_block *sb, int *mount_flags,
*/
if (*mount_flags & SB_RDONLY) {
ufs_put_super_internal(sb);
- usb1->fs_time = cpu_to_fs32(sb, get_seconds());
+ usb1->fs_time = ufs_get_seconds(sb);
if ((flags & UFS_ST_MASK) == UFS_ST_SUN
|| (flags & UFS_ST_MASK) == UFS_ST_SUNOS
|| (flags & UFS_ST_MASK) == UFS_ST_SUNx86)
diff --git a/fs/ufs/util.h b/fs/ufs/util.h
index 1907be6d5808..1fd3011ea623 100644
--- a/fs/ufs/util.h
+++ b/fs/ufs/util.h
@@ -590,3 +590,17 @@ static inline int ufs_is_data_ptr_zero(struct ufs_sb_private_info *uspi,
else
return *(__fs32 *)p == 0;
}
+
+static inline __fs32 ufs_get_seconds(struct super_block *sbp)
+{
+ time64_t now = ktime_get_real_seconds();
+
+ /* Signed 32-bit interpretation wraps around in 2038, which
+ * happens in ufs1 inode stamps but not ufs2 using 64-bits
+ * stamps. For superblock and blockgroup, let's assume
+ * unsigned 32-bit stamps, which are good until y2106.
+ * Wrap around rather than clamp here to make the dirty
+ * file system detection work in the superblock stamp.
+ */
+ return cpu_to_fs32(sbp, lower_32_bits(now));
+}
--
2.9.0
The get_seconds function is deprecated now since it returns a 32-bit
value that will eventually overflow, and we are replacing it throughout
the kernel with ktime_get_seconds() or ktime_get_real_seconds() that
return a time64_t.
bcache uses get_seconds() to read the current system time and store it in
the superblock as well as in uuid_entry structures that are user visible.
Unfortunately, the two structures in are still limited to 32 bits, so this
won't fix any real problems but will still overflow in year 2106. Let's
at least document that properly, in case we get an updated format in the
future it can be fixed. We still have a long time before the overflow
and checking the tools at https://github.com/koverstreet/bcache-tools
reveals no access to any of them.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
drivers/md/bcache/super.c | 12 ++++++------
include/uapi/linux/bcache.h | 4 ++--
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index fa4058e43202..74746d8ee05e 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -181,7 +181,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
goto err;
}
- sb->last_mount = get_seconds();
+ sb->last_mount = (u32)ktime_get_real_seconds();
err = NULL;
get_page(bh->b_page);
@@ -701,7 +701,7 @@ static void bcache_device_detach(struct bcache_device *d)
SET_UUID_FLASH_ONLY(u, 0);
memcpy(u->uuid, invalid_uuid, 16);
- u->invalidated = cpu_to_le32(get_seconds());
+ u->invalidated = cpu_to_le32((u32)ktime_get_real_seconds());
bch_uuid_write(d->c);
}
@@ -1027,7 +1027,7 @@ void bch_cached_dev_detach(struct cached_dev *dc)
int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
uint8_t *set_uuid)
{
- uint32_t rtime = cpu_to_le32(get_seconds());
+ uint32_t rtime = cpu_to_le32((u32)ktime_get_real_seconds());
struct uuid_entry *u;
struct cached_dev *exist_dc, *t;
@@ -1070,7 +1070,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
(BDEV_STATE(&dc->sb) == BDEV_STATE_STALE ||
BDEV_STATE(&dc->sb) == BDEV_STATE_NONE)) {
memcpy(u->uuid, invalid_uuid, 16);
- u->invalidated = cpu_to_le32(get_seconds());
+ u->invalidated = cpu_to_le32((u32)ktime_get_real_seconds());
u = NULL;
}
@@ -1390,7 +1390,7 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size)
get_random_bytes(u->uuid, 16);
memset(u->label, 0, 32);
- u->first_reg = u->last_reg = cpu_to_le32(get_seconds());
+ u->first_reg = u->last_reg = cpu_to_le32((u32)ktime_get_real_seconds());
SET_UUID_FLASH_ONLY(u, 1);
u->sectors = size >> 9;
@@ -1894,7 +1894,7 @@ static void run_cache_set(struct cache_set *c)
goto err;
closure_sync(&cl);
- c->sb.last_mount = get_seconds();
+ c->sb.last_mount = (u32)ktime_get_real_seconds();
bcache_write_super(c);
list_for_each_entry_safe(dc, t, &uncached_devices, list)
diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
index 821f71a2e48f..8d19e02d752a 100644
--- a/include/uapi/linux/bcache.h
+++ b/include/uapi/linux/bcache.h
@@ -195,7 +195,7 @@ struct cache_sb {
};
};
- __u32 last_mount; /* time_t */
+ __u32 last_mount; /* time overflow in y2106 */
__u16 first_bucket;
union {
@@ -318,7 +318,7 @@ struct uuid_entry {
struct {
__u8 uuid[16];
__u8 label[32];
- __u32 first_reg;
+ __u32 first_reg; /* time overflow in y2106 */
__u32 last_reg;
__u32 invalidated;
--
2.9.0
get_seconds() is deprecated in favor of ktime_get_real_seconds(),
which returns a 64-bit timestamp.
In the SYSV file system, the superblock timestamp is only 32 bits
wide, and it is used to check whether a file system is clean, so
the best solution seems to be to force a wraparound and explicitly
convert it to an unsigned 32-bit value.
This is independent of the inode timestamps that are also 32-bit
wide on disk and that come from current_time().
Acked-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Originally sent on Jun 19, got an Ack but no other reply.
Christoph apparently hasn't applied any sysvfs patches in many years,
so I'd like someone else to take this one.
Al or Andrew, could you take this patch for 4.19 as well?
---
fs/sysv/inode.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 47f66bbc4578..e8927ea70d12 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -35,7 +35,7 @@
static int sysv_sync_fs(struct super_block *sb, int wait)
{
struct sysv_sb_info *sbi = SYSV_SB(sb);
- unsigned long time = get_seconds(), old_time;
+ u32 time = (u32)ktime_get_real_seconds(), old_time;
mutex_lock(&sbi->s_lock);
@@ -46,8 +46,8 @@ static int sysv_sync_fs(struct super_block *sb, int wait)
*/
old_time = fs32_to_cpu(sbi, *sbi->s_sb_time);
if (sbi->s_type == FSTYPE_SYSV4) {
- if (*sbi->s_sb_state == cpu_to_fs32(sbi, 0x7c269d38 - old_time))
- *sbi->s_sb_state = cpu_to_fs32(sbi, 0x7c269d38 - time);
+ if (*sbi->s_sb_state == cpu_to_fs32(sbi, 0x7c269d38u - old_time))
+ *sbi->s_sb_state = cpu_to_fs32(sbi, 0x7c269d38u - time);
*sbi->s_sb_time = cpu_to_fs32(sbi, time);
mark_buffer_dirty(sbi->s_bh2);
}
--
2.9.0
getnstimeofday64() is just a wrapper around the ktime accessor, so
we should use that directly.
I considered using ktime_get_boottime_ts64() (to avoid leap second
problems) or ktime_get_real_seconds() (to simplify the calculation,
but in the end concluded that the existing interface is probably
the most appropriate in this case.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Originally sent on Jun 18, but got no reply
Alexandre, could you pick this up into the rtc tree?
---
drivers/rtc/class.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index d37588f08055..7fa32c922617 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -68,7 +68,7 @@ static int rtc_suspend(struct device *dev)
return 0;
}
- getnstimeofday64(&old_system);
+ ktime_get_real_ts64(&old_system);
old_rtc.tv_sec = rtc_tm_to_time64(&tm);
@@ -110,7 +110,7 @@ static int rtc_resume(struct device *dev)
return 0;
/* snapshot the current rtc and system time at resume */
- getnstimeofday64(&new_system);
+ ktime_get_real_ts64(&new_system);
err = rtc_read_time(rtc, &tm);
if (err < 0) {
pr_debug("%s: fail to read rtc time\n", dev_name(&rtc->dev));
--
2.9.0
The series introduces struct __kernel_timex as a substitute for
the non y2038 safe struct timex.
The series is based on the original series posted by Arnd Bergmann
in [1].
The overview of the series is as below:
1. Prepare for the compat timex interfaces to be used unconditionally.
2. Introduce struct __kernel_timex.
3. Use struct __kernel_timex in place of struct timex.
4. Switch syscalls to use struct __kernel_timex.
[1] https://sourceware.org/ml/libc-alpha/2015-05/msg00070.html
Changes since v1:
* Fix riscv asm/compat.h to pick up generic compat types
Deepa Dinamani (7):
arm64: Make basic compat_* types always available
sparc: Make thread_info.h available directly
riscv: Include asm-generic/compat.h
timex: prepare compat helpers for y2038 changes
time: Add struct __kernel_timex
timex: use __kernel_timex internally
timex: change syscalls to use struct __kernel_timex
arch/alpha/kernel/osf_sys.c | 2 +-
arch/arm64/include/asm/compat.h | 22 ++++-----
arch/riscv/include/asm/compat.h | 3 ++
arch/sparc/include/asm/compat.h | 2 +
drivers/ptp/ptp_clock.c | 2 +-
include/asm-generic/compat.h | 8 +++-
include/linux/compat.h | 33 --------------
include/linux/compat_time.h | 34 ++++++++++++++
include/linux/posix-clock.h | 2 +-
include/linux/syscalls.h | 5 +--
include/linux/timex.h | 9 +++-
include/uapi/linux/timex.h | 41 +++++++++++++++++
kernel/compat.c | 63 --------------------------
kernel/time/ntp.c | 12 ++---
kernel/time/ntp_internal.h | 2 +-
kernel/time/posix-clock.c | 2 +-
kernel/time/posix-timers.c | 14 ++----
kernel/time/posix-timers.h | 2 +-
kernel/time/time.c | 80 ++++++++++++++++++++++++++++++---
kernel/time/timekeeping.c | 4 +-
20 files changed, 201 insertions(+), 141 deletions(-)
base-commit: e30b8745c892204095c0a8b69405868f63ddcce1
--
2.17.1
Cc: catalin.marinas(a)arm.com
Cc: davem(a)davemloft.net
Cc: linux-alpha(a)vger.kernel.org
Cc: linux-api(a)vger.kernel.org
Cc: linux-arch(a)vger.kernel.org
Cc: linux-riscv(a)lists.infradead.org
Cc: netdev(a)vger.kernel.org
Cc: palmer(a)sifive.com
get_monotonic_boottime() is deprecated, and may not be safe to call in
every context, as it has to read a hardware clocksource.
This changes xmon to print the time using ktime_get_coarse_boottime64()
instead, which avoids the old timespec type and the HW access.
Acked-by: Balbir Singh <bsingharora(a)gmail.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Originally sent Jun 18, but this hasn't appeared in linux-next yet.
Resending to make sure this is still on the radar. Please apply
to the powerpc git for 4.19
---
arch/powerpc/xmon/xmon.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 47166ad2a669..45e3d0ec1246 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -918,13 +918,13 @@ static void remove_cpu_bpts(void)
static void
show_uptime(void)
{
- struct timespec uptime;
+ struct timespec64 uptime;
if (setjmp(bus_error_jmp) == 0) {
catch_memory_errors = 1;
sync();
- get_monotonic_boottime(&uptime);
+ ktime_get_coarse_boottime_ts64(&uptime);
printf("Uptime: %lu.%.2lu seconds\n", (unsigned long)uptime.tv_sec,
((unsigned long)uptime.tv_nsec / (NSEC_PER_SEC/100)));
--
2.9.0
The timespec structure and associated interfaces are deprecated and will
be removed in the future because of the y2038 overflow.
The use of ktime_to_timespec() in timeout_to_jiffies() does not
suffer from that overflow, but is easy to avoid by just converting
the ktime_t into jiffies directly.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
drivers/gpu/drm/msm/msm_drv.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index b2da1fbf81e0..cc8977476a41 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -353,8 +353,7 @@ static inline unsigned long timeout_to_jiffies(const ktime_t *timeout)
remaining_jiffies = 0;
} else {
ktime_t rem = ktime_sub(*timeout, now);
- struct timespec ts = ktime_to_timespec(rem);
- remaining_jiffies = timespec_to_jiffies(&ts);
+ remaining_jiffies = ktime_divns(rem, NSEC_PER_SEC / HZ);
}
return remaining_jiffies;
--
2.9.0
get_monotonic_boottime() is deprecated, and may not be safe to call in
every context, as it has to read a hardware clocksource.
This changes xmon to print the time using ktime_get_coarse_boottime64()
instead, which avoids the old timespec type and the HW access.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
arch/powerpc/xmon/xmon.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 47166ad2a669..45e3d0ec1246 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -918,13 +918,13 @@ static void remove_cpu_bpts(void)
static void
show_uptime(void)
{
- struct timespec uptime;
+ struct timespec64 uptime;
if (setjmp(bus_error_jmp) == 0) {
catch_memory_errors = 1;
sync();
- get_monotonic_boottime(&uptime);
+ ktime_get_coarse_boottime_ts64(&uptime);
printf("Uptime: %lu.%.2lu seconds\n", (unsigned long)uptime.tv_sec,
((unsigned long)uptime.tv_nsec / (NSEC_PER_SEC/100)));
--
2.9.0
32-bit CLOCK_REALTIME timestamps overflow in year 2038, so all such interfaces
are deprecated now. For the FW_CDEV_IOC_GET_CYCLE_TIMER2 ioctl, we already
support 64-bit timestamps, but the implementation still uses timespec.
This changes the code to use timespec64 instead with the appropriate
accessor functions.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Sent originally on Jun 18, but got not reply.
I notice that Stefan Richter has not been active on the mailing lists
since February 2018.
Andrew, could you pick it up in the meantime?
---
drivers/firewire/core-cdev.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
index f0587273940e..d8e185582642 100644
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -1205,7 +1205,7 @@ static int ioctl_get_cycle_timer2(struct client *client, union ioctl_arg *arg)
{
struct fw_cdev_get_cycle_timer2 *a = &arg->get_cycle_timer2;
struct fw_card *card = client->device->card;
- struct timespec ts = {0, 0};
+ struct timespec64 ts = {0, 0};
u32 cycle_time;
int ret = 0;
@@ -1214,9 +1214,9 @@ static int ioctl_get_cycle_timer2(struct client *client, union ioctl_arg *arg)
cycle_time = card->driver->read_csr(card, CSR_CYCLE_TIME);
switch (a->clk_id) {
- case CLOCK_REALTIME: getnstimeofday(&ts); break;
- case CLOCK_MONOTONIC: ktime_get_ts(&ts); break;
- case CLOCK_MONOTONIC_RAW: getrawmonotonic(&ts); break;
+ case CLOCK_REALTIME: ktime_get_real_ts64(&ts); break;
+ case CLOCK_MONOTONIC: ktime_get_ts64(&ts); break;
+ case CLOCK_MONOTONIC_RAW: ktime_get_raw_ts64(&ts); break;
default:
ret = -EINVAL;
}
--
2.9.0
According to the official documentation for HFS+ [1], inode timestamps
are supposed to cover the time range from 1904 to 2040 as originally
used in classic MacOS.
The traditional Linux usage is to convert the timestamps into an unsigned
32-bit number based on the Unix epoch and from there to a time_t. On
32-bit systems, that wraps the time from 2038 to 1902, so the last
two years of the valid time range become garbled. On 64-bit systems,
all times before 1970 get turned into timestamps between 2038 and 2106,
which is more convenient but also different from the documented behavior.
The same behavior is used in Darwin and presumaby all versions of MacOS X,
as seen in the to_hfs_time() function in [2]. It is unclear whether this
is a bug in the file system code, or intentional but undocumented behavior.
This changes Linux over to the traditional MacOS (pre MacOS X)
behavior. This means all files that are created on MacOS X or Linux
with future timestamps between 2040 and 2106 will now show up as past
dates. Timestamps between 2038 and 2040 will still be represented
incorrectly on 32-bit architectures as times between 1902 and 1904,
but that will be fixed once we have user space with 64-bit time_t.
Cc: stable(a)vger.kernel.org
Link: [1] https://developer.apple.com/library/archive/technotes/tn/tn1150.html
Link: [2] https://opensource.apple.com/source/xnu/xnu-344/bsd/hfs/MacOSStubs.c
Suggested-by: Viacheslav Dubeyko <slava(a)dubeyko.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Note: This is the patch that Viacheslav asked for, but given how
MacOS X behaves, I'm increasingly thinking this is a bad idea.
---
fs/hfs/hfs_fs.h | 2 +-
fs/hfsplus/hfsplus_fs.h | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
index 6d0783e2e276..39c1f3a43ed8 100644
--- a/fs/hfs/hfs_fs.h
+++ b/fs/hfs/hfs_fs.h
@@ -247,7 +247,7 @@ extern void hfs_mark_mdb_dirty(struct super_block *sb);
*
*/
#define __hfs_u_to_mtime(sec) cpu_to_be32(sec + 2082844800U - sys_tz.tz_minuteswest * 60)
-#define __hfs_m_to_utime(sec) (be32_to_cpu(sec) - 2082844800U + sys_tz.tz_minuteswest * 60)
+#define __hfs_m_to_utime(sec) ((time64_t)be32_to_cpu(sec) - 2082844800U + sys_tz.tz_minuteswest * 60)
#define HFS_I(inode) (container_of(inode, struct hfs_inode_info, vfs_inode))
#define HFS_SB(sb) ((struct hfs_sb_info *)(sb)->s_fs_info)
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index d9255abafb81..57838ef4dcdc 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -530,8 +530,9 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t sector, void *buf,
void **data, int op, int op_flags);
int hfsplus_read_wrapper(struct super_block *sb);
-/* time macros */
-#define __hfsp_mt2ut(t) (be32_to_cpu(t) - 2082844800U)
+/* time macros: convert between 1904-2040 and 1970-2106 range,
+ * pre-1970 timestamps are interpreted as post-2038 times after wrap-around */
+#define __hfsp_mt2ut(t) ((time64_t)be32_to_cpu(t) - 2082844800U)
#define __hfsp_ut2mt(t) (cpu_to_be32(t + 2082844800U))
/* compatibility */
--
2.9.0
The series introduces struct __kernel_timex as a substitute for
the non y2038 safe struct timex.
The series is based on the original series posted by Arnd Bergmann
in [1].
The overview of the series is as below:
1. Prepare for the compat timex interfaces to be used unconditionally.
2. Introduce struct __kernel_timex.
3. Use struct __kernel_timex in place of struct timex.
4. Switch syscalls to use struct __kernel_timex.
Deepa Dinamani (6):
arm64: Make basic compat_* types always available
sparc: Make thread_info.h available directly
timex: prepare compat helpers for y2038 changes
time: Add struct __kernel_timex
timex: use __kernel_timex internally
timex: change syscalls to use struct __kernel_timex
arch/alpha/kernel/osf_sys.c | 2 +-
arch/arm64/include/asm/compat.h | 22 ++++-----
arch/sparc/include/asm/compat.h | 2 +
drivers/ptp/ptp_clock.c | 2 +-
include/asm-generic/compat.h | 8 +++-
include/linux/compat.h | 33 --------------
include/linux/compat_time.h | 34 ++++++++++++++
include/linux/posix-clock.h | 2 +-
include/linux/syscalls.h | 5 +--
include/linux/timex.h | 9 +++-
include/uapi/linux/timex.h | 41 +++++++++++++++++
kernel/compat.c | 63 --------------------------
kernel/time/ntp.c | 12 ++---
kernel/time/ntp_internal.h | 2 +-
kernel/time/posix-clock.c | 2 +-
kernel/time/posix-timers.c | 14 ++----
kernel/time/posix-timers.h | 2 +-
kernel/time/time.c | 80 ++++++++++++++++++++++++++++++---
kernel/time/timekeeping.c | 4 +-
19 files changed, 198 insertions(+), 141 deletions(-)
base-commit: 69877f06915f1c7a9f1704442993bcc12c13ace2
--
2.17.1
Cc: catalin.marinas(a)arm.com
Cc: davem(a)davemloft.net
Cc: linux-alpha(a)vger.kernel.org
Cc: linux-api(a)vger.kernel.org
Cc: linux-arch(a)vger.kernel.org
Cc: netdev(a)vger.kernel.org
The cfg80211 layer uses get_seconds() to read the current time
in its supend handling. This function is deprecated because of the 32-bit
time_t overflow, and it can cause unexpected behavior when the time
changes due to settimeofday() calls or leap second updates.
In many cases, we want to use monotonic time instead, however cfg80211
explicitly tracks the time spent in suspend, so this changes the
driver over to use ktime_get_boottime_seconds(), which is slightly
slower, but not used in a fastpath here.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
net/wireless/core.h | 2 +-
net/wireless/sysfs.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/wireless/core.h b/net/wireless/core.h
index 63eb1b5fdd04..7f52ef569320 100644
--- a/net/wireless/core.h
+++ b/net/wireless/core.h
@@ -76,7 +76,7 @@ struct cfg80211_registered_device {
struct cfg80211_scan_request *scan_req; /* protected by RTNL */
struct sk_buff *scan_msg;
struct list_head sched_scan_req_list;
- unsigned long suspend_at;
+ time64_t suspend_at;
struct work_struct scan_done_wk;
struct genl_info *cur_cmd_info;
diff --git a/net/wireless/sysfs.c b/net/wireless/sysfs.c
index 570a2b67ca10..6ab32f6a1961 100644
--- a/net/wireless/sysfs.c
+++ b/net/wireless/sysfs.c
@@ -102,7 +102,7 @@ static int wiphy_suspend(struct device *dev)
struct cfg80211_registered_device *rdev = dev_to_rdev(dev);
int ret = 0;
- rdev->suspend_at = get_seconds();
+ rdev->suspend_at = ktime_get_boottime_seconds();
rtnl_lock();
if (rdev->wiphy.registered) {
@@ -130,7 +130,7 @@ static int wiphy_resume(struct device *dev)
int ret = 0;
/* Age scan results with time spent in suspend */
- cfg80211_bss_age(rdev, get_seconds() - rdev->suspend_at);
+ cfg80211_bss_age(rdev, ktime_get_boottime_seconds() - rdev->suspend_at);
rtnl_lock();
if (rdev->wiphy.registered && rdev->ops->resume)
--
2.9.0
Hi,
I just wanted to check if you would be interested in a list of Managed
Service Providers (MSPs) and Managed Security Service Providers (MSSPs)?
• Managed Service Providers (MSP’s) – 25,000 unique companies
• Managed Security Service Providers (MSSP’s) – 7,520 unique
companies
IT Decision Makers – 6million
Business Decision Makers – 10 million
Kindly review and let me know if I can share more information on this.
I look forward to hearing from you.
Regards,
Diana
MSP List Specialist
For Opt-Out reply with “Not Interested”.
The nes infiniband driver uses current_kernel_time() to get a nanosecond
granunarity timestamp to initialize its tcp sequence counters. This is
one of only a few remaining users of that deprecated function, so we
should try to get rid of it.
Aside from using a deprecated API, there are several problems I see here:
- Using a CLOCK_REALTIME based time source makes it predictable in
case the time base is synchronized.
- Using a coarse timestamp means it only gets updated once per jiffie,
making it even more predictable in order to avoid having to access
the hardware clock source
- The upper 2 bits are always zero because the nanoseconds are at most
999999999.
For the Linux TCP implementation, we use secure_tcp_seq(), which appears
to be appropriate here as well, and solves all the above problems.
I'm doing the same change in both versions of the nes driver, with
i40iw being a later copy of the same code.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
The above change is just a guess at what it should look like,
please review carefully and Ack/Nak as appropriate.
---
drivers/infiniband/hw/i40iw/i40iw_cm.c | 8 +++++---
drivers/infiniband/hw/nes/nes_cm.c | 8 +++++---
net/core/secure_seq.c | 1 +
3 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c b/drivers/infiniband/hw/i40iw/i40iw_cm.c
index 7b2655128b9f..da221d07f2dd 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_cm.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -57,6 +57,7 @@
#include <net/addrconf.h>
#include <net/ip6_route.h>
#include <net/ip_fib.h>
+#include <net/secure_seq.h>
#include <net/tcp.h>
#include <asm/checksum.h>
@@ -2164,7 +2165,6 @@ static struct i40iw_cm_node *i40iw_make_cm_node(
struct i40iw_cm_listener *listener)
{
struct i40iw_cm_node *cm_node;
- struct timespec ts;
int oldarpindex;
int arpindex;
struct net_device *netdev = iwdev->netdev;
@@ -2214,8 +2214,10 @@ static struct i40iw_cm_node *i40iw_make_cm_node(
cm_node->tcp_cntxt.rcv_wscale = I40IW_CM_DEFAULT_RCV_WND_SCALE;
cm_node->tcp_cntxt.rcv_wnd =
I40IW_CM_DEFAULT_RCV_WND_SCALED >> I40IW_CM_DEFAULT_RCV_WND_SCALE;
- ts = current_kernel_time();
- cm_node->tcp_cntxt.loc_seq_num = ts.tv_nsec;
+ cm_node->tcp_cntxt.loc_seq_num = secure_tcp_seq(htonl(cm_node->loc_addr[0]),
+ htonl(cm_node->rem_addr[0]),
+ htons(cm_node->loc_port),
+ htons(cm_node->rem_port));
cm_node->tcp_cntxt.mss = (cm_node->ipv4) ? (iwdev->vsi.mtu - I40IW_MTU_TO_MSS_IPV4) :
(iwdev->vsi.mtu - I40IW_MTU_TO_MSS_IPV6);
diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c
index 6cdfbf8c5674..2b67ace5b614 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -58,6 +58,7 @@
#include <net/neighbour.h>
#include <net/route.h>
#include <net/ip_fib.h>
+#include <net/secure_seq.h>
#include <net/tcp.h>
#include <linux/fcntl.h>
@@ -1445,7 +1446,6 @@ static struct nes_cm_node *make_cm_node(struct nes_cm_core *cm_core,
struct nes_cm_listener *listener)
{
struct nes_cm_node *cm_node;
- struct timespec ts;
int oldarpindex = 0;
int arpindex = 0;
struct nes_device *nesdev;
@@ -1496,8 +1496,10 @@ static struct nes_cm_node *make_cm_node(struct nes_cm_core *cm_core,
cm_node->tcp_cntxt.rcv_wscale = NES_CM_DEFAULT_RCV_WND_SCALE;
cm_node->tcp_cntxt.rcv_wnd = NES_CM_DEFAULT_RCV_WND_SCALED >>
NES_CM_DEFAULT_RCV_WND_SCALE;
- ts = current_kernel_time();
- cm_node->tcp_cntxt.loc_seq_num = htonl(ts.tv_nsec);
+ cm_node->tcp_cntxt.loc_seq_num = secure_tcp_seq(htonl(cm_node->loc_addr),
+ htonl(cm_node->rem_addr),
+ htons(cm_node->loc_port),
+ htons(cm_node->rem_port));
cm_node->tcp_cntxt.mss = nesvnic->max_frame_size - sizeof(struct iphdr) -
sizeof(struct tcphdr) - ETH_HLEN - VLAN_HLEN;
cm_node->tcp_cntxt.rcv_nxt = 0;
diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 7232274de334..af6ad467ed61 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -140,6 +140,7 @@ u32 secure_tcp_seq(__be32 saddr, __be32 daddr,
&net_secret);
return seq_scale(hash);
}
+EXPORT_SYMBOL_GPL(secure_tcp_seq);
u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
{
--
2.9.0
As Dave Chinner points out, we don't have a proper documentation for the
ktime_get() family of interfaces, making it rather unclear which of the
over 30 (!) interfaces one should actually use in a driver or elsewhere
in the kernel.
I wrote up an explanation from how I personally see the interfaces,
documenting what each of the functions do and hopefully making it a bit
clearer which should be used where.
This is the first time I tried writing .rst format documentation, so
in addition to any mistakes in the content, I probably also introduce
nonstandard formatting ;-)
I first tried to add an extra section to
Documentation/timers/timekeeping.txt, but this is currently not included
in the generated API, and it seems useful to have the API docs as part
of what gets generated in
https://www.kernel.org/doc/html/latest/core-api/index.html#core-utilities
instead, so I started a new file there.
I also considered adding the documentation inline in the
include/linux/timekeeping.h header, but couldn't figure out how to do
that in a way that would result both in helpful inline comments as
well as readable html output, so I settled for the latter, with
a small note pointing to it from the header.
Cc: Dave Chinner <david(a)fromorbit.com>
Cc: John Stultz <john.stultz(a)linaro.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Stephen Boyd <sboyd(a)kernel.org>
Cc: Linus Walleij <linus.walleij(a)linaro.org>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
Documentation/core-api/index.rst | 1 +
Documentation/core-api/timekeeping.rst | 185 +++++++++++++++++++++++++++++++++
include/linux/timekeeping.h | 15 +++
3 files changed, 201 insertions(+)
create mode 100644 Documentation/core-api/timekeeping.rst
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index f5a66b72f984..989c97cc232a 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -28,6 +28,7 @@ Core utilities
printk-formats
circular-buffers
gfp_mask-from-fs-io
+ timekeeping
Interfaces for kernel debugging
===============================
diff --git a/Documentation/core-api/timekeeping.rst b/Documentation/core-api/timekeeping.rst
new file mode 100644
index 000000000000..97dafa69dddf
--- /dev/null
+++ b/Documentation/core-api/timekeeping.rst
@@ -0,0 +1,185 @@
+ktime access
+============
+
+Device drivers can read the current time using ktime_get() and the many
+related functions declared in linux/timekeeping.h. As a rule of thumb,
+using an accessor with a shorter name is preferred over one with a longer
+name if both are equally fit for a particular use case.
+
+Basic ktime_t based interfaces
+------------------------------
+
+The recommended simplest form returns an opaque ktime_t, with variants
+that return time for different clock references:
+
+
+.. c:function:: ktime_t ktime_get( void )
+
+ CLOCK_MONOTONIC
+
+ Useful for reliable timestamps and measuring short time intervals
+ accurately. Starts at system boot time but stops during suspend.
+
+.. c:function:: ktime_t ktime_get_boottime( void )
+
+ CLOCK_BOOTTIME
+
+ Like ktime_get(), but does not stop when suspended. This can be
+ used e.g. for key expiration times that need to be synchronized
+ with other machines across a suspend operation.
+
+.. c:function:: ktime_t ktime_get_real( void )
+
+ CLOCK_REALTIME
+
+ Returns the time in relative to the UNIX epoch starting in 1970
+ using the Coordinated Universal Time (UTC), same as gettimeofday()
+ user space. This is used for all timestamps that need to
+ persist across a reboot, like inode times, but should be avoided
+ for internal uses, since it can jump backwards due to a leap
+ second update, NTP adjustment settimeofday() operation from user
+ space.
+
+.. c:function:: ktime_t ktime_get_clocktai( void )
+
+ CLOCK_TAI
+
+ Like ktime_get_real(), but uses the International Atomic Time (TAI)
+ reference instead of UTC to avoid jumping on leap second updates.
+ This is rarely useful in the kernel.
+
+.. c:function:: ktime_t ktime_get_raw( void )
+
+ CLOCK_MONOTONIC_RAW
+
+ Like ktime_get(), but runs at the same rate as the hardware
+ clocksource without (NTP) adjustments for clock drift. This is
+ also rarely needed in the kernel.
+
+nanosecond, timespec64, and second output
+-------------------------------------
+
+For all of the above, there are variants that return the time in a
+different format depending on what is required by the user:
+
+.. c:function:: u64 ktime_get_ns( void )
+ u64 ktime_get_boottime_ns( void )
+ u64 ktime_get_real_ns( void )
+ u64 ktime_get_tai_ns( void )
+ u64 ktime_get_raw_ns( void )
+
+ Same as the plain ktime_get functions, but returning a u64 number
+ of nanoseconds in the respective time reference, which may be
+ more convenient for some callers.
+
+.. c:function:: void ktime_get_ts64( struct timespec64 * )
+ void ktime_get_boottime_ts64( struct timespec64 * )
+ void ktime_get_real_ts64( struct timespec64 * )
+ void ktime_get_clocktai_ts64( struct timespec64 * )
+ void ktime_get_raw_ts64( struct timespec64 * )
+
+ Same above, but returns the time in a 'struct timespec64', split
+ into seconds and nanoseconds. This can avoid an extra division
+ when printing the time, or when passing it into an external
+ interface that expects a 'timespec' or 'timeval' structure.
+
+.. c:function:: time64_t ktime_get_seconds( void )
+ time64_t ktime_get_boottime_seconds( void )
+ time64_t ktime_get_real_seconds( void )
+ time64_t ktime_get_clocktai_seconds( void )
+ time64_t ktime_get_raw_seconds( void )
+
+ Return a coarse-grained version of the time as a scalar
+ time64_t. This avoids accessing the clock hardware and rounds
+ down the seconds to the full seconds of the last timer tick
+ using the respective reference.
+
+Coarse and fast_ns access
+-------------------------
+
+Some additional variants exist for more specialized cases:
+
+.. c:function:: ktime_t ktime_get_coarse_boottime( void )
+ ktime_t ktime_get_coarse_real( void )
+ ktime_t ktime_get_coarse_clocktai( void )
+ ktime_t ktime_get_coarse_raw( void )
+
+.. c:function:: void ktime_get_coarse_ts64( struct timespec64 * )
+ void ktime_get_coarse_boottime_ts64( struct timespec64 * )
+ void ktime_get_coarse_real_ts64( struct timespec64 * )
+ void ktime_get_coarse_clocktai_ts64( struct timespec64 * )
+ void ktime_get_coarse_raw_ts64( struct timespec64 * )
+
+ These are quicker than the non-coarse versions, but less accurate,
+ corresponding to CLOCK_MONONOTNIC_COARSE and CLOCK_REALTIME_COARSE
+ in user space, along with the equivalent boottime/tai/raw
+ timebase not available in user space.
+
+ The time returned here corresponds to the last timer tick, which
+ may be as much as 10ms in the past (for CONFIG_HZ=100), same as
+ reading the 'jiffies' variable. These are only useful when called
+ in a fast path and one still expects better than second accuracy,
+ but can't easily use 'jiffies', e.g. for inode timestamps.
+ Skipping the hardware clock access saves around 100 CPU cycles
+ on most modern machines with a reliable cycle counter, but
+ up to several microseconds on older hardware with an external
+ clocksource.
+
+.. c:function:: u64 ktime_get_mono_fast_ns( void )
+ u64 ktime_get_raw_fast_ns( void )
+ u64 ktime_get_boot_fast_ns( void )
+ u64 ktime_get_real_fast_ns( void )
+
+ These variants are safe to call from any context, including from
+ a non-maskable interrupt (NMI) during a timekeeper update, and
+ while we are entering suspend with the clocksource powered down.
+ This is useful in some tracing or debugging code as well as
+ machine check reporting, but most drivers should never call them,
+ since the time is allowed to jump under certain conditions.
+
+Deprecated time interfaces
+--------------------------
+
+Older kernels used some other interfaces that are now being phased out
+but may appear in third-party drivers being ported here. In particular,
+all interfaces returning a 'struct timeval' or 'struct timespec' have
+been replaced because the tv_sec member overflows in year 2038 on 32-bit
+architectures. These are the recommended replacements:
+
+.. c:function:: void ktime_get_ts( struct timespec * )
+
+ Use ktime_get() or ktime_get_ts64() instead.
+
+.. c:function:: struct timeval do_gettimeofday( void )
+ struct timespec getnstimeofday( void )
+ struct timespec64 getnstimeofday64( void )
+ void ktime_get_real_ts( struct timespec * )
+
+ ktime_get_real_ts64() is a direct replacement, but consider using
+ monotonic time (ktime_get_ts64()) and/or a ktime_t based interface
+ (ktime_get()/ktime_get_real()).
+
+.. c:function:: struct timespec current_kernel_time( void )
+ struct timespec64 current_kernel_time64( void )
+ struct timespec get_monotonic_coarse( void )
+ struct timespec64 get_monotonic_coarse64( void )
+
+ These are replaced by ktime_get_coarse_real_ts64() and
+ ktime_get_coarse_ts64(). However, A lot of code that wants
+ coarse-grained times can use the simple 'jiffies' instead, while
+ some drivers may actually want the higher resolution accessors
+ these days.
+
+.. c:function:: struct timespec getrawmonotonic( void )
+ struct timespec64 getrawmonotonic64( void )
+ struct timespec timekeeping_clocktai( void )
+ struct timespec64 timekeeping_clocktai64( void )
+ struct timespec get_monotonic_boottime( void )
+ struct timespec64 get_monotonic_boottime64( void )
+
+ These are replaced by ktime_get_raw()/ktime_get_raw_ts64(),
+ ktime_get_clocktai()/ktime_get_clocktai_ts64() as well
+ as ktime_get_boottime()/ktime_get_boottime_ts64().
+ However, if the particular choice of clock source is not
+ important for the user, consider converting to
+ ktime_get()/ktime_get_ts64() instead for consistency.
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 86bc2026efce..947b1b8d2d01 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -21,6 +21,21 @@ extern int do_sys_settimeofday64(const struct timespec64 *tv,
const struct timezone *tz);
/*
+ * ktime_get() family: read the current time in a multitude of ways,
+ *
+ * The default time reference is CLOCK_MONOTONIC, starting at
+ * boot time but not counting the time spent in suspend.
+ * For other references, use the functions with "real", "clocktai",
+ * "boottime" and "raw" suffixes.
+ *
+ * To get the time in a different format, use the ones wit
+ * "ns", "ts64" and "seconds" suffix.
+ *
+ * See Documentation/core-api/timekeeping.rst for more details.
+ */
+
+
+/*
* timespec64 based interfaces
*/
extern void ktime_get_raw_ts64(struct timespec64 *ts);
--
2.9.0
current_time is one of the few callers of current_kernel_time64(), which
is a wrapper around ktime_get_coarse_real_ts64(). This calls the latter
directly for consistency with the rest of the kernel that is moving to
the ktime_get_ family of time accessors.
An open questions is whether we may want to actually call the more
accurate ktime_get_real_ts64() for file systems that save high-resolution
timestamps in their on-disk format. This would add a small but measurable
overhead to each update of the inode stamps but lead to inode timestamps
to actually have a usable resolution better than one jiffy (1 to 10
milliseconds normally).
I traced the original addition of the current_kernel_time() call to set
the nanosecond fields back to linux-2.5.48, where Andi Kleen added a
patch with subject "nanosecond stat timefields". This adds the original
call to current_kernel_time and the truncation to the resolution of the
file system, but makes no mention of the intended accuracy. At the time,
we had a do_gettimeofday() interface that on some architectures could
return a microsecond-resolution timestamp, but there was no interface
for getting an accurate timestamp in nanosecond resolution, neither inside
the kernel nor from user space. This makes me suspect that the use of
coarse timestamps was never really a conscious decision but instead
a result of whatever API was available 16 years ago.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/inode.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/inode.c b/fs/inode.c
index 2c300e981796..e27bd9334939 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2133,7 +2133,9 @@ EXPORT_SYMBOL(timespec64_trunc);
*/
struct timespec64 current_time(struct inode *inode)
{
- struct timespec64 now = current_kernel_time64();
+ struct timespec64 now;
+
+ ktime_get_coarse_real_ts64(&now);
if (unlikely(!inode->i_sb)) {
WARN(1, "current_time() called with uninitialized super_block in the inode");
--
2.9.0
get_seconds() can overflow on 32-bit architectures and is deprecated
because of that. The use in the aacraid driver has the same problem due
to a limited firmware interface, it also overflows in the year 2106.
This changes all calls to get_seconds() to the non-deprecated
ktime_get_real_seconds(), which unfortunately doesn't solve that problem
but gets rid of one user of the deprecated interface.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
drivers/scsi/aacraid/rx.c | 2 +-
drivers/scsi/aacraid/sa.c | 2 +-
drivers/scsi/aacraid/src.c | 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/aacraid/rx.c b/drivers/scsi/aacraid/rx.c
index 620166694171..576cdf9cc120 100644
--- a/drivers/scsi/aacraid/rx.c
+++ b/drivers/scsi/aacraid/rx.c
@@ -319,7 +319,7 @@ static void aac_rx_start_adapter(struct aac_dev *dev)
union aac_init *init;
init = dev->init;
- init->r7.host_elapsed_seconds = cpu_to_le32(get_seconds());
+ init->r7.host_elapsed_seconds = cpu_to_le32(ktime_get_real_seconds());
// We can only use a 32 bit address here
rx_sync_cmd(dev, INIT_STRUCT_BASE_ADDRESS, (u32)(ulong)dev->init_pa,
0, 0, 0, 0, 0, NULL, NULL, NULL, NULL, NULL);
diff --git a/drivers/scsi/aacraid/sa.c b/drivers/scsi/aacraid/sa.c
index 882f40353b96..efa96c1c6aa3 100644
--- a/drivers/scsi/aacraid/sa.c
+++ b/drivers/scsi/aacraid/sa.c
@@ -251,7 +251,7 @@ static void aac_sa_start_adapter(struct aac_dev *dev)
* Fill in the remaining pieces of the init.
*/
init = dev->init;
- init->r7.host_elapsed_seconds = cpu_to_le32(get_seconds());
+ init->r7.host_elapsed_seconds = cpu_to_le32(ktime_get_real_seconds());
/* We can only use a 32 bit address here */
sa_sync_cmd(dev, INIT_STRUCT_BASE_ADDRESS,
(u32)(ulong)dev->init_pa, 0, 0, 0, 0, 0,
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index 4ebb35a29caa..5a299975a289 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -409,7 +409,7 @@ static void aac_src_start_adapter(struct aac_dev *dev)
init = dev->init;
if (dev->comm_interface == AAC_COMM_MESSAGE_TYPE3) {
- init->r8.host_elapsed_seconds = cpu_to_le32(get_seconds());
+ init->r8.host_elapsed_seconds = cpu_to_le32(ktime_get_real_seconds());
src_sync_cmd(dev, INIT_STRUCT_BASE_ADDRESS,
lower_32_bits(dev->init_pa),
upper_32_bits(dev->init_pa),
@@ -417,7 +417,7 @@ static void aac_src_start_adapter(struct aac_dev *dev)
(AAC_MAX_HRRQ - 1) * sizeof(struct _rrq),
0, 0, 0, NULL, NULL, NULL, NULL, NULL);
} else {
- init->r7.host_elapsed_seconds = cpu_to_le32(get_seconds());
+ init->r7.host_elapsed_seconds = cpu_to_le32(ktime_get_real_seconds());
// We can only use a 32 bit address here
src_sync_cmd(dev, INIT_STRUCT_BASE_ADDRESS,
(u32)(ulong)dev->init_pa, 0, 0, 0, 0, 0,
--
2.9.0
'struct rusage' contains the run times of a process in 'timeval' format
and is accessed through the wait4() and getrusage() system calls. This
is not a problem for y2038 safety by itself, but causes an issue when
the C library starts using 64-bit time_t on 32-bit architectures because
the structure layout becomes incompatible.
There are three possible ways of dealing with this:
a) deprecate the wait4() and getrusage() system calls, and create
a set of kernel interfaces based around a newly defined structure that
could solve multiple problems at once, e.g. provide more fine-grained
timestamps. The C library could then implement the posix interfaces
on top of the new system calls.
b) Extend the approach taken by the x32 ABI, and use the 64-bit
native structure layout for rusage on all architectures with new
system calls that is otherwise compatible. A downside of this
is that it requires a number of ugly hacks to deal with all the
other fields of the structure also becoming 64 bit wide.
Especially on big-endian architectures, we can't easily use the
union trick from glibc.
c) Change the definition of struct rusage to be independent of
time_t. This is the easiest change, as it does not involve new system
call entry points, but it requires the C library to convert between
the kernel format of the structure and the user space definition.
d) Add a new ABI variant of 'struct rusage' that corresponds to the
current layout with 32-bit counters but 64-bit time_t. This would
minimize the libc changes but require additional kernel code to
handle a third binary layout on 64-bit kernels.
I'm picking approach c) for its simplicity. As pointed out by reviewers,
simply using the kernel structure in user space would not be POSIX
compliant, but I have verified that none of the usual C libraries (glibc,
musl, uclibc-ng, newlib) do that. Instead, they all provide their own
definition of 'struct rusage' to applications in sys/resource.h.
To be on the safe side, I'm only changing the definition inside of
the kernel and for user space with an updated 'time_t'. All existing
users will see the traditional layout that is compatible with what the
C libraries export. A 32-bit application that includes linux/resource.h
but uses an update C library with 64-bit time_t will now see the low-level
kernel structure that corresponds to the getrusage() system call interface
but that will be different from one defined in sys/resource.h for the
getrusage library interface.
Link: https://patchwork.kernel.org/patch/10077527/
Cc: Paul Eggert <eggert(a)cs.ucla.edu>
Cc: Eric W. Biederman <ebiederm(a)xmission.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
arch/alpha/kernel/osf_sys.c | 15 +++++++++------
include/uapi/linux/resource.h | 14 ++++++++++++--
kernel/sys.c | 4 ++--
3 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 89faa6f4de47..cad03ee445b3 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1184,6 +1184,7 @@ SYSCALL_DEFINE4(osf_wait4, pid_t, pid, int __user *, ustatus, int, options,
struct rusage32 __user *, ur)
{
unsigned int status = 0;
+ struct rusage32 r32;
struct rusage r;
long err = kernel_wait4(pid, &status, options, &r);
if (err <= 0)
@@ -1192,12 +1193,14 @@ SYSCALL_DEFINE4(osf_wait4, pid_t, pid, int __user *, ustatus, int, options,
return -EFAULT;
if (!ur)
return err;
- if (put_tv_to_tv32(&ur->ru_utime, &r.ru_utime))
- return -EFAULT;
- if (put_tv_to_tv32(&ur->ru_stime, &r.ru_stime))
- return -EFAULT;
- if (copy_to_user(&ur->ru_maxrss, &r.ru_maxrss,
- sizeof(struct rusage32) - offsetof(struct rusage32, ru_maxrss)))
+ r32.ru_utime.tv_sec = r.ru_utime.tv_sec;
+ r32.ru_utime.tv_usec = r.ru_utime.tv_usec;
+ r32.ru_stime.tv_sec = r.ru_stime.tv_sec;
+ r32.ru_stime.tv_usec = r.ru_stime.tv_usec;
+ memcpy(&r32.ru_maxrss, &r.ru_maxrss,
+ sizeof(struct rusage32) - offsetof(struct rusage32, ru_maxrss));
+
+ if (copy_to_user(ur, &r32, sizeof(r32)))
return -EFAULT;
return err;
}
diff --git a/include/uapi/linux/resource.h b/include/uapi/linux/resource.h
index cc00fd079631..611d3745c70a 100644
--- a/include/uapi/linux/resource.h
+++ b/include/uapi/linux/resource.h
@@ -22,8 +22,18 @@
#define RUSAGE_THREAD 1 /* only the calling thread */
struct rusage {
- struct timeval ru_utime; /* user time used */
- struct timeval ru_stime; /* system time used */
+#if (__BITS_PER_LONG != 32 || !defined(__USE_TIME_BITS64)) && !defined(__KERNEL__)
+ struct timeval ru_utime; /* user time used */
+ struct timeval ru_stime; /* system time used */
+#else
+ /*
+ * For 32-bit user space with 64-bit time_t, the binary layout
+ * in these fields is incompatible with 'struct timeval', so the
+ * C library has to translate this into the POSIX compatible layout.
+ */
+ struct __kernel_old_timeval ru_utime;
+ struct __kernel_old_timeval ru_stime;
+#endif
__kernel_long_t ru_maxrss; /* maximum resident set size */
__kernel_long_t ru_ixrss; /* integral shared memory size */
__kernel_long_t ru_idrss; /* integral unshared data size */
diff --git a/kernel/sys.c b/kernel/sys.c
index ad692183dfe9..1de538f622e8 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1769,8 +1769,8 @@ void getrusage(struct task_struct *p, int who, struct rusage *r)
unlock_task_sighand(p, &flags);
out:
- r->ru_utime = ns_to_timeval(utime);
- r->ru_stime = ns_to_timeval(stime);
+ r->ru_utime = ns_to_kernel_old_timeval(utime);
+ r->ru_stime = ns_to_kernel_old_timeval(stime);
if (who != RUSAGE_CHILDREN) {
struct mm_struct *mm = get_task_mm(p);
--
2.9.0
The patch titled
Subject: kernel/sys.c: remove get_monotonic_boottime()
has been removed from the -mm tree. Its filename was
sysinfo-remove-get_monotonic_boottime.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Arnd Bergmann <arnd(a)arndb.de>
Subject: kernel/sys.c: remove get_monotonic_boottime()
get_monotonic_boottime() is deprecated because it uses the old 'timespec'
structure. This replaces one of the last callers with a call to
ktime_get_boottime.
Link: http://lkml.kernel.org/r/20180618150114.849216-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
Reviewed-by: Cyrill Gorcunov <gorcunov(a)gmail.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <y2038(a)lists.linaro.org>
Cc: Dominik Brodowski <linux(a)dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/sys.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff -puN kernel/sys.c~sysinfo-remove-get_monotonic_boottime kernel/sys.c
--- a/kernel/sys.c~sysinfo-remove-get_monotonic_boottime
+++ a/kernel/sys.c
@@ -2523,11 +2523,11 @@ static int do_sysinfo(struct sysinfo *in
{
unsigned long mem_total, sav_total;
unsigned int mem_unit, bitcount;
- struct timespec tp;
+ struct timespec64 tp;
memset(info, 0, sizeof(struct sysinfo));
- get_monotonic_boottime(&tp);
+ ktime_get_boottime_ts64(&tp);
info->uptime = tp.tv_sec + (tp.tv_nsec ? 1 : 0);
get_avenrun(info->loads, 0, SI_LOAD_SHIFT - FSHIFT);
_
Patches currently in -mm which might be from arnd(a)arndb.de are
ocfs2-dlmglue-clean-up-timestamp-handling.patch
shmem-use-monotonic-time-for-i_generation.patch
procfs-uptime-use-ktime_get_boottime_ts64.patch
crash-print-timestamp-using-time64_t.patch
nilfs2-use-64-bit-superblock-timstamps.patch
reiserfs-remove-unused-j_timestamp.patch
reiserfs-use-monotonic-time-for-j_trans_start_time.patch
reiserfs-remove-obsolete-print_time-function.patch
fat-propagate-64-bit-inode-timestamps.patch
adfs-use-timespec64-for-time-conversion.patch
vmcore-hide-vmcoredd_mmap_dumps-for-nommu-builds.patch
This uses the deprecated time_t type but is write-only, and could be
removed, but as Jeff explains, having a timestamp can be usefule for
post-mortem analysis in crash dumps.
In order to remove one of the last instances of time_t, this changes
the type to time64_t, same as j_trans_start_time.
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
This could replace "reiserfs: remove unused j_timestamp" if we decide
that we want to keep that variable.
Posting this patch as an alternative now, while waiting for Jeff to
reply on how important this really is.
---
fs/reiserfs/reiserfs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h
index 621b9a07080a..e5ca9ed79e54 100644
--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -271,7 +271,7 @@ struct reiserfs_journal_list {
struct mutex j_commit_mutex;
unsigned int j_trans_id;
- time_t j_timestamp;
+ time64_t j_timestamp; /* write-only but useful for crash dump analysis */
struct reiserfs_list_bitmap *j_list_bitmap;
struct buffer_head *j_commit_bh; /* commit buffer head */
struct reiserfs_journal_cnode *j_realblock;
--
2.9.0
While working on extended rand for last_error/first_error timestamps,
I noticed that the endianess is wrong, we access the little-endian
fields in struct ext4_super_block as native-endian when we print them.
This adds a special case in ext4_attr_show() and ext4_attr_store()
to byteswap the superblock fields if needed.
In older kernels, this code was part of super.c, it got moved to sysfs.c
in linux-4.4.
Cc: stable(a)vger.kernel.org
Fixes: 52c198c6820f ("ext4: add sysfs entry showing whether the fs contains errors")
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/ext4/sysfs.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c
index f34da0bb8f17..b970a200f20c 100644
--- a/fs/ext4/sysfs.c
+++ b/fs/ext4/sysfs.c
@@ -274,8 +274,12 @@ static ssize_t ext4_attr_show(struct kobject *kobj,
case attr_pointer_ui:
if (!ptr)
return 0;
- return snprintf(buf, PAGE_SIZE, "%u\n",
- *((unsigned int *) ptr));
+ if (a->attr_ptr == ptr_ext4_super_block_offset)
+ return snprintf(buf, PAGE_SIZE, "%u\n",
+ le32_to_cpup(ptr));
+ else
+ return snprintf(buf, PAGE_SIZE, "%u\n",
+ *((unsigned int *) ptr));
case attr_pointer_atomic:
if (!ptr)
return 0;
@@ -308,7 +312,10 @@ static ssize_t ext4_attr_store(struct kobject *kobj,
ret = kstrtoul(skip_spaces(buf), 0, &t);
if (ret)
return ret;
- *((unsigned int *) ptr) = t;
+ if (a->attr_ptr == ptr_ext4_super_block_offset)
+ *((__le32 *) ptr) = cpu_to_le32(t);
+ else
+ *((unsigned int *) ptr) = t;
return len;
case attr_inode_readahead:
return inode_readahead_blks_store(sbi, buf, len);
--
2.9.0
On Thu, Jun 21, 2018 at 7:46 PM, Andreas Dilger <adilger(a)dilger.ca> wrote:
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 0c4c2201b3aa..2063d4e5ed08 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -312,6 +312,20 @@ void ext4_itable_unused_set(struct super_block *sb,
>> bg->bg_itable_unused_hi = cpu_to_le16(count >> 16);
>> }
>>
>> +static void ext4_update_tstamp(__le32 *lo, __u8 *hi)
>
> Would it be better to wrap this in a macro, something like:
>
> #define ext4_update_tstamp(es, tstamp) \
> __ext4_update_tstamp(&(es)->tstamp, &(es)->tstamp ## _hi)
> #define ext4_get_tstamp(es, tstamp) \
> __ext4_get_tstamp(&(es)->tstamp, &(es)->tstamp ## _hi)
>
> So that it can be used in the callers more easily:
>
> ext4_update_tstamp(es, s_last_error_time);
> time = ext4_get_tstamp(es, s_last_error_time);
I generally try to avoid concatenating identifiers like this, as it makes
it much harder to grep for where a particular symbol or
struct member gets used.
>> +{
>> + time64_t now = ktime_get_real_seconds();
>> +
>> + now = clamp_val(now, 0, 0xffffffffffull);
>
> Long strings of "0xfff..." are hard to get correct. This looks right,
> but it may be easier to be sure it is correct with something like:
>
> /* timestamps have a 32-bit low field and 8-bit high field */
> now = clamp_val(now, 0, (1ULL << 40) - 1);
Yes, good idea. I'm surprised we don't have a generic macro for that yet
(or maybe I just couldn't find it)
>> @@ -249,6 +251,12 @@ static void *calc_ptr(struct ext4_attr *a, struct ext4_sb_info *sbi)
>> return NULL;
>> }
>>
>> +static ssize_t print_time(char *buf, __le32 lo, __u8 hi)
>
> It would probably be more consistent to name this "print_tstamp()"
> since it isn't strictly a "time" as one would expect.
Ok.
>> +{
>> + return snprintf(buf, PAGE_SIZE, "%lld",
>> + ((time64_t)hi << 32) + le32_to_cpu(lo));
>> +}
>
> Similarly, wrap this with:
>
> #define print_tstamp(buf, es, tstamp) \
> __print_tstamp(buf, &(es)->tstamp, &(es)->tstamp ## _hi)
Ok. I'll integrate all of the above and post as a non-RFC patch then
after some testing.
Arnd
On Thu, Jun 21, 2018 at 7:27 PM, Andreas Dilger <adilger(a)dilger.ca> wrote:
> On Jun 20, 2018, at 9:33 AM, Arnd Bergmann <arnd(a)arndb.de> wrote:
>>
>> We only care about the low 32-bit for i_dtime as explained in commit
>> b5f515735bea ("ext4: avoid Y2038 overflow in recently_deleted()"), so
>> the use of get_seconds() is correct here, but that function is getting
>> removed in the process of the y2038 fixes, so let's use the modern
>> ktime_get_real_seconds() here.
>>
>> Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
>
> Looks OK, one minor cleanup possible.
>
> Reviewed-by: Andreas Dilger <adilger(a)dilger.ca>
>> ext4_orphan_del(handle, inode);
>> - EXT4_I(inode)->i_dtime = get_seconds();
>> + EXT4_I(inode)->i_dtime = ktime_get_real_seconds();
>
> Not strictly necessary, but it might be good from a code clarity POV
> to use:
>
> EXT4_I(inode)->i_dtime = (__u32)ktime_get_real_seconds();
>
> so that it is more clear we are aware that this is being truncated
> to a 32-bit value.
Right, I've been a bit inconsistent here across file systems, I've
done this in some other ones, using either a cast or a lower_32_bits()
function call. Changed it as you suggested here now.
Arnd
On Thu, Jun 21, 2018 at 7:49 PM, Andreas Dilger <adilger(a)dilger.ca> wrote:
> On Jun 20, 2018, at 9:32 AM, Arnd Bergmann <arnd(a)arndb.de> wrote:
>>
>> While working on extended rand for last_error/first_error timestamps,
>> I noticed that the endianess is wrong, we access the little-endian
>> fields in struct ext4_super_block as native-endian when we print them.
>>
>> This adds a special case in ext4_attr_show() and ext4_attr_store()
>> to byteswap the superblock fields if needed.
>>
>> In older kernels, this code was part of super.c, it got moved to sysfs.c
>> in linux-4.4.
>>
>> Cc: stable(a)vger.kernel.org
>> Fixes: 52c198c6820f ("ext4: add sysfs entry showing whether the fs contains errors")
>> Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
>
> I was wondering why this didn't just use le32_to_cpu() all the time,
> but I see that these functions are being used for both ext4_super_block
> (on-disk) fields, as well as ext4_sb_info (in-memory) fields. A bit
> ugly, but I don't think there is a better solution.
>
> Reviewed-by: Andreas Dilger <adilger(a)dilger.ca>
One alternative that I considered was to just do away with helpers
for the ext4_super_block structure and only use them for ext4_sb_info,
especially after the last patch that changes this again. However,
as a bugfix for stable backports it seemed best to keep the change
as simple as possible.
Thanks for the review,
Arnd
The mount time field in the superblock uses a 64-bit timestamp, but
calling get_seconds() may truncate the current time to 32 bits.
This changes it to ktime_get_real_seconds() to avoid the potential
overflow.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/nilfs2/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index 6ffeca84d7c3..1b9067cf4511 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -834,7 +834,7 @@ static int nilfs_setup_super(struct super_block *sb, int is_mount)
sbp[0]->s_max_mnt_count = cpu_to_le16(NILFS_DFL_MAX_MNT_COUNT);
sbp[0]->s_mnt_count = cpu_to_le16(mnt_count + 1);
- sbp[0]->s_mtime = cpu_to_le64(get_seconds());
+ sbp[0]->s_mtime = cpu_to_le64(ktime_get_real_seconds());
skip_mount_setup:
sbp[0]->s_state =
--
2.9.0
bcache uses get_seconds() to read the current system time and store it in
the superblock as well as in uuid_entry structures that are user visible.
This changes over from the deprecated function to
ktime_get_real_seconds(), which returns a 64-bit timestamp as it
should. Unfortunately, the two structures are still limited to 32 bits,
so this won't fix any real problems. Let's at least document that
properly, in case we get an updated format in the future it can be
fixed. Until then, we still have some time, and checking the tools
at https://github.com/koverstreet/bcache-tools reveals no access to
any of them.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
drivers/md/bcache/super.c | 23 +++++++++++++++++------
include/uapi/linux/bcache.h | 4 ++--
2 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index fa4058e43202..aa9790ee5cb5 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -53,6 +53,17 @@ struct workqueue_struct *bcache_wq;
/* limitation of bcache devices number on single system */
#define BCACHE_DEVICE_IDX_MAX ((1U << MINORBITS)/BCACHE_MINORS)
+/*
+ * various timestamp fields in the superblock are unfortunately
+ * limited to 32 bits, which will lead to overflow in year 2106.
+ *
+ * If we ever get a new superblock format, that should be fixed.
+ */
+static inline u32 bcache_get_realtime32(void)
+{
+ return (u32)ktime_get_real_seconds();
+}
+
/* Superblock */
static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
@@ -181,7 +192,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
goto err;
}
- sb->last_mount = get_seconds();
+ sb->last_mount = bcache_get_realtime32();
err = NULL;
get_page(bh->b_page);
@@ -701,7 +712,7 @@ static void bcache_device_detach(struct bcache_device *d)
SET_UUID_FLASH_ONLY(u, 0);
memcpy(u->uuid, invalid_uuid, 16);
- u->invalidated = cpu_to_le32(get_seconds());
+ u->invalidated = cpu_to_le32(bcache_get_realtime32());
bch_uuid_write(d->c);
}
@@ -1027,7 +1038,7 @@ void bch_cached_dev_detach(struct cached_dev *dc)
int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
uint8_t *set_uuid)
{
- uint32_t rtime = cpu_to_le32(get_seconds());
+ uint32_t rtime = cpu_to_le32(bcache_get_realtime32());
struct uuid_entry *u;
struct cached_dev *exist_dc, *t;
@@ -1070,7 +1081,7 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c,
(BDEV_STATE(&dc->sb) == BDEV_STATE_STALE ||
BDEV_STATE(&dc->sb) == BDEV_STATE_NONE)) {
memcpy(u->uuid, invalid_uuid, 16);
- u->invalidated = cpu_to_le32(get_seconds());
+ u->invalidated = cpu_to_le32(bcache_get_realtime32());
u = NULL;
}
@@ -1390,7 +1401,7 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size)
get_random_bytes(u->uuid, 16);
memset(u->label, 0, 32);
- u->first_reg = u->last_reg = cpu_to_le32(get_seconds());
+ u->first_reg = u->last_reg = cpu_to_le32(bcache_get_realtime32());
SET_UUID_FLASH_ONLY(u, 1);
u->sectors = size >> 9;
@@ -1894,7 +1905,7 @@ static void run_cache_set(struct cache_set *c)
goto err;
closure_sync(&cl);
- c->sb.last_mount = get_seconds();
+ c->sb.last_mount = bcache_get_realtime32();
bcache_write_super(c);
list_for_each_entry_safe(dc, t, &uncached_devices, list)
diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
index 821f71a2e48f..8d19e02d752a 100644
--- a/include/uapi/linux/bcache.h
+++ b/include/uapi/linux/bcache.h
@@ -195,7 +195,7 @@ struct cache_sb {
};
};
- __u32 last_mount; /* time_t */
+ __u32 last_mount; /* time overflow in y2106 */
__u16 first_bucket;
union {
@@ -318,7 +318,7 @@ struct uuid_entry {
struct {
__u8 uuid[16];
__u8 label[32];
- __u32 first_reg;
+ __u32 first_reg; /* time overflow in y2106 */
__u32 last_reg;
__u32 invalidated;
--
2.9.0
get_seconds() is deprecated because of the y2038 overflow, so users
should migrate to 64-bit timestamps using ktime_get_real_seconds().
In ext2, the timestamps in the superblock and in the inode are all
limited to 32-bit, and this won't get fixed, so let's just stop
using the deprecated interface and keep truncating.
All users of ext2 should migrate to ext4 before 2038 to prevent this
from causing problems.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/ext2/inode.c | 2 +-
fs/ext2/super.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 71635909df3b..7f7ee18fe179 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -86,7 +86,7 @@ void ext2_evict_inode(struct inode * inode)
if (want_delete) {
sb_start_intwrite(inode->i_sb);
/* set dtime */
- EXT2_I(inode)->i_dtime = get_seconds();
+ EXT2_I(inode)->i_dtime = ktime_get_real_seconds();
mark_inode_dirty(inode);
__ext2_write_inode(inode, inode_needs_sync(inode));
/* truncate to 0 */
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 25ab1274090f..25e31afe961d 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -679,7 +679,7 @@ static int ext2_setup_super (struct super_block * sb,
"running e2fsck is recommended");
else if (le32_to_cpu(es->s_checkinterval) &&
(le32_to_cpu(es->s_lastcheck) +
- le32_to_cpu(es->s_checkinterval) <= get_seconds()))
+ le32_to_cpu(es->s_checkinterval) <= ktime_get_real_seconds()))
ext2_msg(sb, KERN_WARNING,
"warning: checktime reached, "
"running e2fsck is recommended");
@@ -1245,7 +1245,7 @@ void ext2_sync_super(struct super_block *sb, struct ext2_super_block *es,
spin_lock(&EXT2_SB(sb)->s_lock);
es->s_free_blocks_count = cpu_to_le32(ext2_count_free_blocks(sb));
es->s_free_inodes_count = cpu_to_le32(ext2_count_free_inodes(sb));
- es->s_wtime = cpu_to_le32(get_seconds());
+ es->s_wtime = cpu_to_le32(ktime_get_real_seconds());
/* unlock before we do IO */
spin_unlock(&EXT2_SB(sb)->s_lock);
mark_buffer_dirty(EXT2_SB(sb)->s_sbh);
@@ -1360,7 +1360,7 @@ static int ext2_remount (struct super_block * sb, int * flags, char * data)
* the rdonly flag and then mark the partition as valid again.
*/
es->s_state = cpu_to_le16(sbi->s_mount_state);
- es->s_mtime = cpu_to_le32(get_seconds());
+ es->s_mtime = cpu_to_le32(ktime_get_real_seconds());
spin_unlock(&sbi->s_lock);
err = dquot_suspend(sb, -1);
--
2.9.0
afs uses 32-bit timestamps everywhere, but mixes signed and unsigned
usage, which is a bit inconsistent. In particular on 32-bit machines,
it currently uses unsigned timestamps (ranging from 1970 to 2106) for
locally modified files, but signed timestamps (rand 1902 to 2038) when
reading from a remote end. On 64-bit machines, we always interpret
timestamps as unsigned here.
This replaces the deprecated time_t with a new explicitly unsigned
afs_time32_t to get a consistent interpretation of inode times
according the the wire protocol definition.
This avoids the y2038 overflow on 32-bit machines, extending the range
to the end of the afs_time32_t in year 2106. On 64-bit machines, using
the shorter type saves a few bytes for each afs_file_status and
afs_volsync saves a few bytes over time_t or time64_t.
Note that mtime_server and struct afs_volsync are not currently
used in any meaningful way and could be removed completely.
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
fs/afs/afs.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/afs/afs.h b/fs/afs/afs.h
index b4ff1f7ae4ab..a17f4ce06323 100644
--- a/fs/afs/afs.h
+++ b/fs/afs/afs.h
@@ -26,6 +26,7 @@
typedef unsigned afs_volid_t;
typedef unsigned afs_vnodeid_t;
typedef unsigned long long afs_dataversion_t;
+typedef unsigned afs_time32_t;
typedef enum {
AFSVL_RWVOL, /* read/write volume */
@@ -129,8 +130,8 @@ typedef u32 afs_access_t;
struct afs_file_status {
u64 size; /* file size */
afs_dataversion_t data_version; /* current data version */
- time_t mtime_client; /* last time client changed data */
- time_t mtime_server; /* last time server changed data */
+ afs_time32_t mtime_client; /* last time client changed data */
+ afs_time32_t mtime_server; /* last time server changed data */
unsigned abort_code; /* Abort if bulk-fetching this failed */
afs_file_type_t type; /* file type */
@@ -158,7 +159,7 @@ struct afs_file_status {
* AFS volume synchronisation information
*/
struct afs_volsync {
- time_t creation; /* volume creation time */
+ afs_time32_t creation; /* volume creation time */
};
/*
--
2.9.0