On Wednesday, June 22, 2016 12:58:38 AM CEST Albert ARIBAUD wrote:
Hi all,
I have produced a fourth draft of what will eventually become the Y2038 design document:
https://sourceware.org/glibc/wiki/Y2038ProofnessDesign?rev=83
Relative to the previous draft:
the scope clarifies that *only* Y2038 is considered, and no other doomsday such as Y2106 or Y9999;
all types directly or indirectly derived from time_t are now listed;
all APIs using these types are now listed;
all functions which use time_t internally are now listed;
also listed are types and APIs related to time but which are Y2038-safe (even though they might be unsafe for some other doomsday, e.g. struct rpc_timeval being Y2106-unsafe).
As always, comments welcome.
I've cross-checked your list of data structures with the one I have for the kernel at https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T_...
I noticed 'struct sysinfo' as another interface that holds a time, even though this is documented as 'long' and I concluded that we won't need to change it, that should probably be documented.
'struct rusage' is also interesting, as there is no overflow in 2038, but instead it could overflow on large machines (hundreds of CPUs) with very long-running tasks (months), so I'm unsure how to treat that from the kernel side. It's also one of very few kernel interfaces using 'timeval' rather than 'timespec' (we generally replaced the other ones). My current kernel patch series changes rusage by using 64-bit fields for everything, with the intention of having the same binary layout for 32-bit and 64-bit processes, and reusing the 64-bit syscall entry point for compat mode (running 32-bit tasks on a 64-bit kernel). Does that work for you?
I think we should expand the ioctl section a bit: I can do most of it in the kernel, but need help from glibc for a few things in which we need to agree on calling conventions. Here is what I'd suggest having in the document, feel free to take that into your document or edit as you wish:
== IOCTLs and Y2038 ==
Some Linux IOCTLs may be Y2038-unsafe, or use types defined by glibc that do not match the kernel internal types. Known important cases are:
- An ioctl command number is defined using the _IOR/_IOW/_IORW macros by the kernel with a structure whose size changes based on glibc's time_t. The kernel can handle these transparently by implementing handlers for both command numbers with the correct structure format.
- The binary ABI changes based on the glibc time_t type, but the command number does not change. In this case, the kernel header files defining the data structure will check the "__USE_TIME_BITS64" macro [do we need a new macro for the kernel headers?] to provide a different command number for the new data structure layout. glibc will define this macro at an appropriate location [where?] to make it visible before including kernel header files.
- An ioctl command passes time information in a structure that is not based on time_t but another integer type that does not get changed. The kernel header files will provide both a new structure layout and command number when "__USE_TIME_BITS64" is set.
[I can find examples for ioctl commands in each of those categories if needed.]
== Socket options ==
Like ioctl(), setsockopt()/getsockopt() has a few interfaces that are passing time data:
SO_TIMESTAMP/SO_TIMESTAMPNS/SO_TIMESTAMPING: These enable the timestamping infrastructure for a socket, which will consecutively return data to user space using "cmsg" data on the socket. The kernel does not know the layout of 'struct timespec' and 'struct timeval' when filling the cmsg data, so we need to define new binary values for the three flags, which then get used if __USE_TIME_BITS64 is set.
SO_RCVTIMEO/SO_SNTTIMEO: These pass a 'struct timeval' and a length. Assuming that the 'optlen' argument of the setsockopt syscall always matches the size of 'struct timeval', the kernel will be able to access the data in the same format that was passed by glibc. [alternatively, we could handle this the same way as SO_TIMESTAMP*, using new numbers for the flags].
[end quote]
Regarding the "Support for non-Y2038-safe kernels" section, I'm not sure if that can work at all: A kernel that does not have the appropriate system calls will also not have the handlers for a lot of the ioctl commands and possibly other interfaces that rely on a specific structure layout. If we can instead enforce that __USE_TIME_BITS64 is only set with a minimal version of kernel headers and that it implies binary compatibility with no older kernel version, we could avoid those problems.
On your note "The implementation needs further thinking about, as application code defining _TIME_BITS=64 and gets built against new kernel headers and old GLIBC headers, then GLIBC will use 32-bit time_t and kernel will expect 64-bit time_t, and there is no way to ensure detection of this case.", I think that is covered by having the kernel headers check __USE_TIME_BITS64 instead of _TIME_BITS=64, as I described above.
Arnd