On Thu, Dec 20, 2018 at 11:33:59AM +0100, Szabolcs Nagy wrote:
- Rich Felker <dalias@...c.org> [2018-12-19 19:30:44 -0500]:
On Tue, Dec 18, 2018 at 08:41:53PM +0100, Arnd Bergmann wrote: ".1" ABIs, this translation would mostly be the identity transformation, but on archs where we're already doing some hacks to fix up kernel ABI bugs (sysvipc on big endian, mips stat structure, x32 stuff, etc.) the hacks could be replaced by used of this translation infrastructure.
lesson of ilp32 was that libc cannot generally translate between a user and kernel abi (otherwise it could be done in userspace).
the problematic cases are when user talks to the kernel directly using libc types in a way that the libc cannot do the translation.
interfaces where the libc does not know the type, just an opaque pointer: ioctl, fcntl, getsockopt, setsockopt, raw syscall
Ultimately all of these *can* be translated just by enumerating all the broken interfaces and special-casing them. It's not pretty, though. What would probably happen (Arnd, do you know?) would be redefining the ioctl numbers etc. to "time64" versions of the interfaces, and for interfaces which are actually "important" to have work on old kernels, including translations to/from the corresponding old ioctl. Depending on the scope, that might be all or nearly all of them.
We've done it for most of them by now. In a lot of cases we got lucky because the ioctl command code changes with sizeof(time_t), so all we had to do in the kernel was to interpret those ioctl commands for 32-bit and 64-bit time_t.
In other cases, we have redefined the ioctl command codes in the header with some clever (hopefully not too clever) trick:
#if __BITS_PER_LONG == 64 #define LPSETTIMEOUT LPSETTIMEOUT_OLD #else #define LPSETTIMEOUT (sizeof(time_t) > sizeof(__kernel_long_t) ? \ LPSETTIMEOUT_NEW : LPSETTIMEOUT_OLD) #endif
This way, we guarantee that we can still detect the data type expected by an application calling LPSETTIMEOUT. The same approach is used for setsockopt and some other interfaces.
In other cases (in particular when we never pass absolute CLOCK_REALTIME data), we changed the type inside of a structure from time_t to 'long' or 'unsigned long', in order to keep the ABI unchanged. The disadvantage here is that it requires user space to use updated kernel headers, which is a problem for applications that ship with a copy of the kernel header.
I think for fcntl we were lucky that nothing passesa time_t.
direct communication channel to the kernel that may expose the abi incompatibility: netlink, sysfs, procfs
Netlink is the worst here since it's "hidden" behind normal read/write calls where the data is abstract bytes. If there's anything that needs to be fixed at the netlink layer it probably just requires redefining part of the _API_ to use fixed-width types rather than time_t or such.
I don't remember seeing any such case with netlink. Generally speaking, netlink already has to use fixed-width types in order to support compat mode, but there may be a couple of exceptions where the kernel requires nasty hacks here. The same is true for read/write based chardev interfaces such as /dev/input/eventX, which we had to redefine to use a structure based on 'unsigned long' instead of 'time_t' and require to use CLOCK_MONOTONIC to avoid the overflow.
types related to signal handling that may require sighandler wrapping to translate: siginfo_t, ucontext_t
Yes. I'm not proposing we do sighandler wrapping/translation now or in the future because it's a pain, but there are some good motivations to do it, so I'd like to keep the option open.
I'm certainly not planning to touch any of those in musl ;--)
time_t may not be affected by these, but it shows that translation is fragile in general, i wonder if we can ensure correct behaviour in all cases. there is also the problem of linux headers which may use and redefine libc types and user code may need to use those.
Redefining libc types is already broken, and the kernel headers that do it can't be used from userspace when libc headers are included. This issue is independent of type sizes/layouts matching.
I don't think any kernel headers _use_ libc types either. They generally use their own stuff.
'struct timespec' is a notable exception here, but probably not the only one. At the moment, both libc and kernel define this structure (and timeval, itimerval, itimerspec, ...), and in my work on the kernel interfaces I assumed that the libc version is the one that will prevail, while the kernel version should get removed.
Arnd