Hey Arnd,
Catching up on this thread a little late, sorry... :-/
On Wed, Mar 11, 2020 at 01:52:00PM +0100, Arnd Bergmann wrote:
As discussed before, I tried using the rebootstrap tool [1] to see what problems come up once the entire distro gets rebuilt. Based on Lukasz' recommendation, I tried the 'y2038_edge' branch with his experimental glibc patches [2], using commit c2de7ee9461 dated 2020-02-17.
Here is a rough summary of what I tried, what worked, and what problems I ran into:
- Building a Debian package from this was fairly straightforward, using
the 2.31 branch in the package git repository[3] after replacing the debian/patches/git-updates.diff file with one generated from [2] and disabling the hurd patches because of conflicts.
- After installing the modified x86 glibc package, I ran into a runtime
bug in [4], which needs to pass AT_FDCWD instead of 0 to avoid ENOTDIR errors.
- Bootstrapping a regular time32 Debian armhf with this libc took me
a few days to get right, but that was mostly for getting familiar with rebootstrap and running into known issues unrelated to time64 or the glibc changes.
Cool!
<snip glibc questions>
- There is an open question regarding the name of the Debian
architecture. For my experiments, I kept using the 'armhf' name unmodified, though there seems to be a general feeling that using a different name would be required to address the broad incompatibilities between time32 and time64 versions of all the libraries in the distro. Gradually changing them won't work because of the timeline and the number of affected libraries. However, the new name of the distro also implies having a distinct target triplet, which must then be known by glibc along with everything else using config.guess/config.sub. I expect this topic to require a lot more discussion.
ACK. I'm about to prod on this again.
- Continuing with the rebootstrap build despite the known glibc issues
and the open question on the architecture name went surprisingly well, only two out of the 152 source packages I built had compile-time problems:
building the final gcc failed in libsanitizer, which has compile-time checks to ensure some libc data structures have the expected layout. It noticed that 'struct timeb' and 'struct dirent' are different based on _TIME_BITS and _FILE_OFFSET_BITS. I disabled the checks to be able to continue. To this properly, the library has to learn about the new data structures as well. I opened a bug report against the library[7].
libpreludecpp12 failed to build because of checks for changes in the exported functions, which are different with time64. I disabled the checks. Once we have agreed on a new debian architecture name, the symbols can be made arch specific.
Yup.
- After everything was built, I tried installing the packages into
a chroot with qemu-debootstrap, which failed because I had configured the glibc to assume it's running on a new kernel while the qemu-user binary I had lacks the new syscalls. I believe this is fixed in upstream qemu, but did not try that.
- Trying to install again I used a clean debian-arm64 installation
running in qemu-system-aarch64, and attempted installing the armhf packages using a regular debootstrap, running the 32-bit binaries in compat mode of a recent arm64 kernel. This partially worked and I could chroot into the system and use a shell, but ultimately the debootstrap did not complete because of errors. I saw that 'tar' had failed because of the stat() glibc ABI mismatch breaking its private gnulib fdutimens() implementation, and this is where I gave up.
Nod. :-/ I think it's time that somebody else picked up from you here.
I have spent more time on this now than I had planned, and don't expect to do further work on it anytime soon, but I hope my summary is useful to others that are going to need this later. I can obviously share my patches and build artifacts if anyone needs them. There are two additional approaches that would likely get a Debian bootstrap further, but that I have not tried as they were previously dismissed:
- Adding a time64 armhf as a separate (incompatible) target in glibc
that defines __TIMESIZE==64 and a 64-bit __time_t would avoid most of the remaining ABI issues and put armhf-time64 in the same category as riscv32 and arc, but this idea was so far rejected by the glibc maintainers. Depending on how hard this turns out to be, it could be used to get to the point of self-hosting though, and help find time64 related bugs in the rest of the distro.
OK. I'm thinking it's probably not worth it?
- Doing the bootstrap using a musleabihf target instead of gnueabihf
would avoid the current issues internal to glibc-y2038, but instead lead to new problems with packages that do not currently work with musl. Adelie Linux has shown that it's already possible to build a useful distro using musl and time64[8], and this would sidestep the question of the target triplet. While it would also help find and fix additional bugs in packages, and make an interesting unoffical Debian target, I don't see it replacing the existing armhf port any time soon.
Ditto.
Thanks for the great summary of what you've been working on!