On Sun, Nov 19, 2023 at 08:12:48PM -0500, Stefan Berger wrote:
Documentation/filesystems/ramfs-rootfs-initramfs.rst states:
If CONFIG_TMPFS is enabled, rootfs will use tmpfs instead of ramfs by default. To force ramfs, add "rootfstype=ramfs" to the kernel command line.
This currently does not work when root= is provided since then saved_root_name contains a string and rootfstype= is ignored. Therefore, ramfs is currently always chosen when root= is provided.
Maybe it is a good idea to just fully remove ramfs? initramfs will always be tmpfs. And tmpfs will always be enabled.
As well as I understand, ramfs was originally introduced, because tmpfs seemed too big. So, it seemed to be a good idea to have small fs (ramfs), which is always enabled.
I just did an experiment. I compiled the kernel with a very small config. And without TMPFS and SHMEM. I got 1059440 bytes image. Then I enabled TMPFS and SHMEM, and I got 1072976 bytes. So tmpfs adds 13536 bytes, i. e. 14k, which is a very small amount. It adds 1.3 % to the kernel even with very small config.
So I propose to remove ramfs and always enable tmpfs. This will decrease complexity.
Here are my configs (x86_64). Just enough to run busybox in "qemu -serial stdio"
make KCONFIG_ALLCONFIG="$FILE" allnoconfig
CONFIG_64BIT=y CONFIG_PRINTK=y CONFIG_SERIAL_8250=y CONFIG_TTY=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_BLK_DEV_INITRD=y CONFIG_RD_GZIP=y CONFIG_BINFMT_ELF=y CONFIG_EMBEDDED=y CONFIG_EXPERT=y CONFIG_TMPFS=n # Try to change this to "y" CONFIG_SHMEM=n # Try to change this to "y"
Here is full docker reproducer:
# Reproducible # 20230227 = 20230227T000000Z = 20230226T090712Z FROM debian:sid-20230227 ENV LC_ALL C.UTF-8 RUN sed -i 's~^URIs:.*$~URIs: http://snapshot.debian.org/archive/debian/20230226T090712Z~' /etc/apt/sources.list.d/debian.sources RUN echo 'Acquire::Check-Valid-Until "false";' > /etc/apt/apt.conf.d/02acquire-check-valid-until RUN apt-get update && apt-get install -y apt-utils whiptail RUN apt-get update && apt-get install -y busybox-static qemu-system-x86 make gcc git flex bison bc libelf-dev less nano cpio RUN git clone --depth=1 -b v6.2 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git RUN : > /.config # See Linux f8f0d06438e5c810d1e13b5f8c2fed501fe36e9c RUN echo 'CONFIG_64BIT=y' >> /.config RUN echo 'CONFIG_PRINTK=y' >> /.config RUN echo 'CONFIG_SERIAL_8250=y' >> /.config RUN echo 'CONFIG_TTY=y' >> /.config RUN echo 'CONFIG_SERIAL_8250_CONSOLE=y' >> /.config RUN echo 'CONFIG_BLK_DEV_INITRD=y' >> /.config RUN echo 'CONFIG_RD_GZIP=y' >> /.config RUN echo 'CONFIG_BINFMT_ELF=y' >> /.config RUN echo 'CONFIG_EMBEDDED=y' >> /.config RUN echo 'CONFIG_EXPERT=y' >> /.config RUN echo 'CONFIG_TMPFS=y' >> /.config # try "n" RUN echo 'CONFIG_SHMEM=y' >> /.config # try "n" RUN cd linux && make KCONFIG_ALLCONFIG=/.config allnoconfig RUN cd linux && make -j4 RUN mkdir /initramfs && cp /bin/busybox /initramfs && cd /initramfs && ln -s busybox sh && find . | cpio --create --format=newc --quiet | gzip > /initramfs.cpio.gz RUN echo "qemu-system-x86_64 -M microvm -m 64M -serial stdio -display none -kernel /linux/arch/x86/boot/bzImage -initrd /initramfs.cpio.gz -append 'quiet console=ttyS0 earlyprintk=ttyS0 rdinit=/sh' -nodefaults -no-user-config" > /root/.bash_history
On 12/19/23 20:19, Askar Safin wrote:
On Sun, Nov 19, 2023 at 08:12:48PM -0500, Stefan Berger wrote:
Documentation/filesystems/ramfs-rootfs-initramfs.rst states:
If CONFIG_TMPFS is enabled, rootfs will use tmpfs instead of ramfs by default. To force ramfs, add "rootfstype=ramfs" to the kernel command line.
I wrote that around 2005.
This currently does not work when root= is provided since then saved_root_name contains a string and rootfstype= is ignored.
I wrote the code to populate a tmpfs from initramfs in 2013 (8 years later), because nobody else had bothered to, and I've had a patch to fix the rootfstype= issue for years, but I long ago lost my touch with linux-kernel proctology:
https://lkml.iu.edu/hypermail/linux/kernel/2302.2/05601.html
Last I heard it was going upstream via somebody else (who tripled the size of the fix for no obvious reason, but that's modern linux-kernel for you), but I lost track after the guy who'd said to send it to him instead did multiple automated bounces from a bot that said "to save this one special guy's time, we're going to spam linux-kernel with autogenerated content delivered to tens of thousands of plebian mailboxes":
https://lkml.iu.edu/hypermail/linux/kernel/2311.1/05544.html https://lkml.iu.edu/hypermail/linux/kernel/2311.1/06448.html
Last I saw the rewrite of the fix had been resubmitted to the Special Guy a fourth time:
https://lkml.iu.edu/hypermail/linux/kernel/2311.2/02938.html
Maybe it'll show up in a git pull someday? Who knows...
Therefore,
ramfs is currently always chosen when root= is provided.
Maybe it is a good idea to just fully remove ramfs?
"Let's delete a 20 year old filesystem that's been built into every system that whole time, this can't POSSIBLY have any side effects."
We've already discussed this, more than once:
https://lkml.iu.edu/hypermail/linux/kernel/2311.0/00435.html
Can you build tmpfs on a nommu system? Last I checked the plumbing expects swap, but it's been a while...
initramfs will always be tmpfs. And tmpfs will always be enabled.
As well as I understand, ramfs was originally introduced, because tmpfs seemed too big.
No, ramfs came first, tmpfs was created later. Not just what initramfs uses, but the order in which the filesystems were written.
I believe modern ramfs more or less dates to Al Viro factoring out "libfs":
https://lwn.net/Articles/57369/
Although the memory grows a bit hazy. (I vaguely remember what happened, but exactly when it was, and what came before what... git log in my unified history tree is showing fs/ramfs originating from 2.3.99pre4 in 2000 but it may be confused by a rename. I remember somebody, I thought Linus, saying that ramfs was an example user of the libfs code...)
Here are my configs (x86_64). Just enough to run busybox in "qemu -serial stdio"
Back when I maintained busybox there were people in that space who cared about 14k, and that's not measuring runtime overhead.
There was a lovely ELC talk in 2015 about booting Linux in 256 kilobytes of SRAM (everything else being XIP out of ROM-alikes), but alas the entire year's videos got wiped by the Linux Foundation. The PDF is still online though:
https://elinux.org/images/9/90/Linux_for_Microcontrollers-_From_Marginal_to_...
Alas, XIP got replaced with DAX which I've never managed to get to work...
Rob
Hi, Rob. And Stefan.
First of all, this patch got to linux-next ( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/?qt=... ), so it seems it soon will be in mainline.
On Thu, Dec 21, 2023 at 12:24 PM Rob Landley rob@landley.net wrote:
Can you build tmpfs on a nommu system? Last I checked the plumbing expects swap, but it's been a while...
Okay, I agree, let's not remove ramfs.
Still, I don't like this (already applied) patch. init= and rdinit= are two different options, and this is good. So, I think we should have two different options. Analogously they should be rootfstype= and rdrootfstype=. rootfstype= should be read by kernel when deciding how to mount real root (i. e. not initramfs or initrd) only and rdrootfstype= when deciding how to mount initramfs only. This will make everything cleaner. Also note that userspace tools read rootfstype= and assume that it always applies to real root. For example, this is Debian's rdinit:
https://salsa.debian.org/kernel-team/initramfs-tools/-/blob/cf964bfb4362019f...
As you can see, this shell script parses /proc/cmdline and assumes that rootfstype= always applies to real root. So, if someone sets rootfstype= to tmpfs or ramfs, this will likely break this script.
So, I think the code should look so:
+if (IS_ENABLED(CONFIG_TMPFS)) { + if (!rd_root_fs_names) // We assume rd_root_fs_names is set by rdrootfstype= + is_tmpfs = true; // Use tmpfs if rdrootfstype= is not set. To get all tmpfs benefits + else if (rd_root_fs_names && !!strstr(rd_root_fs_names, "tmpfs")) + is_tmpfs = true; +}
Yes, this will slightly break compatibility. I. e. this will make Linux always choose tmpfs if rdrootfstype= is not present. But I think there is nothing wrong with it. If a user cares, he will set rdrootfstype= . And early boot code will become a lot more clean and logical.
Rob, do you agree? Stefan, do you agree? Then I will write a patch, with doc changes (currently I use gmail web interface, of course I will use git send-email when I sent actual patch)
On 12/21/23 17:58, Askar Safin wrote:
Hi, Rob. And Stefan.
First of all, this patch got to linux-next ( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/?qt=... ), so it seems it soon will be in mainline.
On Thu, Dec 21, 2023 at 12:24 PM Rob Landley rob@landley.net wrote:
Can you build tmpfs on a nommu system? Last I checked the plumbing expects swap, but it's been a while...
Okay, I agree, let's not remove ramfs.
Still, I don't like this (already applied) patch. init= and rdinit= are two different options, and this is good. So, I think we should have two different options. Analogously they should be rootfstype= and rdrootfstype=. rootfstype= should be read by kernel when deciding how to mount real root (i. e. not initramfs or initrd) only and rdrootfstype= when deciding how to mount initramfs only. This will make everything cleaner. Also note that userspace tools read rootfstype= and assume that it always applies to real root. For example, this is Debian's rdinit:
https://salsa.debian.org/kernel-team/initramfs-tools/-/blob/cf964bfb4362019f...
As you can see, this shell script parses /proc/cmdline and assumes that rootfstype= always applies to real root. So, if someone sets rootfstype= to tmpfs or ramfs, this will likely break this script.
Setting the kernel boot command line option rootfstype= to tmpfs or ramfs was possible so far and that's what the documentation and code supported so far as well. The bug surfaced when root= was provided, in which case it was ignored.
So, I think the code should look so:
+if (IS_ENABLED(CONFIG_TMPFS)) {
if (!rd_root_fs_names) // We assume rd_root_fs_names is set
by rdrootfstype=
is_tmpfs = true; // Use tmpfs if rdrootfstype= is not
set. To get all tmpfs benefits
else if (rd_root_fs_names && !!strstr(rd_root_fs_names, "tmpfs"))
is_tmpfs = true;
+}
Yes, this will slightly break compatibility. I. e. this will make Linux always choose tmpfs if rdrootfstype= is not present. But I think
You may find someone who doesn't like this change, either, ...
there is nothing wrong with it. If a user cares, he will set rdrootfstype= . And early boot code will become a lot more clean and logical.
Rob, do you agree? Stefan, do you agree? Then I will write a patch,
... but go ahead.
with doc changes (currently I use gmail web interface, of course I will use git send-email when I sent actual patch)
On 12/29/23 10:39, Stefan Berger wrote:> On 12/21/23 17:58, Askar Safin wrote:
Hi, Rob. And Stefan.
First of all, this patch got to linux-next ( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/?qt=... ), so it seems it soon will be in mainline.
On Thu, Dec 21, 2023 at 12:24 PM Rob Landley rob@landley.net wrote:
Can you build tmpfs on a nommu system? Last I checked the plumbing expects swap, but it's been a while...
Okay, I agree, let's not remove ramfs.
Still, I don't like this (already applied) patch. init= and rdinit= are two different options,
Because they control two different things which are often used at the same time. (Debian has an initramfs that hands off to the final root filesystem, for example. Hence the initramfs-tools package that runs every time apt-get updates the kernel.)
So being able to specify rdinit= to intercept the ramfs layer or init= to intercept the root= layer made sense, because they did different things.
But the only reason to specify anything nontrivial for the initramfs _filesystem_ mount properties is because you intend to stay there. They don't get used together.
and this is good.
Eh, not really. Strange legacy decision we're now stuck with. The kernel only ever runs one init task per boot. If init= was _also_ checked to see which file to run out of initramfs (and the plumbing still justs silently fails and moves on if it's not found) then the debian script would have been forced to do INIT= or similar to override the overmounted root's init task separately from initrd's init task, making it clear a script (not the kernel) is making that decision.
But that would have been a user-visible change, and when initramfs was going in they were trying to avoid user-visible changes that would force sysadmins to learn new stuff because the plumbing changed out from under them. (Like the change you're proposing now would.)
So, I think we should have two different options. Analogously they should be rootfstype= and rdrootfstype=.
You can't have a root= type of initramfs or tmpfs. The specified values can't overlap. The plumbing I wrote responds to specific values but otherwise leaves it for later users.
https://salsa.debian.org/kernel-team/initramfs-tools/-/blob/cf964bfb4362019f...
As you can see, this shell script parses /proc/cmdline and assumes that rootfstype= always applies to real root.
The script is running _in_ the initramfs, which is already loaded and running at that point. Meaning the _kernel_ will not parse root= at that point, userspace has to do it.
So, if someone sets rootfstype= to tmpfs or ramfs, this will likely break this script.
Which was the same 10 years ago?
The script is running in a context where initramfs is not persistent, so overriding it to be a tmpfs has no benefit. (I mean you _can_... Nobody does, because we're gonna switch_root off of it.)
And once code _is_ running in initramfs, the kernel's internal root= automounter will never run. The initramfs code can parse /proc/cmdline to use the same arguments as the kernel, or it could much more easily use the "any unrecognized arguments get set as environment variables in PID 1" and use ROOT= or similar, like many scripts do.
Modifying kernel code that NEVER RUNS in the case you're pointing out seems silly to me.
That said, the code I wrote is doing a strstr to see if the argument's there, but doesn't care what ELSE is there, so it could easily be "rootfstype=tmpfs,ext4" and have the userspace script also filter the argument for just what it's interested in, since at that point it's NOT THE KERNEL DOING IT.
Setting the kernel boot command line option rootfstype= to tmpfs or ramfs was possible so far and that's what the documentation and code supported so far as well. The bug surfaced when root= was provided, in which case it was ignored.
No, as I explained when I wrote the initmpfs code in 2013 when you say root= you are explicitly requesting the kernel mount a second file system over rootfs (that's what root= MEANS), and thus don't bother making it a (more expensive) tmpfs because it's not sticking around.
I did that to NOT invent new arguments, and work with existing systems. The new feature I added triggered when you _requested_ it.
Yes, this will slightly break compatibility. I. e. this will make Linux always choose tmpfs if rdrootfstype= is not present. But I think
You may find someone who doesn't like this change, either, ...
Yeah, me.
there is nothing wrong with it. If a user cares, he will set rdrootfstype= . And early boot code will become a lot more clean and logical.
Rob, do you agree? Stefan, do you agree? Then I will write a patch,
I think it's a bad idea.
If a documentation change is needed, maybe clarify what root= means? That argument explicitly requests the kernel mount a second filesystem over rootfs after failing to launch init out of initramfs, so that rootfs will NOT be the / visible to PID 1.
This is what it has ALWAYS meant. Back before initramfs was invented in the old initrd days, you didn't need to say root= if you provided an initial ramdisk with an init executable in it (except the old mechanism didn't want /init it wanted /linuxrc for no apparent reason), and if you DID say root= when the kernel launched PID 1 out of initrd the superfluous argument was ignored (never parsed) because the kernel setup code never made it far enough to check it.
I also note that I never wired up "rootflags=" (see init/do_mounts.c root_data_setup) so it gets passed through to initmpfs, meaning you can't specify size= and thus it defaults to 50% of available memory as the enforced limit. That would be actually useful. (And no I don't really care if it's the original argument or a different one because it can't have any existing users to break yet, although once again the kernel CAN'T do both at once so having two is SILLY. The reason they're NOT wired up yet is ramfs takes no arguments, so only passing rootflags to tmpfs when you didn't specify a root= or when you tell it rootfstype=tmpfs makes sense, because then what you're passing better not be ext4 arguments because you're staying on initramfs and the kernel's root= plumbing doesn't run.)
That's the actual missing feature here.
Rob
On 12/29/23 13:35, Rob Landley wrote:
On 12/29/23 10:39, Stefan Berger wrote:> On 12/21/23 17:58, Askar Safin wrote:
Hi, Rob. And Stefan.
First of all, this patch got to linux-next ( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/?qt=... ), so it seems it soon will be in mainline.
On Thu, Dec 21, 2023 at 12:24 PM Rob Landley rob@landley.net wrote:
Can you build tmpfs on a nommu system? Last I checked the plumbing expects swap, but it's been a while...
Okay, I agree, let's not remove ramfs.
Still, I don't like this (already applied) patch. init= and rdinit= are two different options,
Because they control two different things which are often used at the same time. (Debian has an initramfs that hands off to the final root filesystem, for example. Hence the initramfs-tools package that runs every time apt-get updates the kernel.)
So being able to specify rdinit= to intercept the ramfs layer or init= to intercept the root= layer made sense, because they did different things.
But the only reason to specify anything nontrivial for the initramfs _filesystem_ mount properties is because you intend to stay there. They don't get used together.
and this is good.
Eh, not really. Strange legacy decision we're now stuck with. The kernel only ever runs one init task per boot. If init= was _also_ checked to see which file to run out of initramfs (and the plumbing still justs silently fails and moves on if it's not found) then the debian script would have been forced to do INIT= or similar to override the overmounted root's init task separately from initrd's init task, making it clear a script (not the kernel) is making that decision.
But that would have been a user-visible change, and when initramfs was going in they were trying to avoid user-visible changes that would force sysadmins to learn new stuff because the plumbing changed out from under them. (Like the change you're proposing now would.)
So, I think we should have two different options. Analogously they should be rootfstype= and rdrootfstype=.
You can't have a root= type of initramfs or tmpfs. The specified values can't overlap. The plumbing I wrote responds to specific values but otherwise leaves it for later users.
https://salsa.debian.org/kernel-team/initramfs-tools/-/blob/cf964bfb4362019f...
As you can see, this shell script parses /proc/cmdline and assumes that rootfstype= always applies to real root.
The script is running _in_ the initramfs, which is already loaded and running at that point. Meaning the _kernel_ will not parse root= at that point, userspace has to do it.
So, if someone sets rootfstype= to tmpfs or ramfs, this will likely break this script.
Which was the same 10 years ago?
The script is running in a context where initramfs is not persistent, so overriding it to be a tmpfs has no benefit. (I mean you _can_... Nobody does, because we're gonna switch_root off of it.)
And once code _is_ running in initramfs, the kernel's internal root= automounter will never run. The initramfs code can parse /proc/cmdline to use the same arguments as the kernel, or it could much more easily use the "any unrecognized arguments get set as environment variables in PID 1" and use ROOT= or similar, like many scripts do.
Modifying kernel code that NEVER RUNS in the case you're pointing out seems silly to me.
That said, the code I wrote is doing a strstr to see if the argument's there, but doesn't care what ELSE is there, so it could easily be "rootfstype=tmpfs,ext4" and have the userspace script also filter the argument for just what it's interested in, since at that point it's NOT THE KERNEL DOING IT.
It's a bit tricky that this particular option, that can support a comma-separated list, is shared between kernel and user space and user space does not already filter-out what is not relevant for it.
Setting the kernel boot command line option rootfstype= to tmpfs or ramfs was possible so far and that's what the documentation and code supported so far as well. The bug surfaced when root= was provided, in which case it was ignored.
No, as I explained when I wrote the initmpfs code in 2013 when you say root= you are explicitly requesting the kernel mount a second file system over rootfs
From the perspective of needing xattr support in initramfs it's unfortunately not so obvious what the filesystem type of the kernel's rootfs (presumably the 1st file system) has to do with the option given for the 2nd filesystem. Though the Debian scripts are the bigger problem it seems. However, for those one could argue that the Debian scripts could be updated and for as long as they are not able to filter-out the tmpfs or ramfs options we are interested in one cannot pass these options or a comma-separated list on systems that run the current Debian scripts.
(that's what root= MEANS), and thus don't bother making it a (more expensive) tmpfs because it's not sticking around.
That's true unless you want to use IMA signature enforcement in the initramfs already and tmpfs is now required.
Stefan
On 12/29/23 13:14, Stefan Berger wrote:>> That said, the code I wrote is doing a strstr to see if the argument's there,
but doesn't care what ELSE is there, so it could easily be "rootfstype=tmpfs,ext4" and have the userspace script also filter the argument for just what it's interested in, since at that point it's NOT THE KERNEL DOING IT.
It's a bit tricky that this particular option, that can support a comma-separated list, is shared between kernel and user space and user space does not already filter-out what is not relevant for it.
Debian's code sometimes has bugs, especially their initramfs stuff doesn't get a lot of scrutiny:
https://lkml.iu.edu/hypermail/linux/kernel/1705.2/05611.html https://lkml.iu.edu/hypermail/linux/kernel/1705.3/01182.html https://lkml.org/lkml/2017/9/13/651#:~:text=Debian%27s
But they're pretty good about fixing bugs pointed out to them:
https://salsa.debian.org/kernel-team/initramfs-tools/-/commit/49e4a0555f51
The kernel having more capabilities here than Debian's userspace does isn't new, it's what gives debian's userspace the opportunity to gain new capabilities.
Although in this case, the patch in question still isn't in lkml 5 years later because Debian development is much more responsive than linux-kernel:
https://lkml.iu.edu/hypermail/linux/kernel/2302.2/05597.html
Setting the kernel boot command line option rootfstype= to tmpfs or ramfs was possible so far and that's what the documentation and code supported so far as well. The bug surfaced when root= was provided, in which case it was ignored.
No, as I explained when I wrote the initmpfs code in 2013 when you say root= you are explicitly requesting the kernel mount a second file system over rootfs
From the perspective of needing xattr support in initramfs it's unfortunately not so obvious what the filesystem type of the kernel's rootfs (presumably the 1st file system) has to do with the option given for the 2nd filesystem. Though the Debian scripts are the bigger problem it seems.
Ping Ben if initramfs-tools needs updating?
I've been following the initramfs xattr support threads forever:
https://lkml.iu.edu/hypermail/linux/kernel/2207.3/06939.html
I'm happy to add new format support to toybox cpio if anybody comes to an agreement on what that should be, but last time there was "as long as we're here 32 bit timestamps" and "sparse file support could be" and various bikeshedding...
Was there a new thread I didn't get cc'd on? The last I have is... July 2022 I think?
However, for those one could argue that the Debian scripts could be updated and for as long as they are not able to filter-out the tmpfs or ramfs options we are interested in one cannot pass these options or a comma-separated list on systems that run the current Debian scripts.
You can argue that current userspace does not take full advantage of the existing kernel API, sure.
(that's what root= MEANS), and thus don't bother making it a (more expensive) tmpfs because it's not sticking around.
That's true unless you want to use IMA signature enforcement in the initramfs already and tmpfs is now required.
I agree that if you want to add a new requirement, you may need to modify userspace.
Let me see if I understand your problem: it sounds like debian's initramfs-tools overloads the root= and rootfstype= arguments parsed by the kernel to have a second meaning (the kernel uses them for one thing, you want to use them for something else, and there's currently a semantic gap between the two.)
You want to add a new capability requiring a new build dependency in the initramfs-tools package because it's doing new stuff, but there cannot be any OTHER changes made to initramfs-tools, so the kernel should change its existing semantics instead.
You can't NOT provide root=, and you can't provide initramfstype=tmpfs... either, and those are the two existing ways to tell rootfs to be tmpfs instead of ramfs. You'd like to add a third way to specify the same thing.
Do I have that right?
Stefan
Rob
On Sat, Dec 30, 2023 at 8:01 PM Rob Landley rob@landley.net wrote:
You want to add a new capability requiring a new build dependency in the
Rob, who are you telling this to? To Stefan? It seems he doesn't propose any further changes. *I* did propose changes (i. e. adding rdrootfstype=), and I already wrote that I will not pursue further
On 12/30/23 18:46, Askar Safin wrote:
On Sat, Dec 30, 2023 at 8:01 PM Rob Landley rob@landley.net wrote:
You want to add a new capability requiring a new build dependency in the
Rob, who are you telling this to? To Stefan? It seems he doesn't propose any further changes. *I* did propose changes (i. e. adding rdrootfstype=), and I already wrote that I will not pursue further
I was trying to make sure we can support the motivating use case, and that there aren't any actual blockers with the current API.
I don't ask things like "Do I have that right?" sarcastically: I am periodically wrong about stuff. You all know your use case(s) better than I do, and I may not have understood your attempt to communicate it to me.
I get a bit exasperated at "this thing that's been there for 10 years could instead have been like this" because I think changing established kernel api is _conceptually_ expensive (either we broke the old one or now we have to support two), but "I still can't figure out how to make X work" means somebody still needs help (up to and including actually motivating an API change, but more commonly some userspace design work and/or a documentation update).
Thank you for clarifying. It looks like we're good here for the moment.
Rob
On Mon, Jan 1, 2024 at 8:42 PM Rob Landley rob@landley.net wrote:
You all know your use case(s) better than I do, and I may not have understood your attempt to communicate it to me.
I don't have any personal use cases at all. I proposed rdrootfstype= by aesthetic reasons.
On 12/30/23 12:08, Rob Landley wrote:
On 12/29/23 13:14, Stefan Berger wrote:>> That said, the code I wrote is doing a strstr to see if the argument's there,
but doesn't care what ELSE is there, so it could easily be "rootfstype=tmpfs,ext4" and have the userspace script also filter the argument for just what it's interested in, since at that point it's NOT THE KERNEL DOING IT.
It's a bit tricky that this particular option, that can support a comma-separated list, is shared between kernel and user space and user space does not already filter-out what is not relevant for it.
Debian's code sometimes has bugs, especially their initramfs stuff doesn't get a lot of scrutiny:
https://lkml.iu.edu/hypermail/linux/kernel/1705.2/05611.html https://lkml.iu.edu/hypermail/linux/kernel/1705.3/01182.html https://lkml.org/lkml/2017/9/13/651#:~:text=Debian%27s
But they're pretty good about fixing bugs pointed out to them:
https://salsa.debian.org/kernel-team/initramfs-tools/-/commit/49e4a0555f51
The kernel having more capabilities here than Debian's userspace does isn't new, it's what gives debian's userspace the opportunity to gain new capabilities.
Although in this case, the patch in question still isn't in lkml 5 years later because Debian development is much more responsive than linux-kernel:
https://lkml.iu.edu/hypermail/linux/kernel/2302.2/05597.html
Setting the kernel boot command line option rootfstype= to tmpfs or ramfs was possible so far and that's what the documentation and code supported so far as well. The bug surfaced when root= was provided, in which case it was ignored.
No, as I explained when I wrote the initmpfs code in 2013 when you say root= you are explicitly requesting the kernel mount a second file system over rootfs
From the perspective of needing xattr support in initramfs it's unfortunately not so obvious what the filesystem type of the kernel's rootfs (presumably the 1st file system) has to do with the option given for the 2nd filesystem. Though the Debian scripts are the bigger problem it seems.
Ping Ben if initramfs-tools needs updating?
I've been following the initramfs xattr support threads forever:
https://lkml.iu.edu/hypermail/linux/kernel/2207.3/06939.html
I'm happy to add new format support to toybox cpio if anybody comes to an agreement on what that should be, but last time there was "as long as we're here 32 bit timestamps" and "sparse file support could be" and various bikeshedding...
Was there a new thread I didn't get cc'd on? The last I have is... July 2022 I think?
However, for those one could argue that the Debian scripts could be updated and for as long as they are not able to filter-out the tmpfs or ramfs options we are interested in one cannot pass these options or a comma-separated list on systems that run the current Debian scripts.
You can argue that current userspace does not take full advantage of the existing kernel API, sure.
(that's what root= MEANS), and thus don't bother making it a (more expensive) tmpfs because it's not sticking around.
That's true unless you want to use IMA signature enforcement in the initramfs already and tmpfs is now required.
I agree that if you want to add a new requirement, you may need to modify userspace.
Let me see if I understand your problem: it sounds like debian's initramfs-tools overloads the root= and rootfstype= arguments parsed by the kernel to have a second meaning (the kernel uses them for one thing, you want to use them for something else, and there's currently a semantic gap between the two.)
My intention is to be able to pass rootfstype= to the kernel and have it interpreted correctly in the presence of root=, which currently does not work. User space tools that interpret the value of rootfstype= as if this option belonged to user space is not helpful, though it should be easy to teach the user space scripts to strip a leading 'tmpfs,' or 'ramfs,' from the rootfstype value and let them interpret the rest.
You want to add a new capability requiring a new build dependency in the initramfs-tools package because it's doing new stuff, but there cannot be any OTHER changes made to initramfs-tools, so the kernel should change its existing semantics instead.
I haven't even thought of what would need to be added to Debian's initramfs-tools package since my primary goal was to enable tmpfs for the initramfs on OpenBMC where we then read the xattr values from a file and write them into the filesystem because cpio format doesn't carry them. Also, I didn't expect that any user space tools would try to handle a kernel command line option as if it was theirs.
You can't NOT provide root=, and you can't provide initramfstype=tmpfs...
I only know about rootfstype= ( https://github.com/torvalds/linux/blob/master/init/do_mounts.c#L128 ). If currently handling of rootfstype= in presence of root= is not considered a bug and we should introduce initramfstype= instead, we could do that. But doesn't this become a bit confusing if rootfstype= can be passed when root= is absent but then initramfstype= must be used when root= is present?
This is 'our' patch describing the issue: https://github.com/torvalds/linux/blob/master/init/do_mounts.c#L128
either, and those are the two existing ways to tell rootfs to be tmpfs instead of ramfs. You'd like to add a third way to specify the same thing.
Do you have a link to initramfstype= handling in kernel code?
Stefan
Do I have that right?
Stefan
Rob
Hi, Stefan. Hi, Rob.
My intention is to be able to pass rootfstype= to the kernel and have it interpreted correctly in the presence of root=, which currently does not work
Stefan, let me repeat: your patch, which does exactly this, got to linux-next: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?...
I think there is some misunderstanding here. First, there is Stefan's patch (based on Rob's patch), which makes the kernel parse rootfstype= even if root= is present. As well as I understand, we all agree that this patch is needed, and it was applied to linux-next, so all is good!
Second, there was my suggestion to introduce rdrootfstype= option. You both reacted sceptically, so I withdraw this suggestion
On 12/31/23 10:03, Stefan Berger wrote:
Let me see if I understand your problem: it sounds like debian's initramfs-tools overloads the root= and rootfstype= arguments parsed by the kernel to have a second meaning (the kernel uses them for one thing, you want to use them for something else, and there's currently a semantic gap between the two.)
My intention is to be able to pass rootfstype= to the kernel and have it interpreted correctly in the presence of root=, which currently does not work. User space tools that interpret the value of rootfstype= as if this option belonged to user space is not helpful, though it should be easy to teach the user space scripts to strip a leading 'tmpfs,' or 'ramfs,' from the rootfstype value and let them interpret the rest.
Does your initramfs plumbing need to pass a rootfstype equivalent on to the userspace mount at all? In what cases does it not autodetect the type correctly?
(Even NFS and SMB mounts are generally detectable because of the leading \ or blah: although I suppose there are other network filesystem types that wouldn't be. Or if you wanted to micromanage the fat variant you were using...)
"rootfstype=" is the argument that tells the _kernel_ how to mount / and by the time init runs the kernel's already mounted what it's going to mount. The kernel only exposes one visible / mount to userspace, you don't return back into it and get another init launched running in a different root filesystem.
You want to add a new capability requiring a new build dependency in the initramfs-tools package because it's doing new stuff, but there cannot be any OTHER changes made to initramfs-tools, so the kernel should change its existing semantics instead.
I haven't even thought of what would need to be added to Debian's initramfs-tools package since my primary goal was to enable tmpfs for the initramfs on OpenBMC where we then read the xattr values from a file and write them into the filesystem because cpio format doesn't carry them.
Me, I'd have a simple initramfs extract/decrypt a tarball with the filesystem that needs xattr values into a new tmpfs mount and switch_root to that. But I tend to statically link an initramfs into the kernel image when I want to be sure what it's running, and have never quite been clear on the benefit of _additionally_ verifying data that originates from within the kernel image. (If they can change that, they can change ring 0 code.)
Still, adding xattr support to cpio comes up a lot. It seems like a couple days work tops, maybe the interested parties should do a video conference thingy, hammer out the details, and come up with a patch to add support? The userspace side sounds easy enough, I added xattr support to toybox tar in 2021 in a weekend, and have sent "would you like to keep up with toybox" patches at the busybox guys semi-regularly.
I even poked coreutils about feature parity once (the Android guys asked me to), which they said they would like to add, but which but still isn't in years later:
https://lists.gnu.org/archive/html/coreutils/2023-08/msg00009.html https://lists.gnu.org/archive/html/coreutils/2023-08/msg00100.html
But eh, I'm used to that with 30 year old projects licensed under copyleft...
Also, I didn't expect that any user space tools would try to handle a kernel command line option as if it was theirs.
Debian predates the 1.0 kernel release, so has some historical design baggage. That's why it's I tend to check them for snags in this area.
You can't NOT provide root=, and you can't provide initramfstype=tmpfs...
I only know about rootfstype= ( https://github.com/torvalds/linux/blob/master/init/do_mounts.c#L128 ). If currently handling of rootfstype= in presence of root= is not considered a bug and we should introduce initramfstype= instead, we could do that. But doesn't this become a bit confusing if rootfstype= can be passed when root= is absent but then initramfstype= must be used when root= is present?
I personally think having two would be confusing, and changing the existing API without adding new capabilities is pointless.
This is 'our' patch describing the issue: https://github.com/torvalds/linux/blob/master/init/do_mounts.c#L128
either, and those are the two existing ways to tell rootfs to be tmpfs instead of ramfs. You'd like to add a third way to specify the same thing.
Do you have a link to initramfstype= handling in kernel code?
No, it's never done that. There was a suggestion to do that earlier in this thread:
https://lkml.iu.edu/hypermail/linux/kernel/2312.2/07060.html
And I thought it was a bad idea. The submitter agreed it was a bad idea. (Over the holidays I've haven't been paying close attention and threads tend to bleed together, sorry. :)
The answer to my "do I have this right" question was, apparently, "no". I mixed together what two different people wanted...
Rob
On 1/1/24 13:50, Rob Landley wrote:
On 12/31/23 10:03, Stefan Berger wrote:
Let me see if I understand your problem: it sounds like debian's initramfs-tools overloads the root= and rootfstype= arguments parsed by the kernel to have a second meaning (the kernel uses them for one thing, you want to use them for something else, and there's currently a semantic gap between the two.)
My intention is to be able to pass rootfstype= to the kernel and have it interpreted correctly in the presence of root=, which currently does not work. User space tools that interpret the value of rootfstype= as if this option belonged to user space is not helpful, though it should be easy to teach the user space scripts to strip a leading 'tmpfs,' or 'ramfs,' from the rootfstype value and let them interpret the rest.
Does your initramfs plumbing need to pass a rootfstype equivalent on to the userspace mount at all? In what cases does it not autodetect the type correctly?
The only change I needed was to have tmpfs used for the initramfs to enable xattrs. No other changes were needed in my case (OpenBMC/Yocto).
(Even NFS and SMB mounts are generally detectable because of the leading \ or blah: although I suppose there are other network filesystem types that wouldn't be. Or if you wanted to micromanage the fat variant you were using...)
"rootfstype=" is the argument that tells the _kernel_ how to mount / and by the time init runs the kernel's already mounted what it's going to mount. The kernel only exposes one visible / mount to userspace, you don't return back into it and get another init launched running in a different root filesystem.
You want to add a new capability requiring a new build dependency in the initramfs-tools package because it's doing new stuff, but there cannot be any OTHER changes made to initramfs-tools, so the kernel should change its existing semantics instead.
I haven't even thought of what would need to be added to Debian's initramfs-tools package since my primary goal was to enable tmpfs for the initramfs on OpenBMC where we then read the xattr values from a file and write them into the filesystem because cpio format doesn't carry them.
Me, I'd have a simple initramfs extract/decrypt a tarball with the filesystem that needs xattr values into a new tmpfs mount and switch_root to that. But I tend to statically link an initramfs into the kernel image when I want to be sure what it's running, and have never quite been clear on the benefit of _additionally_ verifying data that originates from within the kernel image. (If they can change that, they can change ring 0 code.)
Still, adding xattr support to cpio comes up a lot. It seems like a couple days
Let's see where we can take this next now that we will have xattr support via tmpfs for the initramfs.
Stefan
work tops, maybe the interested parties should do a video conference thingy, hammer out the details, and come up with a patch to add support? The userspace side sounds easy enough, I added xattr support to toybox tar in 2021 in a weekend, and have sent "would you like to keep up with toybox" patches at the busybox guys semi-regularly.
I even poked coreutils about feature parity once (the Android guys asked me to), which they said they would like to add, but which but still isn't in years later:
https://lists.gnu.org/archive/html/coreutils/2023-08/msg00009.html https://lists.gnu.org/archive/html/coreutils/2023-08/msg00100.html
But eh, I'm used to that with 30 year old projects licensed under copyleft...
Also, I didn't expect that any user space tools would try to handle a kernel command line option as if it was theirs.
Debian predates the 1.0 kernel release, so has some historical design baggage. That's why it's I tend to check them for snags in this area.
You can't NOT provide root=, and you can't provide initramfstype=tmpfs...
I only know about rootfstype= ( https://github.com/torvalds/linux/blob/master/init/do_mounts.c#L128 ). If currently handling of rootfstype= in presence of root= is not considered a bug and we should introduce initramfstype= instead, we could do that. But doesn't this become a bit confusing if rootfstype= can be passed when root= is absent but then initramfstype= must be used when root= is present?
I personally think having two would be confusing, and changing the existing API without adding new capabilities is pointless.
This is 'our' patch describing the issue: https://github.com/torvalds/linux/blob/master/init/do_mounts.c#L128
either, and those are the two existing ways to tell rootfs to be tmpfs instead of ramfs. You'd like to add a third way to specify the same thing.
Do you have a link to initramfstype= handling in kernel code?
No, it's never done that. There was a suggestion to do that earlier in this thread:
https://lkml.iu.edu/hypermail/linux/kernel/2312.2/07060.html
And I thought it was a bad idea. The submitter agreed it was a bad idea. (Over the holidays I've haven't been paying close attention and threads tend to bleed together, sorry. :)
The answer to my "do I have this right" question was, apparently, "no". I mixed together what two different people wanted...
Rob
On Sat, Dec 30, 2023 at 8:01 PM Rob Landley rob@landley.net wrote:
I've been following the initramfs xattr support threads forever:
Here is my proposal: add to the kernel support for catar ( https://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.... ) in addition to cpio. catar has the following advantages:
- catar is simple and reproducible. For the same directory tree the same bit-precise catar file is generated, which is good for cryptographic signatures. As opposed to tar's monstrosity ( https://www.cyphar.com/blog/post/20190121-ociv2-images-i-tar ) - catar has support for xattr. It has support for nearly all types of metainformation Linux offers (32 bit UIDs, nanosecond timestamps, "disable CoW" flag and various other flags, selinux file labels, file capabilities, etc). All this metainformation can be disabled if needed. So, next time we will want to add some new type of metainformation, there will be no need for lengthy discussions about how it should be stored. All needed metainformation is already supported
-- Askar Safin
On 1/4/24 00:06, Askar Safin wrote:
On Sat, Dec 30, 2023 at 8:01 PM Rob Landley rob@landley.net wrote:
I've been following the initramfs xattr support threads forever:
Here is my proposal: add to the kernel support for catar ( https://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.... ) in addition to cpio. catar has the following advantages:
Didn't we just have a thread about the inadvisability of adding more ways to do the same thing, with each existing codepath still having to be supported forever?
And your solution is a link to the website of Lenart Pottering, author and maintainer of systemd. You want to put systemd code into the kernel.
- catar is simple and reproducible. For the same directory tree the
same bit-precise catar file is generated, which is good for cryptographic signatures.
I can trivially reproduce the same cpio each time from the same tree, the trick is just fetching the whole directory and sorting it before processing (to squelch hash-impacted readdir() return order from the filesystem).
As opposed to tar's monstrosity ( https://www.cyphar.com/blog/post/20190121-ociv2-images-i-tar )
- catar has support for xattr.
Adding xattr support to my toybox cpio implementation is maybe 10 lines of C, I assume the other implementations aren't that much more tangled. The question was always a largely aesthetic issue of file format.
Tar is NOT well-defined. I say this as someone who has IMPLEMENTED tar and has a pending TODO item to patch up YET ANOTHER funky corner case du jour somebody hit:
https://github.com/landley/toybox/issues/469
Inventing a NEW file format... there are multiple dozens already: zip, arj, lharc, zoo... You could theoretically extract squashfs into initramfs because technically it's an archive format.
Good grief, there's an xkcd on that:
It has support for nearly all types of
"nearly"
Please no.
metainformation Linux offers (32 bit UIDs, nanosecond timestamps, "disable CoW" flag and various other flags, selinux file labels, file capabilities, etc). All this metainformation can be disabled if needed. So, next time we will want to add some new type of metainformation, there will be no need for lengthy discussions about how it should be stored.
There's no real need NOW. "We did not come to an agreement in email" does not mean "let's add a new unrelated format". It means let's carve out an hour to actually speak to each other, by voice and maybe video, using this thing called the internet. (Without covid we'd have had a BOF at some conference by now.)
Rob
On Fri, Dec 29, 2023 at 9:29 PM Rob Landley rob@landley.net wrote:
Rob, do you agree? Stefan, do you agree? Then I will write a patch,
I think it's a bad idea.
Okay, I agree.
linux-stable-mirror@lists.linaro.org