It's true but it's also true that a lot of these algorithm can be tuned at build time. We have various memory allocators, tiny RCU, the ability to disable SMP, we can even tune certain filesystems to use more or less buffers, etc. It's not that bad at all and I'm not sure that many other OSes have this level of flexibility.
It's not tuning the algorithms that is the problem. The problem is the choice of algorithm. RCU is a lovely example. The correct solution for a small single or dual CPU device is not to have RCU in the first place. Our tty layer is another - it's about ten times the size it needs to be becauuse it has to handle all sorts of crazy stuff you don't need on a router (to the point we now have an optional second 'tty' layer choice.
Do we want to do that with everything though - a dumb scheduler alternative (the one in Linux 1.2 is actually really good for a little uniprocessor), a new VFS, a simple block layer ?
Likewise on a low memory embedded router you don't need the scheduler logic we have, you don't need the VFS design Linux uses, you don't want the dcache, you don't want most of the disk optimisations, the tty layer and so on.
384kB were lost and not remappable by then). It was unbeatable for this purpose, though I'm not aiming at making this possible nowadays, most machines have at least a bit more RAM :-)
Memory costs power so there are pressures in both directions.
Alan