On Tue, Jul 6, 2021 at 8:31 PM Jason Gunthorpe jgg@ziepe.ca wrote:
On Tue, Jul 06, 2021 at 07:35:55PM +0200, Daniel Vetter wrote:
Yup. We dont care about any of the fancy pieces you build on top, nor does the compiler need to be the optimizing one. Just something that's good enough to drive the hw in some demons to see how it works and all that. Generally that's also not that hard to reverse engineer, if someone is bored enough, the real fancy stuff tends to be in how you optimize the generated code. And make it fit into the higher levels properly.
Seems reasonable to me
And it's not just nvidia, it's pretty much everyone. Like a soc company I don't want to know started collaborating with upstream and the reverse-engineered mesa team on a kernel driver, seems to work pretty well for current hardware.
What I've seen is that this only works with customer demand. Companies need to hear from their customers that upstream is what is needed, and companies cannot properly hear that until they are at least already partially invested in the upstream process and have the right customers that are sophisticated enough to care.
Embedded makes everything 10x worse because too many customers just don't care about upstream, you can hack your way through everything, and indulge in single generation thinking. Fork the whole kernel for 3 years, EOL, no problem!
It's not entirely hopeless in embedded either. Sure there's the giant pile of sell&forget abandonware, but there are lots of embedded things where multi-year to multi-decade support is required. And an upstream gfx stack beats anything the vendor has to offer on that, easily.
And on the server side it's actually pretty hard to convince customers of the upstream driver benefits, because they don't want or can't abandon nvidia and have just learned to accept the pain. They either build a few abstraction layers on top (and demand the vendor support those), or they flat out demand you support the nvidia broprietary interfaces. And AMD has been trying to move the needle here for years, with not that much success.
It is the enterprise world, particularly with an opinionated company like RH saying NO stuck in the middle that really seems to drive things toward upstream.
Yes, vendors can work around Red Hat's No (and NVIDIA GPU is such an example) but it is incredibly time consuming, expensive and becoming more and more difficult every year.
The big point is this:
But also nvidia is never going to sell you that as the officially supported thing, unless your ask comes back with enormous amounts of sold hardware.
I think this is at the core of Linux's success in the enterprise world. Big customers who care demanding open source. Any vendor, even nvidia will want to meet customer demands.
IHMO upstream success is found by motivating the customer to demand and make it "easy" for the vendor to supply it.
Yup, exactly same situation here. The problem seems to be a bit that gpu vendor stubbornness is higher than established customer demand even, or they just don't care, and so in the last few years that customer demand has resulted in payment to consulting shops and hiring of engineers into reverse-engineering a full driver, instead of customer and vendor splitting the difference and the vendor upstreaming their stack. And that's for companies who've done it in the past, or at least collaborated on parts like the kernel driver, so I really have no clue why they don't just continue. We have well-established customers who do want it all open and upstream, across kernel and userspace pieces.
And it looks like it's going to repeat itself a few more times unfortunately. I'm not sure when exactly the lesson will sink in.
Maybe I missed some, but looking at current render/compute drivers I think (but not even sure on that) only drm/lima is a hobbyist project and perhaps you want to include drm/nouveau as not paid by customers and more something redhat does out of principle. All the others are paid for by customers, with vendor involvement ranging from "just helping out with the kernel driver" to "pays for pretty much all of the development". And still apparently that's not enough demand for an upstream driver stack. -Daniel