On Tue, Jul 06, 2021 at 07:31:37PM +0200, Christoph Hellwig wrote:
On Tue, Jul 06, 2021 at 02:28:28PM -0300, Jason Gunthorpe wrote:
Also on your claim that drivers/gpu is a non-upstream disaster: I've also learned that that for drivers/rdma there's the upstream driver, and then there's the out-of-tree hackjob the vendor actually supports.
In the enterprise world everyone has their out of tree backport drivers. It varies on the vendor how much deviation there is from the upstream driver and what commercial support relationship the vendor has with the enterprise distros.
I think he means the Mellanox OFED stack, which is a complete and utter mess and which gets force fed by Mellanox/Nvidia on unsuspecting customers. I know many big HPC sites that ignore it, but a lot of enterprise customers are dumb enought to deploy it.
No, I don't think so. While MOFED is indeed a giant mess, the mlx5 upstream driver is not some token effort to generate good will and Mellanox certainly does provide full commercial support for the mlx5 drivers shipped inside various enterprise distros.
MOFED also doesn't have a big functional divergance from RDMA upstream, and it is not mandatory just to use the hardware.
I can not say the same about other company's RDMA driver distributions, Daniel's description of "minimal effort to get goodwill" would match others much better.
You are right that there are a lot of enterprise customers who deploy the MOFED. I can't agree with their choices, but they are not forced into using it anymore.
Jason