Re: [Linaro-mm-sig] [PATCH v4 0/2] Add p2p via dmabuf to habanalabs

6 Jul 2021


      I should stop typing and prep dinner, but I found some too hilarious
typos below.
On Tue, Jul 6, 2021 at 7:35 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
...
On Tue, Jul 6, 2021 at 6:29 PM Jason Gunthorpe jgg@ziepe.ca wrote:
...
On Tue, Jul 06, 2021 at 05:49:01PM +0200, Daniel Vetter wrote:
...
The other thing to keep in mind is that one of these drivers supports
25 years of product generations, and the other one doesn't.
Sure, but that is the point, isn't it? To have an actually useful
thing you need all of this mess
...
...
My argument is that an in-tree open kernel driver is a big help to
reverse engineering an open userspace. Having the vendors
collaboration to build that monstrous thing can only help the end goal
of an end to end open stack.
Not sure where this got lost, but we're totally fine with vendors
using the upstream driver together with their closed stack. And most
of the drivers we do have in upstream are actually, at least in parts,
supported by the vendor. E.g. if you'd have looked the drm/arm driver
you picked is actually 100% written by ARM engineers. So kinda
unfitting example.
So the argument with Habana really boils down to how much do they need
to show in the open source space to get a kernel driver? You want to
see the ISA or compiler at least?
Yup. We dont care about any of the fancy pieces you build on top, nor
does the compiler need to be the optimizing one. Just something that's
good enough to drive the hw in some demons to see how it works and all
s/demons/demos/ but hw tends to be funky enough that either fits :-)
...
that. Generally that's also not that hard to reverse engineer, if
someone is bored enough, the real fancy stuff tends to be in how you
optimize the generated code. And make it fit into the higher levels
properly.
...
That at least doesn't seem "extreme" to me.
...
...
For instance a vendor with an in-tree driver has a strong incentive to
sort out their FW licensing issues so it can be redistributed.
Nvidia has been claiming to try and sort out the FW problem for years.
They even managed to release a few things, but I think the last one is
2-3 years late now. Partially the reason is that there don't have a
stable api between the firmware and driver, it's all internal from the
same source tree, and they don't really want to change that.
Right, companies have no incentive to work in a sane way if they have
their own parallel world. I think drawing them part by part into the
standard open workflows and expectations is actually helpful to
everyone.
Well we do try to get them on board part-by-part generally starting
with the kernel and ending with a proper compiler instead of the usual
llvm hack job, but for whatever reasons they really like their
in-house stuff, see below for what I mean.
...
...
...
...
I don't think the facts on the ground support your claim here, aside
from the practical problem that nvidia is unwilling to even create an
open driver to begin with. So there isn't anything to merge.
The internet tells me there is nvgpu, it doesn't seem to have helped.
Not sure which one you mean, but every once in a while they open up a
few headers, or a few programming specs, or a small driver somewhere
for a very specific thing, and then it dies again or gets obfuscated
for the next platform, or just never updated. I've never seen anything
that comes remotely to something complete, aside from tegra socs,
which are fully supported in upstream afaik.
I understand nvgpu is the tegra driver that people actualy
use. nouveau may have good tegra support but is it used in any actual
commercial product?
I think it was almost the case. Afaik they still have their internal
userspace stack working on top of nvidia, at least last year someone
fixed up a bunch of issues in the tegra+nouveau combo to enable format
modifiers properly across the board. But also nvidia is never going to
sell you that as the officially supported thing, unless your ask comes
back with enormous amounts of sold hardware.
And it's not just nvidia, it's pretty much everyone. Like a soc
company I don't want to know started collaborating with upstream and
s/know/name/ I do know them unfortunately quite well ...
Cheers, Daniel
...
the reverse-engineered mesa team on a kernel driver, seems to work
pretty well for current hardware. But for the next generation they
decided it's going to be again only their in-house tree that
completele ignores drivers/gpu/drm, and also tosses all the
foundational work they helped build on the userspace side. And this is
consistent across all companies, over the last 20 years I know of
(often non-public) stories across every single company where they
decided that all the time invested into community/upstream
collaboration isn't useful anymore, we go all vendor solo for the next
one.
Most of those you luckily don't hear about anymore, all it results in
the upstream driver being 1-2 years late or so. But even the good ones
where we collaborate well can't seem to help themselves and want to
throw it all away every few years.
-Daniel
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [PATCH v4 0/2] Add p2p via dmabuf to habanalabs

-Daniel