On Mon, Jul 05, 2021 at 04:03:12PM +0300, Oded Gabbay wrote:
Hi, I'm sending v4 of this patch-set following the long email thread. I want to thank Jason for reviewing v3 and pointing out the errors, saving us time later to debug it :)
I consulted with Christian on how to fix patch 2 (the implementation) and at the end of the day I shamelessly copied the relevant content from amdgpu_vram_mgr_alloc_sgt() and amdgpu_dma_buf_attach(), regarding the usage of dma_map_resource() and pci_p2pdma_distance_many(), respectively.
I also made a few improvements after looking at the relevant code in amdgpu. The details are in the changelog of patch 2.
I took the time to write an import code into the driver, allowing me to check real P2P with two Gaudi devices, one as exporter and the other as importer. I'm not going to include the import code in the product, it was just for testing purposes (although I can share it if anyone wants).
I run it on a bare-metal environment with IOMMU enabled, on a sky-lake CPU with a white-listed PCIe bridge (to make the pci_p2pdma_distance_many happy).
Greg, I hope this will be good enough for you to merge this code.
So we're officially going to use dri-devel for technical details review and then Greg for merging so we don't have to deal with other merge criteria dri-devel folks have?
I don't expect anything less by now, but it does make the original claim that drivers/misc will not step all over accelerators folks a complete farce under the totally-not-a-gpu banner.
This essentially means that for any other accelerator stack that doesn't fit the dri-devel merge criteria, even if it's acting like a gpu and uses other gpu driver stuff, you can just send it to Greg and it's good to go.
There's quite a lot of these floating around actually (and many do have semi-open runtimes, like habanalabs have now too, just not open enough to be actually useful). It's going to be absolutely lovely having to explain to these companies in background chats why habanalabs gets away with their stack and they don't.
Or maybe we should just merge them all and give up on the idea of having open cross-vendor driver stacks for these accelerators.
Thanks, Daniel
Thanks, Oded
Oded Gabbay (1): habanalabs: define uAPI to export FD for DMA-BUF
Tomer Tayar (1): habanalabs: add support for dma-buf exporter
drivers/misc/habanalabs/Kconfig | 1 + drivers/misc/habanalabs/common/habanalabs.h | 26 ++ drivers/misc/habanalabs/common/memory.c | 480 +++++++++++++++++++- drivers/misc/habanalabs/gaudi/gaudi.c | 1 + drivers/misc/habanalabs/goya/goya.c | 1 + include/uapi/misc/habanalabs.h | 28 +- 6 files changed, 532 insertions(+), 5 deletions(-)
-- 2.25.1
On Tue, Jul 06, 2021 at 10:40:37AM +0200, Daniel Vetter wrote:
Greg, I hope this will be good enough for you to merge this code.
So we're officially going to use dri-devel for technical details review and then Greg for merging so we don't have to deal with other merge criteria dri-devel folks have?
I don't expect anything less by now, but it does make the original claim that drivers/misc will not step all over accelerators folks a complete farce under the totally-not-a-gpu banner.
This essentially means that for any other accelerator stack that doesn't fit the dri-devel merge criteria, even if it's acting like a gpu and uses other gpu driver stuff, you can just send it to Greg and it's good to go.
There's quite a lot of these floating around actually (and many do have semi-open runtimes, like habanalabs have now too, just not open enough to be actually useful). It's going to be absolutely lovely having to explain to these companies in background chats why habanalabs gets away with their stack and they don't.
FYI, I fully agree with Daniel here. Habanlabs needs to open up their runtime if they want to push any additional feature in the kernel. The current situation is not sustainable.
On Tue, Jul 06, 2021 at 02:21:10PM +0200, Christoph Hellwig wrote:
On Tue, Jul 06, 2021 at 10:40:37AM +0200, Daniel Vetter wrote:
Greg, I hope this will be good enough for you to merge this code.
So we're officially going to use dri-devel for technical details review and then Greg for merging so we don't have to deal with other merge criteria dri-devel folks have?
I don't expect anything less by now, but it does make the original claim that drivers/misc will not step all over accelerators folks a complete farce under the totally-not-a-gpu banner.
This essentially means that for any other accelerator stack that doesn't fit the dri-devel merge criteria, even if it's acting like a gpu and uses other gpu driver stuff, you can just send it to Greg and it's good to go.
There's quite a lot of these floating around actually (and many do have semi-open runtimes, like habanalabs have now too, just not open enough to be actually useful). It's going to be absolutely lovely having to explain to these companies in background chats why habanalabs gets away with their stack and they don't.
FYI, I fully agree with Daniel here. Habanlabs needs to open up their runtime if they want to push any additional feature in the kernel. The current situation is not sustainable.
Before anyone replies: The runtime is open, the compiler is still closed. This has become the new default for accel driver submissions, I think mostly because all the interesting bits for non-3d accelerators are in the accel ISA, and no longer in the runtime. So vendors are fairly happy to throw in the runtime as a freebie.
It's still incomplete, and it's still useless if you want to actually hack on the driver stack. -Daniel
Am 06.07.21 um 14:23 schrieb Daniel Vetter:
On Tue, Jul 06, 2021 at 02:21:10PM +0200, Christoph Hellwig wrote:
On Tue, Jul 06, 2021 at 10:40:37AM +0200, Daniel Vetter wrote:
Greg, I hope this will be good enough for you to merge this code.
So we're officially going to use dri-devel for technical details review and then Greg for merging so we don't have to deal with other merge criteria dri-devel folks have?
I don't expect anything less by now, but it does make the original claim that drivers/misc will not step all over accelerators folks a complete farce under the totally-not-a-gpu banner.
This essentially means that for any other accelerator stack that doesn't fit the dri-devel merge criteria, even if it's acting like a gpu and uses other gpu driver stuff, you can just send it to Greg and it's good to go.
There's quite a lot of these floating around actually (and many do have semi-open runtimes, like habanalabs have now too, just not open enough to be actually useful). It's going to be absolutely lovely having to explain to these companies in background chats why habanalabs gets away with their stack and they don't.
FYI, I fully agree with Daniel here. Habanlabs needs to open up their runtime if they want to push any additional feature in the kernel. The current situation is not sustainable.
Before anyone replies: The runtime is open, the compiler is still closed. This has become the new default for accel driver submissions, I think mostly because all the interesting bits for non-3d accelerators are in the accel ISA, and no longer in the runtime. So vendors are fairly happy to throw in the runtime as a freebie.
Well a compiler and runtime makes things easier, but the real question is if they are really required for upstreaming a kernel driver?
I mean what we need is to be able to exercise the functionality. So wouldn't (for example) an assembler be sufficient?
It's still incomplete, and it's still useless if you want to actually hack on the driver stack.
Yeah, when you want to hack on it in the sense of extending it then this requirement is certainly true.
But as far as I can see userspace don't need to be extendable to justify a kernel driver. It just needs to have enough glue to thoughtfully exercise the relevant kernel interfaces.
Applying that to GPUs I think what you need to be able to is to write shaders, but that doesn't need to be in a higher language requiring a compiler and runtime. Released opcodes and a low level assembler should be sufficient.
Regards, Christian.
-Daniel
On Wed, Jul 7, 2021 at 2:17 PM Christian König ckoenig.leichtzumerken@gmail.com wrote:
Am 06.07.21 um 14:23 schrieb Daniel Vetter:
On Tue, Jul 06, 2021 at 02:21:10PM +0200, Christoph Hellwig wrote:
On Tue, Jul 06, 2021 at 10:40:37AM +0200, Daniel Vetter wrote:
Greg, I hope this will be good enough for you to merge this code.
So we're officially going to use dri-devel for technical details review and then Greg for merging so we don't have to deal with other merge criteria dri-devel folks have?
I don't expect anything less by now, but it does make the original claim that drivers/misc will not step all over accelerators folks a complete farce under the totally-not-a-gpu banner.
This essentially means that for any other accelerator stack that doesn't fit the dri-devel merge criteria, even if it's acting like a gpu and uses other gpu driver stuff, you can just send it to Greg and it's good to go.
There's quite a lot of these floating around actually (and many do have semi-open runtimes, like habanalabs have now too, just not open enough to be actually useful). It's going to be absolutely lovely having to explain to these companies in background chats why habanalabs gets away with their stack and they don't.
FYI, I fully agree with Daniel here. Habanlabs needs to open up their runtime if they want to push any additional feature in the kernel. The current situation is not sustainable.
Before anyone replies: The runtime is open, the compiler is still closed. This has become the new default for accel driver submissions, I think mostly because all the interesting bits for non-3d accelerators are in the accel ISA, and no longer in the runtime. So vendors are fairly happy to throw in the runtime as a freebie.
Well a compiler and runtime makes things easier, but the real question is if they are really required for upstreaming a kernel driver?
I mean what we need is to be able to exercise the functionality. So wouldn't (for example) an assembler be sufficient?
So no one has tried this yet, but I think an assembler, or maybe even just the full PRM for the ISA is also good enough I think.
I guess in practice everyone just comes with the compiler for a few reasons: - AMD and Intel are great and release full PRMs for the gpu, but preparing those takes a lot of time. Often that's done as part of bring up, to make sure everything is annotated properly, so that all the necessary bits are included, but none of the future stuff, or silicon bring-up pieces. So in reality you have the compiler before you have the isa docs.
- reverse-engineered drivers also tend to have demo compilers before anything like full ISA docs show up :-) But also the docs tooling they have are great.
- then there's the case of developing a driver with NDA'd docs. Again you'll have a compiler as the only real output, there's not going to be any docs or anything like that.
It's still incomplete, and it's still useless if you want to actually hack on the driver stack.
Yeah, when you want to hack on it in the sense of extending it then this requirement is certainly true.
But as far as I can see userspace don't need to be extendable to justify a kernel driver. It just needs to have enough glue to thoughtfully exercise the relevant kernel interfaces.
Applying that to GPUs I think what you need to be able to is to write shaders, but that doesn't need to be in a higher language requiring a compiler and runtime. Released opcodes and a low level assembler should be sufficient.
Yeah I think in theory ISA docs + assembler testcase or whatever is perfectly fine. In reality anyone who cares enough to do this properly gets to the demo quality compiler stage first, and so that's what we take for merging a new stack.
I do disagree that we're only ever asking for this and not more, e.g. if you come with a new 3d accelator and it's not coming with a userspace driver as a mesa MR, you have to do some very serious explaining about wtf you're doing - mesa3d won, pretty much across the board, as a common project for both vulkan and opengl, and the justifications for reinventing wheels better be really good here. Also by the time you've written enough scaffolding to show it integrates in non-stupid ways into mesa, you practically have a demo-quality driver stack anyway.
Similar on the display side of things, over the past year consensus for merge criteria have gone up quite a bit, e.g. there's a patch floating around to make that clearer:
https://lore.kernel.org/dri-devel/20210706161244.1038592-1-maxime@cerno.tech...
Of course this doesn't include anything grandfathered in (*cough* amdvlk *cough*), and also outside of 3d there's clearly no cross-vendor project that's established enough, media, compute, AI/NN stuff is all very badly fragmented. That's maybe lamentable, but like you said not really a reason to reject a kernel driver. -Daniel
linaro-mm-sig@lists.linaro.org