On Wed, Nov 21, 2018 at 05:20:55PM -0800, Andrew Morton wrote:
On Tue, 20 Nov 2018 15:12:49 -0800 Dan Williams dan.j.williams@intel.com wrote:
[...]
I am also concerned that HMM was designed in a way to minimize further engagement with the core-MM. That, with these hooks in place, device-drivers are free to implement their own policies without much consideration for whether and how the core-MM could grow to meet that need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the core-MM should be allowed the opportunity and stimulus to change and address these new use cases as first class functionality.
The arguments are compelling. I apologize for not thinking of and/or not being made aware of them at the time.
So i wanted to comment on that part. Yes HMM is an impedence layer that goes both way ie device driver are shielded from core mm and core mm folks do not need to understand individual driver to modify mm, they only need to understand what is provided to the driver by HMM (and keeps HMM promise intact from driver POV no matter how it is achieve). So this is intentional.
Nonetheless I want to grow core mm involvement in managing those memory (see patchset i just posted about hbind() and heterogeneous memory system). But i do not expect that core mm will be in full control at least not for some time. The historical reasons is that device like GPU are not only use for compute (which is where HMM gets use) but also for graphics (simple desktop or even games). Those are two differents workload using different API (CUDA/OpenCL for compute, OpenGL/Vulkan for graphics) on the same underlying hardware.
Those API expose the hardware in incompatible way when it comes to memory management (especialy API like Vulkan). Managing memory page wise is not well suited for graphics. The issues comes from the fact that we do not want to exclude either workload from happening concurrently (running your destkop while some compute job is running in the background). So for this to work we need to keep the device driver in control of its memory (hence why callback when page are freed for instance). We also need to forbid things like pinning any device memory pages ...
I still expect some commonality to emerge accross different hardware so that we can grow more things and share more code into core mm but i want to get their organicaly, not forcing everyone into a design today. I expect this will happens by going from high level concept, how things get use in userspace from end user POV, and working back- ward from there to see what common API (if any) we can provided to catter those common use case.
Cheers, Jérôme