4.20-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
commit 02917e9f8676207a4c577d4d94eae12bf348e9d7 upstream.
At Maintainer Summit, Greg brought up a topic I proposed around EXPORT_SYMBOL_GPL usage. The motivation was considerations for when EXPORT_SYMBOL_GPL is warranted and the criteria for taking the exceptional step of reclassifying an existing export. Specifically, I wanted to make the case that although the line is fuzzy and hard to specify in abstract terms, it is nonetheless clear that devm_memremap_pages() and HMM (Heterogeneous Memory Management) have crossed it. The devm_memremap_pages() facility should have been EXPORT_SYMBOL_GPL from the beginning, and HMM as a derivative of that functionality should have naturally picked up that designation as well.
Contrary to typical rules, the HMM infrastructure was merged upstream with zero in-tree consumers. There was a promise at the time that those users would be merged "soon", but it has been over a year with no drivers arriving. While the Nouveau driver is about to belatedly make good on that promise it is clear that HMM was targeted first and foremost at an out-of-tree consumer.
HMM is derived from devm_memremap_pages(), a facility Christoph and I spearheaded to support persistent memory. It combines a device lifetime model with a dynamically created 'struct page' / memmap array for any physical address range. It enables coordination and control of the many code paths in the kernel built to interact with memory via 'struct page' objects. With HMM the integration goes even deeper by allowing device drivers to hook and manipulate page fault and page free events.
One interpretation of when EXPORT_SYMBOL is suitable is when it is exporting stable and generic leaf functionality. The devm_memremap_pages() facility continues to see expanding use cases, peer-to-peer DMA being the most recent, with no clear end date when it will stop attracting reworks and semantic changes. It is not suitable to export devm_memremap_pages() as a stable 3rd party driver API due to the fact that it is still changing and manipulates core behavior. Moreover, it is not in the best interest of the long term development of the core memory management subsystem to permit any external driver to effectively define its own system-wide memory management policies with no encouragement to engage with upstream.
I am also concerned that HMM was designed in a way to minimize further engagement with the core-MM. That, with these hooks in place, device-drivers are free to implement their own policies without much consideration for whether and how the core-MM could grow to meet that need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the core-MM should be allowed the opportunity and stimulus to change and address these new use cases as first class functionality.
Original changelog:
hmm_devmem_add(), and hmm_devmem_add_resource() duplicated devm_memremap_pages() and are now simple now wrappers around the core facility to inject a dev_pagemap instance into the global pgmap_radix and hook page-idle events. The devm_memremap_pages() interface is base infrastructure for HMM. HMM has more and deeper ties into the kernel memory management implementation than base ZONE_DEVICE which is itself a EXPORT_SYMBOL_GPL facility.
Originally, the HMM page structure creation routines copied the devm_memremap_pages() code and reused ZONE_DEVICE. A cleanup to unify the implementations was discussed during the initial review: http://lkml.iu.edu/hypermail/linux/kernel/1701.2/00812.html Recent work to extend devm_memremap_pages() for the peer-to-peer-DMA facility enabled this cleanup to move forward.
In addition to the integration with devm_memremap_pages() HMM depends on other GPL-only symbols:
mmu_notifier_unregister_no_release percpu_ref region_intersects __class_create
It goes further to consume / indirectly expose functionality that is not exported to any other driver:
alloc_pages_vma walk_page_range
HMM is derived from devm_memremap_pages(), and extends deep core-kernel fundamentals. Similar to devm_memremap_pages(), mark its entry points EXPORT_SYMBOL_GPL().
[logang@deltatee.com: PCI/P2PDMA: match interface changes to devm_memremap_pages()] Link: http://lkml.kernel.org/r/20181130225911.2900-1-logang@deltatee.com Link: http://lkml.kernel.org/r/154275560565.76910.15919297436557795278.stgit@dwill... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Logan Gunthorpe logang@deltatee.com Reviewed-by: Christoph Hellwig hch@lst.de Cc: Logan Gunthorpe logang@deltatee.com Cc: "Jérôme Glisse" jglisse@redhat.com Cc: Balbir Singh bsingharora@gmail.com, Cc: Michal Hocko mhocko@suse.com Cc: Benjamin Herrenschmidt benh@kernel.crashing.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/pci/p2pdma.c | 10 ++-------- mm/hmm.c | 4 ++-- 2 files changed, 4 insertions(+), 10 deletions(-)
--- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -82,10 +82,8 @@ static void pci_p2pdma_percpu_release(st complete_all(&p2p->devmap_ref_done); }
-static void pci_p2pdma_percpu_kill(void *data) +static void pci_p2pdma_percpu_kill(struct percpu_ref *ref) { - struct percpu_ref *ref = data; - /* * pci_p2pdma_add_resource() may be called multiple times * by a driver and may register the percpu_kill devm action multiple @@ -198,6 +196,7 @@ int pci_p2pdma_add_resource(struct pci_d pgmap->type = MEMORY_DEVICE_PCI_P2PDMA; pgmap->pci_p2pdma_bus_offset = pci_bus_address(pdev, bar) - pci_resource_start(pdev, bar); + pgmap->kill = pci_p2pdma_percpu_kill;
addr = devm_memremap_pages(&pdev->dev, pgmap); if (IS_ERR(addr)) { @@ -211,11 +210,6 @@ int pci_p2pdma_add_resource(struct pci_d if (error) goto pgmap_free;
- error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_percpu_kill, - &pdev->p2pdma->devmap_ref); - if (error) - goto pgmap_free; - pci_info(pdev, "added peer-to-peer DMA memory %pR\n", &pgmap->res);
--- a/mm/hmm.c +++ b/mm/hmm.c @@ -1110,7 +1110,7 @@ struct hmm_devmem *hmm_devmem_add(const return result; return devmem; } -EXPORT_SYMBOL(hmm_devmem_add); +EXPORT_SYMBOL_GPL(hmm_devmem_add);
struct hmm_devmem *hmm_devmem_add_resource(const struct hmm_devmem_ops *ops, struct device *device, @@ -1164,7 +1164,7 @@ struct hmm_devmem *hmm_devmem_add_resour return result; return devmem; } -EXPORT_SYMBOL(hmm_devmem_add_resource); +EXPORT_SYMBOL_GPL(hmm_devmem_add_resource);
/* * A device driver that wants to handle multiple devices memory through a