Hi Dan, there is a question for you further down:
On 11/4/19 3:49 PM, Jerome Glisse wrote:
On Mon, Nov 04, 2019 at 02:49:18PM -0800, John Hubbard wrote:
...
Maybe add a small comment about wrap around :)
I don't *think* the count can wrap around, due to the checks in user_page_ref_inc().
But it's true that the documentation is a little light here...What did you have in mind?
About false positive case (and how unlikely they are) and that wrap around is properly handle. Maybe just a pointer to the documentation so that people know they can go look there for details. I know my brain tend to forget where to look for things so i like to be constantly reminded hey the doc is Documentations/foobar :)
I see. OK, here's a version with a thoroughly overhauled comment header:
/** * page_dma_pinned() - report if a page is pinned for DMA. * * This function checks if a page has been pinned via a call to * pin_user_pages*() or pin_longterm_pages*(). * * The return value is partially fuzzy: false is not fuzzy, because it means * "definitely not pinned for DMA", but true means "probably pinned for DMA, but * possibly a false positive due to having at least GUP_PIN_COUNTING_BIAS worth * of normal page references". * * False positives are OK, because: a) it's unlikely for a page to get that many * refcounts, and b) all the callers of this routine are expected to be able to * deal gracefully with a false positive. * * For more information, please see Documentation/vm/pin_user_pages.rst. * * @page: pointer to page to be queried. * @Return: True, if it is likely that the page has been "dma-pinned". * False, if the page is definitely not dma-pinned. */ static inline bool page_dma_pinned(struct page *page)
[...]
@@ -1930,12 +2028,20 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, pgmap = get_dev_pagemap(pfn, pgmap); if (unlikely(!pgmap)) {
undo_dev_pagemap(nr, nr_start, pages);
} SetPageReferenced(page); pages[*nr] = page;undo_dev_pagemap(nr, nr_start, flags, pages); return 0;
get_page(page);
if (flags & FOLL_PIN) {
if (unlikely(!user_page_ref_inc(page))) {
undo_dev_pagemap(nr, nr_start, flags, pages);
return 0;
}
Maybe add a comment about a case that should never happens ie user_page_ref_inc() fails after the second iteration of the loop as it would be broken and a bug to call undo_dev_pagemap() after the first iteration of that loop.
Also i believe that this should never happens as if first iteration succeed than __page_cache_add_speculative() will succeed for all the iterations.
Note that the pgmap case above follows that too ie the call to get_dev_pagemap() can only fail on first iteration of the loop, well i assume you can never have a huge device page that span different pgmap ie different devices (which is a reasonable assumption). So maybe this code needs fixing ie :
pgmap = get_dev_pagemap(pfn, pgmap); if (unlikely(!pgmap)) return 0;
OK, yes that does make sense. And I think a comment is adequate, no need to check for bugs during every tail page iteration. So how about this, as a preliminary patch:
Actualy i thought about it and i think that there is pgmap per section and thus maybe one device can have multiple pgmap and that would be an issue for page bigger than section size (ie bigger than 128MB iirc). I will go double check that, but maybe Dan can chime in.
In any case my comment above is correct for the page ref increment, if the first one succeed than others will too or otherwise it means someone is doing too many put_page()/ put_user_page() which is _bad_ :)
I'll wait to hear from Dan before doing anything rash. :)
thanks,
John Hubbard NVIDIA