On 9/29/19 3:28 AM, jun.zhang(a)intel.com wrote:
> From: zhang jun <jun.zhang(a)intel.com>
>
> we see tons of warning like:
> [ 45.846872] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req
> write-combining for [mem 0x1e7a80000-0x1e7a87fff], got write-back
> [ 45.848827] x86/PAT: .vorbis.decoder:4088 map pfn RAM range req
> write-combining for [mem 0x1e7a58000-0x1e7a58fff], got write-back
> [ 45.848875] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req
> write-combining for [mem 0x1e7a48000-0x1e7a4ffff], got write-back
> [ 45.849403] x86/PAT: .vorbis.decoder:4088 map pfn RAM range
> req write-combining for [mem 0x1e7a70000-0x1e7a70fff], got write-back
>
> check the kernel Documentation/x86/pat.txt, it says:
> A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
> vm_insert_pfn
> Drivers wanting to export some pages to userspace do it by using
> mmap interface and a combination of
> 1) pgprot_noncached()
> 2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn()
> With PAT support, a new API pgprot_writecombine is being added.
> So, drivers can continue to use the above sequence, with either
> pgprot_noncached() or pgprot_writecombine() in step 1, followed by step 2.
>
> In addition, step 2 internally tracks the region as UC or WC in
> memtype list in order to ensure no conflicting mapping.
>
> Note that this set of APIs only works with IO (non RAM) regions.
> If driver ants to export a RAM region, it has to do set_memory_uc() or
> set_memory_wc() as step 0 above and also track the usage of those pages
> and use set_memory_wb() before the page is freed to free pool.
>
> the fix follow the pat document, do set_memory_wc() as step 0 and
> use the set_memory_wb() before the page is freed.
>
All this work needs to be done on the new dma-buf heap rework and I
don't think it makes sense to put it on the staging version
https://lore.kernel.org/lkml/20190906184712.91980-1-john.stultz@linaro.org/
(I also continue to question the value of uncached buffers, especially on
x86)
> Signed-off-by: he, bo <bo.he(a)intel.com>
> Signed-off-by: zhang jun <jun.zhang(a)intel.com>
> Signed-off-by: Bai, Jie A <jie.a.bai(a)intel.com>
> ---
> drivers/staging/android/ion/ion_system_heap.c | 28 ++++++++++++++++++-
> 1 file changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c
> index b83a1d16bd89..d298b8194820 100644
> --- a/drivers/staging/android/ion/ion_system_heap.c
> +++ b/drivers/staging/android/ion/ion_system_heap.c
> @@ -13,6 +13,7 @@
> #include <linux/scatterlist.h>
> #include <linux/slab.h>
> #include <linux/vmalloc.h>
> +#include <asm/set_memory.h>
>
> #include "ion.h"
>
> @@ -134,6 +135,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
> sg = table->sgl;
> list_for_each_entry_safe(page, tmp_page, &pages, lru) {
> sg_set_page(sg, page, page_size(page), 0);
> +
> +#ifdef CONFIG_X86
> + if (!(buffer->flags & ION_FLAG_CACHED))
> + set_memory_wc((unsigned long)page_address(sg_page(sg)),
> + PAGE_ALIGN(sg->length) >> PAGE_SHIFT);
> +#endif
> +
> sg = sg_next(sg);
> list_del(&page->lru);
> }
> @@ -162,8 +170,15 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
> if (!(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE))
> ion_heap_buffer_zero(buffer);
>
> - for_each_sg(table->sgl, sg, table->nents, i)
> + for_each_sg(table->sgl, sg, table->nents, i) {
> +#ifdef CONFIG_X86
> + if (!(buffer->flags & ION_FLAG_CACHED))
> + set_memory_wb((unsigned long)page_address(sg_page(sg)),
> + PAGE_ALIGN(sg->length) >> PAGE_SHIFT);
> +#endif
> +
> free_buffer_page(sys_heap, buffer, sg_page(sg));
> + }
> sg_free_table(table);
> kfree(table);
> }
> @@ -316,6 +331,12 @@ static int ion_system_contig_heap_allocate(struct ion_heap *heap,
>
> buffer->sg_table = table;
>
> +#ifdef CONFIG_X86
> + if (!(buffer->flags & ION_FLAG_CACHED))
> + set_memory_wc((unsigned long)page_address(page),
> + PAGE_ALIGN(len) >> PAGE_SHIFT);
> +#endif
> +
> return 0;
>
> free_table:
> @@ -334,6 +355,11 @@ static void ion_system_contig_heap_free(struct ion_buffer *buffer)
> unsigned long pages = PAGE_ALIGN(buffer->size) >> PAGE_SHIFT;
> unsigned long i;
>
> +#ifdef CONFIG_X86
> + if (!(buffer->flags & ION_FLAG_CACHED))
> + set_memory_wb((unsigned long)page_address(page), pages);
> +#endif
> +
> for (i = 0; i < pages; i++)
> __free_page(page + i);
> sg_free_table(table);
>
On Sun, Sep 29, 2019 at 03:28:41PM +0800, jun.zhang(a)intel.com wrote:
> From: zhang jun <jun.zhang(a)intel.com>
>
> we see tons of warning like:
> [ 45.846872] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req
> write-combining for [mem 0x1e7a80000-0x1e7a87fff], got write-back
> [ 45.848827] x86/PAT: .vorbis.decoder:4088 map pfn RAM range req
> write-combining for [mem 0x1e7a58000-0x1e7a58fff], got write-back
> [ 45.848875] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req
> write-combining for [mem 0x1e7a48000-0x1e7a4ffff], got write-back
> [ 45.849403] x86/PAT: .vorbis.decoder:4088 map pfn RAM range
> req write-combining for [mem 0x1e7a70000-0x1e7a70fff], got write-back
>
> check the kernel Documentation/x86/pat.txt, it says:
> A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
> vm_insert_pfn
> Drivers wanting to export some pages to userspace do it by using
> mmap interface and a combination of
> 1) pgprot_noncached()
> 2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn()
> With PAT support, a new API pgprot_writecombine is being added.
> So, drivers can continue to use the above sequence, with either
> pgprot_noncached() or pgprot_writecombine() in step 1, followed by step 2.
>
> In addition, step 2 internally tracks the region as UC or WC in
> memtype list in order to ensure no conflicting mapping.
>
> Note that this set of APIs only works with IO (non RAM) regions.
> If driver ants to export a RAM region, it has to do set_memory_uc() or
> set_memory_wc() as step 0 above and also track the usage of those pages
> and use set_memory_wb() before the page is freed to free pool.
>
> the fix follow the pat document, do set_memory_wc() as step 0 and
> use the set_memory_wb() before the page is freed.
>
> Signed-off-by: he, bo <bo.he(a)intel.com>
> Signed-off-by: zhang jun <jun.zhang(a)intel.com>
> Signed-off-by: Bai, Jie A <jie.a.bai(a)intel.com>
> ---
> drivers/staging/android/ion/ion_system_heap.c | 28 ++++++++++++++++++-
> 1 file changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c
> index b83a1d16bd89..d298b8194820 100644
> --- a/drivers/staging/android/ion/ion_system_heap.c
> +++ b/drivers/staging/android/ion/ion_system_heap.c
> @@ -13,6 +13,7 @@
> #include <linux/scatterlist.h>
> #include <linux/slab.h>
> #include <linux/vmalloc.h>
> +#include <asm/set_memory.h>
>
> #include "ion.h"
>
> @@ -134,6 +135,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
> sg = table->sgl;
> list_for_each_entry_safe(page, tmp_page, &pages, lru) {
> sg_set_page(sg, page, page_size(page), 0);
> +
> +#ifdef CONFIG_X86
> + if (!(buffer->flags & ION_FLAG_CACHED))
> + set_memory_wc((unsigned long)page_address(sg_page(sg)),
> + PAGE_ALIGN(sg->length) >> PAGE_SHIFT);
> +#endif
There is no way to do this without these #ifdefs? That feels odd, why
can't you just always test for this?
thanks,
greg k-h
On Sun, Sep 08, 2019 at 02:34:50PM +1000, Adam Zerella wrote:
> Using strncpy() does not always terminate the destination string.
> stracpy() is a alternative function that does, by using this new
> function we will no longer need to insert a null separator.
>
> Signed-off-by: Adam Zerella <adam.zerella(a)gmail.com>
> ---
> drivers/staging/android/ion/ion.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index e6b1ca141b93..17901bd626be 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -433,8 +433,7 @@ static int ion_query_heaps(struct ion_heap_query *query)
> max_cnt = query->cnt;
>
> plist_for_each_entry(heap, &dev->heaps, node) {
> - strncpy(hdata.name, heap->name, MAX_HEAP_NAME);
> - hdata.name[sizeof(hdata.name) - 1] = '\0';
> + stracpy(hdata.name, heap->name, MAX_HEAP_NAME);
stracpy() only takes two arguments. This doesn't compile.
regards,
dan carpenter
This is the new dma_fence_array based container for shared fences in the dma_resv object.
Advantage of this approach is that you can grab a reference to the current set of shared fences at any time, which allows us to drop the sequence number increment and makes the whole RCU handling much more easier.
Disadvantage is that RCU users now have to grab a reference instead of using the sequence counter. As far as I can see i915 was actually the only driver doing this.
So we optimize for adding more fences instead of reading them now.
Another behavior change worth noting is that the shared fences are now only visible after unlocking the dma_resv object or calling dma_resv_fences_commit() manually.
Please review and/or comment,
Christian.
Hi everyone,
In previous discussion it surfaced that different drivers use the shared and explicit fences in the dma_resv object with different meanings.
This is problematic when we share buffers between those drivers and requirements for implicit and explicit synchronization leaded to quite a number of workarounds related to this.
So I started an effort to get all drivers back to a common understanding of what the fences in the dma_resv object mean and be able to use the object for different kind of workloads independent of the classic DRM command submission interface.
The result is this patch set which modifies the dma_resv API to get away from a single explicit fence and multiple shared fences, towards a notation where we have explicit categories for writers, readers and others.
To do this I came up with a new container called dma_resv_fences which can store both a single fence as well as multiple fences in a dma_fence_array.
This turned out to actually be even be quite a bit simpler, since we don't need any complicated dance between RCU and sequence count protected updates any more.
Instead we can just grab a reference to the dma_fence_array under RCU and so keep the current state of synchronization alive until we are done with it.
This results in both a small performance improvement since we don't need so many barriers any more, as well as fewer lines of code in the actual implementation.
Please review and/or comment,
Christian.