Re: [Linaro-mm-sig] Memory region attribute bits and multiple mappings

27 Apr 2011


      Hello,
On Wednesday, April 20, 2011 5:13 PM Arnd Bergmann wrote:
I'm sorry for a little delay in answer, I've missed some mail from linaro-mm.
...
On Wednesday 20 April 2011, Marek Szyprowski wrote:
...
Hello,
On Tuesday, April 19, 2011 11:23 PM Arnd Bergmann wrote:
...
This may be a stupid question, but do we have an agreement that it
is actually a requirement to have uncached mappings? With the
streaming DMA mapping API, it should be possible to work around
noncoherent DMA by flushing the caches at the right times, which
probably results in better performance than simply doing noncached
mappings. What is the specific requirement for noncached memory
regions?
Flushing cache for large buffers also takes a significant time,
especially if it is implemented by iterating over the whole buffer and
calling flush instruction for each line.
For most use cases the CPU write speed is not degraded on non-caches
memory areas. ARM CPUs with write combining feature performs really
well on uncached memory.
Ok, makes sense.
...
Non-cached buffers are also the only solution for buffers that need to
be permanently mapped to userspace (like framebuffer).
Why? Are the cache flush operations privileged on ARM?
I simplified this too much. Non-cached buffers are the only solution for
userspace APIs that assume coherent memory. With framebuffer example I
wanted to say that the fb clients can write data at anytime though the
mmaped window and kernel has no way to guess when data has been written,
so it has no possibility to flush/invalidate cache.
...
...
Non-cached
mappings are also useful when one doesn't touch the memory with cpu at
all (zero copy between 2 independent multimedia blocks).
I would think that if we don't want to touch the data, ideally it should
not be mapped at all into the kernel address space.
That's the other possibility, however it is hardly possible now because of
the lacks in the kernel and userspace API for most of the subsystems.
(snipped)
...
...
That's why we came with the idea of CMA (contiguous memory allocator)
which can 'recycle' memory areas that are not used by multimedia hardware.
CMA allows system to allocate movable pages (like page cache, user
process
...
memory, etc) from defined CMA range and migrate them on allocation
request
...
for contiguous memory. For more information, please refer to:
https://lkml.org/lkml/2011/3/31/213
I thought CMA was mostly about dealing with systems that don't have an
IOMMU, which is a related problem, but is not the same as dealing with
noncoherent DMA.
Right, the main purpose of the CMA is to provide a possibility to allocate
a large chunk of contiguous memory. The reason I mentioned it is the fact
that some proposed solutions for coherent mapping problem assumed that memory
buffers for coherent allocator will be allocated from separate memory region
that is not mapped in low-level kernel memory or not accessed by kernel at
all.
...
...
I want to merge this idea with changing changing the kernel linear low-
mem
...
mapping, so 2-level page mapping will be done only for the defined CMA
range, what should reduce TLB pressure. Once the contiguous block is
allocated from CMA range, the mapping in low-mem area can be removed to
fulfill the ARM specification.
I'm not convinced that trying to solve both issues at the same time is a
good idea. For systems that have an IOMMU, we just want a bunch of pages
and map them virtually contiguous into the bus address space. For other
systems, we really need physically contiguous memory. Independent of that,
you may or may not need to unmap them from the linear mapping and make
them noncached, depending on whether there is prefetching and noncoherent
DMA involved.
Right, the problem of allocating memory and keeping the correct page mappings
are orthogonal to each other. I just wanted to show that some optimizations
can be achieved of both problems are solved together.
Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] Memory region attribute bits and multiple mappings