On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
I think it should certainly be documented in terms of what guarantees it provides to application, _not_ the kinds of instructions it may or may not induce the core to execute. And if existing API can't be re-documented sanely, then deprecatd and new ones added that DTRT. Possibly under a new system call, if arch's like ARM want a range flush and we don't want to expand the multiplexing behaviour of membarrier even more (sigh).
The 32-bit ARM sys_cacheflush() is there only to support self-modifying code, and takes whatever actions are necessary to support that. Exactly what actions it takes are cache implementation specific, and should be of no concern to the caller, but the underlying thing is... it's to support self-modifying code.
Sadly, because it's existed for 20+ years, and it has historically been sufficient for other purposes too, it has seen quite a bit of abuse despite its design purpose not changing - it's been used by graphics drivers for example. They quickly learnt the error of their ways with ARMv6+, since it does not do sufficient for their purposes given the cache architectures found there.
Let's not go around redesigning this after twenty odd years, requiring a hell of a lot of pain to users. This interface is called by code generated by GCC, so to change it you're looking at patching GCC as well as the kernel, and you basically will make new programs incompatible with older kernels - very bad news for users.