On 08/31/2018 05:12 PM, Hauke Mertens wrote:
On 08/30/2018 08:01 PM, Paul Burton wrote:
When a system suffers from dcache aliasing a user program may observe stale VDSO data from an aliased cache line. Notably this can break the expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name suggests, monotonic.
In order to ensure that users observe updates to the VDSO data page as intended, align the user mappings of the VDSO data page such that their cache colouring matches that of the virtual address range which the kernel will use to update the data page - typically its unmapped address within kseg0.
This ensures that we don't introduce aliasing cache lines for the VDSO data page, and therefore that userland will observe updates without requiring cache invalidation.
Signed-off-by: Paul Burton paul.burton@mips.com Reported-by: Hauke Mehrtens hauke@hauke-m.de Reported-by: Rene Nielsen rene.nielsen@microsemi.com Reported-by: Alexandre Belloni alexandre.belloni@bootlin.com Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Cc: James Hogan jhogan@kernel.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.4+
Tested-by: Hauke Mehrtens hauke@hauke-m.de
Tested-by: Rene Nielsen rene.nielsen@microchip.com
Without this patch ping shows these results on kernel 4.19-rc1 on the Lantiq VR9 SoC to a PC directly connected to the LAN port:
root@OpenWrt:~# ping 192.168.1.195 PING 192.168.1.195 (192.168.1.195): 56 data bytes 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.689 ms 64 bytes from 192.168.1.195: seq=1 ttl=64 time=236.527 ms 64 bytes from 192.168.1.195: seq=2 ttl=64 time=4294963.829 ms 64 bytes from 192.168.1.195: seq=3 ttl=64 time=4294423.824 ms 64 bytes from 192.168.1.195: seq=4 ttl=64 time=960.527 ms 64 bytes from 192.168.1.195: seq=5 ttl=64 time=472.530 ms 64 bytes from 192.168.1.195: seq=6 ttl=64 time=464.530 ms 64 bytes from 192.168.1.195: seq=7 ttl=64 time=452.530 ms
With this patch it looks like this:
root@OpenWrt:~# ping 192.168.1.195 PING 192.168.1.195 (192.168.1.195): 56 data bytes 64 bytes from 192.168.1.195: seq=0 ttl=64 time=0.638 ms 64 bytes from 192.168.1.195: seq=1 ttl=64 time=0.573 ms 64 bytes from 192.168.1.195: seq=2 ttl=64 time=0.605 ms 64 bytes from 192.168.1.195: seq=3 ttl=64 time=0.524 ms 64 bytes from 192.168.1.195: seq=4 ttl=64 time=0.534 ms 64 bytes from 192.168.1.195: seq=5 ttl=64 time=0.518 ms 64 bytes from 192.168.1.195: seq=6 ttl=64 time=0.485 ms 64 bytes from 192.168.1.195: seq=7 ttl=64 time=0.501 ms
Hi Alexandre,
Could you try this out on your Ocelot system? Hopefully it'll solve the problem just as well as James' patch but doesn't need the questionable change to arch_get_unmapped_area_common().
Thanks, Paul
arch/mips/kernel/vdso.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c index 019035d7225c..5fb617a42335 100644 --- a/arch/mips/kernel/vdso.c +++ b/arch/mips/kernel/vdso.c @@ -13,6 +13,7 @@ #include <linux/err.h> #include <linux/init.h> #include <linux/ioport.h> +#include <linux/kernel.h> #include <linux/mm.h> #include <linux/sched.h> #include <linux/slab.h> @@ -20,6 +21,7 @@ #include <asm/abi.h> #include <asm/mips-cps.h> +#include <asm/page.h> #include <asm/vdso.h> /* Kernel-provided data used by the VDSO. */ @@ -128,12 +130,30 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp) vvar_size = gic_size + PAGE_SIZE; size = vvar_size + image->size;
- /*
* Find a region that's large enough for us to perform the
* colour-matching alignment below.
*/
- if (cpu_has_dc_aliases)
size += shm_align_mask + 1;
- base = get_unmapped_area(NULL, 0, size, 0, 0); if (IS_ERR_VALUE(base)) { ret = base; goto out; }
- /*
* If we suffer from dcache aliasing, ensure that the VDSO data page is
* coloured the same as the kernel's mapping of that memory. This
* ensures that when the kernel updates the VDSO data userland will see
* it without requiring cache invalidations.
*/
- if (cpu_has_dc_aliases) {
base = __ALIGN_MASK(base, shm_align_mask);
base += ((unsigned long)&vdso_data - gic_size) & shm_align_mask;
- }
- data_addr = base + gic_size; vdso_addr = data_addr + PAGE_SIZE;