Re: String routines writeup with benchmarks

11 Jun 2012


      On 11 June 2012 02:14, Michael Hope michael.hope@linaro.org wrote:
...
We talked at Connect about finishing up the cortex-strings work by
upstreaming them into Bionic, Newlib, and GLIBC.  I've written up one
of our standard 'Output' pages:
https://wiki.linaro.org/WorkingGroups/ToolChain/Outputs/CortexStrings
with a summary of what we did, what else exists, benchmark results,
and next steps.  This can be used to justify the routines to the
different upstreams.
The Android guys are going to upstream these to Bionic.  I need a
volunteer to do Newlib and GLIBC.
One surprise was that the Newlib plain C routines are very good on
strings - probably due to a good end of string detector.
Those graphs end at 4k, which is well within even L1 cache.  How do
these functions compare for sizes that hit L2 or external memory?
I would expect functions doing some prefetching to perform better
there.  Some time ago, I compared a few memcpy() implementations
on large blocks, and the Bionic NEON-optimised one was several
times faster than glibc.  It is of course possible that glibc has
improved since then.
-- 
Mans Rullgard / mru

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: String routines writeup with benchmarks