Matthew Gretton-Dann matthew.gretton-dann@linaro.org writes:
All,
[snip]
So before we go any further I would like to see what the view of LEG is about a better malloc. My questions boil down to:
- Is malloc important - or do server applications just implement their own?
I got sent this question and a list of "server applications" and did some investigation, both of typical runtimes and of the applications. Just based on source inspection and a little googling in some cases. Let me know if you want me to look into anything in more detail.
The answer is likely "sometimes" to both parts of the question :-)
These are my notes. Corrections welcome!
Runtimes ========
Perl ----
Uses glibc malloc (in practice -- it ships with its own malloc implementation but this is not used by default on Linux or in the Ubuntu builds)
Python ------
Uses its own allocator for small allocations, which are by far the commonest. Uses glibc malloc for some things (e.g. memory backing a list object), but malloc-related functions do not appear high up in perf traces.
Java ----
Very much does its own heap management.
PHP ---
As Ard says it has its own thing, and looking at its source, it clearly does something quite complicated (zend_alloc.c is nearly 3000 lines). It bundles various libraries (sqlite, pcre, ...) that do call malloc() and it doesn't seem like it tries to get those libraries to call its own implementation of malloc or anything like that -- so some workloads might benefit from malloc improvements.
Server processes ================
apache2 -------
As Ard says it has its own thing where it manages a pool per request. Looks like it calls malloc a fair bit though.
cassandra ---------
Uses the Java heap mostly, clearly. Does store a few things "off heap" (row cache, bloom filter bitsets, compression metadata), which uses sun.misc.unsafe.allocateMemory, which /probably/ backs onto glibc malloc but mostly I think these things are allocated once at process start up rather than in any hot path.
hadoop ------
Appears to have bits that call malloc. Hard to say more than that without inhaling the architecture more thoroughly.
ceph ----
Certainly calls malloc (and operator new) in many places. So potentially interesting.
memcached ---------
AFAICT, allocates one big chunk of memory with malloc and then does its own thing to divvy it up.
mongodb -------
AIUI, pushes the problem to the kernel by mmap()ing the data files into its address space and fooling around in there. So probably not dependent on malloc() performance.
swift -----
Seems to be pure Python, so not really dependent on malloc.
varnish -------
Calls malloc() once per request and allocates itself within that -- and on linux (incl Ubuntu armhf), it uses a bundled version of jemalloc for even that.
haproxy -------
I *think* this mostly uses a similar model to apache2/varnish: allocate a region once per request (there are quite a few other calls to malloc too -- I don't know if they are on hot paths or not though). It does just use glibc malloc to allocate this memory though AFAICT.
tomcat7 -------
Just uses the Java heap afaict (I guess the contained JSPs can use JNI or whatever but it looks like the container doesn't).
- Do you have any benchmarks that stress malloc and would provide us with
some more data points?
But any and all comments on the subject are welcome.
It seems perl and ceph almost certainly have a dependency on glibc malloc performance. In most other cases, it seems that projects that have noticed that malloc can be a little slow have implemented their own solutions. It might be that an improved system malloc would mean that some of these could stop using their own implementation, but often times they are exploiting properties a system malloc simply cannot (e.g. allocating an arena per-request and then throwing it all away in one big go).
Cheers, mwh