Re: Use of memcpy() in libpng

7 Oct 2011

      The unaligned accesses in libpng are, for the large copies, a bug.   Our attempt to align the row buffer to a 16 byte boundary was off-by-one so we end up always mis-aligning it.  I've posted a patch on the png-mng-implement list:
http://sourceforge.net/mailarchive/message.php?msg_id=28194444
The time spent in memcpy() is probably an illusion.  The data out of zlib gets copied to one row buffer where it is unfiltered (if necessary) then a copy is made in a separate buffer that is only used for the filter handling.  If you test using images with large rows (I don't know what pngbench does) the copy buffer may well get flushed out of the second level cache between each row, then the memcpy will stall bringing it back in.
If you have machine level profiling you may see this as a massive time spike on some probably unrelated instruction which just happens to be in the PC when the stall stops everything.
Anyway, I have several ideas of how to avoid the copy when it isn't required.
John Bowler jbowler@acm.org
-----Original Message-----
From: Glenn Randers-Pehrson [mailto:glennrp@gmail.com] 
Sent: Monday, October 03, 2011 1:15 PM
To: PNG/MNG implementation discussion list
Subject: [png-mng-implement] Use of memcpy() in libpng [Fwd from linaro-toolchain list]
Re: Use of memcpy() in libpng
David Gilbert
Tue, 27 Sep 2011 06:20:14 -0700
On 27 September 2011 14:16, Christian Robottom Reis k...@linaro.org wrote:
...
On Tue, Sep 27, 2011 at 09:47:33AM +0100, Ramana Radhakrishnan wrote:
...
On 26 September 2011 21:51, Michael Hope michael.h...@linaro.org wrote:
...
Saw this on the linaro-multimedia list:

http://lists.linaro.org/pipermail/linaro-multimedia/2011-September/
000074.html
libpng spends a significant amount of time in memcpy().  This might 
tie in with Ramana's investigation or the unaligned access work by 
allowing more memcpy()s to be inlined.
It's the unaligned access and the change / improvements to the memcpy 
that *might* help in this case. But that ofcourse depends on the 
compiler knowing when it can do such a thing. Ofcourse what might be 
more interesting is the kind of workload analysis that Dave's done in 
the past with memcpy to know what the alignment and size of the 
buffer being copied is.
If you guys could take a look at this there is a potential requirement 
for the MMWG around libpng optimization; we could fit this in along 
with other work (possible vectorizing, etc) on that component.
It wouldn't take long to analyse the memcpy calls - life would be easier if we had the test program and some details on things like what size of images were used in these benchmarks.
Dave

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Use of memcpy() in libpng