Chung-Lin Tang wrote:
This reminds me of a PR that Bernd did: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40657
It is also support for adding the r0-r3 registers to the epilogue/prologue push-pop for sake of reducing code size, though in a sense even more aggressive; it tries to merge the local stack allocation SP sub/add with the stm/ldm.
Bernd's patch was for Thumb-1, though I don't see why it can't be implemented for ARM/Thumb-2 too.
Chung-Lin, 'Unfortunately', FSF GCC trunk can do this for Thumb2. 1.c is from GCC PR40657.
./fsf-mainline/install/bin/arm-none-linux-gnueabi-gcc -mthumb -mcpu=cortex-a9 -Os 1.c -c -o 1.o
00000000 <foo>: 0: b507 push {r0, r1, r2, lr} 2: a801 add r0, sp, #4 4: f7ff fffe bl 0 <bar> 8: 9801 ldr r0, [sp, #4] a: bd0e pop {r1, r2, r3, pc}