Re: [PATCH 11/17] find: micro-optimize for_each_{set,clear}_bit()

26 Aug 2021

On Thu, Aug 26, 2021 at 03:57:13PM +0200, Petr Mladek wrote:
...
On Sat 2021-08-14 14:17:07, Yury Norov wrote:
...
The macros iterate thru all set/clear bits in a bitmap. They search a
first bit using find_first_bit(), and the rest bits using find_next_bit().
Since find_next_bit() is called shortly after find_first_bit(), we can
save few lines of I-cache by not using find_first_bit().
Is this only a speculation or does it fix a real performance problem?
The macro is used like:
for_each_set_bit(bit, addr, size) {
   	fn(bit);
   }
IMHO, the micro-opimization does not help when fn() is non-trivial.
The effect is measurable:
Start testing for_each_bit()
for_each_set_bit:                15296 ns,   1000 iterations
for_each_set_bit_from:           15225 ns,   1000 iterations
Start testing for_each_bit() with cash flushing
for_each_set_bit:               547626 ns,   1000 iterations
for_each_set_bit_from:          497899 ns,   1000 iterations
Refer this:
https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg356151.html
Thanks,
Yury
...
...

--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -280,7 +280,7 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 #endif
 
 #define for_each_set_bit(bit, addr, size) \

for ((bit) = find_first_bit((addr), (size));		\


for ((bit) = find_next_bit((addr), (size), 0);		\
   (bit) < (size);					\
   (bit) = find_next_bit((addr), (size), (bit) + 1))

It is not a big deal. I just think that the original code is slightly
more self-explaining.
Best Regards,
Petr

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 11/17] find: micro-optimize for_each_{set,clear}_bit()