On Wed, Jan 30, 2019 at 12:30 PM Andreas Gruenbacher agruenba@redhat.com wrote:
This reverts commit 2d29f6b96d8f80322ed2dd895bca590491c38d34.
It turns out that the fix can lead to a ~20 percent performance regression in initial writes to the page cache according to iozone. Let's revert this for now to have more time for a proper fix.
Ugh. I think part of the problem is that the
n += (rbm->bii - initial_bii);
doesn't make any sense when we just set rbm->bii to 0 a couple of lines earlier. So it's basically a really odd way to write
n -= initial_bii;
which in turn really doesn't make any sense _either_.
So I'l all for reverting the commit because it caused a performance regression, but the end result really is all kinds of odd.
Is the bug as simple as "we incremented the iterator counter twice"? Because in the -E2BIG case, we first increment it by the difference in bii, but then we *also* increment it in res_covered_end_of_rgrp() (which we do *not* do for the "ret > 0" case, which goes to next_iter).
So if somebody can run the performance test again, how does it look if *instead* of the revert, we do what looks at least _slightly_ more sensible, and move the "increment iteration count" up to the next-bitmap case instead?
At that point, it will actually match the bii updates (because next_bitmap also updates rbm->bii). Hmm?
Something like the whitespace-damaged thinig below. Completely untested. But *if* this works, it would make more sense than the revert..
Hmm?
Linus
--- snip snip --- diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c index 831d7cb5a49c..5b1006d5344f 100644 --- a/fs/gfs2/rgrp.c +++ b/fs/gfs2/rgrp.c @@ -1796,10 +1796,10 @@ static int gfs2_rbm_find(struct gfs2_rbm *rbm, u8 state, u32 *minext, rbm->bii++; if (rbm->bii == rbm->rgd->rd_length) rbm->bii = 0; + n++; res_covered_end_of_rgrp: if ((rbm->bii == 0) && nowrap) break; - n++; next_iter: if (n >= iters) break;