On 2020/1/14 8:34 下午, Nix wrote:
On 6 Jan 2020, Eric Wheeler spake thusly:
On Sat, 4 Jan 2020, Coly Li wrote:
In year 2007 high performance SSD was still expensive, in order to save more space for real workload or meta data, the readahead I/Os for non-meta data was bypassed and not cached on SSD.
It's also because readahead data is more likely to be useless.
In now days, SSD price drops a lot and people can find larger size SSD with more comfortable price. It is unncessary to bypass normal readahead I/Os to save SSD space for now.
Hi Nix,
Doesn't this reduce the utility of the cache by polluting it with unnecessary content? It seems to me that we need at least a *litle* evidence that this change is beneficial. (I mean, it might be beneficial if on average the data that was read ahead is actually used.)
What happens to the cache hit rates when this change has been running for a while?
I have two reports offline and directly to me, one is from an email address of github and forwarded to me by Jens, one is from a China local storage startup.
The first report complains the desktop-pc benchmark is about 50% down and the root cause is located on commit b41c9b0 ("bcache: update bio->bi_opf bypass/writeback REQ_ flag hints").
The second report complains their small file workload (mixed read and write) has around 20%+ performance drop and the suspicious change is also focused on the readahead restriction.
The second reporter verifies this patch and confirms the performance issue has gone. I don't know who is the first report so no response so far.
I don't have exact hit rate number because the reporter does not provide (BTW, because the readahead request is bypassed, I feel the hit rate won't count on them indeed). But from the reports and one verification, IMHO this change makes sense.
Thanks.