On Thu, Aug 16, 2018 at 05:24:09PM +0200, Greg KH wrote:
On Thu, Aug 16, 2018 at 02:33:56PM +0200, Michal Kubecek wrote:
Anyway, even at this rate, I only get ~10% of one core (Intel E5-2697).
What I can see, though, is that with current stable 4.4 code, modified testcase which sends something like
2:3, 3:4, ..., 3001:3002, 3003:3004, 3004:3005, ... 6001:6002, ...
I quickly eat 6 MB of memory for receive queue of one socket while earlier 4.4 kernels only take 200-300 KB. I didn't test latest 4.4 with Takashi's follow-up yet but I'm pretty sure it will help while preserving nice performance when using the original segmentsmack testcase (with increased packet ratio).
Ok, for now I've applied Takashi's fix to the 4.4 stable queue and will push out a new 4.4-rc later tonight. Can everyone standardize on that and test and let me know if it does, or does not, fix the reported issues?
I did repeat the tests with Takashi's fix and the CPU utilization is similar to what we have now, i.e. 3-5% with 10K pkt/s. I could still saturate one CPU somewhere around 50K pkt/s but that already requires 2.75 MB/s (22 Mb/s) of throughput. (My previous tests with Mao Wenan's changes in fact used lower speeds as the change from 128 to 1024 would need to be done in two places.)
Where Takashi's patch does help is that it does not prevent collapsing of ranges of adjacent segments with total length shorter than ~4KB. It took more time to verify: it cannot be checked by watching the socket memory consumption with ss as tcp_collapse_ofo_queue isn't called until we reach the limits. So I needed to trace when and how tcp_collpse() is called with both current stable 4.4 code and one with Takashi's fix.
If not, we can go from there and evaluate this much larger patch series. But let's try the simple thing first.
At high packet rates (say 30K pkt/s and more), we can still saturate the CPU. This is also mentioned in the announcement with claim that switch to rbtree based queue would be necessary to fully address that. My tests seem to confirm that but I'm still not sure it is worth backporting something as intrusive into stable 4.4.
Michal Kubecek