On 12/2/25 14:44, Paolo Abeni wrote:
On 12/1/25 12:35 AM, Pavel Begunkov wrote:
Note: it's net/ only bits and doesn't include changes, which shoulf be merged separately and are posted separately. The full branch for convenience is at [1], and the patch is here:
https://lore.kernel.org/io-uring/7486ab32e99be1f614b3ef8d0e9bc77015b173f7.17...
Many modern NICs support configurable receive buffer lengths, and zcrx and memory providers can use buffers larger than 4K/PAGE_SIZE on x86 to improve performance. When paired with hw-gro larger rx buffer sizes can drastically reduce the number of buffers traversing the stack and save a lot of processing time. It also allows to give to users larger contiguous chunks of data. The idea was first floated around by Saeed during netdev conf 2024 and was asked about by a few folks.
Single stream benchmarks showed up to ~30% CPU util improvement. E.g. comparison for 4K vs 32K buffers using a 200Gbit NIC:
packets=23987040 (MB=2745098), rps=199559 (MB/s=22837) CPU %usr %nice %sys %iowait %irq %soft %idle 0 1.53 0.00 27.78 2.72 1.31 66.45 0.22 packets=24078368 (MB=2755550), rps=200319 (MB/s=22924) CPU %usr %nice %sys %iowait %irq %soft %idle 0 0.69 0.00 8.26 31.65 1.83 57.00 0.57
This series adds net infrastructure for memory providers configuring the size and implements it for bnxt. It's an opt-in feature for drivers, they should advertise support for the parameter in the qops and must check if the hardware supports the given size. It's limited to memory providers as it drastically simplifies implementation. It doesn't affect the fast path zcrx uAPI, and the sizes is defined in zcrx terms, which allows it to be flexible and adjusted in the future, see Patch 8 for details.
A liburing example can be found at [2]
full branch: [1] https://github.com/isilence/linux.git zcrx/large-buffers-v7 Liburing example: [2] https://github.com/isilence/liburing.git zcrx/rx-buf-len
Dump question, hoping someone could answer in a very short time...
Differently from previous revisions, this is not a PR, just a plain patch series - that in turn may cause duplicate commits when applied on different trees.
Is the above intentional? why?
It was based on linus-rc* before and getting merged nice and clean, now there is a small conflict. In my view, it should either be a separate pull to Linus that depends on the net+io_uring trees if Jens would be willing to orchestrate that, or I'll just merge the leftover io_uring patch for-6.20. In either case, this set shouldn't get applied to any other tree directly.