On 8/9/24 15:10, Mina Almasry wrote:
On Thu, Aug 8, 2024 at 10:24 PM Jakub Kicinski kuba@kernel.org wrote:
On Thu, 8 Aug 2024 16:36:24 -0400 Mina Almasry wrote:
How do you know that the driver:
- supports net_iov at all (let's not make implicit assumptions based on presence of queue API);
- supports net_iov in current configuration (eg header-data split is enabled)
- supports net_iov for _this_ pool (all drivers must have separate buffer pools for headers and data for this to work, some will use page pool for both)
What comes to mind is adding an "I can gobble up net_iovs from this pool" flag in page pool params (the struct that comes from the driver),
This already sorta exists in the current iteration, although maybe in an implicit way. As written, drivers need to set params.queue, otherwise core will not attempt to grab the mp information from params.queue. A driver can set params.queue for its data pages pool and not set it for the headers pool. AFAICT that deals with all 3 issues you present above.
The awkward part is if params.queue starts getting used for other reasons rather than passing mp configuration, but as of today that's not the case so I didn't add the secondary flag. If you want a second flag to be added preemptively, I can do that, no problem. Can you confirm params.queue is not good enough?
I'd prefer a flag. The setting queue in a param struct is not a good API for conveying that the page pool is for netmem payloads only.
and then on the installation path we can check if after queue reset the refcount of the binding has increased. If it did - driver has created a pool as we expected, otherwise - fail, something must be off. Maybe that's a bit hacky?
What's missing is for core to check at binding time that the driver supports net_iov. I had relied on the implicit presence of the queue-API.
What you're proposing works, but AFAICT it's quite hacky, yes. I basically need to ASSERT_RTNL in net_devmem_binding_get() to ensure nothing can increment the refcount while the binding is happening so that the refcount check is valid.
True. Shooting from the hip, but we could walk the page pools of the netdev and find the one that has the right mp installed, and matches queue? The page pools are on a list hooked up to the netdev, trivial to walk.
I think this is good, and it doesn't seem hacky to me, because we can check the page_pools of the netdev while we hold rtnl, so we can be sure nothing is messing with the pp configuration in the meantime. Like you say below it does validate the driver rather than rely on the driver saying it's doing the right thing. I'll look into putting this in the next version.
Why not have a flag set by the driver and advertising whether it supports providers or not, which should be checked for instance in netdev_rx_queue_restart()? If set, the driver should do the right thing. That's in addition to a new pp_params flag explicitly telling if pp should use providers. It's more explicit and feels a little less hacky.