On Thu, Feb 13, 2025 at 5:17 AM Pavel Begunkov asml.silence@gmail.com wrote:
On 2/12/25 19:18, Mina Almasry wrote:
On Wed, Feb 12, 2025 at 7:52 AM Pavel Begunkov asml.silence@gmail.com wrote:
On 2/10/25 21:09, Mina Almasry wrote:
On Wed, Feb 5, 2025 at 4:20 AM Pavel Begunkov asml.silence@gmail.com wrote:
On 2/3/25 22:39, Mina Almasry wrote: ...
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index bb2b751d274a..3ff8f568c382 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1711,9 +1711,12 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
...
int zerocopy_fill_skb_from_iter(struct sk_buff *skb, struct iov_iter *from, size_t length);
@@ -1721,12 +1724,14 @@ int zerocopy_fill_skb_from_iter(struct sk_buff *skb, static inline int skb_zerocopy_iter_dgram(struct sk_buff *skb, struct msghdr *msg, int len) {
return __zerocopy_sg_from_iter(msg, skb->sk, skb, &msg->msg_iter, len);
return __zerocopy_sg_from_iter(msg, skb->sk, skb, &msg->msg_iter, len,
NULL);
Instead of propagating it all the way down and carving a new path, why not reuse the existing infra? You already hook into where ubuf is allocated, you can stash the binding in there. And
It looks like it's not possible to increase the side of ubuf_info at all, otherwise the BUILD_BUG_ON in msg_zerocopy_alloc() fires.
It's asserting that sizeof(ubuf_info_msgzc) <= sizeof(skb->cb), and I'm guessing increasing skb->cb size is not really the way to go.
What I may be able to do here is stash the binding somewhere in ubuf_info_msgzc via union with fields we don't need for devmem, and/or
It doesn't need to account the memory against the user, and you actually don't want that because dmabuf should take care of that. So, it should be fine to reuse ->mmp.
It's also not a real sk_buff, so maybe maintainers wouldn't mind reusing some more space out of it, if that would even be needed.
netmem skb are real sk_buff, with the modification that frags are not
We were discussing ubuf_info allocation, take a look at msg_zerocopy_alloc(), it has nothing to do with netmems and all that.
Yes. My response was regarding the suggestion that we can use space in devmem skbs however we want though.
readable, only in the case that the netmem is unreadable. I would not approve of considering netmem/devmem skbs "not real skbs", and start messing with the semantics of skb fields for devmem skbs, and having to start adding skb_is_devmem() checks through all code in the skb handlers that touch the fields being overwritten in the devmem case. No, I don't think we can re-use random fields in the skb for devmem.
stashing the binding in ubuf_info_ops (very hacky). Neither approach seems ideal, but the former may work and may be cleaner.
I'll take a deeper look here. I had looked before and concluded that we're piggybacking devmem TX on MSG_ZEROCOPY path, because we need almost all of the functionality there (no copying, send complete notifications, etc), with one minor change in the skb filling. I had concluded that if MSG_ZEROCOPY was never updated to use the existing infra, then it's appropriate for devmem TX piggybacking on top of it
MSG_ZEROCOPY does use the common infra, i.e. passing ubuf_info, but doesn't need ->sg_from_iter as zerocopy_fill_skb_from_iter() and it's what was there first.
But MSG_ZEROCOPY doesn't set msg->msg_ubuf. And not setting msg->msg_ubuf fails to trigger msg->sg_from_iter altogether.
And also currently sg_from_iter isn't set up to take in a ubuf_info. We'd need that if we stash the binding in the ubuf_info.
https://github.com/isilence/linux.git sg-iter-ops
I have old patches for all of that, they even rebased cleanly. That should do it for you, and I need to send then regardless of devmem.
These patches help a bit, but do not make any meaningful dent in addressing the concern I have in the earlier emails.
The concern is that we're piggybacking devmem TX on MSG_ZEROCOPY, and currently the MSG_ZEROCOPY code carefully avoids any code paths setting msg->[sg_from_iter|msg_ubuf].
If we want devmem to reuse both the MSG_ZEROCOPY mechanisms and the msg->[sg_from_iter|ubuf_info] mechanism, I have to dissect the MSG_ZEROCOPY code carefully so that it works with and without setting msg->[ubuf_info|msg->sg_from_iter]. Having gone through this rabbit hole so far I see that it complicates the implementation and adds more checks to the fast MSG_ZEROCOPY paths.
The complication could be worth it if there was some upside, but I don't see one tbh. Passing the binding down to zerocopy_fill_skb_from_devmem seems like a better approach to my eye so far
I'm afraid I'm going to table this for now. If there is overwhelming consensus that msg->sg_from_iter is the right approach here I will revisit, but it seems to me to complicate code without a significant upside.
-- Thanks, Mina