On Mon, Mar 10, 2025 at 3:59 PM Akihiko Odaki akihiko.odaki@daynix.com wrote:
On 2025/03/10 12:55, Jason Wang wrote:
On Fri, Mar 7, 2025 at 7:01 PM Akihiko Odaki akihiko.odaki@daynix.com wrote:
Hash reporting
Allow the guest to reuse the hash value to make receive steering consistent between the host and guest, and to save hash computation.
RSS
RSS is a receive steering algorithm that can be negotiated to use with virtio_net. Conventionally the hash calculation was done by the VMM. However, computing the hash after the queue was chosen defeats the purpose of RSS.
Another approach is to use eBPF steering program. This approach has another downside: it cannot report the calculated hash due to the restrictive nature of eBPF steering program.
Introduce the code to perform RSS to the kernel in order to overcome thse challenges. An alternative solution is to extend the eBPF steering program so that it will be able to report to the userspace, but I didn't opt for it because extending the current mechanism of eBPF steering program as is because it relies on legacy context rewriting, and introducing kfunc-based eBPF will result in non-UAPI dependency while the other relevant virtualization APIs such as KVM and vhost_net are UAPIs.
Signed-off-by: Akihiko Odaki akihiko.odaki@daynix.com Tested-by: Lei Yang leiyang@redhat.com
Documentation/networking/tuntap.rst | 7 ++ drivers/net/Kconfig | 1 + drivers/net/tap.c | 68 ++++++++++++++- drivers/net/tun.c | 98 +++++++++++++++++----- drivers/net/tun_vnet.h | 159 ++++++++++++++++++++++++++++++++++-- include/linux/if_tap.h | 2 + include/linux/skbuff.h | 3 + include/uapi/linux/if_tun.h | 75 +++++++++++++++++ net/core/skbuff.c | 4 + 9 files changed, 386 insertions(+), 31 deletions(-)
diff --git a/Documentation/networking/tuntap.rst b/Documentation/networking/tuntap.rst index 4d7087f727be5e37dfbf5066a9e9c872cc98898d..86b4ae8caa8ad062c1e558920be42ce0d4217465 100644 --- a/Documentation/networking/tuntap.rst +++ b/Documentation/networking/tuntap.rst @@ -206,6 +206,13 @@ enable is true we enable it, otherwise we disable it:: return ioctl(fd, TUNSETQUEUE, (void *)&ifr); }
[...]
diff --git a/drivers/net/tun_vnet.h b/drivers/net/tun_vnet.h index 58b9ac7a5fc4084c789fe94fe36b5f8631bf1fa4..8e7d51fb0b4742cef56e7c5ad778b156cc654bed 100644 --- a/drivers/net/tun_vnet.h +++ b/drivers/net/tun_vnet.h @@ -6,6 +6,16 @@ #define TUN_VNET_LE 0x80000000 #define TUN_VNET_BE 0x40000000
+typedef struct virtio_net_hash *(*tun_vnet_hash_add)(struct sk_buff *); +typedef const struct virtio_net_hash *(*tun_vnet_hash_find)(const struct sk_buff *);
+struct tun_vnet_hash_container {
struct tun_vnet_hash common;
I'd rename this as hash.
Everything in this structure is about hash. "common" represents its feature well.
I see a few alternative options though I don't prefer them either; they make the code verbose and I don't think they are worthwhile:
- Rename tun_vnet_hash to tun_vnet_hash_common.
- Prefix the other fields with "hash_" for consistency.
Or use different structures, one for hash_report another is for rss.
struct tun_vnet_hash_rss rss;
u32 rss_key[VIRTIO_NET_RSS_MAX_KEY_SIZE];
u16 rss_indirection_table[];
+};
Besides the separated ioctl, I'd split this structure into rss and hash part as well.
Like this.
Thanks