On Sun, Oct 8, 2023 at 7:22 AM Akihiko Odaki akihiko.odaki@daynix.com wrote:
virtio-net have two usage of hashes: one is RSS and another is hash reporting. Conventionally the hash calculation was done by the VMM. However, computing the hash after the queue was chosen defeats the purpose of RSS.
Another approach is to use eBPF steering program. This approach has another downside: it cannot report the calculated hash due to the restrictive nature of eBPF.
Introduce the code to compute hashes to the kernel in order to overcome thse challenges. An alternative solution is to extend the eBPF steering program so that it will be able to report to the userspace, but it makes little sense to allow to implement different hashing algorithms with eBPF since the hash value reported by virtio-net is strictly defined by the specification.
The hash value already stored in sk_buff is not used and computed independently since it may have been computed in a way not conformant with the specification.
Signed-off-by: Akihiko Odaki akihiko.odaki@daynix.com
+static const struct tun_vnet_hash_cap tun_vnet_hash_cap = {
.max_indirection_table_length =
TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH,
.types = VIRTIO_NET_SUPPORTED_HASH_TYPES
+};
No need to have explicit capabilities exchange like this? Tun either supports all or none.
case TUNSETSTEERINGEBPF:
ret = tun_set_ebpf(tun, &tun->steering_prog, argp);
bpf_ret = tun_set_ebpf(tun, &tun->steering_prog, argp);
if (IS_ERR(bpf_ret))
ret = PTR_ERR(bpf_ret);
else if (bpf_ret)
tun->vnet_hash.flags &= ~TUN_VNET_HASH_RSS;
Don't make one feature disable another.
TUNSETSTEERINGEBPF and TUNSETVNETHASH are mutually exclusive functions. If one is enabled the other call should fail, with EBUSY for instance.
case TUNSETVNETHASH:
len = sizeof(vnet_hash);
if (copy_from_user(&vnet_hash, argp, len)) {
ret = -EFAULT;
break;
}
if (((vnet_hash.flags & TUN_VNET_HASH_REPORT) &&
(tun->vnet_hdr_sz < sizeof(struct virtio_net_hdr_v1_hash) ||
!tun_is_little_endian(tun))) ||
vnet_hash.indirection_table_mask >=
TUN_VNET_HASH_MAX_INDIRECTION_TABLE_LENGTH) {
ret = -EINVAL;
break;
}
argp = (u8 __user *)argp + len;
len = (vnet_hash.indirection_table_mask + 1) * 2;
if (copy_from_user(vnet_hash_indirection_table, argp, len)) {
ret = -EFAULT;
break;
}
argp = (u8 __user *)argp + len;
len = virtio_net_hash_key_length(vnet_hash.types);
if (copy_from_user(vnet_hash_key, argp, len)) {
ret = -EFAULT;
break;
}
Probably easier and less error-prone to define a fixed size control struct with the max indirection table size.
Btw: please trim the CC: list considerably on future patches.