On 12/22/23 13:01, Pablo Neira Ayuso wrote:
On Mon, Nov 27, 2023 at 11:49:16AM +0000, Felix Huettner wrote:
conntrack zones are heavily used by tools like openvswitch to run multiple virtual "routers" on a single machine. In this context each conntrack zone matches to a single router, thereby preventing overlapping IPs from becoming issues. In these systems it is common to operate on all conntrack entries of a given zone, e.g. to delete them when a router is deleted. Previously this required these tools to dump the full conntrack table and filter out the relevant entries in userspace potentially causing performance issues.
To do this we reuse the existing CTA_ZONE attribute. This was previous parsed but not used during dump and flush requests. Now if CTA_ZONE is set we filter these operations based on the provided zone. However this means that users that previously passed CTA_ZONE will experience a difference in functionality.
Alternatively CTA_FILTER could have been used for the same functionality. However it is not yet supported during flush requests and is only available when using AF_INET or AF_INET6.
For the record, this is applied to nf-next.
Hi, Felix and Pablo.
I was looking through the code and the following part is bothering me:
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index fb0ae15e96df..4e9133f61251 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1148,6 +1149,10 @@ static int ctnetlink_filter_match(struct nf_conn *ct, void *data) if (filter->family && nf_ct_l3num(ct) != filter->family) goto ignore_entry;
+ if (filter->zone.id != NF_CT_DEFAULT_ZONE_ID && + !nf_ct_zone_equal_any(ct, &filter->zone)) + goto ignore_entry; + if (filter->orig_flags) { tuple = nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL); if (!ctnetlink_filter_match_tuple(&filter->orig, tuple,
If I'm reading that right, the default zone is always flushed, even if the user requested to flush a different zone. I.e. the entry is never ignored for a default zone. Is that correct or am I reading that wrong?
If my observation is correct, then I don't think this functionality can actually be used by applications as it does something unexpected.
Best regards, Ilya Maximets.