From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 93a8417088ea570b5721d2b526337a2d3aed9fa3 ]
Given the following order of operations:
(1) we add filter A using tc-flower (2) we send a packet that matches it (3) we read the filter's statistics to find a hit count of 1 (4) we add a second filter B with a higher preference than A, and A moves one position to the right to make room in the TCAM for it (5) we send another packet, and this matches the second filter B (6) we read the filter statistics again.
When this happens, the hit count of filter A is 2 and of filter B is 1, despite a single packet having matched each filter.
Furthermore, in an alternate history, reading the filter stats a second time between steps (3) and (4) makes the hit count of filter A remain at 1 after step (6), as expected.
The reason why this happens has to do with the filter->stats.pkts field, which is written to hardware through the call path below:
vcap_entry_set / | \ / | \ / | \ / | \ es0_entry_set is1_entry_set is2_entry_set \ | / \ | / \ | / vcap_data_set(data.counter, ...)
The primary role of filter->stats.pkts is to transport the filter hit counters from the last readout all the way from vcap_entry_get() -> ocelot_vcap_filter_stats_update() -> ocelot_cls_flower_stats(). The reason why vcap_entry_set() writes it to hardware is so that the counters (saturating and having a limited bit width) are cleared after each user space readout.
The writing of filter->stats.pkts to hardware during the TCAM entry movement procedure is an unintentional consequence of the code design, because the hit count isn't up to date at this point.
So at step (4), when filter A is moved by ocelot_vcap_filter_add() to make room for filter B, the hardware hit count is 0 (no packet matched on it in the meantime), but filter->stats.pkts is 1, because the last readout saw the earlier packet. The movement procedure programs the old hit count back to hardware, so this creates the impression to user space that more packets have been matched than they really were.
The bug can be seen when running the gact_drop_and_ok_test() from the tc_actions.sh selftest.
Fix the issue by reading back the hit count to tmp->stats.pkts before migrating the VCAP filter. Sure, this is a best-effort technique, since the packets that hit the rule between vcap_entry_get() and vcap_entry_set() won't be counted, but at least it allows the counters to be reliably used for selftests where the traffic is under control.
The vcap_entry_get() name is a bit unintuitive, but it only reads back the counter portion of the TCAM entry, not the entire entry.
The index from which we retrieve the counter is also a bit unintuitive (i - 1 during add, i + 1 during del), but this is the way in which TCAM entry movement works. The "entry index" isn't a stored integer for a TCAM filter, instead it is dynamically computed by ocelot_vcap_block_get_filter_index() based on the entry's position in the &block->rules list. That position (as well as block->count) is automatically updated by ocelot_vcap_filter_add_to_block() on add, and by ocelot_vcap_block_remove_filter() on del. So "i" is the new filter index, and "i - 1" or "i + 1" respectively are the old addresses of that TCAM entry (we only support installing/deleting one filter at a time).
Fixes: b596229448dd ("net: mscc: ocelot: Add support for tcam") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot_vcap.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/mscc/ocelot_vcap.c b/drivers/net/ethernet/mscc/ocelot_vcap.c index 709b304fde67..118572590607 100644 --- a/drivers/net/ethernet/mscc/ocelot_vcap.c +++ b/drivers/net/ethernet/mscc/ocelot_vcap.c @@ -1142,6 +1142,8 @@ int ocelot_vcap_filter_add(struct ocelot *ocelot, struct ocelot_vcap_filter *tmp;
tmp = ocelot_vcap_block_find_filter_by_index(block, i); + /* Read back the filter's counters before moving it */ + vcap_entry_get(ocelot, i - 1, tmp); vcap_entry_set(ocelot, i, tmp); }
@@ -1199,6 +1201,8 @@ int ocelot_vcap_filter_del(struct ocelot *ocelot, struct ocelot_vcap_filter *tmp;
tmp = ocelot_vcap_block_find_filter_by_index(block, i); + /* Read back the filter's counters before moving it */ + vcap_entry_get(ocelot, i + 1, tmp); vcap_entry_set(ocelot, i, tmp); }