Marc Kleine-Budde mkl@pengutronix.de writes:
On 31.03.2025 09:25:27, Axel Forsman wrote:
The functions kvaser_pciefd_start_xmit() and kvaser_pciefd_handle_ack_packet() raced to stop/wake TX queues and get/put echo skbs, as kvaser_pciefd_can->echo_lock was only ever taken when transmitting. E.g., this caused the following error:
can_put_echo_skb: BUG! echo_skb 5 is occupied!
Instead, use the synchronization helpers in netdev_queues.h. As those piggyback on BQL barriers, start updating in-flight packets and bytes counts as well.
This looks like it does in the right direction. Using the netif_subqueue_completed helpers is a great idea.
What usually works even better is to have 2 counters and a mask:
- unsigned int tx_head, tx_tail
- TXFIFO_DEPTH
The tx_head is incremented in the xmit function, tail is incremented in the tx_done function.
There's no need to check how many buffers are free in the HW.
Have a look at the rockchip-canfd driver for an example.
An attempt was made to keep this a bugfix-only commit, but I do agree it is nicer to maintain an in-flight counter instead of querying the device.
(A mask is inapplicable, as the size of the device TX FIFO is not necessarily a power-of-2, though the packets have sequence numbers so it does not matter.)
I guess a write barrier would be needed before tx_tail is incremented, for the xmit function not to see it before __can_get_echo_skb() has cleared the echo skb slot? (Or else, can_put_echo_skb() errors if the slot is already non-NULL.) It is not obvious to me how rockchip-canfd handles this?
- /*
* Without room for a new message, stop the queue until at least
* one successful transmit.
*/
- if (!netif_subqueue_maybe_stop(netdev, 0, kvaser_pciefd_tx_avail(can), 1, 1))
return NETDEV_TX_BUSY;
Returning NETDEV_TX_BUSY is quite expensive, stop the queue at the end of this function, if the buffers are full.
Will do.
@@ -1638,11 +1650,25 @@ static int kvaser_pciefd_read_buffer(struct kvaser_pciefd *pcie, int dma_buf) { int pos = 0; int res = 0;
- unsigned int i;
do { res = kvaser_pciefd_read_packet(pcie, &pos, dma_buf); } while (!res && pos > 0 && pos < KVASER_PCIEFD_DMA_SIZE);
- for (i = 0; i < pcie->nr_channels; ++i) {
struct kvaser_pciefd_can *can = pcie->can[i];
if (!can->completed_tx_pkts)
continue;
netif_subqueue_completed_wake(can->can.dev, 0,
can->completed_tx_pkts,
can->completed_tx_bytes,
kvaser_pciefd_tx_avail(can), 1);
You can do this as soon as as one packet is finished, if you want to avoid too frequent wakeups, use threshold of more than 1.
I did that initially, but changed it to follow the advice of the netdev_tx_completed_queue() docstring. Since the RX buffer for a single IRQ may have multiple packets, would not BQL see incorrect intervals in that case?
Thanks for the review Axel Forsman