Re: [PATCH] avoid race condition between start_xmit and cm_rep_handler

16 Aug 2018


      On Wed, Aug 15, 2018 at 08:37:22PM -0400, Aaron S. Knister wrote:
...
Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which
leaves a window where cm_rep_handler can run, set the connection up,
dequeue pending packets and leave the subsequently queued packets by
start_xmit() sitting on neigh->queue until they're dropped when the
connection is torn down. This only applies to connected mode. These
dropped packets can really upset TCP, for example,  and cause
multi-minute delays in transmission for open connections.
I've got a reproducer available if it's needed.
Here's the code in start_xmit where we check to see if the connection
is up:
   if (ipoib_cm_get(neigh)) {
           if (ipoib_cm_up(neigh)) {
                   ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                   goto unref;
           }
   }

Agree with Leon on the locking. Find a more elegant way to write
this.
...
The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv->lock to dequeue pending skb's) but before the below code snippet
in start_xmit where packets are queued.
   if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
           push_pseudo_header(skb, phdr->hwaddr);
           spin_lock_irqsave(&priv->lock, flags);
           __skb_queue_tail(&neigh->queue, skb);
           spin_unlock_irqrestore(&priv->lock, flags);
   } else {
           ++dev->stats.tx_dropped;
           dev_kfree_skb_any(skb);
   }

Somehow I think the spinlock should be held across the skb_queue_len as
well. Right?
I wonder if the 'neigh->ah' has the same racing problem and needs the
same fixing:
} else if (neigh->ah) {
    	neigh_refresh_path(neigh, phdr->hwaddr, dev);
    }
??
Have a feeling that needs a READ_ONCE to be correct...
Jason

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] avoid race condition between start_xmit and cm_rep_handler