From: Herbert Xu Date: Fri, 19 Oct 2007 05:37:58 +0000 (-0700) Subject: [NET]: Fix possible dev_deactivate race condition X-Git-Tag: v2.6.24-rc1~141^2 X-Git-Url: https://err.no/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=ce0e32e65f70337e0732c97499b643205fa8ea31;p=linux-2.6 [NET]: Fix possible dev_deactivate race condition The function dev_deactivate is supposed to only return when all outstanding transmissions have completed. Unfortunately it is possible for store operations in the driver's transmit function to only become visible after dev_deactivate returns. This patch fixes this by taking the queue lock after we see the end of the queue run. This ensures that all effects of any previous transmit calls are visible. If however we detect that there is another queue run occuring, then we'll warn about it because this should never happen as we have pointed dev->qdisc to noop_qdisc within the same queue lock earlier in the functino. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller --- diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index e01d57692c..fa1a6f45dc 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -556,6 +556,7 @@ void dev_deactivate(struct net_device *dev) { struct Qdisc *qdisc; struct sk_buff *skb; + int running; spin_lock_bh(&dev->queue_lock); qdisc = dev->qdisc; @@ -571,12 +572,31 @@ void dev_deactivate(struct net_device *dev) dev_watchdog_down(dev); - /* Wait for outstanding dev_queue_xmit calls. */ + /* Wait for outstanding qdisc-less dev_queue_xmit calls. */ synchronize_rcu(); /* Wait for outstanding qdisc_run calls. */ - while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state)) - yield(); + do { + while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state)) + yield(); + + /* + * Double-check inside queue lock to ensure that all effects + * of the queue run are visible when we return. + */ + spin_lock_bh(&dev->queue_lock); + running = test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state); + spin_unlock_bh(&dev->queue_lock); + + /* + * The running flag should never be set at this point because + * we've already set dev->qdisc to noop_qdisc *inside* the same + * pair of spin locks. That is, if any qdisc_run starts after + * our initial test it should see the noop_qdisc and then + * clear the RUNNING bit before dropping the queue lock. So + * if it is set here then we've found a bug. + */ + } while (WARN_ON_ONCE(running)); } void dev_init_scheduler(struct net_device *dev)