From: Jarek Poplawski Date: Fri, 11 Jan 2008 05:21:20 +0000 (-0800) Subject: [AX25] af_ax25: Possible circular locking. X-Git-Tag: v2.6.24-rc8~7^2~9 X-Git-Url: https://err.no/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=ecd2ebdea350c40e73c00d400d74c8a09c072082;p=linux-2.6 [AX25] af_ax25: Possible circular locking. Bernard Pidoux F6BVP reported: > When I killall kissattach I can see the following message. > > This happens on kernel 2.6.24-rc5 already patched with the 6 previously > patches I sent recently. > > > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.23.9 #1 > ------------------------------------------------------- > kissattach/2906 is trying to acquire lock: > (linkfail_lock){-+..}, at: [] ax25_link_failed+0x11/0x39 [ax25] > > but task is already holding lock: > (ax25_list_lock){-+..}, at: [] ax25_device_event+0x38/0x84 > [ax25] > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: ... lockdep is worried about the different order here: #1 (rose_neigh_list_lock){-+..}: #3 (ax25_list_lock){-+..}: #0 (linkfail_lock){-+..}: #1 (rose_neigh_list_lock){-+..}: #3 (ax25_list_lock){-+..}: #0 (linkfail_lock){-+..}: So, ax25_list_lock could be taken before and after linkfail_lock. I don't know if this three-thread clutch is very probable (or possible at all), but it seems another bug reported by Bernard ("[...] system impossible to reboot with linux-2.6.24-rc5") could have similar source - namely ax25_list_lock held by ax25_kill_by_device() during ax25_disconnect(). It looks like the only place which calls ax25_disconnect() this way, so I guess, it isn't necessary. This patch is breaking the lock for ax25_disconnect(). Reported-and-tested-by: Bernard Pidoux Signed-off-by: Jarek Poplawski Signed-off-by: David S. Miller --- diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index ecb14ee004..b4725ff317 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -87,10 +87,22 @@ static void ax25_kill_by_device(struct net_device *dev) return; spin_lock_bh(&ax25_list_lock); +again: ax25_for_each(s, node, &ax25_list) { if (s->ax25_dev == ax25_dev) { s->ax25_dev = NULL; + spin_unlock_bh(&ax25_list_lock); ax25_disconnect(s, ENETUNREACH); + spin_lock_bh(&ax25_list_lock); + + /* The entry could have been deleted from the + * list meanwhile and thus the next pointer is + * no longer valid. Play it safe and restart + * the scan. Forward progress is ensured + * because we set s->ax25_dev to NULL and we + * are never passed a NULL 'dev' argument. + */ + goto again; } } spin_unlock_bh(&ax25_list_lock);