From 74870172824a78640ec4f03058d9bd35dfa08618 Mon Sep 17 00:00:00 2001 From: Zhu Yi Date: Wed, 27 Aug 2008 14:33:00 +0800 Subject: [PATCH] lockdep: fix invalid list_del_rcu in zap_class The problem is found during iwlagn driver testing on v2.6.27-rc4-176-gb8e6c91 kernel, but it turns out to be a lockdep bug. In our testing, we frequently load and unload the iwlagn driver (>50 times). Then the MAX_STACK_TRACE_ENTRIES is reached (expected behaviour?). The error message with the call trace is as below. BUG: MAX_STACK_TRACE_ENTRIES too low! turning off the locking correctness validator. Pid: 4895, comm: iwlagn Not tainted 2.6.27-rc4 #13 Call Trace: [] save_stack_trace+0x22/0x3e [] save_trace+0x8b/0x91 [] mark_lock+0x1b0/0x8fa [] __lock_acquire+0x5b9/0x716 [] ieee80211_sta_work+0x0/0x6ea [mac80211] [] lock_acquire+0x52/0x6b [] run_workqueue+0x97/0x1ed [] run_workqueue+0xe7/0x1ed [] run_workqueue+0x97/0x1ed [] worker_thread+0xd8/0xe3 [] autoremove_wake_function+0x0/0x2e [] worker_thread+0x0/0xe3 [] kthread+0x47/0x73 [] trace_hardirqs_on_thunk+0x3a/0x3f [] child_rip+0xa/0x11 [] restore_args+0x0/0x30 [] finish_task_switch+0x0/0xcc [] kthread+0x0/0x73 [] child_rip+0x0/0x11 Although the above is harmless, when the ilwagn module is removed later lockdep will trigger a kernel oops as below. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [] zap_class+0x24/0x82 PGD 73128067 PUD 7448c067 PMD 0 Oops: 0002 [1] SMP CPU 0 Modules linked in: rfcomm l2cap bluetooth autofs4 sunrpc nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_log dm_multipath dm_mod snd_hda_intel sr_mod snd_seq_dummy snd_seq_oss snd_seq_midi_event battery snd_seq snd_seq_device cdrom button snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc e1000e snd_hwdep sg iTCO_wdt iTCO_vendor_support ac pcspkr i2c_i801 i2c_core snd soundcore video output ata_piix ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: mac80211] Pid: 4941, comm: modprobe Not tainted 2.6.27-rc4 #10 RIP: 0010:[] [] zap_class+0x24/0x82 RSP: 0000:ffff88007bcb3eb0 EFLAGS: 00010046 RAX: 0000000000068ee8 RBX: ffffffff8192a0a0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000001dfb RDI: ffffffff816e70b0 RBP: ffffffffa00cd000 R08: ffffffff816818f8 R09: ffff88007c923558 R10: ffffe20002ad2408 R11: ffffffff811028ec R12: ffffffff8192a0a0 R13: 000000000002bd90 R14: 0000000000000000 R15: 0000000000000296 FS: 00007f9d1cee56f0(0000) GS:ffffffff814a58c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000073047000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 4941, threadinfo ffff88007bcb2000, task ffff8800758d1fc0) Stack: ffffffff81057376 0000000000000000 ffffffffa00f7b00 0000000000000000 0000000000000080 0000000000618278 00007fff24f16720 0000000000000000 ffffffff8105d37a ffffffffa00f7b00 ffffffff8105d591 313132303863616d Call Trace: [] ? lockdep_free_key_range+0x61/0xf5 [] ? free_module+0xd4/0xe4 [] ? sys_delete_module+0x1de/0x1f9 [] ? audit_syscall_entry+0x12d/0x160 [] ? system_call_fastpath+0x16/0x1b Code: b2 00 01 00 00 00 c3 31 f6 49 c7 c0 10 8a 61 81 eb 32 49 39 38 75 26 48 98 48 6b c0 38 48 8b 90 08 8a 61 81 48 8b 88 00 8a 61 81 <48> 89 51 08 48 89 0a 48 c7 80 08 8a 61 81 00 02 20 00 48 ff c6 RIP [] zap_class+0x24/0x82 RSP CR2: 0000000000000008 ---[ end trace a1297e0c4abb0f2e ]--- The root cause for this oops is in add_lock_to_list() when save_trace() fails due to MAX_STACK_TRACE_ENTRIES is reached, entry->class is assigned but entry is never added into any lock list. This makes the list_del_rcu() in zap_class() oops later when the module is unloaded. This patch fixes the problem by assigning entry->class after save_trace() returns success. Signed-off-by: Zhu Yi Signed-off-by: Ingo Molnar --- kernel/lockdep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/lockdep.c b/kernel/lockdep.c index b5db51d280..dbda475b13 100644 --- a/kernel/lockdep.c +++ b/kernel/lockdep.c @@ -875,11 +875,11 @@ static int add_lock_to_list(struct lock_class *class, struct lock_class *this, if (!entry) return 0; - entry->class = this; - entry->distance = distance; if (!save_trace(&entry->trace)) return 0; + entry->class = this; + entry->distance = distance; /* * Since we never remove from the dependency list, the list can * be walked lockless by other CPUs, it's only allocation -- 2.39.5